merge dataframe by lines

Asked

Viewed 58 times

2

I have 2 data frames and would like to merge a data frame with another data frame by the value column, subtracting the values of each line. The data frames I have are :

df1:

   UF  Ano   Valor
    1 AC 2007 2315798
    2 AC 2008 2425687
    3 AC 2009 2511285
    4 AC 2010 2578460
    5 AC 2011 2549497
    6 AC 2012 2634467

df2:

   UF   Ano  Valor
    1 AC  2007 446447
    2 AC  2008 417316
    3 AC  2009 418318
    4 AC  2010 485166
    5 AC  2011 455485
    6 AC  2012 362548

I would like to create another data frame from these, where you have the column UF, Year and Value, but the column Value subtracted df1-df2.

I mean, I’d like to keep it that way :

  UF   Ano  Valor
1 AC  2007 1869351
2 AC  2008 2008371
3 AC  2009 2092967
4 AC  2010 2033294
5 AC  2011 2094012
6 AC  2012 2271919

1 answer

2

On R base, one can do this with merge.

df3 <- merge(df1, df2, by = c('UF', 'Ano'))
df3[[3]] <- df3[[3]] - df3[[4]]
df3 <- df3[-4]
df3
#  UF  Ano Valor.x
#1 AC 2007 1869351
#2 AC 2008 2008371
#3 AC 2009 2092967
#4 AC 2010 2093294
#5 AC 2011 2094012
#6 AC 2012 2271919

With the package dplyr it may be so:

library(dplyr)

df1 %>%
  inner_join(df2, by = c('UF', 'Ano')) %>%
  mutate(Valor = Valor.x - Valor.y) %>%
  select(UF, Ano, Valor)

Dice.

df1 <- read.table(text = "
UF  Ano   Valor
1 AC 2007 2315798
2 AC 2008 2425687
3 AC 2009 2511285
4 AC 2010 2578460
5 AC 2011 2549497
6 AC 2012 2634467
", header = TRUE)

df2 <- read.table(text = "
UF   Ano  Valor
1 AC  2007 446447
2 AC  2008 417316
3 AC  2009 418318
4 AC  2010 485166
5 AC  2011 455485
6 AC  2012 362548
", header = TRUE)
  • Hello, using the merge, you are not merging, see: [1] UF Year Value. x Value. y <0 lines> (or Row.Names of length 0) >

  • @Ingledseiévini Is using the merge as it is in the answer? In particular, the argument by = c('UF', 'Ano') cannot include the column Valor. The above result was obtained with the data as they are in the question and works.

  • Yeah, I’m using the merge as it is, not including the Value column. The data is the same, I gave only one head and pasted here, the data frame is only a little bigger. when I open the new data frame after I’ve run the merge, 3 columns of UF, Year, Value. x and Value appear. y, and a message from "No data invailable in table".

Browser other questions tagged

You are not signed in. Login or sign up in order to post.