Problems with the merge function

Asked

Viewed 240 times

1

I’m having trouble with the job merge out of nowhere she stopped "Merger" the date frames..

It seems she instead of using say the cbind, she’s using rbind and what is worst triple echo (each=3). Because two 40-line data.frames, are resulting in a 134-line date.frame. (I know you probably don’t use these functions but only to elucidate the behavior)

I already renamed the columns of both data.frames leaving only the column that would be used to "identify" the data. but still it was in vain.

My goal is to group two factorial schema developments to present the significant interaction by grouping the average test response.

I use the code like this: tab.final<-merge(tab1, tab2, by=c("means","means"))

tab1<-structure(list(Talhão = c("Abacate", "Abacate", "Abacate", "Abacate", 
"Abacate", "Banana", "Banana", "Banana", "Banana", "Banana", 
"Cacau", "Cacau", "Cacau", "Cacau", "Cacau", "Gliricidia", "Gliricidia", 
"Gliricidia", "Gliricidia", "Gliricidia", "Inga", "Inga", "Inga", 
"Inga", "Inga", "Manga", "Manga", "Manga", "Manga", "Manga", 
"Pupunha", "Pupunha", "Pupunha", "Pupunha", "Pupunha", "Seringueira", 
"Seringueira", "Seringueira", "Seringueira", "Seringueira"), 
    means = c(3.375, 2.875, 3.875, 3.125, 2.875, 3.125, 3.25, 
    2.875, 3.75, 3.625, 3.375, 4, 4, 2.125, 2.125, 3.625, 3.625, 
    3.625, 3.375, 2.125, 3.125, 2, 2, 2.125, 2, 3.875, 3.25, 
    2.875, 2.5, 2, 3.375, 3.375, 2.25, 2.375, 2.625, 2.125, 3.5, 
    3, 4, 3.75), let1 = c("A", "B", "A", "BC", "BC", "A", "AB", 
    "BC", "AB", "AB", "A", "A", "A", "D", "CD", "A", "AB", "AB", 
    "AB", "CD", "A", "C", "D", "D", "D", "A", "AB", "BC", "CD", 
    "D", "A", "AB", "CD", "CD", "CD", "B", "AB", "BC", "A", "A"
    ), Doses = c("0.001", "0.25", "0.5", "0.75", "1", "0.001", 
    "0.25", "0.5", "0.75", "1", "0.001", "0.25", "0.5", "0.75", 
    "1", "0.001", "0.25", "0.5", "0.75", "1", "0.001", "0.25", 
    "0.5", "0.75", "1", "0.001", "0.25", "0.5", "0.75", "1", 
    "0.001", "0.25", "0.5", "0.75", "1", "0.001", "0.25", "0.5", 
    "0.75", "1")), .Names = c("Talhão", "means", "let1", "Doses"
), row.names = c(NA, -40L), class = "data.frame")



 tab2<-structure(list(Doses = c("0.001", "0.001", "0.001", "0.001", 
"0.001", "0.001", "0.001", "0.001", "0.25", "0.25", "0.25", "0.25", 
"0.25", "0.25", "0.25", "0.25", "0.5", "0.5", "0.5", "0.5", "0.5", 
"0.5", "0.5", "0.5", "0.75", "0.75", "0.75", "0.75", "0.75", 
"0.75", "0.75", "0.75", "1", "1", "1", "1", "1", "1", "1", "1"
), means = c(3.375, 3.125, 3.375, 3.625, 3.125, 3.875, 3.375, 
2.125, 2.875, 3.25, 4, 3.625, 2, 3.25, 3.375, 3.5, 3.875, 2.875, 
4, 3.625, 2, 2.875, 2.25, 3, 3.125, 3.75, 2.125, 3.375, 2.125, 
2.5, 2.375, 4, 2.875, 3.625, 2.125, 2.125, 2, 2, 2.625, 3.75), 
    let2 = structure(c(2L, 2L, 1L, 1L, 1L, 1L, 1L, 5L, 3L, 2L, 
    1L, 1L, 3L, 2L, 1L, 2L, 1L, 3L, 1L, 1L, 3L, 3L, 3L, 3L, 2L, 
    1L, 3L, 1L, 3L, 4L, 3L, 1L, 3L, 2L, 3L, 3L, 3L, 5L, 2L, 2L
    ), .Label = c("a", "ab", "b", "bc", "c"), class = "factor"), 
    Talhão = c("Abacate", "Banana", "Cacau", "Gliricidia", "Inga", 
    "Manga", "Pupunha", "Seringueira", "Abacate", "Banana", "Cacau", 
    "Gliricidia", "Inga", "Manga", "Pupunha", "Seringueira", 
    "Abacate", "Banana", "Cacau", "Gliricidia", "Inga", "Manga", 
    "Pupunha", "Seringueira", "Abacate", "Banana", "Cacau", "Gliricidia", 
    "Inga", "Manga", "Pupunha", "Seringueira", "Abacate", "Banana", 
    "Cacau", "Gliricidia", "Inga", "Manga", "Pupunha", "Seringueira"
    )), .Names = c("Doses", "means", "let2", "Talhão"), row.names = c(NA, 
-40L), class = "data.frame")
  • The problem is that your merge column has no unique values, so in the merge there is a combination of several elements.

  • realized it now kkk, but knows of some option that multiple combination in this case?

  • It worked for me the first time: tab.final <- merge(tab1, tab2, by = "means"); tab.final <- tab.final[!duplicated(tab.final), ]. But it is clear that there will be many combinations of lines, see just, for example, the case means = 2.00. After removing duplicate lines you get 102 lines.

  • Rui understood, but it should only have 40 lines, what is happening is that as I am using the Means column to Merger, it contains data, which are not duplicated, because it has doses, and or different plot, although the averages are equal as is the case of averages 2.00... another thing I noticed but did not report is that in the column Doses, I do not know why loads of water, when I use the function HSD.test package agricolae, he plots in the column trt, some spaces.. what I believe is this other impasse for me not being able to rotate.

2 answers

1

I managed to solve using the functions rbind and cbind combined with the functions eval(parse(text=paste:

eval(parse(text=paste(c('tabela<-rbind(',rep("",(nv1*nv2)-1)),'cbind(tab1[tab1$Talhão=="',rep(lf1,each=nv2),'" & tab1$Doses=="',lf2,'",],tab2[tab2$Doses=="',lf2,'" & tab2$Talhão=="',rep(lf1,each=nv2),'",])',c(rep(",",(nv2*nv1)-1),")"),sep="")))

in which: lf1<-levels(tab1$Talhão);nv1<-length(summary(tab1$Talhão)) lf2<-levels(tab2$Doses);nv2<-length(summary(tab2$Doses))

But I’d like a simpler command style merge(tab1, tab2, by=c("means","means"))

  • With the data of the question, gave <0 rows> (or 0-length row.names).

  • fix the data.frames copy tab1 and tab2 again if you want! sorry for the inconvenience, I had to replace the code to paste here and I didn’t realize that columns were missing in both date frames..

0

If you used the bind_rows of dplyr wouldn’t get you what you want?

tab.final <- bind_rows(tab1, tab2)

In that case you’d have one data.frame 80-line.

If not, you could shed more light on exactly where you want to go?

Browser other questions tagged

You are not signed in. Login or sign up in order to post.