Multiple Gather with 4 resulting "joint" columns

Asked

Viewed 85 times

3

Hi,

I have the following situation:

I have a data frame with several columns, a group of them I want to transform into key value. So far so good, but there are 2 groups and they can’t just sort of repeat themselves. Like, it’s a group of 3 and another group of 3

NOME      A1   A2    A3    B1    B2    B3
batata     6   4     7      2    1      1
maçã       9   4     8      1    2      0

I did Gather 2 times, one for A1 A2 A3 and one for B1 B2 B3. The problem is that the resulting line number is 2*3*3 = 18, as it takes a result from A and makes a row for each result of B

NOME    keyA    valueA    keyB    valueB
batata   A1       6        B1       2
batata   A1       6        B2       1
batata   A1       6        B3       1
maçã     A1       9        B1       2
maçã     A1       9        B2       1
maçã     A1       9        B3       1

..........(same process with A2 and A3)

What I need is for each value of A1, I have only key B1, A2 only B2, etc. So:

NOME    keyA    valueA    keyB    valueB
batata   A1       6        B1       2
batata   A2       4        B2       1
batata   A3       7        B3       1
maçã     A1       9        B1       1
maçã     A2       4        B2       2
maçã     A3       8        B3       0

Only 6 lines

Can anyone help me? xd

2 answers

2


Here is a solution:

library(tidyverse)

x <- data.frame(
  NOME = c("batata", "maça"),
  A1 = c(6, 9),
  A2 = c(4, 4),
  A3 = c(7, 8),
  B1 = c(2, 1),
  B2 = c(1, 2),
  B3 = c(1, 0)
)


x %>%
  gather(keyA, valueA, starts_with("A")) %>%
  gather(keyB, valueB, starts_with("B")) %>%
  filter(parse_number(keyA) == parse_number(keyB))

The function parse_number takes only the numeric part of a variable, and so you can use it to compare the keyA and keyb columns to pick only what you need.

  • Daniel, I had done just that. Just go open here to post the same answer that you put kkkkkk. Still thank you so much!

  • kkkk that good! read your thought :P

0

Using as a basis the result obtained from your function Gather, I made a comparison of the columns to get the new data.frame. Note that I create a logical comparison vector of the numbers contained in the columns keyA and keyB and I apply this vector to select the lines of the data.frame.

produtos <- data.frame(nome = c('batata','batata','batata', 
                            'maça', 'maça','maça'), 
                   keyA = c('A1','A1','A1','A2','A2','A2'), 
                   valueA = c(6,6,6,9,9,9), 
                   keyB = c('B1','B2','B3','B1','B2','B3'), 
                   valueB = c(2,1,1,2,1,1))

library(stringr)
produtos[str_extract(produtos[,2], '\\d') == str_extract(produtos[,4], '\\d'), ]
  • yesterday thinking, I came up with a possible solution with filter. I will test it here and then put. If you don’t have it right there I will try it then it passed me. Thanks!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.