R language - Column separation

Asked

Viewed 23 times

1

I have the two columns below and I need to separate them so that the digits are in other columns. The idea is to have at the end 4 columns with names of teams and score.

Two columns of df2 type Character:

Time1           Time2
Green Bay 35  Kansas City 10
Green Bay 33      Oakland 14

I need the digits to be separate, but when I use the function below, it eats a digit and returns me only "3" and not "35" for example:

df3 <- df2 %>%
  separate(Time1, into = c("Time", "Score"), sep = "\\d",
           extra = "merge")

They know some way to fix?

Thank you.

1 answer

0

Here are three ways to separate words from numbers in the two columns. The first uses sub, the second and the third str_extract package stringr.

library(dplyr)
library(tidyr)
library(stringr)

df2 %>%
  mutate(
    Score1 = sub("^.* (\\b[^ ]+)$", "\\1", Time1),
    Time1 = sub(" [^ ]*$", "", Time1),
    Score2 = sub("^.* (\\b[^ ]+)$", "\\1", Time2),
    Time2 = sub(" [^ ]*$", "", Time2)
  )

df2 %>%
  mutate(
    Score1 = str_extract(Time1, pattern = "[[:digit:]]+"),
    Time1 = str_extract(Time1, pattern = "[[:alpha:][:blank:]]*"),
    Score2 = str_extract(Time2, pattern = "[[:digit:]]+"),
    Time2 = str_extract(Time2, pattern = "[[:alpha:][:blank:]]*")
  )

This third way denies the standard [:digit:] to extract the words. So maybe you have the clearest regex’s.

df2 %>%
  mutate(
    Score1 = str_extract(Time1, pattern = "[[:digit:]]+"),
    Time1 = str_extract(Time1, pattern = "[^[:digit:]]*"),
    Score2 = str_extract(Time2, pattern = "[[:digit:]]+"),
    Time2 = str_extract(Time2, pattern = "[^[:digit:]]*")
  )

Dice

txt <- "Time1           Time2
'Green Bay 35'  'Kansas City 10'
'Green Bay 33'      'Oakland 14'"
df2 <- read.table(textConnection(txt), header = TRUE)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.