Special character removal in Software R

Asked

Viewed 51 times

4

in the database below in column 7 (Column title is Round) is written "1st Round". How to remove the "th" and keep only "1st Round"? Below follows the code for reading database view.

url <- "https://raw.githack.com/fulgenciomath/stackOverflow/master/futdata.csv"
library(data.table)
data <- fread(url,encoding = "Latin-1")   

2 answers

5

The sub/gsub function of the base:

data$Rodada <- gsub("ª", "", data$Rodada)

Or, using the syntax of data table. :

data[, Rodada := gsub("ª", "", Rodada)]

You can also remove the "ROUND" and convert to numeric. Only use as.integer(sub("ª RODADA$", "", Rodada))

4

You can use the stringr package and call the str_replace_all function

url <- "https://raw.githack.com/fulgenciomath/stackOverflow/master/futdata.csv"
library(data.table)
data <- fread(url,encoding = "Latin-1")   

library(stringr)

data$Rodada <- str_replace_all(data$Rodada, "ª", "")

Exit:

 [1] "1 RODADA" "1 RODADA" "1 RODADA" "1 RODADA" "1 RODADA" "1 RODADA"
 [7] "1 RODADA" "1 RODADA" "1 RODADA" "2 RODADA"

Browser other questions tagged

You are not signed in. Login or sign up in order to post.