How can I assign a variable to a table?

Asked

Viewed 68 times

1

This table is only in one column ! I need to assign the age(age) variable, to make the average, variance, ..., but I can’t do it because they are in the same column.

"Name","Sex","Age","Height","Weight","Team","NOC","Games","Season","City","Sport","Event","Medal" 1,"A Dijiang","M",24,180,80,"China","CHN","1992 Summer",1992,"Summer","Barcelona","Basketball","Basketball Men’s Basketball",NA 2,"A Lamusi","M",23,170,60,"China","CHN","2012 Summer","Summer","London","Judo","Judo Men’s Extra-Lightweight",NA "Gunnar Nielsen Aaby","M",24,NA,"Denmark","DEN","1920 Summer",1920,"Summer","Antwerpen",""Football","Football Men’s Football",NA 4,"Edgar Lindenau Aabye","M",34,NA,NA,"Denmark/Sweden","DEN","1900 Summer","Summer","Paris","Tug-Of-War","Tug-Of-War Men’s Tug-Of-War","Gold" 5,"Christine Jacoba Aaftink","F",21,185,82,""Netherlands","NED","1988 Winter",1988,"Winter","Calgary","Speed Skating","Speed Skating Women’s 500 metres",NA 5,"Christine Jacoba Aaftink","F",21,185,82,""Netherlands","NED","1988 Winter",1988,"Winter","Calgary","Speed Skating","Speed Skating Women’s 1,000 metres",NA 5,"Christine Jacoba Aaftink","F",25,185,82,""Netherlands","NED","1992 Winter",1992,"Winter","Albertville","Speed Skating","Speed Skating Women’s 500 metres",NA 5,"Christine Jacoba Aaftink","F",25,185,82,""Netherlands","NED","1992 Winter",1992,"Winter","Albertville","Speed Skating","Speed Skating Women’s 1,000 metres",NA "Christine Jacoba Aaftink","F",27,185,82,"Netherlands","NED","1994 Winter",1994,"Winter","Lillehammer","Speed Skating","Speed Skating Women’s 500 metres",NA "Christine Jacoba Aaftink","F",27,185,82,"Netherlands","NED","1994 Winter",1994,"Winter","Lillehammer","Speed Skating","Speed Skating Women’s 1,000 metres",NA 6,"Per Knut Aaland","M",31,188,75,""United States","USA","1992 Winter",1992,"Winter","Albertville","Cross Country Skiing","Cross Country Skiing Men’s 10 kilometres",NA 6,"Per Knut Aaland","M",31,188,75,""United States","USA","1992 Winter",1992,"Winter","Albertville","Cross Country Skiing","Cross Country Skiing Men’s 50 kilometres",NA 6,"Per Knut Aaland","M",31,188,75,""United States","USA","1992 Winter",1992,"Winter","Albertville","Cross Country Skiing","Cross Country Skiing Men’s 10/15 kilometres Pursuit",NA 6,"Per Knut Aaland","M",31,188,75,""United States","USA","1992 Winter",1992,"Winter","Albertville","Cross Country Skiing","Cross Country Skiing Men’s 4 x 10 kilometres Relay",NA "Per Knut Aaland","M",33,188,75,"United States","USA","1994 Winter",1994,"Winter","Lillehammer","Cross Country Skiing","Cross Country Skiing Men’s 10 kilometres",NA "Per Knut Aaland","M",33,188,75,"United States","USA","1994 Winter",1994,"Winter","Lillehammer","Cross Country Skiing","Cross Country Skiing Men’s 30 kilometres",NA "Per Knut Aaland","M",33,188,75,"United States","USA","1994 Winter",1994,"Winter","Lillehammer","Cross Country Skiing","Cross Country Skiing Men’s 10/15 kilometres Pursuit",NA 6,"Per Knut Aaland","M",33,188,75,""United States","USA","1994 Winter",1994,"Winter","Lillehammer","Cross Country Skiing","Cross Country Skiing Men’s 4 x 10 kilometres Relay",NA "John Aalberg","M",31,183,72,"United States","USA","1992 Winter",1992,"Winter","Albertville","Cross Country Skiing","Cross Country Skiing Men’s 10 kilometres",NA "John Aalberg","M",31,183,72,"United States","USA","1992 Winter",1992,"Winter","Albertville","Cross Country Skiing","Cross Country Skiing Men’s 50 kilometres",NA "John Aalberg","M",31,183,72,"United States","USA","1992 Winter",1992,"Winter","Albertville","Cross Country Skiing","Cross Country Skiing Men’s 10/15 kilometres Pursuit",NA "John Aalberg","M",31,183,72,"United States","USA","1992 Winter",1992,"Winter","Albertville","Cross Country Skiing","Cross Country Skiing Men’s 4 x 10 kilometres Relay",NA

1 answer

2


Your data is in the form of a comma-separated file (csv), but with quotation marks for the values. I believe a simple read.csv will not function (other suggestions are welcome).

One solution will be to copy this data into a file txt and read with a read.table. First let’s read the file indicating the separation by space. This way each element will contain the entire line of the data frame

# ler ficheiro (warning é devido a falta de estrutura do documento)
dados <- read.table('data.txt', sep = ' ')

# separar entre cabeçalho e dados
cab <- dados[1]
dados <- dados[-1]

Now let’s seprar each value separated by comma with the function strsplit()

# separar cada elemento por vírgula
df <- data.frame(matrix(NA, nrow = 1, ncol = 15))
for(i in 1:length(dados)) {
  df[i, ] <- unlist(strsplit(as.character(dados[, i]), ','))
}

# incluir cabeçalho
names(df) <- unlist(strsplit(as.character(cab[, 1]), ','))

# transformar variáveis
for(i in c(4, 5, 6, 10)) {
  df[, i] <- as.numeric(df[, i])
}

Browser other questions tagged

You are not signed in. Login or sign up in order to post.