Organize data in Excel to open as table in R?

Asked

Viewed 3,977 times

11

I have a file with 50 observations and 3 variables in Excel file (.xlsx), and turned to .csv. I used banco1<- read.csv("teste1.csv") to create the table in R, however my 3 variables appear in the same column. How do I make them appear separately?

  • 2

    Hello Adriana, post some lines of your csv file, you will probably need to set a different tab (and maybe also the character that involves Strings). Anyway, did you know that there are packages that allow you to read xlsx files directly? Take a look at the package xlsx cran.

  • Thank you Anthony, I managed to install the xlsx package, very good tip.

  • Adriana, if any of the answers answered your question, you can accept it by clicking on the "ok" symbol on the left.

  • This question also deals with: http://answall.com/q/19125/6036

3 answers

5

From the result you have observed, I will assume you are using a Portuguese version of Windows Excel. Usually, when we save a Windows Excel spreadsheet in CSV, it is saved with ; as a separator. This happens mainly in Brazilian (some European) versions of Excel, because our decimal separator is the comma, and not the point, as it is in the United States.

In the R this is already, in a way, anticipated by the function read.csv2 (see here), that has the configuration of headers, decimals and separators that Windows Excel uses as default. See in the function setting, making ?read.csv

read.csv(file, header = TRUE, sep = ",", quote = "\"",
         dec = ".", fill = TRUE, comment.char = "", ...)

read.csv2(file, header = TRUE, sep = ";", quote = "\"",
          dec = ",", fill = TRUE, comment.char = "", ...)

In addition to this possible solution, as Anthony said, there are packages that make it easy to read files directly from Excel, such as packages xlsx and gdata. Behold here and here.

  • Thank you Julio!

2

Check the tabs in the file . csv, opening it in text mode. In function read.csv, you must set the Sep parameter. For a table with the "&" character, as separator: you should use

banco1 <- read.csv("teste.csv", sep="&")

Also use some of these options for reading xlsx that work well is the package "Xlconnect" (install.packages("XLConnect") ), which is better:

library(XLConnect)
banco1 <- readWorksheet(loadWorkbook("teste.xlsx"),sheet=1)

Or by using the "xlsx" (install.packages("xlsx")):

library(xlsx)
banco1 <- read.xlsx("teste.xlsx", 1 , stringsAsFactors=F)
  • 1

    Thanks for the tips Guilherme!

  • Your problem was solved with this, @Adriana? I suspected that the problem was like the one Guilherme diagnosed. Mark the question as answered.

0

Another alternative is to create in Excel a delimited file of type "Text (separated by tabulations)". Excel generates a file with extension . txt which, by own experience, is imported more easily by R Studio.

To put it another way, you will stop using a "comma separated values", and use a "tab separaeted value". You can also use a file. csv, but you’ll need to spend time setting the parameters of read.csv.

To later export an object called "mydata" to a tab-separated text file called mydata.txt, use the command:

write.table(mydata, "mydata.txt", sep="\t")

Browser other questions tagged

You are not signed in. Login or sign up in order to post.