The base functions for reading tables are sufficient to suit most cases. However, they are relatively slow, and there are faster alternatives if they are many files and/or they are very large, which also have other small advantages.
The package readr
was created with exactly the aim of improving the standard functions, in the following points:
Arguments have names more consistent with each other (e. g. col_names
and col_types
and not header
and colClasses
).
They are approximately 10x faster.
Show a Progress bar if the reading takes longer than a few seconds.
Strings are not transformed into factors by default.
Column names are not transformed into "valid" R expressions, that is, columns keep the name identical to the original (even if they start with number, have space, etc).
In this package the functions have similar name to those of the base
, replacing the dot with an underscore (_). For example:
#base:
variavel <- read.table("dados.csv", header=T, dec=",", sep=";")
variavel <- read.csv2("dados.csv", header=T)
#readr
library(readr)
variavel <- read_csv2("dados.csv")
Similarly, there are functions read_csv()
, read_table()
, read_delim()
, read_tsv()
, read_lines()
and read_fwf()
.
Another alternative, too, is the function fread()
package data.table
. To fread()
is even faster (about 2x) than the package functions readr
, and tries to automatically identify the separator, if there are column names, etc. The function fread()
has arguments with names equal to the functions of the base
, as sep
, header
and stringsAsFactors
. In this example, it would look like this:
library(data.table)
variavel <- fread("dados.csv", sep = ";", header = TRUE)
Depending on the data format, sep
and header
may be omitted, but in doubt, it is safer to put them explicitly.
Finally, it is important to note that it only makes sense to use these functions if reading performance is a problem, or if the package is already loaded anyway (in the case of data.table
). Otherwise, there is no need to load a package to do something that can be done identically on base
.
Buddy, it worked, thanks.
– LongBoarder