Here are some tips p/ solve your problem:
1 - Read the file using another function:
> microbenchmark(
+   base = read.csv(file = "df-write-csv.csv", header = T),
+   readr = readr::read_csv("df-write-csv.csv"),
+   data.table = data.table::fread("df-write-csv.csv"),
+   rio = rio::import("df-write-csv.csv", format = "csv")
+ )
Unit: microseconds
       expr      min        lq      mean    median        uq      max neval
       base 1836.230 1912.1815 2253.6071 1980.3995 2282.1675 4148.787   100
      readr  823.960  881.3625 1072.4790  921.6605 1120.2365 3538.359   100
 data.table  327.759  364.4810  442.5933  402.3295  458.7895  920.436   100
        rio  312.317  351.2260  444.1087  382.9325  439.7960 2938.490   100
See that read the file with the function fread of data.table or with the function importof rio is 4x faster than with native R function.
Make sure you really read.
2 - Check that you have actually been able to filter your database. Save the subset result to an auxiliary object.
If this is the problem, try filtering using other package functions like dplyr or data.table.
When it is long operations, the data.table can be much faster.
> df <- data.frame(x = 1:100000, y = 1:100000, l = sample(letters, size = 100, replace = T))
> microbenchmark(
+   base = subset(df, l == "a"),
+   dplyr = dplyr::filter(df, l == "a"),
+   data.table = data.table(df)[l == "a",]
+ )
Unit: milliseconds
       expr       min        lq      mean    median        uq      max neval
       base 10.329514 12.467143 14.962479 13.976907 17.171858  24.3761   100
      dplyr  7.331626  8.624356 10.063947  8.853807 11.140871  16.8939   100
 data.table  2.986519  4.580536  6.774548  4.824227  5.957255 119.9709   100
3 - Use the function write_csv package readr she is more or less 2x faster than the function write.csv native of the R.
microbenchmark(
  base = write.csv(df, file = "df-write-csv.csv", row.names = F),
  rio = rio::export(df, file = "df-rio.csv", format = "csv"),
  readr = readr::write_csv(df, path = "df-readr.csv")
)
Unit: microseconds
  expr     min       lq     mean    median       uq      max neval
  base 713.564 1097.534 2025.377 1467.4980 2996.136 4168.352   100
   rio 718.141 1156.998 2243.143 2011.5310 3106.479 7368.046   100
 readr 366.306  594.629 1265.297  734.0445 1793.405 5852.142   100
Anyway, if you were able to read the 5GB file, it is very likely that you can also write it, since it is already in the RAM of your computer.
							
							
						 
It didn’t work because it didn’t finish running? It’s probably because the file is heavy. Have you run this analysis on data of this magnitude? How long was running? How much of RAM does the computer have?
– Molx
Andre, put the computer information you’re using into the question. Also worth checking out are these answers and see if they help you: http://answall.com/questions/30631/strategy-parsingbases-databases-very big
– Carlos Cinelli
Carlos, the configuration of my machine is as follows:: Meméria 2.9GB, Intel CPU 585 2.16GHz and 32 bit. It is running Ubuntu 14.04 LTS. Your link can help a lot, for now, he gave me several ideas that I will see if I put in practice.
– André Oliveira
Do you have the link to csv? If possible, pass the link here so we can test the solutions. Your question is very similar to this one: http://answall.com/questions/35469/pre-processr-grande-arquivo-de-texto-txt-substr-por probably the package
ffsolves your problem.– Carlos Cinelli