1
Hello, all right ?
Precise import a large db that is in .txt and divided into 20 parts of 5Gb each (approximately).
Within that base there are three data.frame, with 3 different column quantities.
As columns are separated by fixed spaces.
I’m trying to use data.table()
and read_fwf()
, but I’m having trouble separating the columns due to the three different frames..
The "RECORD TYPE" column identifies the data.frames.
The basis is this (the layout is here too):
Anybody got any tips? From now on, thank you very much !
ps¹: I tried to use: devtools::install_github("georgevbsantiago/qsacnpj")
with
qsacnpj::gerar_bd_cnpj(path_arquivos_txt = "C:/Users/Downloads/",
localizar_cnpj = "NAO",
n_lines = 10000,
armazenar = "csv")
But it takes too long ! When I tried n_lines = 100000, the computer locked in the sixth file.
ps²: my computer does not have as much memory.
Are you sure it’s not a RAM limitation available?
– Tomás Barcellos
About the delay, I think that’s just it ! I would like tips to dribble this problem. That’s why I tried data.table(), but I can’t handle it.
– RxT
@Tomásbarcellos when I say that there are three data.frame on that basis, they are "mixed". Hence the difficulty in separating.
– RxT