Whenever possible, avoid using for
in the R
. It is computationally slow and can lead to making silly mistakes. For example, make a for
starting like this
for(i in 199501:201703)
will take you to consider the months 199501, 199502, ..., 199512, 199513, 199514 and so on. Not a good idea.
Another problem is saving something within a position reserved for number (dados[i]
) something that has two dimensions (subset(dados,data==i)
). This will not work. Ideally saving these results within a list. Also, you were trying to save new objects inside the old object, thus creating a recipe for the loop not to work.
Assuming your dataset is called dados
and he has a column with dates called data
, a way to solve this problem using for
is the following:
dadosLista <- list()
for (i in unique(dados$data)){
dadosLista[[i]] <- subset(dados, data==i)
}
This will cause a minor inconvenience that the first 199500 positions on the list dadosLista
evening NULL
, and all positions that do not have a corresponding year and month, type 199533, will be NULL as well. The advantage is that the command
dadosLista[[199803]]
will return the data to March 1998. You can remove the NULL
spinning
dadosLista <- Filter(Negate(is.null), dadosLista)
The problem with doing this is that references are lost with the indexes of years and months. No free lunch.
However, there is a better solution. Assuming your dataset is called dados
and he has a column with dates called data
, do the following:
dadosLista <- split(dados, dados$data)
This will put your data in a list. It will be possible to access each of the separate datasets via commands similar to
dadosLista$199501
Thus, each position on the list will be identified by a name, identical to the desired year and month, and not by a number. It will make the code more organized, cleaner and, I believe, run faster than if you used a for
.
Thank you, Marcus. I ended up putting my base on a list. Much better even as you spoke.
– T. Veiga