Considering:
dados<-data.frame(idade=c(15,18,25,40,85,NA),
sexo=c("M","F",NA,"F","M","M"),
unidade.organica=c("EMEI CG","USP",NA,"UFSM","UFRGS","UPF"),
curso=c("TÉCNICO","SUPERIOR",NA,"SUPERIOR","SUPERIOR",NA),
ano.ingresso=c(2005,2011,NA,2014,1980,2015))
#exibindo o data.frame criado
dados
idade sexo unidade.organica curso ano.ingresso
1 15 M EMEI CG TÉCNICO 2005
2 18 F USP SUPERIOR 2011
3 25 <NA> <NA> <NA> NA
4 40 F UFSM SUPERIOR 2014
5 85 M UFRGS SUPERIOR 1980
6 NA M UPF <NA> 2015
NOTE: Take into account that your missing data is represented by NA.
##Filtro dos dados ausentes (NA):
#Removendo os NA's com função na.omit()
dada.sem.NA<-na.omit(dados)
#Removendo NA's com a função indexadora which():
dados.sem.NA<-dados[-unique(which(is.na(dados),arr.ind = T)[,1]),]
For both functions:which()
or na.omit()
. The result is:
dados.sem.NA
idade sexo unidade.organica curso ano.ingresso
1 15 M EMEI CG TÉCNICO 2005
2 18 F USP SUPERIOR 2011
4 40 F UFSM SUPERIOR 2014
5 85 M UFRGS SUPERIOR 1980
The age filter can be applied in any variable dados
or in dados.sem.NA
, see the cases:
#Filtro de idade na variável dados:
dados.por.idade<-dados[(dados.sem.NA$idade>17 & dados.sem.NA$idade<70), ]
The result is:
dados.por.idade
idade sexo unidade.organica curso ano.ingresso
2 18 F USP SUPERIOR 2011
3 25 <NA> <NA> <NA> NA
6 NA M UPF <NA> 2015
#Filtro de idade na variável dados.sem.NA:
dados.por.idade<-dados.sem.NA[(dados.sem.NA$idade>17 & dados.sem.NA$idade<70), ]
The result is:
dados.por.idade
idade sexo unidade.organica curso ano.ingresso
2 18 F USP SUPERIOR 2011
4 40 F UFSM SUPERIOR 2014
I hope I helped. Good luck!
Friend, I do not know the R language, but I have a great knowledge in Python! I believe that R, from what I have heard in lectures, is not very effective in your case for having too many lines to process and would take longer than in Python. I recommend the Panda library in Python to read and make your dataset.
– Vinicius Mesel
Welcome to Sopt! Read on How to ask and mcve to know how to elaborate a question of easy understanding and so can be helped. If you put the code you already have (data frame, columns, etc.) it will be much easier for the community to help you.
– carlosfigueira
@Viniciusmesel, R can easily handle database with ~12k items; I’ve worked with data frames of over 1 million data smoothly.
– carlosfigueira
Evandro, even if this can be easily done in R, you can also do it without difficulties in Excel using the filter tool. If your goal is to use R only to perform this filter, I think there is no need.
– Molx
Evandro a suggestion, even if in general you have been clear, whenever possible provide a representative sample of these data, that facilitates the understanding and life that will answer the question. Also, if the case is just to filter the data, I agree with Molx, the Excel of this account.
– Jean
Good afternoon. I don’t know how to post this example here. It is pa a requested job at school and one of the teacher’s requirements is that it be done in the R. Show me how to post an example in the OS
– Evandro Lopes