5
Hello, I have a database, with about 50000 remarks, as follows, only figurative values:
nome<-c("joão","pedro", "joãoo")
identificador<-c(123456,124578,123456)
valor<-c(2145,350,23)
dados=data.frame(nome,identificador,valor)
I would like to identify individuals with the same identifier and create a new variable as follows:
nome=c("joão","pedro", "joãoo","maria","mariaa","carla","felipe","vitor","pedro","vitorr")
identificador=c(123456,124578,123456,000,000,123,156,2222,3232,2222)
valor=c(2145,350,23,32,12,32,1,2,54,4)'
validor=c(1,0,1,2,2,0,0,3,0,3)
dados=data.frame(nome,identificador,valor,validor)
I did so to identify the equal identifiers, but n manage to make this variable.
x<-dados$identificador
length(x)
i=1
k=1
validor=0
validor[1:50000]=0
for(i in 1:50000){
for(j in 1:50000){
if(x[j]==x[i] & i!= j ){
validor[j]=k
}
}
}
I would like to create a function that produces the value variable as shown. I hope I have been clear, and I thank you very much for your help.
Where is
dados
? I think you should have passed the dataframe for the given variable:dados = data.frame(...
- the same functional as the<-
, only it is one character less, so I like to use. What result you expect?– Not The Real Hemingway
edited, I would like to create the variable "validator" identifying the pairs, or set, of identifier with an algorithms
– Adryan Fernandes
You can explain how this sequence is formed?
validor = c(1, 0, 1, 2, 2, 0, 0, 3, 0, 3)
. It seems to me that you have created an array with the order in which the numbers repeat. E.g.: 123456 is the first to repeat, 000 is the second to repeat, and finally, 2222 is the third to repeat, therefore they equal 1, 2 and 3 respectively. The others do not repeat, so they receive 0.– Not The Real Hemingway
exactly that
– Adryan Fernandes