2
I have the following df
n_words Words .
220 [('trabalho', 17), ('monitor', 17), ('via', 16...
3114 [('atend', 863), ('ortopedico', 863), ('proced...
5 [('anomalos', 2), ('feixes', 1), ('eletrofisio...
3 [('hr', 1), ('sistema', 1), ('fenotipagem'...
I need the amount of different words, that is, the size of each tuple list.
I tried to:
df['palvras_dif'] = ""
i = 0
for row in df['Words']:
df['palvras_dif'][i] = len(df['Words'][i])
i+=1
df
But it doesn’t count correctly. Someone can help me?
Is using the Pandas?
– Woss
I am using yes!
– Gisele Santos
And what does the number represent on each tuple? It should be considered also or just the word?
– Woss
It is the frequency that the word appeared in another df. Example: on line 3 I had a list of ['anomalies', 'electrophysiotherapy', 'bundles', 'anomalies', 'electrophysiotherapy'] and I made the list of tuples with her word and phrquency. I need to know qts words are different, so I wanted the size of the list of tuples...
– Gisele Santos
But should it be considered or not? For example, if there is
('trabalho', 2)and('trabalho', 14), should be considered as the same word or as separate occurrences?– Woss
In this example you gave, I don’t have the same word 2x, just because the number is the word frequency.
– Gisele Santos
Then it would not be enough to add the values in
n_words?– Woss
Not pq in n_words I have the total number of words, also considering the repeated ones. I need the number of distinct words. Like the example on line 3: ['anomalies', 'electrophysiotherapy', 'bundles', 'anomalies', 'electrophysiotherapy'] I have n_words =5 and I need the number of different words, which would be: 3.
– Gisele Santos