2
I have the following df
n_words Words .
220 [('trabalho', 17), ('monitor', 17), ('via', 16...
3114 [('atend', 863), ('ortopedico', 863), ('proced...
5 [('anomalos', 2), ('feixes', 1), ('eletrofisio...
3 [('hr', 1), ('sistema', 1), ('fenotipagem'...
I need the amount of different words, that is, the size of each tuple list.
I tried to:
df['palvras_dif'] = ""
i = 0
for row in df['Words']:
df['palvras_dif'][i] = len(df['Words'][i])
i+=1
df
But it doesn’t count correctly. Someone can help me?
Is using the Pandas?
– Woss
I am using yes!
– Gisele Santos
And what does the number represent on each tuple? It should be considered also or just the word?
– Woss
It is the frequency that the word appeared in another df. Example: on line 3 I had a list of ['anomalies', 'electrophysiotherapy', 'bundles', 'anomalies', 'electrophysiotherapy'] and I made the list of tuples with her word and phrquency. I need to know qts words are different, so I wanted the size of the list of tuples...
– Gisele Santos
But should it be considered or not? For example, if there is
('trabalho', 2)
and('trabalho', 14)
, should be considered as the same word or as separate occurrences?– Woss
In this example you gave, I don’t have the same word 2x, just because the number is the word frequency.
– Gisele Santos
Then it would not be enough to add the values in
n_words
?– Woss
Not pq in n_words I have the total number of words, also considering the repeated ones. I need the number of distinct words. Like the example on line 3: ['anomalies', 'electrophysiotherapy', 'bundles', 'anomalies', 'electrophysiotherapy'] I have n_words =5 and I need the number of different words, which would be: 3.
– Gisele Santos