8
I’m having serious difficulties understanding this mechanism.
In English it would just be:
import nltk
tag_word = nltk.word_tokenize(text)
Whereas text
is the English text that I would like to "tokenize", which occurs very well, but in Portuguese I still can not find any example.
I am disregarding here the previous steps of stop_words
and sent_tokenizer
, just to make it clear that my doubt is in relation to tokenization.
Have you read this article or saw this repository?
– Woss
Hello @Andersoncarloswoss, yes I have read, but I still can’t understand the flow. I was able to use stop_words with nltk.corpus.stopwords.words('Portuguese'), but I still could not taggear the words, this internet example found very little didactic.
– Mueladavc