Characters KWIC Quanteda

Asked

Viewed 43 times

1

Hello I am using R together with Quanteda to survey KWIC in a corpus of the agency Lupa. I have problems with the characters.

I import the corpus as follows:

corpus.fake.df <- readtext("../dados/analise/*.txt",
                      docvarsfrom = "filenames",
                      encoding = "UTF-8")
fake.corpus <- corpus(corpus.fake.df)

After this done, I run a kwic with a term:

k <- kwic(fake.tokens , "gomes", 5, case_insensitive=TRUE, encTo = "UTF-8", valuetype = "regex")

When asking to view the data, instead of calling the Rstudio viewer, what I have is a browser window with all the wrong characters:

Janela com o erro

Thank you very much!

  • 1

    the problem is in the corpus.fake.df and not in function kwid. See https://answall.com/questions/6805/howto avoid-problems-de-encoding-quando-pegadados-com-twitter,

  • Thanks for the answer!. I could not, tried and gave the same problem

  • Have you tried encTo='windows1252' or encTo="latin1"?

  • Yes, unfortunately it didn’t work

1 answer

0

Later I managed to resolve the issue by installing a package called DT in the R. In the case of the problem I reported above, when using together with the package How much in the Rstudio generates a very friendly visualization, with correct accents and able to be exported to HTML for later editing.

Thanks!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.