problem with R enconding

Asked

Viewed 292 times

2

I was asked to do text analysis and I’m having trouble with coding, someone knows how to convert these strings with accent directly?

Example of how the file appears:

vocês dizerem que não!!! Até quando

Another example:

â¤ï¸(...) Comilanças é amigo secreto na casa clean!ðŸŽ

I’ve tried using this function:

stringi::stri_enc_detect(dados$text)

and I got this exit:

    [[1]]$Encoding
    [1] "UTF-8"        "windows-1252" "windows-1250" "UTF-16BE"     "UTF-16LE"     "Shift_JIS"    "windows-1254"
    [8] "IBM420_ltr"  

    [[1]]$Language
    [1] ""   "pt" "cs" ""   ""   "ja" "tr" "ar"

    [[1]]$Confidence
    [1] 1.00 0.63 0.28 0.10 0.10 0.10 0.02 0.01

If anyone can help I’d be grateful!

  • What is the result when you use Sys.getlocale()?

  • Oops, the result was this:

  • [1] "LC_COLLATE=Portuguese_Brazil.1252;LC_CTYPE=Portuguese_Brazil.1252;LC_MONETARY=Portuguese_Brazil.1252;LC_NUMERIC=C;LC_TIME=Portuguese_Brazil.1252"

1 answer

1


Browser other questions tagged

You are not signed in. Login or sign up in order to post.