0
I am trying to read a PDF using this iText library, however, accented characters are ignored, I have already looked at the project Encoding and this as UTF-8
.
PdfReader reader = new PdfReader("arquivo.pdf");
String conteudo = PdfTextExtractor.getTextFromPage(reader,1);
System.out.println(conteudo);
Example:
- Text in the PDF:
Exercícios
- Exit:
Exerc cios
Strange guy, I made an example here reading by
filename
also, with the project and all its resources asUTF-8
and are ok. Do a test by passing theInputStream
of your file and not thefilename
and see if you’re OK. If you’re wrong, try to force theInputStream
asUTF-8
– Bruno César
I don’t understand of
iText
. It’s just a theory, but you imported some standard source (that supports accents)?– Guilherme Nascimento
It worked Bruno, I do not know why the pdf file I was testing was giving this problem. I tested with another and it worked perfectly! Thank you
– Johnata Rodrigo