Read content from a PDF in Javascript

Asked

Viewed 1,649 times

-1

I have a PDF that I load via input and I need to get the contents of the file via Javascript without the use of Node.js "server side". I was able to get the contents in Base64, but it’s not readable.

  • add pf the code that is used to read the contents of the file,

1 answer

-1


Base64 is an encoding. You must use the method atob() to convert from Base64.

If you are more curious about the subject, there is a lot of documentation in English, such as this https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding


PS: As stated in this OS response (https://stackoverflow.com/a/247261/1256062) Still you should not get the expected result, because I imagine that the "Base64" you have is the binary of the file and not its contents.

Therefore, it would be necessary to use some external lib, such as pdf.js. I do not know the lib and I believe that even with it should not always work, because it should be necessary that the PDF is "as text". In many cases this is not true and it would then require an OCR tool to interpret the PDF "image" as text.

  • I came to see this lib but the problem is that it is in Ode and I wanted something in client side but I think that will not be possible

  • I think it works, yes, in the browser: https://mozilla.github.io/pdf.js/examples/index.html#Interactive-examples

  • I checked that you repeated the question and even used PDF.js. At least it was solved then.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.