PDF reading - PHP - Content and ID

Asked

Viewed 179 times

3

I’m looking for a solution that consists of an automatic form of reading PDF’s, basically I get hundreds of invoices and wanted a way to automate. What I tried:

Programs that convert to txt, which is not as effective because it messes up some values

Programs that take by the coordinate, sometimes change from x,y coordinate, for example, usually a PDF snippet has a line, but sometimes when it has two, mock the layout.

I’m trying to find some pattern, maybe like an ID, read this documentation http://webcheatsheet.com/php/reading_clean_text_from_pdf.php I wanted to see if I could get the dictionary, suddenly the amount I want, on all the invoices would have the same dictionary. Does anyone have any idea of a library that I could salvage the dictionary and the text? I believe that the most complete library is Pdfparser - pdfparser.org that supports more encoding and the most it supports is extracting Metadata

No answers

Browser other questions tagged

You are not signed in. Login or sign up in order to post.