Most voted "tika" questions
Apache Tika is a Toolkit that detects and extracts Metadata and text from various documents - from PPT to CSV to PDF - using a parser library. Tika unifies these parsers under a single interface to allow you to easily get an parser for thousands of different file types. Tika is useful for indexing search systems, content analysis, translations and more.
Learn more…1 question
Sort by count of
-
2
votes1
answer467
viewsApache Lucene with Tika not returning words with accent
I implemented the library Lucene and Tika of the Apache package and managed to make it work super well for what I want. But I have a problem in words with accent, he can not return results for words…