In the first part of this article, i explained how you can use Lucene to query a document (Word, PDF etc...), and find matches for specific keywords, which was necessary for us in order to automatically identify the document's category based on its content.
We've chosen a simple approach to demonstrate the automatic classification extension : if a document contains the name of a category, then...