It’s not a big secret that companies throughout industries need to digitise and automate their document management process. Hence, our goal is to automatically extract all invoices data to nearly a 100%!
Our new solution: Fellow²KV
Reality is that companies receive in a day, month or even year great uncountable amounts of invoices but not even 80% is properly extracted. Having a document management software is already a big step to saving costs, time and resources. But, as everything in life, changes and improvements can be done.
Fellow Consulting AG has developed a new version called Fellow²KV. This machine learning plugin is able to train the Ephesoft Transact software to identify the proper data during your process of digitization. It can do these three functions at once:
- Analysis of documents via TF-IDF.
- Synchronisation of extraction rules.
- Learning new extraction rules by simply having a one-time human assistance.
How does it work?
With our Fellow²KV we can improve the accuracy of the Ephesoft Engine by filling it with our cloud-based extraction repository. Every time when invoices are validated and then exported, the new extraction rules are saved and updated again in that same repository.
In order to apply those rules to the specific invoices layouts we have to match the invoices to certain terms or more specifically to an identifier. This is done by the TF-IDF.
TF-IDF weighs the frequency of a term (TF) and its inverse document frequency (IDF). Each word or term that occurs in the text has its respective TF and IDF score. This would more or less look like this:
Every point shows a different term identifier, but what we really want is to group the points near to each other to create a cluster. This cluster simply represents an invoice layout that can be used for different suppliers. This K-Means Clustering would look like this:
Now, imagine doing this process a million times and filling the Fellow²KV repository with more and more data. Nearly 100% of accurate extraction is certainly feasible, don’t you think? That’s why you can forget about training Ephesoft Engine for 4 to 6 months, with Fellow²KV you can start from day one!