BIG DATA has been a big buzzword around IT for the past couple years. “Big Documents” could be the next big trend in the capture and IIM market. After all, don’t both documents and data represent different manifestations of the information piece of the IT equation?
In fact, there are several emerging trends coming together to create a ripe environment for implementing Big Document solutions, or, as I like to call it, the act of capturing everything and letting the technology sort it out.
Here’s a look at those trends, followed by a summary of how they are enabling Big Document applications:
- Less-expensive, longer-lasting hardware
- Multi-channel capture capabilities
- Improved user interfaces (UIs)
- Better connectivity
- Increased intelligence in recognition technologies
- Desire for better governance
- Application of Big Data principles
Document scanners continue to become better, faster and cheaper. Basically, the performance a user paid $20,000 for 15 years ago is now attainable for around $1,000. And, today’s higher-volume scanners are designed to last longer, and are easier to operate and maintain than their predecessors. In addition, vast improvements have been made in the scanning capabilities on MFPs, so hardware should no longer be a barrier for anyone wishing to capture documents.
In addition to paper documents, capture is increasingly used to onboard electronic documents like emails, attachments, fax images and even payments into document workflows. Mobile capture is another emerging trend. When you combine these capabilities for multi-channel capture with the ease of use of document onboarding associated with the friendly UIs of popular file, sync and share systems—it’s becoming easier than ever to get a document into a repository.
This is where new, advanced capture comes in. Today, there are multiple proven technologies for identifying and capturing data from many types of forms. These include advanced key-from-image, as well as automated recognition—which can now be supplemented by broadband Internet connections to better distribute human keying and verification. However, there is also emerging technology in areas like natural language processing, semantic understanding and even artificial intelligence. These technologies are able to automatically classify and extract data from any type of document, including unstructured narrative letters.
So, why would anyone want to know the details of every document? Reasons like protection from liability, data mining for improving business (similar to what you see with Big Data applications), and better-functioning IIM come to mind. No, not every document is going to need these Big Document technologies applied to them, but it’s important to know that the tools are becoming available, especially for organizations whose information in their documents is as important as the information in their databases.