Getting your computer to see what you see when you read, collect, collate, check, and bundle documents for processing or filing. This is the next challenge of document process automation.
Recognizing forms content is a big part of the next level of document process automation, but extracting text from documents and forms for indexing, validating signatures, removing or obscuring personal information, checking for completeness, and sorting to the right order are complex tasks that need a flexible approach.
- In legal discovery, for example, it may be a check for hand annotations on contracts, or specific signatures on letters.
- In healthcare, it might be detecting and redacting data for HIPAA compliance.
- In financial services it could be checking for completeness of new account forms and for signatures on compliance statements.
- In consumer goods it could be content extraction from inbound correspondence for automated indexing and prioritization.
- In pharmaceuticals it might be validating signatures against authorized professionals.
- For a utility or telecom company, it might be the ability to log comments collected in the field shedding light on otherwise “dark data”, tagging and enabling it for big data analytics.
AIIM takes a close look at this in its study and related white paper, Shedding light on the dark data in your document capture processes, where AIIM finds that, among other trends, hand-written address, data and free-format fields play an important role in most business processes – and an increasing one as organizations seek to exploit the “big data” they may contain. AIIM surveyed 267 individual members of the AIIM community to reflect existing and emerging trends in data capture and process automation.
What trends did AIIM's look at today's document capture solution trends, practices and processes reveal?
- Records managers are increasingly coming round to the idea that computers can be as accurate as humans in classifying content for records purposes, and they would certainly agree that they are more consistent.
- Free-text search technology has been around for a while, relying on an approximate conversion of the text within a document to index and find it, and then displaying the image for confirmation. Indexing for keywords requires a more reliable conversion, but this is now becoming more sophisticated, using context aware analytics to more precisely tag content and update metadata. This is a key enabler for governance.
- Information governance extends control of a document across its whole lifecycle, both active and as a record. Security is key and captured data can be used to identify sensitivities in the document.
- Detection of Personally Identifiable Information (PII) such as social-security numbers or credit card numbers will affect the security rating, and these can be automatically redacted if required.
- Capture to financial process, particularly capturing inbound invoices into the accounts payable (AP) process has been a very popular application for OCR recognition, speeding up processing, and potentially providing a “hands-off” matching and payment process. Validation of captured data against the transactional detail stored in the finance system is an important part of this process.
- Forms are fundamental to many business processes, particularly customer on-boarding, account opening and claims handling. Automated extraction of the contents of application forms is a huge productivity saver, although, of course, many of these forms are likely to be filled in by hand. Sorting and tallying forms and mandatory supporting documents into a case folder, and verifying completeness, can make a vital contribution to compliance.
- Organizations are coming to realize that no matter how sophisticated their analytics and big data capability is, if the content that needs to be analyzed resides on paper, it’s simply not going to be possible.
- The limit for many businesses in their use of capture will be reached too early if they confine their recognition to machine-text. Although PDF forms and web forms have much improved things, the majority of paper forms are filled in by hand – in fact in the business-to-business area, the passing of the typewriter has made this worse.
- The legacy requirement in many businesses for wet-ink signatures simply adds to the prevalence of paper-based, hand-written forms. In fact, 55 percent of responding organizations estimate that half or more of their forms have signature fields.
- Forms are generally littered with other hand-written fields for name and address, numerical and text data, and for 29 percent, half or more of their forms contain free text or open-ended comment fields – which are often the most crucial as regards customer satisfaction, previous histories, extenuating circumstances, etc., and would represent key inputs to many analytics projects, according to AIIM.
- Driving routing and workflow is the most quoted benefit for users of hand-writing technology, automating that initial journey into the organization and directing the form or document to the correct process.
- Providing keywords or tags for search is the next biggest benefit – coming higher than the productivity savings from not having to key in things like name and address fields. Roughly 30 percent of the respondents can also see the benefit of recognizing keywords for research and big data analytics.
- Finally, information governance and the protection of sensitive or private information is beneficial for a quarter of users, including the ability to detect and verify signatures.
Digital Transformation Age
At this time, there is a transformation impacting business decisions, strategies and growth - a digital transformation targeting the way enterprises manage documents. To remain competitive - and compliant - organizations must face today's digital realities and make a determined effort to integrate leading-edge digital initiatives across the enterprise to create a truly efficient and compliant organization.