Magazine Article | July 1, 1997

A New Code For Documents

Imaging integrator develops off-the-shelf solution for bar coding documents, then pushes for standardization to take sales to the next level.

Business Solutions, July 1997

As an integrator, Michael Coar of Innovative Workflow Engineering (IWE) found himself applying the same solution over and over again. His customers were looking for the most efficient way to file scanned document images to an indexed database. Bar coding the documents, Coar found, was often the best solution. What Coar also found was that integrators who wrote applications for bar coding documents did it on a proprietary basis. There was no off-the-shelf software that could be used to generate bar codes for documents, and then use the information contained in the bar codes to index images of the documents into a file system for organized retrieval.

After writing a number of similar proprietary applications himself, Coar decided there had to be a better way. "It was crazy. Sure, the money I was being paid to write these applications was good, but I had to ask myself, ‘How many times do I want to re-design the wheel?' "

A Quickly-Applied Solution
Coar decided to develop Barcode-In, his own off-the-shelf software which IWE now resells and integrates into imaging systems to facilitate the filing and indexing of documents through the use of bar codes. "Barcode-In saves me from having to develop a new application each time I install an imaging system which uses bar-coded documents," says Coar. "It also impresses customers because I can have a working application installed within 20 minutes. An integrator developing a similar proprietary application can show a customer what he did for a similar situation, but will probably need a week and some advance money to come up with a working application for the new customer. I know because I've done it."

How Imaging Bar Code Software Works
Barcode-In wraps around a user's scanning software, and integrates with his document management software. When documents are prepared for scanning, data that will be used to index the document images is manually extracted from the customer's data system and assigned a bar code. If a user has a health-care records system where documents are indexed by patient number/date of visit/type of form (billing, medical record etc.), this information is assigned a bar code, which is then assigned to the document. Bar codes are generated through Barcode-In, and can either be printed and applied to the document, or printed on a separator sheet.

When the documents are scanned, scanning software reads the bar codes. Barcode-In then facilitates the indexing of the document by acting as an interface with the user's document management software. Barcode-In can also be used to check that all the images that were bar coded were entered into the system. By checking the bar codes of all scanned documents against a list of bar codes that were created, a user can discover which documents were missed. Coar says the only type of imaging application where bar coding of documents doesn't fit is where a company is imaging documents that have no data to index them with. In this situation, there are no indexes, just file names to reference the files. A bar code is reduced to a substitute for a file name, which is redundant.

Standardization Is The Next Step
Coar did not stop with software development, though. "Once you've gone through the effort of creating the bar code, why shouldn't everyone who comes into contact with the document be able to read that bar code?" he asks. "This can be accomplished through creating a standardized format for the information contained in the bar code." Coar's background is in mortgage banking. Before founding IWE in 1995, Coar's claim to fame in the imaging world was designing an imaging system for the Federal National Mortgage Association, better known as Fannie Mae. Located in Washington, D.C., Fannie Mae is a publicly-held, government-sponsored corporation that buys and sells about 175,000 home mortgages per month. With these mortgages comes a load of paperwork.

Eliminating Data Reentry
"Typically, in the mortgage banking industry there is a lot of document transfer among brokers, underwriters, banks and agencies such as Fannie Mae. They are all buying and selling loans," says Coar. "These organizations all work with data extracted from the documents associated with these loans." This data can act as a stand-alone source of information, or it can double as an index for images of the documents. In either case, if the data required by the organizations was contained in a bar code, created at the beginning of the process, and that bar code could be read by each organization, it would eliminate a lot of data reentry.

"Typically, each organization that receives the documents requires the same information," says Coar, "and typically, each time an organization receives a set of documents on a mortgage, it enters the data into its system manually. A standardized bar code format containing this data would eliminate the work involved with manually rekeying the data. All the data could be entered through the swipe of a bar code reader, or, if the document images are being transferred, the bar codes could be read and the data extracted by software which reads bar codes on images."

Several Markets Could Benefit From Standardized Format
Coar says the legal and health-care markets could also benefit from standardized formats of bar codes on documents. In the legal market, case records, testimonies and affidavits are often passed among law firms and in the health-care industry patient records are moved between facilities. "Many industries can benefit from a standardized bar coding system for documents. Any industry where there are large amounts of documents moving among different sites, and where those documents need to be indexed, or similar data from the documents needs to be extracted at each site to populate a database, can benefit," says Coar.

2-D Bar Code Needed To Contain Information
In order to contain the necessary data for applications like mortgage banking, law and health care, any standardized bar code format would have to be a PDF (Portable Data File) 417 two-dimensional bar code. The PDF 417 bar code can contain up to 2,725 data characters, about 100 times as many characters as the average Code 39 bar code (the most popular type of bar code, considered by many to be the standard for general commercial and industrial use). Coar acknowledges that the greater availability of Code 39 technology makes it attractive to integrators. "There is a lot of inexpensive software available to generate and read Code 39 bar codes," says Coar. "Integrators can get away with Code 39s for proprietary applications because they can cross-reference documents with data already stored in the system to cut down on the amount of data they need to include in the bar code. But in order to include enough data so that a document can be moved independently among different systems, a 2-D bar code is necessary."

Settling On EDI Standards
Coar is a member of a committee set up through AIIM (The Association for Information and Image Management) which is working to develop a set of standards for bar coding documents. According to Coar, the committee has settled on using a variation of existing EDI document formats. "We asked ourselves, ‘Why reinvent the wheel when there are already some standards out there?' The information contained in a typical EDI transaction can easily be put in a PDF 417 bar code." EDI involves transferring encoded data to a receiver where it is decoded and used to populate a database. There are existing EDI formats for most documents used in mortgage banking, health care and law.

"Some banks avoid data reentry through EDI, but if EDI information were contained in a bar code, users would no longer have to set up for an EDI transaction, which involves using a modem to transmit information. Instead, they could just swipe the information off the bar codes when they receive the documents," says Coar. "The information then could also be used to index the document images."

Snags To Final Standards
There are snags to standardizing on bar code formats for documents, says Coar, one of which is trying to decide document-related issues that are not resolved by EDI. One of those is where to list the type of document. "EDI is a data transfer," says Coar. "The data comes from a document but an EDI transfer doesn't list what type of document it has drawn the data from." The type of document is important not only for indexing purposes if the documents are imaged, but also if the bar code is used in conjunction with OCR (Optical Character Recognition).

"When the bar code is being created, the document it is assigned to may not be complete, or it may even be blank," says Coar. "Somewhere down the line when that document is filled in, we'd like to be able to use the bar code to tell an OCR program where to look for data. For instance, a 1003 mortgage loan form has 360 fields on it. When the bar code is created it could say: This is a 1003 form. Look in field one for customer name. Look in field two for loan number, etc." Some of the problems with agreeing on a standardized form are as minor as deciding how to define the X and Y coordinates of the fields. Larger snags have come from a lack of assistance from manufacturers of OCR and scanning software, says Coar.

"OCR manufacturers see this bar code format as a replacement for OCR, which it can be," says Coar. "OCR is also used to extract data from forms, but the problem is that every time a user gets a new form, the OCR system has to be trained to read that form. With a standardized bar code system, users' bar code-reading software would take the data from the bar code. But the bar codes can also be used to assist OCR, by telling the OCR program where to look for data, thus eliminating the need to manually ‘train' the system to read each new document." Coar says that most off-the-shelf, image-based data capture software is designed to facilitate manual data entry. "Data entered through a bar code would make scanning software obsolete in a lot of applications."

Motivation For Compliance Needed
The final problem with trying to create a standard for bar coding of documents, according to Coar, is that, "nobody gets paid for creating standards." Coar believes that it will take the adoption of standards by two major interacting organizations to get the ball rolling. "Two businesses that are using separate bar coding systems to file their documents will come to the conclusion that they are duplicating data entry efforts. To communicate more effectively they will adopt standards. Once they find out how these standards expedite data transfer, they will encourage everyone they work with to adopt the same standards."

The ideal time for assigning the bar code to documents is when the documents are created, says Coar. Unfortunately, the documents are often created by the smallest companies in the document chain. "In a mortgage banking application, a broker might be the one generating the forms. You're lucky sometimes if a broker can afford an office, much less an imaging system."

Coar says that larger companies on the upper end of the hierarchy can really reap the benefits of more efficient data transfer if the bar code is applied early to the documents. "The larger players need to offer some sort of financial incentive to the brokers to motivate them to apply the bar code early in the process."

Competitors Working Toward Similar Goals
Coar promotes standards by offering IWE's bar code formats to all integrators who request them. "The more integrators who are writing bar codes similar to the IWE format, the better chance that, when a standard is developed, it will be close to what we are already using," says Coar. Coar also earns consulting fees from integrators to whom he gives his bar code language, when they are developing their applications. He adds, "Although I may be competing with some of these integrators in the future, we are all really working for the same thing. The more bar coded documents that are out there, the greater the chance that two organizations using them are going to want to communicate with each other, and some sort of standard will be developed. Once that standard is in place, and users find out how much easier this technology can make their jobs, integrators who are already offering a bar code solution for documents are going to have a head start on the rest of the competition."