|
Table of Contents
|
Introduction
The OptiDoc2 imaging workstation application is highly versatile, feature packed, and user friendly. This application is intended for use as the industrial strength front end application for high speed document imaging. This application is capable of driving all types of scanners, including ISIS and TWAIN. This application can correct and manipulate images, on the fly, or as needed, with a large set of scanner hardware and software based DSP function and filter options. It can recognize barcodes, including 2-D PDF417 barcodes.
Scanning Modes
In scanning modes, this application can use blank pages or patch codes to delimit batch separations, in combination with many other processing modes that streamline your workflow. This application can perform automated indexing and field validating operations. These features speed up overall data entry, and guarantee data accuracy.
Search Modes
This application can perform many types of field based and zone based barcode and OCR zone recognition operations. It can perform ultra fast field SQL searching, and in addition, it can perform sophisticated state of the art, Full Text and English Language searching. These advanced modes of searching are very useful for uncovering relationships in large repositories of unstructured documents. This modular integrated system provides complete transaction coordination with the database for guaranteed delivery and recall of documents flowing through the system.
Windows Standard User Interface
This application program uses a Windows standard MDI user interface. This Windows standard user interface has one large sized outter frame window that contains a variable number of smaller windows. The inside windows can be maximized and minimized, and moved around to specific locations within the large frame window.
Each window in the OptiDoc Workstation module has a specific functional purpose. Each of these windows is completely configurable. User configuration is granular. This means that everything specific to screen layouts, positions, colors, and fonts are saved and remembered. An administrator or an end user can configure these settings.
This advanced flexibility and user interface technology provides the adaptability that is needed in high volume and commercial environments. This flexibility also allows a workflow to be divided up into the most efficient and most manageable units based on your business needs.
Configurable User Interface
The OptiDoc Workstation application user interface is completely customizable. It consists of a set of functional windows that the user or administrator can manage to best suit the task at hand. Each functional window exists within a master frame window. A new functional window can be created by choosing any of the "create" menu items under the window menu item from the main application menu bar. Each new window can be sized and positioned within the main frame window so that screen space is best utilized for the job.
Configurable Business Processes
For example, if your job is to just run a scanner and create project files, then you might fill up the entire main window with the input functional window, and choose to ignore all of the other functional windows. In this configuration, the Optidoc Workstation application is simply a scanner driver and project making application.
However, if you want to be able to search for and retrieve documents, then the result list functional window must be present. When this windows is present on the screen then the results of searches are populated into this window. In this configuration, the OptiDoc Workstation application is simply a search and retrieval workstation.
Functional Windows
Other functional windows provide additional features, and these features can be mixed and matched together.
- The text functional window is used to view and edit raw text data that results from Full Text indexing, or simply to view ASCII text documents stored in the database.
- The fields functional window is used to attach and view index data with scanned documents before submission into the database.
- The image functional window is used to display pictures of raster image information, such as TIFF files.
- The input functional window is used to drive scanners, manage projects, and accept drag and drop input into the system.
- The result list functional window shows the results from searches in a matrix of columns.
For example, after you have performed a search, and have the results showing in the result list window, and then you click the mouse on a result, the first page of the image will appear in the image window. Two conditions must be met in order for this action to happen. The selected database document must contain a raster image, such as a TIFF file, and the image window must be on the screen.
Drag And Drop User Interface
Further flexibility of this user interface includes drag and drop operations. A user may select and drag any of the result list columns around to rearrange them by simply dragging and dropping the headers from place to place. A user can sort the result list on any column according to the type of the data by simply clicking on the column header. You can drag and drop documents from the result list directly into the input window to quickly create a copy of a document. You can drag and drop documents from the explorer interface and drop them into the input window. Pages and documents can be drag and dropped within the input window to rearrange pages, delete unwanted pages, and change processing order.
Persistent User Interface
All configuration information is stored in the preferences file, and is automatically recalled and updated, so that all of the setup information persists each time the application is started. The entire functional organization, as well as the entire look and feel of the OptiDoc Workstation application is completely configurable.
Scanning Modes
Use this dialog to configure the OptiDoc Workstaiton application to interact with your scanning device. From the input window, right mouse click on the icon of the scanner and choose scanner settings from the drop down menu to setup and configure your scanning job. You can save sets of scanner settings under different job names for each different type of job that you are working on, and switch between these settings using the job name drop down menu. You can remove a job, by choosing a job and then clicking on the delete job button. A scanning job is simply a name for all of the settings that are applied to the scanner, such as simplex, or duplex modes, batching modes, blank page removal, etc. If you are working on several different types of jobs at the same time that require different scanner configurations, this is an easy way to remember the settings, and to be able to switch between them.
Single Page Mode
In single page mode the scanner expects a single page to be fed into the scanner with each invocation of the scanning process. Typically, there is a setup delay while the scanner adjusts the input trey and prepares to perform a run of scanning, and there is also a shutdown delay while the scanner puts the trey back down and terminates a run of scanning. In single page mode this overhead is incurred for each page that is fed into the scanner.
Single Batch Mode
In single batch mode the scanner will scan all of the pages that are loaded into the trey into one multi-page document. Each time the scanning process is invoked the new pages are accumulated into the existing multi-page document.
Specify Number Of Document Pages Mode
In specify number of document pages mode, the number of pages in the document is set to a fixed number of pages using the edit field that becomes enabled when this mode is selected. For example, if you have nine pages and set this scan mode to break after three pages, you will end up with three documents in the input window. Blank pages are always removed when using this mode.
Multiple Batches Delimited By A Blank Page Mode
In multiple batches delimted by a blank page mode each document is accumulated until two adjacent page faces are blank. Typically, two adjacent page faces are blank when a blank single page is inserted into a batch of documents that are scanned in full duplex mode. When the blank page is detected, a new document is created, and the blank page is removed.
Multiple Batches Delimited By A Patch Code (Keep Patch Code)
In multiple batches delimted by a patch code (keep patch code) mode, each document is accumulated until a patch code is found. A patch code is a type of barcode that is used to indicate the end of a batch. In this mode the page with the patch code is kept in with the document, and therefore, can have other information on the same page with the patch code.
In multiple batches delimted by PDF417 (keep barcode) mode, each document is accumulated until a PDF417 barcode is found. A PDF417 barcode is a type of barcode that is used to contain indexing information and to indicate the end of a batch. In this mode the page with the barcode is kept in with the document, and therefore, can have other information on the same page with the barcode.
Multiple Batches Delimited By A Patch Code (Delete Patch Code)
In multiple batches delimted by a patch code (delete patch code) mode, each document is accumulated until a patch code is found. A patch code is a type of barcode that is used to indicate the end of a batch. In this mode the page with the patch code is deleted from the document, and therefore, the patch code page should not contain any information that you might want to keep.
Multiple Batches Delimited By A PDF417 Barcode
In multiple batches delimted by a PDF417 barcode (delete barode) mode, each document is accumulated until a barcode is found. A PDF417 barcode is a type of barcode that is used to contain indexing information and to indicate the end of a batch. In this mode the page with the barcode is deleted from the document, and therefore, the barcode page should not contain any information that you might want to keep.
Scan Type
The contents of the scan type drop down menu is dependent upon the hardware of the scanner and changes depending upon what type, model, and manufacture, of scanner you are using. Some features that might be available could be, front only, back only, simplex, duplex, flatbed, feeder, automatic, red drop out, and others. Please read the documentation that comes with your scanner for an explanation of what additional hardware features are available and what these features do.
Virtual Rescan
The apply filter checkbox allows you to create a virtual rescan operation. You can apply any DSP imaging filter to all of the scanned images. DSP filters perform functions like, deskew, despeckle, darken, change contrast, border removal, and many others. This is an advanced feature. Please refer to the section on DSP imaging filters, or call ATS, for help setting up and configuring DSP imaging filters.
Preferences
Login Pane
The login preference pane is used to setup default values for the login authentication method, and to provide a user name, a password, and the name of a database connection.
Windows Integrated Authentication
Windows Integrated authentication attempts to create a database connection with the name of the logged in user and does not require a password. Although more convenient in many respects, this database authentication method is less secure than typical SQL authentication.
SQL Authentication
SQL authentication requires you to enter a user name and a password and uses the SQL database to validate the user. SQL authentication is the preferred method. If you ask the OptiDoc Workstation application to perform an operation that requires a database connection, and no connection has yet been established, then this panel may present itself in order to setup the required database connection.
ODBC Timeout
The ODBC timeout value is the maximum number of seconds that are allowed to transpire before any given SQL query is determined to be dead. Network conditions, server speeds, and query values all conspire to change the length of any particular SQL query. A reasonable value of 30 seconds is the default, but if you have very large databases, or poor network connectivity, then longer values may be necessary.
Temp Pane
The OptiDoc Workstation application program creates and maintains a preference file that contains user configuration and preference information. If all users of a workstation are setup to use the same configuration, then the default preference file is loaded up each time the application is launched. The physcial name of this preference files is "default.dat". If each user has a different configuratin then it has to be based on user login. In this case a preference file with that user name is loaded up when the application is launched. The Prefs pane in the preferences dialog is used to set these different preference file configurations.
Preference Files
Preference files and temporary files are normally written to default locations, but they can also be redirected to other locations. For example, network security might demand that no temporary file or preference file be written onto the local workstation. In this case, all of these files can be redirected to a specified known safe location.
Temporary Working Files
Please keep in mind that temporary working files can be very large, and enough disk space must exist at the specified location in order for the OptiDoc Workstation to carry out its operations properly. By default, preference files are written into the application folder, within a folder that is named "OptiCenter_Prefs", and temporary files are written into the standard Windows TEMP folder that is provided by the Windows environment.
This preference panel is used to change these default settings for any network or security configuration. All preferences are encrypted. The main preference file that contains the OptiDoc Workstation window configurations, arrangements, and settings can be different for each user that logs into the workstation, or they can be global and shared for all users.
Since, the storage location of the preferences file cannot be stored in the preferences file itself, this information must be kept in the registry. The specific registry key is HKEY_LOCAL_MACHINE\SOFTWARE\ATS and the application must have read and write permissions to this key in order to find and work with its preference files. If permissions to these keys is not sufficient, a warning message is generated in the log file trace.
The application permissions to this key can be tested by pressing the "Test Registry Access" button. A key fact to remember is that preferences are first located by using the registry keys, and then all generated preferences files will follow the registry keys. The registry keys determine whether a global set of preferences is in use for all users, or if specific preferences are in use for each logged in user, or whether the preferences are stored locally or remotely, or whether temporary files are written locally or remotely.
Styles Pane
The OptiDoc Workstation application program allows the user to configure the font, face, color, layout, and size of each of the functional windows. This preference panel allows the user to choose a functional window and set the visual parameters for that window.
One operator might want a very large font for indexing, while another operator might want a high contrast foreground and background colors in order to count pages more easily. This preference panel allows all of these parameters to be configured. The panel works by showing a drop down list of the names of each functional window. First, set the target of the style panel by using the drop down list to pick the name of the functional window that you want to configure. Then click on the foreground or background color boxes, or other options that may appear, to view and pick colors and fonts. Each individual panel can also be modified from within each window, by right mouse clicking inside of the window and choosing "styles" from the resulting properties menu.
Free Text Pane
The OptiDoc Workstation application program can have an integrated Free Text, or Full Text, or English search engine embedded into the functional windows. When this panel is activated by checking the box to enable the full text engine in the preferences, the OptiDoc Workstation application program has two additional and advanced modes of searching.
These modes of searching go beyond the scope of normal SQL builder searching capabilities. The functional window that is used for indexing is also automatically reconfigured to perform a full text submission into a special index of scanned and OCR'd documents. When the full text search engine is enabled, DLLs and other modules are automatically loaded from the specified engine path. This additional feature is installed separately.
The full text query engine provides a complete retrieval language that finds documents based upon words or phrases located anywhere in the scanned and OCR'd documents. The easiest way to use the free text search function is to simply type in some words that you are looking for into the query window. The results of the query will be documents that contain those words, or contain synonyms of those words. The following is a list of some of the advanced operations supported by the free text query engine.
- AND - Boolean AND: both words or phrases must appear in the same document
- OR - Boolean OR: locates documents which contain any one of the words or phrases
- NOT - Boolean NEGATION: locates documents which contain the first word or phrase, but not the second.
- /N,M/ - must appear from n words before to m words after, but within the same paragraph
- // - must appear in the same paragraph
- \N,M\ - must appear from n paragraphs before to m paragraphs after.
- \\ - must appear in one paragraph on either side
- … - second term must appear after the first term
- EXCEPT - paragraph level exclusion operator
- IN - first term must appear in a paragraph (or section) labeled by the second term
- TO - specifies an alphabetic, date, or numeric range search
- ( ) - Operator precedence
- AFTER - search for documents after a specific date
- BEFORE - search for documents before a specific date
- GE, LE - numeric range searching
- Wild Cards - In the full text search engine a wild-card symbol may appear once, anywhere in the word, much like the DOS wildcard character, (e.g. xyz*abc), or may occur at each end of a word.
- Conflation - In the full text search engine words may be suffixed or prefixed by the conflation operator which causes all tense-forms of the word to be retrieved. For example "worked~" would also retrieve "work", "working", "workers", but not "workstation". Conflation may also be used at the start of a word.
- Dates - The full text search engine includes optional intelligent date handling which can find dates regardless of the format in which they are expressed in the documents or the query. For example, Mar-20-96, March 20 1996, or even the 20th of March, 1996 are all matches.
- Numbers - The full text search engine includes intelligent recognition and indexing of numeric quantities, regardless of how they are expressed. For example, the phrase "two hundred and nine thousand one hundred and one" will match the numeric value. Likewise, for numeric quantities expressed in terms such as "10 million" or "1000".
- Fuzzy Searching - The full text search engine recognizes that data is often sourced from scanned and OCRed material. While OCR software improves all the time, it is far from perfect, and the cost of manually inspecting and correcting OCR errors can be very time consuming. The fuzzy precompensator for OCR and typo errors automatically adjusts for typical OCR scanning and typographical errors without operator intervention. For example, in a document about "ducks", if the word "cluck" suddenly appeared, the search engine may deduce that the "d" had been incorrectly OCRed as a "cl". If you searched on the word "duck" you would also hit on the word "cluck". However, "cluck" would still be found if you specifically searched for it. In other words, the full text search engine is not so presumptuous as to correct seeming errors, only to compensate for them. In a document mainly about chickens, the reverse may be true. In a document about both ducks and chickens, the full text search engine would deem it too close to call. The full text search engine uses advanced heuristically processing to achieve fuzzy precompensator. Rather than basing this on a static dictionary, the dictionary it uses is the database itself, hence it is adaptive and will function correctly even on proper nouns.
Field Search Pane
After a user has logged into a database and has selected a collection then that user has the ability to search for documents in that collection. The most basic type of search is called a field search because this type of search combines together logical clauses that are made up from field relationships from the currently chosen collection.
Choosing the find menu item under the find menu, or pressing the F3 key, will bring up the find dialog. If you have a system that is configured for advanced searching capabilities, such as Full Text, or English searching, then these alternative search interfaces will appear as additional panes within the main search dialog. Field searches are always available.
If you have not logged into the OptiDoc database before you attempt to perform a search, then the login dialog will automatically be presented. If you have not yet chosen a collection from which to search, then you will also be prompted to specify a document collection. Keep in mind that you will only be presented with document collections that your logged in user has security access to.
Logical AND vs. Logical OR
Once you have the field search dialog on screen, you can enter your search into the dialog by completing each of the four possible search clauses. Each search clause can be combined using a joining operator such as AND or OR. Please be aware of the logical nature of creating your search. If you want to see the results of two distinct possibilities then each clause should be joined together using the OR operator. Be aware that the AND operator makes a search more specific, not more general. Conversely, using the OR operator makes a search more general, and not more specific.
How To Use The SQL Search Dialog
Each field that you have security access to will appear in the drop down list of fields in the left most part of the dialog. Choose the field that you want to inquire about from this drop down list. In the middle of the clause is a drop down list of operators that you can choose from. For text values the operators contain options that deal with text, such as contains, or begins with, while with numeric values the operators contain options that deal with numbers such as equal to, or greater than. The operators that you see in the drop down are dependent on the type of the field that you have chosen to inquire about. Finally, enter the value that you want to compare against, into the field along the right most part of the dialog. If you want a more sophisticated search, then you can check on an additional clause, and fill in all of the required information for that clause.
Saving Searches For Instant Recall
You can have up to four different clauses in a search. If you are performing a search many times over, then you can save that search for instant recall. First enter the search parameters into the dialog. Then hold down the shift key and click on the "search#" button that you want to use. A dialog box will appear asking you to provide a name for this search. The name that you enter into this dialog box will become the name of the button. Later on, you can recall this search by simply clicking on the button. When you click on a button that contains a saved search, the information is automatically recalled back into the search dialog. Up to four searches can be saved; however, each search is specific to a collection. Therefore, a search that has been designed for use with one collection will not work inside of a different collection.
Behaviors Pane
The OptiDoc Workstation application program has a characteristic set of behaviors that modify the overall way the application works. These behaviors have been implemented to maintain functionality with older OptiDoc systems and to allow a smooth transition into more advanced functionality. Sometimes behaviors are used to change how certain windows operate, so that more options are available.
One behavior is to automatically delete project files after they have been submitted.
About Project Files
A project file contains documents, and image pages, and indexing information, and ORC information, and annotation information, as well as other information about a job, all packaged together as a single binary file. A project file has the advantage of being a single package for work in progress. If you can find the project file, then you have access to everything that has been worked on and saved into the project. This is in contrast to keeping up with large sets of related files. This design is resilient in real world conditions where confusion, and damaged or misplaced files are a common issue.
The OptiDoc Workstation application program can run a high end scanner and can automatically provide a project file that contains the scanned documents. A project file can then be passed along and manipulated, and passed from person to person, so that pages are arranged, filtered, corrected, OCR'd, indexed, and annotated. All of this can be done, in any logical order, with or without a physical database connection. The final step of submitting a project file into the database usually requires a physical database connection, but as a minimum only an internet connection is actually necessary.
This checkbox controls a special bbehavior for projects that enables the application to delete project files after they have been successfully inserted into the database. The normal mode of operation is that projects remain on disk after they are inserted.
Global Admin
Another behavior is the method that is employed for automating indexing features. Older OptiDoc systems used a "Global Admin" model, where a centralized database was used to implement features such as dropping down a list of choices, or making fields automatically fill in from the values entered into other fields. When the behaviors drop down menu is set to "Global Admin" then the automated indexing features behave in accordance with the older OptiDoc systems. Drop down menus, or reference tables, automatically appear, and field dependencies, or look up tables, automatically occur based upon the setup that is provided by the global administration module. These features have been implemented to model the older OptiDoc indexing functionality.
Active Fields
A more flexible approach to automated indexing features and functions is made available when the "Active Fields" behavior is chosen. When the active fields automated indexing behavior is chosen, a set of property panels can be used to setup many additional, and specialized, automated indexing features. Besides more flexible drop down or entry menus, and complex field dependencies, multiple zone bar code recognition of all types, multiple zone OCR recognition of all types, and powerful 2-dimentional PDF417 auto population features are available.
Input Window Drag And Drop
A behavior of the input window is how files are managed when a group of files is dropped into the window. When a group of files is dragged from the desktop and dropped onto the input window, the files are always sorted into alphabetical order. Once the files are sorted by file name, the pages enter the input window tree as pages or as documents. There are two choices for how this is handled:
- All of the pages of all of the files can be merged together into one single document.
- Each file that is dropped onto the input window can be a single document in the input window.
To change this behavior set the drag and drop behavior of the input window.
Application Logging Level
Another behavior is the application logging level. The default logging level is not to log anything. This setting allows the application to run fastest. However, many different levels of application logging are available for observing, analyzing, and diagnosing system and operator behavior. When detailed logs are generated, every function call is traced, and every mouse click and key press and database operation that happens is traced. Log files are limited to about 1 MB in size, and the last ten log files are kept on disk. These log files will appear in the same folder with the application program. If a user encounters a problem with the OptiDoc Workstation application program, a detailed log demonstrating the problem will lead directly to a solution.
Creator Identification Number
Another behavor is the creator identification number. When a document is inserted into the database an identification number is also inserted which cross references this application with the database record. This identification number can be setup on a workstation by workstation basis, so that records inserted from any particular workstation can be differentiated from records inserted from other workstations. This feature allows a system administrator to know how many records have been inserted from particular workstations, within certain intervals of time, and if the quality of work from any particular workstation is above or below requirements. The default creator identification number is 100, but can be configured and setup by a logged in administrator, so that each individual workstation has its own unique identification number.
Submit Mode
Another behavior is to set online versus offline operations. When using offline operations it is still possible to insert documents from projects over a web connection. If the submit mode is "DIRECT" then documents are inserted into the system using the file system as the transport medium. If the submit mode is set to "HTTP POST" then documents are inserted into the system by web server proxy. When using "HTTP POST" to submit documents, the URL of the OptiWeb dataserver must be specified.
Document Properties Dialog
This window appears when you right click the mouse on the list results and choose the "document properties…" menu item. This dialog shows all the information relevant to a particular document in the system. It is much like getting file info with the standard windows explorer, and in addition, this dialog shows information that is important to the document imaging system. To see the file system information press the "properties" button and the Windows explorer properties dialog will come up for the selected document without you having to find it in the directory. This function is very useful for system analysis and diagnostics. This dialog shows if the document has been stored into a storage group and if the backup system flag is turned on or off.
Dual Path Information
The backup system flag indicates if the administrative "dual path" operational mode has been set active for a storage group that contains documents. The dual path mode creates redundant documents in the system for remote storage security and data safety. Dual path algorithms are setup and configured using the administration module.
Subfolder Algorithm
If the subfolder algorithm is active then the document properties dialog also shows the name of the subfolder that the document has been saved within. When the administrative subfolder algorithm mode is turned on for a storage group then an upper bound is placed on the number of documents that will be written into any specified subfolder. New subfolders will automatically be created as necessary as documents are filed into the system. This feature helps file systems such as NTFS, or Netware, balance the directory tree nodes and can greatly speed up the access to files stored in any storage group.
Global Unique Identifier
The document properties dialog also shows the internal database document identification number, or GUID, as well as the cross reference value, or XRef, for quickly tracking down any document in the SQL database.
Primary And Secondary Path
This dialog also shows both the primary and the optional secondary path of where the documents have been stored. If the backup flag is not turned on, then there will not be a secondary, or backup path, and will display as N/A.
Document Security Levels
This dialog also shows all of the document securities that have been applied to this document. By default, if no document securities have been specifically applied to a document, then the document will have a public security. Document securities provide a hierarchy of protection levels for system users. Document securities are setup and configured by the administrator.
Analyze Button
By clicking the "analyze" button, first an attempt to resolve the target file will be made, and if the target file can be resolved, then an attempt to parse the target file will be made, and if the document can be parsed, then result of the parse will be shown. If the document cannot be resolved, analyzed, or parsed, then the document may be inaccessible, offline, or corrupt.
Extract Marked Results To Folder
Once a result list has been made in the result list window a menu option is available in the list that appears when you right mouse click on the result list to export the current list to a selected network folder location. The OptiDoc Workstation application program must have write access to the chosen folder. When this feature is invoked a dialog titled "Extract" will appear and accepts export choices and parameters.
Import Export Field Data
The first choice is for the type of the import/export field data file. This file is a text file, in one format or another, that contains the field information for the exported document. The different file formats are common formats that are used to import and export data into different systems for interoperability. When one of the IRF, or XML radio buttons are chosen, one extra file will be created per document that is exported. The name of the extra file is the same name as the exported file, but will have a different file extension. If the IRF option is chosen, then the extra file will be formatted as an OptiDoc IRF file, and it will have the extension of IRF. If the XML option is chosen, then the extra file will be formatted into standard XML, and it will have the extension XML. When documents that are exported using this method are are of type NONE, then no extra file will be exported along with the document.
TIFF Tags
In all cases, when the document type is TIFF, the standard TAGs are always updated to reflect the most recent field information from the database, as well as annotation information in the Kodak/Wang format. In most cases, when importing and exporting data to and from different OptiDoc systems the extra field file is redundant becaue of the embedded XML TAGs, and is therefore not needed. All OptiDoc services and systems possess this TIFF TAG technology and actively recognize the embedded TAGs and and use this information when importing and exporting data to and from local or remote systems.
Export Max MB
The amount of data exported into the folder can be limited in MB increments by entering a value into a settings field withing the dialog. This can be used to setup a data repository size suitable for CD ROM storage or DVD ROM storage.
Delete After Extraction
If your OptiDoc user has permissions to delete files from the database then two additional options are available. The first option allows each extracted document to be removed from the database as it is extracted into the destination folder. The second option allows for the document that is stored within the database to also be deleted. In some extraction situations, a user might want to keep the stored documents, and in other situations a user might want to remove the stored documents. These options are useful for creating and managing collections of CD ROMs or DVD ROMs that are constructed as collections and then exported onto portable media.
OptiDoc Scanner External Search Engine
The OptiDoc System has an associated additional module, called the OptiDoc Scanner, that can create a mini database and search engine, and the whole application and mini database can be burned directly onto a CD ROM or a DVD ROM along with the exported data. This makes a complete and portable CD ROM or DVD ROM that can operate in a stand alone mode and the media can be simply carried to and run from one computer any other another workstation within your organization. This is because all of the data, the mini database, and the searching software are contained on the CD ROM or DVD ROM.
Once the export process has started to take place, the documents continue to be exported until all of the documents are exported, or the size limit has been reached. If the target disk drive fills up during the export, but before the target size limit is reached, an error message will be presented to the user. If a user attempts to delete a database storage document from a read only media storage device, such as WORM storage device, an error message will be generated. At the bottom of the dialog is an entry field for specifying the target folder for the export function. This is where the exported documents will be transferred to.
Automated Indexing Using PDF417 Coversheets
When scanning documents into an OptiDoc database using the OptiDoc Workstation module, a special coversheet technology can be used. This coversheet technology uses 2-D error correcting PDF417 barcodes that act as both a document delimiter, and as a container for all of the index field data and other meta data.
These coversheets can be placed on top of a group of related documents and automatically scanned into the system in one big stack. If the scanner settings are set properly, then each 2-D PDF417 barcode that is detected will denote the start of a new document. In this way, varying numbers of backup documents or primary records can be added to the database.
The coversheet contains all of the field information, in both 2-D PDF417, and human readable format. The main advantage of using this automatic indexing feature is that no keystrokes are performed by the scanning operator when indexing documents into the database. All of the field information comes from inside the special barcode. Menu items in the input functional window provide the operator with the ability to automatically look for, and index from a PDF417 coversheet, at the document, or at the project level.
After you have scanned a batch of 2-D PDF417 separated documents, choose the "PDF417 Barcode Index Project" menu item, to invoke the automated indexing feature for the entire project, or choose the "PDF417 Barcode Index Document" menu item, to invoke the automated indexing feature for just the selected document. In order to make use of this feature, your workflow must contain the ability to create these specialized PDF417 coversheets.
Indexing Automation - Active Fields
When indexing documents into an OptiDoc imaging database using the indexing feature of this workstation module, the user or administrator has the ability to make settings that speed up the entry of, and enforce the accuracy of, the indexed data.
There are two modes of indexing automation. The older mode of operation is called "global admin" and can be selected using the "Behaviors" pane in the preferences dialog. This mode is backwards compatible with older OptiDoc imaging clients, and uses rules and settings that are configured from the administration module that are global for everyone using the indexing feature. The newer mode of operation is called "Active Fields" and can also be selected using the "Behaviors" pane in the preferences dialog.
The active field mode is not backward compatible with older OptiDoc clients, but contains all the functionality of the older mode, plus additional and more powerful options, including the ability to work with an unlimited number of zones, OCR, barcode, or otherwise, and pattern matching using regular expression syntax. Once you have selected active fields as your indexing automation method in the behaviors perference panel, you can right mouse click on the fields window and choose "setup field actions" from the resulting menu.
This action provides you with a tabbed dialog box for configuring each active field. Active field functionality falls into two categories.
Populate
All of the tabs on the left are populate categories. For example, a field can be populated using a barcode zone, or an OCR zone, or a rule.
Extras
The right most tab is considered the "extras" category. The extras category is used for the application of additional features that are not related to the population of the field. For example, a manditory field, or a field that must match a specific syntax, are configured using the extras tab.
Active Field Setup
When the setup field actions dialog appears you can choose the field that you want to apply an automation feature to using the drop down menu in the upper left of the dialog. As you tab along the panes of the dialog, you will see each type of automation feature that can be applied to the selected field. To choose a class of automation for the selected field, click on the "enable" checkbox from within that pane. Multiple types of automated population features can be applied to the same field. However, the automation features must not be self contradictory. One or more extras features can be applied to a field, in addition to its population features.
Barcode Zone
You can populate a field from a barcode zone. To set the zone for a barcode, click on the "set sample document" button at the bottom left of the pane. This sample document should represent the kind of documents that will be scanned. Drag a rectangle around the barcode and the screen will zoom into the barcode region. You can adjust the zone by entering new coordinates into the upper left coordinate controls. Be sure to leave yourself enough room around the edges of the barcode so that the placement of the barcode can wiggle around a little bit. Try to keep just the barcode in the zone, and try to eliminate any extra text or lines that might interfere with the recognition process. Next choose the type of barcode from the array of barcode type checkboxes that are available. If none of the barcode checkboxes are checked then the software will perform a general analysis of the barcode and try to recognize it. However, the system will function faster and more accurately if you can specify the type of barcode that you want to recognize. Once you have enabled a barcode zone, then this field will be automatically populated from the barcode zone when you click on the document from the input window.
OCR Zone
You can populate a field from a section of printed text. This feature requires the OCR feature to be enabled in the application. To set the zone for an section of printed text, click on the "set sample document" button at the bottom left of the pane. This sample document should be representative of the documents that will be scanned. Drag a rectangle around the OCR zone and the screen will zoom into the region. You can adjust the zone by entering new coordinates into the upper left coordinate controls. Be sure to leave yourself enough room around the edges of the zone so that the placement of the zone can wiggle around a little bit. Try to keep just the text that you want in the zone, and try to eliminate any extra text or lines that might interfere with the recognition process. Click on the "Test OCR Zone" button to see if the OCR engine correctly recognizes your zone from the sample document.
An additional feature that goes along with an OCR zone is the "Jump Field" feature. When the jump field checkbox is enabled for an OCR zone, the image screen will automatically jump to the part of the image where the OCR zone has been set, when the user tabs or clicks on that field in the input field window. The jump field feature is good for quickly double checking that the OCR engine has done its job properly.
An additional feature that goes along with an OCR zone "Jump Field" is the "No OCR" feature. This lets you configure a jump field that does not require a preliminary OCR of the zone. This feature is only available if the "Jump Field" feature is enabled.
Barcode Edge
You can automatically populate a field from a selected set of barcodes that are found along the edges of a document. You can tell the barcode edge detector which edges to look inside of, and what the width of each edge will be. You can also tell the barcode edge detector in what order to look for barcodes. For example; top then bottom, or top then right, or any order that you choose. In order to detect a 3 of 9 barcode along a 1 inch wide border along the top of a 300 dpi scanned document, you would first check the 3 of 9 barcode checkbox from the list of barcode types available along the upper right hand side of the dialog. Then you would double click on the "top" edge list entry that is located along the upper left hand side of the dialog. This action will bring up a settings dialog box. Set the detect checkbox to true, and enter in the value 300 pixels for 1 inch deep of edge at 300dpi. To see how this will layout on a particular document click on the choose sample document button and choose a good representative document to work with. This document will appear in a preview window below the controls that guide the automatic edge detection. A red line will run horizontally across the document showing where the top edge will be located as it falls on the sample document. Any of the chosen barcode types that appear within that edge will be detected, in any specified edge, in any specified order. This can be tested at any time by clicking on the test barcode edges button and making adjustments as needed. If a barcode is detected, the value of that barcode will appear in a dialog box. If no barcode is detected the dialog box will come up empty. To add another edge to the barcode edge detector, select the edge name from the list in the upper left hand corner of the dialog. Then click on the edit button. This action will bring up a settings dialog box. Check the detect checkbox for the selected edge and enter the pixel value for that edge. Once you have done this, the preview window will show two lines, where each line shows the boundaries of all of the chosen edges as they lay upon the sample document. All edges, top, bottom, left, and right, can be active, in combinations, or all at once. To change the order of precedence of the edge detections, click on the name of the edge and then click on the up or the down button as needed. The edges are processed in order from top to bottom in the list according to the specifications that you have entered for that edge. The first barcode that is detected in an edge according to the rules that you have setup will be the value that is used to automatically populate the field.
Choice List
You can populate a field from a dropdown list. The list can be dynamically populated from an old style reference table over an active database connection, or it can be populated from a static list that defined at the workstation. A dropdown list may or may not allow entries besides the entries that appear in the list. Static list entries can be initialized from reference tables. List entries can be sorted, or reverse sorted, or organized in any order that is desired.
Rule
You can populate a field from a rule. Rules are general and flexible and come in many types and flavors.
- One type of rule is a prefill rule that populates a field with the current date.
- One type of rule is a prefill rule that populates a field with a set value that you can configure. A value can be a number or a word, or any text value that is appropriate for the selected field type.
- One type of rule is a prefill rule that populates a field with the current Windows user name.
Auto Populate
You can populate sets of other fields based on the value of a particular field. If a field is setup as an autopopulate field, then when the user tabs over that field, or drops a menu down on that field, then other fields can be filled in using a kind of a look up table. These lookup tables are stored in the database and therefore an online connection is needed in order to use this feature. Once you have enabled a field as an auto populate field, then it can trigger the values in other fields. If one of the other fields has been setup as an autopopulate field, then those values will cascade and also auto populate. However, after a field has been filled in and triggered, it cannot be triggered again. This allows cascades of autopopulate fields, but cancels out looping autopopulate fields. In summary, one auto populate field can automatically fill in the values of other fields based on a value, or one autopopulate field can fill on the values of other fields, that cause other autopopulate fields to fill in the values of other fields, but the process is not allowed to loop back on itself.
Extras
- One type of extra function requires that data be entered into the field as mandatory. The field must be filled in before the document can be indexed into the system.
- One type of extra function makes the current contents of a field repeat forward into the next indexing screen. Once this value is set, the data value automatically moves forward into the next document. If you change the data value, then the changed data value is automatically moved forward into the next document.
- The last type of extra rule is a very powerful rule for data integrity. This type of rule is a pattern recognition rule that uses a standard regular expressions to match the entered data. Any sequence or combination of letters, numbers, symbols, can be enforced as a data integrity rule. The value will be a regular expression as derived from the standard regular expression syntax. This feature provides exact syntax matching on any pattern to be done on any field that is input into the system. An understanding of regular expression syntax is necessary in order to make use of this feature.
Using regular expressions
see documentation in wikipedia
Regular expressions can be built from characters and special symbols. There are some similarities between regular expressions and arithmetic expressions. The most basic elements of arithmetic expressions are numbers and expressions enclosed in parens ( ). The most basic elements of regular expressions are characters, regular expressions enclosed in parens ( ) and character sets. On the next higher level, arithmetic expressions have '*' and '/' operators, whereas regular expressions have operators indicating the multiplicity of the preceding element.
Most basic elements of regular expressions
Individual characters. e.g. "h" is a regular expression. In the string "this home" it matches the beginning of 'home'. For non printable characters, one has to use either the notation \xhh where h means a hexadecimal digit or one of the escape sequences \n \r \t \v known from "C". Because the characters
* + ? . | [ ] ( ) - $ ^
have a special meaning in regular expressions, escape sequences must also be used to specify these characters literally:
\* \+ \? \. \| \[ \] \( \) \- \$ \^
Furthermore, use '\ ' to indicate a space, because this implementation skips spaces in order to support a more readable style.
Character sets enclosed in square brackets [ ].
e.g. "[A-Za-z_$]"
matches any alphabetic character, the underscore and the dollar sign (the dash (-) indicates a range),
e.g. [A-Za-z$_]
matches "B", "b", "_", "$" and so on. A ^ immediately following the [ of a character set means 'form the inverse character set'.
e.g. "[^0-9A-Za-z]"
matches non-alphanumeric characters. Expressions are enclosed in round parens ( ). Any regular expression can be used on the lowest level by enclosing it in round brackets. It refers to an already defined regular expression. e.g. "$Ident" stands for a user defined regular expression previously defined. Think of it as a regular expression enclosed in round parens, which has a name.
Operators indicating the multiplicity of the preceding element
Any of the above five basic regular expressions can be followed by one of the special characters * + ? /i
- * meaning repetition (possibly zero times); e.g. "[0-9]*" not only matches "8" but also "87576" and even the empty string "".
- + meaning at least one occurrence; e.g. "[0-9]+" matches "8", "9185278", but not the empty string.
- ? meaning at most one occurrence; e.g. "[$_A-Z]?" matches "_", "U", "$", .. and ""
- \i meaning ignore case
Catenation of regular expressions
The regular expressions described above can be catenated to form longer regular expressions.
E.g. "[_A-Za-z][_A-Za-z0-9]*"
is a regular expression which matches any identifier of the programming language "C", namely the first character must be alphabetic or an underscore and the following characters must be alphanumeric or an underscore.
"[0-9]*\.[0-9]+"
describes a floating point number with an arbitrary number of digits before the decimal point and at least one digit following the decimal point. (The decimal point must be preceded by a backslash, otherwise the dot would mean 'accept any character at this place'). "(Hallo (,how are you\?)?)\i" matches "Hallo" as well as "Hallo, how are you?" in a case insensitive way.





