A Quick Tour Of Ephesoft

Exclusive offer: get 50% off this eBook here
Intelligent Document Capture with Ephesoft

Intelligent Document Capture with Ephesoft — Save 50%

Learn to use open source software to automate the processing of scanned and digital documents to save time, save money, and improve accuracy with this book and ebook.

$23.99    $12.00
by Clifford Laurin Ike Kavas Michael Muller Pat Myers | September 2012 | Open Source

Ephesoft has two user interfaces. One is intended for use by operators to review and validate Ephesoft's classification, separation, and extraction. The other is intended for use by system administrators in the configuration of Ephesoft. Not all aspects of Ephesoft can be configured through the administrative interface, however. For some of the configuration, administrators will need to use a text editor to modify files in Ephesoft's installation directory.

Before we begin, it is helpful to understand some commonly-used terms. A batch or a batch instance is a set of document images that are processed together. A batch class is a set of rules for processing a batch.

This article by Pat Myers, Ike Lavas, Michael Muller, and Clifford Laurin, authors of Intelligent Document Capture with Ephesoft, will provide you with a brief introduction to Ephesoft's user interfaces:

  • The administrative user interface
  • The operator user interface

(For more resources on Open source Content Management, see here.)

The administrative user interface

The administrative user interface has five tabs across the top that provide access to key areas of the system. They are as follows:

  • Batch Class Management
  • Batch Instance Management
  • Workflow Management
  • Folder Management
  • Reports

 

Batch Class Management

The Batch Class Management interface allows administrators to create, modify, edit, and delete batch classes. The batch class configuration is broken down into sections for workflow modules and plugins, document types and fields, e-mail configuration, batch class fields, scanner profiles, and CMIS import.

Administrators can configure six aspects of a batch class. They are as follows:

  • Modules and Plugins: Modules are the major steps in the workflow. Each module is implemented by a series of plugins. An administrator can configure the plugins that comprise a module by selecting the module and pressing the Edit button.
  • Documents and Fields: The Document Types tab is where the documents that will be processed in the batch class are specified. Fields can be specified per document. Extraction rules for automatic indexing of field values can be created as well.
  • Email Configuration: Ephesoft can process e-mail messages and attachments. Ephesoft is configured with authentication information to check mail on an account, and it will process any e-mail sent to that account.
  • Batch Class Fields: This prompts users for batch-level information from the Web Scanner. They can also be used in scripting to persist information at the batch level.
  • Scanner Profiles: This is where administrators can configure Web scanners associated with each batch class.
  • CMIS Import: Ephesoft can monitor documents in an existing document repository to import into Ephesoft for classification and extraction.

 

Batch Instance Management

Batch Instance Management within the administrative interface allows administrators to see the status of batches and restart in-flight workflows starting at a previous step in the workflow.

 

Workflow Management

The Workflow Management interface allows users to add new custom plugins and create dependencies. The following screenshot shows the Workflow Management interface:

Ephesoft is designed to accommodate customizations to fit any customer's needs. For this reason, Ephesoft incorporates the jBPM workflow engine. This gives Ephesoft the ability to adapt to specific customer requirements from the capture workflow perspective.

Ephesoft's capture workflow can be thought of as workflow within a workflow. The workflow for each batch class consists of major steps called modules. The main modules that come with Ephesoft are the following:

  • Import
  • Page Processing
  • Document Assembly
  • Document Review
  • Extraction
  • Document Validation
  • Export

Each module above is composed of a series of substeps called plugins. It is worth noting that some plugins may depend on other plugins. For example, the CMIS export plugin requires the CreateMultiPage Files plugin so that it can upload documents instead of individual pages.

Ephesoft allows developers to add custom modules and plugins to the Ephesoft capture workflow. The ability to add custom modules allows Ephesoft to extend to meet any document capture need. The ability to remove unused modules allows Ephesoft to run as efficiently as possible, maximizing the use of expensive server hardware.

Folder Management

The Folder Management interface allows the administrator to upload new and updated files for batch class creation.

Reports

Reporting can be enabled to provide administrators with statistics on the average time batches, documents, and pages that are processed on each module or plugin. The administrator can filter by batch class, start and end date, and type of report.

To access reporting, click on the Reports tab in the administrative user interface.

(For more resources on Open source Content Management, see here.)

 

Intelligent Document Capture with Ephesoft Learn to use open source software to automate the processing of scanned and digital documents to save time, save money, and improve accuracy with this book and ebook.
Published: September 2012
eBook Price: $23.99
Book Price: $39.99
See more
Select your format and quantity:

Operator user interface

The operator interface has tabs across the top that provide access to four key features :

  • Home/Batch List
  • Batch Details
  • Web Scanner
  • Batch Upload

 

Batch List

The operator's Home screen shows the batches that are in the review and validation steps and allows the user to select batches to process.

The Review process involves documents that could not be classified with the trained confidence to a document type. Operators can split and merge pages of documents and specify the document type for each document.

The Validation process involves index fields that could not be extracted or where the index values do not comply with the validation patterns specified for the field.

Batch Detail

The Batch Detail screen presents the operator with the next available batch for processing according to priority and batch date.

Web Scanner

The Web Scanner feature uses a Java applet to enable the operator to send content directly to the server from any TWAIN-enabled scanner.

The first time a user logs into the operator interface and selects the Web Scanner tab, the user will have to choose a scanner. When the user selects the Select Source button, the user will be shown all TWAIN devices that have been installed on the machine. Once the scanner is selected the user can select the appropriate batch class and start the scan job. Once the scanning is complete, the user is able to press Stop then the Finish button to start the batch processing.

The Web Scanner will scan directly to the server. Depending on the size of batch and bandwidth of the network you may want to consider using a desktop capture tool that submits jobs to Ephesoft via the monitored folders.

Upload Batch

Operators can submit PDF and TIF files directly to Ephesoft for processing using the Upload Batch feature. Once documents are selected and uploaded the operator can select the appropriate batch class and start the batch processing.

(For more resources on Open source Content Management, see here.)

Intelligent Document Capture with Ephesoft Learn to use open source software to automate the processing of scanned and digital documents to save time, save money, and improve accuracy with this book and ebook.
Published: September 2012
eBook Price: $23.99
Book Price: $39.99
See more
Select your format and quantity:

File system

The following are some important directories that are created when Ephesoft is installed. These are subdirectories beneath the Ephesoft installation directory:

  • Apache 2.2: Apache can be used in front of Ephesoft for load balancing and failover. It is included in the installation but not configured.
  • Application: The Ephesoft web application is installed in this directory.
  • Application/i18n, images, css: These directories contain files to customize and localize the Ephesoft application.
  • Application/native/RecostarPlugin: This plugin provides the image OCR functionality.
  • Application/native/Tesseract-OCR: This plugin provides the image OCR functionality using the Tesseract engine.
  • Dependencies/gs, ImageMagick: Applications which Ephesoft uses for image manipulation are installed here.
  • Dependencies/licence-util, licensing: These directories contain tools to collect information needed to generate and install license keys.
  • Dependencies/luke: Luke is a tool that helps troubleshoot problems with Lucene indexes.
  • JavaAppServer: This directory contains the Tomcat configuration for Ephesoft.
  • JavaAppServer/conf/catalina/localhost: This is where the contexts are defined for Ephesoft; it is where URLs are bound to java code.
  • WEB-INF/classes/META-INF: System configuration property files are stored in this directory.
  • Report: The configuration for the automated updating of reporting data is stored here.
  • SharedFolders/BC99: The configuration for each batch class is stored here. The contents of the batch class folder can be modified through the Folder Management interface by a batch class or system administrator.
    • CMIS-plugin-mapping: This folder contains the properties file for the mapping between the document fields in Ephesoft to the CMIS endpoint content model.
    • A reminder that the Batch Class Administration allows you to create document field names with spaces in the name. Names with a whitespace do not work if you are using CMIS export.

    • Fuzzydb-index
    • Image-classification-sample
    • Learn-index
    • Lucene-search-classification-samples
    • Recostar-extraction
    • Scripts
    • Test-extraction
    • Test-table
  • SharedFolders/final-drop-folder: Processed batches are placed in this folder pending export to another system.

Any contents in the SharedFolders folder can be modified through the Folder Management interface by a system administrator.

Summary

In this article, we looked at the administrative user's interface and the operator's interface to Ephesoft. We also looked at the installation directory on the file system.

 


Further resources on this subject:


About the Author :


Clifford Laurin

Cliff has over 17 years of professional experience as a software engineer, including 11 years in the field of Enterprise Content Management. He is currently an ECM Architect at Zia Consulting.

Ike Kavas

Mr. Kavas has more than 12 years of solid experience in document imaging, document management, workflow and systems. Mr. Kavas is the founder and the Chief Technology Officer at Ephesoft, Inc. responsible for product design and roadmap. Mr. Kavas is a serial entrepreneur with three successful companies. He has both a keen technical background, which he developed by implementing several multimillion dollar projects for Fortune 100 companies and outstanding sales and business experience, which he demonstrated by achieving and exceeding revenue-based goals.

Before founding Ephesoft, Inc., Mr. Kavas managed professional services at Kofax, Inc. and co-founded other technology companies in southern California. Mr. Kavas holds a Bachelor of Science degree in Electronics & Electrical Engineering and CDIA+ certification.

Michael Muller

Michael Muller is Director of Engineering at Zia Consulting. He has 25 years of professional software development experience, currently specializing in enterprise content management.

Pat Myers

Pat Myers is the Executive Vice President and a co-founder of Zia Consulting, a content-centric solutions firm. Zia is a platinum Ephesoft and Alfresco partner that provides solutions from paper to mobile. Pat has over 10 years of Enterprise Content Management experience and 15 years of professional services and application development experience. Pat and Ike developed the official Ephesoft training.

Books From Packt


Pentaho Data Integration 4 Cookbook
Pentaho Data Integration 4 Cookbook

OpenNebula 3 Cloud Computing
OpenNebula 3 Cloud Computing

Drupal 7 Development by Example   Beginner’s Guide
Drupal 7 Development by Example Beginner’s Guide

Python Text Processing with NLTK 2.0 Cookbook
Python Text Processing with NLTK 2.0 Cookbook

Using CiviCRM
Using CiviCRM

Sonar Code Quality Testing Essentials
Sonar Code Quality Testing Essentials

Moodle as a Curriculum and Information Management System
Moodle as a Curriculum and Information Management System

FusionCharts Beginner’s Guide: The   Official Guide for FusionCharts Suite
FusionCharts Beginner’s Guide: The Official Guide for FusionCharts Suite


No votes yet

Post new comment

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
i
4
S
U
f
Q
Enter the code without spaces and pay attention to upper/lower case.
Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software