Document Analyzer (original) (raw)
Summary
Use the LEADTOOLS Document Analyzer SDK to automate finding and extracting information from images and documents in formats such as TIFF, PDF, XPS, DOC, XLS, HTML, RTF, and Text. Leverage its AI and machine learning to find all data of interest, even when the file layouts are completely different or the data is in various formats. Develop complex workflows to process and handle documents. The SDK is composed of two assemblies, Leadtools.Document.Analytics and Leadtools.Document.Unstructured.
Leadtools.Document.Analytics and Leadtools.Document.Unstructured are high-level .NET SDK document analyzer assemblies that can intelligently identify document components and features that can be used to recognize and classify scanned documents. Together the two assemblies provide a set of classes and interfaces for automated unstructured forms processing. This framework provides a higher level way to use LEADTOOLS form information extraction within text as well as provide action functionalities.
With the LEADTOOLS Document Analyzer SDK, developers get an easy-to-use API that is easy-to-integrate into their applications. End users delight in the easy-to-use interface.
Available for the .NET and Java environments.
Key Features
- Automatically detect and extract data from any type of structured or unstructured form, document, or image with simple rule-based configurations
- Access pre-defined rules for datatypes such as SSN, Address, email address, etc.
- Analyze any input
- Automate processes like redaction, data anonymization, and information sanitization
- Create rulesets to find, extract, collect, and act on information
- Obtain confidence ratings for results that can be used for accept/reject decisions
- Perform conditional searches
- Perform partial or full Regex matching
- Process structured and unstructured forms, tables, documents, images, and mixed content