Using Amazon Textract with Amazon Augmented AI for processing critical documents | Amazon Web Services (original) (raw)

Documents are a primary tool for record keeping, communication, collaboration, and transactions across many industries, including financial, medical, legal, and real estate. For example, millions of mortgage applications and hundreds of millions of tax forms are processed each year. Documents are often unstructured, which means the content’s location or format may vary between two otherwise similar forms. Unstructured documents require time-consuming and complex processes to enable search and discovery, business process automation, and compliance control. When using machine learning (ML) to automate processing of these unstructured documents, you can now build in human reviews to aid in managing sensitive workflows that require human judgment.

Amazon Textract lets you easily extract text and data from virtually any document, and Amazon Augmented AI (Amazon A2I) lets you easily implement human review of machine learning predictions. This post shows how you can take advantage of Amazon Textract and Amazon A2I to automatically extract highly accurate data from both structured and unstructured documents without any ML experience. Amazon Textract is directly integrated with Amazon A2I so you can, for example, easily get humans to review low-quality scans or documents with poor handwriting reviewed. Amazon A2I provides human reviewers with a web interface with the instructions and tools they need to complete their review tasks.

AWS takes care of building, training, and deploying advanced ML models in a highly available and scalable environment, and you can take advantage of these services with simple-to-use API actions. You can define the conditions in which you need a human reviewer by using the Amazon Textract form data extraction API and Amazon A2I. You can adjust these business conditions at any time to achieve the right balance between accuracy and cost-effectiveness. For example, you can specify that a human review the predictions (or inferences) an ML model makes about the document content if the model is less than 90% confident about its prediction. You can also specify which form fields are important in your documents and send those to human review.

You can also use Amazon A2I to send a random sample of Amazon Textract predictions to human reviewers. You can use these results to inform stakeholders about the model’s performance and to audit model predictions.

Prerequisites

This post requires you to have completed the following prerequisites:

Step 1: Creating a private work team

A work team is a group of people that you select to review your documents. You can create a work team from a workforce, which is made up of Amazon Mechanical Turk workers, vendor-managed workers, or your own private workers that you invite to work on your tasks. Whichever workforce type you choose, Amazon A2I takes care of sending tasks to workers. For this post, you create a work team using a private workforce and add yourself to the team to preview the Amazon A2I workflow.

To create and manage your private workforce, you can use the Labeling workforces page on the Amazon SageMaker console. In the console, you have the option to create a private workforce by entering worker emails or importing a pre-existing workforce from an Amazon Cognito user pool.

If you already have a work team for Amazon SageMaker Ground Truth, you can use the same work team with Amazon A2I and skip to the following section.

To create your private work team, complete the following steps:

After you create the private team, you get an email invitation. The following screenshot shows an example email:

After you click the link and change your password, you are registered as a verified worker for this team. The following screenshot shows the updated information on the Private tab.

Your one-person team is now ready, and you can create a human review workflow.

Step 2: Creating a human review workflow

You use a human review workflow to do the following:

For this post, you want to trigger a human review if the key Mail Address is identified with a confidence score of less than 99% or not identified by Amazon Textract in the document. For all other keys, a human review starts if a key is identified with a confidence score less than 90%.

For model-monitoring purposes, you can also randomly send a specific percent of pages for human review. This is the third option on the Conditions for invoking human review page: Randomly send a sample of forms to humans for review. This post does not include this condition.

In the next steps, you create a UI template that the worker sees for document review. Amazon A2I provides pre-built templates that workers use to identify key-value pairs in documents.

Step 3: Sending the document to Amazon Textract and Amazon A2I

In this section, you start a human loop using the Amazon Textract API and send a document for human review.

Call Amazon Textract Analyze Document API operation

This post uses AWS CLI for the following steps. If you prefer using Jupyter, see the following sample notebook in the GitHub repo.

You call the Amazon Textract Analyze Document API to do the following:

For this post, you create the input payload to send to the Amazon Textract Analyze Document API call. In the following code, replace the following values:

{  
    "Document": {  
        "S3Object": {  
            "Bucket": "{s3_bucket}",  
            "Name": "{s3_key}"  
        }  
    },  
    "FeatureTypes": [  
        "FORMS"  
    ],  
    "HumanLoopConfig": {  
        "HumanLoopName": "{human-loop-name}",  
        "FlowDefinitionArn": "{flow_def_arn}"  
    }  
}  
aws textract analyze-document --cli-input-json file:/tmp/textract-a2i-input.json  

The response to this call contains the inference from Amazon Textract and the evaluated activation conditions that may or may not have led to a human loop creation. If a human loop is created, the output contains HumanLoopArn. You can track its status using the DescribeHumanLoop API. The following is the output-format from the CLI command above:

{
    Blocks: [...], // Amazon textract inference
    DocumentMetadata: {...},
    AnalyzeDocumentModelVersion: "1.0", 
    HumanLoopActivationOutput: {
        HumanLoopArn: "arn:aws:sagemaker:us-east-1:{account-id}:human-loop/{human-loop-name}", // successfully created human loop arn 
        HumanLoopActivationReasons: [
            "ConditionsEvaluation" // reason for human loop creation i.e. in this case, activation conditions were evaluated to true
        ], 
        // evaludated conditions explaining individual conditions evaluation and overall evaluation to True
        HumanLoopActivationConditionsEvaluationResults: ""
    }
}

If a Human Loop was not created, the output looks like following and will not contain a HumanLoopArn.

{
    Blocks: [...], // Amazon textract inference
    DocumentMetadata: {...},
    AnalyzeDocumentModelVersion: "1.0", 
    HumanLoopActivationOutput: {
        // evaludated conditions explaining individual conditions evaluation and overall evaluation to False
        HumanLoopActivationConditionsEvaluationResults: ""
    }
}

Repeat this step for each document that you want analyzed.

Step 4: Completing Human Review of your document

To complete a human review of your document, complete the following steps:

You see instructions and the first document to work on. You can use the toolbox to zoom in and out, fit image, and reposition document. See the following screenshot.

This UI is specifically designed for document-processing tasks. On the right side of the preceding screenshot, the key-value pairs are automatically pre-filled with the Amazon Textract response. As a worker, you can quickly refer to this sidebar to make sure the key-values are identified correctly (which is the case for this post).

When you select any field on the right, a corresponding bounding box appears, which highlights its location on the document. See the following screenshot.

In the following screenshot, Amazon Textract did not identify Mail Address. The human review workflow identified this as an important field. Even though Amazon Textract didn’t identify it, the worker task UI asks you to enter Mail Address details on the right side.

Step 5: Seeing results in your S3 bucket

After you submit your review of the document, the results are written back to the Amazon S3 output location you specified in your human review workflow. The following are written to a JSON file in this location:

You can use this information to track and correlate ML output with human-reviewed output. To see the results, complete the following steps:

The output file (output.json) is structured as follows:

{
    // the original request made to Textract for this human loop
    aiServiceRequest:
    {
        "Document": {.       // bytes are currently not supported with human loop configuration
            "S3Object": {                   
                "Bucket": "{s3_bucket}",
                "Name": "{s3_key}",
            }
        },
        "FeatureTypes": [
            "FORMS" // configuration for extracting key value pairs
        ],
        "HumanLoopConfig": {
            "HumanLoopName": "{human-loop-name}",  // name of human loop 
            "FlowDefinitionArn": "{flow_def_arn}", // human reiew workflow created above
            "DataAttributes": {
                "ContentClassifiers": [. // required for sending work to public workforce
                    "FreeOfAdultContent",
                    "FreeOfPersonallyIdentifiableInformation"
                ]
            }
        }
    },
    // the original response from Textract for the request made as above
    aiServiceResponse: 
    {
        Blocks: [...], // Amazon textract inference
        DocumentMetadata: {...},
        AnalyzeDocumentModelVersion: "1.0", 
        HumanLoopActivationOutput: {
            HumanLoopArn: "arn:aws:sagemaker:us-east-1:{account-id}:human-loop/{human-loop-name}", // successfully created human loop arn 
            HumanLoopActivationReasons: [
                "ConditionsEvaluation" // reason for human loop creation i.e. in this case, activation conditions were evaluated to true
            ], 
            // evaluated conditions explaining individual conditions evaluation and overall evaluation to True
            HumanLoopActivationConditionsEvaluationResults: ""
        }        
    },
    // all parts of Amazon Textract's inference that matched the activation conditions & modified by humans       
    selectedAiServiceResponse: 
    {...}
}

Conclusion

This post has merely scratched the surface of what Amazon A2I can do. Amazon A2I is available in 12 Regions. For more information about regions see AWS Region Table.

For more information about use cases like content moderation and sentiment analysis, see the Jupyter notebook page on GitHub. For more information about integrating Amazon A2I into any custom ML workflow, see over 60 pre-built worker templates on the GitHub repo and Use Amazon Augmented AI with Custom Task Types.


About the Authors

Anuj Gupta is the Product Manager for Amazon Augmented AI. He is focusing on delivering products that make it easier for customers to adopt machine learning. In his spare time, he enjoys road trips and watching Formula 1.

Pranav Sachdeva is a Software Development Engineer in AWS AI. He is passionate about building high performance distributed systems to solve real life problems. He is currently focused on innovating and building capabilities in the AWS AI ecosystem that allow customers to give AI the much needed human aspect.

Talia Chopra is a Technical Writer in AWS specializing in machine learning and artificial intelligence. She has worked with multiple teams in AWS to create technical documentation and tutorials for customers using Amazon SageMaker, Amazon Augmented AI, MxNet, and AutoGluon. In her spare time she enjoys taking walks in nature and meditating.