pytesseract in Python: How to Build OCR Function | Python Central (original) (raw)

This article is part of in the series

Published: Wednesday 30th April 2025

Last Updated: Thursday 1st May 2025

pytesseract tutorial banner image

pytesseract is a Python wrapper for Google's Tesseract-OCR Engine. This wrapper lets you extract text from images with just a few lines of code. This tool will be very helpful to you if you are working with document digitization, data extraction, and image-to-text conversion if you know the basics of Python, which you will, if you keep visiting PythonCentral 🙂

In this detailed guide, we will learn how to use pytesseract effectively, including setup, usage examples, advanced techniques, best practices, common pitfalls, and tips for better OCR accuracy. Ready? Get. Set. Learn!

How to Install pytesseract

Before you start using pytesseract, make sure you have Tesseract-OCR installed. Here is how you can install it on different operating systems:

You can install pytesseract via pip as well by executing the command:

pip install pytesseract

Then, install Pillow for image processing by executing the command:

pip install Pillow

How Can You Use pytesseract

Here is how you can extract text from an image using pytesseract:

from PIL import Image import pytesseract

This step loads the image

image = Image.open('ExampleImage.png')

Now let us perform OCR

text = pytesseract.image_to_string(image) print(text)

Specifying Tesseract Path in Windows

To specify the path while using a Windows device, execute this command:

pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

How to Read Text from Different Languages

Here is how you can specify a language for OCR:

text = pytesseract.image_to_string(image, lang='fra') # For French

Download additional language packs from the official repository.

Extracting Structured Data

Now that we have covered the basics, let us see some practical applications

Creating Bounding Boxes for Words

Use this script to create bounding boxes for words:

boxes = pytesseract.image_to_boxes(image) print(boxes)

Extracting Metadata at Word-Level

To extract word-level metadata from an image, you can use this script:

data = pytesseract.image_to_data(image, output_type=pytesseract.Output.DICT) print(data['text'])

What is Preprocessing

Preprocessing an image is done to improve accuracy. OCR performance improves with clean images. Here is how you can do that:

from PIL import ImageOps

This step converts the image to grayscale and increase contrast

image = ImageOps.grayscale(image) image = ImageOps.autocontrast(image)

For advanced preprocessing, follow these instructions:

How to Perform OCR with PDF Files

You can use "pdf2image" to convert PDFs into images. As usual, we are going to use pip to install "pdf2image":

pip install pdf2image

Once you have installed it, use this script to convert pdf to image and then perform OCR:

from pdf2image import convert_from_path

images = convert_from_path('sample.pdf') for img in images: print(pytesseract.image_to_string(img))

Some Advanced Use Cases

By now, you would be familiar with the basic use cases. Now it is time for some advanced real-world applications.

Best Practices

Common Errors and How to Fix Them

Here are the common errors we face when we work with pytesseract and their solutions:

Wrapping Up

pytesseract helps Python developers to add OCR capabilities to their applications with ease. Whether you are building document automation tools, digitizing printed media, or scraping screen data, pytesseract provides a versatile and open-source solution for text extraction. With this wrapper, you get powerful OCR capabilities, turning images into actionable data for modern applications.

  1. Home
  2. Python Tools
  3. Python How To's