Releases · mindee/doctr (original) (raw)

v1.0.1

v1.0.0

Note: docTR 1.0.0 requires python >= 3.10

What's Changed

Breaking Change

TensorFlow has been removed as a supported backend. docTR now comes with PyTorch as the default and only deep learning backend.

The installation options torch and tf have been removed. You can now install docTR simply with:

This will install docTR with PyTorch support by default.

Training script filenames have been updated to remove backend-specific extensions. For example:

recognition/train_pytorch.py → recognition/train.py

New features

A new crnn_vgg16_bn checkpoint was added

What's Changed

Breaking Changes 🛠

[BREAKING] Drop TensorFlow backend by @felixT2K in #1967

Bug Fixes

[bug] Fix viptr onnx export issue by @felixT2K in #1966
[Fix] Correct condition for image dilation in orientation estimation by @Razlaw in #1971

Improvements

[models] New crnn_vgg16_bn checkpoint by @felixdittrich92 in #1969
[CI] Add windows to build ci jobs by @felixdittrich92 in #1981

Miscellaneous

[misc] post release v0.12.0 by @felixT2K in #1965
[misc/quality] Adjust imports by @felixdittrich92 in #1984
[misc] Rename reference scripts & corr CI job paths by @felixdittrich92 in #1985

New Contributors

@Tanmay20030516 made their first contribution in #1980

Full Changelog: v0.12.0...v1.0.0

v0.12.0

Note: docTR 0.12.0 requires python >= 3.10
Note: docTR 0.12.0 requires either TensorFlow >= 2.15.0 or PyTorch >= 2.0.0

Warning

TensorFlow Backend Deprecation Notice

Using docTR with TensorFlow as a backend is deprecated and will be removed in the next major release (v1.0.0).
We recommend switching to the PyTorch backend, which is more actively maintained and supports the latest features and models.
Alternatively, you can use OnnxTR, which does not require TensorFlow or PyTorch.

This decision was made based on several considerations:

Allows better focus on improving the core library
Frees up resources to develop new features faster
Enables more targeted optimizations with PyTorch

Warning

This release is the last minor release supporting TensorFlow as backend

What's changed

New features

A new lightweight recognition model viptr_tiny was added
New built-in dataset added - COCO-Text V2
A new custom model loading interface

NEW

model = vitstr_small(pretrained=False, pretrained_backbone=False) model.from_pretrained("") # local path or url to .pt or .h5

Instead of depending on the backend

reco_params = torch.load('', map_location="cpu") reco_model.load_state_dict(reco_params)

Or with TensorFlow

reco_model.load_weights(..)

What's Changed

Breaking Changes 🛠

[Feat] Simplify and unify model loading - from_pretrained by @felixdittrich92 in #1915
[Build] Add TensorFlow deprecation warnings by @felixdittrich92 in #1948

New Features

[datasets] COCO-Text V2 integration by @sarjil77 in #1888
[references] Recognition - Allow built-in datasets usage by @sarjil77 in #1904
[Feat] PyTorch - VIP backbone and VIPTR recognition module by @lkosh in #1912

Bug Fixes

[Fix] Duplicated forward call in rec train scripts by @felixT2K in #1862
[Fix] Fix invalid hOCR format & PDF/A compatiblity with the kie preditor by @felixdittrich92 in #1870
[CI/CD] Fix conda dependency issue by @felixT2K in #1937
[Fix] Fix merge of short strings by @Razlaw in #1947
[bug] fix missing import by @felixT2K in #1958
Fix: Update min_loss only on improved validation loss to ensure best … by @sneakybatman in #1961

Improvements

[Docs] Fix typos by @jk4e in #1885
added russian vocab by @Madhavi258 in #1902
[FIXED] Add proper Hebrew Unicode chars vocab by @johnlockejrr in #1909
[classification] Add PyTorch pretrained VIP checkpoints by @felixdittrich92 in #1920
Use OpenCV instead of Shapely to compute area & length by @lachesis in #1922
[datasets] feat: add croatian vocabulary by @cyanic-selkie in #1923
[docs] Fix VIPTR docstring by @felixT2K in #1924
Migrating training scripts to torchrun by @lkosh in #1933
[datasets] Massively extend the pre-defined vocabs by @felixdittrich92 in #1928
[datasets] Update indic & arabic & other vocabs by @felixdittrich92 in #1941
[Fix] Improve merge for recognition of long words (#1936) by @Razlaw in #1939
[docs] Add tools section by @felixdittrich92 in #1950
Give instructions on how to enable GPU acceleration on apple silicon by @ulfaslak in #1955
[models] Add viptr tiny checkpoint by @felixdittrich92 in #1963

Miscellaneous

[misc] post release v0.11.0 & public docker cron job adjustment by @felixdittrich92 in #1860
[misc] Avoid tf onnx test to fail the CI & mypy fixes by @felixdittrich92 in #1903
[Docs] Fix typo in docstring by @jk4e in #1927
[misc] Fix mypy & merge test by @felixT2K in #1945
[docs] Add new community model (arabic) to docs by @felixT2K in #1956
[models] Drop viptr base config by @felixT2K in #1962
[misc] Increase minor version v0.12 by @felixT2K in #1964

New Contributors

@jk4e made their first contribution in #1885
@Madhavi258 made their first contribution in #1902
@johnlockejrr made their first contribution in #1909
@sebastianMindee made their first contribution in #1917
@lkosh made their first contribution in #1912
@lachesis made their first contribution in #1922
@cyanic-selkie made their first contribution in #1923
@ulfaslak made their first contribution in #1955
@sneakybatman made their first contribution in #1961

Full Changelog: v0.11.0...v0.12.0

v0.11.0

Note: docTR 0.11.0 requires python >= 3.10
Note: docTR 0.11.0 requires either TensorFlow >= 2.15.0 or PyTorch >= 2.0.0

What's changed

New features

Added torch.compile support (PyTorch backend)
Improved model training logging
Created a small labeling tool designed for docTR (early stage) --> doctr-labeler

Compile your model

Compiling your PyTorch models with torch.compile optimizes the model by converting it to a graph representation and applying backends that can improve performance.
This process can make inference faster and reduce memory overhead during execution.

Further information can be found in the PyTorch documentation

import torch from doctr.models import ( ocr_predictor, vitstr_small, fast_base, mobilenet_v3_small_crop_orientation, mobilenet_v3_small_page_orientation, crop_orientation_predictor, page_orientation_predictor )

Compile the models

detection_model = torch.compile( fast_base(pretrained=True).eval() ) recognition_model = torch.compile( vitstr_small(pretrained=True).eval() ) crop_orientation_model = torch.compile( mobilenet_v3_small_crop_orientation(pretrained=True).eval() ) page_orientation_model = torch.compile( mobilenet_v3_small_page_orientation(pretrained=True).eval() )

predictor = models.ocr_predictor( detection_model, recognition_model, assume_straight_pages=False )

NOTE: Only required for non-straight pages (`assume_straight_pages=False`) and non-disabled orientation classification

Set the orientation predictors

predictor.crop_orientation_predictor = crop_orientation_predictor(crop_orientation_model) predictor.page_orientation_predictor = page_orientation_predictor(page_orientation_model)

compiled_out = predictor(doc)

What's Changed

New Features

[Feat] Add torch.compile support by @felixdittrich92 in #1791
feat: ✨ tqdm slack by @odulcy-mindee in #1837
[docs] Add note about labeling tool to training section by @felixdittrich92 in #1839
feat: ✨ ClearML training loss logging by @odulcy-mindee in #1844

Bug Fixes

[Fix] documentation deploy CI/CD job by @felixdittrich92 in #1781
[CI] Fix PR labeler job & Rollback doc deploy deps by @felixdittrich92 in #1786
[build] Fix tensorflow build dep by @felixdittrich92 in #1807
fix: 🐛 sanitize docker tag step by @odulcy-mindee in #1809
[Bug] Fix vocabs and add corresponding test case by @felixdittrich92 in #1813
[Bug] Replace mem leaking torch gaussian_blur in augmentations by @felixdittrich92 in #1822
[Fix] Fix tqdm slack message write : monkeypatch by @felixdittrich92 in #1852
[Fix] Fix loading targets in WILDRECEIPT init by @Razlaw in #1859

Improvements

[TF] Change eager mode by @felixdittrich92 in #1763
[models] Change Resize kwargs to args for each zoo predictor by @cmoscardi in #1765
[Docs] Add community docs by @felixdittrich92 in #1766
[CI/CD] update Dependabot by @felixdittrich92 in #1768
adding pytorch DDP script of detection task by @sarjil77 in #1777
docs: fix faulty code for prediction and recognition demos by @khanfarhan10 in #1800
feat: ✨ specify output_dir in reference scripts by @odulcy-mindee in #1820
[references] Unify sched + optim config and add AdamW as option by @felixdittrich92 in #1825
[misc] Update header year & clearer built-in datasets progress bar msg by @felixdittrich92 in #1831
Adding Gujarati Language support by @sarjil77 in #1845
[references] Update Logging by @felixdittrich92 in #1847

Miscellaneous

[misc] post release modifications v0.10.0 by @felixdittrich92 in #1756
Introducing docTR Guru on Gurubase.io by @kursataktas in #1760
[tests/style] Fix tf test and formatting by @felixdittrich92 in #1762
[build] Upgrade py 3.10 by @felixdittrich92 in #1770
[misc] Change prefered backend from tf to torch by @felixdittrich92 in #1779
bump: ⬆️ CUDA Version by @odulcy-mindee in #1769
[CI/code quality] Add clear github runner caches job + mypy fixes by @felixdittrich92 in #1783
ci(dependabot): remove frgfm from automatic reviewers by @frgfm in #1787
[CI] public docker job - change commit tag to version by @felixdittrich92 in #1789
[misc] Small DDP script adjustments by @felixT2K in #1793
[misc] Replace in deprecation dropped httpx.AsyncClient app arg by @felixT2K in #1802
[docs] Tiny documentation export page fix by @felixdittrich92 in #1824
[CI/CD] Replace deprecated parts conda & publish by @felixdittrich92 in #1842
[misc] Increase minor version to v0.11.0 by @felixdittrich92 in #1858

New Contributors

@agarkovv made their first contribution in #1758
@kursataktas made their first contribution in #1760
@cmoscardi made their first contribution in #1765
@sarjil77 made their first contribution in #1777
@Razlaw made their first contribution in #1859

Full Changelog: v0.10.0...v0.11.0

v0.10.0

Note: docTR 0.10.0 requires python >= 3.9
Note: docTR 0.10.0 requires either TensorFlow >= 2.15.0 or PyTorch >= 2.0.0

What's Changed

Soft Breaking Changes (TensorFlow backend only) 🛠

Changed the saving format from /weights to .weights.h5

NOTE: Please update your custom trained models and HuggingFace hub uploaded models, this will be the last release supporting manual loading from /weights.

New features

Added numpy 2.0 support @felixdittrich92
New and updated notebooks was added @felixdittrich92 --> notebooks
Custom orientation model loading @felixdittrich92
Additional functionality to control the pipeline when dealing with rotated documents @milosacimovic @felixdittrich92
Bulit-in datasets can now be loaded directly for detection with detection_task=True comparable to the existing recognition_task=True @felixdittrich92

Disable page orientation classification

If you deal with documents which contains only small rotations (~ -45 to 45 degrees), you can disable the page orientation classification to speed up the inference.
This will only have an effect with assume_straight_pages=False and/or straighten_pages=True and/or detect_orientation=True.

from doctr.models import ocr_predictor

model = ocr_predictor(pretrained=True, assume_straight_pages=False, disable_page_orientation=True)

Disable crop orientation classification

If you deal with documents which contains only horizontal text, you can disable the crop orientation classification to speed up the inference.
This will only have an effect with assume_straight_pages=False and/or straighten_pages=True.

from doctr.models import ocr_predictor

model = ocr_predictor(pretrained=True, assume_straight_pages=False, disable_crop_orientation=True)

Loading custom exported orientation classification models

You can now load your custom trained orientation models, the following snippet demonstrates how:

from doctr.io import DocumentFile from doctr.models import ocr_predictor, mobilenet_v3_small_page_orientation, mobilenet_v3_small_crop_orientation from doctr.models.classification.zoo import crop_orientation_predictor, page_orientation_predictor

custom_page_orientation_model = mobilenet_v3_small_page_orientation("") custom_crop_orientation_model = mobilenet_v3_small_crop_orientation(""))

predictor = ocr_predictor(pretrained=True, assume_straight_pages=False, detect_orientation=True)

Overwrite the default orientation models

predictor.crop_orientation_predictor = crop_orientation_predictor(custom_crop_orientation_model) predictor.page_orientation_predictor = page_orientation_predictor(custom_page_orientation_model)

What's Changed

Breaking Changes 🛠

[TF] First changes on the road to Keras v3 by @felixdittrich92 in #1724
[Build] update minor version & update torch to >= 2.0 by @felixdittrich92 in #1747

New Features

Disable page and crop orientation by @milosacimovic in #1735

Bug Fixes

[Bug] fix straighten pages by @felixdittrich92 in #1697
[Fix] Remove image padding after rotation correction with straighten_pages=True by @felixdittrich92 in #1731
[datasets] Allow detection task for built-in datasets by @felixdittrich92 in #1717
[Bug] Fix eval scripts + possible overflow in Resize by @felixdittrich92 in #1715
[demo] Add missing viz dep for demo by @felixT2K in #1751

Improvements

[Datasets] Add Vietnamese letters by @MinhChien9 in #1693
feat: added ukrainian vocab by @holyCowMp3 in #1700
[orientation] Enable usage of custom trained orientation models by @felixdittrich92 in #1708
[demo] Automate doctr demo update via CI job by @felixdittrich92 in #1742
[TF] Move model building & unify train scripts by @felixdittrich92 in #1744
[demo/docs] Update notebook docs & minor demo update / fix by @felixT2K in #1755
[Reconstitution] Improve reconstitution by @felixdittrich92 in #1750

Miscellaneous

[misc] post release 0.9.1 by @felixT2K in #1689
[build] NumPy 2.0 support by @felixdittrich92 in #1709

New Contributors

@MinhChien9 made their first contribution in #1693
@holyCowMp3 made their first contribution in #1700
@milosacimovic made their first contribution in #1735

Full Changelog: v0.9.0...v0.10.0

v0.9.0

v0.8.1

v0.8.0

v0.7.0

Note: doctr 0.7.0 requires either TensorFlow >= 2.11.0 or PyTorch >= 1.12.0.
Note: We will release the missing PyTorch checkpoints with 0.7.1

What's Changed

Breaking Changes 🛠

We changed the preserve_aspect_ratio parameter to True by default in #1279
=> To restore the old behaviour you can pass preserve_aspect_ratio=False to the predictor instance

New features

Feat: Make detection training and inference Multiclass by @aminemindee in #1097
Now all TensorFlow models have pretrained weights by @odulcy-mindee
The docs was updated and model corresponding benchmarks was added by @felixdittrich92
Two new recognition models was added (ViTSTR and PARSeq) in both frameworks by @felixdittrich92 @nikokks

Add of the KIE predictor

The KIE predictor is a more flexible predictor compared to OCR as your detection model can detect multiple classes in a document. For example, you can have a detection model to detect just dates and adresses in a document.

The KIE predictor makes it possible to use detector with multiple classes with a recognition model and to have the whole pipeline already setup for you.

from doctr.io import DocumentFile from doctr.models import kie_predictor

Model

model = kie_predictor(det_arch='db_resnet50', reco_arch='crnn_vgg16_bn', pretrained=True)

PDF

doc = DocumentFile.from_pdf("path/to/your/doc.pdf")

Analyze

result = model(doc)

predictions = result.pages[0].predictions for class_name in predictions.keys(): list_predictions = predictions[class_name] for prediction in list_predictions: print(f"Prediction for {class_name}: {prediction}")

The KIE predictor results per page are in a dictionary format with each key representing a class name and it's value are the predictions for that class.

What's Changed

Breaking Changes 🛠

Feat: Make detection training and inference Multiclass by @aminemindee in #1097

New Features

feat: ✨ PyTorch Recognition Model Multi-GPU support by @odulcy-mindee in #1164
[Feat] Add PARSeq model TF and PT by @nikokks in #1205
[Feat] Predictor precision PT backend by @felixdittrich92 in #1204
feat: ✨ ClearML support for TensorFlow by @odulcy-mindee in #1257

Bug Fixes

fix classification model cuda move by @odulcy-mindee in #1125
fix: 🔧 docker api use GitHub repository by @odulcy-mindee in #1148
Error in unpacking archive of SROIE dataset by @HamzaGbada in #1178
[Fix] remove autogen version.py fix docs build and fix version identifier by @felixT2K in #1180
[FIX] Error in unpacking archive of CORD dataset by @HamzaGbada in #1190
chore(deps-dev): update docutils requirement from <0.20 to <0.21 by @dependabot in #1198
speed up VIT models and fix patch size by @felixdittrich92 in #1219
[Fix] PARSeq pytorch fixes by @felixdittrich92 in #1227
[Fix] PARSeq tensorflow fixes by @felixdittrich92 in #1228
[fix/chore] fix bug in tf det eval script / update dep version specifier by @felixdittrich92 in #1232
fix: 🐛 fix bug when training object detection by @aminemindee in #1254
[Fix] fix obj det train and suppress endless warning prints by @felixdittrich92 in #1267
[Fix] add ignore keys if classes differ - KIE training by @felixdittrich92 in #1271
change the way model is saved in ddp by @venkatapathy in #1289

Improvements

Improve pypdfium2 integration again by @mara004 in #1096
[build] replaces flake8 with ruff by @felixT2K in #1179
[datasets] Add IIIT HWS dataset by @felixT2K in #1199
feat: ✨ TF linknet_resnet18 checkpoint by @odulcy-mindee in #1231
[tests/bug] improve tests and fix a minor bug by @felixdittrich92 in #1229
[PyTorch] update transforms pytorch (classification / det / rec) by @felixdittrich92 in #1253
[docs] custom model load by @felixdittrich92 in #1263
feat: ✨ TF ViTSTR Small checkpoint by @odulcy-mindee in #1273
[predictor] aspect ratio true by default by @felixdittrich92 in #1279
feat: ✨ TF SAR Resnet31 checkpoint by @odulcy-mindee in #1281

Miscellaneous

chore: apply post release modifications v0.6.0 by @felixdittrich92 in #1081
chore: dev version downgrade from 0.7.0 to 0.6.1 by @felixdittrich92 in #1082
chore(deps-dev): update black requirement from <23.0,>=22.1 to >=22.1,<24.0 by @dependabot in #1140
chore(deps-dev): update docutils requirement from <0.18 to <0.20 by @dependabot in #1101
docs: Minor typo fix by @khanfarhan10 in #1150
Update utils.py by @weiwangmeta in #1177
[tests/TF/build] enable missing classification onnx tests and set tensorflow lower bound to 2.11 by @felixT2K in #1182
[build] update pytorch dependency by @felixT2K in #1188
[build] drop py3.6/3.7 support and update CI default to py3.8/3.9 by @felixT2K in #1184
[CI] change old cache action and skip TF classification onnx export temporarily by @felixT2K in #1201
[Fix] add missing mean/std defaults, add missing weight init for sar by @felixT2K in #1212
[classification] vit and magc_resnet checkpoints by @felixdittrich92 in #1221
[tests] update test cases by @felixT2K in #1233
chore: apply PIL major changes and increase min version specifier by @felixT2K in #1237
[chore]: Pypdfium2 compatibility fix by @felixT2K in #1239
[chore]: Replace tensorflow_addons by @felixdittrich92 in #1252
[style] Fix markdown style warnings by @felixdittrich92 in #1260
[docs] update export page to ONNX by @felixdittrich92 in #1261
[PyPi] Fix image display by @felixdittrich92 in #1268
[chore] increase version and update maintainers by @felixT2K in #1264
[demo] update models list for Tf / PT backend by @felixdittrich92 in #1280
[chore] update to new torchvision API in models as well by @felixT2K in #1291
[chore]: clean dependencies by @felixT2K in #1287
feat: ✨ TF Parseq checkpoint by @odulcy-mindee in #1305
feat: ✨ TF ViTSTR Base checkpoint by @odulcy-mindee in #1306
[docs] update benchmark page by @felixdittrich92 in #1234

New Contributors

@dependabot made their first contribution in #1140
@eltociear made their first contribution in #1119
@khanfarhan10 made their first contribution in #1150
@weiwangmeta made their first contribution in #1177
@HamzaGbada made their first contribution in #1178
@felixT2K made their first contribution in #1180
@nikokks made their first contribution in #1205
@odulcy made their first contribution in #1246
@venkatapathy made their first contribution in #1289

Full Changelog: v0.6.0...v0.7.0

v0.6.0

Highlights of the release:

Note: doctr 0.6.0 requires either TensorFlow >= 2.9.0 or PyTorch >= 1.8.0.

Full integration with Huggingface Hub (docTR meets Huggingface)

Loading from hub:

from doctr.io import DocumentFile
from doctr.models import ocr_predictor, from_hub
image = DocumentFile.from_images(['data/example.jpg'])
# Load a custom detection model from huggingface hub
det_model = from_hub('Felix92/doctr-torch-db-mobilenet-v3-large')
# Load a custom recognition model from huggingface hub
reco_model = from_hub('Felix92/doctr-torch-crnn-mobilenet-v3-large-french')
# You can easily plug in this models to the OCR predictor
predictor = ocr_predictor(det_arch=det_model, reco_arch=reco_model)
result = predictor(image)

Pushing to the hub:

from doctr.models import recognition, login_to_hub, push_to_hf_hub
login_to_hub()
my_awesome_model = recognition.crnn_mobilenet_v3_large(pretrained=True)
push_to_hf_hub(my_awesome_model, model_name='doctr-crnn-mobilenet-v3-large-french-v1', task='recognition', arch='crnn_mobilenet_v3_large')

Documentation: https://mindee.github.io/doctr/using_doctr/sharing_models.html

Predefined datasets can be used also for recognition task

from doctr.datasets import CORD
# Crop boxes as is (can contain irregular)
train_set = CORD(train=True, download=True, recognition_task=True)
# Crop rotated boxes (always regular)
train_set = CORD(train=True, download=True, use_polygons=True, recognition_task=True)
img, target = train_set[0]

Documentation: https://mindee.github.io/doctr/using_doctr/using_datasets.html

New models (both frameworks)

classification: VisionTransformer (ViT)
recognition: Vision Transformer for Scene Text Recognition (ViTSTR)

Bug fixes recognition models

MASTER and SAR architectures are now operational in both frameworks (TensorFlow and PyTorch)

ONNX support (experimential)

All models can now be exported into ONNX format (only TF mobilenet left for 0.7.0)

NOTE: full production pipeline with ONNX / build is planned for 0.7.0 (the models can be only exported up to the logits without any post processing included)

Further features

our demo is now also PyTorch compatible, thanks to @odulcy-mindee
it is now possible to detect the language of the extracted text, thanks to @aminemindee

What's Changed

Breaking Changes 🛠

feat: ✨ allow beam width > 1 in the CRNN postprocessor by @khalidMindee in #630
[Fix] TensorFlow SAR_Resnet31 implementation by @felixdittrich92 in #925

New Features

[onnx] classification models export by @felixdittrich92 in #830
feat: Added Vietnamese entry in VOCAB by @calibretaliation in #878
feat: Added Czech to the set of vocabularies in datasets/vocabs.py by @Xargonus in #885
feat: Add ability to upload PT/TF models to Huggingface Hub by @felixdittrich92 in #881
[feature][tf/pt] integrate from_hub for all tasks by @felixdittrich92 in #892
[feature] Part 2 from use datasets for recognition by @felixdittrich92 in #891
[datasets] Add MJSynth (Synth90K) by @felixdittrich92 in #827
[docu]: add documentation for datasets by @felixdittrich92 in #905
add a Slack Community badge by @fharper in #936
Feat/add language detection by @aminemindee in #1023
add ViT as classification model TF and PT by @felixdittrich92 in #1050
[models] add ViTSTR TF and PT and update ViT to work as backbone by @felixdittrich92 in #1055

Bug Fixes

[PyTorch][references] fix pretrained with different vocabs by @felixdittrich92 in #874
[classification] Fix cfgs by @felixdittrich92 in #883
docs: Fixed typo in installation instructions by @frgfm in #901
[Fix] imgur5k test by @felixdittrich92 in #903
fix: Fixed load_pretrained_params in PyTorch when ignoring keys by @frgfm in #902
[Fix]: Documentation add missing in vocabs and correct tab in sharing models by @felixdittrich92 in #904
Fix links in readme by @jsn5 in #937
[Fix] PyTorch MASTER implementation by @felixdittrich92 in #941
[Fix] MJSynth dataset: filter corrupted or missing images by @felixdittrich92 in #956
[Fix] SVT dataset: clip box values and add shape and label check by @felixdittrich92 in #955
[Fix] Tensorflow MASTER implementation by @felixdittrich92 in #949
[FIX] MASTER AMP and onnxruntime issue with master PT by @felixdittrich92 in #986
pytest-api test: fix ping server step by @odulcy-mindee in #997
docs/index: fix two minor typos by @mara004 in #1002
Fix orientation details export by @aminemindee in #1022
Changed return type of multithread_exec to iterator by @mtvch in #1019
[datasets] Fix recognition parts of SynthText and IMGUR5K by @felixdittrich92 in #1038
[Fix] rotation classifier input move to model device by @felixdittrich92 in #1039
[models] Vit: fix intermediate size scale and unify TF to PT by @felixdittrich92 in #1063

Improvements

chore: Applied post release modifications v0.5.1 by @felixdittrich92 in #870
[refactor][fix]: Part1 from use datasets for recognition task by @felixdittrich92 in #889
ci: Add swagger ping in API CI job by @frgfm in #906
[docs] Add naming conventions for upload models to hf hub by @felixdittrich92 in #921
docs: Improved error message of encode_string by @frgfm in #929
[Refactor] PyTorch SAR_Resnet31 make it ONNX exportable (again) by @felixdittrich92 in #930
Add support page in README by @jonathanMindee in #946
[references] Add eval recognition and update eval detection scripts by @felixdittrich92 in #933
update pypdfium2 dep and improve code quality by @felixdittrich92 in #953
docs: Moved need help section after code snippet by @frgfm in #959
chore: Updated TF requirements to fix grouped convolutions on CPU by @frgfm in #963
style: Fixed mypy and moved tool configs to pyproject.toml by @frgfm in #966
Updating the readme by @Atomme1 in #938
Update docs in using_doctr by @odulcy-mindee in #993
feat: add a basic example of text detection by @ianardee in #999
Add pytorch demo by @odulcy-mindee in #1008
[build] move requirements to pyproject.toml by @felixdittrich92 in #1031
Migrate static data from github to monitoring middleware. by @marvinmindee in #1033
Changes needed to be able to use doctr on AWS Lambda by @mtvch in #1017
[Fix] unify recognition dataset parts return signature by @felixdittrich92 in #1041
Updated README.md for custom fonts by @carl-krikorian in #1051
[refactor] detection script by @felixdittrich92 in #1060
[models] ViT add checkpoints and some rework to use pretrained ViT backbone in ViTSTR by @felixdittrich92 in #1072
upgrade pypdfium2 by @felixdittrich92 in #1075
ViTSTR disable pretrained backbone by default by @felixdittrich92 in #1080

Miscellaneous

[Refactor] commit tags by @felixdittrich92 in #871
Update io/pdf.py to new pypdfium2 API by @mara004 in #944
docs: Documentation the reason for keras version specifier by @frgfm in #958
[datasets] update IC / SROIE / FUNSD / CORD by @felixdittrich92 in #983
[datasets] revert whitespace filtering and fix svhn reco by @felixdittrich92 in #987
fix: update tensorflow-addons to match tensorflow version by @ianardee in #998
move transformers implementation to modules by @felixdittr...

Releases · mindee/doctr (original) (raw)

v1.0.1

v1.0.0

What's Changed

Breaking Change

New features

What's Changed

Breaking Changes 🛠

Bug Fixes

Improvements

Miscellaneous

New Contributors

v0.12.0

What's changed

New features

NEW

Instead of depending on the backend

Or with TensorFlow

What's Changed

Breaking Changes 🛠

New Features

Bug Fixes

Improvements

Miscellaneous

New Contributors

v0.11.0

What's changed

New features

Compile your model

Compile the models

NOTE: Only required for non-straight pages (assume_straight_pages=False) and non-disabled orientation classification

Set the orientation predictors

What's Changed

New Features

Bug Fixes

Improvements

Miscellaneous

New Contributors

v0.10.0

What's Changed

Soft Breaking Changes (TensorFlow backend only) 🛠

New features

Disable page orientation classification

Disable crop orientation classification

Loading custom exported orientation classification models

Overwrite the default orientation models

What's Changed

Breaking Changes 🛠

New Features

Bug Fixes

Improvements

Miscellaneous

New Contributors

v0.9.0

v0.8.1

v0.8.0

v0.7.0

What's Changed

Breaking Changes 🛠

New features

Add of the KIE predictor

Model

PDF

Analyze

What's Changed

Breaking Changes 🛠

New Features

Bug Fixes

Improvements

Miscellaneous

New Contributors

v0.6.0

Highlights of the release:

Full integration with Huggingface Hub (docTR meets Huggingface)

Predefined datasets can be used also for recognition task

New models (both frameworks)

Bug fixes recognition models

ONNX support (experimential)

Further features

What's Changed

NOTE: Only required for non-straight pages (`assume_straight_pages=False`) and non-disabled orientation classification