feat: add a basic example of text detection by ianardee · Pull Request #999 · mindee/doctr (original) (raw)

A script to extract text from files.

Codecov Report

Merging #999 (99bbc86) into main (23d1a1e) will decrease coverage by 0.01%.
The diff coverage is n/a.

@@ Coverage Diff @@ ## main #999 +/- ##

Coverage 94.85% 94.83% -0.02%
Files 134 134
Lines 5558 5558
Hits 5272 5271 -1
Misses 286 287 +1

Flag	Coverage Δ
unittests	94.83% <ø> (-0.02%)	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
doctr/transforms/functional/base.py	94.20% <0.00%> (-1.45%)	⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us.

Hi @ianardee 👋,
thanks for the PR :)
Some minor things:

could we add also the xml output ?
missing (minor/lightweight) CI test
maybe rename to extract_text otherwise as a user i would expect to get only the box coords wdyt ?
missing .cuda() if backend is pytorch and gpu available (silent move)
pass can be removed
for string export you should use the .render() method

Sry for the lazy review i have tried to do it from my mobile phone 😅

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the script!

I added some comments if you do a follow-up PR :)

@@ -0,0 +1,118 @@
# Copyright (C) 2021-2022, Mindee.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Copyright (C) 2022, Mindee.

This script didn't exist in 2021 :)

Comment on lines +48 to +49

for word in line["words"]:
out_txt += word["value"] + " "

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps

out_txt += " ".join(word["value"] for word in line["words"])

or wrapping more nested loops inside a list comprehensions 🤷‍♂️

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out.render() ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah actually, I had forgotten about this haha

model = ocr_predictor(args.detection, args.recognition, pretrained=True)
path = Path(args.path)
if path.is_dir():
allowed = (".pdf", ".jpeg", ".jpg", ".png", ".tif", ".tiff", ".bmp")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps we should move this at the top of the files with other constants?

fh.write(out_str)
else:
out_str = _process_file(model, path, args.format)
print(out_str)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in one case, we dump the string into a file and in the other we print it?

Hi @ianardee 👋 , any updates about the refactor PR ? :)

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})