page.search_for method returns rects not in given clip (original) (raw)
Please provide all mandatory information!
Describe the bug (mandatory)
If I am not mistaken the page.search_for method takes a clip argument in which the searching process is perfored. In my case it somehow returns rects that are outside of the given clip(rect).
this is how the searching was done:
marAllRects = [page.search_for(i, clip=marginRect, quad=False) for i in string.ascii_lowercase + "0123456789"] marAllRects = self.flatten(list(filter(None, marAllRects)))
marAllRects = [i for i in marAllRects if i in marginRect] # this is my hacky solution
page.draw_rect(marginRect, color=getColor("red"))
it should return a list of rects that are contained by the mother marginRect, but it didn't.
just after the code above, i did:
for i in marAllRects: page.draw_rect(i, color=(0, 0, 0), fill=(0, 0, 0), overlay=False) if i not in marginRect: print("\n\n") print(marginRect) print(i)
it returns:
Rect(550.1610107421875, 0.0, 792.3422241210938, 782.3599853515625) # this is the clip Rect(215.37942504882812, 680.1986694335938, 215.37942504882812, 690.1986694335938) # somehow included and while not contained by the clip
when all the rects were drawn, it looks like this:
you see the tiny black match-like thingy, it is the extra rect.
To Reproduce (mandatory)
Here is the pdf file I tested with:
My code is written above, where the clip is Rect(550.1610107421875, 0.0, 792.3422241210938, 782.3599853515625)
Expected behavior (optional)
page.search_for returning rects that are contained by the given clip.
Screenshots (optional)
See above
Your configuration (mandatory)
- MacOs BigSur 11.6
- python 3.9.7
- PyMuPDF version PyMuPDF==1.18.17 installed with pip
best,
Don
