page.search_for method returns rects not in given clip (original) (raw)

Please provide all mandatory information!

Describe the bug (mandatory)

If I am not mistaken the page.search_for method takes a clip argument in which the searching process is perfored. In my case it somehow returns rects that are outside of the given clip(rect).

this is how the searching was done:

marAllRects = [page.search_for(i, clip=marginRect, quad=False) for i in string.ascii_lowercase + "0123456789"] marAllRects = self.flatten(list(filter(None, marAllRects)))

marAllRects = [i for i in marAllRects if i in marginRect] # this is my hacky solution

page.draw_rect(marginRect, color=getColor("red"))

it should return a list of rects that are contained by the mother marginRect, but it didn't.

just after the code above, i did:

for i in marAllRects: page.draw_rect(i, color=(0, 0, 0), fill=(0, 0, 0), overlay=False) if i not in marginRect: print("\n\n") print(marginRect) print(i)

it returns:

Rect(550.1610107421875, 0.0, 792.3422241210938, 782.3599853515625) # this is the clip Rect(215.37942504882812, 680.1986694335938, 215.37942504882812, 690.1986694335938) # somehow included and while not contained by the clip

when all the rects were drawn, it looks like this:

Screenshot 2021-09-25 at 19 26 11

you see the tiny black match-like thingy, it is the extra rect.

To Reproduce (mandatory)

Here is the pdf file I tested with:

[paper2].Small.pdf

My code is written above, where the clip is Rect(550.1610107421875, 0.0, 792.3422241210938, 782.3599853515625)

Expected behavior (optional)

page.search_for returning rects that are contained by the given clip.

Screenshots (optional)

See above

Your configuration (mandatory)

best,
Don