Fix MinerU table option sanitization by wangq8 · Pull Request #16118 · infiniflow/ragflow (original) (raw)

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deepdoc/parser/mineru_parser.py` around lines 694 - 695, The condition `if
not table_enable:` incorrectly bypasses `_sanitize_section_text` for all block
types instead of just tables, because `table_enable` is true by default. Replace
this condition with a check that specifically verifies whether the current
section is a TABLE block type before skipping sanitization. This ensures
non-table blocks (text, code, lists, images) are properly sanitized while
preserving raw HTML only for TABLE sections. Additionally, add a logging
statement in this branch to document when sanitization is bypassed, following
the coding guideline to log new flows.