fix: tighten embedding truncation threshold by buua436 · Pull Request #16123 · infiniflow/ragflow (original) (raw)

Review Change Stack

📝 Walkthrough

Walkthrough

In LLMBundle.encode, the truncation guard condition is changed from token_size > self.max_length to token_size > self.max_length * 0.95, making the check consistent with the existing truncation target of int(self.max_length * 0.95).

Changes

Embedding Truncation Guard

Layer / File(s) Summary
Truncation condition threshold api/db/services/llm_service.py The if token_size > self.max_length guard in LLMBundle.encode is changed to if token_size > self.max_length * 0.95, so the check now fires at the same 95% boundary used by the truncation target already computed on the next line.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

A bunny trims its token ears,
Not quite to the tip — just 5% clears.
The guard now matches the cut it would make,
Consistent thresholds for embedding's sake!
🐇✂️

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title accurately describes the main change: adjusting the embedding truncation threshold from max_length to 95% of max_length.
Description check ✅ Passed The description includes the required section 'What problem does this PR solve?' with clear context, and properly identifies the change type as a bug fix.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.