fix: tighten embedding truncation threshold by buua436 · Pull Request #16123 · infiniflow/ragflow (original) (raw)
📝 Walkthrough
Walkthrough
In LLMBundle.encode, the truncation guard condition is changed from token_size > self.max_length to token_size > self.max_length * 0.95, making the check consistent with the existing truncation target of int(self.max_length * 0.95).
Changes
Embedding Truncation Guard
| Layer / File(s) | Summary |
|---|---|
| Truncation condition threshold api/db/services/llm_service.py | The if token_size > self.max_length guard in LLMBundle.encode is changed to if token_size > self.max_length * 0.95, so the check now fires at the same 95% boundary used by the truncation target already computed on the next line. |
Estimated code review effort
🎯 1 (Trivial) | ⏱️ ~2 minutes
Poem
A bunny trims its token ears,
Not quite to the tip — just 5% clears.
The guard now matches the cut it would make,
Consistent thresholds for embedding's sake!
🐇✂️
🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | ⚠️ Warning | Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. | Write docstrings for the functions missing them to satisfy the coverage threshold. |
✅ Passed checks (4 passed)
| Check name | Status | Explanation |
|---|---|---|
| Linked Issues check | ✅ Passed | Check skipped because no linked issues were found for this pull request. |
| Out of Scope Changes check | ✅ Passed | Check skipped because no linked issues were found for this pull request. |
| Title check | ✅ Passed | The title accurately describes the main change: adjusting the embedding truncation threshold from max_length to 95% of max_length. |
| Description check | ✅ Passed | The description includes the required section 'What problem does this PR solve?' with clear context, and properly identifies the change type as a bug fix. |
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
❤️ Share
Comment @coderabbitai help to get the list of available commands and usage tips.