rustdoc: simplify JS search routine by not messing with lev distance by notriddle · Pull Request #105796 · rust-lang/rust (original) (raw)

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

notriddle

@rustbot rustbot added S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

T-rustdoc

Relevant to the rustdoc team, which will review and decide on the PR/issue.

labels

Dec 16, 2022

GuillaumeGomez

@notriddle

Since the sorting function accounts for an index field, there's not much reason to also be applying changes to the levenshtein distance. Instead, we can just not treat lev as a filter if there's already a non-sentinel value for index.

This change gives slightly more weight to the index and path part, as search criteria, than it used to. This changes some of the test cases, but not in any obviously-"worse" way, and, in particular, substring matches are a bigger deal than levenshtein distances (we're assuming that a typo is less likely than someone just not typing the entire name).

Based on rust-lang#103710 (comment)

@notriddle

@bors bors added the S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

label

Jan 16, 2023

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request

Jan 18, 2023

@matthiaskrgr

…-stop-doing-demerits, r=GuillaumeGomez

rustdoc: simplify JS search routine by not messing with lev distance

Since the sorting function accounts for an index field, there's not much reason to also be applying changes to the levenshtein distance. Instead, we can just not treat lev as a filter if there's already a non-sentinel value for index.

This change gives slightly more weight to the index and path part, as search criteria, than it used to. This changes some of the test cases, but not in any obviously-"worse" way, and, in particular, substring matches are a bigger deal than levenshtein distances (we're assuming that a typo is less likely than someone just not typing the entire name).

The biggest change is the addition of a path_lev field to result items. It's always zero if the search query has no parent path part and for type queries, making the check in the sortResults function a no-op. When it's present, it is used to implement different precedence for the parent path and the tail.

Consider the query hashset::insert, a test case that already exists and can be found here. We want the ordering shown in the test case:

        { 'path': 'std::collections::hash_set::HashSet', 'name': 'insert' },
        { 'path': 'std::collections::hash_set::HashSet', 'name': 'get_or_insert' },
        { 'path': 'std::collections::hash_set::HashSet', 'name': 'get_or_insert_with' },
        { 'path': 'std::collections::hash_set::HashSet', 'name': 'get_or_insert_owned' },
        { 'path': 'std::collections::hash_map::HashMap', 'name': 'insert' },

We do not want this ordering, which is the ordering that would occur if substring position took priority over path_lev:

        { 'path': 'std::collections::hash_set::HashSet', 'name': 'insert' },
        { 'path': 'std::collections::hash_map::HashMap', 'name': 'insert' }, // BAD
        { 'path': 'std::collections::hash_set::HashSet', 'name': 'get_or_insert' },
        { 'path': 'std::collections::hash_set::HashSet', 'name': 'get_or_insert_with' },
        { 'path': 'std::collections::hash_set::HashSet', 'name': 'get_or_insert_owned' },

We also do not want HashSet::iter to appear before HashMap::insert, which is what would happen if path_lev took priority over the appearance of any substring match. This is why the sortResults function has path_lev sandwiched between a index < 0 check and a index comparison check:

        { 'path': 'std::collections::hash_set::HashSet', 'name': 'insert' },
        { 'path': 'std::collections::hash_set::HashSet', 'name': 'get_or_insert' },
        { 'path': 'std::collections::hash_set::HashSet', 'name': 'get_or_insert_with' },
        { 'path': 'std::collections::hash_set::HashSet', 'name': 'get_or_insert_owned' },
        { 'path': 'std::collections::hash_set::HashSet', 'name': 'iter' }, // BAD
        { 'path': 'std::collections::hash_map::HashMap', 'name': 'insert' },

The old code implemented a similar feature by manipulating the lev member based on whether a substring match was found and averaging in the path distance (item.lev = name_lev + path_lev / 10), so the path lev wound up acting like a tie breaker, but it gives slightly different results for Vec::new, changing the test case because of the slight changes to ordering priority.

Based on rust-lang#103710 (comment)

Previews:

bors added a commit to rust-lang-ci/rust that referenced this pull request

Jan 19, 2023

@bors

…mpiler-errors

Rollup of 8 pull requests

Successful merges:

Failed merges:

r? @ghost @rustbot modify labels: rollup

@notriddle notriddle deleted the notriddle/rustdoc-search-stop-doing-demerits branch

January 19, 2023 13:53

bors added a commit to rust-lang-ci/rust that referenced this pull request

Feb 6, 2023

@bors

bors added a commit to rust-lang/miri that referenced this pull request

Feb 7, 2023

@bors

wip-sync pushed a commit to NetBSD/pkgsrc-wip that referenced this pull request

Mar 20, 2023

@he32

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request

Apr 8, 2023

@he32

Pkgsrc changes:

Upstream changes:

Version 1.68.2 (2023-03-28)

Version 1.68.1 (2023-03-23)

Version 1.68.0 (2023-03-09)

Language

Compiler

Libraries

Stabilized APIs

These APIs are now stable in const contexts:

Cargo

Misc

Compatibility Notes

nternal Changes

These changes do not affect any public interfaces of Rust, but they represent significant improvements to the performance or internals of rustc and related tools.

Version 1.67.0 (2023-01-26)

Language

Compiler

Added and removed targets:

Refer to Rust's [platform support page][platform-support-doc] for more information on Rust's tiered platform support.

Libraries

Stabilized APIs

These APIs are now stable in const contexts:

Compatibility Notes

Internal Changes

These changes do not affect any public interfaces of Rust, but they represent significant improvements to the performance or internals of rustc and related tools.