Optimize insertion sort · Pull Request #40807 · rust-lang/rust (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation8 Commits1 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

This change slightly changes the main iteration loop so that LLVM can optimize it more efficiently.

Benchmark:

name                                   before ns/iter   after ns/iter    diff ns/iter   diff %
slice::sort_unstable_small_ascending   39 (2051 MB/s)   38 (2105 MB/s)             -1   -2.56%
slice::sort_unstable_small_big_random  579 (2210 MB/s)  575 (2226 MB/s)            -4   -0.69%
slice::sort_unstable_small_descending  80 (1000 MB/s)   70 (1142 MB/s)            -10  -12.50%
slice::sort_unstable_small_random      396 (202 MB/s)   386                       -10   -2.53%

The benchmark is not a fluke. I can see that performance on small_descending is consistently better after this change. I'm not 100% sure why this makes things faster, but my guess would be that v.len()+1 to the compiler looks like it could in theory overflow.

This change slightly changes the main iteration loop so that LLVM can optimize it more efficiently.

Benchmark:

name before ns/iter after ns/iter diff ns/iter diff % slice::sort_unstable_small_ascending 39 (2051 MB/s) 38 (2105 MB/s) -1 -2.56% slice::sort_unstable_small_big_random 579 (2210 MB/s) 575 (2226 MB/s) -4 -0.69% slice::sort_unstable_small_descending 80 (1000 MB/s) 70 (1142 MB/s) -10 -12.50% slice::sort_unstable_small_random 396 (202 MB/s) 386 -10 -2.53%

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

📌 Commit 2c816f7 has been approved by alexcrichton

alexcrichton added a commit to alexcrichton/rust that referenced this pull request

Mar 25, 2017

…=alexcrichton

Optimize insertion sort

This change slightly changes the main iteration loop so that LLVM can optimize it more efficiently.

Benchmark:

name                                   before ns/iter   after ns/iter    diff ns/iter   diff %
slice::sort_unstable_small_ascending   39 (2051 MB/s)   38 (2105 MB/s)             -1   -2.56%
slice::sort_unstable_small_big_random  579 (2210 MB/s)  575 (2226 MB/s)            -4   -0.69%
slice::sort_unstable_small_descending  80 (1000 MB/s)   70 (1142 MB/s)            -10  -12.50%
slice::sort_unstable_small_random      396 (202 MB/s)   386                       -10   -2.53%

bors added a commit that referenced this pull request

Mar 25, 2017

Rollup of 11 pull requests

Successful merges: #40347, #40501, #40516, #40524, #40540, #40642, #40683, #40764, #40778, #40807, #40809
Failed merges: #40771

bors added a commit that referenced this pull request

Mar 25, 2017

Rollup of 11 pull requests

Successful merges: #40347, #40501, #40516, #40524, #40540, #40642, #40683, #40764, #40778, #40807, #40809
Failed merges: #40771

Since this is internal to libstd/core, could you check whether the inclusive range syntax makes things better as well?

i.e. 2 ... v.len()

@nagisa Inclusive range syntax makes performance slightly worse, actually...

With the old insertion sort and with inclusive range syntax there's a bound check at the beginning of insertion sort. This PR removes the bound check.

If you want to play with this, here's a playpen link.

⌛ Testing commit 2c816f7 with merge 04e47d7...

try to fix the build on emscripten #40821
@bors retry

frewsxcv added a commit to frewsxcv/rust that referenced this pull request

Mar 25, 2017

…=alexcrichton

Optimize insertion sort

This change slightly changes the main iteration loop so that LLVM can optimize it more efficiently.

Benchmark:

name                                   before ns/iter   after ns/iter    diff ns/iter   diff %
slice::sort_unstable_small_ascending   39 (2051 MB/s)   38 (2105 MB/s)             -1   -2.56%
slice::sort_unstable_small_big_random  579 (2210 MB/s)  575 (2226 MB/s)            -4   -0.69%
slice::sort_unstable_small_descending  80 (1000 MB/s)   70 (1142 MB/s)            -10  -12.50%
slice::sort_unstable_small_random      396 (202 MB/s)   386                       -10   -2.53%

frewsxcv added a commit to frewsxcv/rust that referenced this pull request

Mar 25, 2017

…=alexcrichton

Optimize insertion sort

This change slightly changes the main iteration loop so that LLVM can optimize it more efficiently.

Benchmark:

name                                   before ns/iter   after ns/iter    diff ns/iter   diff %
slice::sort_unstable_small_ascending   39 (2051 MB/s)   38 (2105 MB/s)             -1   -2.56%
slice::sort_unstable_small_big_random  579 (2210 MB/s)  575 (2226 MB/s)            -4   -0.69%
slice::sort_unstable_small_descending  80 (1000 MB/s)   70 (1142 MB/s)            -10  -12.50%
slice::sort_unstable_small_random      396 (202 MB/s)   386                       -10   -2.53%

bors added a commit that referenced this pull request

Mar 26, 2017

@ghost ghost deleted the optimize-insertion-sort branch

March 26, 2017 16:54