bpo-29698: _collectionsmodule.c: Replace while
loop with for
by Kojoley · Pull Request #396 · python/cpython (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation8 Commits1 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
It is clear for me that n++; while (--n)
and for (; n; --n)
are interchangable statements, but here is the prof of it http://coliru.stacked-crooked.com/a/a6fc4108b223e7b2.
According to asm out https://godbolt.org/g/heHM33 the for
loop is even shorter (takes less instructions).
While I believe that location of cmp
/jmp
instruction makes no sense and performance is the same, but I have made a benchmark.
Benchmark
Run on (4 X 3310 MHz CPU s)
02/27/17 22:10:55
Benchmark Time CPU Iterations
------------------------------------------------------------
BM_while_loop/2 13 ns 13 ns 56089384
BM_while_loop/4 17 ns 16 ns 40792279
BM_while_loop/8 24 ns 24 ns 29914338
BM_while_loop/16 40 ns 40 ns 20396140
BM_while_loop/32 84 ns 80 ns 8974301
BM_while_loop/64 146 ns 146 ns 4487151
BM_while_loop/128 270 ns 269 ns 2492862
BM_while_loop/128 267 ns 266 ns 2639500
BM_while_loop/512 1022 ns 1022 ns 641022
BM_while_loop/4096 8203 ns 8344 ns 89743
BM_while_loop/32768 66971 ns 66750 ns 11218
BM_while_loop/262144 545833 ns 546003 ns 1000
BM_while_loop/2097152 4376095 ns 4387528 ns 160
BM_while_loop/8388608 17654654 ns 17883041 ns 41
BM_for_loop/2 13 ns 13 ns 56089384
BM_for_loop/4 15 ns 15 ns 49857230
BM_for_loop/8 21 ns 21 ns 32051077
BM_for_loop/16 37 ns 37 ns 19509351
BM_for_loop/32 81 ns 80 ns 8974301
BM_for_loop/64 144 ns 128 ns 4985723
BM_for_loop/128 265 ns 263 ns 3205108
BM_for_loop/128 265 ns 266 ns 2639500
BM_for_loop/512 1036 ns 1022 ns 641022
BM_for_loop/4096 8314 ns 8344 ns 89743
BM_for_loop/32768 67345 ns 66750 ns 11218
BM_for_loop/262144 541310 ns 546004 ns 1000
BM_for_loop/2097152 4354986 ns 4387528 ns 160
BM_for_loop/8388608 17592428 ns 17122061 ns 41
#include <benchmark/benchmark.h>
#define MAKE_ROTL_BENCHMARK(name)
static void BM_##name(benchmark::State& state) {
while (state.KeepRunning()) {
int n = name(state.range(0));
}
}
/**/
int while_loop(int n) { int sum = 0; n++; while (--n) { sum += 1; } return sum; }
int for_loop(int n) { int sum = 0; for(; n; --n) { sum += 1; } return sum; }
MAKE_ROTL_BENCHMARK(while_loop) MAKE_ROTL_BENCHMARK(for_loop) BENCHMARK(BM_while_loop)->RangeMultiplier(2)->Range(2, 8<<4); BENCHMARK(BM_while_loop)->Range(8<<4, 8<<20); BENCHMARK(BM_for_loop)->RangeMultiplier(2)->Range(2, 8<<4); BENCHMARK(BM_for_loop)->Range(8<<4, 8<<20);
BENCHMARK_MAIN()
This is not trivial change. Existing code is used for purpose and changes should be well founded. Please open an issue on the tracker for discussion.
Mariatta changed the title
_collectionsmodule.c: Replace bpo-29698: _collectionsmodule.c: Replace while
loop with for
while
loop with for
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Block the PR: I would like to hear Raymond first, and clarify the rationale of the change.
I already assigned both the PR and the issue to @rhettinger for this purpose. Isn't this enough for blocking?
I already assigned both the PR and the issue to @rhettinger for this purpose. Isn't this enough for blocking?
CPython GitHub workflow changed recently: any Python core dev can now approve a change if tests pass. I prefer to explicitly block the PR to avoid mistakes. I'm still not confortable with the new (changing) workflow :-)
Sorry, I'm going to reject this one. We compile with -O3 and the code in question already compiled optimally.
The two formulations are sematically equivalent but I find that the current one more readable in that it matches my way of thinking about the problem (as a decrement-skip-on-zero). The proposed rewrite looks weird to my eyes and interferes with the way I think about the code.