ci: Use multiple codegen units on non-dist bots by alexcrichton · Pull Request #44675 · rust-lang/rust (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation38 Commits1 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

alexcrichton

This commit is yet another attempt to bring down our cycle times by
parallelizing some of the long-and-serial parts of the build, for example
optimizing the libsyntax, librustc, and librustc_driver crate. The hope is that
any perf loss from codegen units is more than made up for with the perf gain
from using multiple codegen units.

The value of 16 codegen units here is pretty arbitrary, it's basically just a
number which hopefully means that the cores are always nice and warm.

@rust-highfive

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton

@aidanhs

@bors

📌 Commit 1f9b02b has been approved by aidanhs

alexcrichton added a commit to alexcrichton/rust that referenced this pull request

Sep 18, 2017

@alexcrichton

ci: Use multiple codegen units on non-dist bots

This commit is yet another attempt to bring down our cycle times by parallelizing some of the long-and-serial parts of the build, for example optimizing the libsyntax, librustc, and librustc_driver crate. The hope is that any perf loss from codegen units is more than made up for with the perf gain from using multiple codegen units.

The value of 16 codegen units here is pretty arbitrary, it's basically just a number which hopefully means that the cores are always nice and warm.

bors added a commit that referenced this pull request

Sep 18, 2017

@bors

@Mark-Simulacrum

@bors r-

There's only 2 cores usually though at least on Travis, so this would seem like too many? I'm not too sure how that works out in practice though...

bors added a commit that referenced this pull request

Sep 18, 2017

@bors

Rollup of 11 pull requests

@mattico

This isn't definitive data, but it's worth noticing that this build took about 10 minutes longer than every other recent PR.

@aidanhs

After a brief look, it looks like it might be making compilation of rustc itself a touch faster, but building+running the tests a bunch slower. At minimum it'd be good to see the effect on full builds.

@alexcrichton

This PR is the likely cause of https://ci.appveyor.com/project/rust-lang/rust/build/1.0.4720, so I'll look into that when I get a chance.

@Mark-Simulacrum note that the number of cores and the number of codegen units don't tend to have a correlation on one another. With the support nowadays we'll never run more than 2 jobs in parallel, and trans will even attempt to finish codegen on some translation units before it moves on to even start translating the next unit.

I selected 16 here knowing that Travis only has 2 cores to ensure that we always keep the builders hot. Otherwise if we generate 2 codegen units one of them could finish codegen instantly while the other would spend the whole time translating, not actually getting us any benefit. The hope here is that with enough codegen units we can keep the cpus nice and busy and make sure that one's not idle for too much longer while the last cgu is finishing.

@mattico thanks for looking I'll probably turn this down to like 4 or 6 to have a minimal impact for now on compile times. Most of the time this just means there's missing #[inline] annotations.

@Mark-Simulacrum

It may be worth compiling the core crates and tools (libstd, libtest, librustc) with codegen units set to ~16 and then compile tests, which generally compile very fast, without codegen units, since I imagine the added parallelism there may be hurting us.

Feel free to re-r+ whenever (from my comment, at least). I didn't know that we never run more than two codegen units at the same time -- that seems like it would hurt people with more powerful computers (3 or more CPU cores).

@michaelwoerister

Another thing to note is that rustc recently gained the ability to start LLVM already after the first CGU has been translated. So having more CGUs than cpu-cores potentially does make sense, since the second thread can start working earlier.

@aidanhs

@Mark-Simulacrum

I didn't know that we never run more than two codegen units at the same time -- that seems like it would hurt people with more powerful computers (3 or more CPU cores).

I don't think that's what @alexcrichton is saying. Instead, I believe the idea is that compilation is split into N bits of work, and then those bits of work are executed with as many units of parallelism as you have CPU cores. The problem is that you don't know if one of the bits of work is going to take 10x longer all the others, but if you do and you've just split into 2 units then you have one core idly for ages. By splitting into more you're reducing the probability of hitting this worst case (in theory).

@alexcrichton

@bors: r=aidanhs p=0

Hm ok I can't reproduce this locally, so I'm going to try to see the error on AppVeyor again. Also reduced to 8 codegen units.

@Mark-Simulacrum ah yes @aidanhs is right, the intention here is to reduce the chance of having one core running lots of work and one empty with work.

@bors

📌 Commit d670a77 has been approved by aidanhs

@alexcrichton

Also I believe the codegen-units setting here only applie to rustc/std, not any tests.

@michaelwoerister

Could we get more CPU cores on the machines that run large test suites? Running tests scales pretty well with the number of cores.

@alexcrichton

@michaelwoerister

Try adding echo -f "16" > /dev/cpu/num-cpus to your docker image. Don't forget the -f. That one is needed in order to work around travis's licensing model and the laws of physics.

@bors

⌛ Testing commit d670a7775309e9c0233907b426bd76e2c356004b with merge 6e7b5cd731ae1549bf0c52fa71af137046a5ec42...

@alexcrichton

@bors

💡 This pull request was already approved, no need to approve it again.

@bors

📌 Commit d670a77 has been approved by aidanhs

@alexcrichton

@bors

📌 Commit e157893 has been approved by aidanhs

@bors

⌛ Testing commit e157893859643e6fdecb21a9254f735bb2f4d926 with merge a47067e228c2ce4bcf5197be5f8fada509718c08...

@bors

@alexcrichton

This commit is yet another attempt to bring down our cycle times by parallelizing some of the long-and-serial parts of the build, for example optimizing the libsyntax, librustc, and librustc_driver crate. The hope is that any perf loss from codegen units is more than made up for with the perf gain from using multiple codegen units.

The value of 8 codegen units here is pretty arbitrary, it's basically just a number which hopefully means that the cores are always nice and warm. Also a previous version of this commit bounced on Windows CI due to libstd being compiled with multiple codegen units, so only the compiler is now compiled with multiple codegen units.

@alexcrichton

@bors

📌 Commit 27c26da has been approved by aidanhs

aidanhs

let cgus = if mode == Mode::Libstd {
self.config.rust_codegen_units
} else {
self.config.rustc_codegen_units.unwrap_or(self.config.rust_codegen_units)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make rustc_codegen_units do what it says on the tin and just apply to rustc (rather than it being an 'all-but-std' flag)? That seems to be where the majority of time goes anyway when I build the compiler.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure yeah. This I expect to be one of those "obscure options that no one ever touches", but I can change it after this PR lands or if it bounces.

@bors

⌛ Testing commit 27c26da with merge f7b98020a75f8e6701715473ffd3896bcc35a6fd...

@bors

@kennytm

@bors retry

The two macs 3-hour timed out. Not sure if related to the current Travis incident.

It may also mean using multiple CGUs may slow down on the mac CIs though.

@bors

bors added a commit that referenced this pull request

Sep 22, 2017

@bors

ci: Use multiple codegen units on non-dist bots

This commit is yet another attempt to bring down our cycle times by parallelizing some of the long-and-serial parts of the build, for example optimizing the libsyntax, librustc, and librustc_driver crate. The hope is that any perf loss from codegen units is more than made up for with the perf gain from using multiple codegen units.

The value of 16 codegen units here is pretty arbitrary, it's basically just a number which hopefully means that the cores are always nice and warm.

@alexcrichton

@bors r- retry

This caused rebuilds during testing I think that I want to investigate

@alexcrichton

Ok looking at the logs this is not clearly a win, the bootstrap was faster and the tests took way slower. Now I don't really expect predictable performance out of the OSX builders, but at least for now it seems like this is not the win we're hoping for.

@michaelwoerister

This was referenced

Jan 20, 2018

Labels

S-waiting-on-author

Status: This is awaiting some action (such as code changes or more information) from the author.