Make initial incremental/watch builds as fast normal builds by sokra · Pull Request #42960 · microsoft/TypeScript (original) (raw)

Current just running tsc is much faster compared to tsc --watch or tsc --incremental. There are already multiple issues describing that problem (I didn't verify all of them if it's really the same problem):

After digging in the source code for a while I think I found the cause of that:

For incremental or watch builds typescript need to compute 2 additional things for each module:
the "shape" and the "referenced modules".

These things are needed to calculate which modules need to be invalidated when a file has changed.
Where a file has changed typescript calculates a new "shape" and if the new shape differs from the old one, it will follow the graph of "referenced modules" upwards and invalidates these modules too (recursively).

Currently typescript uses "emit as declaration file" to calculate the "shape" and "referenced modules" and this is basically as expensive as doing a full "emit". Most time of the initial build is spend with that.

So the initial incremental/watch build is as slow as running tsc with emit.
But tsc --noEmit is very fast compared to that.

Refactoring idea

I think we actually don't really need to compute the "shape" on initial build. The "shape" is only an optimization so that non-shape-affecting changes to the file don't invalidate importing modules.

I would propose to not compute the "shape" on initial build. When a change to the file happens use the module content (version) instead to check if we need to invalidate the parents. This will cause unnecessary invalidation if only internals has change, but that would be an acceptable trade-off (Keep in mind that typechecking is much faster than computing the shape).

In addition to that compute the "shape" when a module was invalidated in cause of a file change or an "shape"-change of an referenced module (only real "shape" changes, don't do that when we haven't computed the old shape. This avoids computing too many shapes during a watch rebuild)

Note: Due to not computing the shape we also don't have access to exportedModulesFromDeclarationEmit and have to use all references of the module instead. This will cause the module to be invalidated more often until the invalidation is triggered by a real shape change, which will cause it to compute its own shape and exportedModulesFromDeclarationEmit.

Summary: Lazy compute "shapes" and "exported modules" on first invalidation. Without old shape and exported modules: Invalidate referencing modules on file change instead of shape change. Invalidate module if any referenced modules changes instead of only exported ones.

So that's what I did.

Benchmark

Running different test cases with a project with about 3000 files. Total time as reported by tsc. I did not do any averaging as the results are pretty clear:

with isolatedModules:

Test case master This PR Note
tsc 23.34s 23.01s equal
tsc --incremental (initial) 67.51s⚠️ 24.15s large improvement
(with cache, no change) 6.79s 6.75s equal
(with cache, non shape affecting change) 8.12s 9.19s❗ Initial shape computation, slower
(with cache, same file again) 8.09s 8.05s shape is already computed, equal
(with fresh cache, shape affecting change) 9.65s 9.30s Initial shape computation
(with cache, same file again) 9.59s 9.25s shape is already computed
(with cache, same file again) 9.58s 9.29s shape is already computed
tsc --watch (startup) 70.98s⚠️ 26.24s large improvement
(save without change) 0.03s 0.03s equal
(non shape affecting change) 0.36s 1.47s❗ Initial shape computation, slower
(same file again) 0.31s 0.21s shape is already computed, equal
(with fresh watcher, shape affecting change) 2.11s 1.40s Initial shape computation
(same file again) 1.49s 1.21s shape is already computed
(same file again) 1.45s 1.07s shape is already computed
tsc --watch --incremental (initial) 71.34s⚠️ 26.84s large improvement
(from cache) 9.78s 9.69s equal

without isolatedModules

Test case master This PR Note
tsc 23.50s 23.03s
tsc --incremental (initial) 67.68s⚠️ 24.02s large improvement
(with cache, no change) 6.89s 6.77s equal
(with cache, non shape affecting change) 7.87s 9.34s❗ Initial shape computation, slower
(with cache, same file again) 7.92s 8.04s equal
(with fresh cache, shape affecting change) 9.55s 9.27s Initial shape computation, ironially faster as shapes of referencing files are not computed
(with cache, same file again) 9.56s 10.37s❗ Initial shape computation of referencing files, slower
(with cache, same file again) 9.55s 9.70s equal
tsc --watch (startup) 71.30s⚠️ 26.27s large improvement
(save without change) 0.03s 0.03s equal
(non shape affecting change) 0.34s 1.42s❗ Initial shape computation, slower
(same file again) 0.26s 0.21s equal
(with fresh watcher, shape affecting change) 1.91s 1.38s Initial shape computation, ironially faster as shapes of referencing files are not computed
(same file again) 1.55s 2.15s❗ Initial shape computation of referencing files, slower
(same file again) 1.52s 1.47s equal
tsc --watch --incremental (initial) 73.30s⚠️ 26.99s large improvement
(from cache) 9.93s 9.57s equal

Raw data

Summary: tsc --incremental and tsc --watch is now as fast a pure tsc (see ⚠️), first time changing a file in watch/incremental mode takes a small hit (see ❗).

Test suite

All tests are passing. I updated a lot baselines as signatures are now missing from tsbuildinfo (they are 0 as marked for lazy computed), but there is no functional change.

I needed to change some tests that verify that clean build and incremental build result in the same build info, which is no longer true when signatures are lazily computed.

I disabled lazy shape computation for unittests:: tsserver:: compileOnSave, unittests:: tsc-watch:: emit file --incremental for some compileOnSave tests and for 8 tests using assumeChangesOnlyAffectDirectDependencies as the tests expect certain behavior that lazy shape computation would change. Note that the behavior is not wrong, but it doesn't fit to the test cases.

Edge cases

There are a few edge cases one might run into:

A

Do a non shape affecting change to file that affects the global scope (and is not a declaration file).

Since we don't know it's non shape affecting on this first change this will need to typecheck all files.

The second change will no longer have this behavior since shape is then computed.

Note that all CommonJS files are currently considered as "affecting global scope", so this might be a problem for commonjs projects. I guess this is a bug and CommonJS modules should probably not flagged in this way. Note: I fixed that.

B

Do a shape affecting change to a file that is referenced by many other modules.

On second change this will trigger a shape computation on all referencing modules, which might cause a extra delay (similar to the initial shape computation before this PR).

We could a limit in how many shapes should be computed at maximum during a single build to avoid this. But in worst case this would make it have the performance like the current initial builds have.

🔍 Search Terms

slow, incremental, watch

✅ Viability Checklist

My suggestion meets these guidelines:

Please verify that:

📃 Motivating Example

Incremental build are unattractive compared to full builds when using typescript for typechecking with noEmit: true.

💻 Use Cases

What do you want to use this for?

next.js

What shortcomings exist with current approaches?

Incremental builds are too slow. So you have to choose between:

What workarounds are you using in the meantime?

Not using --incremental at all

PS: tsbuildinfo reference list optimization

As a little extra I changed to serialization of tsbuildinfo a little bit so that duplicate lists of references are deduplicated (this is the first commit). This isn't strictly necessary, but in an intermediate version of this refactoring I just used all modules as fallback references and this resulted in an huge slowdown due to writing tsbuildinfo, so I optimized it a bit. I left it here, because it will decrease the tsbuildinfo size, which is good when it has to be transferred e. g. between CI builds.