Rewrite HashSet's implementation based on Dictionary's by stephentoub · Pull Request #37180 · dotnet/runtime (original) (raw)

Fixes #37111
Contributes to #1989

This moves HashSet into corelib, and then effectively deletes HashSet's data structure and replaces it with the one used by Dictionary, then updated for the differences (e.g. just a value rather than a key and a value). HashSet used to have basically the same implementation, but Dictionary has evolved significantly and HashSet hasn't; this brings them to basic parity on implementation.

Based on perf tests, I veered away from Dictionary's implementation in a few places (e.g. a goto-based implementation in the core find method led to a significant regression for Int32-based Contains operations), and we should follow-up to understand whether Dictionary should be changed as well, or why there's a difference between the two. @benaadams, if you have some time, it'd probably worth looking at this again; maybe you'll get different numbers than I did.

Functionally, bringing over Dictionary's implementation yields a few notable changes, namely that Remove and Clear no longer invalidate enumerations. The tests have been updated accordingly.

With HashSet now in corelib, I also updated two Dictionary uses I found in corelib that were using Dictionary as a set and switched them to use HashSet.

Running the dotnet/performance perf tests:

dotnet run -c Release -f net5.0 --filter System.Collections.*.HashSet --corerun d:\coreclrtest\master\corerun.exe d:\coreclrtest\pr\corerun.exe --join
Type Toolchain Size Mean Ratio Allocated
CtorDefaultSize master ? 4.777 ns 1.00 64 B
CtorDefaultSize pr ? 6.473 ns 1.36 72 B
CtorDefaultSize master ? 13.706 ns 1.00 64 B
CtorDefaultSize pr ? 13.156 ns 0.96 72 B
ContainsTrueComparer master 512 5,882.158 ns 1.00 -
ContainsTrueComparer pr 512 5,474.605 ns 0.93 -
ContainsTrueComparer master 512 22,759.196 ns 1.00 -
ContainsTrueComparer pr 512 22,814.504 ns 1.00 -
AddGivenSize master 512 5,034.736 ns 1.00 8456 B
AddGivenSize pr 512 4,118.598 ns 0.82 8464 B
AddGivenSize master 512 17,223.668 ns 1.00 10536 B
AddGivenSize pr 512 11,328.307 ns 0.66 10544 B
CreateAddAndRemove master 512 13,473.343 ns 1.00 27712 B
CreateAddAndRemove pr 512 11,269.323 ns 0.84 27720 B
CreateAddAndRemove master 512 43,108.066 ns 1.00 34480 B
CreateAddAndRemove pr 512 28,070.657 ns 0.65 34488 B
CtorFromCollection master 512 8,066.837 ns 1.00 8488 B
CtorFromCollection pr 512 6,705.302 ns 0.83 8496 B
CtorFromCollection master 512 20,284.380 ns 1.00 10568 B
CtorFromCollection pr 512 12,863.063 ns 0.63 10576 B
CtorGivenSize master 512 505.879 ns 1.00 8456 B
CtorGivenSize pr 512 493.664 ns 0.97 8464 B
CtorGivenSize master 512 650.880 ns 1.00 10536 B
CtorGivenSize pr 512 640.043 ns 0.98 10544 B
ContainsFalse master 512 2,427.766 ns 1.00 -
ContainsFalse pr 512 2,360.871 ns 0.97 -
ContainsFalse master 512 15,994.814 ns 1.00 -
ContainsFalse pr 512 8,593.030 ns 0.54 -
ContainsTrue master 512 2,945.063 ns 1.00 -
ContainsTrue pr 512 2,794.102 ns 0.95 -
ContainsTrue master 512 16,805.925 ns 1.00 -
ContainsTrue pr 512 10,333.508 ns 0.61 -
CreateAddAndClear master 512 8,900.466 ns 1.00 27712 B
CreateAddAndClear pr 512 7,554.481 ns 0.85 27720 B
CreateAddAndClear master 512 22,007.874 ns 1.00 34480 B
CreateAddAndClear pr 512 15,382.689 ns 0.70 34488 B
IterateForEach master 512 1,348.107 ns 1.00 -
IterateForEach pr 512 977.800 ns 0.73 -
IterateForEach master 512 2,301.860 ns 1.00 -
IterateForEach pr 512 1,589.925 ns 0.69 -

cc: @benaadams, @danmosemsft, @GrabYourPitchforks, @eiriktsarpalis, @layomia