<random>: Use _Unsigned128 for linear_congruential_engine by StephanTLavavej · Pull Request #5436 · microsoft/STL (original) (raw)

I noticed that we had a single remaining usage of separately compiled multiprecision arithmetic machinery. I believe that we can replace this with the modern _Unsigned128 type, which is far better tested and more widely exercised. As @AlexGuteniev observed, _Unsigned128 uses intrinsics, so it should also be faster than multprec.cpp. And as a final bonus, it makes this codepath strongly resemble the 32-bit and 64-bit codepaths above.

👨‍🔬 Proof that 128 bits are sufficient

The function comment says:

// Choose intermediate type:
// To use type T for the intermediate calculation, we must show
// _Ax * (_Mx - 1) + _Cx <= numeric_limits::max()
// Split _Cx:
// _Cx <= numeric_limits::max()
// && _Ax * (_Mx - 1) <= numeric_limits::max() - _Cx
// Divide by _Ax:
// _Cx <= numeric_limits::max()
// && (_Mx - 1) <= (numeric_limits::max() - _Cx) / _Ax

The first part, _Cx <= (2^128 - 1), is obviously true.

The next part is (_Mx - 1) <= ((2^128 - 1) - _Cx) / _Ax. This is the hardest to achieve for the largest LHS (caused by the largest _Mx) and the smallest RHS (caused by the largest _Ax and the largest _Cx). So ((2^64 - 1) - 1) <= ((2^128 - 1) - (2^64 - 1)) / (2^64 - 1) is the worst case.

C:\Temp>py
Python 3.13.3 (tags/v3.13.3:6280bb5, Apr  8 2025, 14:47:33) [MSC v.1943 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> ((2**64 - 1) - 1)
18446744073709551614
>>> ((2**128 - 1) - (2**64 - 1)) // (2**64 - 1)
18446744073709551616
>>> ((2**64 - 1) - 1) <= ((2**128 - 1) - (2**64 - 1)) // (2**64 - 1)
True

⚙️ Commits

✅ Notes