bpo-29882: Add an efficient popcount method for integers by niklasf · Pull Request #771 · python/cpython (original) (raw)

Fix implemented, in the shape of an extra (uint32_t) cast on the multiplication result along with some U suffixes on the integer constants.

Note that the U suffix on 0x01010101U is actually necessary to avoid undefined behaviour: consider a (hypothetical, but permissible under the C standard) machine with a 48-bit int. If we do u * 0x01010101 on that machine, then first the integer promotions are applied to both arguments, and they both end up being of type int. Then we do an int-by-int multiplication. With the right (wrong?) value of d, this could overflow, giving undefined behaviour.

But with the U suffix, we end up after the integer promotions multiplying an int by an unsigned int, and then the usual arithmetic conversions kick in and the multiplication is performed as type unsigned int, which has well-defined wraparound behaviour for the result.

The other U suffixes aren't strictly necessary; they're just there to satisfy my OCD and make the code easier to think about.