Performance Matters (original) (raw)

Your CPU May Have Slowed Down on Wednesday

The death of hardware store optimization.

Ice Lake AVX-512 Downclocking

Examining the extent of AVX related downclocking on Intel's Ice Lake CPU

A Concurrency Cost Hierarchy

Concurrent operations can be grouped relatively neatly into categories based on their cost

AVX-512 Mask Registers, Again

Taking a second look at the newly introduced mask registers, this time with the benefit of a SKX die shot from Fritzchens Fritz.

Ice Lake Store Elimination

We look at the zero store optimization as it applies to Intel's newest micro-architecture.

Hardware Store Elimination

Probing a previously undocumented zero-related optimization on Intel CPUs.

Adding Staticman Comments

Adding static comments to a static blog using staticman. Static.

The Hunt for the Fastest Zero

Unexpected performance deviations depending on how you spell zero.

Gathering Intel on Intel AVX-512 Transitions

Investigating some details of SIMD related frequency transitions on Intel CPUs.

A Note on Mask Registers

Some mostly too-low-level-to-care-about hardware details of the mask registers introduced in AVX-512.

Clang-format Tanks Performance

Can using clang-format make your code slower? Kind of.

Incrementing Vectors

Incrementing vector<T> for various T may not perform as you'd expect.

Where Do Interrupts Happen?

Trying to determine exactly where asynchronous interrupts are delivered on Intel CPUs.

Performance Speed Limits

A laundry list of speed limits that your code can't exceed.

Beating Up on Qsort

Building sort functions faster than what the C and C++ standard libraries offer.

What Has Your Microcode Done for You Lately?

CPU microcode updates can cause silent and dramatic performance changes.

A Blog Appears!

If there’s one thing the internet needs, it’s another blog. So after messing around with Jekyll and Github Pages for way longer than is reasonable, here we are.