[Python-Dev] Rattlesnake progress (original) (raw)
Daniel Berlin dan@dberlin.org
Tue, 19 Feb 2002 11:37:22 -0500
- Previous message: [Python-Dev] Rattlesnake progress
- Next message: [Python-Dev] Rattlesnake progress
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tuesday, February 19, 2002, at 11:01 AM, Kevin Jacobs wrote:
On Tue, 19 Feb 2002, Daniel Berlin wrote:
On Tuesday, February 19, 2002, at 09:51 AM, Neil Schemenauer wrote:
Daniel Berlin wrote: When you get to optimizations, you want Advanced Compiler Design and Implementation by Muchnick.
Right now I'm not planning to do any optimizations (except perhaps limiting the number of registers used). This is, of course, a tricky optimization to do. Limiting registers used involves splitting live ranges at the right places, etc. Why limit the number of registers at all? So long as they fit in L1 cache you are golden.
Err, what makes you think this? The largest problem on architectures like x86 is the number of registers. You end up with about 4 usable registers. (hardware register renaming only helps eliminate instruction dependencies, before someone mentions it). Performance quickly drops when you start spilling registers to the stack.
In fact, i've seen multiple SPEC regressions of 15% or more caused by a single extra spilled register. Why? Because you have to save it and reload it multiple times. These kill pipelines, and instruction scheduling.
It's also much harder to model the cache hierarchy properly so that you can make sure they'd fit in the l1 cache, than it is to make sure they stay in registers where needed in the first place.
Try taking a performance critical loop entirely in registers, and change it to save to and load from memory into a register on every iteration. See how much slower it gets.
--Dan
- Previous message: [Python-Dev] Rattlesnake progress
- Next message: [Python-Dev] Rattlesnake progress
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]