hotspot heap and L1 and L2 cache misses (original) (raw)
Vitaly Davidovich vitalyd at gmail.com
Mon Sep 17 13:40:15 PDT 2012
- Previous message: hotspot heap and L1 and L2 cache misses
- Next message: hg: hsx/hotspot-comp/hotspot: 7196262: JSR 292: java/lang/invoke/PrivateInvokeTest.java fails on solaris-sparc
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Andy,
TLAB will satisfy the allocation requests in this case, so the object and its arrays would be in that thread-local buffer. However, some objects can be considered humongous and will be allocated straight out of global Eden space or even tenured space. Also, once a tlab is exhausted, it's retired (its objects' memory is copied into Eden global pool and TLAB is reset, possibly resized as well). If your objects survive gc and get tenured, they'll end up in the old gen. Once there, I don't think there's any guarantee that heap compaction will try to keep the object graph close together in memory, although maybe it ends up like that inadvertently. If your objects get copied into Eden pool (or a survivor space), I'm not sure there's any guarantee of colocation either. Some of the above may be inaccurate so check with the gc devs using the alias in Chris' last email.
I'd also do as Chris suggested and re-run your benchmark on a recent hotspot build and modern/recent hardware. You'd also need to profile using a profiler that supports hardware perf counters so that you can attribute the difference to cache misses - otherwise, hard to say for sure.
HTH, Vitaly
Sent from my phone On Sep 17, 2012 3:50 PM, "Christian Thalinger" < christian.thalinger at oracle.com> wrote:
On Sep 17, 2012, at 12:20 PM, Andy Nuss <andrewnuss at yahoo.com> wrote: > What about the case of a new class instance that creates and holds 2 references to medium length arrays? Is the new instance and its 2 arrays in the same area of the heap? Depends on what you mean with "the same area". But these questions should better go to hotspot-gc-dev. -- Chris > > From: Christian Thalinger <christian.thalinger at oracle.com> > To: Andy Nuss <andrewnuss at yahoo.com> > Cc: hotspot <hotspot-compiler-dev at openjdk.java.net> > Sent: Monday, September 17, 2012 11:39 AM > Subject: Re: hotspot heap and L1 and L2 cache misses > > > On Sep 15, 2012, at 12:03 PM, Andy Nuss <andrewnuss at yahoo.com> wrote: > > > Hi, > > > > Lets say I have a function which mutates a finite automata. It creates lots of small objects (my own link and double-link structures). It also does a lot of puts in my own maps. The objects and maps in turn have references to arrays and some immutable objects. > > > > My question is, all these arrays and objects created in one function that has to do a ton of construction, are there any things to watchout for so that hotspot will try to create all the objects in this one function/thread colocated on the heap so that L1/L2 cache misses are reduced when the finite automata is executed against data? > > > > Ideally, someone could tell me that when my class constructors in turn creates new instances of other various size other objects and arrays, they are all colocated on the heap. > > > > Ideally, someone could tell me that when I have a looping function that creates alot of very small Linked List objects in succession, again they are colocated. > > > > In general, how does hotspot try with creating new objects to help the L1/L2 caches? > > > > By the way, I did a test port of my automata to C++ where for objects like the above, I had big memory chunks that my inplace constructors just subdivided the memory chunk that it owned so that all the subobjects were absolutely as colocated as possible. > > > > This C++ ported automata out-performed my java version by 5x in execution against data. And in cases where I tested the performance of construction-time cost of the automata where the comparison is between the hotspot new, versus my simple inplace C++ member functions which basically just return the current chunk cursor, after calculating the size of the object, and updating the chunk cursor to point beyond the new size, in those cases I saw 25x performance differences (5 yrs ago). > > TLAB allocations do the same pointer-bump in HotSpot. Do the 5x really come from co-located data? Did you measure it? And maybe you should redo your 25x experiment. 5 years is a long time... > > -- Chris > > > > > Andy > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20120917/f69e107c/attachment.html
- Previous message: hotspot heap and L1 and L2 cache misses
- Next message: hg: hsx/hotspot-comp/hotspot: 7196262: JSR 292: java/lang/invoke/PrivateInvokeTest.java fails on solaris-sparc
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the hotspot-compiler-dev mailing list