On constant folding of final field loads (original) (raw)
Vladimir Ivanov vladimir.x.ivanov at oracle.com
Tue Jun 30 19:00:42 UTC 2015
- Previous message: On constant folding of final field loads
- Next message: why doesn't trigger compile when loop 10700 times?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Aleksey,
Big picture question: do we actually care about propagating final field values once the object escaped (and in this sense, available to be introspected by the compiler)?
Java memory model does not guarantee the final field visibility when the object had escaped. The very reason why deserialization works is because the deserialized object had not yet been published. That is, are we in line with the spec and general expectations by folding the final values, and not deoptimizing on the store? Can you elaborate on your point and interaction with JMM a bit? Are you talking about not tracking constant folded final field values at all, since there are no guarantees by JMM such updates are visible? Yup. AFAIU the JMM, there is no guarantees you would see the updated value for final field after the object had leaked. So, spec-wise you may just use the final field values as constants. I think the only reason you have to do the dependency tracking is when constant folding depends on instance identity. So, my question is, do we knowingly make a goodwill call to deopt on final field store, even though it is not required by spec? I am not opposing the change, but I'd like us to understand the implications better. That's a good question.
I consider it more like a quality of implementation aspect. Neither Reflection nor Unsafe APIs are part of JVM/JLS spec, so I don't think possibility of final field updates should be taken into account there.
In order to avoid surprises and inconsistencies (old value vs new value depending on execution path) which are very hard to track down, VM should either completely forbid final field changes or keep track of them and adapt accordingly.
For example, I can see the change gives rise to some interesting low-level coding idioms, like:
final boolean running = true; Field runningField = resolve(...); // reflective // run stuff for minutes void m() { while (running) { // compiler hoists, turns into while(true) // do stuff } } void hammerTime() { runningField.set(this, false); // deopt, break the loop! } Once we allow users to go crazy like that, it would be cruel to retract/break/change this behavior. But I speculate those cases are not pervasive. By and large, people care about final ops to jump through the barriers. For example, the final load can be commonned through the acquires / control flow. See e.g.: http://psy-lob-saw.blogspot.ru/2014/02/when-i-say-final-i-mean-final.html
Regarding alternative approaches to track the finality, an offset bitmap on per-class basis can be used (containing locations of final fields). Possible downsides are: (1) memory footprint (1/8th of instance size per class); and (2) more complex checking logic (load a relevant piece of a bitmap from a klass, instead of checking locally available offset cookie). The advantage is that it is completely transparent to a user: it doesn't change offset translation scheme.
I like this one. Paying with slightly larger memory footprint for API compatibility sounds reasonable to me. I don't care about cases when Unsafe API is abused (e.g. raw memory writes on absolute address or arbitrary offset in an object). In the end, it's unsafe API, right? :-) Yeah, but with millions of users, we are in a bit of a (implicit) compatibility bind here ;)
That's why I deliberately tried to omit compatibility aspect discussion for now :-)
Unsafe is unique: it's not a supported API, but nonetheless many people rely on it. It means we can't throw it away (even in a major release), but still we are not as limited as with official public API.
As part of Project Jigsaw there's already an attempt to do an incompatible change for Unsafe API. Depending on how it goes, we can get some insights how to address compatibility concerns (e.g. preserve original behavior in Java 8 compatibility mode).
What I'm trying to understand right now, before diving into compatibility details, is whether Unsafe API allows offset encoding scheme change itself and what can be done to make it happen.
Though offset value is explicitly described in API as an opaque offset cookie, I spotted 2 inconsistencies in the API itself:
- Unsafe.get/set*Unaligned() require absolute offsets; These methods were added in 9, so haven't leaked into public yet.
Andrew, can you comment on why you decided to stick with absolute offsets and not preserving Unsafe.getInt() addressing scheme?
- Unsafe.copyMemory() Source and destination addressing operate on offset cookies, but amount of copied data is expressed in bytes. In order to do bulk copies of consecutive memory blocks, the user should be able to convert offset cookies to byte offset and vice versa. There's no way to do that with current API.
Are you aware of any other use cases when people rely on absolute offsets?
I thought about VarHandles a bit and it seems they aren't a silver bullet - they should be based on Unsafe (or stripped Unsafe equivalent) anyway.
Unsafe.fireDepChange is a viable option for Reflection and MethodHandles. I'll consider it during further explorations. The downside is that it puts responsibility of tracking final field changes on a user, which is error-prone. There are places in JDK where Unsafe is used directly and they should be analyzed whether a final field is updated or not on a case-by-case basis.
It's basically opt-in vs opt-out approaches. I'd prefer a cleaner approach, if there's a solution for compatibility issues.
So, my next question is how to proceed. Does changing API and providing 2 set of functions working with absolute and encoded offsets solve the problem? Or leaving Unsafe as is (but clarifying the API) and migrating Reflection/j.l.i to VarHandles solve the problem? That's what I'm trying to understand. I would think Reflection/j.l.i would eventually migrate to VarHandles anyway. Paul? The interim solution for encoding final field flags shouldn't leak into (even Unsafe) API, or at least should not break the existing APIs. I further think that an interim solution makes auxiliary single Unsafe.fireDepChange(Field f / long addr) or something, and uses it along with the Unsafe calls in Reflection/j.l.i, when wrappers know they are dealing with final fields. In other words, should we try to reuse the knowledge those wrappers already have, instead of trying to encode the same knowledge into offset cookies?
II. Managing relations between final fields and nmethods Another aspect is how expensive dependency checking becomes. Isn't the underlying problem being the dependencies are searched linearly? At least in ConstantFieldDep, can we compartmentalize the dependencies by holder class in some sort of hash table? In some cases (when coarse-grained (per-class) tracking is used), linear traversal is fine, since all nmethods will be invalidated. In order to construct a more efficient data structure, you need a way to order or hash oops. The problem with that is oops aren't stable - they can change at any GC. So, either some stable value should be associated with them (System.identityHashCode()?) or dependency tables should be updated on every GC. Yeah, like Symbol::identityhash. Symbol is an internal VM entity. Oops are different. They are just pointers to Java object (OOP = Ordinary Object Pointer). The only doable way is piggyback on object hash code. I won't dive into details here, but there are many intricate consequences.
Unless existing machinery can be sped up to appropriate level, I wouldn't consider complicating things so much. Okay. I just can't escape the feeling we keep band-aiding the linear searches everywhere in VM on case-to-case basis, instead of providing the asymptotic guarantees with better data structures. Well, class-based dependency contexts have been working pretty well for KlassDeps. They worked pretty well for CallSiteDeps as well, once a more specific context was used (I introduced a specialized CallSite instance-based implementation because it is simpler to maintain).
It's hard to come up with a narrow enough class context for ConstantFieldDeps, so, probably, it's a good time to consider a different approach to index nmethod dependencies. But assuming final field updates are rare (with the exception of deserialization), it can be not that important.
The 3 optimizations I initially proposed allow to isolate ConstantFieldDep from other kinds of dependencies, so dependency traversal speed will affect only final field writes. Which is acceptable IMO. Except for an overwhelming number of cases where the final field stores happen in the course of deserialization. What's particularly bad about this scenario is that you wouldn't see the time burned in the VM unless you employ the native profiler, as we discovered in Nashorn perf work. Yes, deserialization is a good example. It's special because it operates on freshly created objects, which, as you noted, haven't escaped yet. It'd be nice if VM can skip dependency checking in such case (either automatically or with explicit hints).
In order to diagnose performance problems with excessive dependency checking, VM can monitor it closely (UsePerfData counters + JFR events + tracing should provide enough information to spot issues).
Recapping the discussion in this thread, I think we would need to have a more thorough performance work for this change, since it touches the very core of the platform. I think many people outside the hotspot-compiler-dev understand some corner intricacies of the problem that we miss. JEP and outcry for public comments, maybe? Yes, I planned to get quick feedback on the list and then file a JEP as a followup.
Thanks again for the feedback, Aleksey!
Best regards, Vladimir Ivanov
- Previous message: On constant folding of final field loads
- Next message: why doesn't trigger compile when loop 10700 times?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the hotspot-compiler-dev mailing list