Valhalla EG minutes Feb 14, 2018 (original) (raw)

John Rose john.r.rose at oracle.com
Tue Feb 27 03:59:10 UTC 2018

Previous message: Valhalla EG minutes Feb 14, 2018
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Feb 20, 2018, at 7:52 AM, Karen Kinnear <karen.kinnear at oracle.com> wrote:

attendees: Tobi, Mr Simms, Dan H, Dan S, Frederic, Remi, Karen ... III. Value Types Latest LWorld Value Types proposal: http://cr.openjdk.java.net/~acorn/LWorldValueTypesFeb13.pdf Latest rough draft JVMS: http://cr.openjdk.java.net/~fparain/L-world/L-World-JVMS-4b.pdf Feedback/Q&A: 1. creation of a new value type - Remi - why not vnew ? why default/withfield/withfield/withfield? - transformations - e.g. Byteman - easier if arguments are on the stack Frederic: First proposal had a factory bytecode, returning a single fully constructed value type rejected: concern: cost of pushing all arguments, method signature and attribute to how signature maps to fields

Yep. This is really a FAQ. I'll take a shot at it.

"Why not just do vnew"? Because vnew would be a complicated construct requiring, for each vnew instruction, a detailed list of fields and values which amount to a whole record type. This would be resolved in the constant pool (again, for each vnew instruction). The constant pool would have to support a new variadic constant type, as a primitive constant type to supply the needs of vnew. This is beyond the complexity of any other constant pool type to date.

The data required to correctly link is analogous to (but more complex than) the code generation of the enum-switch code generation or strings-in-switch. Like those features, it is best implemented by a metafactory, not a single VM instruction.

Also, we have an independent need for one-field update to values (withfield) to express "wither" methods and similar shapes. But vdefault + withfield* covers the same functionality as vnew. And it reuses CONSTANT_Fieldref.

Also, Java constructors for value types translate naturally into vdefault + withfield*. They do not translate naturally into vnew. Some Java constructors translate naturally into vnew, but only those trivial ones which perform no logic other than blank final field assignment. Most Java constructors are not so trivial, and we do not intend to limit the expressiveness of value type constructors in such a way.

One objection to use of withfield is that it is hard to implement a series of withfield instructions in the interpreter, without creating many intermediate versions of a variable, each in its own buffer. Of course the JIT has no such problem, since it knows the liveness of each intermediate value, and can immediately reuse storage known to be unique to a particular value, to create the next version after a withfield. The problem in the interpreter can also be fixed, e.g., by appropriate use of approximate liveness tracking, such as reference counts. Such a local implementation issue must not be allowed to overturn the more basic design points noted above, which affect all classfiles and all compilers.

Specifically: If a value is loaded (using aload or vnew) to TOS, it may well have a reference count of 2 or more. But the first withfield will rebuffer it to unaliased storage. A chain of further withfields can just modify it in place. This pattern (of provably unaliased value buffers) can be detected on the fly by mechanical reference counts, or else by a lightweight pre-pass run at class load time, recoding each subsequent withfield after the first as "patching_withfield". Classfiles would not be allowed to mention patching_withfield, because it is grossly unsafe, but the interpreter would safely run those instructions where the class loader had determined that the operation is safe. There are lots of ways to skin this cat.

Dan S: declared fields do not have an inherit ordering, so e.g. attribute to identify order - expected usage: factory method in the value class itself

Dan: also want withfield exposed at the language level to allow tweaking one thing

Yes, this is very important.

Karen: would be helpful to have a single way to create a value type or an object to allow more shared code - model is to move all toward a factory mechanism

For object classes this single way is new + + putfield*. Plus reflective versions. For value classes this single way is + vdefault + withfield*. Plus reflective.

The two cases are in close correspondence in order to allow constructors to be translated compatibly for both object classes and value classes.

If we go to factory mechanisms it will be very frustrating (as with MVT) to pain-gram the constructor translation strategy. Remember that constructors usually intermix field sets with method calls (including on 'this') and control flow. You can't fit that into a factory or metafactory. You have to use a sequence of elemental operations expressed by bytecodes.

Frederic: - inside factory - it is not the same bytecodes for value type and object type creation - note: withfield returns a new value type - it does not have the same stack behavior as putfield

There are a superficial differences, but the similarities outweigh them.

The initial binding of 'this' in an object constructor is the result of the 'new' instruction passed as the receiver argument to invokespecial , and stored in local number zero. The initial binding of 'this' in a value constructor is produced locally, using 'vdefault', and stored (if necessary) in a local which does not correspond to any of the incoming arguments. (In fact it could be stored at the root of the stack, in many but not all cases.)

Those two sequences are different but in the rest of the constructor body, 'this' is uniformly available, perhaps in a partially uninitialized state. It's true: The object might have blank final fields containing their default values because they have not been putfield-ed to. The value might have fields containing their default values because they have not be withfield-ed to. There is no difference to the programmer: Getfield in either case will produce the default value. The rules of definite unassignment in the JLS make it hard to see this, of course, but it's there in both cases.

(Probably the JLS rules, as written today, are enough to ensure that no value type can contort itself to observe an uninitialized field, but this is not a necessary point. The JVM can see a default field value, if it wants, for both objects and values.)

After entry, the constructor body runs control flow and method calls, mixed with assignments (one per field along any path, per JLS rules) to the fields of the new instance (either object or value). At the end of the constructor, when it returns normally, the method returns void, leaving the passed-in object, in the required state. The factory method returns the new value, in the required state. The way I see it, these differences are at the surface, not in the basic semantics of constructors for the two kinds of classes.

A final observation: putfield and withfield have different stack behaviors, in that putfield doesn't return a result (just keeps hammering the same object over and over) while withfield returns a new version of the value. Again, this is a surface difference, because the translation of a value class constructor must simply pop the new version of the value and store it in the local variable (mentioned above) which was initially populated with a "vdefault". In some cases, a translator may be able to keep the value on the JVM stack through a series of peephole optimizations, but this point doesn't affect the semantic parity between the two classes of constructors.

(Another note to us implementors: It would be somewhat reasonable, though uglier, if withfield took its first operand from a local rather than from the stack, and updated that local in place. The iinc instruction is an pre-existing example of this. There would be two benefits to using such an in-place withfield: First, there would be no need in a constructor or wither to push the local containing this on the stack and then pop the new version back off, for a small reduction in bytecode size. Second, there might be less duplication of buffers in an interpreter which used reference counts to track buffer usage. But these small advantages do not seem to me to outweigh the relative cleanliness of the current design of withfield.)

Dan H: factory proposal is better than defaultvalue/withfield - less throwing away extra created value types for the interpreter

I hate to disappoint Dan, but that would be the tail wagging the dog; see above.

3. withfield handling Remi: why withfield? Frederic: goal is to allow loop iteration with low cost Remi: why restrict to within the value class itself? Karen: concern: this creates a new value type, think of it as CopyOnWrite, it does NOT go through final and update an existing value type. So this is heavyweight Remi: could we have the language decide restrictions on its usage rather than the JVMS?

That's the current scheme: We keep withfield private even if the field is public. This allows class writers to decide independently (a) how visible to make fields for reading, and (b) how much trust they give clients to create new values with arbitrary field settings. If a value type represents a checked capability, it must not be possible for external users to forge arbitrary new capabilities, in an unchecked manner. But public withfield would do this, or else force API designers always to hide fields behind accessors.

Same point even if the value type isn't a capability, but just asserts its right to validate and/or normalize field values. Raw withfield would subvert that.

A future version of withfield might allow a class to open up "raw" withfield access. But it is more likely that we will create a way for a class to open up a more "cooked" version of such access, such as some hook for "dumping" the state of a value and "reassembling" it from an altered state. In fact, that's just getfield* + constructor, in many cases, so the API points are already there. Crucially, the class writer has control over validation, normalization, permission checks, etc., in the reconstruction of the new value state.

Dan S: future - if we want a general purpose withfield - we may want to put that in with extended field access controls - e.g. separate read vs. write. At that time you could use withfield if the field were accessible. - e.g. with Records - may expose readability, not availability

Yes. This is possible, even with object classes. I like to name this feature "sealed fields", since the sealed field is "usable but not redefinable", where "usable" = readable and "redefinable" = writable. (By analogy with sealed interfaces.) Value fields are sealed by default for reasons given above, but could be unsealed. Object fields are sealed if final, but unsealed if non-final. An intermediate state might make sense.

But: When I work out use cases for these intermediate states, I don't see anything promising yet. So I think we can stick with what we have now and make a note to reevaluate later.

Frederic: concern about confusing people - withfield with an immutable object

Dan S: language could make this clearer that this is not an assignment, but is a “new” Opinions?

Yes, we need a new syntax at the source code level to make it clear that (a) an old value instance is being operated on, but (b) a new version of that instance is the result of the operation. It seems promising to me to allow something like a constructor body (with field values in scope under their own names) to pull this off.

But I don't know what this looks like, except maybe in the easier case of a "named reconstructor" within a class:

__ByValue class Rational { public final long num, den; public Rational(long num, long den) { this.num = num; this.den = den; assert(den != 0); } public __Reconstructor neg() { // constructor rules here, except fields appear mutable num *= -1; // aload L0; dup; getfield num; iconst -1; imul; withfield num; astore L0 return; // aload L0; areturn } }

The rule is, inside a constructor you can assign to your fields. That's the rule already, of course, but in a reconstructor you can do the same things, plus refer to previous field values. (And perhaps 'this', although there's some doubt about which version should apply: Either the original or the current state.)

4. arrays We need a new bytecode to create a flattenable/non-nullable array existing bytecodes do not create flattenable arrays with the new model of container marking flattenable rather than by type

Whoa.

I haven't yet seen a strong reason to do per-container flattenability. I'd rather not do this unless there is a strong reason.

And even if we need this, there's no reason to burn a new bytecode; it can be a reflective call, as java.lang.reflect.Array.newNullableInstance. (Yes, I think the existing bytecode should make the correct flatness.)

…

5. Arrays and nullability

Question: can you pass a VT[] where an Object[] is expected? Yes you can pass the argument, and sub typing works. Frederic: If you have an Object[], if you have non-flattenable values then elements are nullable, if you have flattenable values, then elements are not nullable

Yep. We are eating the cost of buffering up flat elements inside aaload, even though that requires a data-dependent check.

The costs of this can be reduced in the JIT, using profiling.

But if we allow flatness to be a new randomly changing bit on instances, the profiling will be less effective (until we profile that bit, perhaps, or perhaps not).

The JVM works very hard to feed type analysis to the JIT. Let's think twice before we make flatness not a property of types.

(Note to self: This argument applies somewhat to frozen arrays also. But there reusing the same types seems to be a forced move. Non-flat arrays of flat types is not a forced move.)

5. Generics and nullability

Dan S: With generics, value types will work as is. In future, if we were to change a field to be non-nullable, then we could get NullPointerExceptions Karen: if we were to change a field to be non-nullable, then if we wanted to we could support a different layout, and that would require specialization if the field were non-nullable depending on the parameter type. This is a current open challenge - how to handle migration to non-nullable fields and arrays

We are working cases on this. It looks like the issue, often, is how tolerant to be about "polluting nulls" in code which is not fully type-correct. (It is not type-correct when old classfiles co-exist with new ones.) The decision points are at places like getfield, putfield, invoke, return, aastore, checkcast. When do we allow a polluting null to pass by, and when do we throw NPE? The basic right move, I think, is to throw NPE as soon as possible, as a service to the JIT, and also to the user who wants to know something is fishy in the code.

But legacy classfiles must never throw NPE, since they can't know any better. This implies that a number of our bytecodes must be sensitive to the classfile version, and throw NPE only on recompiled code. This behavioral divergence is not (IMO) enough to warrant a new slew of bytecodes, but it is a tight fit to support both old and new behaviors. Maybe we want a bytecode prefix that means "allow polluting nulls", and treat all relevant bytecodes in old code as if that prefix were present.

Note that in future we might want non-nullable identity objects as well as value types.

Yes. In which case the prefix might mean "invert treatment of polluting nulls". Or maybe the prefix comes with flag bit to explicitly get the NPE behavior vs. pass behavior.

We could also handle statically mandated null checks with a new nullcheck bytecodes or even just invokestatic Os.requireNonNull. That doesn't feel right to me, since the right behavior should also look relatively simple in bytecodes, compared to the wrong behavior.

To help migration, Brian would like us to find a way so that javac would detect a mismatch in expectations of nullability, so we catch them at compile time.

Ooh, that's a good idea. When javac resolves use of an API in a JAR that isn't up to Valhalla classfile version, it could check to see if the API signature mentions return types which are statically known to be value types. (And in other covariant positions, maybe.) That's an example of a classfile that could introduce polluting nulls, and merits a warning.

Thanks for pushing all this forward! — John

Previous message: Valhalla EG minutes Feb 14, 2018
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the valhalla-spec-experts mailing list