Serialization problem (original) (raw)
Osvaldo Doederlein opinali at gmail.com
Sun Jan 31 17:42:30 UTC 2010
- Previous message: Serialization problem
- Next message: Serialization problem
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
It's sad to see this issue of serialization vs. final resurface so many times. I have complained about this myself a number of times. The 'final' modifier is counter-intuitive as it doesn't really prohibit modification (most developers don't know that even a 'private static final' field can be updated by reflection or JNI, as explicitly allowed by the JLS). On top of that, Serialization was introduced without enough care for final fields, so we are effectively forced to drop 'final' for fields that require custom desserialization. Now these problems will bite us much more often because the immutable-object technique is being increasingly adopted, sometimes by whole libraries, or even more radically by newer languages-for-the-JVM like Clojure (and I guess these languages would love to translate their semantically-immutable types into immutable JVM-level classes whenever possible, e.g. when mutable state is not introduced by the compiler as optimization around extra allocations).
My suggestion (big one I know, perhaps an idea for Java 8...) - add some mechanism (annotation, type modifier, etc.) that allows to fix/strengthen the semantics of final and serialization (and more? some suggestions below...), as follows:
- Final fields are guaranteed immutable, forever, after construction. They cannot be changed by magic reflection calls from trusted classes, or even by JNI calls. (If it's too expensive to write-protect against arbitrary JNI code, spec the result as 'undefined behavior', and perform the very expensive check only with -Xcheck:jni.)
- readObject() and other desserialization helpers can update final fields. The "final freeze" is defined to happen only after desserialization is complete. (If readObject() is invokes some helper methods, even if these methods are in the same class and only used by desserialization, they cannot assign to final fields; no need to make this check complex.)
- In final fields of array type, the array elements are also immutable after the final-freeze. (I know this creates some challenges. Within the same class, maybe we can do enough effort to make sure that array field is not assigned/aliased/reflected in a way that would allow modification to elements. Then we can just decree that the array cannot be escape its class; so if one wants a getter for the array, it's mandatory to return a copy of it. This technique - defensive copying - is already a best-practice, so it's not really extra cost. And the JIT can always eliminate copying in cases covered by Escape Analysis, e.g. in println(obj.getStuff()[5]), it's trivial to see that the array copy performed by getStuff() can be avoided at this particular callsite.)
- If the class is Serializable, providing serialVersionUID is mandatory.
- If the class overloads hashCode(), it must overload equals() and vice-versa.
- If the class (or some of its superclasses except Object) doesn't overload hashCode(), a call to hashCode() throws an exception; that is, relying on Object.hashCode() is banned.
- Some interaction with JSR-305 (enforcing the semantics of its annotations further - a real pluggable typesystem)? 8..) More?
The general idea is enforcing as much "Modern POJO Best-Practices" as possible (without requiring extra code, so I don't propose things such as mandating hashCode/equals to be overridden). This enforcement should be hard-line, with detection of uncompliance at both runtime and (when possible) compile-time. It should be robust enough (no possible circumvention) so the security system could rely on it to enforce security concerns without extra runtime checks, and the JIT optimizer could rely on it to enable aggressive optimizations.
A+ Osvaldo
2010/1/31 Alan Bateman <Alan.Bateman at sun.com>
Stephen Colebourne wrote:
I thought I'd raise an issue with serialization that I've had a problem with more than once. Perhaps there is an obvious easy solution, but I can't see it (I can see hard workarounds...)
In JSR-310 we have lots of immutable classes. One of these stores four fields: private final String name private final Duration duration private final List periods private final int hashCode For serialization, I only need to store the name, duration and element zero from the periods list. (The rest of the period list is a cache derived from the first element. Similarly, I want to cache the hash code in the constructor as this could be performance critical.). Storing just these fields can be done easily using writeObject() In the JDK there are places that use unsafe's putObjectVolatile to workaround this. It's also possible to use reflection hacks in some cases. There is more discussion here: http://bugs.sun.com/viewbug.do?bugid=6379948 Doug Lea and the concurrency group were working on a Fences API that included a method for safe publication so that one can get the same effects as final for cases where it's not possible to declare a field as field. For the hashCode case above then perhaps it doesn't necessary to compute the hash code in the constructor or when reconstituting the object. Instead perhaps the hashCode method could compute and set the hashCode field when it sees the value is 0 (no need to be volatile and shouldn't matter if more than one thread computes it). -Alan. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100131/3cdd8982/attachment.html>
- Previous message: Serialization problem
- Next message: Serialization problem
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]