IBM Java PackedObjects (original) (raw)
1. Marcel Mitran – STSM, Architect Java on System z mmitran@ca.ibm.com November 20th, 2012 IBM Java PackedObjects: An Overview IBM Software Group: Java Technology Centre © 2012 IBM Corporation
2. Important Disclaimers THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILST EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. ALL PERFORMANCE DATA INCLUDED IN THIS PRESENTATION HAVE BEEN GATHERED IN A CONTROLLED ENVIRONMENT. YOUR OWN TEST RESULTS MAY VARY BASED ON HARDWARE, SOFTWARE OR INFRASTRUCTURE DIFFERENCES. ALL DATA INCLUDED IN THIS PRESENTATION ARE MEANT TO BE USED ONLY AS A GUIDE. IN ADDITION, THE INFORMATION CONTAINED IN THIS PRESENTATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM, WITHOUT NOTICE. IBM AND ITS AFFILIATED COMPANIES SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF: - CREATING ANY WARRANT OR REPRESENTATION FROM IBM, ITS AFFILIATED COMPANIES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS 2 © 2012 IBM Corporation
3. PackedObject Delivery and Intended Use PackedObject is an experimental feature in IBM J9 Virtual Machine. Goal(s) of Feature: ■ Improve serialization and I/O of Java objects ■ Allow direct access to “native” (off-heap) data ■ Allow for explicit source-level representation of compact data-structure Intended Use: ■ Provide an opportunity for feedback and experimentation – Not meant for production support – Not a committed language change 3 © 2012 IBM Corporation
4. PackedObjects for IBM's Java Features of today's Java work well in certain ...changing how Java data is represented and Present data is accessed and used introduces native l scenarios, poorly in others... new efficiencies into the Java language ● Bloated Objects: Data headers and ● Shared Headers & No References references required to access and use data stored outside of Java ● No direct access to off-heap data: Java Native Interface or Direct Byte Buffers ● Direct access to native stored off-heap required when accessing. ● Redundant Data Copying: Copies of off- heap data required to incorporate/act-on ● Elimination of data copies changes to source data ● Suboptimal heap placement: Non- adjacent placement of objects in memory ● In-lined data allows for optimal caching slows down serialization, garbage collection 4 © 2012 IBM Corporation
5. Speak to me in 'Java', I don't speak 'Native' ■ Java only speaks ‘Java’… – Data typically must be copied/(de)serialized/marshalled onto/off Java heap – Costly in path-length and footprint 5 © 2012 IBM Corporation
6. On-Heap PackedObjects Example ■ Allows controlled layout of storage of data structures on the Java heap – Reduces footprint of data on Java heap – No (de)serialization required I/O Native storage (20 bytes) Java heap JVM 6 © 2012 IBM Corporation
7. Off-Heap Packed Objects Example ■ Enable Java to talk directly to the native data structure – Avoid overhead of data copy onto/off Java heap – No (de)serialization required I/O Native storage (20 bytes) Meta Data JVM Java heap 7 © 2012 IBM Corporation
8. Example: Distributed Computing High-Level Architecture Communication between nodes (RDMA, hyper-sockets, ORB, etc): Using Java packed objects, data can ● Data copy be moved between the ● (De)Serialization persistency and communication layers without being copied or (de)serialized onto/off the Java Data persistency on each heap node (DB, file-system, etc): ● Data copy ● (De)serialization DB DB JVM JVM App. App. Server Server Node Node 8 © 2012 IBM Corporation
9. © 2012 IBM Corporation Page 9 Example: Inter-language Communication Java requires data copies, marshalling and COBOL Java C/C++ serialization across language boundaries foo(…){ goo(…){ loo(…){ … … … goo(); loo(); } } } Java packed objects avoids data copies, COBOL Java C/C++ marshaling and serialization foo(…){ goo(…){ loo(…){ … … … goo(); loo(); } } }
10. PackedObjects 101 ■ A new PackedObject type for the Java language, which allows for: – Direct access of data located outside of the Java heap – Contiguous allocation of all object's data (objects and arrays) – Is not derived from Object, and hence dis-allows assignment and casting – Special BoxingPackedObject is glue to reference a PackedObject from Object java/lang/object java/lang/PackedObject java/math/BigDecimal etc… java/lang/PackedArray etc… java/lang/String java/lang/PackedString java/lang/HashMapEntry java/lang/BoxedPackedObject java/lang/PackedHashMapEntry ■ Current Java Capabilities – Current Java logic requires language interpreters and data copies for execution. – PackedObjects eliminate data copies across the Java Native Interface and the need to design and maintain Direct Byte Buffers ■ Using PackedObjects: annotation-based (or later a packed key word) above a class definition is required to create a packed class. The class instances can be accessed and modified identically to current Java objects 10 © 2012 IBM Corporation
11. Scope of Implementation ■ “@Packed” class annotation used to define a PackedObject class ■ “@Length” field annotation used to specify length of PackedObject arrays Proposed Initial Rules ■ Packed types must directly subclass PackedObject ■ Packed inlining can only happen for field declarations which are primitives, PackedObjects or arrays of PackedObjects ■ Fields made up of arrays must provide a length that is a compile time constant ■ Regular Java primitive types cannot be used to declare a PackedObject array. Boxed types for primitive arrays must be used instead. ■ A field declaration cannot introduce a circular class dependency ■ When a PackedObject is instantiated, only the constructor for the top-level PackedObject is called ■ Local variable assignment and parameter passing of a PackedObject is copy-by-reference ■ BoxedPackedObject is used to box a PackedObject with an Object reference ■ Allocating a PackedObject using the 'new' keyword creates an on-heap PackedObject ■ Off-heap PackedObject creation is done using factory method provided in the class library 11 © 2012 IBM Corporation
12. Code Snippets ■ Packed class definition ■ On-Heap Packed Allocation ■ Off-heap Packed Allocation 12 © 2012 IBM Corporation
13. Functionality Changes Current Java PackedObject Data Field ■ Object fields limited to primitives or ■ When allocating a PackedObject, all Allocation references to other objects; non- corresponding data fields get allocated and Storage primitives must be initialized and copied simultaneously and packed into a single into a format understood by Java. contiguous object (rather than referenced). ■ Headers for child objects copied onto ■ No headers for child objects which all Child objects the Java heap when accessed. share global header on the PackedObject. Arrays ■ For arrays of objects each element in an ■ Arrays packed together contiguously array has it's own header and a under one common header; array length reference to it. The elements are not marked in PackedObject header. Full contiguous in memory. access to elements in array and bounds checking still performed. Off-heap ■ Data can not be accessed or modified ■ Data that does not exist in Java can be outside of the Java heap. Data must be accessed and modified directly by using converted into a Java version and then the data's memory location. The Java this copy can then be accessed and Virtual Machine takes care of the manipulated. accessors and modifiers internally. 13 © 2012 IBM Corporation
14. Off Heap Benefit: Lowers Memory Footprint, increases performance Before Native memory ● Java requires objects to be in primitive form to be Header accessed directly* Header Hea ● If objects are not in primitive form, references and Header Data der Data copies required to access data; time-consuming Data conversion process Data reference ● When objects are graphed onto the heap, they reference reference reference are placed randomly and occupy more space than is needed He Java Heap ad HEADER Da e er Header ta C r d Memory bloat occurs due to data copies (data ● Data Copy Hea opy opy must be accessed and copied, including headers) Header taC Data Copy Da *without the use of JNI or DBB After ● PackedObjects eliminate requirement for objects to be in primitive form ● PackedObjects can be accessed directly from source without the redundant copying; no conversion ● PackedObject allocates and packs all data fields HEADER (including other PackedObject and arrays) into a Direct Access, No Copy single well defined contiguous storage area 14 © 2012 IBM Corporation
15. Copyright and Trademarks © IBM Corporation 2012. All Rights Reserved. IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp., and registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web – see the IBM “Copyright and trademark information” page at URL: www.ibm.com/legal/copytrade.shtml 15 © 2012 IBM Corporation