Quick overview: Structure of Java Virtual Machine (original) (raw)

Java-based applications run in Java Runtime Environment (JRE) which consists of a set of Java APIs and Java Virtual Machine (JVM). The JVM loads an application via class loaders and runs it by the execution engine.

JVM design principles

JVM runs on all kind of hardware where executes a java bytecode without changing java execution code. VM implements a Write Once Run Everywhere principle or so-called platform independence principle. Just to sum up JVM key design principles:

Java bytecode

Java uses bytecode as an intermediate representation between source code and machine code which runs on hardware. Bytecode instruction is represented as 1 byte numbers e.g. getfield 0xb4, invokevirtual 0xb6 hence there is a maximum of 256 instructions. If the instruction doesn’t need operand so next instruction immediately follows otherwise operands follow instruction according to instruction set specification. Those instructions are contained in class files produced by java compilation. The exact structure of the class file is defined in “Java Virtual Machine specification” section 4 – class file format. After some version information there are sections like constant pools, access flags, fields info, this and super info, methods info etc. See the spec for the details.

Java class loading

A class loader loads compiled java bytecode to the Runtime Data Areas and the execution engine executes the Java bytecode. Class is loaded when is used for the first time in the JVM. Class loading works in dynamic fashion on parent-child (hierarchical) delegation principle. Class unloading is not allowed. Some time ago I wrote an article about class loading on application server. Detail mechanics of class loading is out of scope for this article.

Java memory model

Runtime data areas are used during execution of the program. Some of these areas are created when JVM starts and destroyed when the JVM exits. Other data areas are per-thread – created on thread creation and destroyed on thread exit. FThe following picture is based mainly on JVM 8 internals (doesn’t include segmented code cache and dynamic linking of language introduced by JVM 9).

25AC9487-1AB9-48DE-936C-B6A9AC781637

Program counter

Program counter exist for one thread and has the address of currently executed instruction unless it is native then the PC is undefined. PC is, in fact, pointing to at a memory address in the Method Area.

Stack

JVM stack exists for one thread and holds one Frame for each method executing on that thread. It is LIFO data structure. Each stack frame has reference to for local variable array, operand stack, runtime constant pool of a class where the code being executed.

Native Stack

Native stack is not supported by all JVMs. If JVM is implemented using C-linkage model for JNI than stack will be C stack(order of arguments and return will be identical in the native stack typical to C program). Native methods can call back into the JVM and invoke Java methods.

Stack Frame

Stack Frame stores references that point to the objects or arrays on the heap.

Heap

Heap is area shared by all threads used to allocate class instances and arrays in runtime. Heap is the subject of Garbage Collection as a way of automatic memory management used by java. This space is most often mentioned in JVM performance tuning.

Non-Heap memory areas

Java runtime

Bytecode assigned to runtime data areas in the JVM via class loader is executed by the execution engine. Engine reads bytecode in the unit of instruction. Execution engine must change the bytecode to the language that can be executed by the machine. This can happen in one of two ways: