VM Spec The class File Format (original) (raw)

The bytes item of the CONSTANT_Integer_info structure represents the value of the int constant. The bytes of the value are stored in big-endian (high byte first) order.

The bytes item of the CONSTANT_Float_info structure represents the value of the float constant in IEEE 754 floating-point single format (§3.3.2). The bytes of the single format representation are stored in big-endian (high byte first) order.

The value represented by the CONSTANT_Float_info structure is determined as follows. The bytes of the value are first converted into an int constant bits. Then:

     int s = ((bits>> 31) == 0) ? 1 : -1;     int e = ((bits>> 23) & 0xff);     int m = (e == 0) ?      (bits& 0x7fffff) << 1 :     (bits& 0x7fffff) | 0x800000; Then the float value equals the result of the mathematical expression s·m·2e-150.

4.4.5 The CONSTANT_Long_info and CONSTANT_Double_info Structures

The CONSTANT_Long_info and CONSTANT_Double_info represent 8-byte numeric (long and double) constants:

    CONSTANT_Long_info {      u1 tag;      u4 high_bytes;      u4 low_bytes;     }

    CONSTANT_Double_info {      u1 tag;      u4 high_bytes;      u4 low_bytes;     }

All 8-byte constants take up two entries in the constant_pool table of the class file. If a CONSTANT_Long_info or CONSTANT_Double_info structure is the item in the constant_pool table at index n, then the next usable item in the pool is located at index n+2. The constant_pool index n+1 must be valid but is considered unusable.2

The items of these structures are as follows:

tag

The tag item of the CONSTANT_Long_info structure has the value CONSTANT_Long (5).

The tag item of the CONSTANT_Double_info structure has the value CONSTANT_Double (6).

high_bytes, low_bytes

The unsigned high_bytes and low_bytes items of the CONSTANT_Long_info structure together represent the value of the long constant ((long) high_bytes << 32) + low_bytes, where the bytes of each of high_bytes and low_bytes are stored in big-endian (high byte first) order.

The high_bytes and low_bytes items of the CONSTANT_Double_info structure together represent the double value in IEEE 754 floating-point double format (§3.3.2). The bytes of each item are stored in big-endian (high byte first) order.

The value represented by the CONSTANT_Double_info structure is determined as follows. The high_bytes and low_bytes items are first converted into the long constant bits, which is equal to ((long) high_bytes << 32) + low_bytes. Then:

     int s = ((bits>> 63) == 0) ? 1 : -1;     int e = (int)((bits>> 52) & 0x7ffL);     long m = (e == 0) ?      (bits& 0xfffffffffffffL) << 1 :     (bits& 0xfffffffffffffL) | 0x10000000000000L;

Then the floating-point value equals the double value of the mathematical expression s·m·2e-1075.

4.4.6 The CONSTANT_NameAndType_info Structure

The CONSTANT_NameAndType_info structure is used to represent a field or method, without indicating which class or interface type it belongs to:

    CONSTANT_NameAndType_info {      u1 tag;      u2 name_index;      u2 descriptor_index;     }

The items of the CONSTANT_NameAndType_info structure are as follows:

tag

The tag item of the CONSTANT_NameAndType_info structure has the value CONSTANT_NameAndType (12).

name_index

The value of the name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure representing either a valid field or method name (§2.7) stored as a simple name (§2.7.1), that is, as a Java programming language identifier (§2.2) or as the special method name <init> (§3.9).

descriptor_index

The value of the descriptor_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure representing a valid field descriptor (§4.3.2) or method descriptor (§4.3.3).

4.4.7 The CONSTANT_Utf8_info Structure

The CONSTANT_Utf8_info structure is used to represent constant string values.

UTF-8 strings are encoded so that character sequences that contain only non-null ASCII characters can be represented using only 1 byte per character, but characters of up to 16 bits can be represented. All characters in the range '\u0001' to '\u007F' are represented by a single byte:

The 7 bits of data in the byte give the value of the character represented. The null character ('\u0000') and characters in the range '\u0080' to '\u07FF' are represented by a pair of bytes x and y:

x:
y:

The bytes represent the character with the value ((x & 0x1f) << 6) + (y & 0x3f).

Characters in the range '\u0800' to '\uFFFF' are represented by 3 bytes x, y, and z:

x:

1 1 1 0 bits 15-12

y:
z:

The character with the value ((x & 0xf) << 12) + ((y & 0x3f) << 6) + (z & 0x3f) is represented by the bytes.

The bytes of multibyte characters are stored in the class file in big-endian (high byte first) order.

There are two differences between this format and the "standard" UTF-8 format. First, the null byte (byte)0 is encoded using the 2-byte format rather than the 1-byte format, so that Java virtual machine UTF-8 strings never have embedded nulls. Second, only the 1-byte, 2-byte, and 3-byte formats are used. The Java virtual machine does not recognize the longer UTF-8 formats.

For more information regarding the UTF-8 format, see File System Safe UCS Transformation Format (FSS_UTF), X/Open Preliminary Specification (X/Open Company Ltd., Document Number: P316). This information also appears in ISO/IEC 10646, Annex P.

The CONSTANT_Utf8_info structure is

    CONSTANT_Utf8_info {      u1 tag;      u2 length;      u1 bytes[length];     }

The items of the CONSTANT_Utf8_info structure are the following:

tag

The tag item of the CONSTANT_Utf8_info structure has the value CONSTANT_Utf8 (1).

length

The value of the length item gives the number of bytes in the bytes array (not the length of the resulting string). The strings in the CONSTANT_Utf8_info structure are not null-terminated.

bytes[]

The bytes array contains the bytes of the string. No byte may have the value (byte)0 or lie in the range (byte)0xf0-(byte)0xff.


4.5 Fields

Each field is described by a field_info structure. No two fields in one class file may have the same name and descriptor (§4.3.2). The format of this structure is

    field_info {      u2 access_flags;      u2 name_index;      u2 descriptor_index;      u2 attributes_count;      attribute_info attributes[attributes_count];     }

The items of the field_info structure are as follows:

access_flags

The value of the access_flags item is a mask of flags used to denote access permission to and properties of this field. The interpretation of each flag, when set, is as shown in Table 4.4.

Fields of classes may set any of the flags in Table 4.4. However, a specific field of a class may have at most one of its ACC_PRIVATE, ACC_PROTECTED, and ACC_PUBLIC flags set (§2.7.4) and may not have both its ACC_FINAL and ACC_VOLATILE flags set (§2.9.1).

Flag Name Value Interpretation
ACC_PUBLIC 0x0001 Declared public; may be accessed from outside its package.
ACC_PRIVATE 0x0002 Declared private; usable only within the defining class.
ACC_PROTECTED 0x0004 Declared protected; may be accessed within subclasses.
ACC_STATIC 0x0008 Declared static.
ACC_FINAL 0x0010 Declared final; no further assignment after initialization.
ACC_VOLATILE 0x0040 Declared volatile; cannot be cached.
ACC_TRANSIENT 0x0080 Declared transient; not written or read by a persistent object manager.

Fields of classes may set any of the flags in Table 4.4. However, a specific field of a class may have at most one of its ACC_PRIVATE, ACC_PROTECTED, and ACC_PUBLIC flags set (§2.7.4) and may not have both its ACC_FINAL and ACC_VOLATILE flags set (§2.9.1).

All fields of interfaces must have their ACC_PUBLIC, ACC_STATIC, and ACC_FINAL flags set and may not have any of the other flags in Table 4.4 set (§2.13.3.1).

All bits of the access_flags item not assigned in Table 4.4 are reserved for future use. They should be set to zero in generated class files and should be ignored by Java virtual machine implementations.

name_index

The value of the name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure which must represent a valid field name (§2.7) stored as a simple name (§2.7.1), that is, as a Java programming language identifier (§2.2).

descriptor_index

The value of the descriptor_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure that must represent a valid field descriptor (§4.3.2).

attributes_count

The value of the attributes_count item indicates the number of additional attributes (§4.7) of this field.

attributes[]

Each value of the attributes table must be an attribute structure (§4.7). A field can have any number of attributes associated with it.

The attributes defined by this specification as appearing in the attributes table of a field_info structure are the ConstantValue (§4.7.2), Synthetic (§4.7.6), and Deprecated (§4.7.10) attributes.

A Java virtual machine implementation must recognize and correctly read ConstantValue (§4.7.2) attributes found in the attributes table of a field_info structure. A Java virtual machine implementation is required to silently ignore any or all other attributes in the attributes table that it does not recognize. Attributes not defined in this specification are not allowed to affect the semantics of the class file, but only to provide additional descriptive information (§4.7.1).


4.6 Methods

Each method, including each instance initialization method (§3.9) and the class or interface initialization method (§3.9), is described by a method_info structure. No two methods in one class file may have the same name and descriptor (§4.3.3).

The structure has the following format:

    method_info {      u2 access_flags;      u2 name_index;      u2 descriptor_index;      u2 attributes_count;      attribute_info attributes[attributes_count];     }

The items of the method_info structure are as follows:

access_flags

The value of the access_flags item is a mask of flags used to denote access permission to and properties of this method. The interpretation of each flag, when set, is as shown in Table 4.5.

Flag Name Value Interpretation
ACC_PUBLIC 0x0001 Declared public; may be accessed from outside its package.
ACC_PRIVATE 0x0002 Declared private; accessible only within the defining class.
ACC_PROTECTED 0x0004 Declared protected; may be accessed within subclasses.
ACC_STATIC 0x0008 Declared static.
ACC_FINAL 0x0010 Declared final; may not be overridden.
ACC_SYNCHRONIZED 0x0020 Declared synchronized; invocation is wrapped in a monitor lock.
ACC_NATIVE 0x0100 Declared native; implemented in a language other than Java.
ACC_ABSTRACT 0x0400 Declared abstract; no implementation is provided.
ACC_STRICT 0x0800 Declared strictfp; floating-point mode is FP-strict

Methods of classes may set any of the flags in Table 4.5. However, a specific method of a class may have at most one of its ACC_PRIVATE, ACC_PROTECTED, and ACC_PUBLIC flags set (§2.7.4). If such a method has its ACC_ABSTRACT flag set it may not have any of its ACC_FINAL, ACC_NATIVE, ACC_PRIVATE, ACC_STATIC, ACC_STRICT, or ACC_SYNCHRONIZED flags set (§2.13.3.2).

All interface methods must have their ACC_ABSTRACT and ACC_PUBLIC flags set and may not have any of the other flags in Table 4.5 set (§2.13.3.2).

A specific instance initialization method (§3.9) may have at most one of its ACC_PRIVATE, ACC_PROTECTED, and ACC_PUBLIC flags set and may also have its ACC_STRICT flag set, but may not have any of the other flags in Table 4.5 set.

Class and interface initialization methods (§3.9) are called implicitly by the Java virtual machine; the value of their access_flags item is ignored except for the settings of the ACC_STRICT flag.

All bits of the access_flags item not assigned in Table 4.5 are reserved for future use. They should be set to zero in generated class files and should be ignored by Java virtual machine implementations.

name_index

The value of the name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure representing either one of the special method names (§3.9), <init> or <clinit>, or a valid method name in the Java programming language (§2.7), stored as a simple name (§2.7.1).

descriptor_index

The value of the descriptor_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure representing a valid method descriptor (§4.3.3).

attributes_count

The value of the attributes_count item indicates the number of additional attributes (§4.7) of this method.

attributes[]

Each value of the attributes table must be an attribute structure (§4.7). A method can have any number of optional attributes associated with it.

The only attributes defined by this specification as appearing in the attributes table of a method_info structure are the Code (§4.7.3), Exceptions (§4.7.4), Synthetic (§4.7.6), and Deprecated (§4.7.10) attributes.

A Java virtual machine implementation must recognize and correctly read Code (§4.7.3) and Exceptions (§4.7.4) attributes found in the attributes table of a method_info structure. A Java virtual machine implementation is required to silently ignore any or all other attributes in the attributes table of a method_info structure that it does not recognize. Attributes not defined in this specification are not allowed to affect the semantics of the class file, but only to provide additional descriptive information (§4.7.1).


4.7 Attributes

Attributes are used in the ClassFile (§4.1), field_info (§4.5), method_info (§4.6), and Code_attribute (§4.7.3) structures of the class file format. All attributes have the following general format:

    attribute_info {      u2 attribute_name_index;      u4 attribute_length;      u1 info[attribute_length];     }

For all attributes, the attribute_name_index must be a valid unsigned 16-bit index into the constant pool of the class. The constant_pool entry at attribute_name_index must be a CONSTANT_Utf8_info (§4.4.7) structure representing the name of the attribute. The value of the attribute_length item indicates the length of the subsequent information in bytes. The length does not include the initial six bytes that contain the attribute_name_index and attribute_length items.

Certain attributes are predefined as part of the class file specification. The predefined attributes are the SourceFile (§4.7.7), ConstantValue (§4.7.2), Code (§4.7.3), Exceptions (§4.7.4), InnerClasses (§4.7.5), Synthetic (§4.7.6), LineNumberTable (§4.7.8), LocalVariableTable (§4.7.9), and Deprecated (§4.7.10) attributes. Within the context of their use in this specification, that is, in the attributes tables of the class file structures in which they appear, the names of these predefined attributes are reserved.

Of the predefined attributes, the Code, ConstantValue, and Exceptions attributes must be recognized and correctly read by a class file reader for correct interpretation of the class file by a Java virtual machine implementation. The InnerClasses and Synthetic attributes must be recognized and correctly read by a class file reader in order to properly implement the Java and Java 2 platform class libraries (§3.12). Use of the remaining predefined attributes is optional; a class file reader may use the information they contain, or otherwise must silently ignore those attributes.

4.7.1 Defining and Naming New Attributes

Compilers are permitted to define and emit class files containing new attributes in the attributes tables of class file structures. Java virtual machine implementations are permitted to recognize and use new attributes found in the attributestables of class file structures. However, any attribute not defined as part of this Java virtual machine specification must not affect the semantics of class or interface types. Java virtual machine implementations are required to silently ignore attributes they do not recognize.

For instance, defining a new attribute to support vendor-specific debugging is permitted. Because Java virtual machine implementations are required to ignore attributes they do not recognize, class files intended for that particular Java virtual machine implementation will be usable by other implementations even if those implementations cannot make use of the additional debugging information that the class files contain.

Java virtual machine implementations are specifically prohibited from throwing an exception or otherwise refusing to use class files simply because of the presence of some new attribute. Of course, tools operating on class files may not run correctly if given class files that do not contain all the attributes they require.

Two attributes that are intended to be distinct, but that happen to use the same attribute name and are of the same length, will conflict on implementations that recognize either attribute. Attributes defined other than by Sun must have names chosen according to the package naming convention defined by The Java Language Specification. For instance, a new attribute defined by Netscape might have the name "com.Netscape.new-attribute".[3](#41454)

Sun may define additional attributes in future versions of this class file specification.

4.7.2 The ConstantValue Attribute

The ConstantValue attribute is a fixed-length attribute used in the attributestable of the field_info (§4.5) structures. A ConstantValue attribute represents the value of a constant field that must be (explicitly or implicitly) static; that is, the ACC_STATIC bit (Table 4.4) in the flags item of the field_info structure must be set. There can be no more than one ConstantValue attribute in the attributestable of a given field_info structure. The constant field represented by the field_info structure is assigned the value referenced by its ConstantValueattribute as part of the initialization of the class or interface declaring the constant field (§2.17.4). This occurs immediately prior to the invocation of the class or interface initialization method (§3.9) of that class or interface.

If a field_info structure representing a non-static field has a ConstantValue attribute, then that attribute must silently be ignored. Every Java virtual machine implementation must recognize ConstantValue attributes.

The ConstantValue attribute has the following format:

    ConstantValue_attribute {      u2 attribute_name_index;      u4 attribute_length;      u2 constantvalue_index;     }

The items of the ConstantValue_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure representing the string "ConstantValue".

attribute_length

The value of the attribute_length item of a ConstantValue_attribute structure must be 2.

constantvalue_index

The value of the constantvalue_index item must be a valid index into the constant_pool table. The constant_pool entry at that index gives the constant value represented by this attribute. The constant_pool entry must be of a type appropriate to the field, as shown by Table 4.6.

Field Type Entry Type
long CONSTANT_Long
float CONSTANT_Float
double CONSTANT_Double
int, short, char, byte, boolean CONSTANT_Integer
String CONSTANT_String

4.7.3 The Code Attribute

The Code attribute is a variable-length attribute used in the attributes table of method_info structures. A Code attribute contains the Java virtual machine instructions and auxiliary information for a single method, instance initialization method (§3.9), or class or interface initialization method (§3.9). Every Java virtual machine implementation must recognize Code attributes. If the method is either native or abstract, its method_info structure must not have a Code attribute. Otherwise, its method_info structure must have exactly one Code attribute.

The Code attribute has the following format:

    Code_attribute {      u2 attribute_name_index;      u4 attribute_length;      u2 max_stack;      u2 max_locals;      u4 code_length;      u1 code[code_length];      u2 exception_table_length;      { u2 start_pc;      u2 end_pc;      u2 handler_pc;      u2 catch_type;      } exception_table[exception_table_length];      u2 attributes_count;      attribute_info attributes[attributes_count];     }

The items of the Code_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure representing the string "Code".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes.

max_stack

The value of the max_stack item gives the maximum depth (§3.6.2) of the operand stack of this method at any point during execution of the method.

max_locals

The value of the max_locals item gives the number of local variables in the local variable array allocated upon invocation of this method, including the local variables used to pass parameters to the method on its invocation.

The greatest local variable index for a value of type long or double is max_locals-2. The greatest local variable index for a value of any other type is max_locals-1.

code_length

The value of the code_length item gives the number of bytes in the code array for this method. The value of code_length must be greater than zero; the code array must not be empty.

code[]

The code array gives the actual bytes of Java virtual machine code that implement the method.

When the code array is read into memory on a byte-addressable machine, if the first byte of the array is aligned on a 4-byte boundary, the tableswitch and lookupswitch 32-bit offsets will be 4-byte aligned. (Refer to the descriptions of those instructions for more information on the consequences of code array alignment.)

The detailed constraints on the contents of the code array are extensive and are given in a separate section (§4.8).

exception_table_length

The value of the exception_table_length item gives the number of entries in the exception_table table.

exception_table[]

Each entry in the exception_table array describes one exception handler in the code array. The order of the handlers in the exception_table array is significant. See Section 3.10 for more details.

Each exception_table entry contains the following four items:

start_pc, end_pc

The values of the two items start_pc and end_pc indicate the ranges in the code array at which the exception handler is active. The value of start_pc must be a valid index into the code array of the opcode of an instruction. The value of end_pc either must be a valid index into the code array of the opcode of an instruction or must be equal to code_length, the length of the code array. The value of start_pc must be less than the value of end_pc.

The start_pc is inclusive and end_pc is exclusive; that is, the exception handler must be active while the program counter is within the interval [start_pc, end_pc).4

handler_pc

The value of the handler_pc item indicates the start of the exception handler. The value of the item must be a valid index into the code array and must be the index of the opcode of an instruction.

catch_type

If the value of the catch_type item is nonzero, it must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Class_info (§4.4.1) structure representing a class of exceptions that this exception handler is designated to catch. This class must be the class Throwable or one of its subclasses. The exception handler will be called only if the thrown exception is an instance of the given class or one of its subclasses.

If the value of the catch_type item is zero, this exception handler is called for all exceptions. This is used to implement finally (see Section 7.13, "Compiling finally").

attributes_count

The value of the attributes_count item indicates the number of attributes of the Code attribute.

attributes[]

Each value of the attributes table must be an attribute structure (§4.7). A Code attribute can have any number of optional attributes associated with it.

Currently, the LineNumberTable (§4.7.8) and LocalVariableTable (§4.7.9) attributes, both of which contain debugging information, are defined and used with the Code attribute.

A Java virtual machine implementation is permitted to silently ignore any or all attributes in the attributes table of a Code attribute. Attributes not defined in this specification are not allowed to affect the semantics of the class file, but only to provide additional descriptive information (§4.7.1).

4.7.4 The Exceptions Attribute

The Exceptions attribute is a variable-length attribute used in the attributestable of a method_info (§4.6) structure. The Exceptions attribute indicates which checked exceptions a method may throw. There may be at most one Exceptionsattribute in each method_info structure.

The Exceptions attribute has the following format:

    Exceptions_attribute {      u2 attribute_name_index;      u4 attribute_length;      u2 number_of_exceptions;      u2 exception_index_table[number_of_exceptions];     }

The items of the Exceptions_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be ``the CONSTANT_Utf8_info (§4.4.7) structure representing the string "Exceptions".

attribute_length

The value of the attribute_length item indicates the attribute length, excluding the initial six bytes.

number_of_exceptions

The value of the number_of_exceptions item indicates the number of entries in the exception_index_table.

exception_index_table[]

Each value in the exception_index_table array must be a valid index into the constant_pool table. The constant_pool entry referenced by each table item must be a CONSTANT_Class_info (§4.4.1) structure representing a class type that this method is declared to throw.

A method should throw an exception only if at least one of the following three criteria is met:

4.7.5 The InnerClasses Attribute

The InnerClasses attribute5 is a variable-length attribute in the attributes table of the ClassFile (§4.1) structure. If the constant pool of a class or interface refers to any class or interface that is not a member of a package, its ClassFile structure must have exactly one InnerClasses attribute in its attributes table.

The InnerClasses attribute has the following format:

    InnerClasses_attribute {      u2 attribute_name_index;      u4 attribute_length;      u2 number_of_classes;      { u2 inner_class_info_index;      u2 outer_class_info_index;      u2 inner_name_index;      u2 inner_class_access_flags;      } classes[number_of_classes];     }

The items of the InnerClasses_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure representing the string "InnerClasses".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes.

number_of_classes

The value of the number_of_classes item indicates the number of entries in the classes array.

classes[]

Every CONSTANT_Class_info entry in the constant_pool table which represents a class or interface C that is not a package member must have exactly one corresponding entry in the classes array.

If a class has members that are classes or interfaces, its constant_pool table (and hence its InnerClasses attribute) must refer to each such member, even if that member is not otherwise mentioned by the class. These rules imply that a nested class or interface member will have InnerClasses information for each enclosing class and for each immediate member.

Each classes array entry contains the following four items:

inner_class_info_index

The value of the inner_class_info_index item must be zero or a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Class_info (§4.4.1) structure representing C. The remaining items in the classes array entry give information about C.

outer_class_info_index

If C is not a member, the value of the outer_class_info_index item must be zero. Otherwise, the value of the outer_class_info_index item must be a valid index into the constant_pool table, and the entry at that index must be a CONSTANT_Class_info (§4.4.1) structure representing the class or interface of which C is a member.

inner_name_index

If C is anonymous, the value of the inner_name_index item must be zero. Otherwise, the value of the inner_name_index item must be a valid index into the constant_pool table, and the entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure that represents the original simple name of C, as given in the source code from which this class file was compiled.

inner_class_access_flags

The value of the inner_class_access_flags item is a mask of flags used to denote access permissions to and properties of class or interface C as declared in the source code from which this class file was compiled. It is used by compilers to recover the original information when source code is not available. The flags are shown in Table 4.7.

Flag Name Value Meaning
ACC_PUBLIC 0x0001 Marked or implicitly public in source.
ACC_PRIVATE 0x0002 Marked private in source.
ACC_PROTECTED 0x0004 Marked protected in source.
ACC_STATIC 0x0008 Marked or implicitly static in source.
ACC_FINAL 0x0010 Marked final in source.
ACC_INTERFACE 0x0200 Was an interface in source.
ACC_ABSTRACT 0x0400 Marked or implicitly abstract in source.

All bits of the inner_class_access_flags item not assigned in Table 4.7 are reserved for future use. They should be set to zero in generated class files and should be ignored by Java virtual machine implementations.

The Java virtual machine does not currently check the consistency of the InnerClasses attribute with any class file actually representing a class or interface referenced by the attribute.

4.7.6 The Synthetic Attribute

The Synthetic attribute6 is a fixed-length attribute in the attributes table of ClassFile (§4.1), field_info (§4.5), and method_info (§4.6) structures. A class member that does not appear in the source code must be marked using a Synthetic attribute.

The Synthetic attribute has the following format:

    Synthetic_attribute {      u2 attribute_name_index;      u4 attribute_length;     }

The items of the Synthetic_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure representing the string "Synthetic".

attribute_length

The value of the attribute_length item is zero.

4.7.7 The SourceFile Attribute

The SourceFile attribute is an optional fixed-length attribute in the attributestable of the ClassFile (§4.1) structure. There can be no more than one SourceFile attribute in the attributes table of a given ClassFile structure.

The SourceFile attribute has the following format:

    SourceFile_attribute {      u2 attribute_name_index;      u4 attribute_length;      u2 sourcefile_index;     }

The items of the SourceFile_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure representing the string "SourceFile".

attribute_length

The value of the attribute_length item of a SourceFile_attribute structure must be 2.

sourcefile_index

The value of the sourcefile_index item must be a valid index into the constant_pool table. The constant pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure representing a string.

The string referenced by the sourcefile_index item will be interpreted as indicating the name of the source file from which this class file was compiled. It will not be interpreted as indicating the name of a directory containing the file or an absolute path name for the file; such platform-specific additional information must be supplied by the runtime interpreter or development tool at the time the file name is actually used.

4.7.8 The LineNumberTable Attribute

The LineNumberTable attribute is an optional variable-length attribute in the attributes table of a Code (§4.7.3) attribute. It may be used by debuggers to determine which part of the Java virtual machine code array corresponds to a given line number in the original source file. If LineNumberTable attributes are present in the attributes table of a given Code attribute, then they may appear in any order. Furthermore, multiple LineNumberTable attributes may together represent a given line of a source file; that is, LineNumberTable attributes need not be one-to-one with source lines.

The LineNumberTable attribute has the following format:

    LineNumberTable_attribute {      u2 attribute_name_index;      u4 attribute_length;      u2 line_number_table_length;      { u2 start_pc;      u2 line_number;      } line_number_table[line_number_table_length];     }

The items of the LineNumberTable_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure representing the string "LineNumberTable".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes.

line_number_table_length

The value of the line_number_table_length item indicates the number of entries in the line_number_table array.

line_number_table[]

Each entry in the line_number_table array indicates that the line number in the original source file changes at a given point in the code array. Each line_number_table entry must contain the following two items:

start_pc

The value of the start_pc item must indicate the index into the code array at which the code for a new line in the original source file begins. The value of start_pc must be less than the value of the code_length item of the Code attribute of which this LineNumberTable is an attribute.

line_number

The value of the line_number item must give the corresponding line number in the original source file.

4.7.9 The LocalVariableTable Attribute

The LocalVariableTable attribute is an optional variable-length attribute of a Code (§4.7.3) attribute. It may be used by debuggers to determine the value of a given local variable during the execution of a method. If LocalVariableTableattributes are present in the attributes table of a given Code attribute, then they may appear in any order. There may be no more than one LocalVariableTableattribute per local variable in the Code attribute.

The LocalVariableTable attribute has the following format:

    LocalVariableTable_attribute {      u2 attribute_name_index;      u4 attribute_length;      u2 local_variable_table_length;      { u2 start_pc;      u2 length;      u2 name_index;      u2 descriptor_index;      u2 index;      } local_variable_table[local_variable_table_length];     }

The items of the LocalVariableTable_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure representing the string "LocalVariableTable".

attribute_length

The value of the attribute_length item indicates the length of the attribute, excluding the initial six bytes.

local_variable_table_length

The value of the local_variable_table_length item indicates the number of entries in the local_variable_table array.

local_variable_table[]

Each entry in the local_variable_table array indicates a range of code array offsets within which a local variable has a value. It also indicates the index into the local variable array of the current frame at which that local variable can be found. Each entry must contain the following five items:

start_pc, length

The given local variable must have a value at indices into the code array in the interval [start_pc, start_pc+length], that is, between start_pc and start_pc+length inclusive. The value of start_pc must be a valid index into the code array of this Code attribute and must be the index of the opcode of an instruction. Either the value of start_pc+length must be a valid index into the code array of this Code attribute and be the index of the opcode of an instruction, or it must be the first index beyond the end of that code array.

name_index, descriptor_index

The value of the name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must contain a CONSTANT_Utf8_info (§4.4.7) structure representing a valid local variable name stored as a simple name (§2.7.1).

The value of the descriptor_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must contain a CONSTANT_Utf8_info (§4.4.7) structure representing a field descriptor (§4.3.2) encoding the type of a local variable in the source program.

index

The given local variable must be at index in the local variable array of the current frame. If the local variable at index is of type double or long, it occupies both index and index+1.

4.7.10 The Deprecated Attribute

The Deprecated attribute7 is an optional fixed-length attribute in the attributestable of ClassFile (§4.1), field_info (§4.5), and method_info (§4.6) structures. A class, interface, method, or field may be marked using a Deprecatedattribute to indicate that the class, interface, method, or field has been superseded. A runtime interpreter or tool that reads the class file format, such as a compiler, can use this marking to advise the user that a superseded class, interface, method, or field is being referred to. The presence of a Deprecated attribute does not alter the semantics of a class or interface.

The Deprecated attribute has the following format:

    Deprecated_attribute {      u2 attribute_name_index;      u4 attribute_length;     }

The items of the Deprecated_attribute structure are as follows:

attribute_name_index

The value of the attribute_name_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure representing the string "Deprecated".

attribute_length

The value of the attribute_length item is zero.


4.8 Constraints on Java Virtual Machine Code

The Java virtual machine code for a method, instance initialization method (§3.9), or class or interface initialization method (§3.9) is stored in the code array of the Code attribute of a method_info structure of a class file. This section describes the constraints associated with the contents of the Code_attribute structure.

4.8.1 Static Constraints

The static constraints on a class file are those defining the well-formedness of the file. With the exception of the static constraints on the Java virtual machine code of the class file, these constraints have been given in the previous section. The static constraints on the Java virtual machine code in a class file specify how Java virtual machine instructions must be laid out in the code array and what the operands of individual instructions must be.

The static constraints on the instructions in the code array are as follows:

4.8.2 Structural Constraints

The structural constraints on the code array specify constraints on relationships between Java virtual machine instructions. The structural constraints are as follows:


4.9 Verification of class Files

Even though Sun's compiler for the Java programming language attempts to produce only class files that satisfy all the static constraints in the previous sections, the Java virtual machine has no guarantee that any file it is asked to load was generated by that compiler or is properly formed. Applications such as Sun's HotJava World Wide Web browser do not download source code, which they then compile; these applications download already-compiled class files. The HotJava browser needs to determine whether the class file was produced by a trustworthy compiler or by an adversary attempting to exploit the virtual machine.

An additional problem with compile-time checking is version skew. A user may have successfully compiled a class, say PurchaseStockOptions, to be a subclass of TradingClass. But the definition of TradingClass might have changed since the time the class was compiled in a way that is not compatible with preexisting binaries. Methods might have been deleted or had their return types or modifiers changed. Fields might have changed types or changed from instance variables to class variables. The access modifiers of a method or variable may have changed from public to private. For a discussion of these issues, see Chapter 13, "Binary Compatibility," in the first edition of The Java Language Specification or the equivalent chapter in the second edition.

Because of these potential problems, the Java virtual machine needs to verify for itself that the desired constraints are satisfied by the class files it attempts to incorporate. A Java virtual machine implementation verifies that each class file satisfies the necessary constraints at linking time (§2.17.3). Structural constraints on the Java virtual machine code may be checked using a simple theorem prover.

Linking-time verification enhances the performance of the interpreter. Expensive checks that would otherwise have to be performed to verify constraints at run time for each interpreted instruction can be eliminated. The Java virtual machine can assume that these checks have already been performed. For example, the Java virtual machine will already know the following:

The class file verifier is also independent of the Java programming language. Programs written in other languages can be compiled into the class file format, but will pass verification only if all the same constraints are satisfied.

4.9.1 The Verification Process

The class file verifier operates in four passes:

Pass 1:
When a prospective class file is loaded (§2.17.2) by the Java virtual machine, the Java virtual machine first ensures that the file has the basic format of a class file. The first four bytes must contain the right magic number. All recognized attributes must be of the proper length. The class file must not be truncated or have extra bytes at the end. The constant pool must not contain any superficially unrecognizable information.

While class file verification properly occurs during class linking (§2.17.3), this check for basic class file integrity is necessary for any interpretation of the class file contents and can be considered to be logically part of the verification process.

Pass 2:
When the class file is linked, the verifier performs all additional verification that can be done without looking at the code array of the Code attribute (§4.7.3). The checks performed by this pass include the following:

Pass 3:
During linking, the verifier checks the code array of the Code attribute for each method of the class file by performing data-flow analysis on each method. The verifier ensures that at any given point in the program, no matter what code path is taken to reach that point, the following is true:

Pass 4:
For efficiency reasons, certain tests that could in principle be performed in Pass 3 are delayed until the first time the code for the method is actually invoked. In so doing, Pass 3 of the verifier avoids loading class files unless it has to.

For example, if a method invokes another method that returns an instance of class A, and that instance is assigned only to a field of the same type, the verifier does not bother to check if the class A actually exists. However, if it is assigned to a field of the type B, the definitions of both A and B must be loaded in to ensure that A is a subclass of B.

Pass 4 is a virtual pass whose checking is done by the appropriate Java virtual machine instructions. The first time an instruction that references a type is executed, the executing instruction does the following:

A Java virtual machine implementation is allowed to perform any or all of the Pass 4 steps as part of Pass 3; see 2.17.1, "Virtual Machine Start-up" for an example and more discussion.

In one of Sun's Java virtual machine implementations, after the verification has been performed, the instruction in the Java virtual machine code is replaced with an alternative form of the instruction. This alternative instruction indicates that the verification needed by this instruction has taken place and does not need to be performed again. Subsequent invocations of the method will thus be faster. It is illegal for these alternative instruction forms to appear in class files, and they should never be encountered by the verifier.

4.9.2 The Bytecode Verifier

As indicated earlier, Pass 3 of the verification process is the most complex of the four passes of class file verification. This section looks at the verification of Java virtual machine code in Pass 3 in more detail.

The code for each method is verified independently. First, the bytes that make up the code are broken up into a sequence of instructions, and the index into the code array of the start of each instruction is placed in an array. The verifier then goes through the code a second time and parses the instructions. During this pass a data structure is built to hold information about each Java virtual machine instruction in the method. The operands, if any, of each instruction are checked to make sure they are valid. For instance:

Next, a data-flow analyzer is initialized. For the first instruction of the method, the local variables that represent parameters initially contain values of the types indicated by the method's type descriptor; the operand stack is empty. All other local variables contain an illegal value. For the other instructions, which have not been examined yet, no information is available regarding the operand stack or local variables.

Finally, the data-flow analyzer is run. For each instruction, a "changed" bit indicates whether this instruction needs to be looked at. Initially, the "changed" bit is set only for the first instruction. The data-flow analyzer executes the following loop:

  1. Select a virtual machine instruction whose "changed" bit is set. If no instruction remains whose "changed" bit is set, the method has successfully been verified. Otherwise, turn off the "changed" bit of the selected instruction.
  2. Model the effect of the instruction on the operand stack and local variable array by doing the following:
    • If the instruction uses values from the operand stack, ensure that there are a sufficient number of values on the stack and that the top values on the stack are of an appropriate type. Otherwise, verification fails.
    • If the instruction uses a local variable, ensure that the specified local variable contains a value of the appropriate type. Otherwise, verification fails.
    • If the instruction pushes values onto the operand stack, ensure that there is sufficient room on the operand stack for the new values. Add the indicated types to the top of the modeled operand stack.
    • If the instruction modifies a local variable, record that the local variable now contains the new type.
  3. Determine the instructions that can follow the current instruction. Successor instructions can be one of the following:
    • The next instruction, if the current instruction is not an unconditional control transfer instruction (for instance goto, return, or athrow). Verification fails if it is possible to "fall off" the last instruction of the method.
    • The target(s) of a conditional or unconditional branch or switch.
    • Any exception handlers for this instruction.
  4. Merge the state of the operand stack and local variable array at the end of the execution of the current instruction into each of the successor instructions. In the special case of control transfer to an exception handler, the operand stack is set to contain a single object of the exception type indicated by the exception handler information.
    • If this is the first time the successor instruction has been visited, record that the operand stack and local variable values calculated in steps 2 and 3 are the state of the operand stack and local variable array prior to executing the successor instruction. Set the "changed" bit for the successor instruction.
    • If the successor instruction has been seen before, merge the operand stack and local variable values calculated in steps 2 and 3 into the values already there. Set the "changed" bit if there is any modification to the values.
  5. Continue at step 1. To merge two operand stacks, the number of values on each stack must be identical. The types of values on the stacks must also be identical, except that differently typed reference values may appear at corresponding places on the two stacks. In this case, the merged operand stack contains a reference to an instance of the first common superclass of the two types. Such a reference type always exists because the type Object is a superclass of all class and interface types. If the operand stacks cannot be merged, verification of the method fails.

To merge two local variable array states, corresponding pairs of local variables are compared. If the two types are not identical, then unless both contain reference values, the verifier records that the local variable contains an unusable value. If both of the pair of local variables contain reference values, the merged state contains a reference to an instance of the first common superclass of the two types.

If the data-flow analyzer runs on a method without reporting a verification failure, then the method has been successfully verified by Pass 3 of the class file verifier.

Certain instructions and data types complicate the data-flow analyzer. We now examine each of these in more detail.

4.9.3 Values of Types long and double

Values of the long and double types are treated specially by the verification process.

Whenever a value of type long or double is moved into a local variable at index n, index n + 1 is specially marked to indicate that it has been reserved by the value at index n and may not be used as a local variable index. Any value previously at index n + 1 becomes unusable.

Whenever a value is moved to a local variable at index n, the index n - 1 is examined to see if it is the index of a value of type long or double. If so, the local variable at index n - 1 is changed to indicate that it now contains an unusable value. Since the local variable at index n has been overwritten, the local variable at index n - 1 cannot represent a value of type long or double.

Dealing with values of types long or double on the operand stack is simpler; the verifier treats them as single values on the stack. For example, the verification code for the dadd opcode (add two double values) checks that the top two items on the stack are both of type double. When calculating operand stack length, values of type long and double have length two.

Untyped instructions that manipulate the operand stack must treat values of type double and long as atomic (indivisible). For example, the verifier reports a failure if the top value on the stack is a double and it encounters an instruction such as pop or dup. The instructions pop2 or dup2 must be used instead.

4.9.4 Instance Initialization Methods and Newly Created Objects

Creating a new class instance is a multistep process. The statement

    ...     new myClass(i, j, k);     ...

can be implemented by the following:

    ...     new #1 // Allocate uninitialized space for myClass     dup // Duplicate object on the operand stack     iload_1 // Push i     iload_2 // Push j     iload_3 // Push k     invokespecial #5 // Invoke myClass.<init>     ...

This instruction sequence leaves the newly created and initialized object on top of the operand stack. (Additional examples of compilation to the instruction set of the Java virtual machine are given in Chapter 7, "Compiling for the Java Virtual Machine.")

The instance initialization method (§3.9) for class myClass sees the new uninitialized object as its this argument in local variable 0. Before that method invokes another instance initialization method of myClass or its direct superclass on this, the only operation the method can perform on this is assigning fields declared within myClass.

When doing dataflow analysis on instance methods, the verifier initializes local variable 0 to contain an object of the current class, or, for instance initialization methods, local variable 0 contains a special type indicating an uninitialized object. After an appropriate instance initialization method is invoked (from the current class or the current superclass) on this object, all occurrences of this special type on the verifier's model of the operand stack and in the local variable array are replaced by the current class type. The verifier rejects code that uses the new object before it has been initialized or that initializes the object more than once. In addition, it ensures that every normal return of the method has invoked an instance initialization method either in the class of this method or in the direct superclass.

Similarly, a special type is created and pushed on the verifier's model of the operand stack as the result of the Java virtual machine instruction new. The special type indicates the instruction by which the class instance was created and the type of the uninitialized class instance created. When an instance initialization method is invoked on that class instance, all occurrences of the special type are replaced by the intended type of the class instance. This change in type may propagate to subsequent instructions as the dataflow analysis proceeds.

The instruction number needs to be stored as part of the special type, as there may be multiple not-yet-initialized instances of a class in existence on the operand stack at one time. For example, the Java virtual machine instruction sequence that implements

new InputStream(new Foo(), new InputStream("foo"))

may have two uninitialized instances of InputStream on the operand stack at once. When an instance initialization method is invoked on a class instance, only those occurrences of the special type on the operand stack or in the local variable array that are the same object as the class instance are replaced.

A valid instruction sequence must not have an uninitialized object on the operand stack or in a local variable during a backwards branch, or in a local variable in code protected by an exception handler or a finally clause. Otherwise, a devious piece of code might fool the verifier into thinking it had initialized a class instance when it had, in fact, initialized a class instance created in a previous pass through a loop.

4.9.5 Exception Handlers

Java virtual machine code produced by Sun's compiler for the Java programming language always generates exception handlers such that:

4.9.6 Exceptions and finally

Given the code fragment

    ...     try {      startFaucet();     waterLawn();    } finally {      stopFaucet();    }     ...

the Java programming language guarantees that stopFaucet is invoked (the faucet is turned off) whether we finish watering the lawn or whether an exception occurs while starting the faucet or watering the lawn. That is, the finally clause is guaranteed to be executed whether its try clause completes normally or completes abruptly by throwing an exception.

To implement the try-finally construct, Sun's compiler for the Java programming language uses the exception-handling facilities together with two special instructions: jsr ("jump to subroutine") and ret ("return from subroutine"). The finally clause is compiled as a subroutine within the Java virtual machine code for its method, much like the code for an exception handler. When a jsr instruction that invokes the subroutine is executed, it pushes its return address, the address of the instruction after the jsr that is being executed, onto the operand stack as a value of type returnAddress. The code for the subroutine stores the return address in a local variable. At the end of the subroutine, a ret instruction fetches the return address from the local variable and transfers control to the instruction at the return address.

Control can be transferred to the finally clause (the finally subroutine can be invoked) in several different ways. If the try clause completes normally, the finally subroutine is invoked via a jsr instruction before evaluating the next expression. A break or continue inside the try clause that transfers control outside the try clause executes a jsr to the code for the finally clause first. If the try clause executes a return, the compiled code does the following:

  1. Saves the return value (if any) in a local variable.
  2. Executes a jsr to the code for the finally clause.
  3. Upon return from the finally clause, returns the value saved in the local variable. The compiler sets up a special exception handler, which catches any exception thrown by the try clause. If an exception is thrown in the try clause, this exception handler does the following:
  4. Saves the exception in a local variable.
  5. Executes a jsr to the finally clause.
  6. Upon return from the finally clause, rethrows the exception. For more information about the implementation of the try-finally construct, see Section 7.13, "Compiling finally."

The code for the finally clause presents a special problem to the verifier. Usually, if a particular instruction can be reached via multiple paths and a particular local variable contains incompatible values through those multiple paths, then the local variable becomes unusable. However, a finally clause might be called from several different places, yielding several different circumstances:

Verifying code that contains a finally clause is complicated. The basic idea is the following:


4.10 Limitations of the Java Virtual Machine

The following limitations of the Java virtual machine are implicit in the class file format:


1 The Java virtual machine implementation of Sun's JDK release 1.0.2 supports class file format versions 45.0 through 45.3 inclusive. Sun's JDK releases 1.1.X can support class file formats of versions in the range 45.0 through 45.65535 inclusive. Implementations of version 1.2 of the Java 2 platform can support class file formats of versions in the range 45.0 through 46.0 inclusive.

2 In retrospect, making 8-byte constants take two constant pool entries was a poor choice.

3 The first edition of The Java Language Specification required that "com" be in uppercase in this example. The second edition will reverse that convention and use lowercase.

4 The fact that end_pc is exclusive is a historical mistake in the design of the Java virtual machine: if the Java virtual machine code for a method is exactly 65535 bytes long and ends with an instruction that is 1 byte long, then that instruction cannot be protected by an exception handler. A compiler writer can work around this bug by limiting the maximum size of the generated Java virtual machine code for any method, instance initialization method, or static initializer (the size of any code array) to 65534 bytes.

5 The InnerClasses attribute was introduced in JDK release 1.1 to support nested classes and interfaces.

6 The Synthetic attribute was introduced in JDK release 1.1 to support nested classes and interfaces.

7 The Deprecated attribute was introduced in JDK release 1.1 to support the @deprecated tag in documentation comments.

Contents | Prev | Next | Index

The Java Virtual Machine Specification
Copyright © 1999 Sun Microsystems, Inc.All rights reserved
Please send any comments or corrections through our feedback form