Taurus JavaVM Implementation
21st November 1999, © Copyright 1999, Taurus Software
Introduction
This document outlines the internal structure and workings of the Taurus Java
Virtual Machine. Most of these features can be found in other Java VMs, but
the exact implementation of most of them will be unique to the Taurus JVM.
Stack
The execution stack consists of a linked list of Stack Frames. Each
Stack Frame consists of single block of memory divided into the Frame
Header and spaces for the Evaluation Stack and Local Variables.
As a Java method defines both the maximum entries on the stack and the number
of local variables for its frame, the size of this block can be calculated
when it is created.
Figure 1: Stack Frames
The Evaluation Stack in fact uses two stacks in parallel, the Value Stack
and the Type Stack. When a value is pushed onto the Value Stack, its
type is recorded on the Type Stack. The Type Stack is always checked before a
value is popped, and the VM can raise an error if a type-mismatch is detected.
This feature is critical in the development of the VM, and has helped catch
many bugs, but could be disabled in a release version. Beware though, as you
need to run all classes through a verifier to check the stack operations if you
are not going to type-check at runtime.
The Local Variables are implemented in the same way as the Evaluation Stack and
the type is recorded along with the value.
Heap
As we don't do any Garbage Collection, so the Heap is incredibly simple.
Memory is allocated when needed using malloc(). Memory (apart from Stack
Frames) is never freed.
Object References
A idea of how to implement Object References is hinted at in Sun's Virtual
Machine Specification. An Object Reference simply consists of a pointer to a
block of memory, which in turn contains two more pointers. The first pointer
points to the heap memory for the object, the second pointer points to the
classfile object representing the objects type.
Figure 2: Object Reference
The current implementation also stores some additional information in an object
reference to distinguish arrays, which are also handled as objects.
Later in the implementation, we will use this to help implement a sliding heap
and garbage collection. At the moment, it just gives us simple runtime
type-checking of Object References.
Object Representation
Objects are just flat blocks of memory. The fields are allocated as words of
memory within the block, the offsets for these are calculated a class load-time.
Classfiles
Have an optimised loaded representation. UTF8 strings are null-terminated for
convenience. Methods and fields are hooked up to their definitions. Code and
attribute offsets are converted to pointers.
Execution Engine
The core of the virtual machine is the Execution Engine. This consists
of a single function that interprets a single virtual instruction, which is
called in a loop.
An instruction is decoded from its first byte (the Opcode) by a single
switch statement. Instructions are implemented as functions which decode any
additional arguments and perform the appropriate actions. Some functions are
shared between several instructions and are parameterised.
After an instruction has been interpretted, the Program Counter (pc) is
incremented according to the size of the instruction (from a table).
In debug executables, the name of the instruction can be looked up in the table
and displayed.
Native Methods
The mechanism for invoking native methods has been kept very simple and in the
current implementation, only statically linked methods can be invoked. The
invocation mechanism is not JNI compliant.
Native Methods are resolved at link time using a Table of Native Methods. At
present this table is static, but it may later be made dynamic to allow DLLs
containing native methods to be used.
A group of virtual instructions (the invokes) are responsible for the
invocation of new methods (both Java and native), and these are used to trigger
native method invocation. Native methods are flagged in the classfile, so upon
detection we can simply jump to the native code.
Execution Order
The virtual machine has a set order in which classes are loaded and executed.
In version 1.02a this is as follows:
- Load
java.lang.System class.
- Execute
java.lang.System.<clinit> (if there is one).
- Execute static method
java.lang.System.initializeSystemClass()V .
- Load class specified on command line.
- Execute class
.<clinit> (if there is one).
- Execute static method class
.main([Ljava.lang.String;)V .
The execution of <clinit> methods is an inherant part of the
class loading procedure and takes place automatically.
The execution order for the 0.16a version is much simpler:
- Load class specified on command line.
- Execute class
.<clinit> (if there is one).
- Execute static method class
.main([Ljava.lang.String;)V .
This assumes a non-standard Java environment, which may or may not include a
System class. System class initialization must be done in the <clinit>
of the System class being used.
|