javavm
Taurus JavaVM Implementation
21st November 1999, © Copyright 1999, Taurus Software

Introduction
This document outlines the internal structure and workings of the Taurus Java Virtual Machine. Most of these features can be found in other Java VMs, but the exact implementation of most of them will be unique to the Taurus JVM.

Stack
The execution stack consists of a linked list of Stack Frames. Each Stack Frame consists of single block of memory divided into the Frame Header and spaces for the Evaluation Stack and Local Variables. As a Java method defines both the maximum entries on the stack and the number of local variables for its frame, the size of this block can be calculated when it is created.


Figure 1: Stack Frames

The Evaluation Stack in fact uses two stacks in parallel, the Value Stack and the Type Stack. When a value is pushed onto the Value Stack, its type is recorded on the Type Stack. The Type Stack is always checked before a value is popped, and the VM can raise an error if a type-mismatch is detected. This feature is critical in the development of the VM, and has helped catch many bugs, but could be disabled in a release version. Beware though, as you need to run all classes through a verifier to check the stack operations if you are not going to type-check at runtime.

The Local Variables are implemented in the same way as the Evaluation Stack and the type is recorded along with the value.

Heap
As we don't do any Garbage Collection, so the Heap is incredibly simple. Memory is allocated when needed using malloc(). Memory (apart from Stack Frames) is never freed.

Object References
A idea of how to implement Object References is hinted at in Sun's Virtual Machine Specification. An Object Reference simply consists of a pointer to a block of memory, which in turn contains two more pointers. The first pointer points to the heap memory for the object, the second pointer points to the classfile object representing the objects type.


Figure 2: Object Reference

The current implementation also stores some additional information in an object reference to distinguish arrays, which are also handled as objects.

Later in the implementation, we will use this to help implement a sliding heap and garbage collection. At the moment, it just gives us simple runtime type-checking of Object References.

Object Representation
Objects are just flat blocks of memory. The fields are allocated as words of memory within the block, the offsets for these are calculated a class load-time.

Classfiles
Have an optimised loaded representation. UTF8 strings are null-terminated for convenience. Methods and fields are hooked up to their definitions. Code and attribute offsets are converted to pointers.

Execution Engine
The core of the virtual machine is the Execution Engine. This consists of a single function that interprets a single virtual instruction, which is called in a loop.

An instruction is decoded from its first byte (the Opcode) by a single switch statement. Instructions are implemented as functions which decode any additional arguments and perform the appropriate actions. Some functions are shared between several instructions and are parameterised.

After an instruction has been interpretted, the Program Counter (pc) is incremented according to the size of the instruction (from a table).

In debug executables, the name of the instruction can be looked up in the table and displayed.

Native Methods
The mechanism for invoking native methods has been kept very simple and in the current implementation, only statically linked methods can be invoked. The invocation mechanism is not JNI compliant.

Native Methods are resolved at link time using a Table of Native Methods. At present this table is static, but it may later be made dynamic to allow DLLs containing native methods to be used.

A group of virtual instructions (the invokes) are responsible for the invocation of new methods (both Java and native), and these are used to trigger native method invocation. Native methods are flagged in the classfile, so upon detection we can simply jump to the native code.

Execution Order
The virtual machine has a set order in which classes are loaded and executed. In version 1.02a this is as follows:

  • Load java.lang.System class.
  • Execute java.lang.System.<clinit> (if there is one).
  • Execute static method java.lang.System.initializeSystemClass()V.
  • Load class specified on command line.
  • Execute class.<clinit> (if there is one).
  • Execute static method class.main([Ljava.lang.String;)V.
The execution of <clinit> methods is an inherant part of the class loading procedure and takes place automatically. The execution order for the 0.16a version is much simpler:
  • Load class specified on command line.
  • Execute class.<clinit> (if there is one).
  • Execute static method class.main([Ljava.lang.String;)V.
This assumes a non-standard Java environment, which may or may not include a System class. System class initialization must be done in the <clinit> of the System class being used.