Ozone Developers' Guide

Yanick Duchesne

Per Nyfelt

Ozone Documentation License, Version 1

This document is free software; you can redistribute it and/or modify it provided that the terms of the GNU Library General Public License as published by the Free Software Foundation version 2 of the License; and the following terms are met.

The Ozone Database Project <ozone@ozone-db.org>

Included in the ozone distribution is code and documentation made available by other copyright holders and under different licenses. All these licenses allow worldwide, royalty free distribution, whether alone or as part of a larger product. License, copyright and disclaimer of this software is included in this directory.

The document is Copyright (C) 1997-2001 by SMB GmbH, Rohrteichstr. 18, 04347 Leipzig, Germany, All rights reserved.

You must give prominent notice with each copy of the work that the document is used in it and that its use are covered by this License. You must supply a copy of this License. If the work during execution displays copyright notices, you must include the copyright notice for the Library among them, as well as a reference directing the user to the copy of this License.

The name ozone must not be used to endorse or promote software products derived from this software without prior written permission of SMB.

Software products derived from this document may not be called ozone nor may ozone appear in their names without prior written permission of SMB.

This document is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Library General Public License for more details.

Abstract

This document is aimed at developers who would like to contribute to Ozone's codebase or at those who just wish to acquire a better understanding of Ozone's inner workings, either for personal enlightment or to have more information on how to best use Ozone for their own projects. This document concentrates information from Ozone's initiators, contributors and from discussions on the Ozone mailing-lists.

The Ozone Developer's Guide was designed to provide a complete architectural layout of Ozone, as well as insights in its implementation. Together with Ozone's javadoc, the guide will help unleash Ozone's secrets to potential developers.


Table of Contents

1. Introduction
Purpose of this document
Getting the Ozone source code
Building Ozone from the sources
Contributing updates
Coding Standards
Submitting Patches
Getting commit access
JUnit tests
org.ozoneDB.test Package Description
OzoneTestRunner
2. Architecture
General View
Ozone Proxies
Ozone Modules
XML
3. Exploring Ozone
Client-Side
org.ozoneDB.ExternalDatabase
Factory Behavior
org.ozoneDB.LocalDatabase vs org.ozoneDB.RemoteDatabase
Thread management and Connection Pooling
org.ozoneDB.OzoneProxy and org.ozoneDB.OzoneInterface
Outside the server
Inside the server
Network
The Ozone Core
The Ozone Garbage Collector
Rationale
Design
Prerequisites
Remaining questions and problems
4. Creating new release versions of Ozone
Branching

The source code is maintained with the help of CVS. If you do not have CVS or have it but do not know how to use take a look at cvshome Check out /ozone/ozoneDoc at SourceForge using the setup described there. In short: the cvs commands to use for anonymous access is:

      ~> cvs -d :pserver:anoncvs@cvs.ozone.sourceforge.net:/cvsroot/ozone co ozone
      
For developer rw access the settings and commands are as follows:
      ~> export CVS_RSH=ssh
      ~> export CVSROOT=:ext:yourSourceforgeId@cvs.ozone.sourceforge.net:/cvsroot/ozone
      ~> cvs co ozone
      
All components are found under the ozone directory. The core is under the server directory. Other things are in either modules or third party. OzoneDoc is special since it is only documentation.

You are now ready to build the core:

    ~/ozone> cd server
    ~/ozone/server> build.sh
    
or the equivalent build.bat file on Windows

To start ozone set your OZONE_HOME to ~/ozone/server/build and run ozone using the scripts in ~/ozone/server/build/bin. Don't try to run the server by using the scripts in server/bin as the paths will not be set up properly.

To build a binary distribution of the server use the dist.bin target. This will create a ozone-[version]-dev directory that should be set as the OZONE_HOME in order to use it. To build a binary distribution of the server and all modules use the dist.bin target in the ~/ozone root directory.

OZONE_HOME refers to the base location for the compiled, runnable structure, not the CVS project root dir. For version 1.0x the two locations happened to be the same but this is no longer the case for 1.1x.

As mentioned above, there are two possible settings for OZONE_HOME for developers:

  1. Set it to ozone/server/build. This allows for simple recompilation to get updated classes which is fast and simple.

  2. Set it to ozone/server/build/ozone-1.1.X-dev This is the server distribution dir i.e. what the structure will look like in the binary distributions. It is smaller since all classes are jared but it takes longer to build (since the dist.bin target needs to be invoked to produce it). I typically copy the ozone-1.1.X-dev directory to somewhere such as /test/ozone-1.1.X-dev and point OZONE_HOME there so that i can do more formal testing of my system.

It might be easier to start with something small before attempting to take on a major feature enhancement. Things like javadoc on members and public attributes, implementations of methods that are emty etc. are always welcome but more probably you have found a bug in ozone when working on your application and you are reading this document to try figure out where to start looking for possible ways to fix it. If non of this is the case and you the CHANGES file contains a TODO list of things needed.

We need to expand our Junit tests so if it is possible to write test for a fix or feature add-on you're working on that would be great.

Here's a quick overview of our approach to Junit tests:

For further reference see Log4J at apache and the JUnit homepage.

The Ozone architecture, very generally represented by the diagram further below, has four main layers:

The above architecture illustrate the most obvious use of Ozone, which acts as a remote server; but a client application can instantiate an Ozone server as part of its VM, therefore removing the network layer. Such a scheme would be convenient for standalone use. In such a case though, this behavior is implemented through a proxy pattern, which makes client code portable to both configurations (local and remote). We will see further below what this proxy scheme consists of.

Database objects are the persistent objects designed by developers to fullfill their application logic needs. Database objects implement a given interface (in more concrete terms, a Java interface that extends org.ozoneDB.OzoneRemote), and this interface is the "visible" side of database objects. There is only one instance of a database object, which lives inside the database server. This database object is controlled via proxy objects.

A given proxy object represents its corresponding database object - inside the client applications and inside other database objects. A proxy object can be seen as a persistent reference. Proxy classes are automatically generated out of the database classes by the Ozone post-processor and implement the same public interface as their respective database object counterpart - which means that they also implement the OzoneRemote interface that their corresponding database object implements.

All ozone API methods return proxies for the actual database object inside the database. Therefore, the client deals with proxies only. However, this is transparent to the client: proxies can be used as if they were the actual database objects, since they implement the same interface.

Database objects are different from ordinary Java objects (other systems and specs, like JDO, respectively call them "primary" and "secondary", or "first-class" and "second-class"). Only one instance of a given database object reference exists in the database, as opposed to standard Java objects, which are treated in a "by-copy" fashion each time they are serialized. By analogy, database objects are a bit like rows in a relational database table, and members of these database objects that are standard Java objects correspond to the columns in the row - database object members would correspond to links to other tables, if we push the analogy.

Standard Java objects that are part of a database object can be directly used from the client application and from other database objects. In fact, database objects are sets of ordinary objects. For example the class "Car" contains several String members which are objects. One object of type Car and the dependent String members is a database object. In this regard modeling an ozone database is comparable to modeling a relational database.

Database objects in the server are loaded into memory as a whole when someone invokes a method on the object. Since the client actually uses a proxy to the server object, it gets loaded when a method on the proxy is invoked. This is a very important feature as it ensures that objects in the server are only loaded when they are needed. As an example, if the proxy you are working with contains a collection, the collection and the proxies in the collection will be available on the client but none of the corresponding objects in the server will be loaded. The object server loads objects based on client use of the proxy objects which helps reduce the load and memory usage in the server by only loading objects when needed.

The ExternalDatabase class implements the org.ozoneDB.OzoneInterface which, as shown below, defines the behavior expected from classes that allow interaction with the Ozone server - and the latter's subcomponents:

public OzoneProxy createObject( String className ) throws Exception;
public OzoneProxy createObject( String className, int access ) throws Exception;
public OzoneProxy createObject( String className, int access, String objName ) throws Exception;
public OzoneProxy createObject( String className, int access, String objName, String sig, Object[] args ) throws Exception;
public OzoneProxy copyObject( OzoneRemote rObj ) throws Exception;
public void deleteObject( OzoneRemote rObj ) throws Exception;
public void nameObject( OzoneRemote rObj, String name ) throws Exception;
public OzoneProxy objectForName( String name ) throws Exception;
public OzoneProxy[] objectsOfClass( String name ) throws Exception;
public Object invoke( OzoneProxy rObj, String methodName, String sig, Object[] args, int lockLevel ) throws Exception;
public Object invoke( OzoneProxy rObj, int methodIndex, Object[] args, int lockLevel ) throws Exception;
public OzoneCompatible fetch( OzoneProxy rObj, int lockLevel ) throws Exception;
public void reloadClasses() throws Exception;
public Node xmlForObject( OzoneRemote rObj, Document domFactory ) throws Exception;
public void xmlForObject( OzoneRemote rObj, ContentHandler ch ) throws Exception;
public OzoneService service (int serviceHandle) throws Exception;

As can be seen above, OzoneInterface defines methods pertaining to the management of database objects - through their corresponding proxies (instances of OzoneProxy); these methods correspond to the create-read-delete operations - the update operations being performed through the methods of the proxy - which are shared by its corresponding database object.

Proxies retrieved by client applications delegate their method's behavior to their corresponding database object. This delegation is not direct, but an illusion: indeed, a proxy always goes through an OzoneInterface instance to "reach" its corresponding database object. For example, when a proxy is retrieved from an ExternalDatabase instance, it keeps a link to the latter, and delegates its method calls to it. The method call is relayed to the Ozone server using Ozone's custom protocol. The Ozone server then performs the method call on the database object instance that corresponds to the proxy from which the call originates, in a transactionnaly safe manner, and returns the result, if any.

The following code excerpt was generated by the Ozone Post-Processor (OPP). It belongs to the Car example delivered with Ozone (and featured on the web site), and is in fact the Car proxy. The delegation mechanism is clearly visible:

import org.ozoneDB.*;
import org.ozoneDB.core.ObjectID;
import org.ozoneDB.core.Lock;
import org.ozoneDB.core.ResultConverter;

/**
 * This class was automatically generated by ozone's OPP.
 * Do not instantiate or use this class directly.
 */
public final class CarImpl_Proxy 
       extends OzoneProxy
       implements newtradetest.ozone.Car {

   static final long	serialVersionUID = 1L;

   public CarImpl_Proxy() {
      super();
      }


   public CarImpl_Proxy (ObjectID oid, OzoneInterface link) {
      super (oid, link);
      }

   public java.lang.String name () {
      try {
         Object target = link.fetch (this, Lock.LEVEL_READ);
         if (target != null) {
            return (java.lang.String)ResultConverter.substituteOzoneCompatibles (((CarImpl)target).name());
            }
         else {
            Object[] args = {};
            Object result = link.invoke (this, 8, args, Lock.LEVEL_READ);
            return (java.lang.String)result;
            }
         }
      catch (RuntimeException e) {
         e.fillInStackTrace();
         throw e;
         }
      catch (Exception e) {
         e.fillInStackTrace();
         throw new UnexpectedException (e.toString());
         }
      }


   public void setName (java.lang.String arg0) {
      try {
         Object target = link.fetch (this, Lock.LEVEL_WRITE);
         if (target != null) {
            arg0 = (java.lang.String)ResultConverter.substituteOzoneCompatibles (arg0);
            ((newtradetest.ozone.CarImpl)target).setName(arg0);
            }
         else {
            Object[] args = {arg0};
            Object result = link.invoke (this, 15, args, Lock.LEVEL_WRITE);
            }
         }
      catch (RuntimeException e) {
         e.fillInStackTrace();
         throw e;
         }
      catch (Exception e) {
         e.fillInStackTrace();
         throw new UnexpectedException (e.toString());
         }
      }
   }
}

The persistant garbage collector is a mark-sweep-garbage collector which acts transparently in the background. Basically, it works as follows:

There are three disjoint virtual sets of objects (possiblyReachable, surelyReachable and doneReachable). All objects initially belong to possiblyReachable. The root objects are moved to surelyReachable. Then, every surelyReachable object is processed for references it has, and these references are added to the surelyReachable set. The processed objects is moved to the doneReachable set.

The process has ended if the surelyReachable set is empty. The objects in the doneReachable set are reachable and the objects remaining in the possiblyReachable set are not. These objects are then reclaimed.

The process is done at object granularity so that there is no need to stop database operations for running the garbage collector.

The GarbageCollector considers following database objects as reachable (which will not be garbage collected):

Under this rules, I can know that a proxy dies
  • Because it sends a notification message to the server on the finalize() call

  • Because the connection to the client is lost.

Other objects are considered as "unreachable" and will be reclaimed once a GarbageCollector run has finished.

Not only are objects reachable which have a name, also objects are reachable which are reachable by reachable objects. These "indirectly reachable" objects do not have names and still may not be reclaimed.

The GarbageCollector runs at object granularity, thus it does not block the rest of the database users. (It still disturbs because it should cause heavy IO-traffic).

The only issue which seems to remain to impact the theoretical implementability of such a garbage-collector is the way transaction support is implemented. The reason is, that transactions, which are eligible to rollback, may recreate objects on rollback which may be handled as dead by the garbage-collector if the structure of the implementation of transaction support is not carefully observed.

How does it take into account that proxies (object references) can not only exist in other objects but also on the client side?

I'm tracking OzoneProxys which leave the database VM to database clients. On garbage collection (finalize()), they "sign off" themselves (send the ozoneDB server that now the reference they represented is no longer existent. If the connection to the client breaks, all registered OzoneProxy references are declared dead. This is inefficient (due to higher network load), and currently it is not 100% secure (Because ResultConverter.substituteOzoneCompatibles() does not substitute recursively as it should do), but there is a solution for "embedded Proxys" which is complicated but implementable. Before implementing the complicated solution, I'd like to know from the community whether an OzoneObject exists which could return an "embedded proxy" to its caller (e.g. a proxy reachable within the returned object but where the returned object is not a proxy object itself). If the complicated solution is not needed, it does not have to be implemented.

Referential Integrity: One issue we might need to investigate is the case where calling the onDelete() method actually makes that object reachable again: Is this handled by the garbage collector properly? This is a fine example for a sophisticated garbage collector test case.

GarbageCollector.processSurelyReachableObjectsWhichHaveToBeMarkedAsSuch() What to do in the catch (TransactionExc e) block?

GarbageCollector.interceptInvocationPre() There is a design decision which heavily affects performance: - Do we first check for the possibility of an OzoneProxy as parameter or - Do we first check for the cross of the border of the different sets of database objects?

GarbageCollector.GarbageCollectorProxyObjectIdentificationObjectOutputStream

Maybe one GarbageCollectorProxyObjectIdentificationObjectOutputStream per GarbageCollector is sufficient, but there is state maintained in the ObjectOutputStream and flushing state during time another thread is using the ObjectOutputStream as well produces situationes which are unexpected by the developers of ObjectOutputStream. So this is currently not considered.

Once a branch is created no new functionality should go in there, only bug fixes. All new functionality goes into HEAD which is always regarded as in a state of flux.

The basic idea is to have a structure as follows [v][version_numbers][-type][-state] where type is branch, release etc. and state is alpha, beta etc.

* When version numbers are included in the name, change dots to underscore e.g. "1.1.x" becomes "v1_1_x"

* When branching always create a tag with the type suffix -root first e.g. "v1_1_x-root"

* Branches should have the type suffix -branch e.g. "v1_1_x-branch"

* Releases are tagged with the type suffix -release e.g. "v1_1-release"

The "x" in the end of the version number marks the scope of the branch where as the additional "dev" means it is from HEAD. I.e. after the branch for "1.1.x" is created HEAD's build.properties is updated from "1.1.x-dev" to "1.2.x-dev".

"x" and "x-dev" should never appear in normal tags but only when creating the branch.

We create one branch for each release point (1.1, 1.2 etc) and subsequent releases within this branch (1.1.1, 1.1.2 etc) are tagged. e.g. working on "v1_1_x-branch" and releasing the first alpha, the tag would be "v1_1-release-alpha". Before tagging the release , build.properties is updated to match the tag name so that the tag "v1_1-release-alpha" would be released as "ozone-1.1-alpha" and have "version = 1.1-alpha" in build.properties:

The following state suffixes are used:

-alpha This means the first cut after a branch. All functionality should be there but proper testing has not been made other than it builds and "seems to run fine".

-beta When all tests run without problems and sufficient time has passed after the alpha a beta is released.

If bugs are encountered after a beta release there might be need to release a second beta if so just add an incrementing number to the -beta suffix e.g. -beta2 ( note: -beta is considered -beta1 so the number 1 is never used.)

The final version would just not have a prefix e.g. "v1_1-release"

If bugs are found in 1.1 then point releases fix them e.g. "v1_1_1-release" fixes bugs in 1.1 (v1_1-release) "v1_1_2-release" fixes bugs in 1.1.1 (v1_1_1-release) etc.

If necessary (major changes was made) it is possible to go through a beta cycle before releasing the bug fixed version e.g. "v1_1_1-release-beta" is first released as "ozone-1.1.1-beta" and when stable "v1_1_1-release" goes out as "ozone-1.1.1"

It is also possible, if really needed, to create a second branch off the current one: e.g. tag "v1_1_1_x-root" branch "v1_1_1_x-branch"

Another reason for creating a branch would be to explore some very different functionality that is not decided if it should go into HEAD or not. In this case use the state suffix -devel when creating the branch and append some kind of descriptive name after devel separated with a dash. e.g. "v1_1-devel-garbage_collection" or "v1_1-devel-client_call_backs" would be good names.