Open Source XML Database
Quick StartFor a quick start, the following sections describe how to start the server and add some files to the database. 1. BasicsBeginning with version 0.8, eXist offers three alternatives to run the database. It may either run as a standalone server process, embedded into an application or in connection with a servlet engine. All three alternatives are thread-safe and allow concurrent operations by multiple users. In standalone mode, eXist runs in its own Java virtual machine. Clients have to access the database through the network, either using the XML-RPC protocol or simple HTTP requests. In embedded mode, the database is controlled by the client application. It runs in the same Java virtual machine as the client, thus no network connection is needed and the client has full access to the db. eXist may also run directly in a servlet context. In this case, the database is deployed as part of a web application. All resources used by eXist will be relative to the web application's current context. For example, eXist will store all its database files in the WEB-INF/data directory of the web application. Servlets running in the same web application context will have direct access to the db. A Cocoon-driven application would be one example. External client applications may still use the supplied network interfaces. Some features of eXist require a servlet engine: for example, the SOAP and WebDAV interfaces are implemented as servlets. These interfaces will thus not be available in standalone mode. 2. InstallationFor a quick start it is recommended to use the server through a servlet engine, i.e. using the third option described above. The stand-alone and embedded deployment options are covered by the deployment.xml document. The source distribution includes a servlet engine (Jetty), so no additional software is required. If you already have a running servlet-engine at your hands, you may alternatively download eXist as a bundled web application archive (.war-file). However, the source distribution makes it easier to get started and to get an impression of the different clients and interfaces. 2.1. InstallerThe installer will try to determine correct paths and environment variables to launch eXist. On Windows, it can also create shortcuts. The installer is started either by double-clicking the downloaded jar-file (Windows/MAC) or by calling java with the -jar option: java -jar eXist-{version}-installer.jar
After the installation has completed, you should be ready to launch eXist. If the installer has created shortcuts, you may simply click on the "Launch eXist" icon on the desktop or select the corresponding item in the Windows start menu. On other OS, you have to start eXist manually: Open a shell or DOS command prompt, change into the location where you installed eXist and enter: bin/startup.sh
(Unix) or bin\startup.bat
This will launch the included Jetty webserver and eXist. 2.2. Installing the Source DistributionSimply unzip or untar the distribution file to some suitable location. Your JAVA_HOME environment variable should point to the Java installation directory. Now set the EXIST_HOME environment variable to eXist's base directory, e.g.: set EXIST_HOME=c:\Devel\eXist-{version}
or on a Unix system: export EXIST_HOME=/home/wolf/eXist-{version}
The distribution contains several shell and batch scripts to start the servers and various clients in directory bin. To launch the included webserver, use the scripts startup.sh or startup.bat from the root directory of the distribution: bin\startup.bat
The scripts will set a few parameters and then call startup bootloader in start.jar. start.jar, will try to determine correct classpath settings. This makes it also possible to launch eXist by double-clicking on start.jar or by using the -jar option of the Java JVM. To launch the included webserver, change into the newly created directory and double-click start.jar (Windows only ) or type java -Xmx128M -Djava.endorsed.dirs=lib/endorsed -jar start.jar
on the shell or a DOS command prompt. 2.3. Installing eXist as a Web ApplicationTo install eXist with an existing servlet-engine, download the .war-file and put it into a proper location where your servlet-engine can find it. Most installations have a directory called webapps and will automatically launch any .war-files found therein. Please rename the file to exist.war (otherwise some examples may not work correctly). Most servlet-engines will automatically unpack war-files (Tomcat does so). If not, you have to do this by hand (otherwise eXist's data may get lost if you shutdown the servlet-engine). Create a sub-directory exist in webapps, cd into it and unpack the archive using jar. For example: Example: Unpack the .war file
mkdir exist cd exist jar xfv ../exist.war Finally, remove the .war file. You may have to restart your servlet-engine now. If eXist is running inside a servlet-engine, its home directory will always be in WEB-INF. This is where the main configuration file conf.xml and the database files reside. Log files will be written to WEB-INF/logs. 3. Check if the server is runningeXist's home page should now be available at http://localhost:8080/exist/index.xml. To see if the database engine is running, you may click on the Server status menu entry in the sidebar or browse directly to http://localhost:8080/exist/status. The server will present you some information on the configuration, the data directory used by the engine and available database instances. 4. Index some XML filesThere is no preferred way to work with eXist. The server offers quite a number of different interfaces, including XML-RPC, SOAP or direct access via Cocoon and the XML:DB API. Alternatively, you may use the administration interface at http://localhost:8080/exist/xadmin.xsp or WebDAV (see below) to add files. This section explains how to add and query files using the command line client. If you installed the source distribution, you will find the shell scripts to start the client in directory eXist-0.9/bin. Otherwise you have to change to the location where the exist.war file has been unpacked and find the bin directory there. As an alternative to the shell scripts, you may also use the start.jar bootloader to start the client. In this case, pass the string "client" as first argument: java -jar start.jar client
When using the shell-scripts, please make sure that the EXIST_HOME environment variable points to the directory where the configuration file conf.xml can be found. The server will try to read the file from EXIST_HOME/conf.xml. For the source distribution EXIST_HOME is simply the root directory of the distribution as explained above. If you installed eXist as a web application, EXIST_HOME should be set to the WEB-INF directory of the web application, e.g.: set EXIST_HOME=c:\jakarta-tomcat-4.x\webapps\exist\WEB-INF
You should also ensure that you have write permissions for the data directory. You will find it in webapps/WEB-INF/data or WEB-INF/data. The command-line client may either be controlled by parameters, or used interactively. If no action is specified on the command-line, the client will enter interactive mode and start a graphical user interface. More information on the client is available in the corresponding document. The shell scripts to start the client are called bin\client.bat
on DOS/Windows or on Unix: bin/client.sh
If you don't want the graphical user interface, you can also start the client in pure shell mode. It offers the same functionality and usually needs much less time to start. The pure shell mode is selected if parameter -s or --nogui is specified: bin/client.sh -s
Note
If you can't execute the shell script on Unix, try to do a chmod +x bin/client.sh first. If Java complains about unknown classes, please check if you have set EXIST_HOME to point to the correct location. For the first time let's index some sample files provided in directory samples using interactive mode. You can do the same job with parameters as described below. So if you prefer the short way, skip to the next section. If you start the client without arguments, it will connect to the database, enter interactive mode and present the graphical interface. In graphical mode, the client first prompts for a username and password. As long as no password is specified for the admin user, the database grants access to any user. Thus, simply pressing enter is enough. Now, the GUI will show up: ![]() At the bottom you see the shell window. It offers the same commands as in pure shell mode. If you type help, a list of known commands is displayed. eXist organizes all documents in hierarchical collections. Collections are like directories. They are used to group related documents together. So first of all we should add new collections for our documents. To create a new collection, press on the ![]() Now add another collection called plays below the shakespeare collection. You can change into the shakespeare collection by double-clicking on it in the table view. Finally, we add some documents to the collection. Just press the ![]()
Note
The second progress bar will only change if you run the client with an embedded database instance. Otherwise, the client has no information on parsing progress. In addition to the Shakespeare plays, you should also create a collection /db/library and put the file samples/biblio.rdf into it as shown in the screen dump above. Finally, to experiment with XInclude, you should add the files in samples/xinclude to another collection /db/xinclude. 5. Using the shellInstead of clicking through dialogs, the same operations can also be performed using the shell mode. This will be more convenient to experienced users. You can either use the shell window of the graphical user interface or start the client in pure shell mode by passing option -i on startup. Typing mkcol shakespeare and pressing enter will create a shakespeare-collection into which we will put some of the sample documents provided with eXist. To check if the new collection is present, enter ls to get a listing of the current collection contents. The listing below shows an example session of how to add the sample documents: Example: Adding the sample documents
exist:/db>mkcol shakespeare created collection. exist:/db>cd shakespeare exist:/db/shakespeare>mkcol plays created collection. exist:/db/shakespeare>cd plays exist:/db/shakespeare/plays>put samples/shakespeare/ storing document hamlet.xml (1 of 4) ...done. storing document much_ado.xml (2 of 4) ...done. storing document r_and_j.xml (3 of 4) ...done. storing document shakes.xsl (4 of 4) ...done. exist:/db/shakespeare/plays> cd exist:/db>mkcol library created collection. exist:/db>cd library exist:/db/library>put samples/biblio.rdf storing document biblio.rdf (1 of 1) ...done. exist:/db/library>cd exist:/db>mkcol xinclude created collection. exist:/db>cd xinclude exist:/db/xinclude>put samples/xinclude Adding files to the database is done using put. Put expects either a single file, a file-pattern or a directory name as argument. If a directory is specified, all XML and XSL files in that directory will be put into the database. To add the files in directory samples/shakespeare simply enter put samples/shakespeare. To see if the files have actually been stored, you may view the contents of the current collection with ls. To view a document, use the get command, e.g.: get hamlet.xml
Note
put also accepts file-patterns, i.e. a path with wildcards ? or *. ** means: any sub-directory. So the command put samples/**/*.xml will parse any XML files found in the samples directory and any of its sub-directories. In addition to the Shakespeare plays, we create a collection /db/library and put the file samples/biblio.rdf into it as shown in the screen dump above. Finally, to experiment with XInclude, you should add the files in samples/xinclude to another collection /db/xinclude.
Note
If you ever run into problems while experimenting with eXist and your database files get corrupt: just remove the data files created by eXist and everything should work again. The data files all end with .dbx. You will either find them in directory webapp/WEB-INF/data or WEB-INF/data, depending on your installation. It is also ok to backup those data-files to be able to restore them in case of a database corruption. 6. Non-interactive modeAs said above, the client can also be controlled by command-line parameters. This way we only need three commands to index our files: bin/client.sh -m /db/shakespeare/plays -p samples/shakespeare
bin/client.sh -m /db/xinclude -p samples/xinclude
bin/client.sh -m /db/library -p samples/biblio.rdf
Note
If you use start.jar instead of the shell-scripts, your command line should look like this: java -jar start.jar client -m ... -p ...
The -m option implicitely creates all required collections, -p interprets all remaining parameters on the command-line as file or directory names to be stored to the database. 7. Querying documentsThe interactive client understands the find command to send XPath queries to the server. Alternatively you may use the web-based interace at http://localhost:8080/exist/xquery.xsp. For example, to find all SPEECH elements where Juliet talks about love to her Romeo you may use: document(*)//SPEECH[SPEAKER &= 'juliet' and . &= 'love romeo']
For more information on the extended XPath query syntax supported by eXist, please refer to the XPath HowTo. The interactive client will just print out the number of hits for your query. Use the show command to view results: Example: Querying
exist:/db>find document(*)//SPEECH[SPEAKER &='juliet' and . &= 'love romeo'] document(*)//SPEECH[SPEAKER &='juliet' and . &= 'love romeo'] found 6 hits. exist:/db/library>show <SPEECH exist:id="41669" exist:source="/db/shakespeare/plays/r_and_j.xml" xmlns:exist="http://exist.sourceforge.net/NS/exist"> <SPEAKER>JULIET</SPEAKER> <LINE>O Romeo, Romeo! wherefore art thou Romeo?</LINE> <LINE>Deny thy father and refuse thy name;</LINE> <LINE>Or, if thou wilt not, be but sworn my love,</LINE> <LINE>And I'll no longer be a Capulet.</LINE> </SPEECH> displayed items 1 to 1 of 6 exist:/db/library> You can also pass your queries on the command-line, but you have to take care that parameters are correctly parsed by the shell. For example, on Linux I pass the query through standard input: echo "document(*)//SPEECH[SPEAKER&='juliet' and .&='love romeo']" | bin/client.sh -x
8. Sample pagesNow that we have added some files to the database, you may have a look at the example pages available through the web interface:
9. Shutting down the databaseIf eXist is running within the supplied Jetty webserver or in stand-alone mode, you can use the bin/shutdown.sh or bin\shutdown.bat scripts to cleany shut down the database and stop the webserver. Again, you can also use the start.jar bootloader to do the same thing: java -jar start.jar shutdown
You may also specify a different server URI if your database is not running within the supplied Jetty, for example to stop eXist when running in stand-alone mode (for more information see the document on Server Deployment): java -jar start.jar shutdown --uri=xmldb:exist://localhost:8081 If eXist has been deployed as a web archive (.war-file), stopping the webserver is not possible (eXist doesn't know anything about the environment it is running in). However, calling shutdown with the correct URI will at least stop the database engine and flush all open buffers to disk.
10. Backup/RestoreThe graphical user interface of the client provides a convenient way to create backups and to restore them after a server crash. You may backup any local or remote collection available through the XML:DB API. To create a backup, simply click on the Create Backup button in the toolbar. You will be asked to select a collection and a directory. During backup, a hierarchy of directories will be created below the backup directory. The directories correspond to the hierarchy of collections found in the database. The tool will also backup user permissions for each collection and document. This information is written into the special file __contents__.xml placed in each subdirectory. You need these files to restore the database contents. ![]() To restore selected collections, click on the Restore Files from Backup button. As shown above, you will be asked to select one of the __contents__.xml files, which have been created inside the backup directory. For example, to restore the entire database, select BACKUP_DIR/db/__contents__.xml. To restore a specific collection, just select the __contents__.xml file in the directory corresponding to the collection. Once you have selected a file, the client will begin to restore the collection contents, showing the following progress monitor: ![]()
Note
Please note that the client will also restore the user settings which were in effect at the time you created the backup. If new users have been created in the meantime, they will be overwritten. 11. Using WebDAVThe distribution includes xincon, a WebDAV module to access an XML:DB database from any file manager or application supporting the WebDAV protocol. WebDAV makes it possible to manage database collections in eXist just like directories in a file system. You may copy, remove, view or edit files with any application supporting the WebDAV protocol, including, for example, Windows Explorer, cadaver, Linux Davfs, XML Spy and many others. Though xincon has been written for Xindice it works perfectly well with eXist. However, xincon is still alpha, so please do not expect everything to work. On my machine, Windows Explorer will hang for some time when changing directories. The xincon module listens at http://localhost:8080/exist/webdav/. If you browse to this URL with your web browser you will see xincon's web-based administration interface. More interesting, you may use WebDAV with a file manager like Windows Explorer. On Windows, just create a web-folder using the above URL. The screen dump below shows an example session using the cadaver WebDAV client on Linux: Example: Using WebDAV with cadaver
[wolf@zarathustra eXist-0.8]$ cadaver http://localhost:8080/exist/webdav Looking up hostname... Connecting to server... connected. dav:/exist/webdav/> mkcol shakespeare Creating `shakespeare': succeeded. dav:/exist/webdav/> cd shakespeare dav:/exist/webdav/shakespeare/> mkcol plays Creating `plays': succeeded. dav:/exist/webdav/shakespeare/> put samples/shakespeare/hamlet.xml Uploading samples/shakespeare/hamlet.xml to `/exist/webdav/shakespeare/hamlet.xml': Progress: [=============================>] 100.0% of 288832 bytes succeeded. dav:/exist/webdav/shakespeare/> 12. Log FileseXist uses the log4j-package to write logging output. If you have installed eXist in a servlet-engine, you will find the log files in the web-application's WEB-INF directory. Otherwise logging output is written to webapp/WEB-INF/logs. These settings may be changed by editing the log4j-section at the end of eXist's configuration file conf.xml. If you experience any problems with eXist, please have a look at the corresponding log-files. This is also recommended if Cocoon presents you an error-report. Cocoon log-files reside in the WEB-INF/logs directory. Usually, looking at these files helps to understand why an exception occurred. 13. What Next?To use all the power of eXist you should read the XPath HowTo. eXist provides a number of extensions to standard XPath and you can make your queries a lot faster by using these extensions, so don't miss them. If you would like to know more about the different deployment options, read the document about Server Deployment. Security related issues are covered by security.xml. Finally, developers should have a look at the Developer's Guide. |