HØit Nr. 1-98 
Previous article Next article TOC: Nr. 1, 1998 Previous Issue Next Issue About HØit  

The Digital Documentation Center at AUB

Børre Ludvigsen 

 

Introduction

The Digitial Documentation Center (DDC) at the American University of Beirut (AUB) was started in June 1997 as a cooperative project between AUB and Østfold State College (HiØ[1]). The aim was to establish a unit that initially would complete a selection of pilot projects to illustrate the viability of the center and explore required resources. In the long term the purpose of the center is to provide resources, advice and assistance to the university’s various departments and entities for digital archiving and publishing of material deemed important to AUB and its aims. 
 

Background

The center was first proposed by Nabil Bukhalid of AUB's Personal Computing and Networking Services Department (PCNS). AUB andHiØ had previously collaborated on mirroring AUB's webserver[2] inNorway and HiØ's Al Mashriq[3] server in Beirut. Professor Børre Ludvigsen of the Informatics Department at HiØ was invited tospend a half year sabattical at AUB to start up the center. 
The six months establishment period was started in the middle of June 1997 and by the end of the year a series of 25 pilot projects were under development with all but four completed and ready for archival or publishing. Most of the projects were digitized and presented in forms suitable for viewing with readily available public domain software on isolated media such ac CD’s, a local intranet or the World-Wide Web. 
Administratively the center is run under the auspices of the Vice President for Administrative Affairs, George Tomey with technical support from the PCNS Unit. Børre Ludvigsen headed the project in Beirut until the end of 1997 when he returned to HiØ in Norway. Romy Arslan has administrative responsibility for the continuing work with Ludvigsen advising over the Internet. Wissam Adib, Patrick Abisalloum, Marwan Badran, Nader Daou and Mazen Khattab have been student assistants on the various projects. 

Rationale

Publishing scholarly material digitally 

While there has always been a considerable gap between the high-ranking educational institutions and less privileged ones both in access to scientific information and their ability to publish research, the gap has been growing drastically in recent years. 
In a recent article in the Scientific American, the predicament of imbalance in publication is discussed, detailing the enormous disproportion in published material in the peer reviewed scientific press between various countries. The article cites examples of material by the same author being readily accepted when submitted from a continental US institution and refused when it originated from the home university which was located in an emerging nation. It is also well documented that the overwhelming amount of published material originates and and is published in North America and Europe. Several explanations are offered, among them the obvious one of density of institutions, publications and the amount of research being done. An interesting reason however, is the ease by which material is now transmitted for review in digital form and increasingly published in final and permanent form on the Internet itself. In fact, publication of scientific material on the Internet’s World-Wide Web is fast becoming the medium of choice. Edited and reviewed sites are proliferating and the web is set to surpass the printed media volume of scholarly publication if it has not already done so. 

Availability of scientific literature 

Even more important than the possibility to publish, is the availability and accessibility of scientific journals and literature. In the increasingly unfriendly economic climate that vast numbers of educational institutions all over the world find themselves, libraries are invariable subjects of cost cuts both in terms of books and periodicals. There are innumerable examples of not only research but vital teaching being carried out on thebasis of literature that is both out of date and factually wrong. With educational financing deteriorating relative to growing demands for quality education in all but the most privileged academic environments, the future for substantial improvements in access and availability of scientific  publications in traditional form looks increasing dismal. To compound the problem, printed materials that are acquired, are often not procured in  sufficient numbers, and for all practical purposes unavailable to all but the few individuals who have secured the copies that exist. 

Digital archiving and preservation

An increasing number of institutions view intranet and Internet communication technologies, supplemented by other forms of digital documentation as a viable way out of their information access predicament. In this context "digital documentation" is perceived as encompassing methods for acquiring, archiving andpublishing scientific and administrative information. It thus involves recording analog information in formats that are as close as possible to optimal for future retrieval, reuse and publishing. While online publishing is important, popular and cost-effective both in terms of information acquisition and publishing, simple archiving to tapes and CD both for conservation as duplication and backup is a major part of efforts in methodical documentation. 
 

Sharing knowledge

Of all characteristics that can be attributed to the Internet, open sharing of information between institutions and individuals is definitely the most important. Exchange of knowledge and information was at the very roots of the development of the Internet and in spite of the enormous growth of commercialization on the net in recent years, it continues to be the most enduring advantage of the net. While many of the companies that have invested in the net in expectation of huge profits in information  trade, difficulties in implementing methods of payment continue to be a problem. Even when practical methods of remuneration are in common use, it's doubtful if cultural and scientific information of serious educational use will remain anything other than  the responsibility of the academic community. 
 

Similar projects 

  • American Memory - The Library of Congress[4] 
  • Digital Libraries Initiative Projects - NSF/DARPA/NASA[5] 
  • Federating Repositories of Scientific Literature -University of Illinois (UC)[6] 
  • Al Mashriq - Østfold State College and AUB[7] 

Practical considerations 

The following is an overview of some characteristics of digital documentation: 

Advantages of digital archiving  
Size  
As an example 100 images of large poster in 4 resolutions, the largest of print quality, can be stored on a standard digital compact disk. 
Retrieval and reuse  
Converting information to digital form ensures that each copy retrieved is exactly identical to the original recording. Thus material properly cataloged and stored on read-only digital media ensures unchanged copies. Cataloging of the material is also digital, providing mediate access. It can also be sent digitally as copy to any location on the the Internet or on hard media. 
Ease of reproduction  
Apart from reproduction in an infinite number as digital copies, digitized material would be able available for use in print or other media that use digital production techniques. 
Permanence and safety in duplication  
Digital archiving provides the advantage of making duplicates on various media and storage in separate location, ensuring permanent, destruction-free storage. 
 
Digital publication  
Universal availability  
Digitized information can be made universally available on the Internet (with limited personal access if necessary) and on various other readily available and common media such as CDs, digital tapes and disks. 
Low costs  
Publication costs are low, usually involving only the authoring or editorial processes and formatting of mater documents. 
Ease of publication  
With the emergence of standard cross-platform formats of common data types such as html for digital publication and sgml for print, material can often be prepared for final publication by archivists and authors themselves. Additional work is usually limited to editorial and design modification for adaptation into larger collections such as web sites or for marketing and packaging purposes. 
Search-ability  
Digital formats provide excellent search functionality, especially in text, but increasingly in other data where pattern recognition is applicable. Provided common, cross-platform standards are used, searches need not be limited to isolated collections of information. 
Collaboration  
By digitizing information and making it available universally or selectively, conditions for collaboration are greatly enhanced and time/space limitation (and costs) proportionately reduced. 
Immediacy of intellectual property rights 
Intellectual property rights (commonly understood as «copyright») is not limited to particular media. Consequently the limitations previously imposed by print publications, including loss and theft between submission and publication are alleviated. Digital publication, for example on the Internet, ensures the immediate and permanent intellectual property rights of the owner of the material involved, provided it otherwise complies with international copyright regulations. 

 
© AUB  
The original Medical Gate at AUB, 1893. From the Moore collection of photo- graphs at the Archives and Special Collections of the Jafet Library. 
 

 
© B. Ludvigsen   
The wall and car damaged by the bomb of Sep. 27, 1997. (Digital reporting.) 
 

 
Poster of President Amin Gemayel. (From the collection of the Jafet Library.) 
 

 
© AUB Digital Documentation Center  
Aref el Nakadi. From the Oral History project archived at the Jafet Library. 
 

 
Worldwide distribution of solar radiation into belts indicating feasibility of solar applications.  (From Solar Disinfection of Drinking Water and Oral Rehydratoin Solutions, Guidelines for Household Application in Developing Countries, Aftim Acra  Zeina  Raffoul - Yester Karahagopian, 1984. 
 

 
© B. Ludvigsen, AUB  
Anencephaly is characterized by the absence of scalp, calvarium, and normal brain, which is replaced by an angiomatous mass. The typical appearance of the face (batracian phenotype) is due to absent frontal bones and shallow orbits causing protrusion of the eyeballs. (From the gross specimens collection of the Pathology Department.) 
 

 
© Ministry of Torusim, Lebanon,  
   Fulvio Roiter 
Beduins in the Beqaa. From the Ministry of Tourism photobase. 

 
© B. Ludvigsen, AUB  
VRML model of the AUB campus constructed as a navigation aid for the extended campus tour. 
 

 
Sultan Abdul Hamid Khan II (1876- April 1909). From the digitized collection of sultans of  the History and Archaeology Dept. 
 

 
© B. Ludvigsen, AUB  
VRML reconstruction of glass find from the archaeological excavations in Beirut Central District. 
 

 
© AUB  
Detail of tile from Houssaineh, Mazraat al Sayed. From the measured drawing collection of the Dept. of Architecture. 
 

 
© B. Ludvigsen, AUB  
Insects from the natural history collection of the Biology Department. 
 

 
© Aftim Akra  
Mite in 140 million year old amber. From the Fossils from the Amber of Lebanon project. 
 
 

Prerequisites  
Infrastructure  
Archiving, conversion, transmission and especially publishing require sufficient infrastructure in the form of basic network facilities. The situation in 1997 is characterized by fair to excellent internal infrastructure at many educational institutions. External interconnectivity remains a severe limitation on collaboration and publication with all but the most privileged. The emergence of common cross-platform standards in both archiving and publication has seen an explosion of both innovative software and methods of presentation and transmission, all putting severe demands and strains on available bandwidth and the dynamics of academic inter-networks. Unfortunately, there does not seem to be any reason to expect effective improvements other than those that can be provided by government and academic networking organizations that see clearly the benefits for improvement in education and research. 
Technical resources  
Important resource to note are primarily equipment for recording in standardized formats that can easily be read and transferred to new standards as these are developed. Also important is the consideration of standardized, permanent recording media such as CDs and magnetic tapes. 
New file formats are developed continually, especially for displaying over the Internet. Original recordings should be made in formats that preserve complete ranges of sound and color at as high resolutions as possible. Adaptations to various schemes of compression should be made only where no significant loss of information is not of importance to either storage or display for end users. 
The success of any archival and publishing venture is dependent on the skills of the people involved. For non-skilled administrative personnel, the simplest means of judging the skills and competence of the staff involved is the measure of their reliance on proprietary commercial software and standards. The widest possible access and best conformity for information sharing with other institutions and low-cost software is usually achieved by using standards the are freely available in the public domain. Public domain software usually developed from research at educational institutions is freely available on the Internet. Standards developed along the same lines are open and thus readily adaptable for use in both public and commercial software, ensuring much wider compatibility and use at reasonable costs. 
Editorial competence  
To ensure that material be digitized in formats and ways that are compatible with future use, it is important that the venture has a certain amount of knowledge of formatting and editing data for publication on various digital media. «Publication» in this context does not necessarily mean making the digitized information globally available on the Internet, but simply assembling the archived material in such a way that it is retrievable in reasonably ordered form at any time by those who are responsible for the original analog information. 
In addition digital archiving and especially publication does demand knowledge and skills in organizing material, editing for reader interest and some basic knowledge of data communication. This includes the technical skills necessary to run the software needed for publication both at the client and server ends. Also important is sufficient experience and knowledge of Internet communications to be able to judge delivery bandwidths and the constraints these impose on document formats and sizes with respect to online publication. 
Equity  
Given the various prerequisites above, which are mostly of a technical nature, the most important principle for success of any digital archival and publication venture is the doctrine that information retained by educational institutions is in the public domain. Knowledge retained by educational institutions is developed through research and collected by acquisition first for the benefit of students, teachers and researchers and second, for the public at large including the use of other institutions with similar aims. 
Attitudes of possessiveness others than those required to protect personal integrity and intellectual property rights simply negate the purpose and aims of archival for knowledge retrieval and dissemination. 

Digitizing for archiving and digitizing for publication  
There is an important point to be made about the difference between digitizing for archiving purposes alone and subsequent publication of the same material. Storing information on a computer which is attached  to a network does not mean that it is immediately available on the global Internet anymore than putting money in a bank makes it accessible to people on the street outside. 
Digitized material can be stored on CDs for example, for safe physical storage. While ensuring permanence and safety, it does severely restrict retrievability. The World-Wide Web client-server software system in combination with configuration techniques in network routing allows for all the necessary variations in access control familiar to physical libraries and archives, with some fairly sophisticated additions. Access to archived digitized information can thus be controlled with respect to conditions such as location, area, personal identification, authentication, passwords, and so on. 

Media shelf-life  
An often voiced concern with respect to archiving digitized material is the potential deterioration of the «permanent» media on which it is stored or the rapid outdating of the retrieval technology, both hardware and software. Media self-life is dependent on several factors not least of which is the market penetration of common technologies. The more common a certain piece of technology, the longer it will remain in use. Ultimately, the availability of digitized information will simply depend on its perceived value on the part of its custodians. In a well managed archival insitute, material will simply be moved from one form of media about to become outdated to a more modern version as a matter of routine. The fact that the archived material is digital will ensure that it can be moved preserving both accuracy and integrity irrespective of the storage medium. In the case of analog information such as print and film such movement between storage media is prohibitively expensive and causes serious deterioration to the quality of the information itself. 
 

Results 

During the first six months the work of the center fell into three main categories: pilot projects, guidelines for the center’s activities and administration[8]. 
Guidelines  
A general introduction to digital archiving and publication. 
Copyright matters. Intellectual property right essentials, FAQs on copyright, myths about copyright, links to Berne Convention for the Protection of Literary and Artistic Works and the Intellectual Property Center, differences in emphasis on digitizing for archiving and digitizing for publication and AUB’s liability and copyright in the form of present agreements. 
Methods and routines for digitizing. General information on aspects of in-house digitizing and out-sourcing. Recording methods for photographs and other images, audio, video, text, objects and physical environments. Necessary equipment such as computers, scanners, photographic and audio equipment. 
Methods and routines for publication 
A short list of similar projects world-wide. 
Administration  
With the aim of effectively producing as much as possible on the the pilot projects, administration was completely paperless. Whatever paper that was addressed to the center was either discarded or filed with the PCNS Unit. Weekly plans and reports, student assistants work reports, backup logs for server, and miscellaneous documentation aids for methods were filed on the center’s webserver. Whenever the PCNS needed any documentation of ongoing activities or logs for payment to student assistants, they were reffered to the material on the server. 

Pilot projects 

The following projects were either completed or under work at the DDC by the end of December 1997 (for  uncompleted projects, estimates of remaining work are included in parentheses): 

Jafet Library - Archives and special collections: 360 Political posters 1960-1990 (5%). The Moore Collection of historical photographs. 1928 yearbook. Oral History - Aref el Nakadi (50%). 

AUH (American University Hospital): Pathology specimens (This is an initialized project demonstrating methods of digitizing and display which the Pathology department intends to continue by itself.). 

Saab Medical Library: Searchable bibliography of Lebanese medical articles and books. 

Internal audit: Digitizing old policies for archival. (Assistance and guidance for putting policies and procedures operatively on the intranet.) 

History and Archaeology: Picture database from the BEY006 archeological excavations. CD Supplement to Berytus 1998 (5%). VRML reconstructions of glass and pottery. Ottoman Sultans, proposal for presentation IMA exhibition. Oct. 1998, brochure for Careers 97. Beirut Expo (History and Archaeology Department) 
 
Environmental Health: Solar disinfection of drinking water 

Campus tour: An interactive version of the tour was completed with available material on buildings and departments which will be supplemented as it becomes available. (A simplified version of the tour for ‘java-unabled’ browsers is under development.). Detailed campus maps.  A VRML model of the campus. 

Architecture: A selection of measured drawings (60%), a selection from the slide library (75%). 

Biology: Samples from insect collections. (A small selection for demonstration purposes.)  Searchable bird observation database. 

Miscellaneous: Amber fossils from Lebanon. Maps from MOT - Ministry of Tourism (1:200 000). The MOT’s photo database (15 000 photographs, searchable and geographically interfaced, 10%). Archive pictures from the infirmary 
 

Future work 

A selection of projects for the immediate future were already indentified by the end of 1997. These include an interesting experiment in digitizing 16mm film. During the emptying of some garbage bins in front of the DDC lab, a series of films appeared quite by chance some time in September 97. The majority were in canisters and appeared to be a Russion epic from the Soviet era. However one, which was found on an open reel was discovered to be titled "South Lebanon: Tragedy and Steadfastness". The film, which is recorded on a highly unstable analog medium should be digitized on to digital video and then redigitized on to disk and CD for permanent archival. 
Three particular projects have been singled out for 1998. One is an extensive digital documentation of the archeological excavations carried out by the AUB team in Beirut City District. The presentation will be a supplement to objects to be exhibitied at the Institute du Monde Arab’s project on Lebanon during the fall of 1998. Digitization and publishing of papers and articles from the Department of Political Science and Center for Middle East Studies is another important area that is of obvious importance to the university itself and particularly outside users. Finally, the center will attempt to initialize a project on documentation of Arabic calligraphy. 
 

Problems and solutions 

Usual technical problems were been encountered with equipment such as number of available computers during periods of heavy workload, hard disk space, insufficient server memory, slow scanner processing, and slide scanner with reduced slide size. It turns out that many of the slides offered are in the «superslide» 4x4cm format and the acquired scanner takes only 35mm. Of a more serious nature was slow and inermittent Internet access. Quite often supplemental software and essential information was available only over the net. Several essential bits of equipment were also lacking at the center. These include a good single-lense reflex camera with a high definition 55mm lense, a wide angle lense for panora-mas, a 3CCD digitial video camera (DV or miniDV) and a faster slide scanner with slide magazine which 
were all borrowed from private sources. 
The announcement of the startup project was circulated much too late and coincided with the summer break. Projects were therefore contributed late in the period. 
The student assistants are somewhat young, resulting in a long training period before they were prepared to take reasonable responsibility for their respective projects. Subsequently recruited volunteers who were somewhat older were assigned limited projects under own responsibility. 
Various ways of future manning of the center were discussed including an individual as technical and administrative head, administrative heading with external period consultation, student projects and running by committee. A workable solution of heading the work over the net with periodic visits to Beirut is now being tried out. 
Concerns about what material which could be released for publishing both on the AUB intranet and the World-Wide Web were the only real problems that affected the availability of archived material. It was highly unclear whether or not the material from the Oral History project was of a restricted nature or not. As no documention to that effect was brought forward, the project was started as an experiment. The tape recordings of the Oral History interviews were made in the early 1970’s and not of very high quality then and it is important that at least the tapes be digitized to prevent further deterioration. 
 

Conclusion 

With the exception of the Department of Architecture, none of the entities that contributed material to the pilot project had any resources or experience in digital archiving or publishing. Architecture digitized their own drawings and provided student assistance for digitizing slides in the DDC lab. The idea was to train that student in ordering and presetation of the digitized material for subsequent archival on the department’s own server. As the student left the project because of illness, it was left to the DDC staff to complete the project. 
Although it must be left to others to evauate, the amount of material and very varied content, much of which is unique both in terms of academic interest and in exposing the university’s achievements, would amply illustrate the viability of the center. 
The policy of using open standards for digitizing (if not presenting[9]) all the material is important. It allows for complete flexibility and freedom in future replication and other forms of presentation and archival. 
While the initial training of the student assistants was slow, the emphasis on individual responsibility for particular projects was successful in view of the difficulties in resolving the long-term manning of the center. The uncompleted projects are under responsible development with the use of the Internet for guidance an essential tool. 
While both the establishment and leadership of the DDC as a collaborative effort between AUB and HiØ is an interesting experiment, more formalized agreements should be made between the institutions regarding both the distrubtion of time and economic arrangements involved. The experience gained by the end of 1998 should be sufficient to demonstrate both the viability of the center at AUB and the usefulness of the project in terms of collaboration on method, content and technology. 

[1] Høgskolen i Østfold 
[2] http://www.aub.edu.lb 
[3] http://almashriq.hiof.no
[4] http://lcweb2.loc.gov/ammem/  
[5] http://www.cise.nsf.gov/iris/DLHome.html 
[6] http://dli.grainger.uiuc.edu/  
[7] http://almashriq.hiof.no/ and http://almashriq.aub.edu.lb/ 
[8] The web pages of the DDC can be viewed on the AUB intranet at http://ddc.aub.edu.lb/ and http://almashriq.hiof.no/ddc/ at HiØ. 
[9] Audio material is digitized to the pen standard AIFF format, but published as proprietry Real Audio which can be played with freely available software downloadable from the Internet. 


Previous article Next article TOC: Nr. 1, 1998 Previous Issue Next Issue About HØit  
HØit Nr. 1-98


Copyright: 1998, Høgskolen i Østfold. Last Update: 01.07.98, Trond Løvereide