|
|
The Digital Documentation Center at AUB
Børre Ludvigsen
Introduction
The Digitial Documentation Center (DDC)
at the American University of Beirut (AUB) was started in June 1997 as
a cooperative project between AUB and Østfold State College (HiØ[1]).
The aim was to establish a unit that initially would complete a selection
of pilot projects to illustrate the viability of the center and explore
required resources. In the long term the purpose of the center is to provide
resources, advice and assistance to the university’s various departments
and entities for digital archiving and publishing of material deemed important
to AUB and its aims.
Background
The center was first proposed by Nabil Bukhalid of AUB's Personal Computing
and Networking Services Department (PCNS). AUB andHiØ had previously
collaborated on mirroring AUB's webserver[2] inNorway and HiØ's
Al Mashriq[3] server in Beirut. Professor Børre Ludvigsen of the
Informatics Department at HiØ was invited tospend a half year sabattical
at AUB to start up the center.
The six months establishment period was started in the middle of June
1997 and by the end of the year a series of 25 pilot projects were under
development with all but four completed and ready for archival or publishing.
Most of the projects were digitized and presented in forms suitable for
viewing with readily available public domain software on isolated media
such ac CD’s, a local intranet or the World-Wide Web.
Administratively the center is run under the auspices of the Vice President
for Administrative Affairs, George Tomey with technical support from the
PCNS Unit. Børre Ludvigsen headed the project in Beirut until the
end of 1997 when he returned to HiØ in Norway. Romy Arslan has administrative
responsibility for the continuing work with Ludvigsen advising over the
Internet. Wissam Adib, Patrick Abisalloum, Marwan Badran, Nader Daou and
Mazen Khattab have been student assistants on the various projects.
Rationale
Publishing scholarly material digitally
While there has always been a considerable gap between the high-ranking
educational institutions and less privileged ones both in access to scientific
information and their ability to publish research, the gap has been growing
drastically in recent years.
In a recent article in the Scientific American, the predicament of
imbalance in publication is discussed, detailing the enormous disproportion
in published material in the peer reviewed scientific press between various
countries. The article cites examples of material by the same author being
readily accepted when submitted from a continental US institution and refused
when it originated from the home university which was located in an emerging
nation. It is also well documented that the overwhelming amount of published
material originates and and is published in North America and Europe. Several
explanations are offered, among them the obvious one of density of institutions,
publications and the amount of research being done. An interesting reason
however, is the ease by which material is now transmitted for review in
digital form and increasingly published in final and permanent form on
the Internet itself. In fact, publication of scientific material on the
Internet’s World-Wide Web is fast becoming the medium of choice. Edited
and reviewed sites are proliferating and the web is set to surpass the
printed media volume of scholarly publication if it has not already done
so.
Availability of scientific literature
Even more important than the possibility to publish, is the availability
and accessibility of scientific journals and literature. In the increasingly
unfriendly economic climate that vast numbers of educational institutions
all over the world find themselves, libraries are invariable subjects of
cost cuts both in terms of books and periodicals. There are innumerable
examples of not only research but vital teaching being carried out on thebasis
of literature that is both out of date and factually wrong. With educational
financing deteriorating relative to growing demands for quality education
in all but the most privileged academic environments, the future for substantial
improvements in access and availability of scientific publications
in traditional form looks increasing dismal. To compound the problem, printed
materials that are acquired, are often not procured in sufficient
numbers, and for all practical purposes unavailable to all but the few
individuals who have secured the copies that exist.
Digital archiving and preservation
An increasing number of institutions view intranet and Internet communication
technologies, supplemented by other forms of digital documentation as a
viable way out of their information access predicament. In this context
"digital documentation" is perceived as encompassing methods for acquiring,
archiving andpublishing scientific and administrative information. It thus
involves recording analog information in formats that are as close as possible
to optimal for future retrieval, reuse and publishing. While online publishing
is important, popular and cost-effective both in terms of information acquisition
and publishing, simple archiving to tapes and CD both for conservation
as duplication and backup is a major part of efforts in methodical documentation.
Sharing knowledge
Of all characteristics that can be attributed to the Internet, open sharing
of information between institutions and individuals is definitely the most
important. Exchange of knowledge and information was at the very roots
of the development of the Internet and in spite of the enormous growth
of commercialization on the net in recent years, it continues to be the
most enduring advantage of the net. While many of the companies that have
invested in the net in expectation of huge profits in information
trade, difficulties in implementing methods of payment continue to be a
problem. Even when practical methods of remuneration are in common use,
it's doubtful if cultural and scientific information of serious educational
use will remain anything other than the responsibility of the academic
community.
Similar projects
-
American Memory - The Library of Congress[4]
-
Digital Libraries Initiative Projects - NSF/DARPA/NASA[5]
-
Federating Repositories of Scientific Literature -University of Illinois
(UC)[6]
-
Al Mashriq - Østfold State College and AUB[7]
Practical considerations
The following is an overview of some characteristics of digital documentation:
Advantages of digital archiving
Size
As an example 100 images of large poster in 4 resolutions, the largest
of print quality, can be stored on a standard digital compact disk.
Retrieval and reuse
Converting information to digital form ensures that each copy retrieved
is exactly identical to the original recording. Thus material properly
cataloged and stored on read-only digital media ensures unchanged copies.
Cataloging of the material is also digital, providing mediate access. It
can also be sent digitally as copy to any location on the the Internet
or on hard media.
Ease of reproduction
Apart from reproduction in an infinite number as digital copies, digitized
material would be able available for use in print or other media that use
digital production techniques.
Permanence and safety in duplication
Digital archiving provides the advantage of making duplicates on various
media and storage in separate location, ensuring permanent, destruction-free
storage.
Digital publication
Universal availability
Digitized information can be made universally available on the Internet
(with limited personal access if necessary) and on various other readily
available and common media such as CDs, digital tapes and disks.
Low costs
Publication costs are low, usually involving only the authoring or
editorial processes and formatting of mater documents.
Ease of publication
With the emergence of standard cross-platform formats of common data
types such as html for digital publication and sgml for print, material
can often be prepared for final publication by archivists and authors themselves.
Additional work is usually limited to editorial and design modification
for adaptation into larger collections such as web sites or for marketing
and packaging purposes.
Search-ability
Digital formats provide excellent search functionality, especially
in text, but increasingly in other data where pattern recognition is applicable.
Provided common, cross-platform standards are used, searches need not be
limited to isolated collections of information.
Collaboration
By digitizing information and making it available universally or selectively,
conditions for collaboration are greatly enhanced and time/space limitation
(and costs) proportionately reduced.
Immediacy of intellectual property rights
Intellectual property rights (commonly understood as «copyright»)
is not limited to particular media. Consequently the limitations previously
imposed by print publications, including loss and theft between submission
and publication are alleviated. Digital publication, for example on the
Internet, ensures the immediate and permanent intellectual property rights
of the owner of the material involved, provided it otherwise complies with
international copyright regulations. |
|
© AUB
The original Medical Gate at AUB, 1893. From the Moore
collection of photo- graphs at the Archives and Special Collections of
the Jafet Library.
© B. Ludvigsen
The wall and car damaged by the bomb of Sep. 27, 1997.
(Digital reporting.)
Poster of President Amin Gemayel. (From the collection
of the Jafet Library.)
© AUB Digital Documentation Center
Aref el Nakadi. From the Oral History project archived
at the Jafet Library.
Worldwide distribution of solar radiation into belts
indicating feasibility of solar applications. (From Solar Disinfection
of Drinking Water and Oral Rehydratoin Solutions, Guidelines for Household
Application in Developing Countries, Aftim Acra Zeina Raffoul
- Yester Karahagopian, 1984.
© B. Ludvigsen, AUB
Anencephaly is characterized by the absence of scalp,
calvarium, and normal brain, which is replaced by an angiomatous mass.
The typical appearance of the face (batracian phenotype) is due to absent
frontal bones and shallow orbits causing protrusion of the eyeballs. (From
the gross specimens collection of the Pathology Department.)
© Ministry of Torusim, Lebanon,
Fulvio Roiter
Beduins in the Beqaa. From the Ministry of Tourism photobase.
© B. Ludvigsen, AUB
VRML model of the AUB campus constructed as a navigation
aid for the extended campus tour.
Sultan Abdul Hamid Khan II (1876- April 1909). From the
digitized collection of sultans of the History and Archaeology Dept.
© B. Ludvigsen, AUB
VRML reconstruction of glass find from the archaeological
excavations in Beirut Central District.
© AUB
Detail of tile from Houssaineh, Mazraat al Sayed. From
the measured drawing collection of the Dept. of Architecture.
© B. Ludvigsen, AUB
Insects from the natural history collection of the Biology
Department.
© Aftim Akra
Mite in 140 million year old amber. From the Fossils
from the Amber of Lebanon project.
|
Prerequisites
Infrastructure
Archiving, conversion, transmission and especially publishing require
sufficient infrastructure in the form of basic network facilities. The
situation in 1997 is characterized by fair to excellent internal infrastructure
at many educational institutions. External interconnectivity remains a
severe limitation on collaboration and publication with all but the most
privileged. The emergence of common cross-platform standards in both archiving
and publication has seen an explosion of both innovative software and methods
of presentation and transmission, all putting severe demands and strains
on available bandwidth and the dynamics of academic inter-networks. Unfortunately,
there does not seem to be any reason to expect effective improvements other
than those that can be provided by government and academic networking organizations
that see clearly the benefits for improvement in education and research.
Technical resources
Important resource to note are primarily equipment for recording in
standardized formats that can easily be read and transferred to new standards
as these are developed. Also important is the consideration of standardized,
permanent recording media such as CDs and magnetic tapes.
New file formats are developed continually, especially for displaying
over the Internet. Original recordings should be made in formats that preserve
complete ranges of sound and color at as high resolutions as possible.
Adaptations to various schemes of compression should be made only where
no significant loss of information is not of importance to either storage
or display for end users.
The success of any archival and publishing venture is dependent on
the skills of the people involved. For non-skilled administrative personnel,
the simplest means of judging the skills and competence of the staff involved
is the measure of their reliance on proprietary commercial software and
standards. The widest possible access and best conformity for information
sharing with other institutions and low-cost software is usually achieved
by using standards the are freely available in the public domain. Public
domain software usually developed from research at educational institutions
is freely available on the Internet. Standards developed along the same
lines are open and thus readily adaptable for use in both public and commercial
software, ensuring much wider compatibility and use at reasonable costs.
Editorial competence
To ensure that material be digitized in formats and ways that are compatible
with future use, it is important that the venture has a certain amount
of knowledge of formatting and editing data for publication on various
digital media. «Publication» in this context does not necessarily
mean making the digitized information globally available on the Internet,
but simply assembling the archived material in such a way that it is retrievable
in reasonably ordered form at any time by those who are responsible for
the original analog information.
In addition digital archiving and especially publication does demand
knowledge and skills in organizing material, editing for reader interest
and some basic knowledge of data communication. This includes the technical
skills necessary to run the software needed for publication both at the
client and server ends. Also important is sufficient experience and knowledge
of Internet communications to be able to judge delivery bandwidths and
the constraints these impose on document formats and sizes with respect
to online publication.
Equity
Given the various prerequisites above, which are mostly of a technical
nature, the most important principle for success of any digital archival
and publication venture is the doctrine that information retained by educational
institutions is in the public domain. Knowledge retained by educational
institutions is developed through research and collected by acquisition
first for the benefit of students, teachers and researchers and second,
for the public at large including the use of other institutions with similar
aims.
Attitudes of possessiveness others than those required to protect personal
integrity and intellectual property rights simply negate the purpose and
aims of archival for knowledge retrieval and dissemination.
Digitizing for archiving
and digitizing for publication
There is an important point to be made about the difference between
digitizing for archiving purposes alone and subsequent publication of the
same material. Storing information on a computer which is attached
to a network does not mean that it is immediately available on the global
Internet anymore than putting money in a bank makes it accessible to people
on the street outside.
Digitized material can be stored on CDs for example, for safe physical
storage. While ensuring permanence and safety, it does severely restrict
retrievability. The World-Wide Web client-server software system in combination
with configuration techniques in network routing allows for all the necessary
variations in access control familiar to physical libraries and archives,
with some fairly sophisticated additions. Access to archived digitized
information can thus be controlled with respect to conditions such as location,
area, personal identification, authentication, passwords, and so on.
Media shelf-life
An often voiced concern with respect to archiving digitized material
is the potential deterioration of the «permanent» media on
which it is stored or the rapid outdating of the retrieval technology,
both hardware and software. Media self-life is dependent on several factors
not least of which is the market penetration of common technologies. The
more common a certain piece of technology, the longer it will remain in
use. Ultimately, the availability of digitized information will simply
depend on its perceived value on the part of its custodians. In a well
managed archival insitute, material will simply be moved from one form
of media about to become outdated to a more modern version as a matter
of routine. The fact that the archived material is digital will ensure
that it can be moved preserving both accuracy and integrity irrespective
of the storage medium. In the case of analog information such as print
and film such movement between storage media is prohibitively expensive
and causes serious deterioration to the quality of the information itself.
Results
During the first six months the work of the center fell into three main
categories: pilot projects, guidelines for the center’s activities and
administration[8].
Guidelines
A general introduction to digital archiving and publication.
Copyright matters. Intellectual property right essentials, FAQs on
copyright, myths about copyright, links to Berne Convention for the Protection
of Literary and Artistic Works and the Intellectual Property Center, differences
in emphasis on digitizing for archiving and digitizing for publication
and AUB’s liability and copyright in the form of present agreements.
Methods and routines for digitizing. General information on aspects
of in-house digitizing and out-sourcing. Recording methods for photographs
and other images, audio, video, text, objects and physical environments.
Necessary equipment such as computers, scanners, photographic and audio
equipment.
Methods and routines for publication
A short list of similar projects world-wide.
Administration
With the aim of effectively producing as much as possible on the the
pilot projects, administration was completely paperless. Whatever paper
that was addressed to the center was either discarded or filed with the
PCNS Unit. Weekly plans and reports, student assistants work reports, backup
logs for server, and miscellaneous documentation aids for methods were
filed on the center’s webserver. Whenever the PCNS needed any documentation
of ongoing activities or logs for payment to student assistants, they were
reffered to the material on the server.
Pilot projects
The following projects were either completed or under work at the DDC by
the end of December 1997 (for uncompleted projects, estimates of
remaining work are included in parentheses):
Jafet Library - Archives and special collections: 360 Political
posters 1960-1990 (5%). The Moore Collection of historical photographs.
1928 yearbook. Oral History - Aref el Nakadi (50%).
AUH (American University Hospital): Pathology specimens (This
is an initialized project demonstrating methods of digitizing and display
which the Pathology department intends to continue by itself.).
Saab Medical Library: Searchable bibliography of Lebanese medical
articles and books.
Internal audit: Digitizing old policies for archival. (Assistance
and guidance for putting policies and procedures operatively on the intranet.)
History and Archaeology: Picture database from the BEY006 archeological
excavations. CD Supplement to Berytus 1998 (5%). VRML reconstructions of
glass and pottery. Ottoman Sultans, proposal for presentation IMA exhibition.
Oct. 1998, brochure for Careers 97. Beirut Expo (History and Archaeology
Department)
Environmental Health: Solar disinfection of drinking water
Campus tour: An interactive version of the tour was completed
with available material on buildings and departments which will be supplemented
as it becomes available. (A simplified version of the tour for ‘java-unabled’
browsers is under development.). Detailed campus maps. A VRML model
of the campus.
Architecture: A selection of measured drawings (60%), a selection
from the slide library (75%).
Biology: Samples from insect collections. (A small selection
for demonstration purposes.) Searchable bird observation database.
Miscellaneous: Amber fossils from Lebanon. Maps from MOT - Ministry
of Tourism (1:200 000). The MOT’s photo database (15 000 photographs, searchable
and geographically interfaced, 10%). Archive pictures from the infirmary
Future work
A selection of projects for the immediate future were already indentified
by the end of 1997. These include an interesting experiment in digitizing
16mm film. During the emptying of some garbage bins in front of the DDC
lab, a series of films appeared quite by chance some time in September
97. The majority were in canisters and appeared to be a Russion epic from
the Soviet era. However one, which was found on an open reel was discovered
to be titled "South Lebanon: Tragedy and Steadfastness". The film, which
is recorded on a highly unstable analog medium should be digitized on to
digital video and then redigitized on to disk and CD for permanent archival.
Three particular projects have been singled out for 1998. One is an
extensive digital documentation of the archeological excavations carried
out by the AUB team in Beirut City District. The presentation will be a
supplement to objects to be exhibitied at the Institute du Monde Arab’s
project on Lebanon during the fall of 1998. Digitization and publishing
of papers and articles from the Department of Political Science and Center
for Middle East Studies is another important area that is of obvious importance
to the university itself and particularly outside users. Finally, the center
will attempt to initialize a project on documentation of Arabic calligraphy.
Problems and solutions
Usual technical problems were been encountered with equipment such as number
of available computers during periods of heavy workload, hard disk space,
insufficient server memory, slow scanner processing, and slide scanner
with reduced slide size. It turns out that many of the slides offered are
in the «superslide» 4x4cm format and the acquired scanner takes
only 35mm. Of a more serious nature was slow and inermittent Internet access.
Quite often supplemental software and essential information was available
only over the net. Several essential bits of equipment were also lacking
at the center. These include a good single-lense reflex camera with a high
definition 55mm lense, a wide angle lense for panora-mas, a 3CCD digitial
video camera (DV or miniDV) and a faster slide scanner with slide magazine
which
were all borrowed from private sources.
The announcement of the startup project was circulated much too late
and coincided with the summer break. Projects were therefore contributed
late in the period.
The student assistants are somewhat young, resulting in a long training
period before they were prepared to take reasonable responsibility for
their respective projects. Subsequently recruited volunteers who were somewhat
older were assigned limited projects under own responsibility.
Various ways of future manning of the center were discussed including
an individual as technical and administrative head, administrative heading
with external period consultation, student projects and running by committee.
A workable solution of heading the work over the net with periodic visits
to Beirut is now being tried out.
Concerns about what material which could be released for publishing
both on the AUB intranet and the World-Wide Web were the only real problems
that affected the availability of archived material. It was highly unclear
whether or not the material from the Oral History project was of a restricted
nature or not. As no documention to that effect was brought forward, the
project was started as an experiment. The tape recordings of the Oral History
interviews were made in the early 1970’s and not of very high quality then
and it is important that at least the tapes be digitized to prevent further
deterioration.
Conclusion
With the exception of the Department of Architecture, none of the entities
that contributed material to the pilot project had any resources or experience
in digital archiving or publishing. Architecture digitized their own drawings
and provided student assistance for digitizing slides in the DDC lab. The
idea was to train that student in ordering and presetation of the digitized
material for subsequent archival on the department’s own server. As the
student left the project because of illness, it was left to the DDC staff
to complete the project.
Although it must be left to others to evauate, the amount of material
and very varied content, much of which is unique both in terms of academic
interest and in exposing the university’s achievements, would amply illustrate
the viability of the center.
The policy of using open standards for digitizing (if not presenting[9])
all the material is important. It allows for complete flexibility and freedom
in future replication and other forms of presentation and archival.
While the initial training of the student assistants was slow, the
emphasis on individual responsibility for particular projects was successful
in view of the difficulties in resolving the long-term manning of the center.
The uncompleted projects are under responsible development with the use
of the Internet for guidance an essential tool.
While both the establishment and leadership of the DDC as a collaborative
effort between AUB and HiØ is an interesting experiment, more formalized
agreements should be made between the institutions regarding both the distrubtion
of time and economic arrangements involved. The experience gained by the
end of 1998 should be sufficient to demonstrate both the viability of the
center at AUB and the usefulness of the project in terms of collaboration
on method, content and technology.
[1] Høgskolen i Østfold
[2] http://www.aub.edu.lb
[3] http://almashriq.hiof.no/
[4] http://lcweb2.loc.gov/ammem/
[5] http://www.cise.nsf.gov/iris/DLHome.html
[6] http://dli.grainger.uiuc.edu/
[7] http://almashriq.hiof.no/
and http://almashriq.aub.edu.lb/
[8] The web pages of the DDC can be viewed on the AUB intranet at http://ddc.aub.edu.lb/
and http://almashriq.hiof.no/ddc/
at HiØ.
[9] Audio material is digitized to the pen standard AIFF format, but
published as proprietry Real Audio which can be played with freely available
software downloadable from the Internet.
Copyright: 1998, Høgskolen
i Østfold. Last Update: 01.07.98, Trond
Løvereide. |