Arabic script on RLIN

Date01 April 1992
Published date01 April 1992
DOIhttps://doi.org/10.1108/eb047865
Pages59-80
AuthorJoan M. Aliprand
Subject MatterInformation & knowledge management,Library & information science
ARABIC SCRIPT ON RLIN
Joan M. Aliprand
Arabic script is the most recent addition to the
scripts available on the Research libraries
Information Network (RLIN). Bibliographic
control and retrieval using the authentic writing
system are available for titles in Arabic, Persian
(Farsi),
Urdu, Ottoman Turkish, and other
languages written with Arabic script. RLIN is
the world's largest bibliographic database for
Middle Eastern language material.
This paper is a comprehensive description of the
Arabic script features of RLIN. It covers Arabic
character sets and RLIN's character repertoire
for Arabic script; how Arabic characters are
input and stored in the RLIN database; the
equipment needed for Arabic script support; the
indexing, retrieval, and presentation of records
containing Arabic script; the inclusion of non-
Roman data in USMARC bibliographic records;
and statistics on the RLIN databases. Sidebars
explain features of Arabic writing.
The discussion of data storage and presentation
of text is relevant to any computer application
that involves Arabic script.
It is known1,2 that romanized access is inadequate
for the bibliographic control of material written in
non-Roman scripts. Because of this, the Library of
Congress has continued to do original script cataloging
for Japanese, Arabic, Chinese, Korean, Persian (Farsi),
Hebrew, and Yiddish (the so-called JACKPHY languag-
es) on cards, and later in machine-readable form.
The Research Libraries Group, Inc. (RLG) also
recognized the inadequacy of romanization as a substi-
tute for original script access. Over the past decade,
RLG undertook a series of projects to add major
non-Roman scripts to its automated bibliographic
system, the Research Libraries Information Network
(RLIN); the first was Chinese/Japanese/Korean (CJK)
in September
1983.3
Input, retrieval, and display of each JACKPHY
language in its correct script is now available through
RLIN; Arabic, the last script necessary for complete
JACKPHY support, was put
into
production in Novem-
ber 1991. Major research institutions (including the
Library of Congress) use RLIN for online cataloging
of non-Roman material; the Library of Congress was
the first user of RLIN's Arabic capability.4 (Throughout
this article, "Arabic" refers to the script used to write
languages such as Arabic, Persian (Farsi), and Urdu;
"language" will be added when the Arabic language
is meant.)
Aliprand is a programmer/analyst at the Research
Libraries Group, Inc., Mountain View, California.
Acknowledgments: The addition of Arabic
to
RLIN
was underwritten by a grant from the Kuwait Foundation
for the Advancement of Sciences. Laura Spurrier's
thoughtful comments were most useful. Most of RLG's
projects are a team effort, and Arabic is no exception.
I would like
to
thank my colleagues for their
advice,
good
ideas,
and hard work.
ARABIC SCRIPT
ON
RLIN
ISSUE 40
10:4 (1992) 59
SIDEBAR A: FEATURES OF ARABIC SCRIPT
Arabic writing is
one
of the world's great calligraphies.
The distinctive features of the script include:
connection of
letters
(Arabic is a cursive script);
varying letter widths;
varying graphic forms for a letter depending on
its position in a word;
occasional use of an abnormal positional form for
a letter;
representation of vowels and other indicators of
pronunciation as signs above or below consonants.
(Arabic is usually written without vowels, and
they are supplied in the reading process);
digraphs and other composite letter combinations.
Implementation of Non-Roman Scripts
Some of the design issues faced in the initial
project to add a non-Roman script capability to RLIN
were specific to the East Asian family of
scripts;
for
example, how are
the
thousands of Chinese ideograms
(used for Japanese and Korean as well as for Chinese)
to be uniquely encoded? (This question is addressed
in Smith-Yoshimura & Tucker.5)
But other design issues were more general, and
related to the incorporation of a script other than the
Latin alphabet (with bibliographic extensions) into the
RLIN system. For example:
How is a user to switch back and forth between
scripts?
How is a change of script to be indicated in the
stored data?
If
a
user's terminal lacks non-Roman capability,
can
the
user search for and
see
records containing
non-Roman data?
How are the non-Roman data in a record in the
RLIN database output on tape for bibliographic
exchange?
What changes to USMARC need to be introduced?
RLG's solutions to these particular problems, which
have been
implemented for all non-Roman scripts, are
discussed below.
Whenever a new non-Roman script is added to
RLIN, script-specific questions arise and have to be
addressed. Arabic-specific questions included:
What coding standard exists for this script? Is it
adequate?
How will text in opposite directions be input and
displayed?
How will the positional forms of letters and the
lam-alif digraphs be accommodated?
Should vowels and marks of pronunciation that
are "seated" on letters be allowed, or should text
have to be written without vowels?
Are
there
any unique features of languages written
in Arabic script that affect indexing?
Figure
1,
an announcement of
an
international confer-
ence on multilingual computing, shows Arabic text in
various typefaces. Distinctive features of Arabic script,
such as positional forms, "seated" vowels, and the
lam-alif digraphs, are described in detail in sidebars.
60 LIBRARY HI
TECH
JOAN M. ALIPRAND

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT