Present bibliographic databases have the capability to automate tasks
such as cataloguing, searching (e.g. book title search), possible
with thesaurus, navigation, and browsing.
-
INSPEC is a database relevant for physics, electronics, and computing.
It contains citations with abstracts. Sources for INSPEC include
primarily journals and conference proceedings, along with books, reports,
and dissertations. Bibliographic information, indexing terms, abstracts,
property information, and element terms are all searchable. An online
thesaurus is available.
-
Over five million citations are available (3/96). INSPEC is updated
weekly and covers publications since 1969.
-
COMPUSCIENCE is a bibliographic database covering literature in the
field of computer science and computer technology with emphasis on the
subjects of software, computer systems organization, information
systems, computer graphics, new generation computing, and artificial
intelligence. It aims at a complete coverage of European and American
publications. The database is produced by FIZ Karlsruhe in co-operation
with ACM, EACTS, GMD and GI. Therefore some of the entries occur
multiply. Citations are in english and contain bibliographic
information and indexing terms. Many records also include an abstract,
partially in english and german. The citations are classified
according to the ACM Classification Scheme
[ACM91].
-
The database comprises more than 441,210 records (1/97) and is updated
monthly with about 2,500 citations.
-
Search criteria are author, titlewords, keywords, words in the abstracts,
and ACM classification codes [ACM91].
-
The search can be constrained by entry date and document type,
e.g. ALGORITHM, CONFERENCE, REPORT.
The Geometry Literature Database is a BibT
EX database of
papers in computational geometry, maintained as a collective effort by
members of the computational geometry community, under the gentle
supervision of
Bill Jones at the
University of Saskatchewan. It contains over
8000 entries.
For detailed information about the database, you may visit Jeff Erickson's
WWW-page about the
Geometry Literature Database.
-
Faceted thesauri and classifications are quite common, including: AAT,
EI, ERIC, INSPEC, LISA and MeSH. Indeed these systems do not take
advantage of the classification in the user interface. The facet-space
interface integrates a mixture of classification system browsing,
thesauri-term identification, and object retrieval within one system.
The interface has three major components: cascaded-facet menues,
constraint lists, and a document shelf. The CR-classification labels
(defined by the ACM Computing Classification System [ACM91]) are chosen from the cascaded menus. The
constraint list presents the facets which are active, and the shelf is
updated with articles that match the constraints. Thus, the context of
the currently chosen constraints is always evident.
-
The implementation is applied to 1381 summaries of computer science
dissertations as organized by the ACM Computing Reviews classification
system [ACM91]. (5/96)
-
ARIADNE is a tool for the
interactive production and distribution of up-to-date computer science
information in the World Wide Web. It is developed by the MeDoc-Project. A user can participate
in the information exchange by adding new sources, by giving comments
concerning events, sources, publications etc., by reclassifying entries, by
supervising the quality of information via moderation, and by electronic
discussions.
-
ARIADNE manages URLs in a
central database and classifies them using the ACM Computing Classification
System [ACM91].
-
ARIADNE started with 8071
URLs gathered from existing bibliographies
(
Bibliography server by M. Ley , Collection of
Computer Science Bibliographies by A. Achilles) und from data
gathered from several Information and Resource Discovery Systems.
-
Additional Services offered by ARIADNE include an interface to other information sources and
services like databases, bibliographies, libraries, publishers, software
repositories, technical report servers and document delivery services.
-
ARIADNE offer a profile
service which keeps the user informed about changes at some selected URL's.
-
The Science Citation Index (SCI) is a reference tool which presents
bibliographic data about published journal articles in all fields of
sciences, technology and biomedicin. Approximately 7000 research
journals are indexed. The journals are partially choosen according to their
importance, i. e. ISI covers preferably journals which are frequently cited.
Works outside these journals is not be retrieved.
-
Complement to the usually bibliographic databases the SCI offers
cited reference searching.
A register of all the cited authors enables a forward directed
recherche, e.g. discovering all articles which cite a selected author.
In addition to the bibliographic data author's bibliography are retrieved.
This also enables a backward directed recherche.
-
Since 1986 the Science Citation Index appears quarterly
on CD-ROM. SciSearch is the online version of the SCI. It is
accessible via the hosts STN International (FIZ Karlsruhe), DataStar, DIALOG or
DIMDI. SciSearch contains nearly 16 million
(in Sep 95) cited references since 1974. SciSearch's weekly updating gives
extremly fast access to the international literature of scientific and
technical research.
-
In comparison to BibRelEx the SCI doesn't support:
- the visualization of results,
- annotations,
- the distinction between private and public share.
-
The user starts with a free form textual query. The system consults an
on-line thesaurus and offers words and phrases related to the query
words. At any point the user can drag-and-drop any of the related
words/phrases into the positive or negative window. The negative words
are included in the query with a NOT operator.
-
The query result is a ranked list of document titles. Clicking on
any title with the mouse brings up the full document.
-
The interface provides visual feedback to the user about how the query
words influence the ranking of retrieved documents: For every query
word the system displays a bar chart. The leftmost column of bars
corresponds to the top-ranked document, with the columns progressing
to the right representing progressively lesser ranked documents.
-
The user can classify any document as being relevant or non-relevant
by drag-and-dropping the document into positive and negative
windows. The words in the selected documents are used to expand the
query in the next iteration. Thus an improved ranking of documents is
achieved.
-
The interface has facilities to browse the table of contents of
publications and to browse the list of articles written by a specific
author.
-
In comparison to BibRelEx, this system has the following properties:
-
searching by query,
-
overview diagrams are not available,
-
relations between the documents are ignored.
Current systems typically provide full-text scanning (for small
databases), inversion (for large databases) or clustering. The major
ideas from clustering are the relevance feedback (mostly based on a
vector space model) and the ability to provide ranked output (ie,
documents sorted on relevance order).
Hyper-G is a client-server based internet information system. The main goals
are to provide automatic structuring and maintenance of a large number of
distributed hypermedia documents, to provide orientational and navigational
aids to the user, to supports user identification and access control, and
to access to existing information systems like the WWW.
-
Infomation can only be added in a structured way and only with tools provided
by Hyper-G.
-
Contrary to usual Web practice, links are stored in separate
databases and they are bidirectional. Links can be followed
back to their source. The separate link database has a number of
advantages: As links are separated from the documents, links can be
included in all kinds of documents, including text, images, video
clips, and 3D scenes. Links to a document can be removed
automatically when the document is removed. This ensures
referential integrity, i.e., no links pointing to documents
which no longer exist. The link structure is easy to visualize.
-
Documents are grouped into collections, which may recursively be members
of other collections. Each document must belong to at least one
collection. Documents and collections may belong to multiple parent
collections, opening up the possibility of providing multiple views of
the same information. The navigation follows along the collection hierarchy.
-
Search facilities are fully integrated into Hyper-G. As documents and
collections are inserted into the database, they are automatically
indexed. This allows full text searches and boolean searches of all or some
collections. Collections can be marked by the user, and only the
marked collections will be searched. Documents and collections have
attributes, e.g. author, title, keywords and creation date which may be used
in searches. As all documents are automatically indexed when they are inserted
into a collection, the database size does not seem to affect the speed of the
search.
-
Hyper-G supports a Unix-like security system. The system administrator controls
the rights to read, modify and link to documents at the document or collection
level. Each user has a home collection for personal documents. Documents may
also be annotated, and annotations may be private, public or group.
-
Existing Web browsers can be clients of Hyper-G servers (they
just won't get all the features), and Hyper-G browsers can be
full-featured clients of normal Web servers. Similarly, it includes
seamless access to popular Internet server technologies, such as WAIS
and Gopher.
BASISplus is a professional document
management system. In addition to
BASISplus (the full-text search engine based on a relational
database management system) the BASIS WEBserver (WWW gateway) and the
BASIS SGMLserver (management of SGML documents) are offered.
-
Associating weights with the single search terms defines a ranking.
-
The BASISplus Thesaurus supports the
ANSI-defined thesaurus relationships including synonym identification, concept
hierarchies, clarification of ambiguous terms, automatic or interactive term
switching and alternative language searching. Users may also define
their own word relationships. The central thesaurus manager defines
and manage several different thesauri simultaneously. Any
standards-compliant thesaurus (e.g. Roget, Houghton-Mifflin) can be
uploaded and used with BASISplus.
-
Security Features: User ID and password protection may be applied to
Entire databases, Index files, Documents, Fields, Subfields,
Read/Write privileges. Safety-critical documents can be encrypted.
-
Three components:
- SearchServer: search engine,
- SearchBuilder: GUI tool,
- Surfboard: WWW gateway.
-
The human-computer interface must be programmed completely. The access to the
datas based on (commandline-oriented) SearchSQL.
-
Ranking:Fulcrum offers differnt ranking
methods, e.g. number of matches, term frequency, term weights (e.g. TFIDF).
-
Support for over 150 document formats (including Adobe PDF, Microsoft
Office, Word, Powerpoint and Excel) is enabled by using filter and
additional attributs. In addition to a full-text field a record can
contain any other fields common in relational databases.
-
Topic uses a knowledge-based approach
based on the idea of concept retrieval. Instead of words or phrases it uses
hierarchically arranged topics to search for entire concepts.
-
several components:
- Topic Agent: intelligent search engine, support the construction of
networked full-text databases
- Topic Enterprise Server: full-text database,
- Topic Internet Server: WWW gateway,
- DeveloperKit: API,
- Topic CD Publisher,
- Topic News Server.
-
Search Features: word matching, concept-based search, Boolean logic,
proximity and field search. Fuzzy logic and natural language
capabilities allow search by example.
-
Ranking: fuzzi-algorithm with precision calculation for every word.