BibRelEx:
Exploring Bibliographic Databases by Visualization of Contents-Based Relations

Research Status

Bibliographic Databases
Full-Text Databases/Information Retrieval Systems
- Hyper-G
- BASISplus
- Fulcrum
- Topic
Visualization Tools for Large Information Spaces
Annotation Systems
- ComMentor
- Group Annotation Transducer (GrAnT)

Bibliographic Databases

Present bibliographic databases have the capability to automate tasks such as cataloguing, searching (e.g. book title search), possible with thesaurus, navigation, and browsing.

INSPEC [inspec]

INSPEC is a database relevant for physics, electronics, and computing. It contains citations with abstracts. Sources for INSPEC include primarily journals and conference proceedings, along with books, reports, and dissertations. Bibliographic information, indexing terms, abstracts, property information, and element terms are all searchable. An online thesaurus is available.
Over five million citations are available (3/96). INSPEC is updated weekly and covers publications since 1969.

COMPUSCIENCE [COMPUS][fachinf]

COMPUSCIENCE is a bibliographic database covering literature in the field of computer science and computer technology with emphasis on the subjects of software, computer systems organization, information systems, computer graphics, new generation computing, and artificial intelligence. It aims at a complete coverage of European and American publications. The database is produced by FIZ Karlsruhe in co-operation with ACM, EACTS, GMD and GI. Therefore some of the entries occur multiply. Citations are in english and contain bibliographic information and indexing terms. Many records also include an abstract, partially in english and german. The citations are classified according to the ACM Classification Scheme [ACM91].
The database comprises more than 441,210 records (1/97) and is updated monthly with about 2,500 citations.
Search criteria are author, titlewords, keywords, words in the abstracts, and ACM classification codes [ACM91].
The search can be constrained by entry date and document type, e.g. ALGORITHM, CONFERENCE, REPORT.

The Geometry Literature Database is a BibT_EX database of papers in computational geometry, maintained as a collective effort by members of the computational geometry community, under the gentle supervision of Bill Jones at the University of Saskatchewan. It contains over 8000 entries.
For detailed information about the database, you may visit Jeff Erickson's WWW-page about the Geometry Literature Database.

Facet-Space Interface [Allen96]

Faceted thesauri and classifications are quite common, including: AAT, EI, ERIC, INSPEC, LISA and MeSH. Indeed these systems do not take advantage of the classification in the user interface. The facet-space interface integrates a mixture of classification system browsing, thesauri-term identification, and object retrieval within one system. The interface has three major components: cascaded-facet menues, constraint lists, and a document shelf. The CR-classification labels (defined by the ACM Computing Classification System [ACM91]) are chosen from the cascaded menus. The constraint list presents the facets which are active, and the shelf is updated with articles that match the constraints. Thus, the context of the currently chosen constraints is always evident.
The implementation is applied to 1381 summaries of computer science dissertations as organized by the ACM Computing Reviews classification system [ACM91]. (5/96)

ARIADNE - a computer science information retrieval system [ARIADNE]

ARIADNE is a tool for the interactive production and distribution of up-to-date computer science information in the World Wide Web. It is developed by the MeDoc-Project. A user can participate in the information exchange by adding new sources, by giving comments concerning events, sources, publications etc., by reclassifying entries, by supervising the quality of information via moderation, and by electronic discussions.
ARIADNE manages URLs in a central database and classifies them using the ACM Computing Classification System [ACM91].
ARIADNE started with 8071 URLs gathered from existing bibliographies ( Bibliography server by M. Ley , Collection of Computer Science Bibliographies by A. Achilles) und from data gathered from several Information and Resource Discovery Systems.
Additional Services offered by ARIADNE include an interface to other information sources and services like databases, bibliographies, libraries, publishers, software repositories, technical report servers and document delivery services.
ARIADNE offer a profile service which keeps the user informed about changes at some selected URL's.

Science Citation Index [SCI][Göb96a][Göb96b]

The Science Citation Index (SCI) is a reference tool which presents bibliographic data about published journal articles in all fields of sciences, technology and biomedicin. Approximately 7000 research journals are indexed. The journals are partially choosen according to their importance, i. e. ISI covers preferably journals which are frequently cited. Works outside these journals is not be retrieved.
Complement to the usually bibliographic databases the SCI offers cited reference searching. A register of all the cited authors enables a forward directed recherche, e.g. discovering all articles which cite a selected author. In addition to the bibliographic data author's bibliography are retrieved. This also enables a backward directed recherche.
Since 1986 the Science Citation Index appears quarterly on CD-ROM. SciSearch is the online version of the SCI. It is accessible via the hosts STN International (FIZ Karlsruhe), DataStar, DIALOG or DIMDI. SciSearch contains nearly 16 million (in Sep 95) cited references since 1974. SciSearch's weekly updating gives extremly fast access to the international literature of scientific and technical research.
In comparison to BibRelEx the SCI doesn't support:
- the visualization of results,
- annotations,
- the distinction between private and public share.

Querying, Navigating and Visualizing a Digital Library Catalog,Georgia Institute of Technology [VeeNav95]

The user starts with a free form textual query. The system consults an on-line thesaurus and offers words and phrases related to the query words. At any point the user can drag-and-drop any of the related words/phrases into the positive or negative window. The negative words are included in the query with a NOT operator.
The query result is a ranked list of document titles. Clicking on any title with the mouse brings up the full document.
The interface provides visual feedback to the user about how the query words influence the ranking of retrieved documents: For every query word the system displays a bar chart. The leftmost column of bars corresponds to the top-ranked document, with the columns progressing to the right representing progressively lesser ranked documents.
The user can classify any document as being relevant or non-relevant by drag-and-dropping the document into positive and negative windows. The words in the selected documents are used to expand the query in the next iteration. Thus an improved ranking of documents is achieved.
The interface has facilities to browse the table of contents of publications and to browse the list of articles written by a specific author.
In comparison to BibRelEx, this system has the following properties:
- searching by query,
- overview diagrams are not available,
- relations between the documents are ignored.

Full-Text Databases / Information Retrieval Systems

Current systems typically provide full-text scanning (for small databases), inversion (for large databases) or clustering. The major ideas from clustering are the relevance feedback (mostly based on a vector space model) and the ability to provide ranked output (ie, documents sorted on relevance order).

Hyper-G [hyperg] [DalHey95][KaMaSh93][GMRS96]

Hyper-G is a client-server based internet information system. The main goals are to provide automatic structuring and maintenance of a large number of distributed hypermedia documents, to provide orientational and navigational aids to the user, to supports user identification and access control, and to access to existing information systems like the WWW.

Infomation can only be added in a structured way and only with tools provided by Hyper-G.
Contrary to usual Web practice, links are stored in separate databases and they are bidirectional. Links can be followed back to their source. The separate link database has a number of advantages: As links are separated from the documents, links can be included in all kinds of documents, including text, images, video clips, and 3D scenes. Links to a document can be removed automatically when the document is removed. This ensures referential integrity, i.e., no links pointing to documents which no longer exist. The link structure is easy to visualize.
Documents are grouped into collections, which may recursively be members of other collections. Each document must belong to at least one collection. Documents and collections may belong to multiple parent collections, opening up the possibility of providing multiple views of the same information. The navigation follows along the collection hierarchy.
Search facilities are fully integrated into Hyper-G. As documents and collections are inserted into the database, they are automatically indexed. This allows full text searches and boolean searches of all or some collections. Collections can be marked by the user, and only the marked collections will be searched. Documents and collections have attributes, e.g. author, title, keywords and creation date which may be used in searches. As all documents are automatically indexed when they are inserted into a collection, the database size does not seem to affect the speed of the search.
Hyper-G supports a Unix-like security system. The system administrator controls the rights to read, modify and link to documents at the document or collection level. Each user has a home collection for personal documents. Documents may also be annotated, and annotations may be private, public or group.
Existing Web browsers can be clients of Hyper-G servers (they just won't get all the features), and Hyper-G browsers can be full-featured clients of normal Web servers. Similarly, it includes seamless access to popular Internet server technologies, such as WAIS and Gopher.

BASISplus [BASIS][GMRS96]

BASISplus is a professional document management system. In addition to BASISplus (the full-text search engine based on a relational database management system) the BASIS WEBserver (WWW gateway) and the BASIS SGMLserver (management of SGML documents) are offered.

Associating weights with the single search terms defines a ranking.
The BASISplus Thesaurus supports the ANSI-defined thesaurus relationships including synonym identification, concept hierarchies, clarification of ambiguous terms, automatic or interactive term switching and alternative language searching. Users may also define their own word relationships. The central thesaurus manager defines and manage several different thesauri simultaneously. Any standards-compliant thesaurus (e.g. Roget, Houghton-Mifflin) can be uploaded and used with BASISplus.
Security Features: User ID and password protection may be applied to Entire databases, Index files, Documents, Fields, Subfields, Read/Write privileges. Safety-critical documents can be encrypted.

Fulcrum [Fulcrum][GMRS96]

Three components:
- SearchServer: search engine,
- SearchBuilder: GUI tool,
- Surfboard: WWW gateway.
The human-computer interface must be programmed completely. The access to the datas based on (commandline-oriented) SearchSQL.
Ranking:Fulcrum offers differnt ranking methods, e.g. number of matches, term frequency, term weights (e.g. TFIDF).
Support for over 150 document formats (including Adobe PDF, Microsoft Office, Word, Powerpoint and Excel) is enabled by using filter and additional attributs. In addition to a full-text field a record can contain any other fields common in relational databases.

Topic [Topic][GMRS96]

Topic uses a knowledge-based approach based on the idea of concept retrieval. Instead of words or phrases it uses hierarchically arranged topics to search for entire concepts.
several components:
- Topic Agent: intelligent search engine, support the construction of networked full-text databases
- Topic Enterprise Server: full-text database,
- Topic Internet Server: WWW gateway,
- DeveloperKit: API,
- Topic CD Publisher,
- Topic News Server.
Search Features: word matching, concept-based search, Boolean logic, proximity and field search. Fuzzy logic and natural language capabilities allow search by example.
Ranking: fuzzi-algorithm with precision calculation for every word.

Visualization Tools for Large Information Spaces

LyberWorld, GMD, IPSI [Hem95][Thiel95][Engl95]

LyberWorld is a 3D graphical user interface for the probabilistic retrieval system INQUERY. It models the information space as a network of documents and terms.
LyberWorld provides two visualization tools:
- NavigationCones help to navigate along the query paths. The retrieved items are shown on the surface of a cone, alternating between document and term nodes. Selecting a document title on a cone displays a sublevel cone with terms; selecting a term produces a cone with document titles and so on.
- RelevanceSpheres help to judge the relevance of items within the result space. Terms selected in the NavigationCones are displayed as spheres equally distributed on the surface of the RelevanceSphere. The position of each document node is calculated by adding the attraction vectors between it and each of the term nodes. The resulting vector will determine its position within the sphere, using the sphere centre as its origin. Additional features allow the user to interact with the RelevanceSphere. Users can change parameters such as the document density, the term attraction, and the scaling of the visualization.
Compared to BibRelEx, this system has the following properties:
- It reduce document-term-net to a tree,
- It prides only the linktype keyword.

Butterfly System, Xerox Palo Alto Research Center [MaRaCa95]

Butterfly is a graphical user interface for simultaneously exploring multiple DIALOG bibliographic databases across the Internet. It uses 3D interactive animation techniques. A virtual landscape grows controlled by the user, while asynchronous query processes explore the resulting graph.
First the users start with queries to find articles in top areas of interest. Then the user browses through references and citation links to find related articles. For that purpose the current query result is described by a butterfly: The head of the butterfly contains the title, author, year, and journal of the article. The wings of the butterfly displays the articles references at the left and citers of the article found in the Science Citation database at the right. Butterflies with folded wings which have been explored by the user by following link-generating queries from the current butterfly are displayed at the left and right of the current butterfly.
Moreover a 3D scatterplot provides a complete view of the citation graph which has been explored by the user.
An animation loop buffers the user from long-running asynchonous query processes.
Compared to BibRelEx, this system has the following properties:
- No representation of the complete information space. Scatterplots only gives an overview of retrieved documents and do not support recherches.
- The primary goal is a fast user interface in spite of a slow multiple repository.

Narcissus, HyperSpace, University of Birmingham [HDWB95][HenDre95][WBDH95][NarHyp][Young96]

Narcissus is an information visualisation system for complex systems which are composed of a large number of interacting componets. Instead of enforcing a predetermined arrangement on the graph, the system allows the set of nodes and arcs to evolve in its own form.This self-organisation is done by randomly placing the nodes in three-dimensional space and allowing a set of forces to act on them until the structure reaches an equilibrium. There are two classes of forces:
- All objects in the system exert a repulsive force on all other objects.
- Active relationships between objects lead to attractive forces being exerted between related objects.
Inside this structure users navigate, select and manipulate objects. The system labels objects with attributes (e.g. their URL). A user can either just display these attributes for selected objects or set a distance treshold beyond which they are not displayed. Colours are used to represent properties of an object. Besides it is possible to merge a cluster of individual objects into one compound object.
Currently, the behaviour of the objects is very simple and the rules which determine the behaviour are the same for all objects. In a future extension the system is going to provide a richer set of behaviour. The behaviour will depend on the object instance. Robots will wander through the structure and manipulate the objects and their behaviour.
Hyperspace is a system to visualize parts of the WWW. It is based on the Narcissus information visualisation system. Relationships between the objects are determined by hypertext links. Pages directly connected by links are placed close to each other. Pages which contain links to the same pages are considered to be related, too. As the forces are proportional to the numbers of links an object has, indexes and bibliographies stand out by clearing a space for themselves. Objects which are linked to several indexes appear in the gaps between them. Thereby the organisation of the pages is related to their content.
The system makes use of the Mosaic WWW browser API for data collection. If a page is visited, all children of this page and their relations with all the previously visited pages are added to the representation. Therefor the system supports only a local view.

Navigational View Builder, Georgia Institut of Technology [MukFol95][MuFoHu95][RoCaMa93][Young96]

The Navigational View Builder is intended to interactivly develop effective overview diagrams of hypermedia systems.
It uses structural and content analysis to reduce hypermedia graphs to pre-trees. Pre-trees are a mixture of a graph and a tree. It has a root which does not have any parent node. But all its descendants may be arbitrary graphs with the restriction that there is no cross connection between them.
These hierarchies may be visualized as 2D trees, treemaps, cone trees or as tables of contents. A TOC is formed by listing nodes in depht-first order.
Users guide the process during the translation of the graph into a tree, e.g. by assigning different weights to different link types. Visualizing the tree is controlled by assigning different visual attributes to different information attributes in the views.
To reduce the complexity of the overview diagrams, the system supports contents-based and structure-based filtering algorithms.
Once a hierarchy is formed from the original graph structure, other views can be developed as well. For example, a perspective wall view of WWW pages as a linear structure sorted by the last-modified-time can be formed. The perspective wall folds the linear structure into a 3D wall that smoothly integrates a central region for viewing details with two perspective regions, one on each side, for viewing context.
some differences to BibRelEx:
- based on fixed given graphs,
- supports no navigation.

Graphical Fisheye Views, Brown University [SarBro94][Young96]

Graphical fisheye views are a technique for viewing and browsing large graphs. In fisheye views an area of interest (each vertex is assigned an user-defined a priori importance (API)) is shown quite large and with detail. This viewer's point of interest is called the focus. The remainder of the graph is shown successively smaller and in less detail dependent on the distance from the focus and on the API. Furthermore the user can control the minimum level of visual worth (determined by the distance between a vertex and the focus and by the API) that is necessary in order for a vertex to be displayed.
The main difference to BibRelEx is that the graph and the APIs are predetermined.

IVEE (Information Visualization and Exploration Environment),SSKKII, Department of Computer Science, Chalmers University [AhlWis95a][AhlWis95b][Wis95][Engl95][KuPlSh95]

The Information Visualization & Exploration Environment (IVEE) is a slider-based system for interactive information exploration using dynamic queries.
Queries are composed by the conjunction of all query components defined by widgets. Existing query widgets are:
- rangesliders to select a numeric value between a minimum and a maximum,
- alphasliders for selecting from a large set of strings,
- category selectors for selecting one or many categories of a categorical attribute.
Three 2D visualization methods (scatter plot, maps and clustering) are implemented in IVEE. To support more types of visualizations - such as treemaps and 3D scatter plots - is one goal in future work.
Manipulating a widget immediately affects the visualization. Therefore the user gets direct feedback. Detecting anomalies and cluster is very simple.
The focus of IVEE is on database querying. In contrast to BibRelEx additional structures like hyperlinks are not visualized. Furthermore IVEE provides no representation of relations between documents, e.g. citation relationships.

Hyperbolic Browser, Xerox Palo Alto Research Center [LaRaPi95][Engl95]

The hyperbolic browser is a system to visualize and manipulate large hierarchies by means of hyperbolic trees. The representation is similar to Graphical Fisheye Views. In contrast to the work of Sarkar et al, where conventional 2D layout techniques are used, the hyperbolic browser exploits hyperbolic geometry. The essence of the approach is to lay out the hierarchy on a hyperbolic plan and map this plan onto a circular display region. The hyperbolic browser handles arbitrarily large hierarchies, with a context which includes as many nodes as are included by 3D approaches and with modest computational requirements.
In comparison to BibRelEx, this system has the following properties:
- It supports only tree hierarchies and
- it provides only local views.

Annotation Systems

ComMentor, Stanford University [RöMoWi95a][RöMoWi95b]

ComMentor enables annotations on top of the existing WWW infrastructure. The meta-information is managed independently of the documents themselves on separate meta-information servers. Therefor no changes in the original documents are necessary and annotated documents and annotation sets may be stored on different servers.
Meta-information items like annotations are organized as sets. Access control is managed per annotation set, e.g. private sets, group sets and public sets. Any authentication mechanism (e.g. private keys, public keys) can be used.
For querying, both the metaserver and the target document server, and for synthesizing the two responses the ComMentor system uses a client-side component. This separation of the functionality (meta-information server/client-side component) provides more autonomy and scalability, but ties the user interface and query functionality to a particular browser implementation.
public WWW comments, personal annotations, landmarks, shared hotlists are some usages of the ComMentor system.

Group Annotation Transducer (GrAnT), OSF Research Institute Cambridge [SchMaBr96]

The GrAnT is based essentially on the ComMentor system. In contrast to the latter the GrAnT system is browser-independent: the system takes advantage of an application-specific stream transducer; most of the functionality of the system is provided by a specialized proxy.

BibRelEx:
Exploring Bibliographic Databases by Visualization of Contents-Based Relations

Research Status

Bibliographic Databases

INSPEC [inspec]

COMPUSCIENCE [COMPUS][fachinf]

geombib

Facet-Space Interface [Allen96]

ARIADNE - a computer science information retrieval system [ARIADNE]

Science Citation Index [SCI][Göb96a][Göb96b]

Querying, Navigating and Visualizing a Digital Library Catalog,Georgia Institute of Technology [VeeNav95]

Full-Text Databases / Information Retrieval Systems

Hyper-G [hyperg] [DalHey95][KaMaSh93][GMRS96]

BASISplus [BASIS][GMRS96]

Fulcrum [Fulcrum][GMRS96]

Topic [Topic][GMRS96]

Visualization Tools for Large Information Spaces

LyberWorld, GMD, IPSI [Hem95][Thiel95][Engl95]

Butterfly System, Xerox Palo Alto Research Center [MaRaCa95]

Narcissus, HyperSpace, University of Birmingham [HDWB95][HenDre95][WBDH95][NarHyp][Young96]

Navigational View Builder, Georgia Institut of Technology [MukFol95][MuFoHu95][RoCaMa93][Young96]

Graphical Fisheye Views, Brown University [SarBro94][Young96]

IVEE (Information Visualization and Exploration Environment),SSKKII, Department of Computer Science, Chalmers University [AhlWis95a][AhlWis95b][Wis95][Engl95][KuPlSh95]

Hyperbolic Browser, Xerox Palo Alto Research Center [LaRaPi95][Engl95]

Annotation Systems

ComMentor, Stanford University [RöMoWi95a][RöMoWi95b]

Group Annotation Transducer (GrAnT), OSF Research Institute Cambridge [SchMaBr96]

[ Computer Science Dept. I ] [ Research ] [ Teaching ] [ Publications ] [ Staff ] [ University of Bonn ]

BibRelEx: Exploring Bibliographic Databases by Visualization of Contents-Based Relations

[ Computer Science Dept. I ] [ Research ] [ Teaching ] [ Publications ] [ Staff ] [ University of Bonn ]

BibRelEx:
Exploring Bibliographic Databases by Visualization of Contents-Based Relations