Document Visualization
Emile Morse
Submitted: December 15, 1997
Revised: January 15, 1998
Table of contents
Abstract
*1 Introduction
*2 Scope and Definitions
*2.1 Documents and Metadata
*2.2 Shneiderman Framework
*2.3 Information Retrieval versus Information Visualization
*3 Document Data Types and Representation
*3.1 Linear Text
*3.2 Two-dimensional Text
*3.3 Three-dimensional Text
*3.4 Multidimensional Text
*3.4.1 Text Analysis-the basic method
*3.4.2 Refinements of the Basic Method
*3.4.3 Alternative Methods for Encoding Documents
*3.5 Temporal Text
*3.6 Trees
*3.7 Networks
*3.8 Distributed Documents/Workspaces
*4 Interface Issues
*4.1 Overview
*4.2 Zoom
*4.3 Filtering
*4.4 Details-on-Demand
*4.5 Relate
*4.6 History
*4.7 External Memory/Extracts
*5 Task Models
*5.1 Wehrend -- a task level user model
*5.2 Task Models from Library Environments
*5.2.1 Marchionini
*5.2.2 Bates
*5.2.3 Belkin
*5.3 VIRI Research Group Tasks
*5.4 Summary of Task Models
*6 Examples of Visualization Systems
*6.1 Linear Text: TileBars
*6.2 Two-dimensional Text: Pad++
*6.3 Three-dimensional Text: WebBook
*6.4 Multidimensional Text: SPIRE
*6.5 Temporal Text: SeeSoft
*6.6 Tree Text: Hyperbolic Tree
*6.7 Networks: Navigational View Builder
*6.8 Distributed Documents/Workspaces: CASCADE
*7 Research Opportunities
*8 Summary
*Appendix A: Metadata
*Appendix B: List of Tasks from Shneiderman
*Appendix C: Research Design
*9 References
*
List of Tables
Table 1: Shneiderman Classification
*Table 2: Comparison of Tasks
*Table 3: Information Seeking Dimensions (Belkin et al. 1995)
*Table 4: List of Visualization Systems for Documents
*
List of Figures
Figure 1: Information retrieval model
Figure 2: Information visualization or browsing
*Figure 3: TileBars embedded in a Scatter/Gather interface.
*Figure 4: PAD++ rendering of a hypertext
*Figure 5: WebBook close-up showing a book being riffled through
*Figure 6: SPIRE Themescape showing topic distribution in a large document space
*Figure 7: SeeSoft showing overview of a software project
*Figure 8: Hyperbolic tree representation of document collection
*Figure 9: Navigational View Builder showing relationships among a set of documents
*Figure 10: CASCADE display demonstrating landmark feature including color-coded links,
Mural and TileBar
*Figure 11: VIBE display with Displacement feature activated causing tails to appear on document icons
*Figure 12: WebVIBE showing the same document collection and POI selections as
Figure 11
*
Evaluating visual interfaces for documents requires knowledge about each of the components of a human-computer system. The computer component manages the document and its various surrogates, indexes and representations. A framework developed by Shneiderman suggests characterizing objects by their data type. The proposed types include linear, 2-D, 3-D, multidimensional, temporal, hierarchical, network and distributed. The interface component of visualization systems is discussed in terms of the functionality provided for interaction of the user and the computer. The basic elements are static overview, and several dynamic requirements, such as zoom, filter and details-on-demand. The user component is discussed from the viewpoint of various taxonomies of elemental tasks that are required for satisfactory goal realization. Finally, existing visual interfaces that support document space exploration are reviewed.
IntroductionVisualization is a cognitive process performed by humans in forming a mental image of a domain space. In computer and information science it is, more specifically, the visual representation of a domain space using graphics, images, animated sequences and sound augmentation to present the data, structure and dynamic behavior of large, complex data sets that represent systems, events, processes, objects and concepts (Williams et al. 1995).
Visualizations may be characterized on a continuum ranging from physically concrete to purely abstract depending on the properties of the objects being rendered. Scientific visualization is mainly concerned with phenomena that are based in the physical world. The most concrete visualizations are renderings of objects as they exist in the world, e.g., a walk-though of a museum. Renderings of building plans could be placed a little further along the continuum since the building does not already exist but is made visible based on physical properties contained in the plan. Models that attempt to render properties that are not visible, such as forces in bridge beams or wind gusts in weather simulations, are making things that cannot ordinarily be seen visible. Maps are another instance of visualizations that are rooted in the physical world; they can be used to magnify (e.g., computer chip map) or shrink (e.g., geographical maps). In addition, they may be used to display attributes that are physical (e.g., topology) or abstract (e.g., population density). Molecular modeling is an example of visualization that is both physically motivated and abstract. The objects, i.e., atoms and molecules, are concrete objects, but due to the fact that they are invisible, chemists and physicists rely on models that have utility but that are not necessarily faithful representations of the underlying atomic species. Visualizing in databases can be mapped best to a portion of the abstract end of the physical/abstract scale. The attributes stored in many databases reflect characteristics of physical objects, but in many other systems, the objects may be abstract or a mixture of both. Since information has "no innate shape or color" (Koike 1993), its visualization has a purely abstract character. Information visualization covers areas such as visual reasoning, visual data modeling, visual programming, visual information retrieval and browsing, visualization of program execution, visual languages, visual interface design, and spatial reasoning.
People have tremendous perceptual abilities for visual information. Visualizations rely on the fact that users can distinguish positions, colors, textures, and relationships. Relationships can be shown in such displays by proximity, by containment, by connected lines, by color-coding, etc. Fields containing hundreds or thousands of points can be scanned rapidly and efficiently for clusters, outliers, trends, and gaps. Attention can be drawn to salient items using a variety of techniques including highlighting, blinking, motion, and size. Direct manipulation of visualizations can be accomplished with a variety of methods, such as pointing to select, dragging, and zooming. Feedback is immediate and intuitive in such environments. "The eye, the hand and the mind seem to work smoothly and rapidly as users perform actions on visual displays" (Shneiderman 1996, p. 340).
Examples of and guidelines for good graphical displays are common (e.g., Tufte 1983, 1990, 1997, Bertin 1983, Cleveland 1985). These guidelines are descriptive rather than prescriptive. They focus mainly on the data and not on the tasks that the user might need to perform with the data (Casner 1991). There is evidence that task performance is sometimes superior with graphical displays but in other situations, textual displays are better. Larkin & Simon (1987) investigated the usefulness of graphical displays in human task performance. They found that there were two ways in which graphical presentations could support more efficient task performance: 1) by allowing the substitution of rapid perceptual inferences for difficult logical inferences, and 2) by reducing search for information required for task completion. This study gives some theoretical support to add to the intuitive appeal of using graphics.
Until now the discussion has centered on what visualization is and why it is interesting to consider as an alternative to textual information. The next topic pertains to documents, including why and how they can be the objects of visualization. Documents are important sources of information. A document's information is found not only in its content, but also in its metadata and its structure. Document metadata consists of elements such as author, publisher, and date of publication. Keywords comprise an intermediate form of document data; they can be viewed as both content and metadata. Metadata forms the primary type of information in library systems. The WorldWide Web contains large volumes of documents that are poorly characterized in terms of both metadata and content descriptions. Automated methods for indexing documents have become an important research issue since there are not enough human resources available to index the documents that are published either in paper or in electronic formats. The methods for indexing documents are normally lexically-based but this is not the only kind of information available in the native full-text. Issues at the data level include: how to prepare adequate document representations or surrogates and how to integrate metadata and derived indexes. Section 3 will apply a taxonomy developed by Shneiderman (1996, 1998) to support visualization. Details of this taxonomy, as originally proposed, can be found in the Scope and Definitions chapter.
The second dimension of the Shneiderman taxonomy presents a set of functionalities required of any computer-human interface. Section 4 presents an overview of these interface functionalities with respect to visualization systems. Discussing the interface requires attention to both how the computer presents information to the user as well as how the user communicates his needs to the computer. For this reason the discussion includes methods for mapping computer representations of documents to interface objects and interaction techniques.
Much effort is devoted to developing visualization systems but there is less attention to developing schemes for testing whether these systems are useful in meeting the needs of users. Part of the problem with setting up a testing plan for visualization systems resides in the fact that these systems are purported to be exploration, creativity or browsing environments. As such, it seems to be an enormous undertaking to define at any level what the component tasks of such environments might be.
If one intends to be able to evaluate visual presentations of data derived from documents, it is important to understand both the data source and the tasks of the users who need to accomplish document-centered goals. Section 5 describes three different ways that tasks can be categorized. The first is a low level task analysis of Wehrend & Lewis (1990) that seeks to determine, based on elemental data objects, the types of tasks that can be developed. The second approach to developing a task taxonomy comes from the library science literature and is of the naturalistic field testing variety. The final task set is a preliminary, high-level breakdown proposed by the VIRI (Visual Information Retrieval Interface) research group. The sixth section describes several current visualization systems with examples being taken from each of the main data type categories.
The basic components of a human-computer system are the data, the user, and the interface between them. This paper will address 1) how documents in a system are processed for presentation in the interface, 2) how the data are mapped to interface objects and what kinds of interactions should be supported by the interface, and 3) the tasks that characterize the user model in the system. The organization of the discussion mirrors these three major elements. Section 3 concentrates on developing the Shneiderman framework as it might be applied to documents. Section 4 reports on the methods that can be used to render overviews in computer displays and on requirements for effective interaction with displays. Section 5 presents several alternative task models that have been used to characterize the human user of information systems. Finally, Section 6 presents examples of visualizations that derive from each of the major document data types described in Section 3.
Documents and Metadata"Ask any group of ten information scientists to define 'document' and you will get ten different answers." - Spring (1991)
According to Spring (1991, p 8), "A document is an identifiable entity, having some durable form, produced by a person or persons toward the goal of communication and may take a number of forms, but must have at least one symbolic manifestation that can be comprehended by humans." Buckland (1997) presents an interesting historical treatment of document definitions. Various viewpoints have been held in the last 150 years. Otlet (Buckland 1997) regarded not only graphic and written records to be documents, but also objects that could inform observers; e.g., natural object, artifacts, explanatory models, educational toys, archaeological finds, and sculpture. Breit (1951) has also taken an ecumenical view of documents and presented a discussion about whether or not an antelope was a document. Her conclusion was that a free-living antelope in the wild was not a document but that a specimen in a zoo was certainly a document. For the practical purpose of presenting visual representations of documents, it is most often plain text that is processed to generate vectors, indexes, and surrogates used in systems. This paper focuses on this more constrained view of documents as plain text.
Metadata in the current context refers to information about a document rather than to the content of the document itself. Author, publisher, date of publication, and even keywords constitute metadata. Standards are being developed for use on the WWW and in traditional library systems that seek to codify metadata. Ng et al. (1997) present an overview of the metadata standardization issues, including a discussion of the similarities and differences among the competing candidate standards. They mention the Dublin Core, URC, USMARC, IAFA, and TEI header methods. Such schemes will provide some basic structure to a large class of documents that are largely unclassified. A table of metadata classifications is presented in Appendix A.
A final issue related to document spaces concerns granularity. Spring (personal communication) has suggested that a proper framework would include: document components, documents, document sets, document collections, and document analytics. Components can be defined as paragraphs, sections, or chapters of their parent document. Sets are groups of documents that are fewer in number that the full collection. Analytics are essentially metadata. Successful visualizations must clearly distinguish the grain size that they attempt to render. In this paper there is no attempt to divide applications explicitly along these lines. It should be clear in each case, however, whether a visualization is based on single documents, sets of documents or collections of documents.
The framework suggested by Shneiderman to support research in visualization is two-dimensional. The first dimension is the data-type of the objects to be represented in the interface. He lists seven types in an early paper (Shneiderman, 1996) and eight in later versions found in his textbook (Shneiderman 1998) and at the University of Maryland website. The types are linear, planar, volumetric, temporal, multidimensional, tree, network, and workspace. Workspace is the type that was added in the later versions. The second dimension is a task typology and includes: overview, zoom, filter, details-on-demand, relate, history, and extract. The scope of the Shneiderman framework is the entire domain of visualization. Table 1 provides a graphical view of the framework. It is pertinent to note that both of these dimensions are very high-level, more qualitative than quantitative. The purpose of the original framework was "to sort out the prototypes [that currently exist] and guide researchers to new opportunities" (Shneiderman 1996); the goal of the current examination is the same.
Table 1: Shneiderman Classification
|
Interface Functionality |
|||||||
|
Document Data Types |
Overview |
Zoom |
Filter |
Details-on-Demand |
Relate |
History |
Extract |
|
Linear 2-Dimensional 3-Dimensional Multidimensional Temporal Hierarchical Network Distributed |
|||||||
The retrieval process in the traditional view is quite simple. Information is stored and later retrieved when it is needed. On-line retrieval systems typically consist of a large document database. Terms that describe the document contents (index) are selected from manual or automatic indexing. The index terms are descriptors of the represented document. Queries are requests to process information and a search query consists of different terms combined in a structured query language. The traditional information retrieval paradigm is a matching process according to the similarity between the keyword index entries and the search query. The problem is to find all and only the relevant documents. To evaluate the retrieval results two statistics are used: recall, the percentage of relevant documents found and precision, the percentage of the documents found that are relevant. From the evaluation of the retrieval results one can formulate a new query. The traditional model of information retrieval is shown in Figure 1. Problems can arise in this model when the number of retrieved documents is very large or when the language used to specify the query is poorly matched to the real information need of the user.
Figure 1: Information retrieval model
An information visualization as user interface could help to overcome these problems. As shown in Figure 2 the abstract data model represented by the index is visualized as an information space.
Figure 2: Information visualization or browsing
The user interacts directly with the visualization to express his needs. The query in this model is stated implicitly in the view and it is filtered and refined through manipulation of the interface. Navigation inside the information space is helpful following a context-oriented search path to find certain domains of interest. Relevance in the visual setting is not necessarily a predetermined characteristic of a document. The interaction of the user with the interface supports browsing, creativity and constant refinement of the original statement of the goal of the search. Since relevance is not definable in the information visualization paradigm, assessment using recall and precision is impossible. New methods need to be developed to evaluate systems developed based on this model.
"Information representation is multifaceted and flexible." - Gershon (1995)
Although the inspiration for this paper was through the work that has been done on developing interfaces to support information retrieval, it is much larger in scope than the view used by most workers in information retrieval, which is to view documents solely as multidimensional objects. In this subsection text will be viewed variously as streams of words, flows of topics, collections of metadata, and even by reference to its physical manifestation in paper form. Shneiderman (1996, 1998) has suggested that a data type by interaction type framework could ground work in visualization as a whole. He proposed the types adopted here and has extended them recently (http://www.umd.cs.edu/users/north/infoviz.html).
The current adaptation to a text-only environment has changed somewhat the classification scheme and many of the implementations fall in categories different from those suggested by Shneiderman. The use of this organization is not meant to imply that these are all the groupings that might be applied to text nor are the assignments the only ones possible. The structure is meant to be fluid.
Viewing documents as streams of words has a great deal of similarity to assessing speech. Techniques that are applied to spoken words can be adapted to analyzing written words. Speech is examined at several levels, including phonology, morphology, syntax, and semantics. A contextual method that has been applied to written text is discourse analysis. Hearst & Plaunt (1993, Hearst 1994) have investigated using a statistical parser to segment text into topical elements. As text is scanned from top to bottom, a sliding window can be programmed to process chunks of the text. The output is analyzed to determine when a subtopic is being introduced. The method is a motivated segmentation that reflects a text's underlying subtopic structure, which can span paragraph boundaries. TextTiling is a two step process that first compares adjacent blocks of text and assigns a similarity value. The blocks are usually 3-5 sentence units. The second step involves graphing the resulting similarity values and smoothing the generated curve. Peaks in the curve indicate regions of high subtopic coherence, whereas valleys indicate evidence for topic switching. Large expository documents were subjected to testing by asking volunteers to perform a topic identification task (Hearst & Plaunt 1993). The results showed that there was a high degree of correlation between the judgments of the subjects and the TextTiling algorithm.
This approach to analyzing documents is related to several other methods. Salton & Buckley (1991) have used author-supplied orthographic markup to segment documents into paragraphs. Whereas Hearst's motivation in performing text segmentation (Hearst, 1993) was merely to determine where topic boundaries occurred, Salton & Buckley (1991) sought to discover the content of the individual segments. Stanfill & Waltz (1992) created text segments by dividing documents in 30-word blocks. The results of their study can be compared with the variant employed by Hearst which she terms 'unmotivated' segmentation. When compared with 'motivated' segments, the latter are shown to produce superior recall-precision statistics (Hearst & Plaunt 1993).
There are several ways that text can be viewed in two dimensions. The first way to view text as 2-D is to focus on the characteristics of the text as it appears on a page. The key feature is the formatting, such as paragraphs, headings, tables, and general use of 'white space'. The 2-dimensional view of text is especially productive of metaphors. Several visualizations have been developed that build on the tangibility of printed matter. People dog-ear pages to provide bookmarks and they underline and annotate. The printed page is familiar and provides a great deal of utility apart from its primary function; it is only reasonable that graphical interface designers would borrow from it. Rather than trying to determine semantics of the page's content, implementers only show pictures of the page. This is a very simple but versatile mechanism for conveying to a user something about a text. Zooming in on a page reveals successively more and more detail and allows a user to orient himself with respect to the organization of material.
The second way that text can be characterized as two-dimensional is not truly a function of the data source but is a derived measure. If a document can be characterized by a low-dimensional vector, then standard graphical methods, such as pie charts, histograms, scatterplots, and line graphs, can be applied to rendering the document space. Strictly speaking, the renderings are 2-dimensional but the number of attributes that can be mapped to such a space is greater than two. For instance, in a scatterplot each object exists at a particular x-y coordinate, and may have an associated shape, size, color, texture, etc. Each of these features may represent different attributes of the item being represented. These graphical representations are important since they are so common in everyday experience. People have years of training in interpreting such graphs. In addition, considerable work has been performed that allows automatic generation of graphs (Mackinlay 1986, Casner 1991, and Roth et al. 1994). Most of the work done on auto-generation has concentrated on relational data. The metadata that is usually available for documents is relational and might be amenable to viewing in such systems.
Just as the planar page can give rise to a unique view of text, a view of books as 3-dimensional objects can also serve to characterize another useful view. The tangibility of books, the feel of the pages, their location in physical space, the color of the bindings are but a few of the characteristics that are part of the 3-dimensional aspect of text. This view has given rise to several popular metaphors for graphically rendering documents, including the desktop, piles, and rooms.
Mander et al. implemented the "Pile" metaphor (1992). This is analogous to a pile of documents on a desk: the documents retain the order in which they were placed in the pile and some of their appearance, e.g., color. The pile of documents is displayed as a small perspective drawing, piles created by the user have a disheveled appearance, and those created by the system (perhaps as the result of a database query) appear neat. The design includes a gesture for spreading out the elements of the pile so they are all visible (a horizontal back and forth movement), and a gesture for starting to browse the pile elements (an up and down motion). The browsing operation uses a viewing cone, where the miniature document is displayed facing the viewer on the base of a pyramid pointing back towards the document's position in the pile.
The Rooms system (Henderson and Card 1987) exchanges the idea of a single extended virtual surface for a collection of virtual screens of normal size. The reasoning is that typical work patterns are clustered into a collection of tasks between which people switch, and these tasks are not spatially related. The system also allows a window to appear in more than one room, and even to have a different location and shape depending on what room it is being seen from. An extension of this system to three dimensions is described in Robertson et al. (1993).
The use of natural language processing to generate better document vectors has been the object of intense investigation for a long time. Methods for detecting phrases (Croft et al. 1991) and for extracting names (Rau & Jacobs) and topics (Hahn 1990) have enriched the arsenal of information retrieval (IR) researchers. The advent of full-text rather than mere surrogates has opened the question of whether the old methods, which were developed to handle short text pieces, would scale up to handle full-text. The evidence shows that there is some degradation of processing effectiveness (Blair & Maron 1985). One of the possible factors that inhibits scalability is that long pieces of text are actually strings of related and dependent ideas whose major theme emerges from their juxtaposition. In order to capture the meaning of these longer texts there has been a considerable effort to detect and encode the content of subpassages of documents.
The purpose of this section is to guide the reader to an appreciation of the difficulty in producing the requisite data for visualizing text. The starting material for text characterization is usually full-text, but in some cases only surrogate documents comprised variously of title, authors, abstract, citation list are used. Methods for processing these text pieces are generally lexical in nature. Systems that are more ambitious employ syntactic and semantic parsing. There is some evidence that detection of phrases is useful in improving the effectiveness of retrieval. Other methods rely on neural networks to detect patterns in text. There is some intriguing evidence that the purely statistical methods and neural networks produce results that are highly similar (Schütze et al. 1995). The problem with all these methods is similar to the problem of people trying to understand each other. The overriding hope is that the words that are spoken or written convey some meaning that is intended by the speaker/writer and understood by the listener/reader. To ask machines to do what people often fail to do is a big task. The goal of all the methods is to capture some core essence of a passage, document or collection. The hope is that the content being examined is sufficiently clear, long enough, redundant on topic and sparsely populated by extraneous material. Two important criteria bear investigation:
Willett (1988), Schütze et al. (1995) and Lewis & Sparck Jones (1996) have presented reviews of data generation methods. The essential thing to keep in mind when performing visualizations based on this type of data is that the data is fuzzy at best. The computer slogan 'garbage in, garbage out' serves as a warning to those who attempt making pictures of questionable data.
Text Analysis-the basic methodRegardless of whether a Boolean, extended Boolean, fuzzy Boolean, probabilistic, or vector model is used for information retrieval, the document is represented in the computer system as a vector of terms. In some cases, the vector contents are binary (0, 1) to represent the presence or absence of a term. Other systems use numeric values to indicate the strength of a relationship between a document and a term element. The permissible range of values is of little consequence; systems using values between zero and one are common as are those that use positive integers. The first step in processing any text collection is to count the frequencies of words in the texts. Usually one or more stop lists are employed at this stage in order to speed up processing and to generate more meaningful term sets. A generic stop list contains words that are too common in the language to allow reasonable retrieval characteristics, e.g., 'the', 'a', 'an', 'of'', 'that.'. Additional stop lists may be employed in a particular domain to prevent inclusion of words that are prevalent in the local environment, e.g., 'rock' in a geology textbase, or 'computer' in a computer and information science collection. In addition, words are usually stemmed by any of several methods (e.g., Lovins (1968), Porter (1980)) so that the set of potential keywords is compacted.
The resulting raw count data is subjected to further processing by several methods. Depending on the domain and size of the collection, the number of terms that may be identified at this stage may be in the range of a several hundred to tens of thousands or more. Among the most common methods used at this point are: normalization for document length, application of a term discrimination method, term intercorrelation determination, and thesarurus expansion.
Normalization for document length
Collections can vary widely in the size of documents that they contain. A book and an abstract might both contain the same number of occurrences of a particular term. It is clear that, in this case, the term is probably a better descriptor of the shorter document. In order to control for document length, it is customary to normalize the term counts for document length. The necessity for this correction factor depends on the similarity measure chosen for subsequent calculations. If the cosine is used to determine similarity, then no correction need be applied. The process of weighting by frequency of occurrence in the total document collection is an attempt to normalize document representatives with respect to expected frequency distributions.
Term discrimination value
Thus far, a list of words or stems has been produced together with a frequency of occurrence of those elements in each text of a collection. The only adjustment has been for document length. One of the major purposes of a term list to allow a user to appreciate differences and similarities among texts. Terms that appear in nearly every document are useless for this purpose, as are terms that occur rarely. The inverse document frequency in one of several forms is applied to normalize for term set size (Harman 1992). Alternatively, a commonly applied heuristic for the lower bound is that a term should appear in over 20% of all documents. Similarly, terms that appear in over 80% of all texts can be ignored. The term discrimination value is another method for determining which terms provide the best indexing terms for a collection of documents (Salton 1989). The hundreds or thousands of terms generated during the concordance phase of text processing can be viewed as a multidimensional term space within which the documents are suspended. It is theoretically possible to determine the effect of adding or removing a term on the placement of documents in the space. If adding (or deleting) a term causes a significant change in shape of the space then the term is considered important. If adding (or deleting) a term produces little effect on document distribution then it could probably be ignored. The Exact method of Willett (1985) compares each multidimensional document descriptor with each other document vector using the cosine similarity measure. Terms that produce positive cosine values indicate 'good' discriminators; terms that produce negative cosines are useful for dissecting out regions of space that indicate 'not'; intermediate and zero values are neutral for the process of discriminating.
The Exact method is an O(n2) process. Even though the calculation of discrimination values is not performed dynamically during a browsing or retrieval session, the number of terms can lead to processing times in the order of tens of hours even on powerful processors. The method described by Salton (1989) proposes to calculate a centroid document which is used for comparison with each document vector. This process is clearly O(n). A study by Crouch (1988) showed that the results of using this approximate method was as good as the exact method in terms of specificity of term identification with the expected huge reductions in processing time.
Intercorrelation determination
The terms identified by either a pure concordance or those filtered by calculation of term discrimination value (TDV) are likely to be intercorrelated, i.e., different terms produce the exact same documents in response to a query. The implication is that the number of terms can be reduced without affecting the quality of the index terms. In addition, a reduction of correlating terms is indicated in the situation of vector model retrieval in which a usual assumption is that the terms are pairwise orthogonal. Raghavan and Wong present a detailed description of the side effects of violating this assumption (1986). They admit, however, that applications based on vectors as notational convenience rather than a formal model of IR concepts have been successful. Clustering methods are frequently used to identify terms that co-occur. The review by Willett (1988) presents a lengthy discussions of the available methods and the advantages and disadvantages of each. Chen et al. (1995) review various methods and present results derived using several different clustering algorithms.
Thesaurus expansion
Thesauri can be applied to documents collections to generate broader, narrower, synonymous and related terms. Research in this area comprises both creation of and use of thesauri. Chen et al. (1995) describe a method for creating a thesaurus using multiple sources. In addition to using the methods described thus far in this paper-term frequency, document frequency, weighting for length, co-occurrence analysis-they subjected the term lists to one of two generative methods. In the first, they treated the terms as a single collection, regardless of source. In the other, they processed separately the terms from each of four different sources about the same topic. Their study concentrated on trying to determine if better methods could be devised for coping with the problems of information overload and language fluidity. This seems to be a major thrust of automatic thesaurus generation research-automating takes care of the 'overload' problem and creative indexing takes care of the 'fluidity' problem.
The work of Losee and Haas (1995) is a typical study in the field of thesaurus development. Their work concentrates on sublanguages, the languages used by people working in a particular field or discipline. This area is particularly concerned with language that is changing rapidly to accommodate advances in science. Although all languages undergo gradual change, the world of scientific endeavor experiences even more rapid turnover due to the introduction of new concepts that need to find expression. A related problem is the borrowing of terms from one discipline to cover the needs of another. For automatic indexing systems, it is a special problem to know what the introduction of new terms might imply.
Although keywords and vector representations are the most commonly encountered methods of representing text, especially in situations in which automatic encoding is desired, e.g., large on-line collections and/or the WWW, there are significant advantages to using different approaches to text processing. For instance, several of the projects from Xerox PARC employ citation tracing to support browsing of large information stores found in distributed sites (Mackinlay et al. 1995). The researchers undertaking these projects cite the utility of using the built-in schemes of large IR suppliers such as DIALOG. One of the side effects is the ability of such systems to use querying based on relational databases. While it would be difficult to encode in vector form the information about the year of publication, the names of the authors, or similar demographic information, systems that rely on relational databases can use this information quite effectively. Several projects are attempting to merge the two approaches to characterizing text-statistical and relational database (Blair 1988, Croft & Parenty 1985, Lynch & Stonebraker 1988, McLeod & Crawford 1983). Considerable interest exists but there is also much dissension regarding the proper methods to use (DeFazio et al. 1995). If the information sources that eventually become available include significant amounts of classical database material, then the possibilities for leveraging some of the methods that have been developed for visualizing databases will become immediately applicable to the visualization of document information.
The method called latent semantic indexing or LSI (Deerwester et al. 1990) seeks to leverage the correlations among terms in documents to yield superior indexing parameters. The method reduces the dimensionality required to render a document space. LSI uses a singular-value decomposition (SVD) method. A term-document association matrix is constructed using at least 100 terms. Transformation using SVD produces a series of matrices that have reduced dimensionality. In fact, this method generates orthogonal variables, which as mentioned earlier are a requirement for implementation of formally correct vector models (Raghavan and Wong 1986). Deerwester et al. (1990) showed that LSI was superior to several other methods with respect to both precision and recall. This method has been incorporated into other IR systems; e.g., Schütze et al. (1995) have found that LSI provides superior pre-processing for neural network inputs.
Kohnonen maps have also been used to characterize information spaces (Lin 1991, 1992, 1997). Lin has developed displays that can show both content and structure of a document space. He provides as inputs to his algorithm N-dimensional vectors. Through a series of iterations of weight adjustments, the system converges. Sample experiments are described in which input vectors consist of a hundred to more than a thousand elements. The outputs were mapped to grids that were either 10 by 14 or 14 by 14. The mapping that is produced has large areas for concepts that are focal in the collection and smaller areas for less well-mentioned topics. In the examples shown (Lin 1997) the reduction in dimensionality was in the order of 10:1 or greater.
Temporal TextThe other forms of documents that have been considered thus far in this paper are alternative representations that can be generated with the materials at hand. Temporal data is both the same and different. Time, as a dimension, is the same as linear or low-dimensional data when one considers the content of a document, e.g., the timeline in a novel or news story. It is different when considered as metadata, e.g., creation date or date of last reading. Liddy (1995) has explored extracting temporal information from text in a system called CHESS. CHESS automatically creates a knowledge base which aggregates information about any named entity (people, places, events, organizations, companies or ideas) and organizes that knowledge into a timeline which covers the entire period of the knowledge base.
Documents are created and edited in time. In paper form text is finalized and published. Although electronic text is said to be published, it is more difficult to say that it is actually finalized. There is no guarantee that the content might not be changed, more words added, sections removed, the whole reorganized or it might even disappear. Most documents do not have a version history. Computer programs have such records if they are maintained in a version control system. Similarly, legal documents and many electronically managed documents are tracked temporally. Some of the proposals for metadata standards include expanded temporal data fields (Ng et al. 1997).
Docubase management is an issue that is becoming widely discussed. If changes to documents are to be recorded, what granularity of change should be used? Prep Editor (Neuwirth et al., 1990, 1992, 1994) is a system that has implemented a variable diff-ing in order to present users with various levels of changes of text over time. Each view of the document can be filtered to show the desired amount of detail in the editing process. Temporal issues are very important in Groupware settings. When documents are created and/or modified by more than one person, it is important that each participant know who made a change and when it was made.
Many types of data lend themselves to representations as trees, including structured documents, directories, and some kinds of hypertext (those that have no cyclic links). Many approaches have been developed to render these spaces. Conventional methods merely draw a tree as large as it needs to be and then render an image that is controlled with scroll bars. This process has the problem that the user is prevented from seeing the overall structure and must keep most of a large space in memory rather than in view. Although by their very nature, trees can be rendered in a plane, there is no satisfactory 2-D layout of a large tree (Lamping et al. 1995) . In order to make room for the leaf nodes, the nodes near the root must be placed far apart.
Clearly trees are useful for representing large collections of documents, but single documents are also amenable to tree representations if the underlying structure of the document is hierarchical. There is a movement toward representing text structurally. SGML is a prime example of an effort to systematize document structure. Editors that are used to create SGML-compliant text maintain document structure as trees. In SGML trees, the content of a document resides in the leaf nodes of the tree.
Many views of documents can be thought of as networks. Queries, semantic networks, associative thesaurus and hypertexts can all be represented as networks. Multidimensional data, discussed above, differ qualitatively from network data in that the latter have dependencies among the parts. Multidimensional scaling methods tend to drive concepts apart, i.e., to find orthogonal dimensions, while networks assume dependencies among the concepts being manipulated.
Although paper hypertexts exist (Ted Nelson's Literary Machines is probably the most famous), the importance of document networks rests on the fact that the Internet is based on hypertext. Documents are connected to other documents through links and nodes. Attempts to bring order to the potential chaos of hyperlinks run amok have come in the form of several proposed standards. The Dexter (Halasz & Schwartz 1994) and Amsterdam Models (Hardman et al. 1994) are the primary examples.
Network displays can represent more general and more complicated structures than hierarchical displays. The complexity of the information spaces when expressed as networks can be difficult for users to comprehend. A major issue then is how to simplify such displays without losing critical information. One method for reducing complexity is to reduce the dimensionality of the space. Latent semantic indexing (LSI) is a method can be applied to reducing dimensionality. Furnas et al. (1994), however, suggest that too much information would be lost if a high-dimensional space were to be reduced to a small number that could be rendered in two dimensions.
Working groups need various types of support but within the context of this paper the only type of information that is pertinent is the documents that these groups create or manage. Increasingly groups share documents and the management of these texts is handled by Groupware systems. All of the above views of documents are relevant in the context of group work. Each of the views can be exploited to support richer environments for groups of authors. As noted earlier, the temporal dimension is particularly important in distributed situations. Making one person aware of what changes have been made is qualitatively different from reminding a sole author of what he had done. The work of Neuwirth et al. (1990, 1992, 1994) and Greenberg et al. (1994, 1996, 1997) with PrepEdit and GroupKit, respectively, are notable in this context. The results of studies performed on these systems show that designers need to provide different support for workers depending on whether or not they are co-located and whether they work synchronously or asynchronously or mixtures of these conditions.
The types of data that need to be kept are related to the kinds that are kept in a document versioning system or a database. GroupKit (Greenberg et al. 1994) discusses concurrency control issues and their effect on the groupware user interface. These investigators examine common strategies such as serialization and locking. Nichols et al. (1995) discuss many of the same issues in the context of the Jupiter project. Jupiter is a multi-user, multimedia virtual world which supports shared documents, shared tools, and, optionally, live audio/video communication. The success of this project was in part determined by the centralized architecture and optimistic concurrency control algorithm used to maintain common values for all instances of shared widgets. These investigations make clear that one of the greatest obstacles to distributed document sharing is a determination of the appropriate granularity for subdividing documents. Overly large fragments prevent smooth interaction; fragments that are too small can congest systems.
"The purpose of computing is insight, not numbers." -
Hamming (1962) OverviewFor a visualization to be effective, it must provide the user with a sense of the overall composition and layout of the space. For complicated displays such as those that attempt to render large hierarchies using trees or any representation of a large document collection, this task is not as straightforward or obvious as it sounds.
Several issues arise when a data set is to be mapped to an interface, such as how to make the best mapping of the attributes of the data to attributes of objects in the interface. Spring & Jennings (1993) have provided a comprehensive account of the dimensions that might be mapped in an interface. They categorize each of the stimuli as to its suitability to map to data depending on whether the data is nominal, ordinal, interval or ratio. Bartram (1997) has recently raised the issue of incorporating motion as a key feature of complex display due to its easy perceptibility. Other constraints apply when deciding how to map data since different stimuli convey different degrees of salience to users. For instance, it has been shown that tilted lines are more readily apparent than vertical lines. Similar observations were made with respect to curvature, color, line ends, movement, closure, contrast, and brightness (Treisman 1986). In addition, the relative order of noticeability of some stimuli has been determined, e.g., color > line > tilt > angle (Cleveland 1985).
Another issue that arises when addressing the overview of visualizations is how to fit large spaces on the screen and still allow some appreciation of the detail that resides there. Toward this end, the fish-eye view has been developed (Sarkar et al. 1994). The space with a fish-eye lens on it is distorted so that the view is expanded under the lens. Problems can occur if large areas of the screen are distorted; many types of tasks cannot be performed under these conditions, such as comparing two points that are of different magnifications.
Projection onto a hyperbolic surface has also been used to fit large data sources onto a single screen (Lamping et al. 1995). This method is suitable for some types of data such as hierarchies that can viewed as trees and some networks but is not a general solution. Munzer has extended the work on hyperbolic trees to a virtual 3-D rendering (1997). This method of presentation lessens the perceived distortion and enhances the user's interaction via direct manipulation.
Work on 3-dimensional and virtual reality displays is highly evident. Examples of such systems that are used in visualizing documents include VR-VIBE (Benford et al. 1995), Lyberworld, (Hemmje et al. 1994) ,SPIRE (Wise et al. 1995), and Bead (Chalmers 1993, 1996). Whether users perform better in 3-D environments that 2-D has not yet been tested.
Zooming is the technique for allowing a user to select a smaller region of the screen for display. Scrolling is an alternative to zooming but suffers greatly by comparison. Since only a portion of the display can ever be visible at one time, pieces of information that are at opposite ends of the display will never be subjected to some types of evaluations. Zooming includes any change in view from a larger portion to a smaller portion of a field or vice versa. As such, it is possible to implement zooming as a discrete number of intermediate views. Usually such views are available simultaneously to help the user to preserve his sense of place. However, smooth zooming is increasingly available.
Smooth zooming has been incorporated into many of the currently available visualizations including PAD++ (Bederson & Hollan 1994) and the Document Lens (Robertson & Mackinlay 1993). The availability of fast algorithms and state-of-the-art hardware that incorporates graphical routines has made rapid screen update rates possible. Smooth zooming helps users maintain their sense of position of context (Schaffer et al., 1996). Variations in zoom techniques include the capability to move in more that one plane. In PAD++ the user needs only to hold down the mouse button on a location and the view will be transformed to move that region to the center of focus. To the user it appears that she has walked a straight line toward the region (Bederson & Hollan, 1993).
Zooming is a method that has been widely used in virtual worlds. In fact, it is difficult to imagine a VR system that would not provide smooth movement of the virtual body through space. Although the issue of mapping using natural objects and landscapes might have fit more appropriately under the overview section, it seems mandatory to talk about it when movement is being addressed. George Robertson in a presentation at CMU (February 1997) made the observation that system designers who do not use a real-world metaphor in their interfaces are ignoring the fact that users live in a real world and know how to move there very well. He said "It is at their peril that designers will use any interface metaphor that doesn't incorporate what the user knows about moving in the real world." He included not only natural environments but virtual worlds in which the objects might be real-world correlates or abstract concepts. This point of view seems overly rigorous, but as a maxim for designers it should give an appropriate warning.
Several projects have been developing systems based on metaphors that are based on the primary notion of using concrete objects and settings. In the document sphere, VR-VIBE (Benford et al. 1995) and Bead (Chalmers 1993, 1996) use a spatial metaphor that creates landscapes that encourage exploration. The Natural Scenes Paradigm project of P.K. Robertson (1991, 1994) gives an approach based on 1) using clearly and easily understood models such as 3-D structures or scenes, 2) representing data variables by the recognizable properties of the objects or scenes, and 3) inducing mental models in the observer's mind by using graphics scene simulation techniques.
Filtering is the activity of weeding out uninteresting elements in a collection. With databases this is accomplished quite easily. Ahlberg has developed an Alphaslider (Ahlberg & Shneiderman 1994a) which maps an alphabetically sorted list to a slider, such that repositioning the thumb causes the list to be traversed in the expected order. The Alphaslider can be found in a variety of projects including FilmFinder (Ahlberg & Shneiderman 1994b), Spotfire (Ahlberg 1996) and HomeFinder (Williamson & Shneiderman 1992). A common term for this type of filtering is dynamic query (Ioannidis, 1996, Fishkin & Stone 1995).
All of these projects are based on information that is stored in databases. The indexing performed on documents, especially the multidimensional vector type, produces vectors that contain hundreds or even thousands of elements. It is difficult to imagine incorporating an equal number of sliders to control whether or not a factor is to be considered in a display. Since as Olsen et al. (1997) have noted, querying a databases generates an answer that is 100% accurate; there is no concept of recall and precision in databases. There is full recall and total precision. In the instance of docubases that are characterized by large vectors, the issue of similarity arises and how to address this in interactive situations is a question that is currently under investigation. The TREC evaluation series has recently initiated an interactive track that is attempting to answer this question (Over 1996). Using visualizations to represent document sets raises many of the same issues with respect to evaluation that text-based interactive systems do. Perhaps criteria that emerge for assessing the efficacy of text-based systems for interacting with document collections will provide information that will support evaluation of visual presentations of the same material (Newby 1996).
Another part of the problem is the special demands required to merge the data that is stored about documents. Some of the data is essentially metadata and this part is amenable to database treatment. Items such as 'author', 'date of publication', and 'publisher' are easily stored in relational form. The content of a document, the inverted file associated with it, the document vectors, and the other forms such as timeline, topic segmentation, and noun extracts are not so easily stored. The usual method is to perform an SQL search on the part of a query that is suitable (usually the metadata) and then to subject the resulting set to secondary methods, but results of this approach have not been entirely successful (DeFazio et al. 1995).
At some point in interacting with a visualization system, the user may decide to take a closer look at one or more objects in the field of view. When the requested view provides the content of the object, 'detail-on-demand' has been provided. Most systems support this function and it is usually invoked by clicking on an item or group of items or by allowing the sprite (cursor) to dwell on an object. In the former case, a dialog pops up that contains detailed information. In the latter case, a lens might be provided. Lenses have a variety of appearances but as a group they provide what are commonly referred to as 'see-through tools' (Bier et al. 1994). Zooming can also provide details. When zooming magnifies a piece of a display, the view can show different information at the more detailed level.
Problems can arise in several situations, including 1) when the information that pops up occludes the original view, 2) when the smoothness of the movement from one view to another disorients the users, 3) when the information that pops up is not what the user expects. This last issue brings up the questions of what actually constitutes 'detail.' In the case of documents, details can include the full-text that stands behind any representation. However, it may also be the case that details are to be found in another view of the same document. For instance, zooming to a highly clustered region of a visualization and clicking on an icon could indicate a need to see the text of the related document or it could be a request for metadata only.
The relate function seeks to make explicit the relationships between objects in a display. It can also refer to representing relationships between data in multiple associated windows. This function is implemented in a variety of ways. The idea of linking graphical representations is not new. Simple linking can be found in a wide variety of programs, e.g., BEAD (Chalmers, 1993), SeeSoft (Eick, 1992, Antis et al. 1996, Baker & Eick 1995), AutoVisual (Feiner & Beshers, 1990), VisDB (Keim & Kriegal, 1994), Nested Histograms (Mihalisin & Gawlinski, 1990), The Table Lens (Rao & Card, 1994), and The Dynamic HouseFinder (Williamson & Shneiderman, 1992). The Alphasliders found in FilmFinder and Spotfire (Ahlberg 1996) are updated to current values each time an object is selected. When performing Dynamic Queries (Ahlberg & Shneiderman 1994), users are shown consistent views onto the data in a similar way. Chuah et al. (1995) have used a similar technique to integrate multiple views that are a mix of tables and visualizations. The users can easily make connections between the pertinent relationships.
Maintaining histories is important for several reasons including placekeeping and supporting the ability to undo actions. Exploration in visualizations is a creative process and involves many sequential user actions to arrive at a satisfactory solution. The ability to retrace steps on a particular path is important. Shneiderman (1998) suggests that "most prototypes fail to deal with this requirement" and attributes this fact to the novelty of such interfaces. Borrowing from classical information retrieval systems would allow users to recover and refine intermediate searches.
Animation of steps might be a useful mechanism for providing path retracing (Gonzalez 1996). A problem with maintaining an adequate history is the considerable resources that are required to maintain the various kinds of information that might be considered salient by the user. In navigating visualization spaces, it might be important to keep track of the landmarks that are visible, the granularity of the zoomable display, the state of user-selectable options. Step-by-step unwinding is a storage-intensive endeavor.
Once users have found regions or elements of interest in a visual display, they should be able to save the subsets. Not only might the user desire to save this collection as a new starting point for further study, but she might also like to print or mail it. An alternative to exporting a whole data set could be to save interface settings. This is the approach taken by VIBE (Olsen et al. 1992, 1993). The Visage project supports a drag-and-drop feature that allows inter-application exchange of data (Roth et al. 1997).
Several frameworks for information visualization have been proposed (Kennedy et al. 1996, Rogowitz & Treinish 1993, Wehrend 1990). Some of these structures include modeling of the user. Increasingly, user-centered design is being adopted. In this paradigm, explicit representation of the user is important. The user can be modeled in the system by assessing the user's goals and/or defining the tasks the user needs to perform.
This section will present several task models. Shneiderman labels the preceding discussion on interface functionalities as a task level model (see Appendix B for another Shneiderman view). Evaluation of visual interfaces, however, needs to be more grounded in task models from the viewpoint of the user than from the interface side. Some of the models presented here are domain-dependent and others are independent of domain. The granularity of analysis runs the gamut from very fine-grained to very high level.
A classification scheme supports the development of task sets for system evaluation and lays the groundwork for the development of automatic visualization systems. By knowing the data that exists, the requirements of the interface and the goals of the user, it becomes possible to ask how one might build visualizations automatically. The purpose of this paper is to discuss the issues that contribute to understanding how best to approach the evaluation of document visualization systems.
Wehrend -- a task level user modelThe task classification of Wehrend & Lewis (1990) is a low-level, domain-independent taxonomy of tasks that users might perform in a visual environment. Domain-independence allows generalizability. Wehrend & Lewis' classification consists of the following set of user actions.
Some of these tasks are similar to those enumerated by Roth & Mattis (1990) as shown in the following table.
|
Wehrend & Lewis (1990) |
Roth & Mattis (1990) |
|
Identify |
Lookup value |
|
Distribute |
Distribute |
|
Compare within |
Compare within |
|
Compare between |
Pairwise or n-wise comparison |
|
Rank |
Index a structure by an element |
|
Correlate |
Correlate |
Task Models from Library Environments
Modeling users in information retrieval situations has a long history in library science. Systems have changed from having only titles and minimal other metadata to having abstracts to the present situation in which most texts are available as full-texts. Systems have increased in capacity to accommodate the requirements of full-text storage and systems have taken advantage of increased computing power to perform searches. Where once an intermediary worked with a user to formulate and query which would be submitted in essentially batch mode, current systems are used by the end user and searches are interactive. Only recently have visualizations been developed that might help satisfy some of the user's information needs. The models developed by library scientists have changed to accommodate evolving resources.
The following task models developed for use in library environments were chosen to show how varied the approaches are and to describe some models that might actually have some utility in evaluating visual interfaces. Reviews of the historical evolution of information retrieval can be found in Spink (1997) and Bates (1989).
MarchioniniThe breakdown of the information-seeking provided by Marchionini (1992) describes a network of tasks that are performed in various, user-defined orders until the information-seeking problem is solved. Marchionini states clearly that there are two basic forms of information needs - fact knowledge and browsing. The subtasks that he provides are not different for the two types of needs and are:
This particular task list is relevant to interface design but provides little guidance on what subtasks might be. It is also highly grounded in the traditional information retrieval paradigm in that it relies on query formation and an iterative performance of steps to arrive at a satisfactory solution.
BatesBates (1989) describes a 'berrypicking' model of information retrieval, which she contrasts with the classical method. Her description of browsing in a world of text seems to offer similarities to visual representations. She presents a list of 6 tasks which are:
Belkin et al. (1995) propose that information seeking can be defined with respect to four dimensions as shown in the following table.
Table 3: Information Seeking Dimensions (Belkin et al. 1995)Searching as a method of interaction refers to trying to find some known item, while scanning refers to trying to find something interesting. The goal of interaction might be to learn something about an item or it might be to select the item. When looking for items, the user might specify what should be looked for or he might find it by recognizing it. The distinction between information and meta-information is the same distinction that has been made in this paper.
Belkin (Belkin et al. 1995) notes that there are 16 possible information-seeking strategies if each of the components is viewed as a Boolean value. For instance, traditional information retrieval might be characterized as Selecting + Specification + Meta-information + any method of interaction and information visualization would be described as Learning + Recognition + Information + any method of interaction but frequently Scanning.
VIRI Research Group TasksThe VIRI (Visual Information Retrieval Interface) research group developed a set of tasks that we term 'tool-enabled' tasks. The idea behind the name is that visualizations provide ways of doing things that might not be possible or that might be much more difficult using less visual means. The list is as follows:
Each of the task models presented above is incomplete and each is inadequate for supporting the development of an evaluation plan for assessing the usability of information visualization interfaces.
In all of these task sets, it is not clear what fraction of an information browser's tasks is covered by the list.
This section will present samples of visualizations that have been developed to handle data of the various types discussed above. The systems that are presented have been chosen either because they are unique to the category, because they are prototypical, because they are famous, or because they are of personal interest. In each section, it should be noted that the dimensionality of the display is not necessarily mapped in a one-to-one fashion with the text dimension. Obviously multidimensional data would be impossible to render at all if this were the case. However, linear and low dimensional data is mapped variously to one, two or three dimensions. Table
2 contains a more complete listing of visualization systems that have applicability to documents.Table 4: List of Visualization Systems for Documents
|
Data Type |
|
Reference |
|
Linear |
TileBars |
Hearst 1995 |
|
2-dimensional |
Information Mural Pad++ Perspective Wall Document Lens |
Jerding & Stasko 1995 Bederson & Hollan 1994 Mackinlay et al. 1991 Robertson and Mackinlay 1993 |
|
3-dimensional |
WebBook |
Card et al. 1996 |
|
Multidimensional |
Bead LyberWorld Themescape / SPIRE VIBE VR-VIBE |
Chalmers 1993, 1996 Hemmje et al. 1994 Wise et al. 1995 Olsen et al. 1993 Benford et al. 1995 |
|
Temporal |
GroupKit SeeSoft LifeLines EditWear/ReadWear |
Greenberg & Roseman Eick et al. 1992 Plaisant et al. 1996 Hill & Hollan 1992 |
|
Hierarchical |
Cone/Cam-Trees Hyperbolic Trees 3-D Hyperbolic Trees TreeMaps Elastic Windows |
Robertson et al. 1991 Lamping et al. 1995 Munzer 1997 Johnson & Shneiderman 1991 Kandogan & Shneiderman 1997 |
|
Network |
Butterfly Citation Browser Influence Explorer Multi-Trees Navigational View Builder SemNet |
Mackinlay et al. 1995 Tweedie et al. 1996 Furnas & Zacks 1994 Mukherjea & Foley 1995 Lin 1991, 1992 |
|
Distributed |
CASCADE GroupKit Web Forager |
Spring et al. 1996 Greenberg & Roseman Card et al. 1996 |
Linear Text: TileBars
Figure 3: TileBars embedded in a Scatter/Gather interface.
TileBars (Hearst 1995) is shown in Figure 3 as part of a Scatter/Gather (Pirolli et al. 1996, Hearst et al. 1995, Hearst & Pedersen 1996) interface. Each TileBar icon represents a single document and the length of the bar is proportional to the length of the document. Each grayscale block represents a segment of text as determined by the TextTiling method described previously (section 3.1 and Hearst & Plaunt 1993). Dark blocks connote segments with a high occurrence of a term or combination of terms and lighter blocks stand for pieces of text with relatively less of the topic. This particular query was composed of three sets of terms so the display for each document contains one row of blocks for each term set. This display makes some information easy to gather, e.g., relative size of documents, co-occurrence of term sets in a document, absence of a particular concept from a document.
In Pad++ (Perlin & Fox 1993), a document can be visible at any scale or at more than one scale simultaneously. The Pad project explores techniques by which spatial scaling can be integrated into applications. Techniques such as placing microscopic text in place of a footnote marker, applications such as a calendar which reveals finer hierarchical structure as the user approaches, editors for hierarchically structured text, and a multi-scale painting program using wavelets are described in the text.
Figure 4: PAD++ rendering of a hypertext
The view of PAD++ shown in Figure 4 is of a hypertext. Each node shows a text segment at a different level of detail. The developers of this system (Bederson & Hollan 1994) contend that zooming, the primary mechanism of interacting with PAD++ provides superior way-finding in many environments, including hypertext (Páez et al. 1996). This study showed that subjects not only answered more questions correctly and in less time, but there was also greater subjective satisfaction.

Figure 5: WebBook close-up showing a book being riffled through
Card et al. (1996) developed the interface shown in Figure 5 in response to the observation that users have difficulty finding pages, get lost, have difficulty relocating pages, and have problems organizing material they manage to find on the Internet. Although the book metaphor has been used often (e.g., Yankelovich et al. 1985, Remde et al. 1987), this particular implementation is quite compelling. The WebBook is not just a static interface object but provides a variety of interactions that are typical of the way people use books in the real world. Users can riffle through pages (the view shown in the figure), can rip pages from a book, and can tack a page to a desk or wall in the 3-D room in which the book is located. Even the use of bookmarks is more like the real world being flat objects that are inserted between pages and that hang out the end when the book is not open to the page where the bookmark was inserted.
The WebBook represents a three-dimensional view of a virtual world that allows using documents at many levels. When a user scrolls in 'fontsize' mode, the display becomes a Document Lens (Robertson & Mackinlay 1993) which has many similarities to the PAD++ interface shown in the previous section. The WebBook (Card et al. 1996), Document Lens (Robertson & Mackinlay 1993), Perspective Wall (Mackinlay et al. 1991) and WebForager (Card et al.1996) are a few of the elements of the larger Xerox (Xsoft) project that are collectively called Information Visualizer (Card et al. 1991, Rao et al. 1995).

Figure 6: SPIRE Themescape showing topic distribution in a large document space
The number of potential choices for this section was larger than for any of the others. Multidimensional is undoubtedly the most elusive and highly sought after of the types of visualizations. This is primarily due to efforts by many segments of the information community, including but not limited to, database visualizers, geographical information systems and information retrievalists. Even by concentrating on information retrieval visualization, it was difficult to decide from the many systems such as BEAD (Chalmers 1993, 1996), VIBE (Olsen 1992, 1993), VR-VIBE (Benford et 1995), InfoCrystal (Spoerri 1993), and Lyberworld (Hemmje et al. 1994). SPIRE (Figure 6) is a project that was developed at Batelle National Labs and has gone into production by InXight. The goal of information retrieval projects is usually to find text that the user expects to read. SPIRE (Spatial Paradigm for Information Retrieval and Exploration) stands in contrast to this view. Wise et al. (1995) state that:
"True text visualization that would overcome these time and attentional constraints must represent textual content and meaning to the analyst without them having to read it in the manner that text normally requires. These visualizations would instead result from a content abstraction and spatialization of the original text document that transforms it into a new visual representation that communicates by image instead of prose."
The figure shows a Themescape to which a large document collection has been mapped. The layout of the concept space has been accomplished with a variety of clustering and dimensionality reducing algorithms (York & Bohn 1995). The elevation of a feature in the map indicates theme strength. Time sequence animations allow a high level appraisal of the overall change in topics over a period of time.

Figure 7: SeeSoft showing overview of a software project
Figure 7 shows the status of a large software project as filtered through SeeSoft (Eick 1992). The blocks in the figure represent program code modules. Color is being used to profile 'hot spots' in the code. The information in this rendering is actually a frequency count-the more often a line of code is called, the closer the color is to the red end of the color spectrum shown in the legend to the left side of the image. Other modes of this tool can show the time elapsed since a module was last modified.
Other visualizations depict time in a variety of ways. SAGE (Roth et al. 1994, Chuah et al. 1995) has been used to code timelines of information extracted from historical documents. EditWear/ReadWear (Hill & Hollan 1992) creates a bar adjacent to the normal scroll bar when a document is being viewed. In the bar a two-dimensional plot of user activity can be plotted showing the length of time that the document has been either read or edited. ReadWear allows the computer to collect information about the time the user has spent working with various information objects at multiple levels of resolution, such as, time spent writing the different chapters of a book, time spent editing assorted lines of code in a program, or time spent reading interesting net news messages. This use is very similar to some of the functionality of SeeSoft. Therefore, it is clear that temporal covers a lot of territory. Time may be viewed as:
Temporal data is frequently overlaid with other types of information. One might think of temporal information as merely another dimension in a multidimensional space, but to do this is to risk losing the importance that people tie to this important kind of data.
Tree Text: Hyperbolic TreeThe hyperbolic tree (Lamping et al. 1995) shown in Figure 8 is representative of the types of renderings that are possible of hierarchical data. Central objects are larger than more peripheral ones. Interaction with the display is accomplished with a click and drag which cause an apparent rotation of the surface pulling peripheral parts into more central focus.
Other implementations of the hyperbolic tree include some 3-dimensional ones (Munzer 1997). Part of the appeal of these displays is the compelling direct-manipulation that is built-in; the user may drag the surface around using the mouse. Cone Trees (Robertson et al. 1991) project a tree into a semi-transparent 3-dimensional display. Trees of modest size can be rendered using this approach but very large trees are still difficult to manage. TreeMaps (Johnson & Shneiderman 1991) are another way to maximize the use of screen territory. They are a space-filling technique that uses alternating directions, icon size and texture to render large numbers of objects in a hierarchy. These displays require a considerable learning time and the hierarchy is often lost in the remapping.
Figure 9: Navigational View Builder showing relationships among a set of documents
Figure 9 shows Navigational View Builder (Mukherjea & Foley 1995, Mukherjea et al. 1995), a tool for designing overview diagrams of hypermedia systems. The rendering done by the tool are the result of a series of operations, including binding (mapping data attributes to visual display attributes), clustering (coalescing nearby objects into a single icon), filtering based on content, links, and structure, and hierarchization (reduce dimensionality by viewing 3-D trees instead of graphs). The authors admit that the problem is difficult and that their solution can be markedly improved. The three main outstanding issues that they identify are: 1) The system has not been subjected to usability testing, 2) The algorithms that they use have not proven to be scalable, and 3) The metadata that is currently available is too limited to provide interesting views to be built and the content is not captured in the data that they do collect.
There are many other projects whose aim is to automate the production of network displays, including SCALIR (Rose & Belew 1991), gIBIS (Conklin & Begeman 1989) and PFNETS (Fowler et al. 1991). SemNet (Fairchild et al. 1988) is a three-dimensional version of a network display in which the nodes and links are color-coded. Several heuristics are used in SemNet to derive semi-optimal placement of nodes so that intersecting links are minimized and conceptual clustering is enhanced. Another advantage of 3-D displays of networks or trees, mentioned both by Fairchild et al. (1988) and by Munzer (1997), is the ease with which users can interactively work with the visualization to change the point of view.

Workspaces, whether they are personal or shared and, if shared, whether the work is synchronous or asynchronous, are complicated environments. CASCADE is a research testbed for investigating computer-supported co-operative work (Spring et al. 1996). Figure 10 shows a display that has a document in the main portion of the screen. For the purposes of this paper, the most salient features to note are the Mural, TileBar, and intradocument colored icons. The colored blocks in the document show the locations of interactive comments made with the CASCADE editor. These icons serve as landmarks that can provide detail-on-demand. The mural is modeled along the lines of a project by Jerding & Stasko (1995) and shows the location of all the comment landmarks in the entire document. This is an important navigational mechanism in the interface. The TileBar is an adaptation of the work of Hearst (1995).
There are many other Groupware projects and each of them has some interesting ways of notifying group member about other peoples' whereabouts and activities. The right mix of tools is yet unknown and how many of the right ones will be discovered in single-user systems is also not known. Whether the needs of groups require the development of more, better, or smarter visualizations is an area of considerable interest.
The components of information visualization systems discussed in the state-of-the-art provide a springboard for a multifaceted research plan.
Broadly, there are three potential foci for research as described in this paper:
Although there are open issues in each of these areas, the following topics are of particular interest:
With respect to tasks, it should be noted that while defining what people need from document interfaces is difficult, the availability of a data type breakdown of documents allows simplification since there are a finite number of questions that can be asked given a data description. If users are interrogated in field studies about their 'information needs', they provide answers that cover the gamut from 'I need to know X' where X is a simple fact such as a person, place or object attribute to 'I need to know how the world works.' When approached from the data type (a definitely bottom-up method), it is easy to see that if only temporal metadata is available, then questions regarding anything but time are useless. Collecting the tasks that can reasonably be generated from a particular data view can serve as a basis for evaluation of an interface that presents this view to the user. The greater goal is to collect a large set of tasks that can be used to test integrated interfaces that attempt to render documents from multiple data perspectives. It is likely that the set of tasks will grow linearly as a function of data type thus forming a coherent set of potential evaluation points.
Summary
Information visualization is in its infancy. The human, computer, and interface components of such systems are only partially understood. This review attempted to present the current state of knowledge about
The use of the Shneiderman framework to delineate document types proved to have good points and bad points. In its original conception, the framework was intended to dissect the data types of objects in general. In changing the objects to actual instances, i.e., documents, it became less clear whether the fit of framework and object was valid. Indeed, the framework served its purpose -- "to sort out the prototypes [that currently exist] and guide researchers to new opportunities" (Shneiderman 1996). Novel ways of viewing documents were found and interesting questions were generated regarding the breakdown. In the absence of any other document framework for visualization, it appears that there was some utility in the choice of Shneiderman's.
On the negative side, it appears that the numbers of categories in the framework were more than could be accommodated by the underlying data. In essence, documents are content, structure and metadata. Generally speaking, hierarchical, network, 2-dimensional, and 3-dimensional, elements of the Shneiderman framework map to structural information in documents, while linear and multidimensional data are related to document content. Collaborative documents are complex types that are essentially recursive of all the basic types. The temporal dimension, in its intradocument sense, is part of the multidimensional data; in its other form, it is clearly part of the metadata. It seems that a content, structure, metadata framework could have been equally useful.
A third alternative might have been to use the breakdown discussed in the Scope and Definitions section - document components, documents, document sets, document collections, and document analytics. Whether any of these other frameworks is better for guiding development of document visualizations and their evaluation than the one employed in this paper cannot be answered at this time.
|
Dublin Core |
URC |
Semantic Header |
USMARC |
IAFA templates |
TEI header |
|
|
INTRINSIC |
||||||
|
Subject |
+ |
+ |
+ |
+ |
+ |
|
|
Title |
+ |
+ |
+ |
+ |
+ |
+ |
|
Author |
+ |
+ |
+ |
+ |
+ |
+ |
|
Publisher |
+ |
+ |
* |
+ |
+ |
+ |
|
Publication Place |
+ |
* |
+ |
+ |
||
|
Other agent |
+ |
+ |
+ |
+ |
||
|
Date |
+ |
+ |
+ |
+ |
+ |
|
|
Object type |
+ |
* |
+ |
|||
|
Form |
+ |
+ |
+ |
|||
|
Identifier (URN, ISBN...) |
+ |
+ |
+ |
+ |
+ |
+ |
|
Relation |
+ |
+ |
* |
+ |
* |
|
|
Source |
+ |
+ |
+ |
+ |
||
|
Language |
+ |
* |
+ |
+ |
* |
|
|
Coverage |
+ |
* |
+ |
|||
|
Abstract |
+ |
* |
||||
|
Version (edition) |
+ |
* |
+ |
+ |
+ |
|
|
Notes (annotation) |
+ |
* |
* |
|||
|
Signature |
+ |
+ |
||||
|
Classification |
+ |
* |
||||
|
Classification (security level) |
* |
|||||
|
Keyword |
+ |
+ |
* |
|||
|
EXTRINSIC |
||||||
|
System requirement |
* |
+ |
||||
|
Mode of Access |
+ |
+ |
||||
|
Availability |
+ |
* |
||||
|
Cost |
* |
+ |
* |
|||
|
Control |
* |
+ |
||||
|
Extent (size) |
* |
+ |
* |
|||
|
Encoding description |
+ |
* |
||||
|
Revision description |
* |
+ |
* |
+ Mandatory * Optional
Table reproduced from Ng et al. (1997).
Appendix B: List of Tasks from Shneiderman
The following list was obtained from the Olive website at the University of Maryland (http://www.otal.umd.edu/Olive/). It has been edited to include only document-related tasks.
1-D
2-D
3-D
Multi-D
Temporal
Tree
Network
Workspace
Evaluation of Methods for Rendering POI Movement in VIBE
The VIBE interface as it currently exists provides the user with the ability to move a POI. This feature is useful for several reasons, including
disambiguating complex displaysWhen a POI is moved in VIBE (Figure 11), the user sees the POI icon move in real-time but the display is not updated until the drag operation is finished and the mouse button is released. In order to see the effect of the move, the user may enter a mode called Displacement. Displacement will display a track of each document with the original position at one end and the current position at the other.
WebVIBE (Figure 12) incorporates a POI movement function that is much more dynamic. The entire display is redrawn as the POI is dragged around the screen. The documents seem to be attached like clothes on a rubberband clothesline. There is no 'Displacement' feature since the user can always recreate the track by moving the POI back and forth between any positions.
Whether the increased amount of direct manipulation is truly advantageous is not known. Arguments can be made that explicitly knowing how many documents moved, how far they moved, and the range of distances traveled is valuable. Counter arguments can be made that if the user feels more intimately involved in the interface, he will not need to know such facts since they come more or less for free from his having walked the path. Testing of these two implementations of the move function should provide clear evidence of which path to pursue in further interface development.
Null hypothesis: Movement of POIs using direct manipulation is not superior than using drag-and-drop supplemented with Displacement. Criteria for success will be determined from user satisfaction questionnaires as well as results of task performance tests. The list given above regarding reasons for moving POIs should provide a starting point for operationalizing the study. If users can spread out a display, create clusters of documents, place POIs in different positions and disambiguate cluttered displays more quickly, with fewer errors, and with a greater sense of satisfaction using one mechanism than the other, then a superior method will have been identified.
Subjects will be recruited from the University student population. They will be paid for their participation in the study, which is designed to last approximately one hour. To control for learning effects, each subject will be tested with only one of the interfaces. The primary variable will be time to completion of each subtask. Assuming that there will be a 50% variance among subjects and that performance that is 25% faster will serve as evidence of superiority of DM over DDL, 37 subjects will be needed to disprove the null hypothesis that there is no difference between the groups (Dubin 1986).
In order to control for the effect of the interface itself, a single version of VIBE will be developed. This interface will be capable of performing both Displacement and Motion features. For any particular experiment, only one or the other feature will be enabled. Subjects will be given a short description of a defeatured, vector VIBE system, limited to the description of the appearance of the interface, the meaning of a POI icon, the meaning of a document icon, the relationship of documents to the POIs, and the meaning of cluster of documents. They will also be shown how to move a POI using the appropriate method either drag-drop and inspect the displacement (DDL) or direct manipulation (DM).
The tasks that the subjects will be asked to perform will include probing of each of the above-defined tasks that are normally associated with a move operation. Subjects will repeat the series of tasks in 3-POI, 4-POI and 5-POI scenarios. The names of the POIs will be concrete (e.g., dog, cat, etc.) rather than abstract (e.g., A, B, or love, freedom, etc.). The underlying document set will be composed of a set of manually created vectors to achieve a clusterable display. Subjects will be asked to:
The dependent variables in the experiment are time to completion of each subtask, error rate for each subtask, satisfaction as measured as the sum of satisfaction ratings on a 10-item questionnaire composed of 5-point Likert scale responses.
Data will be analyzed by a non-paired t-test using a one-tailed test. The form of the null hypothesis supports the use of a one-tailed test since it only seeks to be rejected if DM is not better than DDL.
Ahlberg, C. 1996. Spotfire: An information exploration environment. SIGMOD Record 25(4): 25-29.
Ahlberg, C. and B. Shneiderman. 1994a. The Alphaslider: a compact and rapid selector. Proceedings ACM CHI'94: Human Factors in Comp. Systems, 365-371.
Ahlberg, C. and B. Shneiderman. 1994b. Visual Information Seeking: Tight coupling of dynamic query filters with starfield displays, Proceedings of ACM CHI94 Conference, 313-317.
Antis, J.M., S.G. Eick and J.D. Pyrce. 1996. Visualizing the structure of large relational databases. IEEE Software. 13(1):72-79.
Baker, M.J. and S.G. Eick. 1995. Space-filling software visualization. Journal of Visual Languages and Computing 6:119-133.
Bartram, L. 1997. Can motion increase user interface bandwidth in complex systems? Proceedings of the IEEE Conference on Systems, Man and Cybernetics (October 12-15, Orlando, FL), 1686-1691.
Bates, M. 1989. A 'berrypicking' model of information retrieval. Online Review 13(5): 408-424.
Beaulieu, M., S. Robertson and E. Rasmussen. 1996. Evaluating interactive systems in TREC. JASIS 41(1): 85-94.
Bederson, B.B. and J.D. Hollan. 1994. Pad++: a zooming graphical interface for exploring alternate interface physics. Proceedings of ACM User Interface Software and Technology Conference (UIST'94), 17-26.
Belkin, N.B., C. Cool, A. Stein and U. Thiel. 1995. Cases, scripts, and information-seeking strategies: on the design of interactive information retrieval systems. Expert Systems with Applications 9(3): 379-395.
Benford, S.D., D. Snowdon, C. Greenhalgh, R. Ingram, I. Knox and C. Brown. 1995. VR-VIBE: a virtual environment for co-operative information retrieval. Eurographics '95, 30th August - 1st September, Maastricht, The Netherlands, 349-360.
Bertin J. 1983. Semiology of Graphics. W. Berg (Translator). University of Wisconsin Press, Madison.
Bier, E.A., M.C. Stone, K. Fishkin, W. Buxton, and T. Baudel. 1994. A taxonomy of see-through tools. Proceedings of CHI '94. 358-364.
Blair, D.C. 1988. An extended relational retrieval model. Information Processing & Management 24(3): 349-371.
Blair, D.C. and M.E. Maron. 1985. An evaluation of retrieval effectiveness for a full-text document retrieval systems. Communications of the ACM 20: 648-656.
Breit, S. 1951. Qu'est-ce que la documentation. Paris: EDIT.
Brodlie, K.W. 1992. Visualization Techniques. In Scientific Visualization - Techniques and Applications, K.W. Brodlie, L.A. Carpenter, R.A. Earnshaw, J.R. Gallop, R.J. Hubbold, A.M. Mumford, C.D. Osland and P. Quarendon (editors), Springer-Verlag, chapter 3, pp. 37-86, 1992.
Buckland, M.K. 1997. What is a "document"? JASIS 48(9): 804-809.
Card, Stuart K., G.G. Robertson, and J.D. Mackinlay. 1991. The information visualizer, an information workspace. Proceedings of ACM Human Factors in Computing Systems Conference (CHI'91), 1991, 181-188.
Card, S.K., G.G. Robertson, and W. York. 1996. The WebBook and the WebForager: an information workspace for the World Wide Web, CHI 96, ACM Conference on Human Factors in Software, ACM Press, New York. 111-117.
Casner, S. 1991. A task-analytic approach to the automated design of graphic presentations.
ACM Transactions on Graphics 10(2):111-151
Chalmers M. 1993. Using a Landscape to represent a corpus of documents, Springer-Verlag Proceedings of COSIT '93, Elba, pp. 377-390.
Chalmers. M. 1996. A linear iteration time layout algorithm for visualising high-dimensional data. Visualization '96 127-132.
Chen, H., T. Yim, and D. Fye. 1995. Automatic thesaurus generation for an electronic community system. JASIS 46(3): 175-193.
Chuah, M.C., S.F. Roth, J. Kolojejchick, J. Mattis and O. Juarez. 1995. SageBook: searching data-graphics by content, Proceedings of the Conference on Human Factors in Computing Systems (SIGCHI '95), Denver, CO, May, pp. 338-345.
Cleveland, W.S. 1985. The Elements of Graphing Data. Wadsworth Advanced Books and Software, Monterey, CA.
Conklin, J. and M. Begeman. 1989. gIBIS: a tool for all reasons. JASIS 40: 200-213.
Croft, W.B. and T.J. Parenty. 1985. A comparison of a network structure and a database system used for information retrieval. Information Systems 10(4): 377-390.
Croft, W.B., H.R. Turtle, and D.D. Lewis. 1991. The use of phrases and structured queries in information retrieval. Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery, 32-45.
Crouch, C.J. 1988. An analysis of approximate versus exact discrimination values. Information Processing & Management 24:5-16.
Crouch, D. and R.R. Korfhage. 1990. The use of visual representations in information retrieval applications. In T. Ichikawa, E. Jungert, & R. R. Korfhage (Eds.), Visual Languages and Applications, New York, Plenum Press, 305-326.
Deerwester, S., S.T. Dumais, G.W. Furnas, and T. K. Landauer. 1990. Indexing by latent semantic analysis. JASIS 41(6): 391-407.
DeFazio, S., A.Daoud, L.A. Smith, and J. Srinivasan. 1995. Integrating IR and RDBMS using cooperative indexing. Proceedings of SIGIR '95, 84-92.
Dubin, D. 1995. Document analysis for visualization. Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval,199-204.
Dubin, S. 1986. Numberator.
Eick S.G., J.L Steffen and E.E. Sumner. 1992. SeeSoft - a tool for visualizing line oriented software. IEEE Transactions on Software Engineering, 11-18.
Fairchild, K.M., S.E. Poltrock, and G.W. Furnas. 1988. SemNet: three-dimensional graphic representations of large knowledge bases. in Guindon, R. (Ed.) Cognitive Science and Its Applications for Human Computer Interaction, Hillsdale, New Jersey: Lawrence Erlbaum, 1988, pp. 201-233.
Feiner, S. 1990. Authoring large hypermedia documents with IGD. Electronic Publishing 3(1): 29-46.
Feiner S. and C. Beshers. 1990. Worlds within Worlds: metaphors for exploring n-dimensional virtual worlds, ACM Proceedings 1990 Conference on User Interface Software Design, pp. 76-83
Fishkin, K., M.C. Stone. 1995. Enhanced dynamic queries via movable filters. CHI '95. 415-420.
Fowler, R.H., W.A. Fowler, and B.A. Wilson. 1991. Integrating query, thesaurus, and documents through a common visual representation. Proceedings of the 14th Annual International ACM/SIGIR Congerence on Research and Development in Information Retrieval. ACM: New York, 142-151.
Furnas, G.W., T.K. Landauer, L.M. Gomez, and S.T. Dumais. 1987. The vocabulary problem in human-system communication. Communications of the ACM 30(11): 964-971.
Furnas, G.W. 1994. High dimensional representations and information retrieval. In New Approaches in Classification and Data Analysis, edited by E. Diday, Y. Lechevallier, M. Schader, P. Bertrand, and R. Burtschy. Springer-Verlag, Berlin.
Furnas, G.W. and J. Zacks. 1994. Multitrees: enriching and reusing hierarchical structure. Human Factors in Computing Systems CHI '94 Conference Proceedings, Boston Ma, ACM, 330-336
Gershon, N. 1995. Visualizing the Internet: putting the user in the driver's seat. Visualization '95, 416-417.
Gonzalez ,C. 1996. Does animation in user interfaces improve decision making? CHI '96 27-34.
Greenberg, S. and Roseman, M. (in press). Groupware Toolkits for Synchronous Work. In M. Beaudouin-Lafon, editor, Trends in CSCW, John Wiley & Sons Ltd.
Greenberg, S., C. Gutwin, R. Roseman, and A. Cockburn. 1995. From Awareness to TeamRooms, GroupWeb and TurboTurtle: Eight Snapshots of Recent Work in the GroupLab Project. Research Report 95/580/32, Department of Computer Science, University of Calgary, Calgary, Alberta, Canada T2N 1N4, December.
Greenberg, S. and D. Marwood. 1994. Real time groupware as a distributed system: Concurrency control and its effect on the interface. In Proceedings of the ACM Conference on Computer Supported Cooperative Work, pp. 207-217, Chapel Hill, North Carolina, October 22-26, ACM Press.
Hahn, U. 1990. Topic parsing: accounting for text macro structures in full-text analysis. Information Processing & Management 26:135-170.
Halasz, M. and M. Schwartz. 1994. The Dexter hypertext reference model. Communications of the ACM 37(2): 30-39.
Hamming, R.W. 1962. Numerical Methods for Scientists and Engineers. McGraw-Hill, New York.
Hardman, L., D.C.A. Bulterman, and G. van Rossum. 1994. The Amsterdam hypermedia model: adding time and context to the Dexter model. Communications of the ACM 37(2): 50-62.
Harman, D. 1992. Ranking algorithms. In Information Retrieval: Data Structures & Algorithms, W.B. Frakes & R. Baeza-Yates, editors. Prentice-Hall, Upper Saddle River, NJ, pp363-392
Hearst, M.A. 1994. Multi-paragraph segmentation of expository text. In the Proceedings of the 32nd Meeting of the Association for Computational Linguistics, Los Cruces, NM, June.
Hearst, M. 1995. TileBars: visualization of term distribution information in full text information access, Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI), Denver, CO. 59-66.
Hearst, M., J. Pedersen, and D. Karger. 1995. Scatter/gather as a tool for the analysis of retrieval results, Working Notes of the AAAI Fall Symposium on AI Applications in Knowledge Navigation, Cambridge, MA, November.
Hearst, M. and P. Pedersen, P. 1996. Reexamining the cluster hypothesis: scatter/gather on retrieval results, in the Proceedings of the19th Annual International ACM/SIGIR Conference, Zurich, August.
Hearst, M.A. and C. Plaunt. 1993. Subtopic structuring for full-length document access. Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery, 59-69.
Hemmje, M., C. Kunkel, and A. Willet. 1994. LyberWorld - a visualization user interface supporting fulltext retrieval. In: Proceedings of ACM SIGIR '94, July 3-6, Dublin, 249-259.
Henderson, D.A., Jr. and S.K. Card, Rooms. 1986. The Use of Multiple Workspaces to Reduce Space Contention in a Window-Based Graphical User Interface, ACM Transactions on Graphics 5(3): 211-243.
Hill, W.C. and J.D. Hollan. 1992. Edit wear and read wear. In Proceedings ACM CHI'92 Conference Human Factors in Computing Systems (Monterey, CA, 3-7 May 1992), 3-9.
Ioannidis, Y.E. 1996. Dynamic information visualization. SIGMOD Record 25(4): 16-20.
Jerding, D.F. and J.T. Stasko. 1995. The Information Mural: A technique for displaying and navigating large information spaces. In Proceedings of the IEEE Visualization `95 Symposium on Information Visualization, pages 43-50, Atlanta, GA, October.
Johnson, B. and B. Shneiderman, B. 1991. Tree-maps: A space filling approach to the visualization of hierarchical information structures, Proceedings of IEEE Visualization '91 (October), 284-291.
Kandogan, E. and B. Shneiderman. 1997. Elastic Windows: evaluation of multi-window operations, ACM SIGCHI 97 Conference on Human Factors in Computing Systems, March.
Keim, D.A. 1996. Pixel-oriented database visualizations. SIGMOD Record 25(4): 35-39.
Keim, D.A. and H. Kriegal. 1994. VisDB: database exploration using multidimensional visualization, IEEE Computer Graphics and Applications September, pp. 40-49.
Kennedy, J.B., K.J. Mitchell and P.J. Barclay. 1996. A framework for information visualisation. SIGMOD Record, 25(4): 30-34.
Koenemann, J. and N.J. Belkin. 1996. A case for interaction: a study of interactive information retrieval behavior and effectiveness. CHI '96 205-212.
Koike, H. 1993. The role of another spatial dimension in software visualization. ACM Transactions of Information Systems. 11(3): 266-286.
Korfhage, R. 1991. To see, or not to see - is that the query? In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM SIGIR, Association for Computing Machinery, 134-141.
Korfhage, R.R. 1997. Information Storage and Retrieval. John Wiley & Sons, New York. pp. 349.
Koshman, S. 1996. VIBE Usability: an Investigation into Visualization Techniques for Information Retrieval. Dissertation. University of Pittsburgh.
Lamping, J., R. Rao, P. Pirolli. 1995. A focus+context technique based on hyperbolic geometry for visualizing large hierarchies. CHI '95, 401-408
Larkin, J. and H. Simon. 1987. Why a diagram is (sometimes) worth 10,000 words. Cognitive Science 11: 65-99.
Lewis, D.D. and K. Sparck-Jones. 1996. Natural language processing for information retrieval. Communications of the ACM 39(1): 92-101.
Liddy, E.D., W. Paik, and M. McKenna. 1995. Development and implementation of a discourse model for newspaper texts. In Proceedings of the AAAI Symposium on Empirical Methods in Discourse Interpretation and Generation. Stanford, CA.
Lin, X. 1991. A self-organizing semantic map for information retrieval. Proceedings for the Fourteenth Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval (Oct. 13-16; Chicago, IL), 262-269.
Lin, X. 1992. Visualization for the document space. Proceedings Visualization '92, IEEE Computer Society Press, Los Alamitos, CA., 274-281.
Lin X. 1997. Map displays for information retrieval. JASIS 48(1): 40-54.
Losee, R.M. and S.W. Haas. 1995. Sublanguage terms: dictionaries, usage, and automatic classification. JASIS 46(7): 519-519.
Lovins, J. 1968. Development of a stemming algorithm. Mechanical Translation and Computational Linguistics, 11: 22-41.
Lynch C.A. and M. Stonebraker. 1988. Extended user-defined indexing with applications to textual databases. Proceedings VLDB, 306-317.
Mackinlay, J. 1986. Automating the design of graphical presentations of relational information.
ACM Transaction on Graphics 5(2): 110-141.
Mackinlay, J. R. Rao and K.K. Card. 1995. An organic user interface for searching citation links. CHI'95, 67-73.
Mackinlay, J.D., G.G. Robertson, and S.K. Card. 1991. Perspective wall: Detail and context smoothly integrated. Proceedings of ACM Human Factors in Computing Systems (CHI'91), 173-179.
Mander, R., G. Solomon, and Y.A. Wong. 1992. 'Pile' metaphor for supporting casual organization of information, CHI'92, ACM Press, Monterey, pp. 627-634.
Marchionini, G. 1992. Interfaces for end-user information seeking. Journal of the American Society for Information Science 43(2): 156-163.
McCormick, B.H., T.A. Defanti, and M.D. Brown. 1987. Visualization in scientific computing. Computer Graphics 21(6).
McLeod I.A. and R.G. Crawford. 1983. Document retrieval as a database application. Information Technology: Research and Development 2: 43-60.
Mihalisin T., Gawlinski E., Timlin J. and Schwegler J. 1990. Visualizing scalar field on an n-dimensional lattice, Proceedings of Visualization 90, IEEE CS Press, pp. 255-262.
Mukherjea, S. and J.D. Foley. 1995. Visualizing the world-wide web with the navigational view builder. Computer Networks and ISDN System, Special Issue on the Third International Conference on the World-Wide Web '95, April, Darmstadt, Germany.
Mukherjea, S., J.D. Foley and S.E. Hudson. 1995. Visualizing complex hypermedia networks through multiple hierarchical views. ACM SIGCHI 1995, May, Denver, Colorado.
Munzner, T. 1997. H3: laying out large directed graphs in 3D hyperbolic space. Proceedings of the 1997 IEEE Symposium on Information Visualization, October 20-21, Phoenix, AZ.
Neuwirth, C., D. Kaufer, R. Chandkok and J. Morris. 1990. Issues in the design of computer support for co-authoring and commenting. Proceedings of CSCW'90, 183-195.
Neuwirth C., D. Kaufer, R. Chandkok, and J. Morris J. 1994. Computer support for distributed collaborative writing: defining parameters of interaction. Proceedings of CSCW'94, 145-152.
Neuwirth C., D. Kaufer, R. Chandkok, and J. Morris. 1992. Flexible diff-ing in a collaborative writing system. Proceedings of CSCW'92, 147-154.
Newby, G.B. 1996. Metric multidimensional information space. In: TREC-5 Proceedings. Gaithersburg, MD: National Institute of Science and Technology.
Ng, K.B., S. Park, and K. Burnett. 1997. Control or management: a comparison of the two approaches for establishing metadata schemes in the digital environment. Proceedings of the 60th Annual Meeting of the American Society for Information Science (Washington, DC, November 1-6), 337-346.
Nichols, D.A., P. Curtis, M. Dixon and J. Lamping. 1995. High-latency, low-bandwidth windowing in the Jupiter collaboration system. UIST '95, Pittsburgh, PA.; November 14-17, pp. 111-120
Olsen, K.A., R.R. Korfhage, M.B. Spring, K.M. Sochats, and J.G. Williams. 1993. Visualization of a document collection: The VIBE system. Information Processing and Management. 29(1): 69-81.
Olsen, K.A., K.M. Sochats, and J.G. Williams. 1997. Full text information retrieval and information overload. Accepted by the International Information and Library Review.
Olsen, K.A., J.G. Williams, K.M. Sochats, and S.C. Hirtle. 1992. Ideation through visualization: the VIBE system. Multimedia Review 3(3): 48-59.
Over, P. 1996. TREC-5 interactive track report. In: TREC-5 Proceedings. Gaithersburg, MD: National Institute of Science and Technology.
Páez, L.B., J.R. da Silva-Fh, and G. Marchionini. 1996. Disorientation in electronic environments: a study of hypertext and continuous zooming interfaces. Proceedings of ASIS '96, 58-66.
Perlin, K. and D. Fox. 1993. Pad: an alternative approach to the computer interface, Proceedings of 1993 ACM SIGGRAPH Conference, 57-64.
Pirolli, P., P. Schank, M. Hearst, and C. Diehl. 1996. Scatter/Gather browsing communicates the topic structure of a very large text collection. CHI '96, ACM Conference on Human Factors in Computing Systems. 213-220
Plaisant, C., B. Milash, A. Rose, S. Widoff, and B. Shneiderman. 1996. LifeLines: visualizing personal histories CHI '96 221-227.
Porter, M. 1980. An algorithm for suffix stripping. Program 14(3): 130-137.
Raghavan, V.V. and S.K.M. Wong. 1986. A critical analysis of vector space model for information retrieval. JASIS 37(5): 279-287.
Rao, R., J.O. Pedersen, M.A. Hearst, J.D. Mackinlay, S.K. Card, L. Masinter, P-K. Halvorsen and G.G. Robertson. 1995. Rich interaction in the digital library. Communications of the ACM 38(4): 29-39.
Rao R. and Card S.K., 1994. The table lens: merging graphical and symbolic representations in an interactive focus+context visualization for tabular information, Proceedings of CHI'94, Boston, ACM Press, 318-322.
Rau, L.F. and P.S. Jacobs. 1991. Creating segmented databases from free text for text retrieval. Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery. 337-346.
Remde, J.R., L.M. Gomez, and T.K. Landauer. 1987. Superbook: an automatic tool for information exploration. ACM Hypertext '87 Proceedings 175-188.
Ribarsky, M.W., E. Ayers, J. Eble and S. Mukherjea. 1994. Using glyphmaker to create customized visualization of complex data. IEEE Computer 27(7): 57-64.
Robertson, G.G., S.K. Card and J.D. Mackinlay. 1993. Information Visualization Using 3D Interactive Animation, Communications of the ACM 36(4): 73.
Robertson, G. and J. Mackinlay. 1993. The document lens. Proceedings of UIST '93. 101-107
Robertson, G.G., J.D. Mackinlay, and S.K. Card. 1991. Cone Trees: Animated 3D visualizations of hierarchical information. Proceedings of ACM Human factors in Computing Systems (CHI'91), 189-194.
Robertson, P. 1991. A methodology for choosing data representations. IEEE Computer Graphics & Applications 11(3):56-67
Robertson, P. and L. De Ferrari. 1994. Systematic approaches to visualization: is a reference model needed? Scientific Visualization, 287-305.
Rogowitz B.E. and L.A. Treinish. 1993. An architecture for rule-based visualization. Proceedings of IEEE Visualization `93, San Jose, CA, October 1993, IEEE Computer Society Press, Los Alamitos, CA, 236-243.
Rose, D.E. and R.K. Belew. 1991. A connectionist and symbolic hybrid for improving legal research. International Journal of Man-Machine Studies 35(1): 1-33.
Roth, S.F., M.C. Chuah, S. Kerpedjiev, J.A. Kolojejchick P. and Lucas, P. 1997. Towards an information visualization workspace: combining multiple means of expression, Human-Computer Interaction Journal
Roth, S.F., J. Kolojejchick, J. Mattis, and J. Goldstein. 1994. Interactive graphic design using automatic presentation knowledge. CHI '94 112-117.
Roth, S. and J. Mattis. 1990. Data characterization for intelligent graphics presentation. Proceedings CHI '90, April, 193-200.
Salton, G. and C. Buckley. 1991. Automatic text structuring and retrieval: Experiments in automatic encyclopedia searching. Proceedings of SIGIR, 21-31..
Salton, G. 1986. Another look at automatic text-retrieval systems. Communications of the ACM 29(7): 648-656.
Salton, G. 1989. Automatic Text Processing: the Transformation, Analysis and Retrieval of Information by Computer. Addison-Wesley, Reading, MA, pp. 530.
Sarkar, M. and M.H. Brown. 1994. Graphical fish-eye views. Communications of the ACM 37(12): 73-84.
Schaffer, D., Z. Auo, S. Greenberg, L. Bartram, J. Dill, S. Dubs, and M. Roseman. 1996. Navigating hierarchically clustered networks through fisheye and full-zoom methods. ACM Transactions on Computer-Human Interaction 3(2): 162-188.
Salton, G. and C Buckley. 1991. Automatic text structuring and retrieval: experiments in automatic encyclopedia searching. Proceedings of SIGIR, 21-31.
Shneiderman, B. 1996. The eyes have it: a task by data type taxonomy for information visualizations. Proceedings of IEEE Symposium on Visual Languages, Boulder, CO, September 3-6, 336-343.
Shneiderman, B. 1998. Designing the User Interface: Strategies for Effective Human-Computer Interaction, Third edition. Addison-Wesley, Reading, MA.
Schütze, H., D.A. Huff, and J.O. Pedersen. (1995) A comparison of classifiers and document representations for the routing problem. Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM Press. 229-237.
Shipman, F.M. III, C.C. Marshall, and T.P. Moran. 1995. Finding and using implicit structure in human-organized spatial layouts of information. CHI '95, pp. 346-353.
Sparck Jones, K. 1971. Automatic Keyword Classification for Information Retrieval. Butterworth & Co., London. pp. 253.
Spink, A. 1997. Information science: a third feedback framework. Journal of the American Society for Information Science 48(8): 728-740.
Spoerri, A. 1993. Visual tools for information retrieval. Proceedings of the 1993 IEEE Symposium on Visual Languages. Bergen, Norway. Los Alamitos, CA: IEEE Computer Society Press, 160-168.
Spring, M.B. 1991. Electronic Printing and Publishing: The Document Processing Revolution. Marcel Dekker, New York.
Spring, M.B. and M.C. Jennings, M.C. 1993. Virtual reality and abstract data: virtualizing information. Virtual Reality World, Spring, 1(1), pp. c-m.
Spring, M.B., E. Morse, and M. Heo. 1996. Multi-level navigation of a document space. http://www.sis.pitt.edu/~spring/mlnds/
Stanfill, C. and D.L. Waltz. 1992. Statistical methods, artificial intelligence, and information retrieval. In Text-based intelligent systems: Current research and practice in information extraction and retrieval, ed. P.S. Jacobs, Lawrence Erlbaum, pp. 215-226.
Treisman, A. 1986. Features and objects in visual processing. Scientific American 254: 114-124.
Tufte, E.R. 1983. The Visual Display of Quantitative Information. Graphics Press, Cheshire, CT.
Tufte, E.R. 1990. Envisioning Information. Graphics Press, Cheshire, CT.
Tufte, E.R. 1997. Visual Explanations. Graphics Press, Cheshire, CT.
Tweedie, L.A., R. Spence, H. Dawkes and H. Su. 1996. Externalising abstract mathematical models, Proceedings of CHI '96, Vancouver, Canada, ACM Press.
Wehrend, S. and C. Lewis. 1990. A problem-oriented classification of visualization techniques
Proceedings IEEE Visualization '90, October, pp.139-143, IEEE Computer Society Press
Willett, P. 1988. Recent trends in hierarchic document clustering: a critical review. Information Processing & Management 24: 577-597.
Willett, P. 1985. An algorithm for the calculation of exact term discrimination values. Information Processing & Management 21(3): 225-232.
Williams, J.G., K.M. Sochats, and E. Morse. 1995. Visualization. In Annual Review of Information Science and Technology (ARIST) 30: 161-207.
Williamson C. and B. Shneiderman. 1992. The Dynamic HomeFinder: Evaluating dynamic queries in a real estate information exploration system, ACM, Proceedings SIGIR'92, 339-346.
Wise, J.A., J.J. Thomas, K. Pennock, D. Lantrip, M. Pottier, A. Schur, and V. Crow. 1995. Visualizing the non-visual: spatial analysis and interaction with information from text documents. Proceedings of Information Visualization, October 20-21, 1995. IEEE Computer Society Press, Los Alamitos, CA. 51-58.
Yankelovich, N., N. Meyrowitz, and A. van Dam. 1985. Reading and writing the electronic book. IEEE Computer 18(10): 15-30.
York, J. and S. Bohn. 1995. Clustering and dimensionality reduction in SPIRE. Presented at the Automatic Intelligence Processing and Analysis Symposium, Mar 28-30, Tysons Corner, VA