Untangling the web

Casting the Net: The Development of a Resource Collection for an Internet Database

Gerry McKiernan
Coordinator, Science and Technology Section
Iowa State University, Ames


Copyright 1996, Gerry McKiernan. Used with permission.

"We need to develop services that provide
information conveniently and quickly with a
minimal investment of users' time-and to
reduce the quantity of that information to
manageable portions ... Dougherty (1991)."

Abstract

CyberStacks(sm) - a demonstration prototype world wide web (WWW) information service, was formally established in November 1995 on the home page server at Iowa State University with the intent of facilitating identification and use of significant Internet resources in science and technology. CyberStacks(sm) was created in response to perceived deficiencies in early efforts to organize access to Net resources and the inherent inadequacies of original and current Internet directories and search services. It has adapted the Library of Congress classification scheme and 'neo-conventional' functionality (McKiernan 1996a) as mechanisms for managing access to the growing number of information sources made available over the Net in recent years.

This paper reviews the general features of CyberStacks(sm), analyzes the decision processes behind its creation, and describes the use of conventional and innovative information management tools and techniques that were employed in developing a preview set of candidate titles for potential incorporation within its collection, and discusses the associated impact on the future enhancement of the service.


Introduction

Information Overload is not unique to Internet users. It is a condition that has plagued the Information Society for more than a generation (Klapp 1986) and has led, at least according to some authorities, to feelings of frustration, disconnectedness, boredom and anxiety (Wurman 1989). To assist users in managing the ever-increasing volume of information, librarians and others have developed or applied a variety of selection and organizational tools and techniques which have become commonplace in libraries throughout the world over the years.

A recent review article provides a concise characterization of the Information Overload phenomenon as well as succinct profiles of a number of the methods that traditionally have been used in countering this problem (Hopkins 1995). Among the conventional tools that librarians and others have created to assist users manage information are guides, handbooks, review articles, literature reviews, abridgments and rankings, as well as indexes, digests and abstracts, among other similar services.

In Fall 1995, CyberStacks(sm), URL= http://www.public.iastate.edu/~CYBERSTACKS/ a demonstration prototype database of selected Internet resources in science, technology and related areas, was formally established on the home page server at Iowa State University as a model for managing access to and use of an increasing number of Internet resources (McKiernan 1995). CyberStacks(sm) was created in response to perceived deficiencies in early efforts to organize access to Net resources and the inherent inadequacies in original and current Internet directories and search services, and has adapted the Library of Congress classification scheme and 'neo-conventional' functionality (McKiernan 1996a) as mechanisms for providing enhanced organization and access to significant resources available over the Net.

Overview

CyberStacks(sm) is a centralized, integrated, and unified collection of significant world wide web (WWW) and other Internet resources categorized using the Library of Congress classification schedules. CyberStacks(sm) uses an abridged Library of Congress call number, that allows users to browse through a virtual library stacks to identify potentially relevant information resources. Resources are categorized first within a broad classification, then within narrower subclasses, and then finally listed under a specific classification range and associated subject description that best characterize the content and coverage of the resource (McKiernan 1996b). The majority of resources incorporated within its collection are monographic or serial works, files, databases or search services. All of the selected resources in CyberStacks(sm) are full-text, hypertext, or hypermedia, and of a research or scholarly nature.

Background

During the early phases of the implementation of CyberStacks(sm), it became obvious that any effort to manage Information Overload could itself become easily overloaded, if it were not appropriately and adequately defined. Thus, in order to manage a collection of relevant Internet resources, it was essential that its nature be clearly defined and that the criteria used for inclusion of a resource be established (McKiernan 1996b).

After reviewing a number of existing efforts, we decided that the creation of a collection of significant resources in science and technology could serve the information needs of specialists within the Science and Technology Section of the Reference and Instructional Services Department at Iowa State University, as well as those of its clientele. Since the Section provided reference, as well as instructional service, it was decided to establish CyberStacks(sm) initially as a collection of significant world wide web (WWW) and other Net resources in selected fields of science, technology and related areas with reference value (McKiernan 1996b).

Selection Guidelines

Although we recognize that the Net offers a variety of resources of potential value to many clientele and communities for a variety of uses, we do not believe that one should suspend critical judgment in evaluating quality or significance of sources available from this new medium. In considering the general principles which would guide the selection of world wide web (WWW) and other Internet resources for CyberStacks(sm), we decided to adopt the same philosophy and general criteria used by libraries in the selection of non-Internet Reference resources (American Library Association. Reference Collection Development and Evaluation Committee 1992). These principles, noted below, offered an operational framework in which resources would be considered as candidate titles for the collection:

  1. Authority of the source
  2. Accuracy of information
  3. Clarity of presentation
  4. Uniqueness within the total collection
  5. Recency or timeliness
  6. Favorable reviews
  7. Community needs.

Reference Works

As we wished to create a true 'virtual' reference collection - an electronic counterpart to our physical collection - only resources that were the equivalent, or an analog, to a print or other electronic reference work (Reference and Information Services: An Introduction, 1995) were initially considered for potential inclusion within the defined collection . In the latter stages of our review, we modified our concept of candidate reference works to include resources that could be considered similar in function to any of those delineated in our defined list (List of Reference Resource Types 1996).

These included compilations of acronyms, abstracting and indexing services, bibliographies, biographical sources, databases and data files, dictionaries, directories, encyclopedias, handbooks and manuals, guides to the literature, indexes, maps, standards, statistical sources, and other types of reference works which historically have served to assist users in accessing primary and secondary information sources.

First Steps

While we recognized that the limited breadth of CyberStacks(sm) could hamper its immediate practical usefulness, we believed that its defined scope provided a manageable collection of resources suitable for an experimental prototype. While we had not surveyed selected sites to determine the potential number of resources on the web that might meet the criteria for inclusion within the CyberStacks(sm) collection, from general searching we believed that the number would not exceed several dozen items. Indeed, in response to a preliminary proposal for creating an organized collection of science and technology reference resources posted to various listservs and newsgroups in Spring 1995, one respected science and technology reference specialist commented that he believed that there were not a sufficient quantity of resources to organize within our proposed scheme!

Initially, resources were identified in an ad hoc manner; any which could be considered a conventional reference work were considered acceptable, including those which would likely be of limited reference value to our local clientele or specialists. As we did not physically acquire such works nor did such works compete for shelf space with publications of greater relevance to our local programs, all potentially relevant Net resources which met the general criteria were selected for preliminary consideration for inclusion in the CyberStacks(sm) collection. By realizing that we need not only focus on local needs, we recognized the potential of creating a model for a centralized, universally-available virtual Reference collection.

As potential candidates were identified, the home page was either directly printed, or e-mailed and then printed, for further review. As time permitted, candidate sites were revisited and re-evaluated and assigned to one of the broad classifications established for the initial CyberStacks(sm) collection (Q, R, S, T, U, V). As opportunities were presented, those resources considered of greatest potential value were assigned a more specific Library of Congress class number based on their content and format. As those assigned to the Science (Q) class included a range of subjects for which there were quality resources and as Science (Q) was the first class in the CyberStacks(sm) sequence, we decided to build the prototype beginning with resources in this broad subject area (Under Construction 1996).

We quickly realized, however, that there were more resources than originally estimated and that managing them by conventional means was both time-consuming, inappropriate and paradoxical. It soon became apparent that it made little sense to attempt to organize a collection of electronic resources using a paper-based approach; available hardware and software needed to be used more effectively, or better hardware and systems needed to be adopted. After a systematic review, it became evident that the DOS and Windows-based hardware and systems available within the university Library were inherently limited for a project of this nature and scope and that a more versatile and powerful system was required.

Although the initial and current CyberStacks(sm) collection was created directly on the university's computation center UNIX home page server, and all editing was performed using a UNIX-based editing system, access and editing of directory files were performed over the campus network on Windows-based PCs from within the Library. Not only was editing tedious, but the hardware environment prevented full multi-tasking operations that could expedite the identification, editing and subsequent incorporation of significant resources within the CyberStacks(sm) scheme. Ironically, a project that had sought to mitigate Information Overload was itself on the verge of becoming unmanageable due to the limitations of established information management practices and technologies.

Enter UNIX

With research funding made available in December 1995 to support a graduate student assistant, and scheduled holiday vacation approaching, we decided to investigate the features and functionality of DEC 3000 Unix-based terminals located in a remote public classroom at the university's computation center. Through a series of tests and trials, and with the assistance of computation center support staff, we gradually developed a working knowledge and understanding of these units and their operating environment. The UNIX platform with its ability to establish several separate sessions, to copy and paste text and graphics, to establish multiple Netscape sessions, and to process data more readily, would prove to be ideal for the next phase of our project.

Template

Although CyberStacks(sm) has been initially established as a collection of significant Internet resources with reference value in the fields of science, technology and related areas, other types of resources could also be appropriately incorporated within its collection to create a more complete reference collection, or, a more comprehensive virtual library. In anticipating the need to manage an increasing number of resources, we recognized the benefit of a workform for expediting the development of the CyberStacks (sm) collection.

While the university's server did not permit the processing of CGI scripts for forms, a basic template could be created in HTML to facilitate the selection and preliminary incorporation of candidate resources. As we seek not to analyze a resource but to characterize it in a manner that permits the user to judge its potential usefulness (McKiernan and Ames 1996), the template format was simple, consisting of three major divisions - a resource title and URL field, a Summary section and a To Search section. The Summary section consists of three duplicate HTML blockquote fields, while the To Search section consists of two. Although we have not sought to standardize the format of the data provided for each selected resource within CyberStacks(sm), an effort has been made to include excerpted information from the source itself that describes its subject coverage, scope, size and/or record structure, as well as available special features or functions (Record Format 1996).

Underlying our selection of resources for the CyberStacks(sm) collection is a general collection development philosophy that considers discrete Internet resources as the units to be identified and described within a virtual collection. Unlike many efforts that seek to organize web resources, CyberStacks(sm) intentionally seeks to identify and describe primarily discrete resources, be they unique or part of a larger site. Indeed, as we searched for candidate titles for possible inclusion within its collection, we realized that access to highly significant yet elusive resources within sites could become one of the more important benefits offered CyberStacks(sm) users.

Copy and Paste

To expedite the incorporation of resources within the template, multiple copies of the template form were created in a running numerical sequence (e.g. new1.htm, new2.htm, new3.htm, etc.). With the template series established, a systematic review was undertaken of all potentially-relevant Net sites which included resources in the defined fields of Science (Q) in the Library of Congress classification scheme (LC Classification Outline 1990). Candidate sites were identified through available search engines (e.g. Yahoo, Lycos, Alta Vista, etc.) and from appropriate sites (e.g. WWW Virtual Library). Where appropriate, sites within sites were visited and although sometimes previously reviewed, were revisited again within the context of a current site.

As a relevant resource was identified, its title was copied and pasted into the title field within a new, numbered template file along with its associated URL; any and all potentially- relevant text that could be used to characterize the resource was similarly copied and pasted into appropriate blockquotes of the Summary section of the template file. If available, text that described procedures for searching or using the candidate resource was copied and pasted in the To Search division. After reviewing the content of the current template, it was subsequently saved and the next template in the numbered series opened. Procedures that would require half-an-hour on PC Windows computers to manipulate were completed in minutes on the large-screen, DEC 3000 workstations.

For much of the remainder of December 1995, and early January, any and all Internet sites that included resources in the fields of medicine, agriculture, technology, and military and naval science, were visited, reviewed, and revisited, and data on each candidate resource incorporated into a temporary template record. In the course of this intensive and comprehensive systematic survey, over 500 candidate resources were identified and saved for future review.

Title Index

Although we had previously attempted to fully incorporate resources into the CyberStacks(sm) scheme after identification, in response to preliminary user feedback, we decided to incorporate each into a newly-created Title Index (Title Index 1996) before full incorporation. In response to user expectations and desires, we considered it more important to provide some level of access to identified and relevant resources than to wait until funding and time permitted the preparation of a complete resource profile (McKiernan 1996a).

Cooperative Collection Development

From the inception of CyberStacks(sm), we recognized the need and benefit of providing an opportunity for users to assist in its development and refinement and formally provided opportunities for users to participate in the development of its collection through a nominating process (Nominations 1996).

The creation of a separate Title Index has not only provided enhanced access to selective Net resources, but has also provided a mechanism by which users of the CyberStacks(sm) collection can more directly participate in its further development. At a formal level, we have established four 'virtual' advisory boards (Virtual Advisory Boards 1996) to assist in the development of CyberStacks(sm), including one to coordinate the overall selection of candidate resources for its collection. A coordinator for medical resources (R) has been appointed and she and her colleagues will assist in the selection of titles for priority description and incorporation beginning in April 1996. Likewise, a specialist with professional responsibility for identifying Internet resources in the field of military science has agreed to assist in the evaluation of candidate resources in Military Science (U) for CyberStacks(sm).

As an alternative to focused, coordinated collection development, potential users will be invited in Spring 1996, through a series of listserv and newsgroup postings, to view a subset of the Technology (T) resources in the Title Index and asked to rate these for priority incorporation within the CyberStacks(sm) collection.

Conclusion

In an effort to facilitate access and use of Internet resources, we have adapted established practices and procedures for managing Information Overload and information resources that have served librarians for generations. While the process of adaptation was at times frustrating and distressing, each challenge presented an opportunity to rethink the efficiency and effectiveness of conventional methods and techniques, and provided an impetus to seek a more effective management approach to the task at hand.

The alternative approaches that emerged in the process not only offered more efficient methods for developing a collection of web resources, but have created an opportunity for users to directly assist in defining the scope and depth of this collection in latter phases of the project.


References

American Library Association. Reference Collection Development and Evaluation Committee. 1992. Reference Collection Development: A Manual. American Library Association, Reference and Adult Services, Chicago, Ill.

Dougherty, R M. 1991. Editorial: Balancing Access and Overload. Journal of Academic Librarianship 16 (6):339.

Hopkins, R. L. 1995. Countering Information Overload: The Role of the Librarian. Reference Librarian 49/50:305-333.

Klapp, O. E. 1986. Overload and Boredom: Essays on the Quality of Life in the Information Society. Greenwood Press, Westport, Conn.

LC Classification Outline. 1990. Library of Congress, Washington. D.C.

"List of Reference Resource Types." [http://www.public.iastate.edu/~CYBERSTACKS/ref_book.htm]. 17 March 1996

McKiernan, G. "CyberStacks(sm): A 'Library-Organized' Virtual Science and Technology Reference Collection." In D-Lib Magazine [http://www.dlib.org/dlib/december95/briefings/12cyber.html]. December 1995.

McKiernan, G. 1996a. Build It and They Will Come: A Case Study of the Creation, Development and Refinement of an Organized Database of Internet Resources in Science and Technology. Paper prepared for The Digital Revolution: Assessing the Impact on Business, Education and Social Structures. The 1996 Mid-Year Meeting of the American Society for Information Science, May 18-22, 1996, San Diego, California.

McKiernan, G. 1996b. The New/Old World Wide Web Order: The Application of 'Neo-Conventional' Functionality to Facilitate Access and Use of a WWW Database of Science and Technology Resources. Submitted to Journal of Internet Cataloging April 1996.

McKiernan, G. and A. Ames. "two-dimensional limitations / 3-D Possibilities - CyberStacks(sm): An Alternative Model for Selecting | Organizing | Presenting | Accessing WWW Resources: A Position Paper Prepared for the OCLC Internet Cataloging Project Colloquium, [http://www.public.iastate.edu/~CYBERSTACKS/OCLC.htm]. 2 February 1996

"Nominations." [http://www.public.iastate.edu/~CYBERSTACKS/nominate.htm]. 17 March 1996

"Record Format." [http://www.public.iastate.edu/~CYBERSTACKS/record.htm]. 17 March 1996

Reference and Information Services: An Introduction. 1995. Edited by Richard E. Bopp and Linda C. Smith. Libraries Unlimited, Englewood, Colo.

"Title Index." [http://www.public.iastate.edu/~CYBERSTACKS/title_lst.htm]. 17 March 1996

"Under Construction." [http://www.public.iastate.edu/~CYBERSTACKS/under.htm]. 17 March 1996

"Virtual Advisory Boards." [http://www.public.iastate.edu/~CYBERSTACKS/advisory.htm]. 17 March 1996

Wurman, R. S. 1989. Information Anxiety. Doubleday, New York, N.Y.

HTML 3.2 Checked!