University of Michigan Digital Library Project
Annual Report and Program Plan
Daniel E. Atkins, Project Director
February 15, 1995

I. Introduction

This report, submitted under the terms and expectations of our Cooperative Agreement, covers a review of activities in Phase 1, and plans for the Phase 2 (March 1995 through February 1996) of the University of Michigan Digital Library Project. The project is part of the Joint Digital Library Initiative of NSF, ARPA, and NASA. We have organized the report around the three primary areas of activity used to describe the scope of work in the cooperative agreement, namely: (enabling) research, testbed/collection development, and evaluation (including deployment and training). In the following we include all of the goals included in the scope of work of the cooperative agreement. I some cases we have added additional goals and denoted these as goal (added).

Additional reporting on activities during Phase 1 (September 1994 through February 1995) has also been provided through the following:

II. Research Plans

Review of Progress in Budget Period 1

In this section, we review our progress during the first 6 months, and discuss plans for the next project period. We will follow the structure of the Scope of Work outline in the cooperative agreement.

Reviewing the statement of work for the first 6 months, we have met our goals for this period and have begun some of the work planned for the second budget period during the first 6 months. Our work plan for the first six months described developments divided into three major areas: research, testbed and collection, and evaluation.

Research

In the research area, we have met our initial goals, and have substantially exceeded them in several areas. In particular, our research plans for the first six months largely called for architectural descriptions. We not only produced working documents describing the agent architecture, but also working prototype implementations in several areas which were demonstrated at the February 14, 1995 site visit. Reviewing our proposed statement of work in this area, we proposed to:

Goal 1.a.1 - Develop first generation conspectus for earth science collection

A generalized conspectus language has been developed. This language is now in the process of being tested against many physical collections to determine its ability to accurately describe the collections. During Period 2, we expect to extend the conspectus language as we gain practical experience with describing real collections. We have also implemented a registration agent that accepts conspectus descriptions from collection agents (CIAs) and uses them in a preliminary form to match user requests with potential collections. Evolution of this agent will also occur as we tackle the federation of additional collections.

Goal 1.a.2 - Develop first generation system architecture, and produce initial version of system architecture document

The core initial agent architecture is complete, and is based on KQML as the basic language for inter-agent communications. A draft architecture document has been written (Appendix A), along with a prototype implementation that provides the basic inter-agent communications systems as well as the capability for collections to register with a central agent.

Goal 1.a.3 - Add rudimentary ability to store and display enhanced media types (e.g., video, audio, real-time data)

Our work here has largely been on integrating the Blue Skies data produced by another project at Michigan and making it available under the prototype system. We have developed and tested a collection agent and conspectus for this collection, which demonstrates the ability of the conspectus to describe and represent collections other than simple text collections. During Period 2, we will work with The Blue Skies group to restructure the current standalone client to integrate into our "Mosaic" interface.

We are also integrating the multimedia McGraw-Hill Encyclopedia of Science in an SGML tagged format. We have had some difficulty in obtaining the data tapes in a usable format from McGraw- Hill but have resolved the difficulties and have a written commitment from them for delivery of the appropriate formats in the next few weeks.

Goal 1.b.4 (added) - Survey and initial evaluation of relevant research and sub-systems development to support intellectual property management and remuneration

The intellectual property and economics group (IPE) has been exploring emerging protocols such as the CMU NetBill project of Marvin Sirbu and the First Virtual protocols of Nathaniel Borenstein, et. al. They have begun designing a basic infrastructure for accounting and economic transactions.

Testbed and Collection Development

In the Testbed and Collection area, our primary goal was to make available a production system (release 0) based on existing technology developed at Michigan, and to populate it with an initial collection of interest to space science users. During Period 1 we committed to:

Goal 1.b.1 - Develop "Mosaic" interface to existing DIRECT system

This interface was finished in December 1994 and has been in production for several months. We took the existing DIRECT system (developed at the University of Michigan) and converted its proprietary clients and server interface (which only ran under UNIX) to now operate under any WWW client (such as Mosaic, Netscape, etc.). This has enabled our system to be usable on any platform supported by a WWW client.

Goal 1.b.2 - Populate database with initial earth science collection (approx. 50 titles) from Elsevier, McGraw-Hill, Encyclopaedia Britannica, and UMI

We are on schedule to meeting our initial goal of populating the system with a substantial set of titles. We actually have in hand over 1,500 CD-ROMS of digital journals from UMI, and have already loaded about 20 titles from that collection into the on-line system. By February 14, we had approximately 54,000 journal articles online (about 30 Gbytes). We have received the back issues of 43 journals on 1,500 CD-ROMS from UMI and are expecting, by the end of February, the collections from McGraw-Hill and Elsevier, as well as additional titles from UMI. This will bring us past our period 1 goal.

Goal 1.b.3 - Deploy current DIRECT system with "Mosaic" interface (release 0) and enhanced collection to Ann Arbor schools testbed site and University of Michigan

As the project has evolved, the deployment, training, and evaluation activities are being handled by one sub-group. This group coordinates closely with the testbed building group, but is separate. We have therefore shifted deployment activities to the next activity area which is described below.

Deployment, Training, and Evaluation

We replaced the term "evaluation" as used in the original statement of work with the broader the set of related activities: deployment, training, and evaluation (DTE).The DTE activity is well underway at the University of Michigan, and preliminary efforts have begun at the high schools. During the initial period we committed to:

Goal 1.b.3 - Deploy current DIRECT system with "Mosaic" interface (release 0) and enhanced collection to Ann Arbor schools testbed site and University of Michigan

The current system has been available on the University of Michigan campus for several months. Deploying it at the Ann Arbor Public Schools has taken somewhat longer than expected due to the need to help the schools with establishing Internet connectivity. This task was completed at Community High School at the end of January, and is scheduled to be completed at Pioneer High School at the end of February.

Goal 1.c.1 Perform baseline evaluation of testbed sites as basis for longitudinal study (release 0)

For the production DIRECT system we have undertaken usability studies of the system, as well as gaining information on how the system is used and how it changes people's work style in the context of material science research. With support from Elsevier we are about to begin several focus groups of users of DIRECT with the material science collection provided by Elsevier. This should product useful data on how the system is used.

For evaluation work in the public schools we filed the various human subject study plans required and have built a working relationship with the teachers and staff at the schools. We have conducted a skills inventory of the science students at Community High School and are in the final stages of designing a survey instrument for establishing a more formal baseline.

Curriculum development and training sessions for high school science teachers are being conducted by UMDL DTE staff. These will continue in period 1.

Other

We have also received support from the Andrew W. Mellon Foundation to apply UMDL technology to an experiment in providing university libraries digital, networked access to all back issues of ten leading history and economics journals. This is called the Journal Storage (JSTOR) Project.

Plans for Budget Period 2

The plan laid out in the cooperative agreement is still the operative plan for the second budget period. We have already begun several of the investigation proposed for the second period, and are well positioned towards meeting our plan for period 2.

The research focus of budget period 2 will be to complete the agent architecture specification and to deploy prototypes based on this architecture. Reviewing the original work plan submitted with the final proposal, during period 2 we planned to:

Research

Goal 2.a.1 - Develop initial interview agent that assists users in constructing queries based on pre-compiled characteristics of users and limited available content of collections (restricted conspectus)

We already have a prototype interview agent up and running, which will be demonstrated during the site visit. This prototype begins to show the types of capabilities we expect in the production user interview agent. The current capabilities of the interview agent are limited, and the actual provisioning of collections to the user agent is current relatively "hard wired." However, many of these constraints will be relaxed in this coming phase.

Goal 2.a.2 - Develop initial specifications for inter-agent protocols (internal draft protocol documents)

We are a bit ahead of schedule on this: a first prototype of a system involving 30 interacting agents was demonstrated at the February site visit. The initial KQML based inter-agent protocols have already been specified, although there is significant work to be done in formalizing the specification. The current draft document is primarily for internal use. By the end of this year, we plan on producing a protocol document for external use that would allow others outside of the project to build their own agents that participate with our system.

Goal 2.a.3 - Design and implement prototype system to search multiple collections with collection selection based on collection agent capability (e.g., content, search engine capability, media type)

The current implementation of the agent protocol is designed to exercise the basic protocol and to test scalability. However, the ability to match user needs with the a general set of collections is very limited in the current prototype. During the second year we plan on substantially generalizing this function, so that agents can match user needs as collected by the user agents with arbitrary collection, which are defined in the conspectus language. This goal will also follow from our agent-based architecture and increased understanding of how to create collection interface agents (CIA) for arbitrary collections.

Goal 2.a.4 - Add interface support for static, archival and real-time geophysical data sets

We plan on enhancing both the collection and our ability to store and make available non- textual datasets during year 2. We plan on working with video data such as is being made available to use by Encyclopaedia Britannica. We also plan on making available image data produced by NASA, as well as increased use of real-time data that is available through projects such as Blue Skies.

Goal 2.a.5 - Enhanced capability to store and display enhanced media types (e.g., video, audio, real-time data)

Much of this work will be done in the context of SGML.

Goal 2.a.6 (added) - Architectural extensions

Goal 2.a.7 (added) - Economic mechanisms

Design and development of mechanisms for commerce within the UMDL. This includes specification languages for goods and services (e.g., intellectual property rights, agent capabilities), transaction mechanisms (e.g., integrating a billing service), and negotiation protocols.

Testbed/Collection Development

Goal 2.b.1 - Populate database with additional earth science collection (approx. 100 titles) from Elsevier, McGraw-Hill, Brittanica, and UMI

We plan to add substantially to the actual collections available under the UMDL. During the coming year we expect to easily double the size of the collection available under UMDL. We are also working on adding new publishing partners to the project, and expect to announce several additional partners during this year. We are on track to reach this goal and in addition have serious discussions underway with other content providers including American Chemical Society, Infosoft, Ebsco, Encyclopaedia Britannica (for content beyond initial commitment), Grolier's, Cambridge University Press, and University of Chicago Press.

Goal 2.a.4 - Add static, archival and real-time geophysical data sets

We are exploring acquisition of non-text collections from ERIM (remote sensing data), Bellcore (GIS data), and will work closely with Roberta Johnson and the NASA-sponsored University of Michigan Windows to the Universe project.

Goal 2.b.2 - Add SGML documents to collection, add SGML based search capabilities to search engine, integrate SGML renderer to display SGML based documents.

During the first 6 months we began the process of incorporating SGML documents into the system. We have selected an SGML renderer (Panorama) and have demonstrated collection agents for SGML collections. We expect that many of the new data sets that will be loaded into the UMDL will start to be made available in SGML during this coming year. We already have commitments from Elsevier and McGraw-Hill to deliver SGML to us starting in March. We will also begin an experiment, funded by the Mellon Foundation, to tag a major archival collection (to be determined) in SGML. This will give us significant experience with the issues and costs of tagging retrospective collections.

Deployment, Training, and Evaluation

Goal 2.c.1 - Deploy release 1 UMDL at test sites

We plan on beginning the rollout of what we internally call the next generation system (NGS) almost immediately to the test sites. While the current production system will remain available for the foreseeable future, we plan on making the capabilities of the agent based system available to the test sites very quickly, so that we can get early reaction to its benefits and weaknesses. We also plan on adding New York Public Library and Stuyvesant High School (the two other initial test sites) to the tests during the summer of 1995.

Goal 2.c.2 - Evaluate UMDL release 1 at test sites.

The evaluation and continuing collection of longitudinal data begun already will continue, so that we have an on-going evaluation effort as new releases are deployed to the field.

III. Financial Report

These reports are unavailable .

Actual expenditures and projections for Period 1

Actual expenditures and projections by project area for Period 1

Project expenditure projections for Period 2

Explanation for unexpended funds in Period 1

Outside and institutional support

NSF Form 1030

Carryover from Period 1

Period 2

IV. Management Report

Overview

The management plan for the NSF/ARPA/NASA Digital Library has developed within the structure of the Management Plan outlined in the cooperative agreement. The researchers, staff, and students involved in the project have developed shared goals, culture, and a common "language" for the project. The project is being managed by the UMDL Operating Committee, which meets weekly, and consists of the leaders of the sub-teams together with Dan Atkins (PD), and Laurie Crum (Project Coordinator). The operating committee deals with coordination among the sub- groups, overall issues arising, and coordination of visitors, papers, and conferences. Outlines of minutes from the Operating Committee meetings are distributed to all project participants. The sub- teams and leaders are outlined below:

  Overall system architecture                            B. Birmingham
  User Interface Agents (UIA)                            K. Drabenstott
  Collection Interfaces Agents (CIA)                     A. Warner*
  Testbed construction and deployment                    R. Frank
  Deployment, Testing and Evaluation                     E. Soloway
  Collection development and Library Operations
     /User Support                                       W. Lougee
  Corporate partner Liaison                              K. Willis
  Mediation Agents (MA)                                  E. Durfee
  Intellectual Property and Economics                    M. Wellman*
(* change from original cooperative agreement)

Since the project has been underway, the myriad of intersecting issues and tasks among sub- groups have become clear (see attached Figure 1). Sub-groups meet on a regular basis. Each sub- group has one or more members that regularly attend related sub-group meetings. In addition, all project members meet monthly. All meetings are considered be "open" and NSF/UMDL/NASA project members, as well as external partners are encouraged to attend meetings.

In addition to coordinating the complex interactions of the project teams and members, the management plan organizes, maintains, and explores venues for increasing the visibility of the project (both within and outside the University) and "day to day" communications among project participants. Electronic mail groups for each sub-group, project partners, and all project members are rigorously maintained. Project members also subscribe to the mail groups maintained by NSF. The UMDL HomePage contains information ranging from a copy of the original proposal to drafts of "specs" and internal documentation. This document is continually updated and maintained and is available at the following URL:

http://www.sils.umich.edu/UMDL/HomePage.html

Plans are underway to coordinate a regularly published newsletter which describes the activities and events of the University-wide digital library initiatives and the NSF/ARPA/NASA Digital Library project. The first Executive Advisory Committee meeting is being planned for early March. A copy of the letter which was sent to Committee members is attached as Appendix B.

Task management and project activities are captured in a GANTT chart. This chart acts as a record of project activities and tracks the upcoming goals and milestones of the project. A copy of the current chart is attached as Appendix C.

Each sub-group engages in a process of research and prototype development that influences the production system. In addition, both the collection development and deployment, evaluation and training sub-groups aid in development of the production system. This process is illustrated by Figure 2. Therefore, each sub-group of the project undergoes both internal and external processes which inform the design and development of the production and next generation systems. The following table outlines primary activities of the project team. Since many of the activities and tasks are highly interrelated, sub-groups have been combined to illustrate the close coordination among sub-groups, tasks and activities.

Group                          Primary Activities
________________________________________________________________

Research/System
Architecture/Testbed                  Deployment of Release 0 System
(B. Birmingham)                         Population of system with content
User Interface Agents                   Conspectus spec 0.1
(K. Drabenstott)                        Architecture spec 1.0
Collection Interface Agents             UIA spec
(A. Warner)                             Installing infrastructure at Ann Arbor
Testbed construction and Deployment       high school sites
(R. Frank)                              Modify DIRECT
                                      Development of "Next Generation" 
                                        (Release 1) System
                                        Agent architecture
                                        Interagent communication
                                        Searching
                                        Examination/integration of SGML
                                        Examination of electronic commerce 
                                          mechanisms
                                      Research activities
                                         Identify and evaluation of search 
                                           engines
                                         Research on distributed databases
________________________________________________________________

Collection development
(W. Lougee)                         Committed content providers
                                      Elsevier
                                      McGraw-Hill
                                      UMI
                                    Under review/negotiation
                                      Grolier's
                                      Cambridge Press
                                      InfoSoft
                                      Ebsco
                                      NASA
                                      ERIM
________________________________________________________________

Deployment, Testing and Evaluation
(E. Soloway)                        Changing/development of "digital culture" 
                                      at Ann Arbor High School sites
                                    Site assessment of infrastructure needs
                                    Collection of baseline data on student use
________________________________________________________________

Partnerships and Outreach 
  Activities                        Equipment/software received from
(K. Willis)                           Apple
                                      IBM
                                      Open Text Corp.
                                    Executive Advisory Committee meeting 
                                     scheduled for 4/95
                                    Building relationships with corporate 
                                     partners
                                      Bellcore
                                      Xerox PARC
                                      Apple Advanced Technology Group
                                      First Virtual
                                    Building relationships with University 
                                     partners
                                      Roberta Johnson (UM)
                                      Marvin Sirbu (CMU)
                                    Leveraging related projects
                                      JSTOR
                                      Making of America
                                      MUSE
________________________________________________________________

The accomplishments and activities of each sub-group were recently highlighted during the first NSF/ARPA/NASA site visit to the University of Michigan, which showcased the depth and scope of the tasks and activities outlined above.

This project thrives in rich multidisciplinary environment where faculty, staff, and students from many disciplines are collaborating to create the NSF/ARPA/NASA digital library. Following is a roster of current participants. This list is clearly a "slice of time" in that the project continues to encourage collaborations and partnerships across all domains.

Faculty Members
Daniel E. Atkins, III         School of Information and Library Studies
Howard Besser                 School of Information and Library Studies
William P. Birmingham         College of Engineering
David Blair                   School of Business Administration
Karen M. Drabenstott          School of Information and Library Studies
Edmund H. Durfee              College of Engineering
Joan C. Durrance              School of Information and Library Studies
Carolyn O. Frost              School of Information and Library Studies
Joseph W. Janes               School of Information and Library Studies
Jeffrey K. MacKie-Mason       College of Literature Science and the Arts
Michael J. McGill             Information Technology Division
David L. Rodgers              School of Information and Library Studies 
Elke A. Rundensteiner         College of Engineering
Perry J. Samson               College of Engineering
Elliot Soloway                College of Engineering
Hal R. Varian                 College of Literature Science and the Arts
Amy J. Warner                 School of Information and Library Studies
Michael P. Wellman            College of Engineering
Terry E. Weymouth             College of Engineering


Senior Staff
Kenneth B. Alexander          College of Engineering
James E. Alloway              University Library
Elizabeth L. Blakely          School of Information and Library Studies
Laurie B. Crum                School of Information and Library Studies
Colin L. Day                  University Press
Nathan Eriksen                School of Information and Library Studies
Randall L. Frank              College of Engineering
R. L. Liming, Jr.             School of Information and Library Studies
Wendy P. Lougee               University Library
Douglas Orr                   School of Information and Library Studies
Gregory R. Peters, Jr.        College of Engineering
W. J. Price-Wilkin            University Library
Douglas E. Van Houweling      Information Technology Division
Katherine F. Willis           School of Information and Library Studies


Graduate Students
Frederick E. Freiheit         Computer Science and Engineering
Eric J. Glover                Computer Science and Engineering
Stephen E. Kirk               School of Information and Library Studies
Nancy Lin                     School of Information and Library Studies
Steve Markel                  School of Information and Library Studies
Tracy Mullen                  Computer Science and Engineering
Anisoara Nica                 Computer Science and Engineering
Sunju Park                    Computer Science and Engineering
Young-Pa So                   Computer Science and Engineering
Jose M. Vidal                 Computer Science and Engineering
John P. Weise                 School of Information and Library Studies


Corporate Partners
Drew Burton                   American Mathematical Society
John Seely Brown              XEROX PARC
James Corgel                  IBM
Len Redon                     Eastman-Kodak
Dennis Egan                   Bellcore
Lee Esler                     Kodak
Karen Hunter                  Elsevier
Rick LeFaivre                 Apple Computers, Inc.
Michael Lesk                  Bellcore
John Chrisstofferson          McGraw-Hill
Ann Okerson                   Association of Research Libraries (ARL)
Phil Stockton                 Encyclopaedia Britannica
James P. Romer                UMI (University Microfilm, Inc.)


University Partners
Roberta Johnson               University of Michigan
Marvin Sirbu                  Carnegie Mellon University

Figures

Figure 1. Road Map of Activities

Figure 2. Project Organization

Appendix A. UMDL System Architecture Specification

Appendix A is not included in this document.

Appendix B. Executive Advisory Committee Letter

Appendix B is not included in this document.

Appendix C. Long Term Overview and GANTT Chart

Appendix C is not included in this document.


Return to UMDL Publications

Return to the Main Page

Comments or questions may be sent to: UMDL.INFO@umich.edu