
This report, submitted under the terms and expectations of our Cooperative Agreement, covers a review of activities in Phase 1, and plans for the Phase 2 (March 1995 through February 1996) of the University of Michigan Digital Library Project. The project is part of the Joint Digital Library Initiative of NSF, ARPA, and NASA. We have organized the report around the three primary areas of activity used to describe the scope of work in the cooperative agreement, namely: (enabling) research, testbed/collection development, and evaluation (including deployment and training). In the following we include all of the goals included in the scope of work of the cooperative agreement. I some cases we have added additional goals and denoted these as goal (added).
Additional reporting on activities during Phase 1 (September 1994 through February 1995) has also been provided through the following:
In this section, we review our progress during the first 6 months, and discuss plans for the next project period. We will follow the structure of the Scope of Work outline in the cooperative agreement.
Reviewing the statement of work for the first 6 months, we have met our goals for this period and have begun some of the work planned for the second budget period during the first 6 months. Our work plan for the first six months described developments divided into three major areas: research, testbed and collection, and evaluation.
In the research area, we have met our initial goals, and have substantially exceeded them in several areas. In particular, our research plans for the first six months largely called for architectural descriptions. We not only produced working documents describing the agent architecture, but also working prototype implementations in several areas which were demonstrated at the February 14, 1995 site visit. Reviewing our proposed statement of work in this area, we proposed to:
Goal 1.a.1 - Develop first generation conspectus for earth science collection
A generalized conspectus language has been developed. This language is now in the process of being tested against many physical collections to determine its ability to accurately describe the collections. During Period 2, we expect to extend the conspectus language as we gain practical experience with describing real collections. We have also implemented a registration agent that accepts conspectus descriptions from collection agents (CIAs) and uses them in a preliminary form to match user requests with potential collections. Evolution of this agent will also occur as we tackle the federation of additional collections.
Goal 1.a.2 - Develop first generation system architecture, and produce initial version of system architecture document
The core initial agent architecture is complete, and is based on KQML as the basic language for inter-agent communications. A draft architecture document has been written (Appendix A), along with a prototype implementation that provides the basic inter-agent communications systems as well as the capability for collections to register with a central agent.
Goal 1.a.3 - Add rudimentary ability to store and display enhanced media types (e.g., video, audio, real-time data)
Our work here has largely been on integrating the Blue Skies data produced by another project at Michigan and making it available under the prototype system. We have developed and tested a collection agent and conspectus for this collection, which demonstrates the ability of the conspectus to describe and represent collections other than simple text collections. During Period 2, we will work with The Blue Skies group to restructure the current standalone client to integrate into our "Mosaic" interface.
We are also integrating the multimedia McGraw-Hill Encyclopedia of Science in an SGML tagged format. We have had some difficulty in obtaining the data tapes in a usable format from McGraw- Hill but have resolved the difficulties and have a written commitment from them for delivery of the appropriate formats in the next few weeks.
Goal 1.b.4 (added) - Survey and initial evaluation of relevant research and sub-systems development to support intellectual property management and remuneration
The intellectual property and economics group (IPE) has been exploring emerging protocols such as the CMU NetBill project of Marvin Sirbu and the First Virtual protocols of Nathaniel Borenstein, et. al. They have begun designing a basic infrastructure for accounting and economic transactions.
In the Testbed and Collection area, our primary goal was to make available a production system (release 0) based on existing technology developed at Michigan, and to populate it with an initial collection of interest to space science users. During Period 1 we committed to:
Goal 1.b.1 - Develop "Mosaic" interface to existing DIRECT system
This interface was finished in December 1994 and has been in production for several months. We took the existing DIRECT system (developed at the University of Michigan) and converted its proprietary clients and server interface (which only ran under UNIX) to now operate under any WWW client (such as Mosaic, Netscape, etc.). This has enabled our system to be usable on any platform supported by a WWW client.
Goal 1.b.2 - Populate database with initial earth science collection (approx. 50 titles) from Elsevier, McGraw-Hill, Encyclopaedia Britannica, and UMI
We are on schedule to meeting our initial goal of populating the system with a substantial set of titles. We actually have in hand over 1,500 CD-ROMS of digital journals from UMI, and have already loaded about 20 titles from that collection into the on-line system. By February 14, we had approximately 54,000 journal articles online (about 30 Gbytes). We have received the back issues of 43 journals on 1,500 CD-ROMS from UMI and are expecting, by the end of February, the collections from McGraw-Hill and Elsevier, as well as additional titles from UMI. This will bring us past our period 1 goal.
Goal 1.b.3 - Deploy current DIRECT system with "Mosaic" interface (release 0) and enhanced collection to Ann Arbor schools testbed site and University of Michigan
As the project has evolved, the deployment, training, and evaluation activities are being handled by one sub-group. This group coordinates closely with the testbed building group, but is separate. We have therefore shifted deployment activities to the next activity area which is described below.
We replaced the term "evaluation" as used in the original statement of work with the broader the set of related activities: deployment, training, and evaluation (DTE).The DTE activity is well underway at the University of Michigan, and preliminary efforts have begun at the high schools. During the initial period we committed to:
Goal 1.b.3 - Deploy current DIRECT system with "Mosaic" interface (release 0) and enhanced collection to Ann Arbor schools testbed site and University of Michigan
The current system has been available on the University of Michigan campus for several months. Deploying it at the Ann Arbor Public Schools has taken somewhat longer than expected due to the need to help the schools with establishing Internet connectivity. This task was completed at Community High School at the end of January, and is scheduled to be completed at Pioneer High School at the end of February.
Goal 1.c.1 Perform baseline evaluation of testbed sites as basis for longitudinal study (release 0)
For the production DIRECT system we have undertaken usability studies of the system, as well as gaining information on how the system is used and how it changes people's work style in the context of material science research. With support from Elsevier we are about to begin several focus groups of users of DIRECT with the material science collection provided by Elsevier. This should product useful data on how the system is used.
For evaluation work in the public schools we filed the various human subject study plans required and have built a working relationship with the teachers and staff at the schools. We have conducted a skills inventory of the science students at Community High School and are in the final stages of designing a survey instrument for establishing a more formal baseline.
Curriculum development and training sessions for high school science teachers are being conducted by UMDL DTE staff. These will continue in period 1.
The plan laid out in the cooperative agreement is still the operative plan for the second budget period. We have already begun several of the investigation proposed for the second period, and are well positioned towards meeting our plan for period 2.
The research focus of budget period 2 will be to complete the agent architecture specification and to deploy prototypes based on this architecture. Reviewing the original work plan submitted with the final proposal, during period 2 we planned to:
Goal 2.a.1 - Develop initial interview agent that assists users in constructing queries based on pre-compiled characteristics of users and limited available content of collections (restricted conspectus)
We already have a prototype interview agent up and running, which will be demonstrated during the site visit. This prototype begins to show the types of capabilities we expect in the production user interview agent. The current capabilities of the interview agent are limited, and the actual provisioning of collections to the user agent is current relatively "hard wired." However, many of these constraints will be relaxed in this coming phase.
Goal 2.a.2 - Develop initial specifications for inter-agent protocols (internal draft protocol documents)
We are a bit ahead of schedule on this: a first prototype of a system involving 30 interacting agents was demonstrated at the February site visit. The initial KQML based inter-agent protocols have already been specified, although there is significant work to be done in formalizing the specification. The current draft document is primarily for internal use. By the end of this year, we plan on producing a protocol document for external use that would allow others outside of the project to build their own agents that participate with our system.
Goal 2.a.3 - Design and implement prototype system to search multiple collections with collection selection based on collection agent capability (e.g., content, search engine capability, media type)
The current implementation of the agent protocol is designed to exercise the basic protocol and to test scalability. However, the ability to match user needs with the a general set of collections is very limited in the current prototype. During the second year we plan on substantially generalizing this function, so that agents can match user needs as collected by the user agents with arbitrary collection, which are defined in the conspectus language. This goal will also follow from our agent-based architecture and increased understanding of how to create collection interface agents (CIA) for arbitrary collections.
Goal 2.a.4 - Add interface support for static, archival and real-time geophysical data sets
We plan on enhancing both the collection and our ability to store and make available non- textual datasets during year 2. We plan on working with video data such as is being made available to use by Encyclopaedia Britannica. We also plan on making available image data produced by NASA, as well as increased use of real-time data that is available through projects such as Blue Skies.
Goal 2.a.5 - Enhanced capability to store and display enhanced media types (e.g., video, audio, real-time data)
Much of this work will be done in the context of SGML.
Goal 2.a.6 (added) - Architectural extensions
Goal 2.a.7 (added) - Economic mechanisms
Design and development of mechanisms for commerce within the UMDL. This includes specification languages for goods and services (e.g., intellectual property rights, agent capabilities), transaction mechanisms (e.g., integrating a billing service), and negotiation protocols.
Goal 2.b.1 - Populate database with additional earth science collection (approx. 100 titles) from Elsevier, McGraw-Hill, Brittanica, and UMI
We plan to add substantially to the actual collections available under the UMDL. During the coming year we expect to easily double the size of the collection available under UMDL. We are also working on adding new publishing partners to the project, and expect to announce several additional partners during this year. We are on track to reach this goal and in addition have serious discussions underway with other content providers including American Chemical Society, Infosoft, Ebsco, Encyclopaedia Britannica (for content beyond initial commitment), Grolier's, Cambridge University Press, and University of Chicago Press.
Goal 2.a.4 - Add static, archival and real-time geophysical data sets
We are exploring acquisition of non-text collections from ERIM (remote sensing data), Bellcore (GIS data), and will work closely with Roberta Johnson and the NASA-sponsored University of Michigan Windows to the Universe project.
Goal 2.b.2 - Add SGML documents to collection, add SGML based search capabilities to search engine, integrate SGML renderer to display SGML based documents.
During the first 6 months we began the process of incorporating SGML documents into the system. We have selected an SGML renderer (Panorama) and have demonstrated collection agents for SGML collections. We expect that many of the new data sets that will be loaded into the UMDL will start to be made available in SGML during this coming year. We already have commitments from Elsevier and McGraw-Hill to deliver SGML to us starting in March. We will also begin an experiment, funded by the Mellon Foundation, to tag a major archival collection (to be determined) in SGML. This will give us significant experience with the issues and costs of tagging retrospective collections.
Goal 2.c.1 - Deploy release 1 UMDL at test sites
We plan on beginning the rollout of what we internally call the next generation system (NGS) almost immediately to the test sites. While the current production system will remain available for the foreseeable future, we plan on making the capabilities of the agent based system available to the test sites very quickly, so that we can get early reaction to its benefits and weaknesses. We also plan on adding New York Public Library and Stuyvesant High School (the two other initial test sites) to the tests during the summer of 1995.
Goal 2.c.2 - Evaluate UMDL release 1 at test sites.
The evaluation and continuing collection of longitudinal data begun already will continue, so that we have an on-going evaluation effort as new releases are deployed to the field.
These reports are unavailable .
Overall system architecture B. Birmingham User Interface Agents (UIA) K. Drabenstott Collection Interfaces Agents (CIA) A. Warner* Testbed construction and deployment R. Frank Deployment, Testing and Evaluation E. Soloway Collection development and Library Operations /User Support W. Lougee Corporate partner Liaison K. Willis Mediation Agents (MA) E. Durfee Intellectual Property and Economics M. Wellman* (* change from original cooperative agreement)
Since the project has been underway, the myriad of intersecting issues and tasks among sub- groups have become clear (see attached Figure 1). Sub-groups meet on a regular basis. Each sub- group has one or more members that regularly attend related sub-group meetings. In addition, all project members meet monthly. All meetings are considered be "open" and NSF/UMDL/NASA project members, as well as external partners are encouraged to attend meetings.
In addition to coordinating the complex interactions of the project teams and members, the management plan organizes, maintains, and explores venues for increasing the visibility of the project (both within and outside the University) and "day to day" communications among project participants. Electronic mail groups for each sub-group, project partners, and all project members are rigorously maintained. Project members also subscribe to the mail groups maintained by NSF. The UMDL HomePage contains information ranging from a copy of the original proposal to drafts of "specs" and internal documentation. This document is continually updated and maintained and is available at the following URL:
http://www.sils.umich.edu/UMDL/HomePage.html
Plans are underway to coordinate a regularly published newsletter which describes the activities and events of the University-wide digital library initiatives and the NSF/ARPA/NASA Digital Library project. The first Executive Advisory Committee meeting is being planned for early March. A copy of the letter which was sent to Committee members is attached as Appendix B.
Task management and project activities are captured in a GANTT chart. This chart acts as a record of project activities and tracks the upcoming goals and milestones of the project. A copy of the current chart is attached as Appendix C.
Each sub-group engages in a process of research and prototype development that influences the production system. In addition, both the collection development and deployment, evaluation and training sub-groups aid in development of the production system. This process is illustrated by Figure 2. Therefore, each sub-group of the project undergoes both internal and external processes which inform the design and development of the production and next generation systems. The following table outlines primary activities of the project team. Since many of the activities and tasks are highly interrelated, sub-groups have been combined to illustrate the close coordination among sub-groups, tasks and activities.
Group Primary Activities ________________________________________________________________ Research/System Architecture/Testbed Deployment of Release 0 System (B. Birmingham) Population of system with content User Interface Agents Conspectus spec 0.1 (K. Drabenstott) Architecture spec 1.0 Collection Interface Agents UIA spec (A. Warner) Installing infrastructure at Ann Arbor Testbed construction and Deployment high school sites (R. Frank) Modify DIRECT Development of "Next Generation" (Release 1) System Agent architecture Interagent communication Searching Examination/integration of SGML Examination of electronic commerce mechanisms Research activities Identify and evaluation of search engines Research on distributed databases ________________________________________________________________ Collection development (W. Lougee) Committed content providers Elsevier McGraw-Hill UMI Under review/negotiation Grolier's Cambridge Press InfoSoft Ebsco NASA ERIM ________________________________________________________________ Deployment, Testing and Evaluation (E. Soloway) Changing/development of "digital culture" at Ann Arbor High School sites Site assessment of infrastructure needs Collection of baseline data on student use ________________________________________________________________ Partnerships and Outreach Activities Equipment/software received from (K. Willis) Apple IBM Open Text Corp. Executive Advisory Committee meeting scheduled for 4/95 Building relationships with corporate partners Bellcore Xerox PARC Apple Advanced Technology Group First Virtual Building relationships with University partners Roberta Johnson (UM) Marvin Sirbu (CMU) Leveraging related projects JSTOR Making of America MUSE ________________________________________________________________
The accomplishments and activities of each sub-group were recently highlighted during the first NSF/ARPA/NASA site visit to the University of Michigan, which showcased the depth and scope of the tasks and activities outlined above.
This project thrives in rich multidisciplinary environment where faculty, staff, and students from many disciplines are collaborating to create the NSF/ARPA/NASA digital library. Following is a roster of current participants. This list is clearly a "slice of time" in that the project continues to encourage collaborations and partnerships across all domains.
Faculty Members Daniel E. Atkins, III School of Information and Library Studies Howard Besser School of Information and Library Studies William P. Birmingham College of Engineering David Blair School of Business Administration Karen M. Drabenstott School of Information and Library Studies Edmund H. Durfee College of Engineering Joan C. Durrance School of Information and Library Studies Carolyn O. Frost School of Information and Library Studies Joseph W. Janes School of Information and Library Studies Jeffrey K. MacKie-Mason College of Literature Science and the Arts Michael J. McGill Information Technology Division David L. Rodgers School of Information and Library Studies Elke A. Rundensteiner College of Engineering Perry J. Samson College of Engineering Elliot Soloway College of Engineering Hal R. Varian College of Literature Science and the Arts Amy J. Warner School of Information and Library Studies Michael P. Wellman College of Engineering Terry E. Weymouth College of Engineering Senior Staff Kenneth B. Alexander College of Engineering James E. Alloway University Library Elizabeth L. Blakely School of Information and Library Studies Laurie B. Crum School of Information and Library Studies Colin L. Day University Press Nathan Eriksen School of Information and Library Studies Randall L. Frank College of Engineering R. L. Liming, Jr. School of Information and Library Studies Wendy P. Lougee University Library Douglas Orr School of Information and Library Studies Gregory R. Peters, Jr. College of Engineering W. J. Price-Wilkin University Library Douglas E. Van Houweling Information Technology Division Katherine F. Willis School of Information and Library Studies Graduate Students Frederick E. Freiheit Computer Science and Engineering Eric J. Glover Computer Science and Engineering Stephen E. Kirk School of Information and Library Studies Nancy Lin School of Information and Library Studies Steve Markel School of Information and Library Studies Tracy Mullen Computer Science and Engineering Anisoara Nica Computer Science and Engineering Sunju Park Computer Science and Engineering Young-Pa So Computer Science and Engineering Jose M. Vidal Computer Science and Engineering John P. Weise School of Information and Library Studies Corporate Partners Drew Burton American Mathematical Society John Seely Brown XEROX PARC James Corgel IBM Len Redon Eastman-Kodak Dennis Egan Bellcore Lee Esler Kodak Karen Hunter Elsevier Rick LeFaivre Apple Computers, Inc. Michael Lesk Bellcore John Chrisstofferson McGraw-Hill Ann Okerson Association of Research Libraries (ARL) Phil Stockton Encyclopaedia Britannica James P. Romer UMI (University Microfilm, Inc.) University Partners Roberta Johnson University of Michigan Marvin Sirbu Carnegie Mellon University
Appendix A is not included in this document.
Appendix B is not included in this document.
Appendix C is not included in this document.
Comments or questions may be sent to: UMDL.INFO@umich.edu