If they differed on specifics, all agreed that the promise of digital libraries was both immense and crucially important to the national interest. They also agreed that there was a great deal of work to be done to fulfill the promise and that digital libraries research requires experimental systems with real collections and real users. Many noted that research will have to deal with the complex problem of balancing present demands and future goals.
Attempting to prompt thought on these issues, the initial plenary session included presentations by Bill Arms (CNRI) and Ron Larsen (DARPA), attempting to move beyond conventional assumptions. Christine Borgman (UCLA) raised user-centered issues while Margaret Hedstrom (University of Michigan) spoke on the importance of long-term access to collections.
At its most general, the problem faced is one of managing complexity--the complexity of systems, of resources, and of users. Digital libraries must work with a highly diverse range of collections of digital objects, assembled on different principles by numerous contributors and continuously changing as more content and value are added to them. Equally, they must work with users who will be as diverse as society itself, with ever-changing needs and expectations, while breaking down conventional distinctions around which existing collections were shaped--such as expert/novice or provider/user. They must be useful to different communities for different purposes, at different times
Some participants schematically grouped issues into three areas, each with its own particular tensions and problems. This report lays out the three separately here. It is important to note, however, first, that most saw these as interdependent, not independent, aspects of digital libraries research, and, second, that views diverged over where emphasis and concentration should best fall.
Work from the original DLI coupled with the simultaneous impact of the World Wide Web have brought into focus the system-centered issues that will need intensive study for systems, as one participant put it, "to provide the connective tissue to bring users and collections together." Research will continue to address system architectures and their functional components, addressing issues of scale, interoperability, extensibility, federation, and composability.
The information infrastructure has scaled up dramatically and, driven by Moore's law, continues to obey a power law for growth in capacity and an inverse power law for cost. Yet, as one participant put it, the real problem is that "We don't know yet how to use the Internet productivity and effectively." This presents digital libraries with the challenge of applying increasing computational capacity and bandwidth to manage terabytes of information that need to be accessible in full, yet be reducible to usable, human scale.
These will require a scalable, open architecture that is
These unlimited resources will inevitably comprise multiple data sources, heterogeneous objects, and multiple schemas federated on a global scale. Moreover, they will be built on and consulted through diverse platforms by equally diverse and distributed users. Top-level issues here include issues of cross-domain, seamless federation that allows:
It was noted in several ways that the design of a digital library should not be posed as physical vs. digital objects (atoms vs. bits) but rather as co-existence and interoperability between the two. Emerging digital repositories will co-exist with more traditional libraries for an indefinite period. In addition, users of digital repositories will be converting digital documents to paper documents (i.e., printing and faxing them), as well as converting paper documents to digital ones (i.e., performing scanning and document recognition). Facilitating the transparent interoperation of paper and digital documents poses technical and social challenges.
Few of the huge number of collections a digital library will bring together will be static. Most will grow as the platform, the collections, and the users themselves develop and grow. Such changes will need adaptable, dynamic, flexible systems able to deal with interactive use. Issues here include
Across these adaptations, digital libraries will face the challenge of preserving and presenting context as a key way to provide structure to unstructured data. Doing so will call for a better understanding and deployment of:
Furthermore, multimedia and multimodal databases will present new challenges as users look beyond simple key word and Boolean text search for means to explore not just text but images, video, or music as well. Will we, one participant asked, be able to search by singing or by sketching as easily as by typing?
Around these issues, developing research will encounter issues of
As mentioned earlier, the stretched vision of an advanced library supports the full cycle of knowledge creation and use by individuals, teams, organizations, and communities. Special attention should be given to digital libraries which support collaboration in all four variations of same and different, time and place.
A second initiative, many thought, would benefit from more symbiotic partnerships between systems developers and existing collections. This would bring to light a number of the issues faced by collection holders, including
One of the most energetically discussed questions at the conference involved issues of archiving. Preservation has, of course, been one of the major contributions of conventional libraries and will remain one for digital libraries. It was felt that more interaction between the digital library research and investigations into long-term digital preservation would be particularly fruitful. In the past, preservation has mostly been addressed in practical ways and has reflected the need to rescue digital data from imminent destruction rather than to consider its long-term viability from the start. Discussion of a more principled approach returned the workshop again to issues of standards, metadata, and interoperability.
Research in digital libraries must always be motivated by the information needs of people. On-line information is breaking down the traditional separation between author, designer, publisher, librarian, user, archivist, etc. (And one person in his time plays many parts.) The rapid growth of on-line information has created a new set of research challenges that can be described as "human centered research". Although many of the achievements of the current DLI have been user centered, a new digital library initiative, many thought, would benefit by being even more responsive to users and use.
At the conference, there was considerable discussion about the question of information selectivity. How can all the information in the world be distilled into the small quantity most relevant to a specific individual? One approach to this question is to take digital library collections as they are and devise enhanced methods for exploring, searching, filtering, and so forth. Another approach is to consider the construction and use of collections together. Within this general theme, some of the issues raised were:
Continuing digital library work, many thought, should be responsive to the overlapping but distinct needs of individuals, communities, and institutions. It would face the challenge of simultaneously augmenting privacy and trust while underwriting seamless collaboration and collaboratories.
Steve Griffin of NSF presented some desirable distinctions between the current DLI and future programs. DLI, he noted had involved
He suggested that future initiatives should, by contrast have
Some participants further stressed the need for more emphasis on the applications of digital libraries in order to build user support for digital libraries, to deliver value to teachers and scholars in different contexts, to link up with the commercial publishing world, and to focus research in the most valuable directions.