B U L L E T I N
What You Missed at the DASER Summit
The first Digital Archives for Science and Engineering Resources (DASER) Summit was held November 21-23, 2003, at the Massachusetts Institute of Technology in Cambridge, Massachusetts. Over 90 participants from four countries and more than a dozen states attended the conference, which explored issues surrounding the creation of science- and engineering-based digital archives, including standards, metadata, software and tool development, and long-term preservation.
This summit was the first sponsored by an ASIS&T Special Interest Group – SIG/STI (Scientific and Technical Information Systems) – and an ASIS&T Chapter – NEASIS&T (New England chapter). One of the goals of this summit was to evaluate this program model for future program planning within ASIS&T. All indications are that the model has succeeded, and the Society has gained valuable experience that will allow us to further refine this program development strategy. A full report will be presented at the August 2004 Board of Director's retreat, with subsequent publication on the ASIS&T website. Highlights will be presented at various venues at the Annual Meeting in November. The DASER Program & Logistics Committee, which consists of Julie Arnold, Margret Branschofsky, Darcy Duke, Deb Helman, Michael Leach (chair) and K.T. Vaughan, hopes that other SIGs and chapters will adopt this model to develop new summit programs in the near future.
The summit kicked off on Friday evening with a social event at the MIT Barker Engineering Library. In a relaxed setting, complete with ample food and drink, participants enjoyed networking inside this famous location – beneath the "dome" of MIT. Members of the Summit Program Committee as well as volunteers from the Simmons ASIS&T Student Chapter were on hand to greet participants. A continental breakfast the next morning in the Wiesner Building, home of the MIT Media Lab, kicked off the panel sessions in the Bartos Theater.
Panel 1 highlighted issues related to metadata and standards in the digital archives realm. Jeff Beck, National Institutes of Health and the National Library of Medicine (NLM), discussed PubMed Central and the NLM's efforts at creating and using DTDs (document or data type definitions used with mark-up languages such as XML). MacKenzie Smith of MIT followed with an overview of METS: Metadata Encoding & Transmission Standard, which provides description, administrative, behavioral and structural metadata with a file inventory to "wrap" digital objects in any digital archive or institutional repository.
Panel 2 focused on digital repository systems. Suzanne Samuel, California Digital Library, discussed the CDL eScholarship Repository project. Margret Branschofsky followed with a description the DSpace project at MIT, while Leslie Johnston presented the Fedora repository implementation at the University of Virginia. Wrapping up this session was Kimberly Douglas, California Institute of Technology, focusing on the Caltech experience in creating a digital repository.
At this point in the program, participants and speakers took a short walk from the Wiesner Building to the MIT Faculty Club for a catered luncheon, where everyone got the opportunity to network and chat about the issues raised in the first panel sessions.
The afternoon began with Panel Session 3, titled "Beyond Words: Issues in Data Archives." Four speakers addressed a variety of issues and experiences related to their implementation of data archives. Volker Brendel, Iowa State University, discussed the development, maintenance and sharing of small-scale databases for genome research. Bob Dragoset of the National Institute of Standards and Technology followed with a description of the physical reference data of the NIST physics laboratory. Peter Knoop of the University of Michigan discussed the UARC/SPARC (Space Physics & Aeronomy Research Collaboratory) experience from 1993 to 2002. Sue Rhee of Stanford University focused on issues in managing and disseminating changing information in biology.
The last panel on Saturday was titled "Will It Last Forever: Preservation of Digital Material." Stephen Abrams of Harvard University discussed the global file format registry. The next speaker was Tom Robertson of Stanford University, who highlighted the LOCKSS model – Lots of Copies Keep Stuff Safe – of digital preservation. Nancy McGovern of Cornell University rounded off this panel by highlighting one risk management approach for Web resources – virtual remote control.
Participants joined one of several networking dinners on Saturday evening, hosted by members of the DASER Program Committee. The City of Cambridge is blessed with many excellent eating establishments as many of the participants discovered, including a Massachusetts staple – seafood.
Sunday morning the participants once again gathered in the Bartos Theater. The last panel session for the summit focused on the details of four specific digital archives projects. Matthew Cockerill highlighted BioMed Central's role as an open access publisher and its role in digital archiving. Guenther Eichhorn from the Harvard-Smithsonian Center for Astrophysics described the NASA Astrophysical Data System (ADS)/Astronomy Digital Library, a worldwide collaboration to provide access to the astronomical literature. Sandra McIntyre, Health Education Assets Library, followed with a discussion of multimedia assets management for health sciences education.
Brandon Muramatsu of the National Engineering Education Delivery System (NEEDS) at University of California, Berkeley, rounded off this panel with a discussion of NEEDS, SMETE.ORG and educational digital libraries.
In the closing keynote address, Clifford Lynch, executive director of the Coalition for Networked Information, and past president of ASIS&T, synthesized much of the summit's presentations, providing a larger framework of reference for these cutting-edge research projects and services. For more on the content of this talk, see the related article by Beatrice Pulliam.
Planning for a second DASER Summit is well underway for the spring of 2005 at the University of Maryland campus in College Park. By the time this article appears in print, or shortly thereafter, announcements on the exact days and times, as well as program details, should be available on the ASIS&T website. Stay tuned.
[Editor's Note: Beatrice Pulliam was one of a half-dozen students from the ASIS&T Student Chapter at Simmons College who volunteered at the summit. Her report of the summit, shared with fellow students at Simmons, is reprinted here.]
I organized the Simmons contingent of volunteers for the ASIS&T & NEASIS&T DASER Summit held at MIT. DASER stands for Digital Archives for Science & Engineering Resources. The ASIS&T SIG/Science & Technology Information Systems SIG/STI) and the Physics-Astronomy-Mathematics Division of SLA also sponsored the summit.
This day and a half conference covered everything from the very technically nitty issues of metadata and standards and what goes into building digital repository systems to current data archiving challenges faced by community databases used by the scientific community and current digital preservation projects. On the subject of metadata, the general sense is that we have a good handle on descriptive metadata issues but not so for standards dealing with provenance and other preservation. Interestingly, there were no archivists or records management folks in attendance at the summit. It would have been great to get that perspective. I wonder if there will be any crossover in roles as more digital library initiatives take off.
On the topic of community databases, there are over 4500 labs, for example, contributing data to The Arabidopsis Information Resource (TAIR) based at Stanford University. The Arabidopsis Thaliana is a small flowering plant from the mustard family and is an endless source for basic research in genetics and molecular biology.
In this report I would like to focus on the keynote and wrap-up talk given on the last morning by Clifford Lynch, the executive director for the Coalition for Center for Networked Information (www.cni.org/). Lynch brought up some issues that will cut across all areas of librarianship, not just science and engineering, things definitely worth thinking about as we begin new careers as librarians or ramp up current skills for new responsibilities. He notes that we need to be more "thoughtful in thinking about what needs to be in place for digitized content in the way of management, dissemination, continuity and preservation."
Lynch led off with a general comment that most of us can agree with. These are innovative times in terms of not only the vast amount of technology that is available, but also in the enormous quantity of digital content that is being disseminated and our increasing reliance on its availability. A question Lynch asks us all to consider is the following: How do we keep the content viable as we deal with its fragility? The scientists at the summit spoke with excitement about the community databases that are being built and the stresses of not being able to keep pace with the sizable datasets being added to the databases nor with the problems of data integrity. The scientists seem eager to work with information scientists. Some are already working in conjunction with their own "domain" scientists, information scientists, computer scientists and sometimes, behavioral scientists.
Lynch thinks that institutional repositories could be a possible solution because the repositories are typically associated with an academic institution, its faculty and their interests. It might also be a future staging ground for faculty research and the storage of data. MIT has already created its own open source institutional repository with DSPACE (www.dspace.org/). Other institutions like the University of California and its e-scholarship repository have found success working with commercial entities that handle the access, editorial and policy issues and let the institution focus on the scholarly communication. Most of the panelists seemed to agree that standards, peer-review and resource sharing are very important, and open-source federations go a long way toward building a critical mass of content that leverages distributed expertise and promotes best practices. Working with librarians is on everyone's list. A panelist even suggested that we "market ourselves by selling persistence" – persistent links, persistent access, even persistent digital caches. Tom Robertson from LOCKSS (Lots of Copies Keeps Stuff Safe; http://lockss.stanford.edu/index.html) at Stanford sees the libraries as "memory organizations." LOCKSS is working to build tools that help libraries preserve their e-content affordably.
Finally, Lynch cautions that tools will "permeate out across the literature and become infrastructure components" and that we shouldn't "underestimate how bizarre it will get." The bottom line, as Lynch says, is that we need to be thinking through the "broad scale issues of curatorial stewardship as they apply to digital content." Now.
For more information on the DASER Summit, including speaker PowerPoint presentations, please go to www.asist.org/Chapters/neasis/daser/index.html
Copyright © 2004, American Society for Information Science and Technology