This past February, librarians, archivists, developers, scientists, and researchers descended upon San Francisco’s Omni Hotel to eat, sleep, and breathe digital curation. I was fortunate enough to be one of the over 200 attendees at this year’s International Digital Curation Conference, Commodity, catalyst, or change-agent? Data-driven transformations in research, education, business & society, organized by the Digital Curation Centre in the UK, with the help of the University of California Curation Center and the Coalition for Networked Information.
The conference began with a full day of workshops on February 24th. The 4C (Collaboration to Clarify the Costs of Curation) Project’s program on Costing Curation immediately caught my eye. The project has been exploring ways that organizations can assess and plan to fund digital curation, a complex and relatively understudied problem. Of particular interest was their plan to develop a Curation Exchange (CCEx), where institutions would be able to share information about the costs of their local digital curation programs, to help peers better plan and assess their own services.
The next day we jumped right into the conference program, with keynotes from Atul Butte, Associate Professor in Medicine and Pediatrics at the Stanford University School of Medicine, and Fran Berman, Edward G. Hamilton Distinguished Professor in Computer Science at Rensselaer Polytechnic and Chair of the Research Data Alliance. The keynotes were followed by four very different perspectives on the conference theme. Jane Hunter from the eResearch Lab at University of Queensland presented on a handful of “big data” curation projects underway at her institution, while Seamus Ross discussed digital curation education. Brian Hole from Ubiquity Press presented a model for data publication, and subsequently blew a few minds by stating that the author processing fee would be around $40 per article. The morning’s session concluded with a talk from Paul Lewis, Washington Correspondent for the Guardian, about the role of social networking “big data” in open journalism.
Lunch was followed by a panel discussion about the iSchool perspective on digital curator education. The panel included Margaret Hedstrom (University of Michigan), Ron Larsen (University of Pittsburgh), and Carole Palmer (University of Illinois at Urbana Champaign), as well as David DeRoure (Oxford eResearch Centre) and Liz Lyon (University of Pittsburgh) as responders. For me, as a recent iSchool graduate and new data curator practitioner, the panel was a bit of a letdown. I recognize that the situation, as it currently stands, typically requires that a data curator in an academic library (most of the conversation seemed to be focused on this particular subset of curators) to be a jack-of-all-trades, as most institutions are working with limited staff and resources. However, I would hope that we could all agree that this is not an ideal situation; and yet much of the discussion continued as if this was the model that will continue to stand into the future, without question. One of the most important questions posed by the audience, which sadly went unanswered, was about professional development for the librarians and information professionals who already work in these libraries, possibly as a way to bridge the gap between new data curators who are trained for these burgeoning services but work in institutions that still follow traditional models. To me, this is an area that requires much more discussion and interest within the community, and it’s where I’d like to see greater advocacy on the part of iSchools – as much as possible, helping to shape the future of academic research data programs and services and advocating for more sustainable and efficient staffing models.
The first full day of the conference ended with demos and the poster exhibition. I was particularly thrilled to hear about some of the new developments with Archivematica, and their recognition that the tool could work very well for processing research data in addition to archival collections. Unfortunately (or fortunately, depending on how you look at it), I spent the evening in discussion about my own poster and wasn’t able to make the rounds to see the others. Thankfully, all of the posters are available online (http://www.dcc.ac.uk/events/idcc14/posters), so you and I can both go back and see what we missed!
The second day of the conference began with a Keynote presentation from Simon Hodson, Executive Director of CODATA, on current and future CODATA activities, including work on data citation principles. The rest of the day was jam-packed with concurrent sessions – far too many panels and far too little time, but the slides are available online (http://www.dcc.ac.uk/events/idcc14/programme-presentations) for review at your leisure. For me, the Data & Value panel (especially the presentations by Mark Parsons and Limor Peer) was the standout. In particular, Parsons’ assertion that curation should be about working towards “generative data,” building upon the notion of “generative technology” from Jonathan Zittrain, was very compelling. Peer’s presentation nicely complemented this notion, discussing in more detail some of the work Social Science Data Archives have been doing to ensure this type of generative data release, as opposed to the deposit and sharing of data that cannot (because of missing metadata, undocumented code, poor quality, etc.) be reused by someone else. Clifford Lynch, Executive Director for CNI closed out the conference with a handful of provocations, including the need to protect sensitive human subject data, and the need to think more seriously about how long we plan to keep data, and which data we really need to preserve.
For additional insights into this year’s IDCC, visit the DCC’s website (http://www.dcc.ac.uk/events/idcc14/blogs), where they have gathered together various blog posts reporting on the conference. And be sure to keep your eyes peeled for information about next year’s conference in London!