This year, for the first time, the Charleston Conference is teaming with the UNC School of Information and Library Science to offer the inaugural Charleston Seminar, Introduction to Data Curation. Earlier this week, we caught up with Jon Crabtree and Cal Lee who are presenting this timely seminar to ask some questions and get their take on how they hope attendees will benefit. Here is what they had to say:
ATG: Data curation is an increasing concern for academic libraries. What specifically can attendees expect to gain from this day and a half long seminar?
JC & CL: Participants in this event will gain an understanding of the challenges and opportunities related to curation of data. This will range from high-level strategic considerations to low-level issues such as integrity and manipulation of bitstreams. They’ll gain exposure to data management tools and methods that can be used at various points in the lifecycle of data.
ATG: Will you focus on particular data curation opportunities that academic libraries should be considering?
JC & CL: We’ll address a variety of opportunities, including engagement with new stakeholders, use of existing and emerging technologies, and ways of approaching data (as opposed to documents) that best facilitate their sharing and use.
ATG: What strategies and practical skills will attendees come away with?
JC & CL: They should come away with an understanding of how to interact with digital information at various levels of representation (including hex dumps and cryptographic hashes); strategies for developing and implementing data management plans; experience in using the Dataverse Network (DVN); and concrete ideas about how to formulate data curation workflows.
ATG: The seminar is billed as an interactive event? What activities are you planning?
JC & CL: In addition to the talks and discussion, we’ll have three hands-on exercises. The first will be related to working with bits (hashes, hexes and file signatures). The second will be using the DVN to curate data. Finally, we’ll have an exercise in which groups will devise proposed workflows around specific areas of digital curation activity.
ATG: Libraries are drowning in data nowadays. With Institutional Repositories, Open Access publishing, eBooks and eJournal subscriptions how can current staffing manage this amount of data? Is cooperation with other libraries or institutions the answer?
JC & CL: Part of the answer is a shift in personnel, by hiring and training more staff to engage in data curation work. Another important element is prioritization: institutions need to decide which data and which types of use are most essential. Librarians and archivists also need to take advantage of automation, so software can perform routine activities, freeing human attention for higher-level judgments and decision making. Partnering and collaboration are certainly important. This involves social interactions but also establishment of distributed technical architectures.
ATG: You each have some specialties that should help give attendees a new approach to dealing with their data flows. Jon you have made the field of “geospatial” factors an important part of your research. And Cal you have done the same with “forensic” examination of data. Could you each describe a little what attendees will learn in those areas?
JC & CL: While we won’t be focusing specifically on GIS data or forensics in this introductory workshop, both of these areas highlight the importance of layering and interdependencies in digital systems, which we will be addressing. For example, Jon will give a talk called, “What makes data different from documents?” Cal will also present several processes that are essential to forensics, including generation of disk images and investigation of bitstreams.
ATG: We really appreciate you taking the time to talk with us. From everything you’ve said it looks like attendees will come away from the seminar with a great introduction to data curation.
JC & CL: Thank you. We look forward to the event.