JISC Beginner's Guide to Digital Preservation

…creating a pragmatic guide to digital preservation for those working on JISC projects

Archive for July, 2011

Closing the Digital Curation Gap

Posted by Marieke Guy on 6th July 2011

Last week on the day before the ICE Forum (28th June 2011) I attended the Closing the Digital Curation Gap Meeting.

CDCG is an International Collaboration to Integrate Best Practice, Research & Development, and Training in Digital Curation. It has been running since October 2009 and was scheduled to finish in September this year but has just been given an extension (till September 2012). A comprehensive overview of the project is given on the Digital Curation Exchange Web site.

The Closing the Digital Curation Gap (CDCG) collaboration is designed to serve as a locus of interaction between those doing leading edge digital curation research, development, teaching, and training in academic and practitioner communities those with a professional interest in applying viable innovations within particular organizational contexts; IMLS; JISC; the DCC, charged with disseminating such innovation and best practices; and the SCA, charged to build a common information environment where users of publicly funded e-Content can realize best value by reducing the barriers that inhibit access, use and re-use of online content.

I have come along to the project at a fairly late stage but hope I can still be of use and possibly offer a new perspective (that of not being an expert!).

The June meeting was held at the JISC offices in London and was a joint meeting of the US and UK partners. The UK was represented by members from JISC, UKOLN, ULCC, HATII, the BL and the DPC, the US had people from the Bishoff Group, Penn State University Libraries, Purdue University Libraries, University of North Carolina at Chapel Hill and University of Toronto. [Thanks to Sharon McMeeking from the DPC for sharing her notes to help jig my memory].

The aims of the meeting were to discuss the outputs of the project so far and to set objectives for the continuation of the work in 2011/12. The main work so far has been staging a number of focus groups, work on decision trees and work on best practice guides. The digital curation exchange web site is the key resource that has been created. Much of the meeting involved discussion of the digital curation exchange: we were encouraged to pass on constructive critism, suggestions on process and ideas for future resources.

They have quite a lot to work on before the next meeting – good luck to them!

Posted in dcc, Events | Comments Off

JISC International Curation Education (ICE) Forum

Posted by Marieke Guy on 1st July 2011

Earlier this week I was lucky enough to attend the JISC International Curation Education (ICE) Forum which was held on Wednesday 29 June 2011 at the Roberts Building, University College London. The aim of the day was to provide an international meeting place for educators, trainers, students and practitioners of digital curation to discuss, evaluate, swap knowledge, and potentially improve practice around course design, production of advice and guidance materials and creation and use of textbooks and scholarly material. It proved to be a very informative and interactive day and I think most attendees felt like real progress was made.
Neil Grindley, programme manager at JISC kicked off by asking us to bear a few key questions in mind:

  • What forms of digital curation education are needed?
  • What approaches are being used?
  • What skills and knowledge do people need?
  • How can we most effectively share practice and resources?

Helen Tibbo – Educating the curator

Helen Tibbo from UNC Chapel Hill began setting the scene by providing an overview of what is happening at UNC and further afield in the US. She introduced a number of important projects and programmes including Educating Stewards of Public Information in the 21st Century (ESOPI-21), Closing the Digital Curation Gap, DigCCurr I, II (pronounced dij-seeker)
and various Certificates in Digital Curation.

Professor Michael Seadle – Why do we need people who can do digital curation

Michael Seadle presenting

Professor Michael Seadle, Humboldt University, Berlin, provided the most contentious talk of the day. He asked us to consider why we need people who can do digital curation, asking for answers more constructive than “so Library and Information Service professionals have something to do in the future.;-)

He sees digital curation as a split between 1) traditional librarianship and digital humanities (Metadata, selection etc. and 2) Basic computer science (system building). He explained that we currently treat the time-element of curation as a computing problem, in a similar way that we assume that reading in 2111 will be the same as in 2011.

Michael explained that we know that we perceive information differently at different time periods; this is part of the message of Marc Bloch and Michel Foucault. He feels that the goal of digital curation migration is to make sure that future readers and users can not only open content from that past, but that they can understand its meaning. A number of examples were given: Oliver Twist was originally in serial form then the publisher had to migrate the book culturally to maintain comprehension, Bach has been parodied by PDQ Bach and is regularly adapted to new instruments.

We should recognise that this is a form of digital cultural migration, and digital cultural migration is more complex than recognising format migration. Some of the trigger factors include

  • Places, names & events that are time-bound
  • Language changes
  • Changes in social mores and tastes
  • Changes in causal perceptions

In attempt to find potential solutions Michael looked at software processing which can recognise text strings and can give clues to context. As we need to make statistical judgements the more clues we have the better. The three solutions we are looking at are:

  • Machine based (today) – easy – links for names, links to explanation of language changes
  • Human based – hard – flagging social mores and tastes that could change, causal perceptions
  • Machine based (tomorrow) – harder – machine intelligence that recognises trigger events

So what training implications does this have? We need to look at ethnographic training (recognising circumstances that are triggers), standard computer science training, AI training (robotics – understanding of perceptions) and LIS training. Michael concluded that the problem is solvable in small quantities (e.g. updating books with contextual notes), on a larger scale problem is solvable by designing software that recognises and addresses the problem.

Michael’s talk was very interesting but did get quite a few people hot under the collar. The Q&A section had people point out the dissonance between what Michael had said and what archivists do. Some asked is this not the work of cultural researchers? Is it out of scope? Is it scalable? Others pointed out that the OAIS model has a commitment to preserving bits but also understanding it over time and the issues around justintime vs prophylatic processing. There were mentions of linked data as a potential way to provide the linkage and of crowd sourcing. Others offered more practical approaches such as packaging materials: adding in an index, a legend, keeping resources together. The main questions left on everyone’s lips at the end of the session was where is it worth putting in the effort? And what does all this mean for professional identity?

Alan Bell – Knowing what we don’t know: Using a devoted teaching model to deliver professional education

After a much needed coffee break Alan Bell from the University of Dundee (who had raised many of the scalability issues in the previous session) took a look at whether our masters degree programmes provide the skills/knowledge that students require.

Alan explained that to answer this questions others need to be considered,for example how well prepared are our educators to teach what students need? Alan then gave a whirlwind tour of the skills needed (including researching and investigative skills) and how our current programmes support this.

Steve Hitchcock, Institutional Digital Repositories: What role do they have in curation

Steve Hitchcock from the JISC Keepit project scared us all by pointing out what a huge amount of data there is out there. He then gave an overview of the repository layer which includes institutional repositories (research papers, preprints, postprints) and more. Keepit have worked with 4 different repositories- ecrystals (science data), University of the Arts London (arts data), EdShare, University of Southampton (teaching data), NECTAR, University of Northampton (research data). Steve pointed us to DRIVER aggregate of repositories and the Data Asset Framework , which they have used on their test bed repositories.

He sees there as being a middle route for repositories in their role in preservation. Repositories will not quickly become preservation repositories and repository managers are not archivists, but they both have a role in preservation. Steve concluded by saying that when it comes to repositories and digital curation we need to avoid creating a sense of urgency as it paralyses people. Instead we need to create a sense of capability.

The JISC Keepit project findings have recently been released.

Gordon McKenna – Cultural Heritage Digital Collections

Gordon McKenna from Europeana, the Collections Trus and Culture Grid took a look at cultural heritage digital collections. He introduced us to Digital Curator Vocational Education Europe, a project funded by the European to establish a curriculum framework for vocational training in digital curation. DigCurV received Quite a lot of airtime during the forum.

Simon Hodson – Pespectives on Curation Education for Research Data

Simon Hodson from JISC began by looking at some definitions of research data. He used definitions from a selection of sources (projects, institutions and other). For example the definition from the Sudamih project sees it as “not just structured information on computers but the whole range of materials that researchers must assemble and analyse in order to produce their research outputs”. Quite a lot of work has gone in to clarifying meaning in this area.

Simon’s opinion that this was all very nice the bigger question was how do we to we persuade researchers to give a damn? His answer was; by raising awareness, by developing an understanding of appropriate sharing and by developing information management skills.

He asked us to consider if research data curation is now seen as a part of good research practice? It may well be, but researchers want to do research. Simon highlighted some good practice in the form of the Incremental project which offered guidance and awareness raising He also pointed to some DCC resources – How to appraise and select research data by Angus Whyte and 5 new projects producing training materials Research data management training materials (RDMTrain).

Simon felt that the challenges in this area were not just making training accessible and relevant but continually providing useful disasters and discipline examples.

The brief panel session was useful but I could sense that there was a feeling in the room that once again the digital preservation and curation community were in the ‘echo chamber’ and the ‘education’ component was being forgotten. More work to be done after lunch.

Joy Davidson, What areas can we best collaborate on and what are we doing now

Joy Davidson set the mood for the afternoon by ditching her original title:The benefits of collaboration: delivering more effective teaching and training through cooperation and replacing it with something more forward looking. She explained that so far there had been good levels of collaboration but that there were still some people who just weren’t sitting round the table. These were people from different research backgrounds, industry, national bodies, people who were taking the courses. Joy then shared a few recommendations from a recent trip to the National Approaches to Digital Preservation (Tallinn) and the previous days Closing the Digital Curation Gap meeting (which I’d also attended).

Firstly we need more metrics and benchmarks. We can then can then compare and contrast courses and what employers want out of these courses. This seems to tie in nicely with the recent HE white paper.

We also need to develop data management plans. One way to do this is by getting professional bodies and industry involved in endorsing data management. Joy pointed us again to the DigCurV project, APARSEN and TIMBUS . There is also a need for more use of tools for testing such Planets.

Joy showed a list of current courses available from DPE, DPOE, DCC, Jorum, Vitae DB, this, she said was good, but not good enough. Joy speculated that the DPOE pyramid and the categorisation of executive, managerial, practical courses might be useful here.

Other practical ways forward include getting people to recognise their needs and helping people to get more practical/hands-on experience. Options like internships and professional exchanges were offered as another way to educate and build skills: “it’s not all about classrooms.” Joy also mentioned current work at Purdue on data CURATOR profiles they ask who is becoming a data curator?

Kate Fernie, DigCurv Project – emerging survey results

Kate Fernie gave a fuller introduction to the DigCurV project funded under EC’s Leonardo da Vinci programme. She explained digital preservation was now important for cultural heritage institutions all over the world. There were 82,000 related staff over Europe and DigCurV was primarily a networking project. One of their current activities was a survey about the training opportunities available.

Kate explained that DigCurV had already identified a few online courses available, which was quite impressive. However although there was lots of literature online there was very little training or educational material.

ICE Forum Networking (the ‘ICE-a-FoN’)

The next session was a networking session for delegate. The ICE-a-FoN (a name Neil Grindley was very proud of) was an opportunity for delegates to engage in a semi-structured networking session. 3 zones in the coffee area designated as ‘curriculum’, ‘training’ and ‘resources’ featured posters and other relevant information and delegates were encouraged to submit forms saying what they’d learnt – the prize was a Kindle! The session was really useful, a great idea.

Back in the main lecture room Heather Bowden, UNC Chapel Hill, gave a quick summary of conversations overheard and opinions elicited during the ICE-a-FoN. The most memorable was the way to remember the difference between education and training: You’d like your children to get sex education but not necessarily sex training…

Cal Lee, What do you care about if you care about digital curation?

After the excitement of the ICE-a-Fon Cal Lee from UNC Chapel Hill brought us back down to earth by considering what you care about if you care about digital curation. He explained that new professionals must care about traces and values and that there is a need to inspire those who are going to learn about digital curation.

Lightning Talks

Possibly the most enjoyable part of the day (though it was a great day generally) was the lightning talks. Anyone who had a burning desire to talk about anything related to digital curation and education and education was given just 3 minutes. The talks were:

Symfonie in data by verbeeldingskr8

  • Neil Beagrie – Neil introduced the JISC Digital Preservation Benefits Analysis Tools project.
  • Marina Noordegraaf – Marina used her illustration Symfonie in data (see above) to state the importance of just starting and not waiting until we think we know everything.
  • Beth Yakel – Beth asked us how do we evaluate student learning? She explained that this involved learning to change expectations as well as the importance of evaluating student learning styles and preferences.
  • Patricia Sleaman – Patricia talked the DPTP’s work with those from third world countries including Iraq. She quoted Margaret Hedstrom: “Outside institutions may have some short-term funding, which they’ll use to produce valuable resources that don’t stay in the country of origin. There is no plan for sustainability. In the long run this will create a skewed record of culture, where the culture from developed countries will be well preserved and the culture from the underdeveloped countries may be lost.
  • Scott Brandt – Scott introduced the Duration Curation Profiles toolkit.
  • Angela Dappert – Angela showed us the TIMBUS project – and introduced the new DPC staff.
  • Sharon McMeekin – Sharon, another new DPC staff member, carried on with other DPC plans including APARSEN.
  • Mike Furlough – Mike talked about the ARL eScience institute: ensuring that Research Libraries not disconnected from scientists.
  • Greg Jansen – Greg showed the Curator’s Workbench from UNC.
  • William Kilbride – William reminded us that DPC gives grants to enable members to attend training courses. He also pointed out the 5 new DPC study areas: Preserving email, preserving sound and vision, digital forensics, IPR, trust regarding ejournals.
  • Heather Bowden – Heather gave a quick demo of the Digital Curation Exchange.
  • Kevin Ashley – Kevin pointed out some of the mornings concerns that we were failing to recognise that we have had many conversations already and that there have already been lessons. He pointed out the Swan skills report and the Donnelly/Pryor article.
  • Sheila Corrall – Sheila questioned professional silos and suggested that professional bodies could join and discuss overlaps.

Seamus Ross, Educating and Validating the Capabilities of Emerging Digital Management Professionals

The closing plenary was given by Seamus Ross from the University of Toronto. Seamus looked at what is needed in a data curator. He asked if we need data curators who are subject specialist or data curators who are technologists? His argument was that it was harder to train someone to be a scientist and so there was a real need to educate producers, managers and users of digital content. He explained that as well as digital curation training, we need to educate data creators to make preservable and curatable data. A digital curation profession must think like a humanist scholar, behave like an engineer, have the ethical standards and have deep subject knowledge

Like others during the day, Seamus emphasised the importance of case studies. He also called for an international profession association for digital preservation and for accreditation and certification of programmes. He concluded that digital curators need to be passionate about preservation, though a delegate suggested that it was more important that you be passionate about what you are preserving.


I really enjoyed the forum and felt that real progress was made during the day. The atmosphere was light but still focused and constructive, the digital curation community are a great bunch. My only suggestion/ slight criticism is that it would have been good to get people along who are actually taking the courses discussed during the day. The cost of the course (£25 for students) and possibly its timing (during the student holiday period) may have been factors here. Maybe something to bear in mind for next time?

There is a TwapperKeeper archive and a Summarizr site for the #iceforum hash tag.

Posted in Events | 2 Comments »