JISC Beginner's Guide to Digital Preservation

…creating a pragmatic guide to digital preservation for those working on JISC projects

Free Research DM Workshop, Cambridge, 9-11 Nov

Posted by Marieke Guy on October 4th, 2011

Unfortunately the DCC Brighton Roadshow was cancelled but the next roashow isn’t far off and places are still available.

To give some background….

DCC logoThe UK Digital Curation Centre is running a series of free inter-linked regional workshops aimed at supporting institutional research data management planning and training. The DCC Roadshows are designed to allow every institution in the UK to prepare for effective research data management and understand more about how the DCC can help. The sixth DCC Roadshow is being organised in conjunction with Cambridge Library and will take place from 9th – 11th 2011 November in the Paston Brown Room at the Homerton Conference Centre, Cambridge.

The roadshow runs over three days but each workshop can be booked individually. Attendees are encouraged to select the workshops which address their own particular data management requirements. The workshops will provide advice and guidance tailored to a range of staff, including PVCs Research, University Librarians, Directors of IT/Computing Services, Repository Managers, Research Support Services and practising researchers.

Day one is an introductory day aimed at researchers, data curators, staff from library etc. It provides an introduction to the DCC and the role of the DCC in supporting research data management. Day two is a more interactive day aimed at senior managers, research PVCs/Directors, directors of Information Services etc. and looks at strategy/policy implementation. Day three is a hands-on day and consists of the Digital Curation 101 – How to manage research data: tips and tools workshop.

To find out more about the workshops take a look at the DCC Cambridge Roadshow page. Registration for the workshop is free but places are limited.

If you can’t decide if the roadshow is for you Steve Walsh from the Interoperable Geospatial Data for Biosphere Study( IGIBS) Project, Aberystwyth University, has written a review of the most recent workshop held in Oxford.

Details of further roadshows will be announced soon on the DCC Web site.

Posted in dcc, irg, Project news | Comments Off

Web Archiving and the IIPC

Posted by Marieke Guy on September 28th, 2011

There is a great video on the importance of Web Archiving courtesy of the International Internet Preservation consortium. This video is also available in German, Spanish, French, Japanese and Arabic.

The IIPC is the home of world-wide experts in collecting and preserving information from the Web.

Posted in Archiving | Comments Off

DCC and the Sussex Roadshow

Posted by Marieke Guy on September 26th, 2011

I’ve recently been appointed as an Institutional Support Officer for the Digital Curation Centre. In this role I will be raising awareness and building capacity for institutional research data by liaising with libraries, IT services, research support staff and others.

My first step in getting myself up to speed will be attending the DCC Roadshow to be held in Brighton from the 4th – 6th October 2011 at the University of Sussex Conference Centre. I attended one day of one of the earlier roadshows held in Bath and thoroughly enjoyed it, so I am very much looking forward to it.

The roadshow consists of three days of training in research data management. Day one is an introductory day aimed at researchers, data curators, staff from library etc. It provides an introduction to the DCC and the role of the DCC in supporting research data management. Day two is a more interactive day aimed at senior managers, research PVCs/Directors, directors of Information Services etc. and looks at strategy/policy implementation. Day three is a proper hands-on day and consists of the Digital Curation 101 – How to manage research data: tips and tools workshop.

Attendees are welcome to dip in an out of the workshops and don’t have to attend the full three days. There are still places available and I’m sure it will a very useful couple of days. Might see you there!

Tags: ,
Posted in dcc, Events | 2 Comments »

The Future of the Past of the Web

Posted by Marieke Guy on September 15th, 2011

Have you ever lost valuable information which was hosted on your Web site? Do you have a record of how your Web site has developed since its launch over 15 years ago?

If these questions are of interest to you you may wish to attend an event on “The Future of the Past of the Web”. JISC are running a workshop on “The Future of the Past of the Web” which will take place in the British Library on 7 October 2011 from 10.30-16.30. Places arefree but will need to be booked before 1200 on Friday 30th September 2011. Further information is available from the JISC Web site.

Posted in Events | Comments Off

DPC Hackathon

Posted by Marieke Guy on September 12th, 2011

The Open Planets Foundation and the Digital Preservation Coalition are inviting people to a hackathon at the DPC offices in York 27th-29th September.

The ‘hackathon” is designed to bridge the gap between collections owners and developers in the development of practical tools for preservation. It will provide a forum for practical problem solving. It will help collection owners to articulate their requirements in ways that developers can address; and will help developers respond more precisely to the needs of a community hungry for solutions.

This event will interest:

  • Collection owners and managers who can bring along samples of their problematic digital collections. You will be asked to give a short talk to provide an overview of the content and the known or potential issues to the group.
  • Developers / technical experts who want to gain hands-on experience of applying digital preservation techniques to digital collections. You will be asked to give a short talk about your technical experience and interests.

DPC and OPF members are invited to attend free-of-charge, non-members are also welcome at the cost of 200GBP. For more details, including registration see the DPC Web site.

Posted in Events | Comments Off

Videos from the ICE Forum

Posted by Marieke Guy on August 19th, 2011

Some vox pop videos created at the JISC International Curation Education (ICE) Forum are now available:

Stuart MacDonald from EDINA refers to selection and appraisal.

YouTube Preview Image

Natalie Walters from the Wellcome Library talks about the need to listen to researchers/users.

YouTube Preview Image
Mike Furlough from Penn State University is concerned about building capacity in the libraries to work with researchers.

YouTube Preview Image

Bill Veillette from the North Eastern Conservation Centre talks about how to provide effective training.

YouTube Preview Image

Posted in Archiving, Events | Comments Off

Closing the Digital Curation Gap

Posted by Marieke Guy on July 6th, 2011

Last week on the day before the ICE Forum (28th June 2011) I attended the Closing the Digital Curation Gap Meeting.

CDCG is an International Collaboration to Integrate Best Practice, Research & Development, and Training in Digital Curation. It has been running since October 2009 and was scheduled to finish in September this year but has just been given an extension (till September 2012). A comprehensive overview of the project is given on the Digital Curation Exchange Web site.

The Closing the Digital Curation Gap (CDCG) collaboration is designed to serve as a locus of interaction between those doing leading edge digital curation research, development, teaching, and training in academic and practitioner communities those with a professional interest in applying viable innovations within particular organizational contexts; IMLS; JISC; the DCC, charged with disseminating such innovation and best practices; and the SCA, charged to build a common information environment where users of publicly funded e-Content can realize best value by reducing the barriers that inhibit access, use and re-use of online content.

I have come along to the project at a fairly late stage but hope I can still be of use and possibly offer a new perspective (that of not being an expert!).

The June meeting was held at the JISC offices in London and was a joint meeting of the US and UK partners. The UK was represented by members from JISC, UKOLN, ULCC, HATII, the BL and the DPC, the US had people from the Bishoff Group, Penn State University Libraries, Purdue University Libraries, University of North Carolina at Chapel Hill and University of Toronto. [Thanks to Sharon McMeeking from the DPC for sharing her notes to help jig my memory].

The aims of the meeting were to discuss the outputs of the project so far and to set objectives for the continuation of the work in 2011/12. The main work so far has been staging a number of focus groups, work on decision trees and work on best practice guides. The digital curation exchange web site is the key resource that has been created. Much of the meeting involved discussion of the digital curation exchange: we were encouraged to pass on constructive critism, suggestions on process and ideas for future resources.

They have quite a lot to work on before the next meeting – good luck to them!

Posted in dcc, Events | Comments Off

JISC International Curation Education (ICE) Forum

Posted by Marieke Guy on July 1st, 2011

Earlier this week I was lucky enough to attend the JISC International Curation Education (ICE) Forum which was held on Wednesday 29 June 2011 at the Roberts Building, University College London. The aim of the day was to provide an international meeting place for educators, trainers, students and practitioners of digital curation to discuss, evaluate, swap knowledge, and potentially improve practice around course design, production of advice and guidance materials and creation and use of textbooks and scholarly material. It proved to be a very informative and interactive day and I think most attendees felt like real progress was made.
Neil Grindley, programme manager at JISC kicked off by asking us to bear a few key questions in mind:

  • What forms of digital curation education are needed?
  • What approaches are being used?
  • What skills and knowledge do people need?
  • How can we most effectively share practice and resources?

Helen Tibbo – Educating the curator

Helen Tibbo from UNC Chapel Hill began setting the scene by providing an overview of what is happening at UNC and further afield in the US. She introduced a number of important projects and programmes including Educating Stewards of Public Information in the 21st Century (ESOPI-21), Closing the Digital Curation Gap, DigCCurr I, II (pronounced dij-seeker)
and various Certificates in Digital Curation.

Professor Michael Seadle – Why do we need people who can do digital curation

Michael Seadle presenting

Professor Michael Seadle, Humboldt University, Berlin, provided the most contentious talk of the day. He asked us to consider why we need people who can do digital curation, asking for answers more constructive than “so Library and Information Service professionals have something to do in the future.;-)

He sees digital curation as a split between 1) traditional librarianship and digital humanities (Metadata, selection etc. and 2) Basic computer science (system building). He explained that we currently treat the time-element of curation as a computing problem, in a similar way that we assume that reading in 2111 will be the same as in 2011.

Michael explained that we know that we perceive information differently at different time periods; this is part of the message of Marc Bloch and Michel Foucault. He feels that the goal of digital curation migration is to make sure that future readers and users can not only open content from that past, but that they can understand its meaning. A number of examples were given: Oliver Twist was originally in serial form then the publisher had to migrate the book culturally to maintain comprehension, Bach has been parodied by PDQ Bach and is regularly adapted to new instruments.

We should recognise that this is a form of digital cultural migration, and digital cultural migration is more complex than recognising format migration. Some of the trigger factors include

  • Places, names & events that are time-bound
  • Language changes
  • Changes in social mores and tastes
  • Changes in causal perceptions

In attempt to find potential solutions Michael looked at software processing which can recognise text strings and can give clues to context. As we need to make statistical judgements the more clues we have the better. The three solutions we are looking at are:

  • Machine based (today) – easy – links for names, links to explanation of language changes
  • Human based – hard – flagging social mores and tastes that could change, causal perceptions
  • Machine based (tomorrow) – harder – machine intelligence that recognises trigger events

So what training implications does this have? We need to look at ethnographic training (recognising circumstances that are triggers), standard computer science training, AI training (robotics – understanding of perceptions) and LIS training. Michael concluded that the problem is solvable in small quantities (e.g. updating books with contextual notes), on a larger scale problem is solvable by designing software that recognises and addresses the problem.

Michael’s talk was very interesting but did get quite a few people hot under the collar. The Q&A section had people point out the dissonance between what Michael had said and what archivists do. Some asked is this not the work of cultural researchers? Is it out of scope? Is it scalable? Others pointed out that the OAIS model has a commitment to preserving bits but also understanding it over time and the issues around justintime vs prophylatic processing. There were mentions of linked data as a potential way to provide the linkage and of crowd sourcing. Others offered more practical approaches such as packaging materials: adding in an index, a legend, keeping resources together. The main questions left on everyone’s lips at the end of the session was where is it worth putting in the effort? And what does all this mean for professional identity?

Alan Bell – Knowing what we don’t know: Using a devoted teaching model to deliver professional education

After a much needed coffee break Alan Bell from the University of Dundee (who had raised many of the scalability issues in the previous session) took a look at whether our masters degree programmes provide the skills/knowledge that students require.

Alan explained that to answer this questions others need to be considered,for example how well prepared are our educators to teach what students need? Alan then gave a whirlwind tour of the skills needed (including researching and investigative skills) and how our current programmes support this.

Steve Hitchcock, Institutional Digital Repositories: What role do they have in curation

Steve Hitchcock from the JISC Keepit project scared us all by pointing out what a huge amount of data there is out there. He then gave an overview of the repository layer which includes institutional repositories (research papers, preprints, postprints) and more. Keepit have worked with 4 different repositories- ecrystals (science data), University of the Arts London (arts data), EdShare, University of Southampton (teaching data), NECTAR, University of Northampton (research data). Steve pointed us to DRIVER aggregate of repositories and the Data Asset Framework , which they have used on their test bed repositories.

He sees there as being a middle route for repositories in their role in preservation. Repositories will not quickly become preservation repositories and repository managers are not archivists, but they both have a role in preservation. Steve concluded by saying that when it comes to repositories and digital curation we need to avoid creating a sense of urgency as it paralyses people. Instead we need to create a sense of capability.

The JISC Keepit project findings have recently been released.

Gordon McKenna – Cultural Heritage Digital Collections

Gordon McKenna from Europeana, the Collections Trus and Culture Grid took a look at cultural heritage digital collections. He introduced us to Digital Curator Vocational Education Europe, a project funded by the European to establish a curriculum framework for vocational training in digital curation. DigCurV received Quite a lot of airtime during the forum.

Simon Hodson – Pespectives on Curation Education for Research Data

Simon Hodson from JISC began by looking at some definitions of research data. He used definitions from a selection of sources (projects, institutions and other). For example the definition from the Sudamih project sees it as “not just structured information on computers but the whole range of materials that researchers must assemble and analyse in order to produce their research outputs”. Quite a lot of work has gone in to clarifying meaning in this area.

Simon’s opinion that this was all very nice the bigger question was how do we to we persuade researchers to give a damn? His answer was; by raising awareness, by developing an understanding of appropriate sharing and by developing information management skills.

He asked us to consider if research data curation is now seen as a part of good research practice? It may well be, but researchers want to do research. Simon highlighted some good practice in the form of the Incremental project which offered guidance and awareness raising He also pointed to some DCC resources – How to appraise and select research data by Angus Whyte and 5 new projects producing training materials Research data management training materials (RDMTrain).

Simon felt that the challenges in this area were not just making training accessible and relevant but continually providing useful disasters and discipline examples.

The brief panel session was useful but I could sense that there was a feeling in the room that once again the digital preservation and curation community were in the ‘echo chamber’ and the ‘education’ component was being forgotten. More work to be done after lunch.

Joy Davidson, What areas can we best collaborate on and what are we doing now

Joy Davidson set the mood for the afternoon by ditching her original title:The benefits of collaboration: delivering more effective teaching and training through cooperation and replacing it with something more forward looking. She explained that so far there had been good levels of collaboration but that there were still some people who just weren’t sitting round the table. These were people from different research backgrounds, industry, national bodies, people who were taking the courses. Joy then shared a few recommendations from a recent trip to the National Approaches to Digital Preservation (Tallinn) and the previous days Closing the Digital Curation Gap meeting (which I’d also attended).

Firstly we need more metrics and benchmarks. We can then can then compare and contrast courses and what employers want out of these courses. This seems to tie in nicely with the recent HE white paper.

We also need to develop data management plans. One way to do this is by getting professional bodies and industry involved in endorsing data management. Joy pointed us again to the DigCurV project, APARSEN and TIMBUS . There is also a need for more use of tools for testing such Planets.

Joy showed a list of current courses available from DPE, DPOE, DCC, Jorum, Vitae DB, this, she said was good, but not good enough. Joy speculated that the DPOE pyramid and the categorisation of executive, managerial, practical courses might be useful here.

Other practical ways forward include getting people to recognise their needs and helping people to get more practical/hands-on experience. Options like internships and professional exchanges were offered as another way to educate and build skills: “it’s not all about classrooms.” Joy also mentioned current work at Purdue on data CURATOR profiles they ask who is becoming a data curator?

Kate Fernie, DigCurv Project – emerging survey results

Kate Fernie gave a fuller introduction to the DigCurV project funded under EC’s Leonardo da Vinci programme. She explained digital preservation was now important for cultural heritage institutions all over the world. There were 82,000 related staff over Europe and DigCurV was primarily a networking project. One of their current activities was a survey about the training opportunities available.

Kate explained that DigCurV had already identified a few online courses available, which was quite impressive. However although there was lots of literature online there was very little training or educational material.

ICE Forum Networking (the ‘ICE-a-FoN’)

The next session was a networking session for delegate. The ICE-a-FoN (a name Neil Grindley was very proud of) was an opportunity for delegates to engage in a semi-structured networking session. 3 zones in the coffee area designated as ‘curriculum’, ‘training’ and ‘resources’ featured posters and other relevant information and delegates were encouraged to submit forms saying what they’d learnt – the prize was a Kindle! The session was really useful, a great idea.

Back in the main lecture room Heather Bowden, UNC Chapel Hill, gave a quick summary of conversations overheard and opinions elicited during the ICE-a-FoN. The most memorable was the way to remember the difference between education and training: You’d like your children to get sex education but not necessarily sex training…

Cal Lee, What do you care about if you care about digital curation?

After the excitement of the ICE-a-Fon Cal Lee from UNC Chapel Hill brought us back down to earth by considering what you care about if you care about digital curation. He explained that new professionals must care about traces and values and that there is a need to inspire those who are going to learn about digital curation.

Lightning Talks

Possibly the most enjoyable part of the day (though it was a great day generally) was the lightning talks. Anyone who had a burning desire to talk about anything related to digital curation and education and education was given just 3 minutes. The talks were:

Symfonie in data by verbeeldingskr8

  • Neil Beagrie – Neil introduced the JISC Digital Preservation Benefits Analysis Tools project.
  • Marina Noordegraaf – Marina used her illustration Symfonie in data (see above) to state the importance of just starting and not waiting until we think we know everything.
  • Beth Yakel – Beth asked us how do we evaluate student learning? She explained that this involved learning to change expectations as well as the importance of evaluating student learning styles and preferences.
  • Patricia Sleaman – Patricia talked the DPTP’s work with those from third world countries including Iraq. She quoted Margaret Hedstrom: “Outside institutions may have some short-term funding, which they’ll use to produce valuable resources that don’t stay in the country of origin. There is no plan for sustainability. In the long run this will create a skewed record of culture, where the culture from developed countries will be well preserved and the culture from the underdeveloped countries may be lost.
  • Scott Brandt – Scott introduced the Duration Curation Profiles toolkit.
  • Angela Dappert – Angela showed us the TIMBUS project – and introduced the new DPC staff.
  • Sharon McMeekin – Sharon, another new DPC staff member, carried on with other DPC plans including APARSEN.
  • Mike Furlough – Mike talked about the ARL eScience institute: ensuring that Research Libraries not disconnected from scientists.
  • Greg Jansen – Greg showed the Curator’s Workbench from UNC.
  • William Kilbride – William reminded us that DPC gives grants to enable members to attend training courses. He also pointed out the 5 new DPC study areas: Preserving email, preserving sound and vision, digital forensics, IPR, trust regarding ejournals.
  • Heather Bowden – Heather gave a quick demo of the Digital Curation Exchange.
  • Kevin Ashley – Kevin pointed out some of the mornings concerns that we were failing to recognise that we have had many conversations already and that there have already been lessons. He pointed out the Swan skills report and the Donnelly/Pryor article.
  • Sheila Corrall – Sheila questioned professional silos and suggested that professional bodies could join and discuss overlaps.

Seamus Ross, Educating and Validating the Capabilities of Emerging Digital Management Professionals

The closing plenary was given by Seamus Ross from the University of Toronto. Seamus looked at what is needed in a data curator. He asked if we need data curators who are subject specialist or data curators who are technologists? His argument was that it was harder to train someone to be a scientist and so there was a real need to educate producers, managers and users of digital content. He explained that as well as digital curation training, we need to educate data creators to make preservable and curatable data. A digital curation profession must think like a humanist scholar, behave like an engineer, have the ethical standards and have deep subject knowledge

Like others during the day, Seamus emphasised the importance of case studies. He also called for an international profession association for digital preservation and for accreditation and certification of programmes. He concluded that digital curators need to be passionate about preservation, though a delegate suggested that it was more important that you be passionate about what you are preserving.


I really enjoyed the forum and felt that real progress was made during the day. The atmosphere was light but still focused and constructive, the digital curation community are a great bunch. My only suggestion/ slight criticism is that it would have been good to get people along who are actually taking the courses discussed during the day. The cost of the course (£25 for students) and possibly its timing (during the student holiday period) may have been factors here. Maybe something to bear in mind for next time?

There is a TwapperKeeper archive and a Summarizr site for the #iceforum hash tag.

Posted in Events | 2 Comments »

Digital Obsolescence – buzz phrase or a real issue?

Posted by Marieke Guy on June 20th, 2011

Deborah Wilson is currently undertaking a research study into digital obsolescence; the results of which she will be happy to share once the analysis is complete. To benefit from quality data she is asking that people complete an online survey. The survey is tailored to those working in the records management area but all working with digital data are welcome to fill it in.

The survey will take no longer than 10 minutes to complete and your participation is greatly appreciated.

Please note that any personal data collected as a result of this survey will be made anonymous and will not be disclosed to any third party or processed for any other purpose.

Posted in Project news | Comments Off

Digital Archaeology

Posted by Marieke Guy on June 13th, 2011

An exhibition on the history of the Web and Web design will run in New York this month.

The exhibition, Digital Archaeology, debuted at Internet Week Europe 2010 and “charts the disruptive moments of web design and celebrates the characters behind its radical evolution“.

The Project

The exhibition will show case 28 important Web sites including: The Project – the first website, published by Tim Berners-Lee at CERN in 1991, and Word.com – One of the earliest and most influential e-zines, a true multimedia experience, incorporating games, audio, and chat.

An introduction to the exhibition explains…

The web is just 20 years old, yet it has transformed our lives utterly, down to the bone. We do, see, hear, share, copy, sell, buy, interact, relate with authority and participate in society differently. Things will never be the same again. Over this short time, technological and communications developments have been so fast that the groundbreaking work of the early creative pioneers, produced on now defunct hardware and software, have disappeared almost as soon as they appeared, like Mayflies in spring doomed to die as the daylight fades.

Soon we will know less about these HTML blossomings than we do about the relief carvings in Mohenjo-Daro or the Yucatán. While they helped define our new culture, almost none of the websites of less than two decades ago can be seen at all. Today, when almost a quarter of the earth’s population is online, this most recent artistic, commercial and social history is being wiped from the face of earth and a hundred million hard drives lie festering in recycling yards or rusting in landfills.”

In his article on the exhibition entitled Internet history is vanishing into thin air Allan Hoffman asks:

Does your company have a working, surfable copy of its first website or the second or third versions? Probably not. But if your company published an annual report or even a newsletter in 1954 or 1999, you can bet someone saved it, and you could dig up a copy.

It seems Web preservation has at last hit the mainstream.

Posted in Project news | Comments Off