JISC Beginner's Guide to Digital Preservation

…creating a pragmatic guide to digital preservation for those working on JISC projects

Archive for June, 2010

Creating Open Training Materials

Posted by Marieke Guy on 24th June 2010

Yesterday I attended the Open University annual Learning and Technology conference: Learning in an open world.

I’ve talked more about my general feelings on the day in another blog post (Learning at an Online Conference) but here I want to focus in on the content of one particular session: Creating Open Courses, presented by Tony Hirst of the Open University (you can watch a playback of the session in Elluminate). During my time working on the JISC Beginner’s Guide to Digital Preservation I’ve been thinking quite a bit about what it means to create open training/learning materials and Tony’s approach struck a chord with me.

Tony’s slides are available on Slideshare and embedded below.

Tony’s talk focused around his creation of the T151 course, an OU module on Digital Worlds, part of the Relevant Knowledge programme.

Tony talked about how the OU are making their print content open through services like OpenLearn and their AV material through YouTube and iTunesU. However while this is happening the mode of production is not necessarily open and he explained that it can take several years to produce a course and it can take 5 to 10 academics 18 months to write one.

Tony wanted to move away from this approach and write the T151 course in public and virtually in real time – 10 weeks of content in 20 weeks. He did so by writing blog posts. The course actually took about 15 weeks to write.

Tony made the choice to use WordPress primarily because of the restrictions on what you can embed, in this way it is similar to Moodle (the open source VLE OU and others are familar with). This is an interesting approach. I am leaning heavily towards WordPress for the final delivery of the Beginner’s Guide (primarily due to time restraints and the fact that I already have experience using WordPress). The restrictions are sometimes a hinderance to me rather than a benefit! Another reason he chose WordPress was it “gives you RSS with everything“, agreed, this can be a real bonus.

Tony then wrote blog posts on series of topics according to a curriculum developed with other academics. He used a FreeMind mind map to get his ideas down and then each blog post was made up of 500-1000 words and took 1-4 hours to write. The end result would take students 10 minutes to 1 hour to work through. Within his posts Tony embedded YouTube movies and other external services. The end result was not a single fixed linear narrative but an emergent narratives. He used GraphViz visualisation to show reverse trackbacks where posts reference previous posts.

The blog also contained questions, readings and links to other relevant content. The idea of this was that each area could be populated from a live feed maintained by someone else. Tony felt that the important thing was to allow students to explore and do (e.g use GameMaker to build a game and submit it), share (using Moodle forums) and demonstrate.

Tony wanted to get away from the idea that there’s a single route through the course and that educator is expressing the one true answer. The students were also provided with a Skunkworks area in a wiki and a FreeMind mind map of all the resources in the course. Assessment was given through short questions and a larger question: they had to write a game design document for a game. He was looking for students to have opportunities to surprise.

In the Q&A Tony talked about how he had written the course while trying to do 101 other things at the same time and how a lot of the course chunks he would write for multiple reasons – this seems to be the approach I’m currently taking. Tony concluded by saying that creating the course was a travelogue in part and was his journey through that material.

How good to hear the approach that I’m currently taking (or trying to take) being endorsed!

Tony has written more about his approach on http://blog.ouseful.info and is very vocal on Twitter as @psychemedia.

Posted in Events, trainingmaterials | Comments Off

LiWA Launch first Code

Posted by Marieke Guy on 14th June 2010

Today the LiWA (Living Web Archives) project has announced the release of the first open-source components of the “liwa-technologies” project on Google code.

The LiWA project is looking

beyond the pure “freezing” of Web content snapshots for a long time, transforming pure snapshot storage into a “Living” Web Archive. “Living” refers to a) long term interpretability as archives evolve, b) improved archive fidelity by filtering out irrelevant noise and c) considering a wide variety of content.

They plan to extend the current state of the art and develop the next generation of Web content capture, preservation, analysis, and enrichment services to improve fidelity, coherence, and interpretability of web archives.

This is the first release of the software so they are keen to receive feedback and comments.

Posted in Archiving, Web | Comments Off

Have you got a Case Study for Us?

Posted by Marieke Guy on 10th June 2010

If you are involved in a JISC project (or work in a similar environment) and would like to offer us a case study of your digital preservation methods please do get in touch.

Some areas that you might want to include in your case study are:

  • The background to your project.
  • A description of the digital preservation problem being addressed.
  • An explanation of the approach taken.
  • A summary of any problems experienced.
  • An explanation the things you would do differently today, based on the experienced you have gained.
  • References
  • Contact Details

Posted in Project news | 1 Comment »

Preserving and the Current Economy

Posted by Marieke Guy on 8th June 2010

Yesterday David Cameron warned the British public of what he called the “inevitably painful times that lie ahead“. His speach referred to the spending cuts that are seen to be necessary to reduce the 70 billion debt the UK currently has. A few weeks ago the Coalition government unveiled their first round of spending cuts and the budget on 22 June is likely to lower the axe again. The Department for Business Innovation & Skills (BIS) has the Higher Education budget down for £200 million in efficiencies.

It is inevitable that a number of organisations will close and many projects will come to an end. British Educational Communications and Technology Agency (Becta), the organisation which promotes the use of technology in schools was one of the first to go.

So what role will digital preservation and access play in the current economic and fiscal situation?

Digital preservation is more important than ever in a time when the wealth of what JISC and other government funded organisations have created could potentially slip away.

After the closure of Becta was announced there was much discussion on Twitter about what would happen to their Web site and their intellectual assets. Some of their work will be carried by other government organisations and it’s likely that these resources will be transfered over to other sites and databases. Their Web site is currently one of those preserved by the National Archives however there are still questions over what else will be preserved and the processes that will take place. Will they mothball their Web site? What other Web resources will they save? They would do well to consult the JISC Preservation of Web Resources handbook.

It will be an interesting case study to watch.

Howerver it is not only government digital objects that are at risk. Those of commercial companies are unlikely to stand the test of time either.

In response to this the UK Web Archive have created a collection for the recession containing Web sites from high street stores closed down.The Credit Crunch Collection initiated in July 2008 contains records of high-street victims of the recession including Woolworths and Zavvi.

There is also a worry that many digital records from bankrupt companies will dissapear in the haste to sell off assets. On his blog a records manager explains how in the past archivists have waded in to save companies records in a form of “rescue” archiving. However “When a modern business goes bust, many of its records will exist only in electronic form….The inheriting organisation will always be under pressure to take the easiest and cheapest way to dispose of a predecessor’s assets, which in practice probably means that data will be wiped and the hardware sold on. “.

It seems that much is likely to be lost in the next few years in we aren’t careful.

Posted in Project news, Web | Comments Off

What is Digital Preservation?

Posted by Marieke Guy on 4th June 2010

The first question I asked myself when I began researching the JISC Beginner’s Guide to Digital Preservation is “what exactly is digital preservation?”.

The experts have put a lot of effort into clarity in this area and a good working definition for the sake of this guide is:

The series of managed activities necessary to ensure continued access to digital materials for as long as necessary.

This definition comes from the Digital Preservation Coalition (DPC) Definitions and Concepts list and I feel it works because it is clear and specific.

Let’s look at it a little closer:

  • Managed – Digital preservation is a managerial problem. All activities (the planning, resource allocation, use of technologies, etc.) need to have been thought about and take place for a reason. The term managed stresses the need for a policy.
  • Activities – The policy needs to filter down to a list of processes: tasks that can take place at specified times and in specified ways.
  • Necessary – We are looking at what needs to be done. In your policy you will have looked at how long you want to preserve the objects for. Necessary talks about the activities needed to achieve a specified level of preservation. there may be other useful activities but we want to look at the most essential ones here.
  • Continued Access – Access is the key here. Most objects in the public sphere are preserved to enable access and retrieval. How long this access is needed will have been discussed and should be defined in your policy.
  • Digital Materials – Digital materials, digital objects, call them what you will. This is the stuff you are preserving. Different objects require different processes.

Other useful definitions are available from DigitalPreservationEurope (DPE), the Digital Curation Center (DCC) , the Digital Preservation of ALCTS Preservation and Reformatting Section (Working Group on Defining Digital Preservation) and Wikipedia. Note that digital curation tends to refer more to science/reserach data.

Many organisations choose to quantify their definition of digital preservation by 3 terms of preservation:

  • Long-term preservation – Continued access to digital materials, or at least to the information contained in them, indefinitely.
  • Medium-term preservation – Continued access to digital materials beyond changes in technology for a defined period of time but not indefinitely.
  • Short-term preservation – Access to digital materials either for a defined period of time while use is predicted but which does not extend beyond the foreseeable future and/or until it becomes inaccessible because of changes in technology.

For JISC projects it will normally be require that digital objects are preserved for the medium-term or the long-term.

A really useful slideshow introduction to digital preservation was written by Michael Day, UKOLN and is available on Slideshare.

Posted in definition | Comments Off

Rolling Back the Years

Posted by Marieke Guy on 3rd June 2010

I’ve been having a little play with MementoFox, a firefox addin that “links resources with their previous versions automatically, so can you see the web as it was in the past“.

Once you have installed the addin a little slider bar is added to your Firefox Web browser. When browsing any Web site you can use the slider bar to select a date on which you’d like to see the shown page. Momento will then look for the closest archived copy available. As you can see I have used MementoFox on the UKOLN home page.

Below is the page for around the time I started working at UKOLN – 10 years ago! The page here is taken from the Internet Archive Wayback machine.

UKOLN Home page in 2000

And here is the page as it is now.

UKOLN Home page in 2010

I initially used version 0.8.6 of MementoFox and had a few problems with viewing embeds (of video, slides etc.) of blogs in Firefox. Version 0.8.7 seems to have sorted this out.

The Memento Project Web site is definitely worth taking a look at. There are various time traveling scenarios and walkthroughs and more information on where the project is going. The project “wants to make it as straightforward to access the Web of the past as it is to access the current Web.

Memento slider bar

At this point, there aren’t any formal technical specifications detailing the Memento framework but we will get to that. For now, the information on this site should provide quite a good insight into how Memento is trying to change the Web by adding a time dimension to its most common protocol, HTTP…If you are interested in establishing a Web with a memory, please join the Memento Development Group.

Maybe in the future we’ll be able to switch our ‘time-versions’ of Web pages as easily as we switch our blog themes.

Posted in Project news | Comments Off