JISC Beginner's Guide to Digital Preservation

…creating a pragmatic guide to digital preservation for those working on JISC projects

DPTP Web Archiving Workshop

Posted by Marieke Guy on 1st July 2010

On Monday I attended the Digital Preservation Training Programme (DPTP) Web archiving workshop.

The Digital Preservation Training Programme

The Digital Preservation Training Programme (DPTP) was initially a project funded by the JISC under its Digital Preservation and Asset Management programme. It has been lead by ULCC and had input from the Digital Preservation Coalition, Cornell University and the British Library.

The programme offers a modular training programme with content aimed at multiple levels of attendee. It builds on the foundations of Cornell’s Digital Preservation Management Workshop.

The DPTP team currently run a 3 day digital preservation course, of which Web archiving is a module.  However this was the first time they had offered the module independently of the rest of the course. I believe there are intentions to offer one-off modules more in the future. They are also planning to offer more content online both freely under a Creative Commons licence and as part of a paid for course. This work is still in the development stage.

For the Web archiving workshop they had squeezed the module into half a day, which made for some rushing of content and a late finish. I have a feeling they will be rethinking their timetable. There was way to much content for 3 hours and a longer workshop would allow more time for networking and group activities.

Approach

The team (Ed Pinsent and Patricia Sleeman) started off by introducing the 3 legged stool approach (borrowed from Cornell). This approach sees understanding of digital preservation as requiring consideration of 3 main areas: technology, organisation and resources. While technology used to be seen as the silver bullet these days achieving good digital preservation is much more about planning (the organisation and resource legs). The Web archiving module considered primarily issues relating to the technology and organisation legs.

At the start of the workshop the DPTP team were upfront about the approach they wanted to take and what they wanted from attendees. They explained that they were not there to promote ‘one right way’ but to offer an explanation of the current situation and then allow us to make the decisions. They were keen to encourage interaction and informal question asking -  “there are no stupid questions“.

Content

The content of the day was really useful, the team trod a nice line between covering cultural issues and the technologies that enable archiving. I won’t go into what I learnt here, that’s the content of another blog post but despite being fairly familiar with Web archiving I found there was lots of new information to digest. Not only did I learn from the team but I also learnt by chatting to others interested in Web archiving. This form of focused networking can be hugely beneficial. For example the person sat next to me had been charged with acquisitioning some of the Government Web sites that are for the chop as part of the (up to) 75% cut. His current big concern was what do you do about domain names? We had lots to discus.

There were also attendees from outside the public sector (such as the lady from a commercial bank), they offered a different perspective on issues and it was refreshing to spend time with them.

Late in the morning we heard more formally about Web archiving from a guest speaker, Dave Thompson from the Wellcome Trust. Dave spoke about their archive of medicine related Web sites created through the UK Web Archive. Many of the sites they collect (such as personal accounts of experience of illness) are out of scope for normal preservation programmes. The Wellcome Trust don’t mediate the content of the Web sites collected, as Dave explained it’s not the job of the archivist (or librarian) to do so. For example, there are books on Witchcraft and quackery in the Wellcome library. It’s the job of the archive to preserve and make available these source materials; it’s the job of the historian or researcher to interpret them. The archive will provide a valuable record for our future researchers.

Dave ended with a quotation from Adrian Sanders Liberal Democrat MP for Torbay. As part of the debate on the The future for local and regional media Sanders had said that he thought that “Most of what’s online is indeed tittle tattle and opinion.” Dave observed that such an opinion from a member of parliament was extremely worrying. Many still failed to understand the value of the Web and the value of preserving it. Tittle tattle and opinion is what our papers consist of (and we preserve them) and what ultimately history is made of.

Overall

I really enjoyed the workshop and would thoroughly recommend it to anyone  who needs practical advice to get them started planning a Web archiving project. The speakers were excellent, both knowledgable and receptive to the information needs of the audience.  As Ed explained at the start “I learn a lot from you too“.

My only criticism of the day is that due to over-running some of the slides were missed out. Also technology problems meant that the team were unable to play the screen cast of HTTrack doing it’s stuff. I think the screencast and would make a valuable contribution to the resources offered. The resources available both online and off were excellent, we even received a great certificate for completing the course. Something to hang on my wall!

Tags: ,
Posted in training | 3 Comments »