JISC Beginner's Guide to Digital Preservation

…creating a pragmatic guide to digital preservation for those working on JISC projects

Digital Preservation Benefits Toolset Workshop

Posted by Marieke Guy on June 10th, 2011

UKOLN have announced that registration is now open for the Workshop to disseminate the Digital Preservation Benefits Toolset and accompanying materials such as user guides and factsheets to the research community.

Workshop Details

Tuesday, 12 July 2011: 12.30 -16.00
London South Bank University
Main Conference Room
The Keyworth Centre
Keyworth Street
London
SE1 6NG

Workshop registration is free but please note that places are limited and early registration is advised. At least 24 hours notice of cancellation is required, otherwise a fee of £50 will be charged to recover costs.

The Digital Preservation Benefit Analysis Tools Project is funded by the Joint Information Systems Committee (JISC) and runs from 1 February to 31 July 2011.

The project has tested and reviewed the combined use of the Keeping Research Data Safe (KRDS) Benefits Framework and the Value Chain and Impact Analysis tool, which were first applied in the I2S2 Project for assessing the benefits and impact of digital preservation of research data. We have extended their utility to, and adoption within, the JISC community by providing user review and guidance for the tools and by creating an integrated toolset. The project consortium consists of a mix of user institutions, projects, and disciplinary data services committed to the testing and exploitation of these tools and the lead partners in their original creation.

A project Web site and the project plan are available and further outputs will be available from the Web site during the summer. The project partners are UKOLN and the Digital Curation Centre at the University of Bath, Centre for Health Informatics and Multi-professional Education (CHIME) at University College London, UK Data Archive (University of Essex), Archaeology Data Service (University of York), OCLC Research, and Charles Beagrie Limited.

Details concerning the Workshop programme, venue and registration are all available from the UKOLN Web site.

Posted in Events, Workshops | Comments Off

Update on the LOC Twitter Archive

Posted by Marieke Guy on June 3rd, 2011

It’s all been very quiet on the Twitter front at the Library of Congress since their announcement last year so it was good to see an update written by Audrey Watters from the O’Reilly Radar. The article entitled How the Library of Congress is building the Twitter archive is a write up by Audrey following a conversation with Martha Anderson, the head of the LOC’s National Digital Information Infrastructure and Preservation Program (NDIIP), and Leslie Johnston, the manager of the NDIIP’s Technical Architecture Initiatives. It gives us a little insight into how the LOC is dealing with the challenges and opportunities of archiving digital data of this kind.

The article cites the biggest challenges as the size of the archive (we are now producing 140 million tweets per day!), the composition of a tweet (a JSON file with a lot of Twitter metadata) and the layers of complexity (e.g. dealing with all the url links).

Dealing with these complexities efficiently is big work.

This requires a significant technological undertaking on the part of the library in order to build the infrastructure necessary to handle inquiries, and specifically to handle the sorts of inquiries that researchers are clamoring for….Expectations also need to be set about exactly what the search parameters will be — this is a high-bandwidth, high-computing-power undertaking after all.

No decision has been made yet on which tools to use but the library is “testing the following in various combinations: Hive, ElasticSearch, Pig, Elephant-bird, HBase, and Hadoop“.

We wait with bated breath!

For those who like analogies Martha Anderson has just written an interesting post on how saving digital information is a lot like jazz. In Digital Preservation Jazz Martha talks about the creative, diverse, and collaborative nature of digital preservation.

Tags: ,
Posted in Archiving | Comments Off

The Future of the Past Report

Posted by Marieke Guy on May 28th, 2011

On 4-5 May 2011, the Cultural Heritage and Technology Enhanced Learning unit hosted a workshop for invited experts in the field of digital preservation. It was attended by around 60 representatives from universities and research centres, memory institutions, industry and other organisations such as foundations dedicated to digital preservation.

The event started with a stock-take of achievements and ongoing activities funded under the ICT programme, presenting the portfolio of digital preservation projects and the research roadmaps proposed by the community so far. This presentation was based on a report commissioned for the event which can be downloaded in the documents section below.

The main part of the workshop consisted in group discussions providing input to the digital preservation research agenda within the next EU framework programme for research and innovation (Common Strategic Framework, 2013-2020). A number of reports are now available from the workshop:

Posted in Events, Reports | 1 Comment »

Digital Preservation News

Posted by Marieke Guy on May 17th, 2011

I’m aware that recently I haven’t spent a lot of time adding post to this blog. Life and other projects have got in the way! However I do like to keep up to date with Digital Preservation news and wanted to share some of the best news blogs and RSS feeds available:

Signing up to these resources should keep you up to date with what’s happening in the digital preservation world.

Posted in Project news | Comments Off

DCC seeks volunteers to test LIFE tool

Posted by Marieke Guy on May 10th, 2011

As part of the JISC-funded Piloting the LIFE costs tool in UK HEIs project, the Digital Curation Centre is now seeking volunteers to test the effectiveness of the LIFE 3 tool within UK HEI repositories. Testing will run from early June until the end of July and participants will receive dedicated support in employing the tool. Participants will be asked to maintain an activity journal for a one-month period and JISC has provided some financial support for institutions wishing to take part. See the DCC LIFE project page for more information.

Tags:
Posted in training | Comments Off

Free AQuA Events – QA for Digital Preservation

Posted by Marieke Guy on March 28th, 2011

The JISC Automating Quality Assurance Project (AQuA) is running a series of free events in April and June for coders, technical experts, collection curators and digital preservation practitioner.

The events will be helping attendees explore a number of questions including:

  • Do you have large amounts of digital content to look after?
  • How well do you know your digital content?*
  • Is your file what it says it is?
  • Do your users do your QA for you?
  • Are you Intimidated by digital preservation tools?

The AQuA events will be held 11-13 April 2011 and 13-15 June 2011 and will bring together digital preservation practitioners, collection curators and technical experts to automate quality assurance of our digital collections.

Preservation or quality issues can occur in our digital content from
many sources:

  • When we create the content via digitisation (eg. missing pages, duplicate pages, poor focus/contrast)
  • When the collection is stored (eg. bit rot)
  • When the collection is processed or moved from store to store (eg. when processes run out of memory or disk space)
  • When technology changes (eg. when our standards and file formats become obsolete)

Manually checking material for these kinds of problems is laborious, challenging and, most critically, expensive. Checking samples of material reduces the cost, but can let through problematic quality issues. Automated tools that can check every digital item in a precise way should allow us to reduce our costs and increase the overall quality of our digital collections.

The AQuA events will provide the opportunity to get hands on experience of developing and applying digital preservation techniques and technology to digital collections.

  • University of Leeds, 11th – 13th April 2011: Join the team for the first Mashup retreat at the beautiful Weetwood Hall Conference Centre and Hotel
  • British Library, London, 13th – 15th June 2011: Get involved in the
    second AQuA Mashup in the heart of London at the UK’s National Library

Inspiring locations, cross discipline collaboration, challenges and prizes, and evening social events. Plus it’s FREE! Accommodation and
refreshments are paid for.

More info at http://wiki.opf-labs.org/display/AQuA/Home

Register at http://aquamashup.eventbrite.com

AQuA is a JISC funded collaborative project between the University of Leeds, the University of York, the British Library and Open Planets Foundation.

Questions – by email to digital@leeds.ac.uk

Tags:
Posted in Events | 1 Comment »

NDIIP Report: Preserving Our Digital Heritage

Posted by Marieke Guy on March 23rd, 2011

The Library of Congress has recently released the Preserving Our Digital Heritage: The National Digital Information Infrastructure and Preservation Program 2010 Report (NDIIPP) – available in PDF.


NDIIPP
have now spent over a decade working to develop a national strategy to collect, preserve and make available significant digital content, especially information that is created in digital form only, for current and future generations.

The report documents the achievements of the Library of Congress and its NDIIPP partners working together to create sustainable long-term access to digital materials.

Since NDIIPP was founded in 2000 by an act of Congress, a network of over 185 partners in 44 states and 25 countries have developed a distributed technical infrastructure, preserved over 1400 at-risk collections, and have made strides to support a legal environment conducive to digital preservation.

The report describes a decade of action in digital preservation and outlays the short- and long-term plans to ensure libraries, archives and other heritage institutions in the United States can collect and provide long-term access to the resources of the 21st Century, and beyond.

The full press release is available from the NDIIP site.

Tags:
Posted in Project news | Comments Off

Preserving your Emails

Posted by Marieke Guy on March 2nd, 2011

Anyone who works at the University of Bath will be having a strange week this week…Last week the University email server ‘broke’ and since Thursday afternoon a limited service has been running. We currently have email but cannot see any messages that were sent and/or stored before Thursday afternoon.

A summary of the events leading up to the email downtime and the planned course of action over the next few days is given on the University of Bath Web site:

To briefly summarise the events prior to the downtime:

  • On Monday 21 February at 2pm it was noticed there were some errors being detected on the backup mail store – at which point we raised a call to Oracle the supplier of the components.
  • Thursday 24 February pm – errors spotted this time on Main mail server.
  • Shortly after corruptions became apparent and the service came to a halt.

The loss of email has left most us in a bit of a mess – there can’t be many of us who don’t rely heavily on email. Email is now such a core part of our business processes that not being able to refer to old messages or see those that arrived last week (many people were on holiday during the half-term break) is very disoreintating.

Brian Kelly has written a thought provoking blog post asking if the situation suggests that it is Time to Move to GMail?

He argues:

So yes there will be problems with externally-hosted systems, just as there will be problems with in-house systems (and ironically the day before the BUCS email system went down and two days before GMail suffered its problems my desktop PC died and I had to spend half a day setting up a new PC!). It may therefore be desirable to develop plans for coping with such problems – and note that a number of resources which provide advice on backing up GMail have been provided recently, including a Techspot article on “How to Backup your Gmail Account” and a Techland article on “How to backup GMail“.

But in addition to such technical problems there are also policy challenges which need to be considered. At the University of Bath email accounts are deleted when staff and students leave the institution (and for a colleague who retired recently the email account was deleted a day or so before she left). One’s GMail account, on the other hand, won’t be affected by changes in one’s place of study or employment. In light of likely redundancies due to Government cutbacks isn’t it sensible to consider migration from an institutional email service? And shouldn’t those who are working or studying for a short period avoid making use of an institutional email account which will have a limited life span?

Personally I continue to use Hotmail when out-of-work but I have no back up plan and the loss of my messages would be fairly devestating. Even losing my phone contacts left me in a pickle.

The JISC Beginner’s Guide to Digital Preservation has a section on preserving email which references the DCC’s Curating e-mails paper.

It’s times like these you really wish you had a plan…

Tags:
Posted in Archiving | Comments Off

Getting Started in Digital Preservation

Posted by Marieke Guy on February 21st, 2011

The Digital Preservation Coalition are running a Getting Started in Digital Preservation workshop on Wednesday 21st March 2011 at Glamorgan Archives, Cardiff. Places are limited and cost costs £25.00 but free to DPC members.

The workshop follows on from the Decoding the Digital conference and being organised in conjunction with the British Library Preservation Advisory Centre whose Approaches to Digitisation workshop I presented at a few weeks back.

The event provides an introduction to digital preservation, builds an understanding of the risks to digital materials, includes practical sessions to help you apply digital preservation planning and tools, and features speakers sharing their own experience of putting digital preservation into practice.

The sessions are aimed at librarians, archivists and collection managers in all sectors and in all sizes of institution who want to find out more about digital preservation and the implications for their organisation of having to retain, manage and provide ongoing access to large quantities of digital material.

Have a look at the Digital Preservation Coalition Web site for more details.

Posted in Events | Comments Off

New JISC Digital Preservation Projects

Posted by Marieke Guy on February 17th, 2011

Neil Grindley, programme manager at JISC, has kindly passed on details of 5 forthcoming digital preservation projects.

Preservation Tools Projects

15/10 Programme (Infrastructure for Education and Research)
6 Month projects (February – July 2011)
Total project cost = JISC funding up to £60,000 per project + institutional contributions

Projects Summary

AQUA – Automated Quality Assurance
University of Leeds; University of York, The British Library, Open Planets Foundation
Total Project cost – £96,629

Manual quality assurance (QA) of digitised content is typically fallible and can result in collections that are marred by a variety of quality issues. Poor storage conditions can result in further damage due to bit-rot. Detecting, identifying and fixing these issues in legacy digitised collections are costly and time consuming manual processes. The Automating Quality assurance Project (AQuA) will apply a variety of existing tools in order to automatically detect quality issues in digitised collections.

Two AQuA events will bring together digital preservation practitioners, collection curators and technical experts to present problematic digitised collections, articulate requirements for their validation, and apply tools to automate the detection and identification of preservation and quality issues. Strong sustainability, take up and dissemination of project results will be ensured by facilitating cross pollination in events that bring together experts from a cross section of organisations and disciplines, and leveraging OPF’s existing role and expertise in preservation
technology support.

EPIC – Evaluating PLATO in Cambridge
University of Cambridge
Total Project Cost – £58,707

EPIC will investigate ways of improving the preservation services currently provided with DSpace@Cambridge. A key activity of the project will be to explore the feasibility of using Plato and associated Planets tools for preservation planning and relevant preservation activities. It will identify a small number of deposited collections that are at risk and use Plato to investigate preservation options and develop plans for them.

In order to inform the planning and evaluate the results we will engage with user communities, seeking to capture their understanding of the significant properties which must be preserved. While the focus of the project is on preservation planning, we expect to carry out preservation actions on some collections. In other cases experiments will be conducted to identify potential future actions.

FIDO – Forensic Investigation of Digital Objects
Centre for e-Research, King’s College London; Archives & Information Management (AIM) Service, King’s College London
Total Project Cost – £42,301

The project aims to investigate the application of digital forensics to support the curation and preservation of digital information held on computer systems and digital media. The project will evaluate the suitability of digital forensic principles and practices to enable HE archives to meet organisational commitments and legal requirements for maintaining digital records; assess the effectiveness of using open source digital forensic tools to identify, acquire, and analyse digital information; and seek to embed digital forensics tools & techniques into the working practices of the KCL Archives & Information Management (AIM).

KRDS/I2S2 – Digital Preservation Benefit Analysis Tools
UKOLN, University of Bath; University College London; UK Data Archive; Archaeology, Charles Beagrie Ltd
Data Service; OCLC Research
Total Project Cost – £69,810

The project aims to test, review and promote combined use of the Keeping Research Data Safe (KRDS) Benefits Taxonomy and the I2S2 Value Chain Analysis tools for assessing the benefits of digital preservation of research data. It will extend their utility to and adoption within the JISC community by providing user review and guidance for the tools and creating an integrated toolset. The project consortium consists of a mix of user institutions, projects, and disciplinary data services committed to the testing and exploitation of these tools and the lead partners in their original creation (Neil Beagrie of Charles Beagrie Ltd and Brian Lavoie of OCLC Research). The project will be undertaken in seven work packages that demonstrate and critique the tools, and then create and disseminate the toolset and accompanying materials such as User Guides and Factsheets to the wider community.

SWORD – Software Ontology for Resource Description
University of Manchester; The European Bioinformatics Institute
Total Project Cost – £46,944

SWORD will build on existing work to produce a Software Ontology (SWO) and the workflow for developing and applying the SWO to data. Our primary use case exists in bioinformatics. The life sciences are rapidly producing data; these primary data are then analysed and described using a variety of means. To gain scientific value from these data we need to know how they were produced and analysed; this requires software descriptions. We will seek to engage with a wide variety of organisations and disciplinary areas to validate the work. A controlled vocabulary capturing software tools, their types, tasks, versions, provenance, and so on will help preserve our important scientific data and increase its worth.

Posted in Project news | 1 Comment »