Application profiles and metadata for repositories
RSS icon Email icon Home icon
  • Picture This! (hack day at Dev8D+)

    Posted on March 1st, 2011 Talat Chaudhri No comments

    Picture This!Picture This!

    The Picture This! event on image metadata was held at Dev8D+ on 15 February 2011 at the University of London Union (ULU) in Bloomsbury, London. It led into the Picture This! Developer Challenge at Dev8D on 16/17 February 2011.

    The morning began with a brief, practical introduction to application profiles for image metadata by Talat Chaudhri of the Application Profiles Support Project, aimed at getting the attendees to think about the kinds of metadata solutions required for the specific problems that face them in dealing with images within the public-facing systems that they run in their institutions. He invited the attendees to think about the sorts of metadata and the kinds of relationships between images as related web resources that might be required in these systems.

    The attendees then delivered some lightning talks, outlining the sorts of problem spaces that they are seeking to deal with in order to deliver image resources more successfully to users. Most of these centred around EXIF, IPTC, as well as ISO 19139 for geospatial metadata. Other metadata standards that were mentioned included NISO MIX and VRA. The talks were focussed on a range of issues including embedding image metadata within images, extracting metadata from images on services such as Flickr using the relevant API, auto-generation and enrichment of metadata, and visual surfacing of copyright and other information for users from embedded metadata within images. There was surprisingly little interest in managing relationships between images, for example where various different types of post-processing of a particular image has resulted in multiple related images that stem from the same original. It was also quite notable that comparatively little attention was given to subject metadata by attendees at the event. Holding the event at an event for developers may explain the relative lack of interest in these areas, which have been more significant issues at meetings of the Metadata Forum.

    Picture This!Picture This!

    After lunch, the attendees formed into groups that included both metadata practitioners and developers, to address the issues raised by the lightning talks. They later delivered pitches based on these ideas, several of which fed into the Picture This! Developer Challenge at Dev8D. In addition to the attendees themselves, a number of additional developers who attended Dev8D+ offered their advice and collaboration, and dropped in and out of the afternoon session. This sharing of expertise highlighted the value of the collaborative approach taken at Dev8D, as well as directly helping practitioners with the problems that they had outlined during the morning session. Particular mention should be made of Ben O’ Steen and Ben Charlton for the considerable help that they gave to developers and practitioners throughout the day.

    Ben O’ Steen talks about Picture This! and Dev8D

     

    As part of theDev8D Developer Challenge, the Picture This! event offered prizes of Amazon vouchers for first and second prize for those who came up with the most innovative and practical solutions to identified problems using image metadata.

    The first prize of a £50 Amazon voucher went to Robert Baker and Roger Greenhaigh for their work in extracting embedding copyright information from images and dynamically modifying images to include a banner including a logo displaying the licence, e.g. the specific type of Creative Commons licence, and the copyright holder. The judges felt that this relatively simple but highly effective idea again had enormous potential within the UK HE sector, not least as a time-saving device with instant visual impact that could be used widely by anybody wanting to know whether or not and how they could re-use a particular image.

    Rober Baker and Roger Greenhaigh’s Entry for the Developer Challenge

    The second prize of a £25 Amazon voucher went to Bharti Gupta for her work in embedding geospatial image metadata within map images, for example climate data, an idea that has enormous potential for re-use within the UK HE sector. By embedding the metadata in this way, the problem of managing images and metadata separately is removed and machine processing and transmission of map images over the web is significantly enriched without the need for metadata harvesting.

    “Before”: Bharti Gupta talks about Picture This!

    “After”: Bharti Gupta’s Entry for the Developer Challenge

    Ianthe Hind and Scott Renton worked on enriching image metadata using a range of techniques, the most ambitious of which was image recognition. Ianthe’s work on this challenge at Dev8D deserves special mention for the huge effort and wide range of technologies that she investigated for auto-generation of metadata. She showed that commonly available image recognition software is not yet capable of delivering the functionality that developers need to be able to make use of existing rich metadata on the web to describe new images of known objects, places or landmarks, which would avoid the need for constant duplication and time-consuming repetitious metadata entry.

    “Before”: Ianthe Hind talks about Picture This!

    “After”: Ianthe Hind’s Entry for the Developer Challenge

    The organisers of the event intend to follow up on and document these outputs, and ensure that they feed back into future meetings of the Metadata Forum. The day was highly successful, the attendees were enthusiastic and motivated, and the Dev8D event format was at its best in bringing practitioners with practical problems together with developers to address real, tractable problems and produce immediate solutions and demonstrations to solve them.

  • Entry for the Developer Challenge OR2010

    Posted on July 16th, 2010 Talat Chaudhri No comments

    Talat Chaudhri and Stephanie Taylor submitted an entry to the Developer Challenge at Open Repositories 2010 in Madrid, which was received with some interest because it used Open Calais to automatically create links to related content. This “quick and dirty” entry was made at the last minute, so only the main features worked. It comes out of UKOLN’s work in creating a new, interactive Drupal site (soon to be launched) as a focus for their work on various metadata activities, including but not limited to application profiles, aimed at providing a hub of user-facing, community documentation. The demand for such a central focus of metadata information was raised separately at the first meeting of the Metadata Forum.

  • Linked data and Dublin Core Application Profiles in EPrints 3.2.0

    Posted on March 23rd, 2010 Talat Chaudhri 2 comments

    EPrints 3.2.0 was released on 10th March 2010. It has some remarkable new features relating to linked data and, consequently, to Dublin Core Application Profiles based on multiple entity domain models such as SWAP, IAP and TBMAP (the GAP does not have a domain model). Here are the key points:

    Linked Data Support

    • Ability to establish arbitrary relations between objects or provide additional metadata in triple form.

    Semantic Web / Linked Data (RDF)

    We have made a (difficult) decision to move these features to 3.2.1 (due out soon after 3.2.0) because testing showed it caused a significant slow down.

    We’re rewriting it to do the same thing but with much less overhead!

    However, as may be seen on the EPrints wiki, the latter section read as follows until 11th March 2010:

    Semantic Web Support

    • RDF+XML Format
    • N3 Format
    • URIs for all objects, including non dataobjs. [sic] eg. Authors, Events, Locations.
    • BIBO Ontology
    • Extendable
    • URIs now use content negotiation to decide which export plugin to redirect to, based on mime-types supplied by plugins and the “accept” header.
    • Relations between eprints and documents

    If this is understood on face value, it appears that there has been significant progress in enabling features that would allow the full implementation of the JISC’s DCAPs based on the simplified FRBR model, although we must wait for some important details until the promised version 3.2.1, which is to be released “soon after 3.2.0″ according to the statement above. Although objects may be described with “arbitrary relations” and “additional metadata” (additional to what?) can be described in triple form, there are not yet URIs for all entities, such as Authors and so on. Presumably, the support for BIBO would be more demanding that the support required for the cut-down version of FRBR as seen, for example, in SWAP.

    This is all very promising, especially in the light of the same functionality being promised in DSpace 2.0, which were not yet implemented in the recent release of DSpace 1.6.0. However, all of this must come with the caveat that, until this is tried out in practice, it is not certain which levels of implementation are possible: clearly, the actual metadata fields can easily be adopted by any repository, but what about the relationships between entities, and the relationships with other complex objects? How exactly will these be implemented in practice? For the purposes of linked data, we also have to wait until EPrints 3.2.1 for metadata in the RDF+XML format.

    To this end, although UKOLN cannot offer a publicly accessible test repository with user access, we hope wherever possible to implement and test these pieces of repository software for their usability with SWAP, IAP, TBMAP, GAP and DC-Ed in the first instance, since the majority of repositories in the UK HE sector use these platforms. Of course, we would also like to do the same with Fedora at some point in the future. However, if you have evidence of any such implementations, even for test purposes, and if you are happy for us to evaluate these, we would be very happy to hear from you.

  • Taking application profiles to the people!

    Posted on August 6th, 2009 Talat Chaudhri No comments

    We’ve recently started trying out various methodologies for testing whether the different bits of application profiles work for the people trying to use those resources. The main thing to remember is that the approach must not be too technical: anybody ought to be able to understand what the metadata terms and the relationships between digital objects on the web are trying to achieve. This is hands-on metadata for real people!

    So we’ve been to various meetings lately. The first one is perhaps the least relevant from most people’s point of view, the Metadata Registries Meeting at the Novotel Centre, York, 23-24 July 2009. We were seeking feedback and discussion of our methodology, as well as talking about a few technical possibilities, which was a useful thing to do. You may ask, what are registries? Well they aren’t the subject of this blog, but in brief they are places that allow people to share their metadata schemas, application profiles and so on, as well as to find tools to help them develop, build and maintain them over time. Remember that application profiles are living structures that should change as the metadata needs of your users in dealing with the resources that you provide change over time. We (UKOLN) operate a registry called IEMSR.

    So what did we do for the people?! Well, first of all we went to the Institutional Web Managers Workshop 2009 (IWMW), 28-30 July 2009 because we felt that they are people who are focussed on making services work for users. It may have been an advantage in some ways that they weren’t by and large repository-related people and could look on things from a fresh perspective. It’s always good to get a range of different approaches: after all, won’t the users come to a repository, VRE, VLE or other service with a whole range of points of view and things they want to do? You can see a slightly ad hoc and only mildly embarrassing interview with me, Talat Chaudhri of UKOLN, explaining in about 20 seconds of profound unreadiness over coffee what it is that application profiles (should) do. (Why on earth did the kind editor choose that particular first frame to stop the video?!) Not a bad attempt, given the lack of coffee, I hope you may agree.

    Talat Chaudhri from iwmw on Vimeo.

    What we did was to get people to think about resources, and reasons why users would want to be looking at them. We played with post-it notes (also called stick-it notes elsewhere?) that had metadata terms written on them, and arranged them in logical groups that would help a person who was trying to perform focussed searches for resources. Any resource type will do: for instance, we tested it out before the session on “beach life: what you might find on a beach and what you might want to know about it”. It doesn’t even have to be sensible! In real life application profiles, however, you obviously need to think of the whole range of things that people will want to do with your resource. The best way: ask them! Don’t engineer things that people won’t want to use. The extra complexity creates the very real danger of making your structures difficult to search, which will put off the very users that are supposed to be using the service.

    This method is called card sorting, and is quite well known. It does have some limitations, but we have already shown its usefulness in focussing attention on what users need to do. One limitation, for instance, is that it’s rather hard not to prejudice the process from the beginning. If you ask the participants to think of the scenarios that users might search for resources first, then participants will come with pre-ordained ideas that will tend to undermine the fresh analysis of user requirements that we are looking for. On the other hand, if you don’t let them know until they have already thought of the terms that they need to describe the resource, on the first try they will tend to organise them in ways that don’t work with the scenarios. Let us remember, though, that this is just the first iteration of a development cycle for metadata solutions. You need to take every new version back to the users and check that it does what they need it to.

    A second limitation is that paper prototyping can’t produce the complex cross-links that you’d find in a real database. A third one is that it doesn’t begin to touch the importance of interface design to usability testing of metadata terms and structures. You may (or may not) need a complex data structure. However, your user should only see what they need to see in order to accomplish what they want. Anything else will actually hinder their use of the service, be it a web page or a repository deposit interface. That complexity can be generated behind the scenes by software, so that users are asked understandable, intuitive and above all useful questions that facilitate their end user experience. We’re also working on these areas.

    We then went to the Repositories Fringe 2009 in Edinburgh, 30-31 July 2009. (You will see from the above dates that this was a bit of a marathon!) The session was broadcast live on the Web, and I hope that the recording will be made available before long. I will add a link here if/when that happens. Having learnt a little from the above session, we did more of the same. We learnt a lot about how to get user requirements, and even more about how not to do it!

    We were asked if we were running a focus group. If people want application profiles like SWAP, IAP, GAP, TBMAP and so on to be implemented, we will certainly have to consult focus groups, but we will tell people when that is what we’re doing. First, however, we are trying to raise discussion about how we can analyse user requirements on an ongoing basis and transmit that hard evidence to developers, so that will have a reason to go to the trouble of incorporating it into their software releases. At present, we can’t show them sufficient evidence that these APs do what they are intended for, which is why repository software developers in particular have been understandably agnostic about APs. But the other thing that is crucial is to engage service providers and users. Why do they want to come? If they don’t get something out of the event that will improve their service or their knowledge, preferably both, they won’t come. This was as much an outreach and training event as a focus group.

    We’re hoping that this is a good start towards an iterative, user-driven method for analysing existing APs for various purposes, as well as for designing new APs from scratch. We’re confident that it’s going well at the moment and that we are beginning to get answers. But the task of making your metadata fit the service that you provide is ongoing, because services also change over time. It’s best not to be too prescriptive, as different institutional or web services take different approaches to achieving the same things. We are aiming at a flexible, iterative, toolkit approach that works for as many people as possible, and offers a range of tools to implement relevant parts of an overall solution that work for the services and users concerned.

    Lastly, the fact that we are reviewing APs should not be taken as a criticism of the ones that we have, even if deficiencies are found that need to be rectified, or new approaches taken. The work that was done in creating them has laid the groundwork for this new activity, which is aimed precisely at making the results of that work more useful in the community of web services that they were intended for. Change should be welcomed because needs and requirements change, along with our understanding of how best to analyse them.