Application profiles and metadata for repositories
RSS icon Email icon Home icon
  • Picture This! (hack day at Dev8D+)

    Posted on March 1st, 2011 Talat Chaudhri No comments

    Picture This!Picture This!

    The Picture This! event on image metadata was held at Dev8D+ on 15 February 2011 at the University of London Union (ULU) in Bloomsbury, London. It led into the Picture This! Developer Challenge at Dev8D on 16/17 February 2011.

    The morning began with a brief, practical introduction to application profiles for image metadata by Talat Chaudhri of the Application Profiles Support Project, aimed at getting the attendees to think about the kinds of metadata solutions required for the specific problems that face them in dealing with images within the public-facing systems that they run in their institutions. He invited the attendees to think about the sorts of metadata and the kinds of relationships between images as related web resources that might be required in these systems.

    The attendees then delivered some lightning talks, outlining the sorts of problem spaces that they are seeking to deal with in order to deliver image resources more successfully to users. Most of these centred around EXIF, IPTC, as well as ISO 19139 for geospatial metadata. Other metadata standards that were mentioned included NISO MIX and VRA. The talks were focussed on a range of issues including embedding image metadata within images, extracting metadata from images on services such as Flickr using the relevant API, auto-generation and enrichment of metadata, and visual surfacing of copyright and other information for users from embedded metadata within images. There was surprisingly little interest in managing relationships between images, for example where various different types of post-processing of a particular image has resulted in multiple related images that stem from the same original. It was also quite notable that comparatively little attention was given to subject metadata by attendees at the event. Holding the event at an event for developers may explain the relative lack of interest in these areas, which have been more significant issues at meetings of the Metadata Forum.

    Picture This!Picture This!

    After lunch, the attendees formed into groups that included both metadata practitioners and developers, to address the issues raised by the lightning talks. They later delivered pitches based on these ideas, several of which fed into the Picture This! Developer Challenge at Dev8D. In addition to the attendees themselves, a number of additional developers who attended Dev8D+ offered their advice and collaboration, and dropped in and out of the afternoon session. This sharing of expertise highlighted the value of the collaborative approach taken at Dev8D, as well as directly helping practitioners with the problems that they had outlined during the morning session. Particular mention should be made of Ben O’ Steen and Ben Charlton for the considerable help that they gave to developers and practitioners throughout the day.

    Ben O’ Steen talks about Picture This! and Dev8D

     

    As part of theDev8D Developer Challenge, the Picture This! event offered prizes of Amazon vouchers for first and second prize for those who came up with the most innovative and practical solutions to identified problems using image metadata.

    The first prize of a £50 Amazon voucher went to Robert Baker and Roger Greenhaigh for their work in extracting embedding copyright information from images and dynamically modifying images to include a banner including a logo displaying the licence, e.g. the specific type of Creative Commons licence, and the copyright holder. The judges felt that this relatively simple but highly effective idea again had enormous potential within the UK HE sector, not least as a time-saving device with instant visual impact that could be used widely by anybody wanting to know whether or not and how they could re-use a particular image.

    Rober Baker and Roger Greenhaigh’s Entry for the Developer Challenge

    The second prize of a £25 Amazon voucher went to Bharti Gupta for her work in embedding geospatial image metadata within map images, for example climate data, an idea that has enormous potential for re-use within the UK HE sector. By embedding the metadata in this way, the problem of managing images and metadata separately is removed and machine processing and transmission of map images over the web is significantly enriched without the need for metadata harvesting.

    “Before”: Bharti Gupta talks about Picture This!

    “After”: Bharti Gupta’s Entry for the Developer Challenge

    Ianthe Hind and Scott Renton worked on enriching image metadata using a range of techniques, the most ambitious of which was image recognition. Ianthe’s work on this challenge at Dev8D deserves special mention for the huge effort and wide range of technologies that she investigated for auto-generation of metadata. She showed that commonly available image recognition software is not yet capable of delivering the functionality that developers need to be able to make use of existing rich metadata on the web to describe new images of known objects, places or landmarks, which would avoid the need for constant duplication and time-consuming repetitious metadata entry.

    “Before”: Ianthe Hind talks about Picture This!

    “After”: Ianthe Hind’s Entry for the Developer Challenge

    The organisers of the event intend to follow up on and document these outputs, and ensure that they feed back into future meetings of the Metadata Forum. The day was highly successful, the attendees were enthusiastic and motivated, and the Dev8D event format was at its best in bringing practitioners with practical problems together with developers to address real, tractable problems and produce immediate solutions and demonstrations to solve them.

  • Practical metadata solutions using application profiles

    Posted on September 21st, 2010 Talat Chaudhri 2 comments

    Past and present

    Up until the present, a number of application profiles have been developed by various metadata experts, with the support of the JISC, with the intention of addressing the needs of practitioners and service providers (and thus ultimately their users) across the higher education sector in the UK. The most significant of these have been aimed at particular resource types that have an impact across the sector.

    car gear lever showing the word "metadata"
    Their names indicate the approach that has been taken to date, e.g.:

    • SWAP – Scholarly Works Application Profile
    • IAP – Images Application Profile
    • GAP – Geospatial Application Profile
    • LMAP – Learning Materials Application Profile (scoping study only: also the DC Education AP)
    • SDAPSS – Scientific Data Application Profile Scoping Study
    • TBMAP – Time-Based Media Application Profile

    Problems with this approach

    However, it cannot be said that a particular type of resource type, set of resource types, or even general subject domain actually constitutes a real, identified problem space that faces large sections of the information community in the UK higher education sector today. Geospatial resources can be any type of resources that have location metadata attached (e.g. place of creation, location as the subject of the resource). Learning materials can be any type of resource that has been created or re-purposed for educational uses, which can include presentations, academic papers, purpose-made educational resources of many types, images, or indeed almost anything else that could be used in an educational context, to which metadata describing that particular use or re-use has been attached. Images might have all sorts of different types of metadata: for instance, metadata about images of herbs might need very different metadata to images of architecture. The same applies to time-based media: what is the purpose of these recordings and what are they used for? why and how will people search for them? Likewise, the type of science in question, of which there are almost innumerable categories and sub-categories, will to a large extent determine the specific metadata that will be useful for describing scientific data.

    Of all of the above, only scholarly works, which might more usefully be called scholarly publications, are an entirely focussed, specific set of resource types with a common purpose. The others are loose and sometimes ill-defined collections of resources or resource types that fit into a particular conceptual category. Only in the case of scholarly publications is there an unspoken problem space: discovery and re-use in repositories and similar systems, usually but not exclusively as Open Access resources. There are other related problem spaces such as keeping accurate information about funders and projects for the purposes of auditing that is required by funding bodies and university authorities. The ability to access these resources with new technologies could be a further area of study, and is one that UKOLN is taking an active interest in. Again, the question must be “what do users want to do with these resources?”

    Current Approaches

    It must not be said that the work in creating the application profiles mentioned above has been wasted. At the same time, the above application profiles constitute general purpose solutions that do not target specific problems affecting identifiable communities of practice across the sector. There is considerable work continuing in Dublin Core Metadata Initiative (DCMI) circles on how metadata modelling should best be carried out, for instance on the Dublin Core Abstract Model (DCAM) and on the overlap between application profiles and linked data, where those application profiles contain relationships that can better enable resource discovery in a linked data world.

    New Approaches

    These approaches remain useful. However, more immediate, specific problem spaces face particular university services (not all of which are necessarily repositories) in trying to describe resources so that they can be discovered, providing copyright and other licensing information so that they can be re-used, providing funding information so that work can be audited and cases can be constructed for funding new projects, and so on. Some of these resources may be textual, but others are increasingly including images (of many types and for many purposes), music, film, audio recordings, learning objects of many types, and large scale corpora of data. Any metadata solution that is tailored to a particular purpose (and, thus, which is usually de facto an application profile) needs to address specific aspects of the Web services that practitioners and other service providers are seeking to develop for their users, not simply provide general catch-all metadata of relatively generic use.

    Key to all this is consultation with those communities: first, to scope the most significant two or three problem spaces that face the largest number of resource providers in serving their users; second, to get those practitioners together with developers to draw up practical, workable recommendations and perhaps demonstrations; third, to provide tangible evidence to the developers of existing software platforms, and to engage with them to help solve such problems in practice. To do this, it is necessary to engage practitioners and deverlopers in practical, hands-on activities that can bring the discussion forward and provide tangible solutions.