Application profiles and metadata for repositories
RSS icon Email icon Home icon
  • Practical metadata solutions using application profiles

    Posted on September 21st, 2010 Talat Chaudhri 2 comments

    Past and present

    Up until the present, a number of application profiles have been developed by various metadata experts, with the support of the JISC, with the intention of addressing the needs of practitioners and service providers (and thus ultimately their users) across the higher education sector in the UK. The most significant of these have been aimed at particular resource types that have an impact across the sector.

    car gear lever showing the word "metadata"
    Their names indicate the approach that has been taken to date, e.g.:

    • SWAP – Scholarly Works Application Profile
    • IAP – Images Application Profile
    • GAP – Geospatial Application Profile
    • LMAP – Learning Materials Application Profile (scoping study only: also the DC Education AP)
    • SDAPSS – Scientific Data Application Profile Scoping Study
    • TBMAP – Time-Based Media Application Profile

    Problems with this approach

    However, it cannot be said that a particular type of resource type, set of resource types, or even general subject domain actually constitutes a real, identified problem space that faces large sections of the information community in the UK higher education sector today. Geospatial resources can be any type of resources that have location metadata attached (e.g. place of creation, location as the subject of the resource). Learning materials can be any type of resource that has been created or re-purposed for educational uses, which can include presentations, academic papers, purpose-made educational resources of many types, images, or indeed almost anything else that could be used in an educational context, to which metadata describing that particular use or re-use has been attached. Images might have all sorts of different types of metadata: for instance, metadata about images of herbs might need very different metadata to images of architecture. The same applies to time-based media: what is the purpose of these recordings and what are they used for? why and how will people search for them? Likewise, the type of science in question, of which there are almost innumerable categories and sub-categories, will to a large extent determine the specific metadata that will be useful for describing scientific data.

    Of all of the above, only scholarly works, which might more usefully be called scholarly publications, are an entirely focussed, specific set of resource types with a common purpose. The others are loose and sometimes ill-defined collections of resources or resource types that fit into a particular conceptual category. Only in the case of scholarly publications is there an unspoken problem space: discovery and re-use in repositories and similar systems, usually but not exclusively as Open Access resources. There are other related problem spaces such as keeping accurate information about funders and projects for the purposes of auditing that is required by funding bodies and university authorities. The ability to access these resources with new technologies could be a further area of study, and is one that UKOLN is taking an active interest in. Again, the question must be “what do users want to do with these resources?”

    Current Approaches

    It must not be said that the work in creating the application profiles mentioned above has been wasted. At the same time, the above application profiles constitute general purpose solutions that do not target specific problems affecting identifiable communities of practice across the sector. There is considerable work continuing in Dublin Core Metadata Initiative (DCMI) circles on how metadata modelling should best be carried out, for instance on the Dublin Core Abstract Model (DCAM) and on the overlap between application profiles and linked data, where those application profiles contain relationships that can better enable resource discovery in a linked data world.

    New Approaches

    These approaches remain useful. However, more immediate, specific problem spaces face particular university services (not all of which are necessarily repositories) in trying to describe resources so that they can be discovered, providing copyright and other licensing information so that they can be re-used, providing funding information so that work can be audited and cases can be constructed for funding new projects, and so on. Some of these resources may be textual, but others are increasingly including images (of many types and for many purposes), music, film, audio recordings, learning objects of many types, and large scale corpora of data. Any metadata solution that is tailored to a particular purpose (and, thus, which is usually de facto an application profile) needs to address specific aspects of the Web services that practitioners and other service providers are seeking to develop for their users, not simply provide general catch-all metadata of relatively generic use.

    Key to all this is consultation with those communities: first, to scope the most significant two or three problem spaces that face the largest number of resource providers in serving their users; second, to get those practitioners together with developers to draw up practical, workable recommendations and perhaps demonstrations; third, to provide tangible evidence to the developers of existing software platforms, and to engage with them to help solve such problems in practice. To do this, it is necessary to engage practitioners and deverlopers in practical, hands-on activities that can bring the discussion forward and provide tangible solutions.

  • JISC Repositories and Preservation Programme Meeting, 6-7 May 2009

    Posted on May 8th, 2009 Talat Chaudhri 1 comment

    Application profiles received considerable attention at the two-day Repositories and Preservation Programme Meeting held by JISC at the Aston Business School, Birmingham.

    Workshop: Application Profiles in Practice, 6 May 2009

    This was an event in two parts: firstly, an introduction to the user testing methodology being developed by the AP Support project in collaboration with the IEMSR and the IE Demonstrator project; secondly, an iteration of the paper prototyping element of the user testing. On this occasion the audience was comprised largely of experts rather than an especially representative group of typical users – quite understandably, given the nature of the meeting. (While it is very helpful to engage repository managers in user testing, it is more difficult to involve entirely non-specialist users, so there is a need for further work in facilitating this.) The session proved to be a success in raising considerable interest in current developments in application profiles.

    It was always the intention to use this particular event as a platform for consulting colleagues in the repositories community about the usefulness of the approach. In this respect, the workshop was highly successful: attendees responded positively to the intention of engaging users in order to analyse and address the strengths and weaknesses of the various application profiles, raising some insightful questions and contributing to an animated debate. Rachel Bruce of JISC commended the workshop in her speech closing the Programme Meeting on the following day.

    “Working with the Repositories Community: WRAP Project” (Jenny Delasalle, Warwick University), 6 May 2009

    Jenny Delasalle referred to the difficulties faced in pioneering an implementation of SWAP in an institutional repository based on EPrints 3.0. Unlike in its successor EPrints 3.1, versioning was unsupported at the time, which to a great extent hampered the SWAP effort in WRAP at Warwick. She considered that in its present form, SWAP represents too complex a metadata model for adoption by the typical IR. But since there is not necessarily a need to employ all of the SWAP metadata terms (any more than one would necessarily need to employ all of the terms in DC Simple or Qualified DC), it must be presumed that the FRBR structure and the lack of automated means to populate fields with structural metadata represent a significant part of the problem. It would be useful to get a clarification from Jenny on this.

    That the feasibility of complex metadata schemas could be radically improved by the use of text mining to autopopulate metadata fields, thus requiring far less input and/or correction from the user, was raised later in the Forum in the discussion “How can text mining support repository tasks?”, convened by James Farnhill of JISC and led principally by Brian Rea of NaCTeM, University of Manchester. This would be of obvious and immediate relevance to the liklihood of SWAP being more widely implemented, whether in its present form or following the recommendations from the user testing effort.

    Repositories Roadmap Session (Rachel Heery, external consultant for JISC), 7 May 2009

    Rachel Heery gave a summary of her Digital Repositories Roadmap Review, revised from the original version by herself and Andy Powell in 2006.  Recommendation 11 referred to SWAP specifically, proposing a cut-down version without the FRBR entity-relationship model and a re-analysis of the sort undertaken in the current user testing programme; Recommendation 12 made an interesting reference to OAI-ORE in the context of SWAP.

    Recommendation 11: Explore deployment of a cut down version of SWAP, possibly at the copy level, retaining the cataloguing rules to ensure a consistent approach to linking to full text. Evaluate whether use of SWAP is consistent with a Web architecture approach to repositories.

    Recommendation 12: Explore use of OAI-ORE to enable applications to handle complex objects. Demonstrate how OAI-ORE facilitates the re-use of research outputs and research data. Clarify different roles of OAI-ORE and SWAP.

    Outcomes

    There was considerable discussion of SWAP on Twitter among colleagues at Eduserv, UKOLN and elsewhere on both days of the meeting, focussing on both the structure and implementation of SWAP as it was originally intended, and in response to Rachel Heery’s recommendations. The need to solve the lack of implementation of the Dublin Core Application Profiles appears to have regained significant impetus from the interest in the series of user testing events planned by UKOLN. In particular, new impetus has been given to the SWAP implementation effort, in which expectations had previously subsided. Given Rachel Heery’s review, it is clear that SWAP may need to be considered once more as an ongoing project rather than a past product that failed to gain support, and one that may need substantial revision in future iterations. It is important to keep an open mind about the nature of those revisions, which should be conditioned by the results of the ongoing user testing effort.