Application profiles and metadata for repositories
RSS icon Email icon Home icon
  • Linked data and Dublin Core Application Profiles in EPrints 3.2.0

    Posted on March 23rd, 2010 Talat Chaudhri 2 comments

    EPrints 3.2.0 was released on 10th March 2010. It has some remarkable new features relating to linked data and, consequently, to Dublin Core Application Profiles based on multiple entity domain models such as SWAP, IAP and TBMAP (the GAP does not have a domain model). Here are the key points:

    Linked Data Support

    • Ability to establish arbitrary relations between objects or provide additional metadata in triple form.

    Semantic Web / Linked Data (RDF)

    We have made a (difficult) decision to move these features to 3.2.1 (due out soon after 3.2.0) because testing showed it caused a significant slow down.

    We’re rewriting it to do the same thing but with much less overhead!

    However, as may be seen on the EPrints wiki, the latter section read as follows until 11th March 2010:

    Semantic Web Support

    • RDF+XML Format
    • N3 Format
    • URIs for all objects, including non dataobjs. [sic] eg. Authors, Events, Locations.
    • BIBO Ontology
    • Extendable
    • URIs now use content negotiation to decide which export plugin to redirect to, based on mime-types supplied by plugins and the “accept” header.
    • Relations between eprints and documents

    If this is understood on face value, it appears that there has been significant progress in enabling features that would allow the full implementation of the JISC’s DCAPs based on the simplified FRBR model, although we must wait for some important details until the promised version 3.2.1, which is to be released “soon after 3.2.0″ according to the statement above. Although objects may be described with “arbitrary relations” and “additional metadata” (additional to what?) can be described in triple form, there are not yet URIs for all entities, such as Authors and so on. Presumably, the support for BIBO would be more demanding that the support required for the cut-down version of FRBR as seen, for example, in SWAP.

    This is all very promising, especially in the light of the same functionality being promised in DSpace 2.0, which were not yet implemented in the recent release of DSpace 1.6.0. However, all of this must come with the caveat that, until this is tried out in practice, it is not certain which levels of implementation are possible: clearly, the actual metadata fields can easily be adopted by any repository, but what about the relationships between entities, and the relationships with other complex objects? How exactly will these be implemented in practice? For the purposes of linked data, we also have to wait until EPrints 3.2.1 for metadata in the RDF+XML format.

    To this end, although UKOLN cannot offer a publicly accessible test repository with user access, we hope wherever possible to implement and test these pieces of repository software for their usability with SWAP, IAP, TBMAP, GAP and DC-Ed in the first instance, since the majority of repositories in the UK HE sector use these platforms. Of course, we would also like to do the same with Fedora at some point in the future. However, if you have evidence of any such implementations, even for test purposes, and if you are happy for us to evaluate these, we would be very happy to hear from you.


    1 responses to “Linked data and Dublin Core Application Profiles in EPrints 3.2.0” RSS icon

    • I have recently spoken to some DSpace developers at MIT who indicated that DSpace 2.0 is now seen as a development branch that provides code and features that will be implemented in future DSpace 1.x versions. As such, DSpace 2.0 would never be deployed. They also felt that RDF structures enabling complex relationships and entity-relationship models was unlikely to be the sort of priority that would make it into the code base any time soon.

      On the other hand, the recent announcement at Open Repositories 2010 that the DSpace and Fedora software platforms are to be merged by 2011 seems to leave a fair amount of doubt about the future of DSpace as an independent platform, as Fedora is intended to be the underlying storage layer. Since the DSpace web interface (either JSP UI or XML UI) is ageing, this leaves relatively little in the longer term except perhaps the DSpace workflow, so that the new DuraSpace platform would have the packaged product and commercial presence that DSpace currently enjoys but the underlying flexibility and functionality of Fedora.

      It remains to be seen how this will turn out in practice, but it seems like a reasonable bet that development resources will not be focussed on support for complex relationships and entity-relationship models unless it is shown that there is substantial demand. However, if all goes to plan by 2011, the new DuraSpace will have this functionality because it will be inherited from Fedora, which is a longstanding feature.

      It will be interesting to see how this development will increase the competitive presence of DSpace > DuraSpace installations compared to EPrints installations in the HE sector, and the implications of radically increasing the accessibility of Fedora to a wider market. At present, it seems as though this can only be a good thing.

    1 Trackbacks / Pingbacks