-
Complexity: perceived or real?
Posted on October 6th, 2009 No commentsOne of the anecdotal remarks that is said a lot about SWAP in particular, but also as a general opinion about the JISC DCAPs, is that they are based on domain models that are too complex. But too complex for what?
Too complex to fit with how real repositories work?
Too complex to create usable input forms?
Too complex for users to understand? Do we mean real end users, or service providers such as repository managers? Do we mean anybody who is using the web forms to input metadata about digital objects, both content providers (who may also be users of the content) and repository managers?
It seems that there is an evidential problem innate in all of these assertions. It’s also worth remembering that not all resource types, and hence application profiles, are equal in this regard – nor are all users, content providers and repository managers. It’s also fair to say that sufficient work has not yet been done on investigating interface design and usability to be able to say for certain that the complexity of a data model necessarily makes the input forms difficult to use. There is an aspect of back-end software design to this question as well: the input forms may very well be simplified if the software can intelligently suggest relationships for the user to agree or reject, and generate as much of the record behind the scenes as possible.
Current work at UKOLN is aimed at solving these evidential problems by providing a methodology for investigating the best way to construct metadata
I can’t unequivocally answer the question of whether the JISC DCAPs have too complex data models to fit with the way that the most common repository platforms organise their records. It appears, however, that DSpace 1.5 does not yet support entity-relationships models, and that EPrints has its own data model. However, the use of DCAPs as exchange formats has already been shown to be a fruitful alternative approach, as EPrints has already got a SWAP export plug-in to do this. It is generally asserted that Fedora can already handle any data model. It is for the repository platform developers, ultimately, to provide the final answer to these questions. It’s clear that a lot of work is going on to address some of these issues. For example, it has been said that DSpace 2.0 will support entity-relationship models.
What I can say, however, is that the inability to support a back-end entity-relationship model does not by any means restrict a particular software platform from making use of an application profile, although there may well be a considerable demand in terms of development time in making the necessary functionality available. This is because there is clearly another alternative, namely emulating the entity-relationship model. To begin to understand this possibility, it’s necessary to take a close look at how the JISC DCAPs have been constructed, and the different classes of metadata that you find within them:
- metadata about the digital object(s) themselves, i.e. the usual stuff in any repository
- metadata elements relating to the semantic relationships between entities, i.e. isExpressedAs, IsManifestedAs, IsAvailableAs and variations thereof. These exist purely for the sake of the particular model that has been chosen, here a reduced form of FRBR. It is interesting that the dc:creator field, which is “real” metadata about the object, is the only link to the Agent identity, which may be seen, from the perspective of the object, as an entity that exists to express more detail about an item of metadata describing the object itself. In fact, it is an independent entity that could relate to multiple unrelated objects, of course.
- identifiers: these are specific to the repository instance and application profile in question. Of course, all digital objects on the Web require at least one URI to identify them (in practice, nearly always a URL that also locates them). However, the entity model required by the application profile, if it isn’t emulated as described hereafter, cuts up a compound digital object in such a way that it is possible to apply further identifiers to each entity as discreet metadata records.
It must be said that this is NOT the only way to do relationships between digital objects. It’s perfectly possible to use, for example, OAI-ORE (or plain RDF) resource maps in place of the second of these types of “metadata” here. In fact, they are really not metadata about the object at all, because they describe the relationships of different parts of that metadata to each other: so they are really meta-metadata! It could be said that identifiers don’t describe the actual objects either, merely locate their metadata descriptions, so they are also meta-metadata. Change the way you do the modelling, and the meta-metadata may change – however, this is NOT true of the “real” metadata (title, author, image size etc) that describe the object itself.