A first cut at a model

September 28th, 2010 by Pete Johnston

This post has permanently moved to http://archiveshub.ac.uk/locah/2010/09/28/model-a-first-cut/

Please update any links and bookmarks.

We apologise for any inconvenience.

 

Tags: , , , , , , , , , , , ,

6 Responses to “A first cut at a model”

  1. Douglas Campbell says:

    Wow, that’s a formidible task, and you’ve covered a lot here!

    For now, a couple of thoughts as I read through it, noting that I am considering our implementation of EAD at the National Library of NZ, rather than in your Archives Hub. We have used EAD as a way of encoding the data we have in our unpublished catalogue ‘TAPUHI’ (which is an early customised version of the Civica Spydus software). In fact we first convert it to MARC then to EAD.

    “Things” – in our case, because they come from catalogue records, I think we could consider the controlaccess values to be ‘about-ness’ – so dc:subject, dc:coverage, etc. would suffice.

    Q1 – Do you think the conceptualisation URIs will be for SKOS entities or EAC entities?

    I’m new to foaf:focus – I always thought a SKOS name authority and/or a FOAF were both identification/abstractions/conceptualisations of a person anyway, so foaf:focus looks to me like straying into reification territory. But obviously I have some more investigation to do there. If that’s the way FOAF is heading, it’s probably a good bandwagon to hitch on to?

    Q3 part 1 – Because our EAD is an encoding of our catalogue we don’t have the separate concept of a finding aid (unless you consider the catalogue a finding aid), ie. our EAD ‘document’ IS the finding aid (unless you prefer that a finding aid is an abstract concept that has an instantiation).

    Q3 part 3 – Yes, I’m assuming all the attribute code lists in EAD would be encoded as URIs, probably within SKOS?

    Q3 part 4 – Are you saying to not use standard RDF language markup? I can see that as useful as it always seemed inaccessible to me.

    Q3 part 5 – ‘Is represented by’ seems reasonable, I can’t imagine what ‘less specific’ representations might be?? Will you be using dc:hasFormat for this, as ‘s should be “electronic representations of the described materials”?

    Q5 (multi-level description) – I agree, given ISAD(G) practices, direct inheritance is out of the question. I agree also that each unit of description is for a first-class citizen that has a dc:isPartOf relationship to its parent (and overall collection). It is entirely up to the consumer to decide whether to investigate the ancestors – the same as for a journal article (eg. whether to investigate the general subject area, political slant, etc. the enclosing journal covers). Context is vitally important in archival collections, but we can’t force it on consumers – we can only presume/hope they’ll want to know more when the description at an item level is just something like “Letter, 22 January 1954″.

  2. Hi Douglas,

    Thanks for the thoughts.

    Re Q1: From a quick read, my understanding is that EAC uses “entity” to refer to the agent, the person, family or organisation.

    My suggestion here is that for each <controlaccess><persname> in the EAD doc, we generate URIs for two distinct resources, a concept and a person.

    The concepts will be instances of skos:Concept, related by a triple with foaf:focus predicate to instances of foaf:Person. So e.g.

    <http://example.org/concept/BlakeArtist&gt; a skos:Concept;
    foaf:focus <http://example.org/person/Blake&gt; .

    <http://example.org/person/Blake&gt; a foaf:Person;
    foaf:name “William Blake” .

    I agree that the act of modelling is always a “conceptualisation”, but what the foaf:focus approach does is to explicitly introduce two distinct resources into the data, one a concept and one a person. When I first came across the discussions of what ended up as foaf:focus, I wasn’t sure I grasped what problem it was solving, but I think now I can see the usefulness. We’ll go with this approach, and se how things work out: “the proof of the pudding is in the eating”, and all that!

    Incidentally, the revised VIAF model takes what I think is a similar approach: at least it introduces both a concept and a person, though I don’t think the data currently includes a direct relation between concept and person. See e.g.

    http://outgoing.typepad.com/outgoing/2010/05/viafs-new-linked-data.html

    Re your “Q3 part n” points, oops, the structure of my post probably wasn’t very clear. that list of items wasn’t meant to be part of Q3, but I see what you are referring to.

    Re Q3/part3 and attribute code lists, yes, there are probably other things that could be mapped into lists of concepts. In the first instance at least, we’ll probably focus on those that seem most “useful” in the specific context of the Hub EAD data, rather than trying to model everything in EAD.

    Re Q3/part 4: I meant only that we’ll cite existing URIs like http://lexvo.org/id/iso639-3/eng instead of creating, say, http://archiveshub.ac.uk/id/language/eng (or similar). So, no, we don’t intend any departure from RDF.

    Re Q3/part 5: I should talk to Jane and Bethan again about the data before replying, but, IIRC, I think the issue is that there are instances in the Hub EAD data where e.g. images are referenced but they are images of things related in some way to the unit of description, rather than digitised versions of it. So I guess the problem is really in the somewhat loose/incorrect use of the EAD dao element within this dataset! If the dao elemeent was used in accordance with the EAD documentation (“to connect the finding aid information to electronic representations of the described materials”), then, yes, I think you are right, and dcterms:hasFormat/dcterms:isFormatOf or dcterms:source would probably be OK.

    Yes, on the last point, that is “where we’re at”, I think – at least given existing archival description practice and existing data. As I tried to suggest, EAD probbaly isn’t the ideal starting point for trying to generate linked data.

    Taking a further step back, I think there is a broader underlying question of whether the “implicit context” sort of approach is really the most appropriate for archival description on the Web – which I think are the sort of questions Mark Matienzo of Yale University is raising in e.g. this presentation, esp. from slide 106 onwards:

    http://www.slideshare.net/anarchivist/linked-data-and-archival-description-confluences-contingencies-and-conflicts

    While in this project we can point out some of the issues arising from that, I’m not sure it’s a problem we can solve.

  3. [...] up and enhancing the data before we make these available publicly, but we have made details of our Hub and Copac modeling work available on the blog. Pete Johnston has also posted about our approach to [...]

  4. [...] can also be inherent complexities in the existing data that can make the modeling [...]

  5. [...] A stricter, clearer inheritance model rather than ISAD(G)’s rule of non-repetition (Thanks to Pete again) [...]

  6. [...] The “things” in EAD: a first cut at a model, http://blogs.ukoln.ac.uk/locah/2010/09/28/ Share this:TwitterFacebookLike this:LikeBe the first [...]