Comments on: Same Old Same Old? http://blogs.ukoln.ac.uk/adrianstevenson/2011/01/19/same-old-same-old/?utm_source=rss&utm_medium=rss&utm_campaign=same-old-same-old Adrian Stevenson's UKOLN blog Wed, 29 Jun 2011 08:28:35 +0000 hourly 1 http://wordpress.org/?v=3.5.2 By: Adrian Stevenson http://blogs.ukoln.ac.uk/adrianstevenson/2011/01/19/same-old-same-old/comment-page-1/#comment-9016 Adrian Stevenson Thu, 20 Jan 2011 16:39:46 +0000 http://blogs.ukoln.ac.uk/adrianstevenson/?p=306#comment-9016 Hi Tyler. Cheers for the thumbs up. Interesting anecdote there, that the teenage sex thing came up before at the same venue. Alanas added that Linked Data is also similar in that one’s first experience of it isn’t very satisfying, but it gets better the more you do it :) I think the comment went down maybe ok actually, though I don’t recall if I was looking round the room at the time.

]]>
By: Tyler Bell http://blogs.ukoln.ac.uk/adrianstevenson/2011/01/19/same-old-same-old/comment-page-1/#comment-9014 Tyler Bell Thu, 20 Jan 2011 16:12:44 +0000 http://blogs.ukoln.ac.uk/adrianstevenson/?p=306#comment-9014 Great commentary Adrian. Same-old indeed: there was a similar event next door at UCL back around 2001, where we were talking about XML. Many of the informatic concerns and arguments persist unchanged; the ‘teenage sex’ simile was also employed. I suspect that it was met with as many stern faces this time around.

]]>
By: Adrian Stevenson http://blogs.ukoln.ac.uk/adrianstevenson/2011/01/19/same-old-same-old/comment-page-1/#comment-8991 Adrian Stevenson Thu, 20 Jan 2011 10:59:04 +0000 http://blogs.ukoln.ac.uk/adrianstevenson/?p=306#comment-8991 Hi Pete

Re. “EAD is challenging because at heart it’s not really a “data-centric” XML format”, yeah, that is of course a good point, and one perhaps I’ve tended not to fully appreciate myself, even though I’m on the project :). I’ll have a look at those slides you mention.

I was aware when writing that section of the post, that it might sound as if I’m saying we’ve found it really hard with the Locah experience, which isn’t what I meant, so I guess I should have tried to rephrase it. The point was more that John did kind of give the impression that it’s very easy (unless I was misunderstanding him), which I think is in danger of giving the wrong impression to people. Maybe within the government sector they are finding it that easy, in which case it would be great to learn more from them, hence my questions.

]]>
By: Pete Johnston http://blogs.ukoln.ac.uk/adrianstevenson/2011/01/19/same-old-same-old/comment-page-1/#comment-8988 Pete Johnston Thu, 20 Jan 2011 10:38:07 +0000 http://blogs.ukoln.ac.uk/adrianstevenson/?p=306#comment-8988 Re the rosiness or otherwise of the Locah experience :)

I think we should be a bit careful about generalising, or at least we need to bear in mind some of the particular characteristics of the source data with which Locah is dealing when doing so.

Working with EAD is challenging because at heart it’s not really a “data-centric” XML format, for the sorts of reasons Mark Matienzo has talked about e.g. in http://www.slideshare.net/anarchivist/archives-the-semantic-web – and it is by design a fairly complex XML format that allows for a lot of structural variation.

And there’s further complexity arising from the nature of the Archives Hub collection, where data creation is decentralised, and has been carried out by multiple independent parties, using different tools and approaches, over an extended period of time, and as consequence there’s a good deal of variation in that aggregated data.

I’d argue that that is a challenge in any processing across that dataset, even staying within the world of XML/XPath etc.

So given this background, I still – despite my occasionally tearing my hair out on Twitter on finding some new pattern in the data that I hadn’t anticipated! – think the process has been relatively straightforward, really.

From my perspective, most of the effort has been in managing that complexity, rather than because of any difficulty associated with the process of generating linked data in particular. And I think the fact that we did process a “controlled” subset of the data – one where we knew the variation was limited – quite quickly reflects that.

]]>