W3C incubator group on LLD – Draft Report for comment

Antoine Isaac

Thanks a lot for spotting this, Adrian. We've updated the end of that section, see http://www.w3.org/2005/Incubator/lld/wiki/Draft_Vocabularies_Datasets_Section We hope it's alright!

Ed Chamberlain

Go to thread

Seconded. Matching against identifiers takes time and is prone to error. Some recommendation over which ones to focus on would be great

Ed Chamberlain

Go to thread

Two JISC funded projects based around Cambridge, Open Bibliography and COMET have made large library related datasets available, but it only goes so far. I second Catherines' point about article / citation level data, there is serious value here. Furthermore, libraries could consider exposing operational data, holdings and anonymised circulation information to facilitate a richer range of interactive and recommendation based services.

Adrian

Go to thread

CKAN is mentioned here for the first time. Thus, it should be shortly explained or referred to a section where it is explained.

Alan Danskin

Go to thread

This is partly a reflection of the [im]maturity of the technologies and the complexity of applying linked data to MARC data, which as is acknowledged elsewhere in the report, libraries are currently locked into. What is needed are models and tools to enable the conversion of MARC data. BL is looking at what would be necessary in order to release the tools we have used/created.

Alan Danskin

Go to thread

The British Library has recently made available a preview of the British National Bibliography Dataset http://www.bl.uk/bibliographic/datafree.html . The difficulties involved in this undertaking were considerable and go a long way to explaining the lack of published datasets The BL chose to make BNB the focus of its linked data work because it is a large data set for which the BL, as the national library of the United Kingdom, is responsible and its scope can be reasonably clearly defined.

Alan Danskin

Go to thread

Maintenance and development of these deliverables is highly desirable and we welcome the steps that have been taken.

Alan Danskin

Go to thread

A minor point, but LCSH is not limited to the topics of books.

Alan Danskin

Go to thread

Thank you. These are useful resources; one of the difficulties with starting the BNB linked data project was knowing what is available and useful.

Patrick Danowski

Go to thread

An analyze which datasets and which vocabularies are using which ontology (metadata fields) would be helpful to get an better idea which ontologies are widely used and which are not very common.

Jennifer Bowen

Go to thread

Would like to see some expansion of this topic. This is a very important consideration for the migration strategies recommendation later in the document.

Jennifer Bowen

Go to thread

This paragraph is unclear. Is it saying that bibliographic datasets for what would commonly be referred to as "library catalog data" have low availability (and if so, can you speculate as to why that is?) or that those datasets ARE available, but that they aren't that important? This would be an appropriate place to mention the need for software tools that help libraries to convert their bibliographic datasets to linked data.

Antoine Isaac

Go to thread

This is a really useful comment, thanks. I think when we used the expression "library-related resources" we were in fact thinking of data such as scientific publisher's bases. But being more explicit would be useful. In fact the availability problem may be even more acute, for these specific datasets.

Antoine Isaac

Go to thread

Datasets are indeed concrete data (e.g. the British national bibliography in RDF) that re-use elements from value vocabularies (e.g., LCSH), and are structured according to the specifications of metadata element sets (e.g., Dublin Core). We got other comments in the same line, and agree that the current wording can be improved. We'll try to make our explanations clearer.

Teague Allen

Go to thread

I infer from two sentences within this paragraph that datasets are the implementation of value vocabulary terms, structured by a metadata element set, or sets. If true, explicitly saying this would be a helpful picture of linked data architecture. If not true, I'm still in the dark as to how the three resource groups relate.

Catherine Jones

Go to thread

I appreciate that the focus of this report is based on "library-held" datasets, however I don’t think the opportunities for journal articles are addressed. Whilst it is unlikely that most academic libraries will be cataloguing each journal article within their own catalogue – especially when there is Web of Science or SCOPUS available – end-users don’t necessarily see the distinction – or perhaps shouldn’t have to see the distinction between material types, or library’s decisions on ownership vs. access within their collection which will be reflected in their catalogue but not in the service provided. Sorry if this isn't clear, but the service provided by a particular (academic) library is provided through a portfolio of electronic resource discovery tools, of which only one, the library catalogue, is the library responsible for creating the content, limiting linked data potential to "library-held" information may only be a small part of the information landscape for a particular user and isn't building bridges to the work going on in the area of linking research data & publications (citing data etc)

W3C incubator group on LLD – Draft Report for comment

Comments by Section