‘What is the Business Administrative Case for Linked Data?’ parallel session, JISC Conference 2011, BT Convention Centre, Liverpool, UK. 15th March 2011
One of the parallel sessions at this years JISC Conference in Liverpool promised to address the “business value to the institution” of linked data, being aimed at “anyone who wants a clear explanation in terms of how it applies to your institutional business strategy for saving money..”. I was one of a number of people invited to be on the panel by the session host, David Flanders, but such was the enthusiasm, I was beaten to it by the other panellists, despite replying within a few hours.
The session kicked off with a five minute soapbox from each of the panellists before opening up to a wider discussion. First up was Hugh Glaser from Seme4. He suggested that Universities have known for a long time that they need to improve data integration and fusion, but have found this a difficult problem to solve. You can get consultants in to do this, but it’s expensive and the IT solutions often end up driving the business process instead of the other way round. Everything single modification has to be paid for, such that your business process often ends up being frozen. However, Linked Data offers the possibility of solving these problems at low risk and low cost, being more of an evolutionary than revolutionary solution. The success of data.gov.uk was cited, it having taken only seven months to release a whole series of datasets. Hugh emphasised that not only has the linked data approach been implemented quickly and cheaply here, but also it hasn’t directly impinged upon or skewed the business process.
He also talked about his work with the British Museum, the problem there being that data has been held separately in different parts of the organisation resulting in seven different databases. These have been converted into linked data form and published openly, now allowing the datasets to be integrated. Hugh mentioned that another bonus of this approach is that you don’t necessarily have to write all your applications yourself. The finance section of data.gov.uk lists five applications contributed by developers not involved with the government.
Wilbert Kraan from CETIS described an example where linked data makes it possible to do things existing technologies simply can’t. The example was based on PROD, a database of JISC projects provided as linked data. Wilbert explained that they are now able to ask questions of the dataset not possible before. They can now put information on where projects have taken place on a map, also detailing the type of institution, and its rate of uptake. The neat trick is that CETIS don’t have to collect any data themselves, as many other people are gathering data and making it available openly. As the PROD data is linked data, it can be linked in to this other data. Wilbert suggested that it’s hard to say if money is saved, because in many cases, this sort of information wouldn’t be available at all without the application of linked data principles.
Lecturer in Computer Science at the University of Manchester, Bijan Parsia talked about the notion of data disintermediation, which is the idea that linked data cuts out intermediaries in the data handling process, thereby eliminating points of frictions. Applications such as visualisations can be built upon linked data without the need to climb the technical and administrative hurdles around a proprietary dataset. Many opportunities then exist to build added value over time.
The business case favoured by Graham Klyne was captured by the idea that it enables “uncoordinated reuse of information” as espoused by Clark Parsia, an example being the simplicity with which it’s possible to overlay faceted browse functionality on a dataset without needing to ask permission. Graham addressed the question of why there are still few compelling linked data apps. He believes this comes down to the disconnect between who pays and who benefits. It is all too often not the publishers themselves who benefit, so we need to do everything possible to remove the barriers to data publishing. One solution may be to find ways to give credit for dataset publication, in the same way we do for publishing papers in the academic sector.
David then asked the panel for some one-liners on the current barriers and pain points. For Wilbert, it’s simply down to lack of knowledge and understanding in the University enterprise sector, where the data publisher is also the beneficiary. Hugh felt it’s about the difficulty of extracting the essence of the benefit of linked data. Bijan suggested that linked data infrastructure is still relatively immature, and Graham felt that the realisation of benefits is too separate from the costs of publication, although he acknowledged that it is getting better and cheaper to do.
We then moved on to questions and discussion. The issue of data quality was raised. Hugh suggested that linked data doesn’t solve problem of quality, but it can help expose quality issues, and therefore their correction. He pointed out that there may be trust and therefore quality around a domain name, such as http://www.bl.uk/ for data from the British Library. Bijan noted that data quality is really no more of an issue than it is for the wider Web, but that it would help to have mechanisms for reporting issues back to the publisher. Hugh believes linked data can in principle help with sustainability, in that people can fairly straightforwardly pick up and re-host linked data. Wilbert noted one advantage of linked data is that you can do things iteratively, and build things up over time without having to make a significant upfront commitment. Hugh also reminded us of the value of linked data to the intranet. Much of the British Library data is closed off, but has considerable value internally. Linked data doesn’t have to open to be useful.
The session was very energetic, being somewhat frantic and freewheeling at times. I was a little frustrated that some interesting discussion points didn’t have the opportunity to develop, but overall the session managed to cover a lot of ground for a one-hour slot. Were any IT managers convinced enough to look at linked data further? For that I think we’ll have to wait and see. For now, as Ethan Merman would say, “let’s go on with the show”.