Relevant Technologies « W3C incubator group on LLD

1 0

@@ To read the most-up-to-date of version of this section, in the context of the entire report, please see our wiki page

2 1

Linked Data is an emerging technology, so most tools are still developmental. Fortunately, the principles of Linked Data are not tied to any particular tool, rather they are tied to Web standards themselves. In many situations, production and consumption of Linked Data can be layered or interwoven with existing applications without the need for massive redevelopment efforts. The following examples are not exhaustive, but are intended to illustrate a few broad categories. From a non-technical perspective, these technologies are relevant because they support the creation and use of HTTP URIs that identify and describe discrete and recognizable individuals.

3 0

Discrete and bulk access to information

4 0

The Semantic Web has been around many years, but Linked Data gives it a major boost in the form of Cool URIs. Linked Data http URIs are Cool because raw RDF can be easily and automatically negotiated and rendered into an HTML format for human (browser) consumption. The DBpedia resource for http://dbpedia.org/resource/Jane_Austen is a good example. This is great for diagnosing data and serendipitous discovery, but the atomic nature of Linked Data http URIs makes it impractical for high volume network access. Fortunately, more and more Linked Datasets are being published in bulk and consistently described using the VoID Vocabulary.

5 0

Linked Data front-ends to existing data stores

6 0

Unlike information represented hierarchically in typical XML documents, resources published as Linked Data allow information to be freed from use-case-specific hierarchies and thus available for unexpected reuse. This not only makes the information easier to mash up, it also makes tools and services easier to mash up. This is true for both producers and consumers of Linked Data. For example, an existing relational database can be mounted as Linked Data and SPARQL by using D2R Server. Similarly, Linked Data can be produced from existing SRU databases with a few rewrite rules. If the information is already available from a SPARQL endpoint, then a Linked Data front-end like Pubby can be used to automate the URIs.

7 0

Tools for data designers

8 0

Another boost for Linked Data is the growing use of OWL for purposes of data design. Prior to OWL, domain experts could use RDFS to create domain-specific vocabularies, but there was no way to map equivalencies across vocabularies. Among other features, OWL includes an upgrade to RDFS to support ontology mapping. This allows experts to describe their domain using community idioms, while still being interoperable with related or more common idioms. A variety of tools related to OWL can be found on the W3Cs RDF wiki and OWL wiki.

9 0

SKOS and related tools

10 0

Yet another key technology boost is being provided by SKOS, which is an OWL ontology for dealing with a broad base of conceptual schemes including the management of preferred and alternate labels. Many SKOS-related tools are listed on the W3Cs SKOS community wiki.

11 0

Microformats, Microdata and RDFa

12 0

Microformats, Microdata and RDFa all provide ways to embed structured data into web pages. As historically the emphasis on publishing information on the web has had to do with publishing web pages, these technologies provide ways to enhance what is already there rather than necessarily deploying separate infrastructure. RDFa supports expression of RDF data in this way and is therefore the most directly interoperable with other linked data infrastructure.

13 0

Microdata, which is defined with the new HTML5, provides another way of doing this. It has noteably gained prominence for Search Engine Optimisation purposes with the announcement of http://schema.org/ by Google, Microsoft and Yahoo. This particular type of microdata does not appear to be intended to represent arbitrarily complex data and the vocabulary that they have published places special emphasis on commerce and tourism. Though it is in principle extensible it would require a lot of extension to express library information in this way as most of the required vocabulary is lacking. There is some level of interoperability with linked data thanks to the efforts at http://schema.rdfs.org/ but at this time it seems like it would be difficult to cultivate the high level of interconnectedness between library and other datasets that is possible with linked data using this approach.

14 0

It should be noted that the http://schema.org/ protagonists do support harvesting of RDFa data and have pledged to continue doing so, therefore it does not appear to be the case that by publishing HTML pages marked up with RDFa one might somehow miss out on the opportunities afforded by microdata. Modulo bugs in the search engines parsers it is even possible to do both in the same web page. If for some reason it is not possible to make use of the full expressive power of RDF with RDFa, some structured data is better than none.

15 0

Web Application Frameworks

16 0

As the Web has grown in popularity, the software development community has created a variety of software libraries that make it easier to create, maintain and reuse web applications. These libraries are often referred to as web application frameworks, and typically implement the Model-View-Controller (MVC) pattern in some fashion. In addition web application frameworks have typically encoded and encouraged best practices with respect to the REST Architectural Style and Resource Oriented Architecture which have informed much of the standardization around web technologies.

17 0

A common component to web application frameworks is a URI routing mechanism, which allows software developers to define http URI patterns, and map them to controllers, which in turn generate an HTTP response using the appropriate views and models. This activity encourages best practices with respect to Cool URIs, and also forces the developer to think about the resources that she is making available on the Web. Linked Datas focus on naming resources with http URIs, and delivering representations of them (HTML for humans, and RDF for machines) makes it a natural fit for web application frameworks which already provide some of the scaffolding for these activities. The wide availability of web application frameworks in many different programming languages and operating system environments has led to them being heavily used in the cultural heritage sector.

18 1

However web developers are sometimes turned off Semantic Web (Linked Data) technologies because they feel like they would need to throw away their current application, to swap their database for a triplestore, and their database query language for SPARQL. This is simply not the case, since RDF serializations can be generated on the fly just as web application frameworks do fo HTML, XML and JSON representations. The use of http URIs to identify and link together resources in RDFs data model make it a natural choice for serializing and sharing entity state in a database neutral waywhich has traditionally been of great interest to cultural heritage organizations and the digital preservation community.

19 0

Content Management Systems

20 0

Just as web application frameworks have evolved as the Web has spread, so has the class of web applications known as content management systems (CMS). CMS are often built using a web application framework, but provide out-of-the-box functionality for easily creating/editing/presenting content (text, images, video) on the Web, and for managing workflows associated with the content. Since CMS are typically built using web frameworks, the same best practices for naming resources with http URIs are naturally followed. The wide availability of content management systems has led to heavy use in the cultural heritage sector. Some content management systems such as Drupal are starting to expose structured database information to machine clients by seamlessly layering it into their HTML using RDFa. As a result, data consumers such as Google Scholar, Google Maps, Facebook, etc. are starting to leverage this structured metadata in their own service offerings. Conversely, Drupal is also starting to make plugins available to consume RDF, such as VARQL and SPARQL Views.

21 0

Web Services for Library Linked Data

22 0

Theoretically, most domain-specific Web Service API capabilities could be refactored as Linked Data URIs, OWL, SPARQL, and SPARQL/Update. But even though it should be possible to layer a Linked Data URI front-end on an existing back-end datastore, it may not be so easy for the back-end to support SPARQL and SPARQL/Update access. Security, robustness and performance considerations could also preclude supporting SPARQL in production situations. Furthermore, SPARQL endpoints and bulk RDF downloads can facilitate discovery and reuse of the published Linked Data greatly. Most web developers however face a steep learning curve before being able to exploit it, and for many application requirements this is too much of a burden.

23 0

Web Services for the most common uses should be be offered as an alternative. Most Web Service APIs tend to be domain-specific, though, and require custom-coded agents. This means they should be well-documented. More general approaches to web service interfaces include OpenSearch (which can be documented using a Description Document), the Linked Data API and ongoing work of the W3C RDF Web Applications Working Group on RDF and RDFa APIs. Some Linked Datasets could also benefit from syndicated access using Atom Syndication Format and/or RSS.

24 0

A few Linked Data implementations have endeavored to implement Web Services to enhance discovery and use of resources, often by providing some form of an application programming interface (API). Agrovoc and STW provide an API to discover resources based on relationships in the data, among many more web services. VIAF, Library of Congress, and STW offer autosuggest services for resources, delivering JSON responses ready for consumption in AJAX browser applications (In principle, though, JSON could be content-negotiable via the Linked Data URI, just like HTML and RDF.) Agrovoc and STITCH/CATCH include support for RDF responses Some services provide full-fledged SOAP APIs, while others support a RESTful approach.

25 1

By focusing on request parameters and response formats to provide enhanced discovery, Linked Data Web Services diminish, if not eliminate, the requirement that data be stored in a triplestore or be made searchable via SPARQL. And, because web service APIs are common, web services can lower the barrier to entry.

Jennifer Bowen

7/20/2011

GO TO TEXT

A very important, and persuasive, point about generating RDF serializations on the fly, rather than “throwing away” current applications. I would suggest adding something general about this in the first section of the report, about Benefits of linked data. This is an important benefit, that libraries do not completely have to retool in order to implement linked data.

Same comment as for paragraph 18 – this is a compelling point that should be brought out earlier in the report. Anything related to “lowering the barrier for entry” for libraries for linked data should be emphasized as much as possible.

Alan Danskin

7/22/2011

This may be true, but it is also true that investment is needed in new skills, tools and kit.

W3C incubator group on LLD – Draft Report for comment