UKOLN Cultural Heritage Documents » Metadata http://blogs.ukoln.ac.uk/cultural-heritage-documents A commentable and syndicable version of UKOLN's cultural heritage briefing documents Fri, 17 Sep 2010 09:32:22 +0000 en-US hourly 1 http://wordpress.org/?v=3.5.2 What Makes A Good Tag? http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/09/02/what-makes-a-good-tag/ http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/09/02/what-makes-a-good-tag/#comments Thu, 02 Sep 2010 13:25:53 +0000 Brian Kelly http://culturalheritagedocs.wordpress.com/?p=204 There are No ‘Wrong’ Tags – Are There?

Although from the theoretical viewpoint there are no ‘wrong’ tags, in practice care needs to be taken when creating tags. So here are a few tips.

Tags are Single Words

Each tag takes the form of a single word. This is fine if the idea you want to convey is easily defined as a single word or doesn’t have multiple meanings. If this is not the case, tags can be extended by using a hyphen to link words together and still be treated by software and applications as a single word.

Singular or Plural

There are no rules so you choose whether to use the singular or plural form of a word. However, the choice of ‘museum’ instead of ‘museums’ or ‘library’ instead of ‘libraries’ by either the person tagging or searching will affect the results of the search. Library catalogue subject headings always use the plural form.

Words with Multiple Meanings

Some words can have multiple meanings, which could be confusing. When using the tag ‘violet’ do you mean a flower or a colour or a woman? You might need to extend the tag to make the distinction clear:

violet-flower
violet-colour
violet-UML-editor      (a piece of software)
violet-cool-gifts      (an Internet shopping site)
violet-hill-song      (a song and not a geographical feature)
violet-carson      (tv series actress)
violet-posy-blog

Tags for Events and Awards

Web sites that use tags often display the tags visually as a tag cloud. These usually take the form of an alphabetical list of tags and use font size and/or colour to identify the most frequently used tags. This enables viewers to either pick from the alphabetical list or to easily spot the most popular tags.

Tag Cloud Types

If you want to create tags for a series of events or an award, it is advisable to think ahead and devise a consistent set of tags. Start with the name of the event (which might be a well-known acronym) and then extend it using location and/or date.

IFLA-2009	nobel-prize-biology-2000
IFLA-2010	nobel-prize-peace-1999

Note, though, that there are also advantages in having short tags, so sometimes a tag for an event such as IFLA09 may be preferred.

‘Meaningless’ Tags

Within social networking services, people new to tagging often create tags from a very personal viewpoint. These are often effective within a specific context, but of limited use to someone else searching for information.

An advanced Search on Flickr using the tag ‘my-party’ turned up 399 hits. And while extending the tag might be expected to reduce the number of photos found, using ‘ann-party’ actually found 630 hits. Nobody seemed to have extended ‘ann-party’ with a date, but a search on the tag ‘party-2008′ found 901 items.

Even for a personal set of photos, using the tag ‘party’ may well not be enough, if you are a regular party giver or attender. You might need to tag some as ’18th-party’, ‘eurovision-party-2008′, ‘graduation-party’, ‘millennium-party’ or ‘engagement-party’.

Multiple Tags

An advantage of tagging is that any number of tags can be assigned to a resource. Assigning multiple tags to resources may take more time but it does get round some of the problems with tagging. So, if a word could be singular or plural, you could use both terms. Similarly, you could use both formal (or specialist) and informal terms as in ‘oncology’ and ‘cancer’. Multiple tagging also helps when the tagged resource might be searched for via several routes. An image of a dress in a costume collection could be tagged not only with its designer’s name, the year, decade or century it was created, its colour, fabric, length and style features (e.g. sleeveless) but also the occasions when it has been worn and by whom.

A Final Tip

It is worth spending some time considering the above points before deciding on your tags. So think carefully before you tag.

]]>
http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/09/02/what-makes-a-good-tag/feed/ 0
An Introduction to Tags and Tagging http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/09/02/an-introduction-to-tags-and-tagging/ http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/09/02/an-introduction-to-tags-and-tagging/#comments Thu, 02 Sep 2010 13:24:08 +0000 Brian Kelly http://culturalheritagedocs.wordpress.com/?p=202 What is a Tag?

Wikipedia defines a tag as “a non-hierarchical keyword or term assigned to a piece of information (such as an internet bookmark, digital image, or computer file)[1]. Tags, which are a form of metadata, allow resources to be more easier found.

Background

In the pre-Internet era, library catalogues used keywords to help users find titles on specific topics. Later, publishers of early Web sites started to use keywords to help people to find content. Then around 2003, tagging was developed by the social bookmarking site Delicious, and subsequently used by other social software services such as Flickr, YouTube and Technorati.

Tag Features

A list of typical characteristics of tags is given below:

  • Tags are chosen by the creator and/or by the viewer of the tagged item.
  • Tags are not part of a formal subject indexing term set.
  • Tags are informal and personal.
  • An item may have multiple tags assigned to it.
  • There is no ‘wrong’ tag.

Tag Clouds

Web sites that use tags often display the tags visually as a tag cloud. These usually take the form of an alphabetical list of tags and use font size and/or colour to identify the most frequently used tags. This enables viewers to either pick from the alphabetical list or to easily spot the most popular tags.

Tag Cloud Types

A number of different types of tag clouds may be found. For example:

  • The size represents the number of times that tag has been applied to a single item.
  • The size represents the number of items to which a specific tag has been applied.
  • The size represents the number of items in a content category.

Folksonomies

In situations where many users add tags to lots of items, a collection of tags is built up over time. Such a collection tags may be referred to as a folksonomy. A more formal definition of folksonomy is a set of keywords that is built up collaboratively without a pre-determined hierarchical structure.

Users of tagging systems can see the tags already applied by other people and will often, therefore, choose to use existing tags. However, they will create new tags if no existing tag is suitable or if the existing ones are not specific enough.

Hash Tags (# Tags)

Hash tags (also written as ‘hashtags’) are used in messages using services such as Twitter. The hash symbol (#) is placed before the word to be treated as a tag, as in the example below.

#goji berries are the new #superfood

This enables tweets on a specific topic to be found by searching on the hash tag.

Adding Tags

Systems vary in how you enter tags. When a single text box is provided and you want to enter more than one tag, you will need to use a separator between the tags. The most popular separator is the space character but some systems use other separators; e.g. quotation marks. Other systems only allow one tag to be entered at a time; in these cases you will have to repeat the process to add further tags.

‘Official’ Tags

Events and conferences increasingly are creating ‘official’ tags. These tags can then be used by participants for blog posts, photos of the event, presentation slides and other supporting materials and resources. This use of a consistent tag maximises the effectiveness of searching for resources relating to specific events.

References

  1. Tag (metadata), Wikipedia,
    <http://en.wikipedia.org/wiki/Tag_(metadata)>
]]>
http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/09/02/an-introduction-to-tags-and-tagging/feed/ 0
Metadata – Fit for Purpose http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/08/26/metadata-fit-for-purpose/ http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/08/26/metadata-fit-for-purpose/#comments Thu, 26 Aug 2010 14:55:37 +0000 Brian Kelly http://culturalheritagedocs.wordpress.com/?p=94 About This Document

This briefing document describes the issues to be considered when choosing and using metadata.

Why Use Metadata?

Metadata cannot solve all your resource management and discovery problems but it can play an important part in the solutions. Since time and effort is needed if metadata is to be used effectively, it is vital to look closely at the problems you wish to address.

Do you want to allow resources on your Web site to be found more easily by search engines such as Google? Or perhaps you want to improve local searching on your Web site? Do you need interoperability with other projects and services? Maybe you want to improve the maintenance of resources on your Web site.

While metadata has a role to play in all of these situations, different approaches will be needed to tackle each type of problem. And in some cases, metadata may not be the optimal solution; for example, Google makes limited use of metadata so an alternative strategy might be needed.

Identifying the Functionality to be Provided

Once you have clarified why you want to use metadata, you should identify the end-user functionality you wish to provide. This will enable you to define the metadata you need, how it should be represented, and how it should be created, managed and deployed.

Choosing The Metadata Standard

You will need to choose the metadata standard which is relevant for your purpose. In many cases this will be self-evident. For example, a project that is funded to develop resources in an OAI environment will need to use the OAI application, while for a database of collection descriptions you will need to use collection description metadata.

Off the Shelf or Custom Fit?

Some metadata can be used without further work – for example, MARC 21 format in library management system cataloguing modules or entries in the Cornucopia and MICHAEL collection description databases.

Other metadata requires decisions on your part. If you are using Dublin Core, you will need to decide whether to use qualifiers (and if so which) and which elements are mandatory and which are repeatable.

Managing Your Metadata

It is important that you think about this at an early stage. If not properly managed, metadata can become out-of-date; and since metadata is not normally displayed to end-users but processed by software, you won’t be able to check visually. Poor quality data can be a major obstacle to interoperable services.

If, for example, you embed metadata directly into a file, you may find it difficult to maintain the metadata; e.g. if the creator changes their name or contact details. A better approach may be the use of a database (sometimes referred to as a metadata repository) which provides management capabilities.

Example Of Use Of This Approach

The Exploit Interactive e-journal was developed by UKOLN with EU funding. Metadata was required in order to provide enhanced searching for the end user. The specific functionality required was the ability to search by issue, article type, author and title and by funding body. In addition metadata was needed in order to assist the project manager producing reports, such as the numbers of different types of articles. This functionality helped to identify the qualified Dublin Core elements required.

The MS SiteServer software used to provide the service provided an indexing and searching capability for processing arbitrary metadata. It was therefore decided to provide Dublin Core metadata stored in <meta> tags in HTML pages. In order to allow the metadata to be more easily converted into other formats (e.g. XHTML) the metadata was held externally and converted to HTML by server-side scripts.

A case study which gives further information (and describes the limitations of the metadata management approach) is available.

Managing And Using Metadata In An E-Journal, QA Focus briefing document no. 1, UKOLN, <http://www.ukoln.ac.uk/qa-focus/documents/case-studies/case-study-01/>

]]>
http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/08/26/metadata-fit-for-purpose/feed/ 0
Quality Assurance For Metadata http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/08/26/quality-assurance-for-metadata/ http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/08/26/quality-assurance-for-metadata/#comments Thu, 26 Aug 2010 14:54:35 +0000 Brian Kelly http://culturalheritagedocs.wordpress.com/?p=92 Introduction

Once you have decided to make use of metadata in your project, you then need to agree on the functionality to be provided, the metadata standards to be used and the architecture for managing and deploying your metadata. However this is not the end of the matter. You will also need to ensure that you have appropriate quality assurance procedures to ensure that your metadata provides fitness for its purposes.

What Can Go Wrong?

There are a number of ways in which services based on metadata can go wrong, such as:

Incorrect content:
The content of the metadata may be incorrect or out-of-date. There is a danger that metadata content is even more likely to be out-of-date than normal content, as content is normally visible, unlike metadata which is not normally displayed on, say, a Web page. In addition humans can be tolerant of errors, ambiguities, etc. in ways that software tools normally aren’t.
Inconsistent content:
The metadata content may be inconsistent due to a lack of cataloguing rules and inconsistent approaches if multiple people are involved in creating metadata.
Non-interoperable content:
Even if metadata is consistent within a project, other projects may apply different cataloguing rules. For example the date 01/12/2003 could be interpreted as 1 December or 12 January if projects based in the UK and USA make assumptions about the date format.
Incorrect format:
The metadata may be stored in a non-valid format. Again, although Web browsers are normally tolerant of HTML errors, formats such as XML insist on compliance with standards.
Errors with metadata management tools:
Metadata creation and management tools could output metadata in invalid formats.
Errors with the workflow process:
Data processed by metadata or other tools could become corrupted through the workflow. As a simple example a MS Windows character such as © could be entered into a database and then output as an invalid character in a XML file.

QA For Metadata Content

You should have procedures to ensure that the metadata content is correct when created and is maintained as appropriate. This could involve ensuring that you have cataloguing rules, ensuring that you have mechanisms for ensuring the cataloguing rules are implemented (possibly in software when the metadata is created). You may also need systematic procedures for periodic checking of the metadata.

QA For Metadata Formats

As metadata which is to be reused by other applications is increasingly being stored in XML it is essential that the format is compliant (otherwise tools will not be able to process the metadata). XML compliance checking can be implemented fairly easily. More difficult will be to ensure that metadata makes use of appropriate XML schemas.

QA For Metadata Tools

You should ensure that the output from metadata creation and management tools is compliant with appropriate standards. You should expect that such tools have a rich set of test suites to validate a wide range of environments. You will need to consider such issues if you develop your own metadata management system.

QA For Metadata Workflow

You should ensure that metadata does not become corrupted as it flows through a workflow system.

A Fictitious Nightmare Scenario

A multimedia e-journal project is set up. Dublin Core metadata is used for articles which are published. Unfortunately there are documented cataloguing rules and, due to a high staff turnover (staff are on short term contracts) there are many inconsistencies in the metadata (John Smith & Smith, J.; University of Bath and Bath University; etc.)

The metadata is managed by a home-grown tool. Unfortunately the author metadata is output in HTML as DC.Author rather than DC.Creator. In addition the tool output the metadata in XHTML 1.0 format which is embedded in HTML 4.0 documents.

The metadata is created by hand and is not checked. This results in a large number of typos and use of characters which are not permitted in XML without further processing (e.g. £, — and &).

Rights metadata for images which describes which images can be published freely and which is restricted to local use becomes separated from the images during the workflow process.

]]>
http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/08/26/quality-assurance-for-metadata/feed/ 0
An Introduction To Dublin Core http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/08/26/an-introduction-to-dublin-core/ http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/08/26/an-introduction-to-dublin-core/#comments Thu, 26 Aug 2010 14:53:18 +0000 Brian Kelly http://culturalheritagedocs.wordpress.com/?p=89 About This Document

This briefing document provides an introduction to Dublin Core metadata

What Is Dublin Core Metadata?

Identifying metadata elements in a standard way enables metadata to be processed in a consistent manner by computer software.

The Dublin Core Metadata Element Set is a standard for cross-domain information resource description. It is widely used to describe digital materials such as video, sound, image, text and composite media such as Web pages. It is the best known metadata standard in the Web environment.

Based on the Resource Description Framework, it defines a number of ‘elements’ of data that are required to find, identify, describe and access a particular resource.

Dublin Core metadata is typically recorded using Extensible Markup Language (XML).

Dublin Core is defined by ISO Standard 15836 and NISO Standard Z39.85-2007.

Simple Dublin Core

There are 15 core elements in the Dublin Core standard:

Title, Creator, Subject, Description, Contributor, Date, Type, Format, Identifier, Source, Language, Relation, Coverage and Rights.

Qualified Dublin Core

The core element set was deliberately kept to a minimum, but this sometimes proved a problem for early implementers. This led to the development of Qualified Dublin Core, which has a further 3 elements (Audience, Provenance and RightsHolder) and a set of element qualifiers, which restrict or narrow the meaning of an element.

For example, qualified Date elements are DateAccepted, DateCopyrighted and DateSubmitted.

Metadata Standards

Because so many communities now use metadata, there are a bewilderingly large number of standards and formats in existence or in development. Metadata is used for resource description and discovery; recording intellectual property rights and access data; and technical information relating to the creation, use and preservation of digital resources.

What Does It Look Like?

Dublin Core metadata is typically recorded in XML using <meta> tags. Each element has a label; this is recorded between <…> brackets and precedes the actual data, while another set of brackets and a forward slash <…> marks the end of the data.

Some examples are:

<Creator> Ann Chapman </Creator>
<Title> An Introduction to Dublin Core </Title>
<DateSubmitted>  20080417 </DateSubmitted>
<DateAccepted>  20080611 </DateAccepted>
<Relation> Cultural Heritage Briefing Papers series </Relation>
<Subject> Metadata </Subject>
<Format> Word document Office 2003 </Format>
<Language> English </Language>

Application Profiles

Implementers then found that even Qualified Dublin Core had insufficient detail for use in specific communities. This lack led to the development of Application Profiles which contain further elements and element qualifiers appropriate to the community of interest.

DC-Lib
Library Application Profile. Used to describe resources by libraries and library related applications and projects.
DC-Collections
Collections Application Profile. Used to describe resources at collection level.
SWAP
Scholarly Works Application Profile. Used to describe research papers, scholarly texts, data objects and other resources created and used within scholarly communications.
DC-Education
Education Application Profile. Used to describe the educational aspects of any resource, and/or the educational context within which is has been or may be used. It is intended to be usable with other application profiles.
]]>
http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/08/26/an-introduction-to-dublin-core/feed/ 0
An Introduction To Metadata http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/08/26/an-introduction-to-metadata/ http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/08/26/an-introduction-to-metadata/#comments Thu, 26 Aug 2010 14:52:01 +0000 Brian Kelly http://culturalheritagedocs.wordpress.com/?p=87 About This Document

This briefing document provides an introduction to metadata.

What Is Metadata?

Metadata is sometimes defined literally as ‘data about data’. More usefully, the term is understood to mean structured data about resources. The fact that the data is structured – broken down into very specific pieces – enables a range of automated processes to be built around the data to provide services.

Traditional ‘Metadata’?

In one sense, metadata is not a new concept. Library catalogues, abstracting and indexing services, directories of resources and institutions, archival finding aids and museum documentation all contain structured information.

What is the Value of Metadata

Firstly, it enables librarians, archivists and museum documentation professionals to work across institutional and sector boundaries to provide more effective resource discovery to the benefit of enquirers, students and researchers.

Secondly, it enables cultural heritage professions to communicate more effectively with other domains that also have an interest in metadata, such as publishers, the recording industry, television companies, producers of digital educational content, software developers and those concerned with geographical and satellite-based information.

Metadata Standards

Because so many communities now use metadata, there are a bewilderingly large number of standards and formats in existence or in development. Metadata is used for resource description and discovery; recording intellectual property rights and access data; and technical information relating to the creation, use and preservation of digital resources.

Metadata Encoding

Metadata is recorded in formats (e.g. MARC 21) or implementations of Mark-up Languages and Document Type Definitions (DTD). The main standards are:

SGML
Standard Generalised Mark-up Language.
XML
Extensible Mark-up Language.

Metadata for Libraries

Important metadata standards for use in a library context are:

MARC 21
A means of encoding metadata defined in bibliographic cataloguing rules.
ISBD series
International Standard for Bibliographic Description.
ONIX
A range of international standards for electronic information messages (about books, serials and licensing and rights) for the book industry.

Metadata for Archives

Important metadata standards for use in an archives context are:

EAD
Encoded Archival Description; a means of encoding metadata defined in archival cataloguing rules.
ISAD(G)
International Standard for Archival Description.

Metadata for Museums

Important metadata standards for use in a museum context are:

CIMI DTD
Computer Interchange of Museum Information.
SPECTRUM
The UK and international standard for collections management.

Metadata for the Digital World

Important metadata standards for use in a digital context are:

Dublin Core (DC)
Defines 15 metadata elements for simple resource discovery. Qualifiers for some of these elements enable more detail to be recorded. Further elements have now been defined to use in specific fields.
DC Application Profiles
A set of DC elements defined for use in the context of specific communities of practice; for example, education, libraries, collections and scholarly works.
]]>
http://blogs.ukoln.ac.uk/cultural-heritage-documents/2010/08/26/an-introduction-to-metadata/feed/ 0