This chapter describes the way DRIVER envisions interoperability for scholarly communication. This means qualitative correct metadata of the records based on the use of standards.
This document is largely based on the recommendations for the use of Unqualified (simple) Dublin Core metadata as described in: USING SIMPLE DUBLIN CORE TO DESCRIBE EPRINTS, by Andy Powell, Michael Day and Peter Cliff, UKOLN, University of Bath, Version 1.2
Additional information, descriptions, explanations, comments, usage instructions and best practices have been carefully provided with the aid of all DRIVER Guidelines contributors in order to create syntactic and semantic interoperability that will be appropriate for most European repositories.
"An institutional repository is a facility, consisting of hardware, software, data and procedures, that contains digital resources representing any type of scientific output..."
"digital resources = any bit stream, independent of content or format, which has been marked as scientific output by an approved person..."
Within this document we use the word "resource" to describe the instance of scientific output, and the word "object" to refer to the digital bit stream.
When "Requirement" is used we mean the following: "1 something required; a need. 2 something specified as compulsory. Compact Oxford Dictionary of Current English third edition"
When "Recommendation" is used we mean: "1 put forward with approval as being suitable for a purpose or role. 2 advise as a course of action. 3 make appealing or desirable.13"
The DRIVER Guidelines are written primarily to facilitate the exchange of metadata between DRIVER content providers and DRIVER services, in compliance with the DCMI definitions for Unqualified (simple) Dublin Core as specified in the OAI-PMH specifications. OAI-PMH specifications "For purposes of interoperability, repositories must disseminate Dublin Core, without any qualification." http://www.openarchives.org/OAI/openarchivesprotocol.html#MetadataNamespaces Basically these DRIVER Guidelines describe the mapping from an internal format to Unqualified (simple) DC to support harvesting. They are not to be used as cataloguing instructions.
In these DRIVER Guidelines Repository Managers have to accept the fact that not everything can be expressed with Unqualified DC, these guidelines therefore concentrate on the most important information in the perspective of the end-user who is not a librarian.
- Metadata are structured as Unqualified Dublin Core (ISO 15836:2003)
- Individual elements of DC are to be used according to the guidelines as presented in this appendix
- The use of Unicode is mandatory
- The values (i.e. actual content) of the DC-elements given below must not contain any HTML (or XML) markup. They may contain LaTeX commands, but there is no mechanism for explicitly indicating that LaTeX is being used.
- Represent Metadata in a higher granular structure such as Qualified Dublin Core or MODS. (Future work, additions to the DRIVER Guidelines)
- The DRIVER metadata guidelines only refer to metadata as exchange format. They do not hard code the recommendations made in the DRIVER Guidelines nor use a mapping between the locally implemented high granular metadata structures and the DRIVER recommendations.
- Recommended language for descriptive information is English, in order for the end-user to reach knowledgable documents that are normally "locked in" an national context.
Editions & difference in intellectual content
Only one metadata record should be used for different manifestations of a digital object (for example a postscript and a pdf version), unless the intellectual content is different. Common practice is to create a new metadata record when the intellectual content is different. This happens for instance when a new "edition", with modifications in the intellectual content, is created. In that case the recommended best practice is to use the relation element to link the more recent version to the older one.
Classification schemas & Review policies
In some cases, additional information on local review policies, the use of metadata elements dc:subject and dc:type on local classification schemas or controlled keyword vocabulary, may be useful for the harvesting party and service provider. A content provider typically releases this type of information via the 'Identify request' on IR level; not on the metadata level. See for instance: 3. Guidelines for Optional Containers at: http://www.openarchives.org/OAI/2.0/guidelines.htm and: http://arXiv.org/oai2?verb=Identify for best practices. On dc-element level this can be done by adding an URI to a term. For classification schemes that do not already have a namespace adding a sub-namespace to the info-uri namespace might be helpful. (see www.info-uri.info)
Dumbing down & Qualifiers
Some words on the use of refinements (qualifiers): When mapping to Unqualified DC the content provider has to make choices when the internal format is "richer" than unqualified DC. This means that during the mapping process all refinements are simply dropped (the DCMI dumbing down principle). The effect of the dumbing down principle is that the simple form of the element, i.e. without the refinement, is the default one. E.g. when the internal format distinguishes between main title and Sub-title this would show as follows in DC:
Default dc-elements interpretations
However, within DRIVER the following values are selected as the default values for oai_dc
dc:description -> default "abstract"
dc:date -> default "published"
dc:audience -> default "education level"
Within DRIVER this means that the date element always pertains to the date published etc. It is advised that all content providers supply this information to external harvesters as information about their repository (in the OAI-PMH Identify response).
Table 1: example of notifying the service provider on the default interpretation of the dc-element fields.