Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin
Section
Column
width50%

Table of contents

Table of Contents
Column
width50%

...

Document information

...

Title: MPEG21 DIDL Application Profile for Institutional Repositories
Subject: MPEG21

...

DIDL

...


Moderator: Thomas Place
Version:

...

3.

...

Date

Version

 

Download PDF

22 Januari 2008

2.3.1

Minor change in the schema path. ISO changed the path .../dii.xsd/dii.xsd to .../dii/dii.xsd
Minor change in the examples the mimetype for humanStartPage resources changed from "application/html" to "text/html". "application/html" is not a valid mimetype.

 

05 December 2007

2.3

Changes to stress the use of Persistent Identifiers in the DIDL document. The addition of the ORE compliant info:eu-repo namespace

 

23 May 2007

2.2.2

Some changes and little tweaks.

 

23 March 2007

2.2.1

Added comment of Peter van Huisstede, small corrections in the example XML.

 

6 March 2007

2.2

The Committee for Complex Objects looked at this document and came with more elegant improvements. Thanks to: Thomas Place, Renze Brandsma, Henk Ellermann, Peter van Huisstede and Ruud Bronmans.

 

20 February 2007

2.1

A closer look at the recommendations of Herbert vd Sompel gave more insight in the DIDL semantics, and thus leading to a better XMLspecification.

 

2 January 2007

2.0

Fundamental change of element and attribute use; for better representation of the semantics.
Additional texts for driver guidelines from Martin Feijen, new DIDL according to comments of Herbert vd Sompel, new DIDL schema. (http://purl.lanl.gov/STB-RL/schemas/2006-09/DIDL.xsd)

 

4 December 2006

1.1.2

Translated into English for DRIVER

 

11 July 2006

1.1.1

Few typos are removed.

 

10 July 2006

1.1

Extension with:

  • Version numbering and information
  • Complete namespace declaration in 'metadata'-item.
  • The three Items are not case sensitively discriminated by: metadata, objects and jump-off-page.
  • Extended explanation about the use of Namespace-declarations.

 

30 March 2006

1.0

Initial document

 

 

 

 

 

...

V 2.3.1

  1. Minor change in the schema path. ISO changed the path .../dii.xsd/dii.xsd to .../dii/dii.xsdNothing major, only to keep the validity check consistent.
  2. Minor change in the examples the mimetype for humanStartPage resources changed from "application/html" to "text/html". "application/html" is not a valid mimetype.

...

  1. The use of Persistent Identifiers are addressed in this version of the DIDL document.
  2. Added explanation about the Persistent Identifier.
  3. Use of the uri namespace info:eu-repo info:eu-repo/semantics/descriptiveMetadatainfo:eu-repo/semantics/objectFileinfo:eu-repo/semantics/humanStartPage

...

  1. Uses the official URL for the DIDL schema's stored at http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-21_schema_files
  2. Removed the comparison with the FRBR model to minimize confusions.
  3. More clear agreements about making some attributes or items Mandatory.
  4. Metadata prefix reduced to "didl"

...

  1. Uses the name "DIDL document" more consistently
  2. The value of the OAI-PMH verb MetadataPrefix has a more neutral name: "didl_document"
  3. hbo-kennisbank.nl link has been removed, along with other text revisions in advise of the Koninklijke Bibliotheek, National Library of The Netherlands.
  4. for interoperability reasons we use DIDL with a minimum of exotic extentions. The only extensions are of the MPEG21 standard itself and Dublin Core.
  5. to introduce the use of semantics of the assets contained in the DIDL Document, the urn:mpeg:mpeg21:2002:01-DIP-NS namespace is used to relate Items to an ObjectType.
  6. to determine semantic relations we use URI's (info:eu-repo) to represent the typical three ObjectTypes for Items in repositories. (Metadata, Object, Jump-off-page)
  7. uses Zulu data (ISO8601) with the YYYY-MM-DDThh:mm:ssZ granularity consistently in examples.
  8. every "text/xml" has been changed to "application/xml"
  9. every identifier, where possible, has to have a registered namespace! Example: info:doi/10.1075/123123, or info:hdl/...., or urn:nbn:nl:....
  10. the examples of the content type names have to be in lowercase, as is recommended.
  11. date modified elements and identifier elements have to be consistent in the statement of every Item, not as attributes in for example the <didl:DIDL> tag.

...

  1. Now only one <Item> is at top level.
  2. Child elements of <Item> are <Component> elements with different DRIVERtype attributes. (DescriptiveMetadata, ResourceLocation, IntermediatePageLocation) All Components are part of the Item. Therefore the semantic relationship between the two elements is a "has a" relationship. This relationship we have defined with DriverComponentTypes. We now can say, "The article has a Descriptive Metadata Component and has also a Recourse Location Component".
  3. The order in which the ResourceLocation components are placed in the DIDL does not have to be in the reading order any more.

...

  1. Semantic XML description for the Items involving metadata, digital object location and intermediate pages, replacing the description/Statement/text construct. Preview: <!-- Introducing the area for metadata --> <didl:Item dit:DriverContentType="DescriptiveMetadata" xmlns:dit="http://www.darenet.nl/upload.view/" xsi:schemaLocation="http://www.darenet.nl/upload.view/ http://www.darenet.nl/upload.view/DriverContentType.xsd" >
  2. The <Container> tag had been replaced by the <Item>, due to semantic compatibility with the DIDL Mpeg-21 standard.
  3. A new schema location (2006-09) replaces the old (2004-08).
  4. The descriptor/statement/dii:identifier construct for the DIDL root element has been replaced by the DIDLDocumentId attribute within the DIDL tag.
  5. The descriptor/statement/dcterms:modified construct for the DIDL root element has been replaced by the didlext:DIDLDocumentModified attribute within the DIDL tag.
  6. The diext:DIDcreated attribute for the DIDL root element has been replaced by the didlext:DIDLDocumentCreated attribute within the DIDL tag.

...

MPEG21 DIDL Document Specifications for repositories
1 Document information
2 Versions
2.1 What's new
3 Table of Contents
4 Introduction and Goal
5 Background information
5.1 Persistent Identifier
6 OAI Response with a DIDL document
Remarks:
7 DIDL as wrapper
7.1 Item Descriptors
7.1.1 Item 'Identifier' Statement
Remarks:
7.1.2 Item 'modified' Statement
Remarks:
7.1.3 Item 'ObjectType' Statement
Remarks:
7.2 Compound Item - representation of the complex work
Remarks:
7.2.1 Metadata Item
Remarks:
7.2.2 Object Item
Remarks:
7.2.3 Jump-off-page / Human Start page Item
Remarks:
7.3 Example of full OAI-PMH record (thesis) as MPEG-21 DIDL

...

This document is an addition to the existing DIDL document specification document for repositories which is being used by the Dutch Universities, Koninklijke Bibliotheek, National Library of The Netherlands, and DAREnet. The goal of this document is to make the use of DIDL unambiguously clear by describing:

  1. the nature of the different parts "metadata", "objects" and "jump-off-page"
  2. What the identification is.
  3. What the modification-date is.

When used correctly, this specification will create a valid XML MPEG-21 DIDL record for use with OAI-PMH responses.
This specification of the DIDL document for repositories is based on decisions that were proposed early in the development of this XML format to use MPEG-21 DIDL. The proposition was a rough sketch of a wrapper format that has room for metadata, object and jump-off-page resources.
This specification is a more precise workout.

...

The DIDL was originally developed within the DARE programme of SURF as a first, implementation of MPEG-21 DIDL.
The rationale behind this development was:

  • a solution for resource harvesting via OAI-PMH for transport of the digital resources (PDF's etc) from the local repository to a National Library for ingest of the resources into the E-Depot system for long term preservation
  • a solution for resource harvesting via OAI-PMH for transport of the digital resources (PDF's etc) from the local repository system to a service provider (e.g. a search portal that indexes the full text of documents)
  • a (partial) solution for representing complex documents. At first focused on theses that consist of multiple digital resource files
  • a solution for the confusing use of dc:identifier in case of a link to a so called jump-off page (JOP); many repositories place a link to a jump-off page in dc:identifier instead of a direct link to the digital resource file

...

0
Date published: 2009-04-18
Excerpt:

Excerpt

Write an excerpt



(Optional information)
Type:
Format:
Identifier:
Language:
Rights:
Tags:

Document History

Date

Version history

Owner

Changelog

PDF

18 April 2009

3.0

SURFshare

Start of version 3.0. Based on NEEO document "MPEG21 DIDL Application Profile for Institutional Repositories" version 0.4, which is based on "MPEG21 DIDL Document Specifications for repositories" version 2.3.1. See also history. Note that this is the first version of this document. The version number (3.0) indicates that it is more recent and more up-to-date than the predecessors on which it is based by having a higher number than their latest versions.
Changes:

  1. Less strict remarks on the location of the namespace declarations.
  2. The number of Descriptors is unlimited.
  3. Type of a Digital Item is expressed as an URI being the value of the rdf:resource attribute of a rdf:type element.
  4. Type statements are also allowed at the top level.
  5. Instead of using not yet registered terms from info:eu-repo/semantics/, the terms from the Eprints Access Rights Vocabulary must be used for expressing the accessibility of an object
  6. For the date of deposit of an object in a repository dcterms:dateSubmitted is to be used (was dcterms:issued)

Download

22 Januari 2008

2.3.1

SURFshare

Minor change in the schema path. ISO changed the path .../dii.xsd/dii.xsd to .../dii/dii.xsd
Minor change in the examples the mimetype for humanStartPage resources changed from "application/html" to "text/html". "application/html" is not a valid mimetype.

Download

05 December 2007

2.3

SURFshare

Changes to stress the use of Persistent Identifiers in the DIDL document. The addition of the ORE compliant info:eu-repo namespace

 

23 May 2007

2.2.2

SURFshare

Some changes and little tweaks.

 

23 March 2007

2.2.1

SURFshare

Added comment of Peter van Huisstede, small corrections in the example XML.

 

6 March 2007

2.2

SURFshare

The Committee for Complex Objects looked at this document and came with more elegant improvements. Thanks to: Thomas Place, Renze Brandsma, Henk Ellermann, Peter van Huisstede and Ruud Bronmans.

 

20 February 2007

2.1

SURFshare

A closer look at the recommendations of Herbert vd Sompel gave more insight in the DIDL semantics, and thus leading to a better XMLspecification.

 

2 January 2007

2.0

SURFshare

Fundamental change of element and attribute use; for better representation of the semantics.
Additional texts for driver guidelines from Martin Feijen, new DIDL according to comments of Herbert vd Sompel, new DIDL schema. (http://purl.lanl.gov/STB-RL/schemas/2006-09/DIDL.xsd)

 

4 December 2006

1.1.2

SURFshare

Translated into English for DRIVER

 

11 July 2006

1.1.1

SURFshare

Few typos are removed.

 

10 July 2006

1.1

SURFshare

Extension with:

  • Version numbering and information
  • Complete namespace declaration in 'metadata'-item.
  • The three Items are not case sensitively discriminated by: metadata, objects and jump-off-page.
  • Extended explanation about the use of Namespace-declarations.

 

30 March 2006

1.0

SURFshare

Initial document

 

 

0.4

NEEO

  1. Some minor editorial changes
  2. 3.3.2 - Object File Item
    • Addition of deposit date as a <dcterms:issued>
    • Vocabulary for <dcterms:accessRights>

 

 

0.3

NEEO

Only minor changes

 

 

0.2

NEEO

  1. One or more object files is changed in zero, one or more object files
  2. Removed example; has to be replaced by a new example.

 

 

0.1

NEEO

Changes with respect to version 2.3.1 of "MPEG21 DIDL Document Specifications for repositories" by Maurice Vanderfeesten (SURF)

  1. Relaxed the requirement that the top Item identifier must be persistent.
  2. More strict formulation for Item identifiers not being the same as the OAI identifier and the DIDL document identifier.
  3. Replaced dip:objectType by rdf:type.
  4. The use of simple Dublin Core is not mandatory and is not even recommended.
  5. MODS is the recommended metadata scheme.
  6. Left out all references to 'work' or 'expression'.
  7. Item elements stand for Digital Items (they are Digital Item Declarations). Type statements (using rdf:type) type the Digital Items.
  8. The use of identifiers (URIs) for Digital Items is mandatory.
  9. Added new semantics: publishedVersion|authorVersion, embargo, description, access rights.

 

Abstract

The abstract describes what the application profile is about. It should contain a problem definition, the standards described by the application profile and the goal of the application profile.

Introduction

This document is an adaptation of "MPEG21 DIDL Application Profile for NEEO" version 0.4 (http://drcwww.uvt.nl/~place/neeo/didl%20application%20profile.0.4.doc). The latter was based on Maurice Vanderfeesten (2008), "MPEG21 DIDL Document Specifications for repositories" version 2.3.1 https://www.surfgroepen.nl/sites/oai/complexobjects/Shared%20Documents/DIDLdocumentSpecification_EN_v2.3.doc

This document describes the use of DIDL in the context of institutional repositories. The DIDL Document Specification was originally developed within the DARE programme of SURF as a solution for:

  • the harvesting of the digital resources (PDFs etc.) from the local repositories for ingest into the E-Depot system of the Royal Library for long term preservation
  • the harvesting of the digital resources (PDFs etc.) from the local repositories by a service provider (e.g. a search portal that indexes the full text of documents)
  • the representation of complex documents such as doctoral theses that consist of multiple digital resource files
  • the confusing use of dc:identifier; dc:identifier can be used for different types of objects; the identifier itself doesn't indicate what type of object is identified; in the context of repositories dc:identifier can point to a jump-off page (JOP) and/or to object files.


DIDL has been in use by the DARE community since the summer of 2006. One of the results is that the content of all Dutch repositories are now part of the E-Depot of the Royal Library, the national library of The Netherlands.

Compound and Digital Objects as Digital Items

The digital objects that populate institutional repositories can be seen as compound objects that consist of parts or components that are also digital objects. In the DIDL model the basic entity is a Digital Item. The compound objects and their objects play the role of Digital Items in the model that underlies DIDL. In a DIDL document the Item elements represent the Digital Items. The top Item element that is situated directly below the DIDL root element is used for the compound object. The Item elements that are the children of the top Item element represent the objects that are part of the compound object. The objects that are part of a compound object can themselves be a compound object. When a part object is also a compound object, then its parts are not described in the same DIDL document, but a separate DIDL document is used to describe this compound object with its parts. This means that in this application profile there are only two levels of Digital Items within a DIDL document. Although DIDL allows for a hierarchy of Digital Items, this profile restricts the hierarchy to two levels: the level of the top Digital Item, the compound object and the level of the Digital Items that are parts of the top Digital Item. This version of the application profile doesn't give (yet) guidelines for the case of a compound object that is part of another compound object.
This profile distinguishes three types of digital objects: descriptive metadata, object files and jump-off pages. This list is extensible; other types can be added.
The figure below is a schematic representation of a DIDL document of a compound object that consists of one or more descriptive metadata records, zero or more object files and zero or one jump-off page. Metadata that apply to the metadata records, object files and jump-off pages can be placed in Descriptor elements within the respective Item elements. In the figure the most used Descriptors are shown. The list of Descriptor elements in an Item is extensible.
A digital object can have one or more representations. A representation is the thing that can be displayed on a computer screen or that can be printed. A representation MUST have a medium type (mimetype). In DIDL, representations are handled by the Resource element. A Resource is contained in a Component element which in its turn is a child of the Item element. There are two ways of including a representation in a Resource element. The first way is by-value: the representation as such is included as content of the Resource element. This is the usual way that metadata records formatted in XML are included. The second way is by-ref: the Resource element stays empty, but the representation is referred to by an URL that is the value of the ref-attribute of the Resource element. Normally, the URL will point to a file in the repository.
Each Digital Item MUST have an identifier with the exception of jump-off pages for which the identifier is optional. This identifier MUST be an URI. The URI of a Digital Item should be different from the URLs of its representations. The identifiers of the Digital Items must be persistent. The URLs of the representations and the medium types can change, while the identifier of the Digital Item stays the same. This allows, e.g., for replacing a file that can only be processed by an old-fashioned word processor by a version with the same content that can be read at all contemporary desk tops. Or a file can be moved to another location; the identifier of the Digital Items stays the same indicating that it is still the same file. If the policy of a repository is to preserve the different representations of a Digital Item then the repository is advised to treat the representations as separate Digital Items, each with its own persistent identifier. So it is possible that in one repository the PDF, the Word and the HTML versions of a publication are combined into one Digital Item, while in another repository they are treated as separate Digital Items. Another use of Digital Item identifiers is to relate Digital Items to each other.

Image Added

OAI Response with a DIDL document

The DIDL document is part of an OAI-PMH response. The DIDL document will be returned within an OAI-record when using didl as value of the metadataPrefix verb. This enables the repository to generate this particular didl format that is described in the document below.
Within the OAI XML structure, the DIDL resides within the metadata element. See below:

Code Block
xml
xml
<OAI-PMH ...>
...
<request ... metadataPrefix="didl">
...
<record>
<header>...</header>
<metadata>
<didl:DIDL
xmlns:didl="urn:mpeg:mpeg21:2002:02-DIDL-NS"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

...

The Persistent Identifier is used to retrieve information also over hundreds of years despite of the underlying future technology. When a Persistent Identifier is used to Identify a Digital Object or Information, the Repository commits itself to keep the information resolvable.
The is particular DIDL document can also being used to transport the Persistent Identifier with the representations to the Resolver.
A national resolution mechanism is used to translate the Persistent Identifier into the current representations. The representation might be anno 2007 MSWord2007, or PDF8.0, but in the future that might be XYZ that represents the current information. The resolver keeps track of the updates.
In the SURFshare program the URN:NBN is used. (the arguments are not included here) We encourage to use the urn:nbn throughout Europe.
Use a persistent Identifier to Identify the information a representation represents, not to Identify the representation itself. This means Don't Identify the physical digital object like a Word file, but on a more abstract level the information it contains. The W3C URI model is as follows URN is a resource that contains static or dynamic content and refers to an URL. The URL provides the location of the representation.
In DIDL the DIDL Item Element contains an Persistent Identifier as a URN, the DIDL Resource Element contains the URL. This is how the National Resolution Services can link the URN to a URL. Image Removed

...

The DIDL document is part of an OAI-PMH response. The DIDL document will be returned within a OAI-record when using didl as value of the metadataPrefix verb. This enables the repository to generate this particular didl format that is described in the document below.
Within the OAI XML structure, the DIDL resides within the metadata element. See below:

...

xmlns:dii="urn:mpeg:mpeg21:2002:01-DII-NS"

...


xmlns:

...

xsi="http://www.w3.org/2001/XMLSchema-instance"

...


xsi:schemaLocation="

...


urn:mpeg:mpeg21:2002:02-DIDL-NS
[http://standards.iso.org/ittf/PubliclyAvailableStandards/]
MPEG-21_schema_files/did/didl.xsd

...



urn:mpeg:mpeg21:2002:01-DII-

...

NS
[http://standards.iso.org/ittf/PubliclyAvailableStandards/]
MPEG-21_schema_files/dii/dii.xsd

...

">
...
</didl:

...

DIDL>
</metadata>

...


<about>...</about>

...


</record>

...


...

...


</OAI-PMH>

...


...

Remarks:

  1. Don't forget the DIDL tag in the OAI-PMH response.
    1. Make a declaration of the didl, dii,
    dip
    1. dc and dcterms namespaces here, in the DIDL tag.
    These namespaces are needed throughout the whole DIDL document.
    1. Do not create these namespaces in the <OAI-PMH> tag, because the rationale of a DIDL document is that is an autonomous entity that can exist out of the context of OAI-PMH.
  2. The about element is optional in OAI-PMH

...

  1. The about element is optional in OAI-PMH

DIDL as a

...

wrapper

The DIDL document is a document with one top-level Item element. The Item contains several child Item elements. These child item Item elements appear in describe three different kind of Object process types (types: descriptive metadata, objects object files and jump-off page)pages. Between brackets the cardinality of the XML elements are shown:

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="15967ea9-03ff-442c-b1c1-21f9a2b82274"><ac:plain-text-body><![CDATA[

DIDL[1]<metadata>
]]></ac:plain-text-body></ac:structured-macro>
<didl:DIDL ...>
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="df0a2281-90f6-4f78-a424-eb4669d21b17"><ac:plain-text-body><![CDATA[Item[1] <didl:Item>
]]></ac:plain-text-body></ac:structured-macro>
<didl*:Item*>...</didl:Item>
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="7f1085fa-5678-4e44-8c74-6a8938267696"><ac:plain-text-body><![CDATA[Item[1..8] (of type metadata)Item[1..8] (of type objects)
]]></ac:plain-text-body></ac:structured-macro>
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="8161c2ed-5a6f-4142-b625-89e30bc90370"><ac:plain-text-body><![CDATA[Item[0..1] (of type jump-off page)
]]></ac:plain-text-body></ac:structured-macro>
<didl*:Item*>...</didl:Item>
<didl*:Item*>...</didl:Item>
</didl:Item>
</didl:DIDL>
</metadata>

...

Wiki Markup
DIDL\[1\]Item\[1\]Item\[1..8\] (of type metadata)Item\[1..8\] (of type objects)
Item\[0..1\] (of type jump-off page)
Descriptor\[2\]Descriptor\[1..3\]Descriptor\[1..3\]Descriptor\[1..3\]
\\
Item descriptors provide information about the Item. This information can be semantic hints for automatic processing or reusing the XML.

...

Example on level one

...

<didl:DIDL ...>
<didl:Item>
<didl:Descriptor>...</didl:Descriptor> <!-- Identification, mandatory -->
<didl:Descriptor>...</didl:Descriptor> <!-- Modification date, mandatory -->
<didl:Item>...</didl:Item>
<didl:Item>...</didl:Item>
<didl:Item>...</didl:Item>
...
</didl:Item>
</didl:DIDL>

...

<didl:DIDL ...>
<didl:Item>
<didl:Item>
<didl:Descriptor>...</didl:Descriptor> <!---Identification, optional -->
<didl:Descriptor>...</didl:Descriptor> <!- Modification date, optional->
<didl:Descriptor>...</didl:Descriptor> <!-- Object type, mandatory -->
...
</didl:Item>
<didl:Item>...</didl:Item>
<didl:Item>...</didl:Item>
<didl:Item>...</didl:Item>
...
</didl:Item>
</didl:DIDL>

...

The first Descriptor contains the ID of the Item elements. This is mostly used to uniquely identify the digital object (e.g. with a urn:nbn). This ID is wrapped in a Statement with a DII Identifier element. For example:

<didl:Item>
<didl:Item>
<didl:Descriptor>
<didl:Statement mimeType="application/xml">
<dii:Identifier>urn:nbn:nl:ui:13-6748398729821</dii:Identifier>
</didl:Statement>
</didl:Descriptor>
...
</didl:Item>
...
</didl:Item>

...

1. In our case the root Item has a Persistent Identifier to represent the
For second level Item elements accounts that this Identifier is NOT equal to the used OAI identifier or DIDL identifier.
2. The Identifier in the root Item element MAY be the same as the DIDL or OAI Identifier, but is not recommended.
3. The namespace for dii MUST be declared in the DIDL tag.
4. The Identifier MUST be an URI when applicable.

...

The second Descriptor contains a modification date. When something changes inside an Item, this modification date element has to be up-dated. This modification date is being specified by the modified element from dcterms:

<didl:Item>
<didl:Item>
...
<didl:Descriptor>
<didl:Statement mimeType="application/xml">
<dcterms:modified>2006-12-20T10:29:12Z</dcterms:modified>
</didl:Statement>
</didl:Descriptor>
...
</didl:Item>
...
</didl:Item>

...

  1. Declare the dcterms namespace in the DIDL tag.
  2. The format of the date is Zulu-time; which means that it can be sorted as text.
  3. There can be only one Statement element in a Descriptor element, which means that dii:identifier and dcterms:modified reside in separate Descriptor elements.

...

The third descriptor contains the object type. This Object type appears on the second level of Item elements. In other words; this applies only on child Item elements of the first Item.
This object type is being specified by the ObjectType element from the MPEG-21 Digital Item Processing (DIP) namespace that specifies an architecture pertaining to the dissemination of Digital Item Documents (DIDs).

<didl:Item>
<didl:Item>
...
<didl:Descriptor>
<didl:Statement mimeType="application/xml">
<dip:ObjectType>info:eu-repo/semantics/descriptiveMetadata</dip:ObjectType>
</didl:Statement>
</didl:Descriptor>
...
</didl:Item>
...
</didl:Item>

In section 7.3 this ObjectType statement will be further elaborated upon.

...

  1. Declare the dip namespace in the DIDL tag.
  2. The ObjectType in the MUST be described as an URI.
  3. For object types we use the info:eu-repo namespace. This is still in development.

...

The top-Item element contains at least two mandatory Item element ObjectTypes. These Item-ObjectTypes are expressions of the root Item: one for the metadata and one for the digital object file, e.g. a PDF, as described by the metadata.
Optionally there can be a third Item element ObjectType for a jump-off-page: The jump-off page is an html intermediate page that is used for human readable presentations when an Item that has more than one digital object file. This situation typically occurs with dissertations (theses) with separate object files (e.g. when the thesis consists of a set of previously published articles). It also occurs when the content provider has a PDF, MS Word DOC and a HTML version of the same article.

<didl:DIDL ...>
<didl:Item>
<didl*:Item*>...</didl:Item> <!-- metadata -->
<didl*:Item*>...</didl:Item> <!-- objects -->
<didl:Item>...</didl:Item> <!-- jump-off-page –>
</didl:Item>
</didl:DIDL>

...

Code Block
<metadata> \\
<didl:DIDL ...> \\
<didl:Item> \\
<didl:Item>...</didl:Item> \\
<didl:Item>...</didl:Item> \\
<didl:Item>...</didl:Item> \\
</didl:Item> \\
</didl:DIDL> \\
</metadata> \\ \\
\\
 

Image Added

 

Item Descriptors

Image Added
Item Descriptors provide information about the Digital Item. A Descriptor contains a Statement with information about the Item. For each "statement" a new Descriptor is used.
The top level Item element MUST contain two Descriptor elements. One Descriptor element for the (Persistent) Identifier and one Descriptor element for the modification date.

  1. Modifications MUST be made visible by changing the modification date. When there are no modifications the modification date can be left out from the second level Items.
  2. Changes of the modification date in child Item elements MUST be propagated to the parent Item element.
  3. When a Descriptor element for modification date is used also a Descriptor element with an identifier MUST be used (they go in pairs). Rationale: In order to compare similar harvested Item elements wrt modification date, an identifier must be added.
  4. For the second level Item elements:
    1. the "type" Descriptor element MUST always be used
    2. the "identifier" Descriptor element MUST be used in the metadata and objectfile Descriptor elements. This is optional for the jump-off page Descriptor element
    3. the "modification date" Descriptor element MAY be used in all of the second level Item elements.

 

Example on level one

Code Block
xml
xml
<didl:DIDL ...> \\
<didl:Item> \\
<didl:Descriptor>...</didl:Descriptor> <\!\--Identification, mandatory-\-> \\
<didl:Descriptor>...</didl:Descriptor> <\!\--Modification date, mandatory-\-> \\
<didl:Item>...</didl:Item> \\
<didl:Item>...</didl:Item> \\
<didl:Item>...</didl:Item> \\
... \\
</didl:Item> \\
</didl:DIDL> \\ \\
\\
 

Example on level two

Object type added

Code Block
xml
xml
<didl:DIDL ...> \\
<didl:Item> \\
<didl:Item> \\
<didl:Descriptor>...</didl:Descriptor> <\!\--Identification, mandatory-\-> \\
<didl:Descriptor>...</didl:Descriptor> <\!\--Modification date, optional-\-> \\
<didl:Descriptor>...</didl:Descriptor> <\!--\--Type, mandatory-\-> \\
... \\
</didl:Item> \\
<didl:Item>...</didl:Item> \\
<didl:Item>...</didl:Item> \\
<didl:Item>...</didl:Item> \\
... \\
</didl:Item> \\
</didl:DIDL> \\ \\
\\
 


Apart from the Identifier, modified date and type, Descriptors with other semantic content can be used, see section 3.2.

Item 'Identifier' Statement

The first Descriptor contains the ID of the Item elements. This is used to uniquely identify the digital object (e.g. with an urn:nbn). This ID is wrapped in a Statement with a DII Identifier element. For example:

Code Block
xml
xml
 <didl:Item>
<didl:Item>
<didl:Descriptor>
<didl:Statement mimeType="application/xml">
<dii:Identifier>urn:nbn:nl:ui:13-6748398729821</dii:Identifier>
</didl:Statement>
</didl:Descriptor>
...
</didl:Item>
...
</didl:Item>

Remarks:

  1. In this example the root Item has a Persistent Identifier.
  2. The identifier MUST be an URI. Some repositories can make use of registered URIs, e.g., in the Netherlands the repositories can make use of National Bibliography Numbers. If a repository doesn't have access to registered URIs, UUIDs or TAGs can be used. More information about UUIDs can be found at [http://en.wikipedia
    Anchor
    _Hlt190750088
    _Hlt190750088
    Anchor
    _Hlt190750089
    _Hlt190750089
    .org/wiki/UUID|http://en.wikipedia.org/wiki/UUID] and for TAGs the best starting point is http://www.taguri.org/. TAGs for the repository of Tilburg University could be constructed in the following way. All TAGs start with 'tag:uvt.nl,<year>:<record id>'. <year> is the year that the compound object entered the repository. The URIs for MODS records can look like 'tag:uvt.nl,<year>:<record id>/mods' and the URIs for object files can look like 'tag:uvt.nl,<year>:<record id>/<file id>'.
  3. The identifiers of the root Item and the second level Item elements SHOULD NOT be equal to the OAI identifier or the DIDL document identifier.
  4. Identifiers should not change. Different identifier implies a different object.
  5. The namespace for dii SHOULD be declared in the DIDL tag.

Item 'modified' Statement

The second Descriptor contains a modification date. When something changes inside an Item, this modification date element has to be up-dated. Modification date is mandatory in the top level Item and is optional in the second level Items. This modification date is being specified by the modified element from dcterms:

Code Block
xml
xml
 <didl:Item>
<didl:Item>
...
<didl:Descriptor>
<didl:Statement mimeType="application/xml">
<dcterms:modified>2006-12-20T10:29:12Z</dcterms:modified>
</didl:Statement>
</didl:Descriptor>
...
</didl:Item>
...
</didl:Item>

Remarks:

  1. Declare the dcterms namespace in the DIDL tag.
  2. The format of the date is Zulu-time; which means that it can be sorted as text.
  3. There can be only one Statement element in a Descriptor element, which means that dii:identifier and dcterms:modified reside in separate Descriptor elements.

 

Item 'type' Statement

(In Maurice Vanderfeesten (2008), "MPEG21 DIDL Document Specification for repositories", version 2.3.1, the dip:ObjectType is used. Here, this is replaced by rdf:type as more appropriate. For compatibility with Driver and SURFshare both Descriptors can be used. In "MPEG21 DIDL Application Profile for NEEO Repositories" the URI is placed as a literal in the content of the rdf:type element. This is not in line with the use of rdf. Service providers should be aware of these different versions of expressing the type of a Digital Item.)

The third descriptor contains the Digital Item type. This type is mainly used the second level Item elements, however, it is also possible to type the top Digital Item.

Code Block
xml
xml
 <didl:Item>
<didl:Item>
...
<didl:Descriptor>
<didl:Statement mimeType="application/xml">
<rdf:type rdf:resource="info:eu-repo/semantics/descriptiveMetadata" />
</didl:Statement>
</didl:Descriptor>
...
</didl:Item>
...
</didl:Item>


See for more information about the type statement the next section.

Remarks:

  1. Declare the rdf namespace in the DIDL tag.
  2. The value of the rdf:resource attribute of the rdf:type element is an URI.
  3. For typing the Digital Items the info:eu-repo namespace is used.

The Top Item as a Compound Object

The top-Item element contains one mandatory Item sub-element that describes a Digital Item of type 'info:eu-repo/semantics/descriptiveMetadata'. There can be more Digital Items that are descriptive metadata or that are object files.
Optionally there can be a Item sub-element that describes a Digital Item of a third type: 'info:eu-repo/semantics/humanStartPage'. A Digital Item of this type is a jump-off-page, i.e., an html intermediate page that describes in a human readable way which objects are involved. In this way a reader can be informed about the fact that a file is available in different formats such as PDF, MS Word or HTML, or that a dissertation consists of separate files (e.g. when the thesis consists of a set of previously published articles).

Code Block
xml
xml
 <didl:DIDL ...>
<didl:Item>
<didl:Item>...</didl:Item> <\!-\- metadata -->
<didl:Item>...</didl:Item> <\!-\- object files -->
<didl:Item>...</didl:Item> <\!-\- jump-off-page -->
</didl:Item>
</didl:DIDL>

The DIDL document contains at least one metadata Item element. This metadata can be in different formats, simple Dublin Core, qualified Dublin Core, MODS, MARC21, etc. The metadata can be included by-value or can be pointed to by-reference. one of the metadata Item elements MUST contain MODS, and the MODS record MUST be included by-value.

Code Block
xml
xml
 <didl:Item>

<didl:Item> <\!-\--one or many occurrences-\->
<didl:Descriptor>
<didl:Statement mimeType="application/xml">
<rdf:type rdf:resource="info:eu-repo/semantics/descriptiveMetadata" />
</didl:Statement>
</didl:Descriptor>
...
</didl:Item>

<didl:Item>    <\!-\--zero or many occurrences-\->
<didl:Descriptor>
<didl:Statement mimeType="application/xml">
<rdf:type rdf:resource="info:eu-repo/semantics/objectFile" />
</didl:Statement>
</didl:Descriptor>
...
</didl:Item>

<didl:Item> <\!-\--zero or one occurrences-\->
<didl:Descriptor>
<didl:Statement mimeType="application/xml">
<rdf:type rdf:resource="info:eu-repo/semantics/humanStartPage" />
</didl:Statement>
</didl:Descriptor>
...
</didl:Item>

</didl:Item>

The URIs will be processed case un-sensitive. It is recommended to use camelCase writing. It is VERY important to use the exact combinations of characters, otherwise automatic processing will not be possible. To make it very clear the following URIs are used:

  1. info:eu-repo/semantics/descriptiveMetadata (This Item occurs 1 or many times)
  2. info:eu-repo/semantics/objectFile (This Item occurs 0 or many times)
  3. info:eu-repo/semantics/humanStartPage (This Item occurs 0 or 1 time)

Remarks:

  1. The info:eu-repo namespace is used with the following syntax:info:eu-repo/type/identifier like in info:lanl-repo. For more information see http://info-uri.info/registry/OAIHandler?verb=GetRecord&metadataPrefix=reg&identifier=info:lanl-repo/
  2. The semantics of the type statements is to indicate the type of the Digital Item.

Second level Items

Metadata Item

When the metadata are included by-value in an Item element, then the metadata form the content of a Resource element. The case of by-reference is described in the "Object File Item" section. The Resource element is contained by a Component element. If there are several representations of the same metadata record, e.g., a version in MODS and a version in MARCXML, it is recommended to use separate Item elements for each representation.
MODS is mandatory; the MODS records MUST be included by-value. Notice that the guidelines of Driver still mention Simple Dublin Core. To be compliant with both the present Application Profile and Driver, separate metadata Items must be included, one for MODS and the other for Dublin Core.

Code Block
xml
xml
collapsetrue
 <didl:Item>
<didl:Descriptor>
<didl:Statement mimeType="application/xml">
<rdf:type rdf:resource="info:eu-repo/semantics/descriptiveMetadata" />
</didl:Statement>
</didl:Descriptor>
<didl:Descriptor> <\!-\- This metadata instance has its own ID number -->
<didl:Statement mimeType="application/xml">
<dii:Identifier>urn:nbn:nl:ui:13-74836724783</dii:Identifier>
</didl:Statement>
</didl:Descriptor>
<didl:Descriptor> <\!-\- This record has its own Modification date -->
<didl:Statement mimeType="application/xml">
<dcterms:modified>2006-12-20T10:29:12Z</dcterms:modified>
</didl:Statement>
</didl:Descriptor>
<didl:Component>
<didl:Resource mimeType="application/xml"> <\!-\- the MODS data -->
<mods:mods
xmlns:mods="http://www.loc.gov/mods/v3"
xsi:schemaLocation=
"http://www.loc.gov/mods/v3
[http://www.loc.gov/standards/mods/v3/mods-3-3.xsd]">
<mods:titleInfo>...</mods:titleInfo>
<mods:name>...</mods:name>
<mods:typeOfResource> ... </mods:typeOfResource>
...
</mods:mods>
</didl:Resource>
</didl:Component>
</didl:Item>

Remarks:

  1. If the date of the metadata has been changed, make sure that the date of the root-level Item modification date is also being changed.
  2. Declare the mods namespace in the start-tag of MODS record, i.e., in the mods element.

Object File Item

An Object File Item contains a link to a digital object. This is 'by_ref', and the Item element has a type statement with an info:eu-repo/semantics/objectFile URI. An Object File Item can occur zero, one or more times.
When there are more representations of the same object file, then this can be handled in two ways:

  1. for each representation there is a separate Resource element within the same Component element. The Descriptors of the Item element apply to both representations. The representations only differ in medium type (mimetype).
  2. for each representation there is a separate Item element. The representations can differ not only in medium type but also in other respects as reflected in their respective Descriptors.


Additional Descriptor elements can be used to describe certain aspects of the object file:

  • To indicate whether the file is a (exact copy of the) published version or the version of the author that is accepted by the publisher for publication a Descriptor with a type statement can be used. The proposal is to use the following URIs:
    • info:eu-repo/semantics/publishedVersion
    • info:eu-repo/semantics/authorVersion
      These proposed URIs are not yet officially registered.
  • To add descriptions like "Introduction", "Chapter1" and "Glossary", the dc:description element can be used within the Statement element of a Descriptor. The dc:description element can also be used for other (unstructured) information for which there is no specific element.
  • The date at which the object file is deposited can be expressed as a dcterms:dateSubmitted element.
  • If there is an embargo on a file, the dcterms:available element can be used with the date that the file will become available.
Code Block
xml
xml
collapsetrue
<didl:Item>
...

<\!-\- Below this line one can find links to one or more digital objects -->
<didl:Item> <\!-\- First Item for a File/Bitstream -->
<didl:Descriptor>
<didl:Statement mimeType="application/xml">
<rdf:type rdf:resource="info:eu-repo/semantics/objectFile" />
</didl:Statement>
</didl:Descriptor>
<didl:Descriptor> <\!-\- This Object Item has its own persistent ID -->
<didl:Statement mimeType="application/xml">
<dii:Identifier>urn:nbn:nl:ui:13-36724783</dii:Identifier>
</didl:Statement>
</didl:Descriptor>
<didl:Descriptor> <\!-\- This Item has its own Modification date -->
<didl:Statement mimeType="application/xml">
<dcterms:modified>2006-12-20T10:29:12Z</dcterms:modified>
</didl:Statement>
</didl:Descriptor>

<didl:Component>
<didl:Resource
mimeType="application/pdf"
  • Info:eu-repo/semantics/descriptiveMetadata (used to be known as the 'metadata' label in the DARE project) (This Item occurs 1 or many times)
  • Info:eu-repo/semantics/objectFile (used to be known as the 'objects' label in the DARE project) (This Item occurs 1 or many times)
  • Info:eu-repo/semantics/humanStartPage (used to be known as the 'jump-off-page' label in the DARE project) (This Item occurs 0 or 1 time)

...

    1. The info:eu-repo namespace is used with the following syntax:info:eu-repo/type/identifier like in info:lanl-repo. For more information see http://info-uri.info/registry/OAIHandler?verb=GetRecord&metadataPrefix=reg&identifier=info:lanl-repo/
    2. The semantics of the ObjectTypes mean for example that this Item states that the first sub-Item has or contains Descriptive Metadata.

...

The first Item ObjectType element contains the metadata. The metadata is put in a Component and then in a Resource element. This can be cone by reference or by value. Every other format of the metadata is considered as a separate entity. Different metadata formats MUST be in separate Item elements. Every Component (mostly one) has a label with the name of a metadata format that has been used.

...

 

...

<didl:Item>

...

 

...

1

...

2

...

 

...

3

...

 

...

 

...

 

...

 

...

4

...

  1. (optional) Insert only an Identifier element for the metadata package when it is also useful in OAI-PMH. This metadata set has its own identifier, which is NOT the same as de DIDL identifier.
  2. If the date of the metadata has been changed, make sure the date of the root-level Item modification date is also being changed.
  3. Declare the dc namespace in the start-tag of the Resource element where you use Dublin core.
  4. An example where the metadata is by ref. Still use the namespace to identify the profile of the metadata format.

...

The second Item ObjectType contains a link to one digital object. This is 'by ref', and the Item element has an ObjectType statement with an info:eu-repo/semantics/objectFile URI. An Item can occur more then once, see the following:

...

ref="http://my.server.nl/report.pdf"/

...

>
</didl:Component>

...


</didl:Item>

...



<didl:Item> <\!-\- Second Item for a File/Bitstream -->

...


<didl:

...

Descriptor>
<didl:Statement mimeType="application/xml">

...


<rdf:type rdf:resource="info:eu-repo/semantics/objectFile

...

" />
</didl:Statement>

...


</didl:Descriptor>

...


<didl:Descriptor> <\!-\- This Object Item has its own persistent ID -->
<didl:Statement mimeType="application/xml">
<dii:Identifier>urn:nbn:nl:ui:13-36724784</dii:Identifier>
</didl:Statement>
</didl:Descriptor>
<didl:Descriptor> <\!-\- This Item has its own Modification date -->
<didl:Statement mimeType="application/xml">

...


<dcterms:modified>2006-12-20T10:29:12Z</dcterms:modified>
</didl:Statement>
</didl:Descriptor>
<didl:Descriptor> <\!-\- this file is the appendix -->
<didl:Statement mimeType="application/xml">
<dc:description>Appendix</dc:description>
</didl:Statement>

...


</didl:Descriptor>

...


...

...


<didl:Component>

...


<didl:Resource
mimeType="application/pdf"

...


ref="http://my.server.nl/

...

appendix.

...

pdf"/

...

>
</didl:Component>

...


</didl:Item>

...

As you can see in the above example, the Resource locations do not appear in several components within one Item, but each Resource location is wrapped in an Item element. The rationale behind this is that each Bitstream of file can have its own Identifier.
On the ... one may place the Identifier and modified tags, which is similar to the metadata Item.

...

  1. The order of the object components should be in a logical reading order! The Item with chapter 1 should be followed by the next sibling Item element that contains chapter 2, etc... This way the service provider can make a better presentation. Making the order explicit by placing sequence numbers is being specified in the next version of the specification.
  2. If there are important modification dates for the Resource element, propagate these date changes upwards until you reach the DIDLDocumentModified attribute in the DIDL tag.
  3. Only add Identifiers when there actually are any.
    1. Use Persistent Identifiers when this object needs a commitment that it can
  4. If there are no Identifiers for the ObjectType Item elements, the Identifier of the DIDL element will be used by the service provider.
  5. Use for a modified or Identifier element a separate <Descriptor> <Statement> element construction.
  6. The rule of thumb is that if a Bitstream or file has its own identifier, the wrapper is an Item element. To keep the possibility open for a Bitstream to have an Identifier, we use the Item element as default to wrap a resource location.

...

The third ObjectType Item element contains a link to the jump-off page or intermediate page. This is done in the same way as for the Object Item element. Currently this is restricted to 1 Item of this type; there are no identifier elements, nor modification date elements present. This Item element is optional:

...

  1. Using a Persistent Identifier for this HTML page means that you always need to resolve this page in the distant future. This is not recommended!

...

Wiki Markup
Bitstream no: \[0\]

...

Wiki Markup
Bitstream no: \[1\]

...

Wiki Markup
Bitstream no: \[2\]

...

Wiki Markup
Bitstream no: \[3\]

...

Wiki Markup
Bitstream no: \[etc...\]

...



<didl:Item> <\!-\- Third Item for a File/Bitstream -->
<didl:Descriptor>
<didl:Statement mimeType="application/xml">
<rdf:type rdf:resource="info:eu-repo/semantics/objectFile" />
</didl:Statement>
</didl:Descriptor>
<didl:Descriptor> <\!-\- This Object Item has its own persistent ID -->
<didl:Statement mimeType="application/xml">
<dii:Identifier>urn:nbn:nl:ui:13-36724785</dii:Identifier>
</didl:Statement>
</didl:Descriptor>
<didl:Descriptor> <\!-\- This Item has its own Modification date -->
<didl:Statement mimeType="application/xml">
<dcterms:modified>2006-12-20T10:29:12Z</dcterms:modified>
</didl:Statement>
</didl:Descriptor>
<didl:Descriptor> <\!-\--\- deposit date -->
<didl:Statement mimeType="application/xml">
<dcterms:issued>2010-12-01</dcterms:issued>
</didl:Statement>
</didl:Descriptor>
<didl:Descriptor> <\!-\--\- embargo on file -->
<didl:Statement mimeType="application/xml">
<dcterms:available>2010-12-01</dcterms:available>
</didl:Statement>
</didl:Descriptor>
...
<didl:Component>
<didl:Resource
mimeType="application/pdf"
ref="http://my.server.nl/datasheets.xls"/>
</didl:Component>
</didl:Item>
</didl:Item>

In the above example, the Resource locations are not repeated within Component element, but each Resource location is wrapped in its own Item element. The rationale behind this is that each Bitstream or file can have its own Identifier and its own Descriptors.

Remarks:

  1. The order of the Object File Items should be in logical reading order! The Item with chapter 1 should be followed by the next sibling Item element that contains chapter 2, etc... This way the service provider can make a better presentation. Making the order explicit by placing sequence numbers is being specified in a next version of the specification.
  2. If there are important modifications in the Resource element or Descriptors of the Item, the modification date must be propagated to the modification date of the root level Item.
  3. Use for a modified or Identifier element a separate <Descriptor> <Statement> element construction.
  4. The rule of thumb is that if a Bitstream or file has its own identifier, the wrapper is an Item element. To keep the possibility open for a Bitstream to have an Identifier, we use the Item element as default to wrap a resource location.
  5. For representing dates ISO 8601 MUST be used and more in particular the formats as defined in http://www.w3.org/TR/NOTE-datetime.

Jump-off-page / Human Start page Item

The third ObjectType Item element contains a link to the jump-off page or intermediate page. This is done in the same way as for the Object Item element. This Item element is optional. There should not be more than one Item of this type. The identifier element and modified elements are optional.

Code Block
xml
xml
collapsetrue
<didl:Item>
...

<\!-\- Below this line; an Item with a link to one optional Intermediate page -->
<didl:Item>
<didl:Descriptor>
<didl:Statement mimeType="application/xml">
<rdf:type rdf:resource="info:eu-repo/semantics/humanStartPage" />
</didl:Statement>
</didl:Descriptor>
...
<didl:Component>
<didl:Resource
mimeType="text/html"
ref="http://my.server.nl/mypub.html"/>
</didl:Component>
</didl:Item>
</didl:Item>

Remarks:

  1. Using a Persistent Identifier for this HTML page means that you always need to resolve this page in the distant future. This is not recommended!

Example of full OAI-PMH record with a MPEG-21 DIDL document

 

Note

===== TO DO =====