|
Date |
Version history |
Owner |
Changelog |
|
---|---|---|---|---|
18 April 2009 |
3.0 |
SURFshare |
Start of version 3.0. Based on NEEO document "MPEG21 DIDL Application Profile for Institutional Repositories" version 0.4, which is based on "MPEG21 DIDL Document Specifications for repositories" version 2.3.1. See also history. Note that this is the first version of this document. The version number (3.0) indicates that it is more recent and more up-to-date than the predecessors on which it is based by having a higher number than their latest versions.
|
|
22 Januari 2008 |
2.3.1 |
SURFshare |
Minor change in the schema path. ISO changed the path .../dii.xsd/dii.xsd to .../dii/dii.xsd |
|
05 December 2007 |
2.3 |
SURFshare |
Changes to stress the use of Persistent Identifiers in the DIDL document. The addition of the ORE compliant info:eu-repo namespace |
|
23 May 2007 |
2.2.2 |
SURFshare |
Some changes and little tweaks. |
|
23 March 2007 |
2.2.1 |
SURFshare |
Added comment of Peter van Huisstede, small corrections in the example XML. |
|
6 March 2007 |
2.2 |
SURFshare |
The Committee for Complex Objects looked at this document and came with more elegant improvements. Thanks to: Thomas Place, Renze Brandsma, Henk Ellermann, Peter van Huisstede and Ruud Bronmans. |
|
20 February 2007 |
2.1 |
SURFshare |
A closer look at the recommendations of Herbert vd Sompel gave more insight in the DIDL semantics, and thus leading to a better XMLspecification. |
|
2 January 2007 |
2.0 |
SURFshare |
Fundamental change of element and attribute use; for better representation of the semantics. |
|
4 December 2006 |
1.1.2 |
SURFshare |
Translated into English for DRIVER |
|
11 July 2006 |
1.1.1 |
SURFshare |
Few typos are removed. |
|
10 July 2006 |
1.1 |
SURFshare |
Extension with:
|
|
30 March 2006 |
1.0 |
SURFshare |
Initial document |
|
|
0.4 |
NEEO |
|
|
|
0.3 |
NEEO |
Only minor changes |
|
|
0.2 |
NEEO |
|
|
|
0.1 |
NEEO |
Changes with respect to version 2.3.1 of "MPEG21 DIDL Document Specifications for repositories" by Maurice Vanderfeesten (SURF)
|
|
The abstract describes what the application profile is about. It should contain a problem definition, the standards described by the application profile and the goal of the application profile.
This document is an adaptation of "MPEG21 DIDL Application Profile for NEEO" version 0.4 (http://drcwww.uvt.nl/~place/neeo/didl%20application%20profile.0.4.doc). The latter was based on Maurice Vanderfeesten (2008), "MPEG21 DIDL Document Specifications for repositories" version 2.3.1 https://www.surfgroepen.nl/sites/oai/complexobjects/Shared%20Documents/DIDLdocumentSpecification_EN_v2.3.doc
This document describes the use of DIDL in the context of institutional repositories. The DIDL Document Specification was originally developed within the DARE programme of SURF as a solution for:
DIDL has been in use by the DARE community since the summer of 2006. One of the results is that the content of all Dutch repositories are now part of the E-Depot of the Royal Library, the national library of The Netherlands.
The digital objects that populate institutional repositories can be seen as compound objects that consist of parts or components that are also digital objects. In the DIDL model the basic entity is a Digital Item. The compound objects and their objects play the role of Digital Items in the model that underlies DIDL. In a DIDL document the Item elements represent the Digital Items. The top Item element that is situated directly below the DIDL root element is used for the compound object. The Item elements that are the children of the top Item element represent the objects that are part of the compound object. The objects that are part of a compound object can themselves be a compound object. When a part object is also a compound object, then its parts are not described in the same DIDL document, but a separate DIDL document is used to describe this compound object with its parts. This means that in this application profile there are only two levels of Digital Items within a DIDL document. Although DIDL allows for a hierarchy of Digital Items, this profile restricts the hierarchy to two levels: the level of the top Digital Item, the compound object and the level of the Digital Items that are parts of the top Digital Item. This version of the application profile doesn't give (yet) guidelines for the case of a compound object that is part of another compound object.
This profile distinguishes three types of digital objects: descriptive metadata, object files and jump-off pages. This list is extensible; other types can be added.
The figure below is a schematic representation of a DIDL document of a compound object that consists of one or more descriptive metadata records, zero or more object files and zero or one jump-off page. Metadata that apply to the metadata records, object files and jump-off pages can be placed in Descriptor elements within the respective Item elements. In the figure the most used Descriptors are shown. The list of Descriptor elements in an Item is extensible.
A digital object can have one or more representations. A representation is the thing that can be displayed on a computer screen or that can be printed. A representation MUST have a medium type (mimetype). In DIDL, representations are handled by the Resource element. A Resource is contained in a Component element which in its turn is a child of the Item element. There are two ways of including a representation in a Resource element. The first way is by-value: the representation as such is included as content of the Resource element. This is the usual way that metadata records formatted in XML are included. The second way is by-ref: the Resource element stays empty, but the representation is referred to by an URL that is the value of the ref-attribute of the Resource element. Normally, the URL will point to a file in the repository.
Each Digital Item MUST have an identifier with the exception of jump-off pages for which the identifier is optional. This identifier MUST be an URI. The URI of a Digital Item should be different from the URLs of its representations. The identifiers of the Digital Items must be persistent. The URLs of the representations and the medium types can change, while the identifier of the Digital Item stays the same. This allows, e.g., for replacing a file that can only be processed by an old-fashioned word processor by a version with the same content that can be read at all contemporary desk tops. Or a file can be moved to another location; the identifier of the Digital Items stays the same indicating that it is still the same file. If the policy of a repository is to preserve the different representations of a Digital Item then the repository is advised to treat the representations as separate Digital Items, each with its own persistent identifier. So it is possible that in one repository the PDF, the Word and the HTML versions of a publication are combined into one Digital Item, while in another repository they are treated as separate Digital Items. Another use of Digital Item identifiers is to relate Digital Items to each other.
The DIDL document is part of an OAI-PMH response. The DIDL document will be returned within an OAI-record when using didl as value of the metadataPrefix verb. This enables the repository to generate this particular didl format that is described in the document below.
Within the OAI XML structure, the DIDL resides within the metadata element. See below:
<OAI-PMH ...> ... <request ... metadataPrefix="didl"> ... <record> <header>...</header> <metadata> <didl:DIDL xmlns:didl="urn:mpeg:mpeg21:2002:02-DIDL-NS" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dii="urn:mpeg:mpeg21:2002:01-DII-NS" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" urn:mpeg:mpeg21:2002:02-DIDL-NS [http://standards.iso.org/ittf/PubliclyAvailableStandards/] MPEG-21_schema_files/did/didl.xsd urn:mpeg:mpeg21:2002:01-DII-NS [http://standards.iso.org/ittf/PubliclyAvailableStandards/] MPEG-21_schema_files/dii/dii.xsd"> ... </didl:DIDL> </metadata> <about>...</about> </record> ... </OAI-PMH> |
Remarks:
The DIDL document is a document with one top-level Item element. The Item contains several child Item elements. These child Item elements describe three different types: descriptive metadata, object files and jump-off pages. Between brackets the cardinality of the XML elements are shown:
|
|
Item Descriptors provide information about the Digital Item. A Descriptor contains a Statement with information about the Item. For each "statement" a new Descriptor is used.
The top level Item element MUST contain two Descriptor elements. One Descriptor element for the (Persistent) Identifier and one Descriptor element for the modification date.
Example on level one |
|
|
Example on level two |
|
Apart from the Identifier, modified date and type, Descriptors with other semantic content can be used, see section 3.2.
The first Descriptor contains the ID of the Item elements. This is used to uniquely identify the digital object (e.g. with an urn:nbn). This ID is wrapped in a Statement with a DII Identifier element. For example:
<didl:Item> <didl:Item> <didl:Descriptor> <didl:Statement mimeType="application/xml"> <dii:Identifier>urn:nbn:nl:ui:13-6748398729821</dii:Identifier> </didl:Statement> </didl:Descriptor> ... </didl:Item> ... </didl:Item> |
Remarks:
The second Descriptor contains a modification date. When something changes inside an Item, this modification date element has to be up-dated. Modification date is mandatory in the top level Item and is optional in the second level Items. This modification date is being specified by the modified element from dcterms:
<didl:Item> <didl:Item> ... <didl:Descriptor> <didl:Statement mimeType="application/xml"> <dcterms:modified>2006-12-20T10:29:12Z</dcterms:modified> </didl:Statement> </didl:Descriptor> ... </didl:Item> ... </didl:Item> |
Remarks:
(In Maurice Vanderfeesten (2008), "MPEG21 DIDL Document Specification for repositories", version 2.3.1, the dip:ObjectType is used. Here, this is replaced by rdf:type as more appropriate. For compatibility with Driver and SURFshare both Descriptors can be used. In "MPEG21 DIDL Application Profile for NEEO Repositories" the URI is placed as a literal in the content of the rdf:type element. This is not in line with the use of rdf. Service providers should be aware of these different versions of expressing the type of a Digital Item.)
The third descriptor contains the Digital Item type. This type is mainly used the second level Item elements, however, it is also possible to type the top Digital Item.
<didl:Item> <didl:Item> ... <didl:Descriptor> <didl:Statement mimeType="application/xml"> <rdf:type rdf:resource="info:eu-repo/semantics/descriptiveMetadata" /> </didl:Statement> </didl:Descriptor> ... </didl:Item> ... </didl:Item> |
See for more information about the type statement the next section.
Remarks:
The top-Item element contains one mandatory Item sub-element that describes a Digital Item of type 'info:eu-repo/semantics/descriptiveMetadata'. There can be more Digital Items that are descriptive metadata or that are object files.
Optionally there can be a Item sub-element that describes a Digital Item of a third type: 'info:eu-repo/semantics/humanStartPage'. A Digital Item of this type is a jump-off-page, i.e., an html intermediate page that describes in a human readable way which objects are involved. In this way a reader can be informed about the fact that a file is available in different formats such as PDF, MS Word or HTML, or that a dissertation consists of separate files (e.g. when the thesis consists of a set of previously published articles).
<didl:DIDL ...> <didl:Item> <didl:Item>...</didl:Item> <\!-\- metadata --> <didl:Item>...</didl:Item> <\!-\- object files --> <didl:Item>...</didl:Item> <\!-\- jump-off-page --> </didl:Item> </didl:DIDL> |
The DIDL document contains at least one metadata Item element. This metadata can be in different formats, simple Dublin Core, qualified Dublin Core, MODS, MARC21, etc. The metadata can be included by-value or can be pointed to by-reference. one of the metadata Item elements MUST contain MODS, and the MODS record MUST be included by-value.
<didl:Item> <didl:Item> <\!-\--one or many occurrences-\-> <didl:Descriptor> <didl:Statement mimeType="application/xml"> <rdf:type rdf:resource="info:eu-repo/semantics/descriptiveMetadata" /> </didl:Statement> </didl:Descriptor> ... </didl:Item> <didl:Item> <\!-\--zero or many occurrences-\-> <didl:Descriptor> <didl:Statement mimeType="application/xml"> <rdf:type rdf:resource="info:eu-repo/semantics/objectFile" /> </didl:Statement> </didl:Descriptor> ... </didl:Item> <didl:Item> <\!-\--zero or one occurrences-\-> <didl:Descriptor> <didl:Statement mimeType="application/xml"> <rdf:type rdf:resource="info:eu-repo/semantics/humanStartPage" /> </didl:Statement> </didl:Descriptor> ... </didl:Item> </didl:Item> |
The URIs will be processed case un-sensitive. It is recommended to use camelCase writing. It is VERY important to use the exact combinations of characters, otherwise automatic processing will not be possible. To make it very clear the following URIs are used:
Remarks:
When the metadata are included by-value in an Item element, then the metadata form the content of a Resource element. The case of by-reference is described in the "Object File Item" section. The Resource element is contained by a Component element. If there are several representations of the same metadata record, e.g., a version in MODS and a version in MARCXML, it is recommended to use separate Item elements for each representation.
MODS is mandatory; the MODS records MUST be included by-value. Notice that the guidelines of Driver still mention Simple Dublin Core. To be compliant with both the present Application Profile and Driver, separate metadata Items must be included, one for MODS and the other for Dublin Core.
<didl:Item> <didl:Descriptor> <didl:Statement mimeType="application/xml"> <rdf:type rdf:resource="info:eu-repo/semantics/descriptiveMetadata" /> </didl:Statement> </didl:Descriptor> <didl:Descriptor> <\!-\- This metadata instance has its own ID number --> <didl:Statement mimeType="application/xml"> <dii:Identifier>urn:nbn:nl:ui:13-74836724783</dii:Identifier> </didl:Statement> </didl:Descriptor> <didl:Descriptor> <\!-\- This record has its own Modification date --> <didl:Statement mimeType="application/xml"> <dcterms:modified>2006-12-20T10:29:12Z</dcterms:modified> </didl:Statement> </didl:Descriptor> <didl:Component> <didl:Resource mimeType="application/xml"> <\!-\- the MODS data --> <mods:mods xmlns:mods="http://www.loc.gov/mods/v3" xsi:schemaLocation= "http://www.loc.gov/mods/v3 [http://www.loc.gov/standards/mods/v3/mods-3-3.xsd]"> <mods:titleInfo>...</mods:titleInfo> <mods:name>...</mods:name> <mods:typeOfResource> ... </mods:typeOfResource> ... </mods:mods> </didl:Resource> </didl:Component> </didl:Item> |
Remarks:
An Object File Item contains a link to a digital object. This is 'by_ref', and the Item element has a type statement with an info:eu-repo/semantics/objectFile URI. An Object File Item can occur zero, one or more times.
When there are more representations of the same object file, then this can be handled in two ways:
Additional Descriptor elements can be used to describe certain aspects of the object file:
<didl:Item> ... <\!-\- Below this line one can find links to one or more digital objects --> <didl:Item> <\!-\- First Item for a File/Bitstream --> <didl:Descriptor> <didl:Statement mimeType="application/xml"> <rdf:type rdf:resource="info:eu-repo/semantics/objectFile" /> </didl:Statement> </didl:Descriptor> <didl:Descriptor> <\!-\- This Object Item has its own persistent ID --> <didl:Statement mimeType="application/xml"> <dii:Identifier>urn:nbn:nl:ui:13-36724783</dii:Identifier> </didl:Statement> </didl:Descriptor> <didl:Descriptor> <\!-\- This Item has its own Modification date --> <didl:Statement mimeType="application/xml"> <dcterms:modified>2006-12-20T10:29:12Z</dcterms:modified> </didl:Statement> </didl:Descriptor> <didl:Component> <didl:Resource mimeType="application/pdf" ref="http://my.server.nl/report.pdf"/> </didl:Component> </didl:Item> <didl:Item> <\!-\- Second Item for a File/Bitstream --> <didl:Descriptor> <didl:Statement mimeType="application/xml"> <rdf:type rdf:resource="info:eu-repo/semantics/objectFile" /> </didl:Statement> </didl:Descriptor> <didl:Descriptor> <\!-\- This Object Item has its own persistent ID --> <didl:Statement mimeType="application/xml"> <dii:Identifier>urn:nbn:nl:ui:13-36724784</dii:Identifier> </didl:Statement> </didl:Descriptor> <didl:Descriptor> <\!-\- This Item has its own Modification date --> <didl:Statement mimeType="application/xml"> <dcterms:modified>2006-12-20T10:29:12Z</dcterms:modified> </didl:Statement> </didl:Descriptor> <didl:Descriptor> <\!-\- this file is the appendix --> <didl:Statement mimeType="application/xml"> <dc:description>Appendix</dc:description> </didl:Statement> </didl:Descriptor> ... <didl:Component> <didl:Resource mimeType="application/pdf" ref="http://my.server.nl/appendix.pdf"/> </didl:Component> </didl:Item> <didl:Item> <\!-\- Third Item for a File/Bitstream --> <didl:Descriptor> <didl:Statement mimeType="application/xml"> <rdf:type rdf:resource="info:eu-repo/semantics/objectFile" /> </didl:Statement> </didl:Descriptor> <didl:Descriptor> <\!-\- This Object Item has its own persistent ID --> <didl:Statement mimeType="application/xml"> <dii:Identifier>urn:nbn:nl:ui:13-36724785</dii:Identifier> </didl:Statement> </didl:Descriptor> <didl:Descriptor> <\!-\- This Item has its own Modification date --> <didl:Statement mimeType="application/xml"> <dcterms:modified>2006-12-20T10:29:12Z</dcterms:modified> </didl:Statement> </didl:Descriptor> <didl:Descriptor> <\!-\--\- deposit date --> <didl:Statement mimeType="application/xml"> <dcterms:issued>2010-12-01</dcterms:issued> </didl:Statement> </didl:Descriptor> <didl:Descriptor> <\!-\--\- embargo on file --> <didl:Statement mimeType="application/xml"> <dcterms:available>2010-12-01</dcterms:available> </didl:Statement> </didl:Descriptor> ... <didl:Component> <didl:Resource mimeType="application/pdf" ref="http://my.server.nl/datasheets.xls"/> </didl:Component> </didl:Item> </didl:Item> |
In the above example, the Resource locations are not repeated within Component element, but each Resource location is wrapped in its own Item element. The rationale behind this is that each Bitstream or file can have its own Identifier and its own Descriptors.
Remarks:
The third ObjectType Item element contains a link to the jump-off page or intermediate page. This is done in the same way as for the Object Item element. This Item element is optional. There should not be more than one Item of this type. The identifier element and modified elements are optional.
<didl:Item> ... <\!-\- Below this line; an Item with a link to one optional Intermediate page --> <didl:Item> <didl:Descriptor> <didl:Statement mimeType="application/xml"> <rdf:type rdf:resource="info:eu-repo/semantics/humanStartPage" /> </didl:Statement> </didl:Descriptor> ... <didl:Component> <didl:Resource mimeType="text/html" ref="http://my.server.nl/mypub.html"/> </didl:Component> </didl:Item> </didl:Item> |
Remarks:
===== TO DO ===== |