The DIDL XML container, as defined in DRIVER, is a document with one top-level Item element. The Item contains several child Item elements. These child item elements appear in three different kind of types. Between the straight brackets the cardinality of the XML elements are shown:
|
|
The DIDL root element contains one attribute; namely DIDLDocumentID. This attribute provides information about the Identifier of the DIDL wrapper as an autonomous entity. This Identifier is NOT to identify the intellectual work, but to Identify the serialisation of the DIDL XML.
<didl:DIDL |
The DIDLDocumentId attribute contains the ID of the DIDL wrapper. This CAN be the same as the OAI-Identifier that is being used to get a record. The DIDL wrapper can be used as an autonomous entity out of the OAI-PMH context, therefore a DIDL is not the same ‘thing' as an OAI record. There is a demand for Persistent Identifiers assigned to digital objects in the future (mandatory for the OAI-ORE project.). For libraries it is recommended to use urn:nbn:{country code}:{isil library code} ISO/NP 15511: International Standard Identifier for Libraries and Related Organizations (ISIL)
http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=52666 - {object id}. {object id} could be the database number. It is recommended to store this number in a separate field and not to auto generate from the database id because a database update in the future will change these numbers and the persistency could be lost.
The Item elements can OPTIONALLY contain two or three Descriptor elements. One Descriptor element describes the modification date of the Item element. To compare similar harvested Item elements on modification date, an identifier must be added.
Example on level one:
<didl:DIDL ...> |
Example on level two; Object type added:
<didl:Item> <!-- Level 1 Root Item --> |
The first Descriptor contains the ID of the Item elements. This is mostly used to uniquely identify the digital object (e.g. with a DOI). This ID is wrapped in a Statement with a DII Identifier element. For example:
<didl:Item> |
The second Descriptor contains a modification date. When something changes inside an Item, this modification date element has to be up-dated. This modification date is being specified by the modified element from the dcterms namespace:
<didl:Item> |
The third descriptor contains the object type. This Object type appears on the second level of Item elements. In other words; this applies only on child Item elements of the root Item. This object type is being specified by the ObjectType element from the MPEG-21 Digital Item Processing (DIP) namespace that specifies an architecture pertaining to the dissemination of Digital Item Documents (DIDs).
<didl:Item> |
In the section Compound Element: representation of the complex work the representation of the complex work this ObjectType statement will be further eleborated upon.
The top-Item element contains at least two mandatory Item element ObjectTypes. These Item-ObjectTypes are expressions of the root Item: one for the metadata and one for the digital object file, e.g. a PDF, as described by the metadata. Optionally there can be a third Item element ObjectType for a jump-off-page. The jump-off page is an html intermediate page that is used for human readable presentations when an Item has more than one digital object file. This situation typically occurs with theses that have separate object files (for example, when the thesis consists of a set of previously published articles). It also occurs when the content provider has a PDF, MS Word DOC and a HTML version of the same article.
<didl:DIDL ...> |
The first Item contains the metadata as Unqualified Dublin Core (DC) (mandatory) which is normally used in the OAI_DC format according to the DRIVER metadata guidelines that belongs to a Digital Item Processing architecture. The second Item(s) contain links to the digital objects, and the third Item contains a link to a jump-off page.
<didl:Item> |
The URI's will be processed case un-sensitive. It is recommended to use camelCase writing. It is VERY important to use the exact combinations of characters, otherwise automatic processing will not be possible. To make it very clear the following URI's are used:
http://info-uri.info/registry/OAIHandler?verb=GetRecord&metadataPrefix=reg&identifier=info:eu-repo/
The first Item ObjectType element contains the metadata. The metadata is put in a Resource element. Every Resource element contains the namespace of a metadata format that has been used. This way the format will be recognised by service providers. According to the OAI protocol it is mandatory to use 'oai_dc'. For ease of implementation one can use the OAI_DC as metadata, since OAI_DC is a basic requirement of OAI-PMH. Every metadata item can optionally have its own Identifier and modified element in a Descriptor element:
|
<didl:Item> |
|
<didl:Descriptor> <didl:Statement mimeType="application/xml"> <dip:ObjectType> info:eu-repo/semantics/descriptiveMetadata</dip:ObjectType> </didl:Statement> </didl:Descriptor> |
1 |
<didl:Descriptor> <!-- This metadata instance has its own ID number --> <didl:Statement mimeType="application/xml"> <dii:Identifier>info:doi/10.1705/74836724783</dii:Identifier> </didl:Statement> </didl:Descriptor> |
2 |
<didl:Descriptor> <!-- This record has its own Modification date --> <didl:Statement mimeType="application/xml"> <dcterms:modified>2006-12-20T10:29:12Z</dcterms:modified> </didl:Statement> </didl:Descriptor> |
|
<didl:Component> |
3 |
<didl:Resource mimeType="application/xml"> <!-- the DC data --> <oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"> <dc:creator>...</dc:creator> <dc:creator>...</dc:creator> <dc:title> ... </dc:title> ... </oai_dc:dc> </didl:Resource> |
|
</didl:Component> |
|
</didl:Item> |
The second Item ObjectType contains a link to one digital object. This is always "by-reference" to limit the file size, when used for metadata transfer purpouses. ("by-value" is possible but increases the file size and touches the issueon ownership, use base64 encoding, not exampled here), and the Item element has an ObjectType statement with an info:eu-repo/semantics/objectFile URI. An objectFile Item can occur more than once. See the following:
<didl:Item> ... <!-- Below this line one can find links to one or more digital objects --> |
<didl:Item> <!-- First Item for a File/Bitstream --> <didl:Descriptor> <didl:Statement mimeType="application/xml"> <dip:ObjectType>info:eu-repo/semantics/objectFile</dip:ObjectType> </didl:Statement> </didl:Descriptor> ... <didl:Component> <didl:Resource mimeType="application/pdf" ref="http://my.server.nl/report.pdf"/></didl:Component> </didl:Item> |
<didl:Item> <!-- Second Item for a File/Bitstream --> <didl:Descriptor> <didl:Statement mimeType="application/xml"> <dip:ObjectType>info:eu-repo/semantics/objectFile</dip:ObjectType> </didl:Statement> </didl:Descriptor> ... <didl:Component> <didl:Resource mimeType="application/pdf" ref="http://my.server.nl/appendix.pdf"/><didl:Component> </didl:Item> |
<didl:Item> <!-- Third Item for a File/Bitstream --> <didl:Descriptor> <didl:Statement mimeType="application/xml"> <dip:ObjectType>info:eu-repo/semantics/objectFile</dip:ObjectType> </didl:Statement> </didl:Descriptor> ... <didl:Component> <didl:Resource mimeType="application/pdf" ref="http://my.server.nl/datasheets.xls"/><didl:Component> </didl:Item> |
</didl:Item> |
As you can see in the above example, the Resource locations do not appear in several components within one Item, but each Resource location is wrapped in an Item element. The rationale behind this is that each Bit stream of file can have its own Identifier. On the three dots "..." (given in the examples) one may place the Identifier and modified tags, which is similar to the metadata Item.
The third ObjectType Item element contains a link to the jump-off page or intermediate page. This is done in the same way as for the Object Item element. Currently this is restricted to 1 Item of this type; there are no identifier elements, nor modification date elements present. This Item element is optional:
<didl:Item> ... <!-- Below this line; an Item with a link to one optional Intermediate page --> |
<didl:Item> <didl:Descriptor> <didl:Statement mimeType="application/xml"> <dip:ObjectType> info:eu-repo/semantics/humanStartPage </dip:ObjectType> </didl:Statement> </didl:Descriptor> ... <didl:Component> <didl:Resource mimeType="application/html" ref="http://my.server.nl/mypub.html"/></didl:Component> </didl:Item> |
</didl:Item> |