Also, the choice of identifiers imposes problems: According to the OAI-PMH specification, the identifier within the DC metadata set must link to the described document. When understood as being metadata, the data contained in one <contextobject> or in a <context-objects> aggregation is best described as being metadata of the usage events in a given time frame. Those usage events, however, regularly do not have their own identifiers yet. So in order to comply with DC requirements, too, identifiers have to be generated for those usage events as well (ID2 in the excerpt above). However, by now there seems to be no immediate use case for such identifiers. Therefore, in the context of these guidelines, offering DC metadata is not required.
Usage of Sets (see OAI-PMH, 2.7.2 )
OAI-PMH optionally allows for structuring the offered data in "sets" to support selective harvesting of the data. Currently, this possibility is not further specified in these guidelines. Future refinements may use this feature, e. g. for selecting usage data for certain services. Provenance information is already included in the Context Objects.
Datestamps, Granularity (see OAI-PMH, 2.7.1
; also compare the notes about datestamps in the OAI-PMH record header versus datestamps within the Context Objects)
The OAI-PMH specification allows for either exact-to-the-second or exact-to-the-day granularity for record header datestamps. The data providers may chose one of these possibilities. The service provider will most certainly rely on overlapping harvesting, i. e. the most recent datestamp of the harvested data is used as the "from" parameter for the next OAI-PMH query. Thus, the data provider will provide some records that have been harvested before. Duplicate records are matched by their identifiers (those in the OAI-PMH record header) and are silently tossed if their datestamp is not renewed (see notes below on deletion tracking).It is strongly recommended to implement exact-to-the-second datestamps to keep redundancy of the transferred data as low as possible.
Deletion tracking (OAI-PMH, 2.5.1)
The OAI-PMH provides functionalities for the tracking of deletion of records. Compared to the classic use case of OAI-PMH (metadata of documents) the use case presented here falls into a category of data which is not subject to long-term storage. Thus, the tracking of deletion events does not seem critical since the data tracking deletions would summarize to a significant amount of data.However, the service provider will accept information about deleted records and will eventually delete the referenced information in its own data store. This way it is possible for data providers to do corrections (e. g. in case of technical problems) on wrongly issued data.It is important to note that old data which rotates out of the data offered by the data provider due to its age will not to be marked as deleted for storage reasons. This kind of data is still valid usage data, but not visible anymore.The information about whether a data provider uses deletion tracking has to be provided in the response to the "identify" OAI-PMH query within the <deletedRecords> field. Currently, the only options are "transient" (when a data provider applies or reserves the possibility for marking deleted records) or "no".The possible cases are:
- Incorrect data which has already been offered by the data provider shall be corrected. There are two possibilities:
- Re-issuing of a corrected set of data carrying the same identifier in the OAI-PMH record header as the set of data to be corrected, with an updated OAI-PMH record header datestamp.
- When the correction is a full deletion of the incorrect issued data, the OAI-PMH record has to be re-issued without a Context Object payload, with specified "<deleted>" flag and updated datestamp in the OAI-PMH record header.
- Records that fall out of the time frame for which the data provider offers data: These records are silently neglected, i. e. not offered via the OAI-PMH interface anymore, without using the deletion tracking features of OAI-PMH.
Metadata formats (see OAI-PMH, 3.
All data providers have to provide support for <context-object> documents or <context-objects> aggregations, respectively.This choice also has to be announced in the response to the "listMetadataFormats" query OAI-PMH, 4.4 by the data provider. While a specific "metadataPrefix" is not required, the information about "metadataNamespace" and "schema" is fixed for implementations:
Using OAI-PMH, the mandatory MetadataPrefix for UpenURL Context Objects will be: "ctxo"
Inclusion of Context Objects in OAI-PMH records
Corresponding to the definition of XML encoded Context Objects as data format of the data exchanged via the OAI-PMH, the embedding is to be done conforming to the OAI-PMH:
<record> <header> <identifier>urn:uuid:fd23522e-c447-4801-9be4-c93c60a2d550 </identifier> <datestamp>2009-06-02T14:10:02Z</datestamp> </header> <metadata> <context-objects xmlns="info:ofi/fmt:xml:xsd:ctx"> <context-object> ... </context-object> <context-object> ... </context-object> </context-objects> </metadata> </record>
In the aforementioned example, the OAI-PMH record is identified by a UUID (in form of a URI). , see RFC 4122When 4122. When offering single <context-object> documents rather than an aggregation using <context-objects> containers like above, a conformal OAI-PMH record may look like the following:
<record> <header> <identifier>urn:uuid:fd23522e-c447-4801-9be4-c93c60a2d550 </identifier> <datestamp>2009-06-02T14:10:02Z</datestamp> </header> <metadata> <context-object xmlns="info:ofi/fmt:xml:xsd:ctx" datestamp="2009-06-01T19:20:57Z"> ... </context-object> </metadata> </record>