...
Description | An identification of a specific usage event. |
XPath | ctx:context-object/@identifier |
Usage | Optional |
Format | No requirements are given for the format of the identifier. If this optional identifier is used, it must be (1) opaque and (2) unique for a specific usage event. |
Example | b06c0444f37249a0a8f748d3b823ef2a |
Warning |
---|
This must be mandatory, at least to identify the repository. This can be done for example to use the repository name as a prefix prior to the opaque identifier.This This could be categorised as provenance information. -jochen -jochen |
Note |
---|
Usage events are good to use as a control mechanism for the aggregation database to check if duplicate events are harvested. The provenance information of the repository can be found in the <resolver> element, see section 4.1.6. - Peter |
Occurences of child elements in <context-object>
...
Element name | minOccurs | maxOccurs |
Referent | 1 | 1 |
ReferringEntity | 0 | 1 |
Requester | 1 | 1 |
ServiceType | 1 | 1 |
Resolver | 1 | 1 |
Referrer | 0 | 1 |
Notewarning |
---|
Just a note: If we make this schema more restrictive, we diverge from the original schema. - Jochen |
Note |
---|
This is explained in the beginning of section 4. Dataformat |
4.1.2. <referent>
The <referent> element must provide information on the document that is requested. More specifically, it must record the following data elements.
...
Description | The user can be identified by providing the IP-address. Including the full IP-address in the description of a usage event is not permitted by international privacy laws. For this reason, the IP-address needs to be obfuscated. The IP-address must be hashed using MD5 encryption. MD5 encryption of IP addresses can easily be hacked. The question if such MD5 encryption secures the privacy sufficiently warrant further research by legal advisors. |
XPath | ctx:context-object/ctx:requester/ctx:identifier |
Usage | Mandatory |
Format | A data-URI, consisting of the prefix "data:,", followed by a 32-digit hexadecimal number. |
Example | data:,c06f0464f37249a0a9f848d4b823ef2a |
Note |
---|
The IP address of the requester is pseudonymised using encryptions, before it is exchanged and taken outside the web-server to another location. Therefore individual users can be recognised when aggregated from distributed repositories, but cannot be referred back to a 'natural person'. This method may seem consisted with the European Act for Protection of Personal data. The summary can be found here: ?http://europa.eu/legislation_summaries/information_society/l14012_en.htm. Further legal research needs to be done if this method is sufficient to protect the personal data of a 'natural person', in order to operate within the boundaries of the law. |
Note |
---|
More information about the data-URI scheme can be found on: http://en.wikipedia.org/wiki/Data_URI_scheme. And don't forget the comma, it's not a typo. |
<requested/identifier> | C-class Subnet
...
Description | The request type specifies if the request is for an object file or a metadata record. |
XPath | ctx:context-object/ctx:service-type/ctx:metadata-by-val/ctx:metadata/dcterms:type |
Inclusion | Mandatory |
Format | Two values are allowed One of these values must be used:
|
Example | info:eu-repo/semantics/objectFile |
...
Description | The identifier of the context from within the user triggered the usage of the target resource. |
XPath | ctx:context-object/ctx:referrer/ctx:identifier |
Usage | Optional |
Format | URL |
Example | info:sid/dlib.org:dlib</identifier |
4.2. Extensions
4.2.1. <requester>. <requester>
<requester/.../dini:classification> | Classification of the requester
Classsification |
|
Description | The user may be categorised, using a list of descriptive terms. If no classification is possible, it must be omitted. |
XPath | ctx:context-object/ctx:requester/ctx:metadata/dini:requesterinfo/dini:classificationIf this element is used, the <metadata> element must be preceded by |
Usage | Optional |
Format | Three values are allowed:
|
Example | institutional |
<requester/.../dini:hashed-session> | Hashed session of the requester
Session ID |
|
Description | The identifier of the complete usage session of a given user. |
XPath | ctx:context-object/ctx:requester/ctx:metadata/dini:requesterinfo/dini:hashed-sessionIf session |
Usage | Optional |
Format | If the session ID is a hash itself, it must be hashed. Otherwise, provide a MD5 hash of the session ID. |
Example | 660b14056f5346d0 |
<requester/.../dini:user-agent> | Full user agent string of the requester
User Agent |
|
Description | The full HTTP user agent string |
XPath | ctx:context-object/ctx:requester/ctx:metadata/dini:requesterinfo/dini:classification/dini:user-agentIf agent |
Usage | Optional |
Format | String |
Example | Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6) Gecko/2009011913 Firefox/3.0.6 (.NET CLR 3.5.30729) |
...
- Usage of Sets see OAI-PMH, 2.7.2 OAI-PMH optionally allows for structuring the offered data in "sets" to support selective harvesting of the data. Currently, this possibility is not further specified in these guidelines. Future refinements may use this feature, e. g. for selecting usage data for certain services. Provenance information is already included in the Context Objects.
- Datestamps, Granularity see OAI-PMH, 2.7.1 (also compare the notes about datestamps in the OAI-PMH record header versus datestamps within the Context Objects)The OAI-PMH specification allows for either exact-to-the-second or exact-to-the-day granularity for record header datestamps. The data providers may chose one of these possibilities. The service provider will most certainly rely on overlapping harvesting, i. e. the most recent datestamp of the harvested data is used as the "from" parameter for the next OAI-PMH query. Thus, the data provider will provide some records that have been harvested before. Duplicate records are matched by their identifiers (those in the OAI-PMH record header) and are silently tossed if their datestamp is not renewed (see notes below on deletion tracking).It is strongly recommended to implement exact-to-the-second datestamps to keep redundancy of the transferred data as low as possible.
- Deletion tracking OAI-PMH, 2.5.1The 1 The OAI-PMH provides functionalities for the tracking of deletion of records. Compared to the classic use case of OAI-PMH (metadata of documents) the use case presented here falls into a category of data which is not subject to long-term storage. Thus, the tracking of deletion events does not seem critical since the data tracking deletions would summarize to a significant amount of data.However, the service provider will accept information about deleted records and will eventually delete the referenced information in its own data store. This way it is possible for data providers to do corrections (e. g. in case of technical problems) on wrongly issued data.It is important to note that old data which rotates out of the data offered by the data provider due to its age will not to be marked as deleted for storage reasons. This kind of data is still valid usage data, but not visible anymore.The information about whether a data provider uses deletion tracking has to be provided in the response to the "identify" OAI-PMH query within the <deletedRecords> field. Currently, the only options are "transient" (when a data provider applies or reserves the possibility for marking deleted records) or "no".The possible cases are:
- Incorrect data which has already been offered by the data provider shall be corrected. There are two possibilities:
- Re-issuing of a corrected set of data carrying the same identifier in the OAI-PMH record header as the set of data to be corrected, with an updated OAI-PMH record header datestamp.
- When the correction is a full deletion of the incorrect issued data, the OAI-PMH record has to be re-issued without a Context Object payload, with specified "<deleted>" flag and updated datestamp in the OAI-PMH record header.
- Records that fall out of the time frame for which the data provider offers data: These records are silently neglected, i. e. not offered via the OAI-PMH interface anymore, without using the deletion tracking features of OAI-PMH.
- Incorrect data which has already been offered by the data provider shall be corrected. There are two possibilities:
- Metadata formats OAI-PMH, 3.4All data providers have to provide support for <context-object> documents or <context-objects> aggregations, respectively.This choice also has to be announced in the response to the "listMetadataFormats" query OAI-PMH, 4.4 by the data provider. While a specific "metadataPrefix" is not required, the information about "metadataNamespace" and "schema" is fixed for implementations:
...
Info |
---|
Using OAI-PMH, the mandatory MetadataPrefix for UpenURL Context Objects will be: "ctxo" |
- Inclusion of Context Objects in OAI-PMH recordsCorresponding to the definition of XML encoded Context Objects as data format of the data exchanged via the OAI-PMH, the embedding is to be done conforming to the OAI-PMH:
...
Note |
---|
It would be nice to have some reference. - jochen |
Note |
---|
Maybe Sune Karlsson can help with this section, considering his expertise. |
6.2.2. Strategy
It is decided to make a distinction between two 'layers' of robot filtering:
...
Info |
---|
To be done: find a web location; create a "cool" URI, implement the above mechanism |
Note |
---|
Knowledge Exchange is offering a web-location; the PURL has been requested by OCLC. The next steps are to put the list online, and make a PURL reference. |
Appendix
Code Block | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
<?xml version="1.0" encoding="UTF-8"?> <context-objects xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dcterms="http://dublincore.org/documents/2008/01/14/dcmi-terms/" xmlns:sv="info:ofi/fmt:xml:xsd:sch_svc" xsi:schemaLocation="info:ofi/fmt:xml:xsd:ctx [http://www.openurl.info/registry/docs/info:ofi/fmt:xml:xsd:ctx]" xmlns="info:ofi/fmt:xml:xsd:ctx"> <context-object timestamp="2009-07-29T08:15:46+01:00" identifier="b06c0444f37249a0a8f748d3b823ef2a"> <referent> <identifier>https://openaccess.leidenuniv.nl/bitstream/1887/12100/1/Thesis.pdf</identifier> <identifier>http://hdl.handle.net/1887/12100</identifier> </referent> <referring-entity> <identifier>http://www.google.nl/search?hl=nl&q=beleidsregels+artikel+4%3A84&meta="</identifier> <identifier>info:sid/google</identifier> </referring-entity> <requester> <metadata-by-val> <format>http://dini.de/namespace/oas-requesterinfo</format> <metadata> <requesterinfo xmlns="http://dini.de/namespace/oas-requesterinfo"> <hashed-ip>b505e629c508bdcfbf2a774df596123dd001cee172dae5519660b6014056f53a</hashed-ip> <hashed-c>d001cee172dae5519660b6014056f5346d05e629c508bdcfbf2a774df596123d</hashed-c> <hostname>uni-saarland.de</hostname> <classification>institutional</classification> <hashed-session>660b14056f5346d0</hashed-session> <user-agent>mozilla/5.0 (windows; u; windows nt 5.1; de; rv:1.8.1.1) gecko/20061204</user-agent> </requesterinfo> </metadata> </metadata-by-val> </requester> <service-type> <metadata-by-val> <format>http://dublincore.org/documents/2008/01/14/dcmi-terms/</format> <metadata> <dcterms:format>objectFile</dcterms:format> </metadata> </metadata-by-val> </service-type> <resolver> <identifier>http://www.worldcat.org/libraries/53238</identifier> </resolver> <referrer> <identifier>info:sid/dlib.org:dlib</identifier> </referrer> </context-object> </context-objects> |
...