...
To be able to compare usage data from different repositories, the data needs to be available in a uniform format. This section will provide specifications for the aspects of the usage event that need to be recorded. In addition, guidelines need to be developed for the format in which this information can be expressed. Following recommendations from MESUR and the JISC Usage Statistics Project, it will be stipulated that usage events need to be serialized in XML using the data format that is specified in the OpenURL Context Objects schema. The XML Schema for XML Context Objects can be accessed at http://www.openurl.info/registry/docs/info:ofi/fmt:xml:xsd:ctx
A distinction will be made between the core set and extensions. Data in the core set can be recorded using standard elements or attributes that are defined in the OpenURL Context Object schema. The extensions are created to record aspects of usage events which cannot be captured using the official schema. They have usually been defined in the context of individual projects to meet very specific demands. Nevertheless, some of the extensions may be relevant for other projects as well. They are included here to inform the usage statistics community what additional information could be made available. Naturally, the implementation of all the extension elements are optional.
4.1. Core set
4.1.1. <context-object>
The OpenURL Framework Initiative recommends that each community that adopts the OpenURL Context Objects Schema should define its application profile. Such a profile must consist of specifications for the use of namespaces, character encodings, serialisation, constraint languages, formats, metadata formats, and transport protocols. This section will attempt to provide such a profile, based on the experiences from the projects NEEO, SURE and OA-Statistics.
The root element of the XML-document must be <context-objects>. It must contain a reference to the official schema and declare the following namespace:
...
Element name | minOccurs | maxOccurs |
Referent | 1 | 1 |
ReferringEntity | 0 | 1 |
Requester | 1 | 1 |
ServiceType | 1 | 1 |
Resolver | 1 | 1 |
Referrer | 0 | 1 |
4.1.2. <referent>
The <referent> element must provide information on the document that is requested. More specifically, it must record the following data elements.
...
Other identifier of the requested document |
|
Description | A globally unique identification of the resource that is requested must be provided if there is one that is applicable to the document. Identifiers should be independent of the 'communication protocol'-independent as much as possible. In the case of a request for an object file, the identifier should enable the aggregator to obtain the object's associated metadata file. |
XPath | ctx:context-object/ctx:referent/ctx:identifier |
Usage | Mandatory if applicable |
Format | URI |
Example |
4.1.3. <referringEntity>
The <ReferringEntity> provides information about the environment that has forwarded the user to the document that was requested. This referrer can be expressed in two ways.
...
Referrer Name |
|
Description | The referrer may be categorised on the basis of a limited list of known referrers. All permitted values will be registered in the OpenURL registry. |
XPath | ctx:referring-entity/ctx:identifier |
Usage | Optional |
Format | A URI that is registered in http://info-uri.info/registry/OAIHandler?verb=GetRecord&metadataPrefix=reg&identifier=info:sid/ |
Example | info:sid/google |
4.1.4. <requester>
The user who has sent the request for the file is identified in the <requester> element.
...
Geographic location |
|
Description | The country from which the request originated may also be provided explicitly. |
XPath | ctx:context-object/ctx:requester/ctx:metadata-by-val/ctx:metadata/?dcterms:spatial |
Usage | Optional |
Format | A two-letter code in lower case, following the ISO 3166-1-alpha-2 standard. http://www.iso.org/iso/english_country_names_and_code_elements |
Example | ne |
4.1.5. <service-type>
Request Type |
|
Description | The request type specifies if the request is for an object file or a metadata record. |
XPath | ctx:context-object/ctx:service-type/ctx:metadata-by-val/ctx:metadata/dcterms:format |
Inclusion | Mandatory |
Format | Two values are allowed: "objectFile" or "metadataView" |
Example | objectFile |
4.1.6. <resolver> and <referrer>
Host name |
|
Description | An identification of the institution that is responsible for the repository in which the requested document is stored. |
XPath | ctx:context-object/ctx:resolver/ctx:identifier |
Usage | Optional |
Format | A unique global identifier taken from the WorldCat registry of institutions, catalogues and OpenURL resolvers. |
Example |
...
Link resolver Context Identifier |
|
Description | The identifier of the context from within the user triggered the usage of the target resource. |
XPath | ctx:context-object/ctx:referrer/ctx:identifier |
Usage | Optional |
Format | URL |
Example | info:sid/dlib.org:dlib</identifier |
4.2. Extensions
4.2.1. <requester>
Classsification |
|
Description | The user may be categorised, using a list of descriptive terms. If no classification is possible, it must be omitted. |
XPath | ctx:context-object/ctx:requester/ctx:metadata/dini:requesterinfo/dini:classificationIf this element is used, the <metadata> element must be preceded by |
Usage | Optional |
Format | Three values are allowed:
|
Example | institutional |
...
Code Block | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://schemas.xmlsoap.org/soap/envelope/ http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<ReportResponse xmlns:ctr="http://www.niso.org/schemas/sushi/counter"
xsi:schemaLocation="http://www.niso.org/schemas/sushi/counter http://www.niso.org/schemas/sushi/counter_sushi3_0.xsd"
xmlns="http://www.niso.org/schemas/sushi"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
<Requestor>
<ID>www.logaggregator.nl</ID>
<Name>Log Aggregator</Name>
<Email>logaggregator@surf.nl</Email>
</Requestor>
<CustomerReference>
<ID>www.leiden.edu</ID>
<Name>Leiden University</Name>
</CustomerReference>
<ReportDefinition Release="urn:DRv1" Name="Daily Report v1">
<Filters>
<UsageDateRange>
<Begin>2009-12-22</Begin>
<End>2009-12-23</End>
</UsageDateRange>
</Filters>
</ReportDefinition>
<Exception>
<Number>2</Number>
<Message>The file describing the internet robots is not accessible.</Message>
</Exception>
</ReportResponse>
</soap:Body>
</soap:Envelope>
|
When the repository is in the course of producing the requested report, a response will be sent that is very similar to listing 6. The estimated time of completion will be provided in the <Data> element. According to the documentation of the SUSHI XML schema, this element may be used for any other optional data.
Code Block | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
<?xml version="1.0" encoding="UTF-8"?> |
...
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" |
...
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" |
...
xsi:schemaLocation="http://schemas.xmlsoap.org/soap/envelope/ http://schemas.xmlsoap.org/soap/envelope/"> |
...
<soap:Body> |
...
<ReportResponse xmlns:ctr="http://www.niso.org/schemas/sushi/counter" |
...
xsi:schemaLocation="http://www.niso.org/schemas/sushi/counter http://www.niso.org/schemas/sushi/counter_sushi3_0.xsd" |
...
xmlns="http://www.niso.org/schemas/sushi" |
...
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > |
...
<Requestor> <ID>www.logaggregator. |
...
nl</ID> <Name>Log Aggregator</Name> <Email>logaggregator@surf.nl</Email> </Requestor> <CustomerReference> <ID>www.leiden.edu</ID> <Name>Leiden University</Name> </CustomerReference> <ReportDefinition Release="urn:DRv1" Name="Daily Report v1"> |
...
<Filters> <UsageDateRange> <Begin>2009-12- |
...
22</Begin> |
...
<End>2009-12- |
...
23</End> |
...
</UsageDateRange> |
...
</Filters> |
...
</ReportDefinition> |
...
<Exception> <Number>3</Number> <Message>The report is not yet available. The estimated time of completion is provided under "Data".</Message> |
...
<Data>2010-01-08T12:13:00+01: |
...
00</Data> |
...
</Exception> </ReportResponse> |
...
</soap:Body> |
...
</soap:Envelope> |
...
|
Error numbers and the corresponding Error messages are also provided in the table below.
...
Wiki Markup |
---|
\\
\[To be done: find a web location; create a "cool" URI, implement the above mechanism\]
*Appendix A: Sample OpenURL Context Object File*
\\ |
Appendix
Code Block | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
<?xml version="1.0" encoding="UTF-8"?> <context-objects xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dcterms="http://dublincore.org/documents/2008/01/14/dcmi-terms/" xmlns:sv="info:ofi/fmt:xml:xsd:sch_svc" xsi:schemaLocation="info:ofi/fmt:xml:xsd:ctx [http://www.openurl.info/registry/docs/info:ofi/fmt:xml:xsd:ctx]" xmlns="info:ofi/fmt:xml:xsd:ctx"> <context-object timestamp="2009-07-29T08:15:46+01:00" identifier="b06c0444f37249a0a8f748d3b823ef2a"> <referent> <identifier>[https <identifier>https://openaccess.leidenuniv.nl/bitstream/1887/12100/1/Thesis.pdf]<pdf</identifier> <identifier>[http <identifier>http://hdl.handle.net/1887/12100]<12100</identifier> </referent> <referring-entity> <identifier>[http <identifier>http://www.google.nl/search?hl=nl&q=beleidsregels+artikel+4%3A84&meta=]"</identifier> <identifier>info:sid/google</identifier> </referring-entity> <requester> <metadata-by-val> <format>[http://dini.de/namespace/oas-requesterinfo]</format> <metadata> <requesterinfo xmlns="http://dini.de/namespace/oas\- requesterinfo"> <hashed-ip>b505e629c508bdcfbf2a774df596123dd001 cee172dae5519660b6014056f53a<ip>b505e629c508bdcfbf2a774df596123dd001cee172dae5519660b6014056f53a</hashed-ip> <hashed-c>d001cee172dae5519660b6014056f5346d05 e629c508bdcfbf2a774df596123d<c>d001cee172dae5519660b6014056f5346d05e629c508bdcfbf2a774df596123d</hashed-c> <hostname>uni-saarland.de</hostname> <classification>institutional</classification> <hashed-session>660b14056f5346d0</hashed-session> <user-agent>mozilla/5.0 (windows; u; windows nt 5.1; de; rv:1.8.1.1) gecko/20061204 <20061204</user-agent> </requesterinfo> </metadata> </metadata-by-val> </requester> <service-type> <metadata-by-val> <format>ihttp <format>http://dublincore.org/documents/2008/01/14/dcmi-terms/</format> <metadata> <dcterms:format>objectFile</dcterms:format> </metadata> </metadata-by-val> </service-type> <resolver> <identifier>[http <identifier>http://www.worldcat.org/libraries/53238]<53238</identifier> </resolver> <referrer> <identifier>info:sid/dlib.org:dlib</identifier> </referrer> </context-object> </context-objects> |
Code Block | ||
---|---|---|
|
...
|
...
| |||||||
<?xml version="1.0" encoding="UTF-8"?> |
...
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> |
...
<xs:element name="exclusions"> |
...
<xs:complexType> |
...
<xs:sequence> |
...
<xs:element ref="sources"/> |
...
<xs:element ref="robot-list"/> |
...
</xs:sequence> |
...
<xs:attributeGroup ref="attlist.exclusions"/> |
...
</xs:complexType> |
...
</xs:element> |
...
<xs:attributeGroup name="attlist.exclusions"> |
...
<xs:attribute name="version" type="xs:string"/> |
...
<xs:attribute name="datestamp" type="xs:date"/> |
...
</xs:attributeGroup> |
...
<xs:element name="sources"> |
...
<xs:complexType> |
...
<xs:sequence> |
...
<xs:element ref="source" minOccurs="0" maxOccurs="unbounded"/> |
...
</xs:sequence> |
...
</xs:complexType> |
...
</xs:element> |
...
<xs:element name="source"> |
...
<xs:complexType> |
...
<xs:simpleContent> |
...
<xs:extension base="xs:string"> |
...
<xs:attribute name="id" type="xs:ID" use="required"/> |
...
<xs:attribute name="name" type="xs:string"/> |
...
<xs:attribute name="version" type="xs:string"/> |
...
<xs:attribute name="datestamp" type="xs:date"/> |
...
</xs:extension> |
...
</xs:simpleContent> |
...
</xs:complexType> |
...
</xs:element> |
...
<xs:element name="sourceRef"> |
...
<xs:complexType> |
...
<xs:simpleContent> |
...
<xs:extension base="xs:string"> |
...
<xs:attribute name="id" type="xs:IDREF" use="required"/> |
...
</xs:extension> |
...
</xs:simpleContent> |
...
</xs:complexType> |
...
</xs:element> |
...
<xs:element name="robot-list"> |
...
<xs:complexType> |
...
<xs:sequence> |
...
<xs:element ref="useragent" minOccurs="0" maxOccurs="unbounded"/> |
...
</xs:sequence> |
...
</xs:complexType> |
...
</xs:element> |
...
<xs:element name="useragent"> |
...
<xs:complexType> |
...
<xs:sequence> |
...
<xs:element ref="regEx"/> |
...
<xs:element ref="sourceRef" minOccurs="0" maxOccurs="unbounded"/> |
...
</xs:sequence> |
...
</xs:complexType> |
...
</xs:element> |
...
<xs:element name="regEx" type="xs:string"/> |
...
</xs:schema> |
Code Block | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
<?xml version="1.0" encoding="UTF-8"?> |
...
<exclusions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" |
...
xsi:noNamespaceSchemaLocation="robotlist.xsd" |
...
version="1.0" |
...
datestamp="2010-04-10"> |
...
<sources> <source id="l1" name="COUNTER" version="R3" datestamp="2010-04-01" |
...
>COUNTER list of internet |
...
robotos</source> |
...
<source id="l2" name="PLOS" |
...
>PLOS list of internet |
...
robotos</source> |
...
</sources> <robot-list> <useragent> <regEx>[^a]fish</regEx> <sourceRef id="l2"/> |
...
</useragent> |
...
<useragent> <regEx>[+:,\. |
...
\;\/-]bot</regEx> <sourceRef id="l2"/> |
...
</useragent> |
...
<useragent> <regEx>acme\.spider</regEx> <sourceRef id="l2"/> |
...
</useragent> |
...
<useragent> <regEx>Brutus\/AET</regEx> <sourceRef id="l1"/> |
...
<sourceRef id="l2"/> |
...
</useragent> |
...
<useragent> <regEx>Code\sSample\sWeb\ |
...
sClient</regEx> |
...
<sourceRef id="l1"/> |
...
</useragent> |
...
</robot-list> |
...
</exclusions> |