Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

To be able to compare usage data from different repositories, the data needs to be available in a uniform format. This section will provide specifications for the aspects of the usage event that need to be recorded. In addition, guidelines need to be developed for the format in which this information can be expressed. Following recommendations from MESUR and the JISC Usage Statistics Project, it will be stipulated that usage events need to be serialized in XML using the data format that is specified in the OpenURL Context Objects schema. The XML Schema for XML Context Objects can be accessed at http://www.openurl.info/registry/docs/info:ofi/fmt:xml:xsd:ctx
A distinction will be made between the core set and extensions. Data in the core set can be recorded using standard elements or attributes that are defined in the OpenURL Context Object schema. The extensions are created to record aspects of usage events which cannot be captured using the official schema. They have usually been defined in the context of individual projects to meet very specific demands. Nevertheless, some of the extensions may be relevant for other projects as well. They are included here to inform the usage statistics community what additional information could be made available. Naturally, the implementation of all the extension elements are optional.

4.1. Core set

4.1.1. <context-object>

The OpenURL Framework Initiative recommends that each community that adopts the OpenURL Context Objects Schema should define its application profile. Such a profile must consist of specifications for the use of namespaces, character encodings, serialisation, constraint languages, formats, metadata formats, and transport protocols. This section will attempt to provide such a profile, based on the experiences from the projects NEEO, SURE and OA-Statistics.
The root element of the XML-document must be <context-objects>. It must contain a reference to the official schema and declare the following namespace:

...

Element name

minOccurs

maxOccurs

Referent

1

1

ReferringEntity

0

1

Requester

1

1

ServiceType

1

1

Resolver

1

1

Referrer

0

1

4.1.2. <referent>

The <referent> element must provide information on the document that is requested. More specifically, it must record the following data elements.

...

Other identifier of the requested document

 

Description

A globally unique identification of the resource that is requested must be provided if there is one that is applicable to the document. Identifiers should be independent of the 'communication protocol'-independent as much as possible. In the case of a request for an object file, the identifier should enable the aggregator to obtain the object's associated metadata file.

XPath

ctx:context-object/ctx:referent/ctx:identifier

Usage

Mandatory if applicable

Format

URI

Example

http://hdl.handle.net/1887/12100

4.1.3. <referringEntity>

The <ReferringEntity> provides information about the environment that has forwarded the user to the document that was requested. This referrer can be expressed in two ways.

...

Referrer Name

 

Description

The referrer may be categorised on the basis of a limited list of known referrers. All permitted values will be registered in the OpenURL registry.

XPath

ctx:referring-entity/ctx:identifier

Usage

Optional

Format

A URI that is registered in http://info-uri.info/registry/OAIHandler?verb=GetRecord&metadataPrefix=reg&identifier=info:sid/

Example

info:sid/google

4.1.4. <requester>

The user who has sent the request for the file is identified in the <requester> element.

...

Geographic location

 

Description

The country from which the request originated may also be provided explicitly.

XPath

ctx:context-object/ctx:requester/ctx:metadata-by-val/ctx:metadata/?dcterms:spatial

If this element is used, the <metadata> element must be preceded by
ctx:requester/ctx:metadata-by-val/ctx:format
with value
"http://dublincore.org/documents/2008/01/14/dcmi-terms/"

Usage

Optional

Format

A two-letter code in lower case, following the ISO 3166-1-alpha-2 standard. http://www.iso.org/iso/english_country_names_and_code_elements

Example

ne

4.1.5. <service-type>

Request Type

 

Description

The request type specifies if the request is for an object file or a metadata record.

XPath

ctx:context-object/ctx:service-type/ctx:metadata-by-val/ctx:metadata/dcterms:format

If this element is used, the <metadata> element must be preceded by
ctx:requester/ctx:metadata-by-val/ctx:format
with value
"http://dublincore.org/documents/?2008/01/14/dcmi-terms/"

Inclusion

Mandatory

Format

Two values are allowed: "objectFile" or "metadataView"

Example

objectFile

4.1.6. <resolver> and <referrer>

Host name

 

Description

An identification of the institution that is responsible for the repository in which the requested document is stored.

XPath

ctx:context-object/ctx:resolver/ctx:identifier

Usage

Optional

Format

A unique global identifier taken from the WorldCat registry of institutions, catalogues and OpenURL resolvers.

Example

http://www.worldcat.org/libraries/53238

...

Link resolver Context Identifier

 

Description

The identifier of the context from within the user triggered the usage of the target resource.

XPath

ctx:context-object/ctx:referrer/ctx:identifier

Usage

Optional

Format

URL

Example

info:sid/dlib.org:dlib</identifier

4.2. Extensions

4.2.1. <requester>

Classsification

 

Description

The user may be categorised, using a list of descriptive terms. If no classification is possible, it must be omitted.

XPath

ctx:context-object/ctx:requester/ctx:metadata/dini:requesterinfo/dini:classificationIf this element is used, the <metadata> element must be preceded by
ctx:requester/ctx:metadata-by-al/ctx:format
with value
"http://dini.de/namespace/oas-requesterinfo"

Usage

Optional

Format

Three values are allowed:

  • "internal": classification for technical, system-internal accesses. Examples would be automated availability and consistency checks, cron jobs, keep-alive queries etc.
  • "administrative": classification for accesses that are being made due to human decision but are for administrative reasons only. Examples would be manual quality assurance, manual check for failures, test runs etc.
  • "institutional": classifies accesses that are made from within the institution running the service in question, regardless whether they are for administrative reasons.

Example

institutional

...

Code Block
xml
xml
titleListing 5
linenumberstrue
collapsetrue
<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
				xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
				xsi:schemaLocation="http://schemas.xmlsoap.org/soap/envelope/ http://schemas.xmlsoap.org/soap/envelope/">
	<soap:Body>
		<ReportResponse xmlns:ctr="http://www.niso.org/schemas/sushi/counter"
						xsi:schemaLocation="http://www.niso.org/schemas/sushi/counter http://www.niso.org/schemas/sushi/counter_sushi3_0.xsd"
						xmlns="http://www.niso.org/schemas/sushi"
						xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
			<Requestor>
				<ID>www.logaggregator.nl</ID>
				<Name>Log Aggregator</Name>
				<Email>logaggregator@surf.nl</Email>
			</Requestor>
			<CustomerReference>
				<ID>www.leiden.edu</ID>
				<Name>Leiden University</Name>
			</CustomerReference>
			<ReportDefinition Release="urn:DRv1" Name="Daily Report v1">
				<Filters>
					<UsageDateRange>
						<Begin>2009-12-22</Begin>
						<End>2009-12-23</End>
					</UsageDateRange>
				</Filters>
			</ReportDefinition>
			<Exception>
				<Number>2</Number>
				<Message>The file describing the internet robots is not accessible.</Message>
			</Exception> 
		</ReportResponse>
	</soap:Body>
</soap:Envelope>


When the repository is in the course of producing the requested report, a response will be sent that is very similar to listing 6. The estimated time of completion will be provided in the <Data> element. According to the documentation of the SUSHI XML schema, this element may be used for any other optional data.

Code Block
xml
xml
titleListing 6
linenumberstrue
collapsetrue

<?xml version="1.0" encoding="UTF-8"?>

...


<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"

...


				xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

...


				xsi:schemaLocation="http://schemas.xmlsoap.org/soap/envelope/ http://schemas.xmlsoap.org/soap/envelope/">

...


	<soap:Body>

...


		<ReportResponse xmlns:ctr="http://www.niso.org/schemas/sushi/counter"

...


						xsi:schemaLocation="http://www.niso.org/schemas/sushi/counter http://www.niso.org/schemas/sushi/counter_sushi3_0.xsd"

...


						xmlns="http://www.niso.org/schemas/sushi"

...


						xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >

...


			<Requestor>
				<ID>www.logaggregator.

...

nl</ID>
				<Name>Log Aggregator</Name>
				<Email>logaggregator@surf.nl</Email>
			</Requestor>
			<CustomerReference>
				<ID>www.leiden.edu</ID>
				<Name>Leiden University</Name>
			</CustomerReference>
			<ReportDefinition Release="urn:DRv1" Name="Daily Report v1">

...


				<Filters>
					<UsageDateRange>
						<Begin>2009-12-

...

22</Begin>

...


						<End>2009-12-

...

23</End>

...


					</UsageDateRange>

...


				</Filters>

...


			</ReportDefinition>

...


			<Exception>
				<Number>3</Number>
				<Message>The report is not yet available. The estimated time of completion is
				provided under "Data".</Message>

...


				<Data>2010-01-08T12:13:00+01:

...

00</Data>

...


			</Exception>
		</ReportResponse>

...


	</soap:Body>

...


</soap:Envelope>

...


Error numbers and the corresponding Error messages are also provided in the table below.

...

Wiki Markup
\\
\[To be done: find a web location; create a "cool" URI, implement the above mechanism\]
*Appendix A: Sample OpenURL Context Object File*
\\

Appendix

Code Block
xml
xml
titleAppendix A: Sample OpenURL Context Object File
linenumberstrue
collapsetrue
<?xml version="1.0" encoding="UTF-8"?>
	<context-objects xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
						xmlns:dcterms="http://dublincore.org/documents/2008/01/14/dcmi-terms/"
						xmlns:sv="info:ofi/fmt:xml:xsd:sch_svc"
						xsi:schemaLocation="info:ofi/fmt:xml:xsd:ctx [http://www.openurl.info/registry/docs/info:ofi/fmt:xml:xsd:ctx]"
						xmlns="info:ofi/fmt:xml:xsd:ctx">
		<context-object timestamp="2009-07-29T08:15:46+01:00" identifier="b06c0444f37249a0a8f748d3b823ef2a">

			<referent>
<identifier>[https				<identifier>https://openaccess.leidenuniv.nl/bitstream/1887/12100/1/Thesis.pdf]<pdf</identifier>
<identifier>[http				<identifier>http://hdl.handle.net/1887/12100]<12100</identifier>
			</referent>

			<referring-entity>
<identifier>[http				<identifier>http://www.google.nl/search?hl=nl&amp;q=beleidsregels+artikel+4%3A84&amp;meta=]"</identifier>
				<identifier>info:sid/google</identifier>
			</referring-entity>

			<requester>
				<metadata-by-val>
					<format>[http://dini.de/namespace/oas-requesterinfo]</format>
					<metadata>
					<requesterinfo xmlns="http://dini.de/namespace/oas\-
requesterinfo">
						<hashed-ip>b505e629c508bdcfbf2a774df596123dd001
cee172dae5519660b6014056f53a<ip>b505e629c508bdcfbf2a774df596123dd001cee172dae5519660b6014056f53a</hashed-ip>
						<hashed-c>d001cee172dae5519660b6014056f5346d05
e629c508bdcfbf2a774df596123d<c>d001cee172dae5519660b6014056f5346d05e629c508bdcfbf2a774df596123d</hashed-c>
						<hostname>uni-saarland.de</hostname>
						<classification>institutional</classification>
						<hashed-session>660b14056f5346d0</hashed-session>
						<user-agent>mozilla/5.0 (windows; u; windows nt
 5.1; de; rv:1.8.1.1) gecko/20061204
<20061204</user-agent>
					</requesterinfo>
				</metadata>
			</metadata-by-val>
		</requester>

		<service-type>
			<metadata-by-val>
<format>ihttp				<format>http://dublincore.org/documents/2008/01/14/dcmi-terms/</format>
				<metadata>
					<dcterms:format>objectFile</dcterms:format>
				</metadata>
			</metadata-by-val>
		</service-type>

		<resolver>
<identifier>[http			<identifier>http://www.worldcat.org/libraries/53238]<53238</identifier>
		</resolver>

		<referrer>
			<identifier>info:sid/dlib.org:dlib</identifier>
		</referrer>

	</context-object>
</context-objects>
Code Block
xml

...

xml

...

titleAppendix B: Schema for Robot filter List
linenumberstrue
collapsetrue

<?xml version="1.0" encoding="UTF-8"?>

...


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">

...



	<xs:element name="exclusions">

...


		<xs:complexType>

...


			<xs:sequence>

...


				<xs:element ref="sources"/>

...


				<xs:element ref="robot-list"/>

...


			</xs:sequence>

...


			<xs:attributeGroup ref="attlist.exclusions"/>

...


		</xs:complexType>

...


	</xs:element>

...



	<xs:attributeGroup name="attlist.exclusions">

...


		<xs:attribute name="version" type="xs:string"/>

...


		<xs:attribute name="datestamp" type="xs:date"/>

...


	</xs:attributeGroup>

...



	<xs:element name="sources">

...


		<xs:complexType>

...


			<xs:sequence>

...


				<xs:element ref="source" minOccurs="0" maxOccurs="unbounded"/>

...


			</xs:sequence>

...


		</xs:complexType>

...


	</xs:element>

...



	<xs:element name="source">

...


		<xs:complexType>

...


			<xs:simpleContent>

...


				<xs:extension base="xs:string">

...


					<xs:attribute name="id" type="xs:ID" use="required"/>

...


					<xs:attribute name="name" type="xs:string"/>

...


					<xs:attribute name="version" type="xs:string"/>

...


					<xs:attribute name="datestamp" type="xs:date"/>

...


				</xs:extension>

...


			</xs:simpleContent>

...


		</xs:complexType>

...


	</xs:element>

...



	<xs:element name="sourceRef">

...


		<xs:complexType>

...


			<xs:simpleContent>

...


				<xs:extension base="xs:string">

...


					<xs:attribute name="id" type="xs:IDREF" use="required"/>

...


				</xs:extension>

...


			</xs:simpleContent>

...


		</xs:complexType>

...


	</xs:element>

...



	<xs:element name="robot-list">

...


		<xs:complexType>

...


			<xs:sequence>

...


				<xs:element ref="useragent" minOccurs="0" maxOccurs="unbounded"/>

...


			</xs:sequence>

...


		</xs:complexType>

...


	</xs:element>

...



	<xs:element name="useragent">

...


		<xs:complexType>

...


			<xs:sequence>

...


				<xs:element ref="regEx"/>

...


				<xs:element ref="sourceRef" minOccurs="0" maxOccurs="unbounded"/>

...


			</xs:sequence>

...


		</xs:complexType>

...


	</xs:element>

...



	<xs:element name="regEx" type="xs:string"/>

...



</xs:schema>


Code Block
xml
xml
titleAppendix C: Sample Robot filter list
linenumberstrue
collapsetrue

<?xml version="1.0" encoding="UTF-8"?>

...



<exclusions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

...


			xsi:noNamespaceSchemaLocation="robotlist.xsd"

...


			version="1.0"

...


			datestamp="2010-04-10">

...



	<sources>
		<source id="l1" name="COUNTER" version="R3" datestamp="2010-04-01"

...

>COUNTER list of internet

...

 robotos</source>

...


		<source id="l2" name="PLOS"

...

>PLOS list of internet

...

 robotos</source>

...


	</sources>

	<robot-list>
		<useragent>
			<regEx>[^a]fish</regEx>
			<sourceRef id="l2"/>

...


		</useragent>

...


		<useragent>
			<regEx>[+:,\.

...

\;\/-]bot</regEx>
			<sourceRef id="l2"/>

...


		</useragent>

...


		<useragent>
			<regEx>acme\.spider</regEx>
			<sourceRef id="l2"/>

...


		</useragent>

...


		<useragent>
			<regEx>Brutus\/AET</regEx>
			<sourceRef id="l1"/>

...


			<sourceRef id="l2"/>

...


		</useragent>

...


		<useragent>
			<regEx>Code\sSample\sWeb\

...

sClient</regEx>

...


			<sourceRef id="l1"/>

...


		</useragent>

...



	</robot-list>

...



</exclusions>