Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

OAI-PMH is a relatively light-weight protocol which does not allow for a bidirectional traffic. If a more reliable error-handling is required, the Standardised Usage Statistics Harvesting Initiative (SUSHI) must be used. SUSHI http://www.niso.org/schemas/sushi/ was developed by NISO (National Information Standards Organization) in cooperation with COUNTER. This document assumes that the communication between the aggregator and the usage data provider takes place as is explained in figure 1.
Image Removed Image Added
Figure 1.
The interaction commences when the log aggregator sends a request for a report about the daily usage of a certain repository. Two parameters must be sent as part of this request: (1) the date of the report and (2) the file name of the most recent robot filter. The filename that is mentioned in this request will be compared to the local filename. Four possible responses can be returned by the repository.

...

Code Block
xml
xml
titleListing 2
linenumberstrue
collapsetrue
<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://schemas.xmlsoap.org/soap/envelope/ [http://schemas.xmlsoap.org/soap/envelope/]">
<soap:Body>
<ReportRequest
xmlns:ctr="http://www.niso.org/schemas/sushi/counter"
xsi:schemaLocation="http://www.niso.org/schemas/sushi/counter
[http://www.niso.org/schemas/sushi/counter_sushi3_0.xsd]"
xmlns="http://www.niso.org/schemas/sushi"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
<Requestor>
<ID>www.logaggregator.nl</ID>
<Name>Log Aggregator</Name>
<Email>logaggregator@surf.nl</Email>
</Requestor>
<CustomerReference>
<ID>www.leiden.edu</ID>
<Name>Leiden University</Name>
</CustomerReference>
<ReportDefinition Release="urn:robots-v1.xml" Name="Daily Report v1">
<Filters>
<UsageDateRange>
<Begin>2009-12-21</Begin>
<End>2009-12-22</End>
</UsageDateRange>
</Filters>
</ReportDefinition>
</ReportRequest>
</soap:Body>
</soap:Envelope>



Listing 2.
Note that the intent of the SUSHI request above is to see all the usage events that have occurred on 21 December 2009. The SUSHI schema was originally developed for the exhchange of COUNTER-compliant reports. In the documentation of the SUSHI XML schema, it is explained that COUNTER usage is only reported at the month level. In SURE, only daily reports can be provided. Therefore, it will be assumed that the implied time on the date that is mentioned is 0:00. The request in the example that is given thus involves all the usage events that have occurred in between 2009-12-21T00:00:00 and 2002-12-22T00:00:00.
As explained previously, the repository can respond in four different ways. If the parameters of the request are valid, and if the requested report is available, the OpenURL ContextObjects will be sent immediately. The Open URL Context Objects will be wrapped into element <Report>, as can be seen in listing 3.

Code Block
xml
xml
linenumbertrue
titleListing 3
collapsetrue

<?xml version="1.0" encoding="UTF-8"?>

...


<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"

...


				xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

...


				xsi:schemaLocation="http://schemas.xmlsoap.org/soap/envelope/ http://schemas.xmlsoap.org/soap/envelope/">

...


	<soap:Body>

...


		<ReportResponse xmlns:ctr="http://www.niso.org/schemas/sushi/counter"

...


						xsi:schemaLocation="http://www.niso.org/schemas/sushi/counter http://www.niso.org/schemas/sushi/counter_sushi3_0.xsd"

...


						xmlns="http://www.niso.org/schemas/sushi"

...


						xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >

...


			<Requestor>
				<ID>www.logaggregator.

...

nl</ID>
				<Name>Log Aggregator</Name>
				<Email>logaggregator@surf.nl</Email>
			</Requestor>
			<CustomerReference>
				<ID>www.leiden.edu</ID>
				<Name>Leiden University</Name>
			</CustomerReference>
			<ReportDefinition Release="urn:DRv1" Name="Daily Report v1">

...


				<Filters>
					<UsageDateRange>
						<Begin>2009-12-

...

22</Begin>

...


						<End>2009-12-

...

23</End>

...


					</UsageDateRange>

...


				</Filters>

...


			</ReportDefinition>
			<Exception>
				<Number>1</Number>
				<Message>The range of dates that was provided is not valid. Only daily reports are
				available.</Message>
			</Exception>
		</ReportResponse>
	</soap:Body>
</soap:Envelope>

Listing 3.
If the begin date and the end date in the request of the log aggregator form a period that exceeds one day, an error message must be sent. In the SUSHI schema, such messages may be sent in an <Exception> element. Three types of errors can be distinguished. Each error type is given its own number. An human-readable error message is provided under <Message>.

Code Block
xml
xml
titleListing 4
linenumberstrue
collapsetrue

<?xml version="1.0" encoding="UTF-8"?>

...


<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"

...


				xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

...


				xsi:schemaLocation="http://schemas.xmlsoap.org/soap/envelope/ http://schemas.xmlsoap.org/soap/envelope/">

...


	<soap:Body>

...


		<ReportResponse xmlns:ctr="http://www.niso.org/schemas/sushi/counter"

...


						xsi:schemaLocation="http://www.niso.org/schemas/sushi/counter http://www.niso.org/schemas/sushi/counter_sushi3_0.xsd"

...


						xmlns="http://www.niso.org/schemas/sushi"

...


						xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >

...


			<Requestor>
				<ID>www.logaggregator.

...

nl</ID>
				<Name>Log Aggregator</Name>
				<Email>logaggregator@surf.nl</Email>
			</Requestor>
			<CustomerReference>
				<ID>www.leiden.edu</ID>
				<Name>Leiden University</Name>
			</CustomerReference>
			<ReportDefinition Release="urn:DRv1" Name="Daily Report v1">

...


				<Filters>
					<UsageDateRange>
						<Begin>2009-12-

...

22</Begin>

...


						<End>2009-12-

...

23</End>
					</UsageDateRange>

...


				</Filters>

...


			</ReportDefinition>

...


			<Exception>
				<Number>1</Number>
				<Message>The range of dates that was provided is not valid. Only daily reports are
				available.</Message>

...


			</Exception>
		</ReportResponse>

...


	</soap:Body>

...


</soap:Envelope>



Listing 4.
A second type of error may be caused by the fact that the file that is mentioned in the request can not be accessed. In this situation, the response will look as follows:

Code Block
xml
xml
titleListing 5
linenumberstrue
collapsetrue

<?xml version="1.0" encoding="UTF-8"?>

...


<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"

...


				xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

...


				xsi:schemaLocation="http://schemas.xmlsoap.org/soap/envelope/ http://schemas.xmlsoap.org/soap/envelope/">

...


	<soap:Body>

...


		<ReportResponse xmlns:ctr="http://www.niso.org/schemas/sushi/counter"

...


						xsi:schemaLocation="http://www.niso.org/schemas/sushi/counter http://www.niso.org/schemas/sushi/counter_sushi3_0.xsd"

...


						xmlns="http://www.niso.org/schemas/sushi"

...


						xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >

...


			<Requestor>
				<ID>www.logaggregator.

...

nl</ID>
				<Name>Log Aggregator</Name>
				<Email>logaggregator@surf.nl</Email>
			</Requestor>
			<CustomerReference>
				<ID>www.leiden.edu</ID>
				<Name>Leiden University</Name>
			</CustomerReference>
			<ReportDefinition Release="urn:DRv1" Name="Daily Report v1">

...


				<Filters>
					<UsageDateRange>
						<Begin>2009-12-

...

22</Begin>

...


						<End>2009-12-

...

23</End>

...


					</UsageDateRange>

...


				</Filters>

...


			</ReportDefinition>

...


			<Exception>
				<Number>2</Number>
				<Message>The file describing the internet robots is not accessible.</Message>
			</Exception> 
		</ReportResponse>
	</soap:Body>

...


</soap:Envelope>

...



When the repository is in the course of producing the requested report, a response will be sent that is very similar to listing 6. The estimated time of completion will be provided in the <Data> element. According to the documentation of the SUSHI XML schema, this element may be used for any other optional data.

...

<?xml version="1.0" encoding="UTF-8"?>

<exclusions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="robotlist.xsd"
version="1.0"
datestamp="2010-04-10">

<sources>
<source id="l1" name="COUNTER" version="R3" datestamp="2010-04-01">COUNTER list of internet robotos</source>
<source id="l2" name="PLOS">PLOS list of internet robotos</source>
</sources>
<robot-list>
<useragent>
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="9b426d10a466f8bd-23529cd5-4be24b91-bd29b7fd-3e395d00f56c1a4fa88908ac"><ac:plain-text-body><![CDATA[
<robot-list>
<useragent>
<regEx>[^a]fish</regEx>
]]></ac:plain-text-body></ac:structured-macro>
<sourceRef id="l2"/>
</useragent>
<useragent>
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="bb9227c2-d224-4c85-9592-29de456aa7bf"><ac:plain-text-body><![CDATA[ <regEx>[+:,\.\;\/
]]></ac:plain-text-body></ac:structured-macro>
-]bot</regEx>
<sourceRef id="l2"/>
</useragent>
<useragent>
<regEx>acme\.spider</regEx>
<sourceRef id="l2"/>
</useragent>
<useragent>
<regEx>Brutus\/AET</regEx>
<sourceRef id="l1"/>
<sourceRef id="l2"/>
</useragent>
<useragent>
<regEx>Code\sSample\sWeb\sClient</regEx>
<sourceRef id="l1"/>
</useragent>

</robot-list>

</exclusions>

...