Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
xml
xml
titleListing 2
linenumberstrue
collapsetrue
<?xml version="1.0" encoding="UTF-8"?>
<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://schemas.xmlsoap.org/soap/envelope/ [http://schemas.xmlsoap.org/soap/envelope/]">
	<soap:Body>
		<ReportRequest xmlns:ctr="http://www.niso.org/schemas/sushi/counter"
 xsi:schemaLocation="http://www.niso.org/schemas/sushi/counter [http://www.niso.org/schemas/sushi/counter_sushi3_0.xsd]"
 xmlns="http://www.niso.org/schemas/sushi"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
			<Requestor>
				<ID>www.logaggregator.nl</ID>
				<Name>Log Aggregator</Name>
				<Email>logaggregator@surf.nl</Email>
			</Requestor>
			<CustomerReference>
				<ID>www.leiden.edu</ID>
				<Name>Leiden University</Name>
			</CustomerReference>
			<ReportDefinition Release="urn:robots-v1.xml" Name="Daily Report v1">
				<Filters>
					<UsageDateRange>
						<Begin>2009-12-21</Begin>
						<End>2009-12-22</End>
					</UsageDateRange>
				</Filters>
			</ReportDefinition>
		</ReportRequest>
	</soap:Body>
</soap:Envelope>

...

Code Block
xml
xml
titleListing 3
linenumberstrue
collapsetrue
<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://schemas.xmlsoap.org/soap/envelope/ [http://schemas.xmlsoap.org/soap/envelope/]">
	<soap:Body>
		<ReportResponse xmlns:ctr="http://www.niso.org/schemas/sushi/counter"
 xsi:schemaLocation="http://www.niso.org/schemas/sushi/counter [http://www.niso.org/schemas/sushi/counter_sushi3_0.xsd]"
 xmlns="http://www.niso.org/schemas/sushi"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
			<Requestor>
				<ID>www.logaggregator.nl</ID>
				<Name>Log Aggregator</Name>
				<Email>logaggregator@surf.nl</Email>
			</Requestor>
			<CustomerReference>
				<ID>www.leiden.edu</ID>
				<Name>Leiden University</Name>
			</CustomerReference>
			<ReportDefinition Release="urn:DRv1" Name="Daily Report v1">
				<Filters>
					<UsageDateRange>
						<Begin>2009-12-22</Begin>
						<End>2009-12-23</End>
					</UsageDateRange>
				</Filters>
			</ReportDefinition>
			<Report>
				<ctx:context-objects xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xmlns:dcterms="http://dublincore.org/documents/2008/01/14/dcmi-terms/"
 xmlns:ctx="info:ofi/fmt:xml:xsd:ctx">
					<ctx:context-object timestamp="2009-11\- 09T05:56:18+01:00">
...
</ctx:context-object>
				</ctx:context-objects>
			</Report>
		</ReportResponse>
	</soap:Body>
</soap:Envelope>

If the begin date and the end date in the request of the log aggregator form a period that exceeds one day, an error message must be sent. In the SUSHI schema, such messages may be sent in an <Exception> element. Three types of errors can be distinguished. Each error type is given its own number. An human-readable error message is provided under <Message>.

Code Block
xml
xml
titleListing 4
linenumberstrue
collapsetrue
 <?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://schemas.xmlsoap.org/soap/envelope/ [http://schemas.xmlsoap.org/soap/envelope/]">
	<soap:Body>
		<ReportResponse xmlns:ctr="http://www.niso.org/schemas/sushi/counter"
 xsi:schemaLocation="http://www.niso.org/schemas/sushi/counter [http://www.niso.org/schemas/sushi/counter_sushi3_0.xsd]"
 xmlns="http://www.niso.org/schemas/sushi"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
			<Requestor>
				<ID>www.logaggregator.nl</ID>
				<Name>Log Aggregator</Name>
				<Email>logaggregator@surf.nl</Email>
			</Requestor>
			<CustomerReference>
				<ID>www.leiden.edu</ID>
				<Name>Leiden University</Name>
			</CustomerReference>
			<ReportDefinition Release="urn:DRv1" Name="Daily Report v1">
				<Filters>
					<UsageDateRange>
						<Begin>2009-12-22</Begin>
						<End>2009-12-23</End>
					</UsageDateRange>
				</Filters>
			</ReportDefinition>
			<Exception>
				<Number>1</Number>
				<Message>The range of dates that was provided is not valid. Only daily reports are
available.</Message>
			</Exception> 
		</ReportResponse>
	</soap:Body>
</soap:Envelope>

A second type of error may be caused by the fact that the file that is mentioned in the request can not be accessed. In this situation, the response will look as follows:

Code Block
xml
xml
titleListing 5
linenumberstrue
collapsetrue
<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://schemas.xmlsoap.org/soap/envelope/ [http://schemas.xmlsoap.org/soap/envelope/]">
	<soap:Body>
		<ReportResponse xmlns:ctr="http://www.niso.org/schemas/sushi/counter"
 xsi:schemaLocation="http://www.niso.org/schemas/sushi/counter [http://www.niso.org/schemas/sushi/counter_sushi3_0.xsd]"
 xmlns="http://www.niso.org/schemas/sushi"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
			<Requestor>
				<ID>www.logaggregator.nl</ID>
				<Name>Log Aggregator</Name>
				<Email>logaggregator@surf.nl</Email>
			</Requestor>
			<CustomerReference>
				<ID>www.leiden.edu</ID>
				<Name>Leiden University</Name>
			</CustomerReference>
			<ReportDefinition Release="urn:DRv1" Name="Daily Report v1">
				<Filters>
					<UsageDateRange>
						<Begin>2009-12-22</Begin>
						<End>2009-12-23</End>
					</UsageDateRange>
				</Filters>
			</ReportDefinition>
			<Exception>
				<Number>2</Number>
				<Message>The file describing the internet robots is not accessible.</Message>
			</Exception> 
		</ReportResponse>
	</soap:Body>
</soap:Envelope>

When the repository is in the course of producing the requested report, a response will be sent that is very similar to listing 6. The estimated time of completion will be provided in the <Data> element. According to the documentation of the SUSHI XML schema, this element may be used for any other optional data.

Code Block
xml
xml
titleListing 6
linenumberstrue
collapsetrue
<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://schemas.xmlsoap.org/soap/envelope/
[http://schemas.xmlsoap.org/soap/envelope/]">
	<soap:Body>
		<ReportResponse xmlns:ctr="http://www.niso.org/schemas/sushi/counter"
 xsi:schemaLocation="http://www.niso.org/schemas/sushi/counter
[http://www.niso.org/schemas/sushi/counter_sushi3_0.xsd]"
 xmlns="http://www.niso.org/schemas/sushi"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
			<Requestor>
				<ID>www.logaggregator.nl</ID>
ID>
				<Name>Log Aggregator</Name>
				<Email>logaggregator@surf.nl</Email>
			</Requestor>
			<CustomerReference>
				<ID>www.leiden.edu</ID>
				<Name>Leiden University</Name>
			</CustomerReference>
			<ReportDefinition Release="urn:DRv1" Name="Daily Report v1">
				<Filters>
					<UsageDateRange>
						<Begin>2009-12-22</Begin>
						<End>2009-12-23</End>
					</UsageDateRange>
				</Filters>
			</ReportDefinition>
			<Exception>
				<Number>3</Number>
				<Message>The report is not yet available. The estimated time of completion is
provided under "Data".</Message>
				<Data>2010-01-08T12:13:00+01:00</Data>
			</Exception>
		</ReportResponse>
	</soap:Body>
</soap:Envelope>

error numbers

Error numbers and the corresponding Error messages are also provided in the table below.

...

PloS

COUNTER

NEEO

AWstats

Description

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="aa8add2aa16c88d8-1bf278c0-4acc495c-b50bbfd4-a1c83d75a9deb8c7b6431567"><ac:plain-text-body><![CDATA[

[^a]fish


 

 

 

]]></ac:plain-text-body></ac:structured-macro>

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="83ac9700ab8bc3eb-bbcefc6d-43a14194-bc97abd6-715a93e5100de156237c0d57"><ac:plain-text-body><![CDATA[

[+:,\.\;\/
]]></ac:plain-text-body></ac:structured-macro>
-]bot


 

 

 

acme\.spider


 

 

 

alexa


 

 

 

Alexandria(\s|)prototype(\s|)project

Alexandria prototype project

 

 

 

AllenTrack


 

 

 

almaden


 

 

 

appie


 

 

 

Arachmo

Arachmo

 

 

 

archive\.org_bot


 

 

 

arks


 

 

 

asterias


 

 

 

atomz


 

 

 

autoemailspider


 

 

 

awbot


 

 

 

baiduspider


 

 

 

bbot


 

 

 

biadu


 

 

 

biglotron


 

 

 

bloglines


 

 

 

blogpulse


 

 

 

boitho\.com-dc


 

 

 

bookmark-manager


 

 

 

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="4ef61066f0c4667d-d0e31084-4f4c4cf1-a88bae35-a69d5564db70b589300b78d2"><ac:plain-text-body><![CDATA[

bot[+:,\.\;\/
]]></ac:plain-text-body></ac:structured-macro>
-]


 

 

 

Brutus\/AET

Brutus/AET

 

 

 

bspider


 

 

 

bwh3_user_agent


 

 

 

cfnetwork| checkbot


 

 

 

China\sLocal\sBrowse\s2\.6


 

 

 


Code Sample Web Client

 

 

 

combine


 

 

 

commons-httpclient


 

 

 

ContentSmartz


 

 

 

core


 

 

 

crawl


 

 

 

cursor


 

 

 

custo


 

 

 

DataCha0s\/2\.0


 

 

 

Demo\sBot


 

 

 

docomo


 

 

 

DSurf


 

 

 

dtSearchSpider

dtSearchSpider

 

 

 

dumbot


 

 

 

easydl


 

 

 

EmailSiphon


 

 

 

EmailWolf


 

 

 

exabot


 

 

 

fast-webcrawler


 

 

 

favorg


 

 

 

FDM(\s|+)1

FDM 1

 

 

 

feedburner


 

 

 

feedfetcher-google


 

 

 

Fetch(\s|)API(\s|)Request

Fetch API Request

 

 

 

findlinks


 

 

 

gaisbot


 

 

 

GetRight

GetRight

 

 

 

geturl


 

 

 

gigabot


 

 

 

girafabot


 

 

 

gnodspider


 

 

 

Goldfire(\s|+)Server

Goldfire Server

 

 

 

Googlebot

Googlebot

 

 

 

grub


 

 

 

heritrix


 

 

 

hl_ftien_spider


 

 

 

holmes


 

 

 

htdig


 

 

 

htmlparser


 

 

 

httpget-5\.2\.2

httpget-5.2.2

 

 

 

httrack


 

 

 

HTTrack

HTTrack

 

 

 

ia_archiver


 

 

 

ichiro


 

 

 

iktomi


 

 

 

ilse


 

 

 

internetseer


 

 

 

iSiloX

iSiloX

 

 

 

java


 

 

 

jeeves


 

 

 

jobo


 

 

 

larbin


 

 

 

libwww-perl

libwww-perl

 

 

 

linkbot


 

 

 

linkchecker


 

 

 

linkscan


 

 

 

linkwalker


 

 

 

livejournal\.com


 

 

 

lmspider


 

 

 

LOCKSS


 

 

 

LWP\:\:Simple

LWP::Simple

 

 

 

lwp-request


 

 

 

lwp-tivial


 

 

 

lwp-trivial

lwp-trivial

 

 

 

lycos


 

 

 

mediapartners-google


 

 

 

megite


 

 

 

Microsoft(\s|)URL(\s|)Control

Microsoft URL Control

 

 

 

milbot

Milbot

 

 

 

mj12bot


 

 

 

mnogosearch


 

 

 

mojeekbot


 

 

 

momspider


 

 

 

motor


 

 

 

msiecrawler


 

 

 

msnbot


 

 

 


MSNBot

 

 

 

MuscatFerre


 

 

 

myweb


 

 

 

NABOT


 

 

 

nagios


 

 

 

NaverBot

NaverBot

 

 

 

netcraft


 

 

 

netluchs


 

 

 

ng\/2\.


 

 

 

no_user_agent


 

 

 

nutch


 

 

 

ocelli


 

 

 

Offline(\s|+)Navigator

Offline Navigator

 

 

 

OurBrowser


 

 

 

perman


 

 

 

pioneer


 

 

 

playmusic\.com


 

 

 


playstarmusic.com

 

 

 

powermarks


 

 

 

psbot


 

 

 

python


 

 

 


Python-urllib

 

 

 

qihoobot


 

 

 

rambler


 

 

 

Readpaper

Readpaper

 

 

 

redalert| robozilla


 

 

 

robot


 

 

 

scan4mail


 

 

 

scooter


 

 

 

seekbot


 

 

 

seznambot


 

 

 

shoutcast


 

 

 

slurp


 

 

 

sogou


 

 

 

speedy


 

 

 

spider


 

 

 

spider


 

 

 

spiderman


 

 

 

spiderview


 

 

 

Strider

Strider

 

 

 

sunrise


 

 

 

superbot


 

 

 

surveybot


 

 

 

T-H-U-N-D-E-R-S-T-O-N-E

T-H-U-N-D-E-R-S-T-O-N-E

 

 

 

tailrank


 

 

 

technoratibot


 

 

 

Teleport(\s|+)Pro

Teleport Pro

 

 

 

Teoma

Teoma

 

 

 

titan


 

 

 

turnitinbot


 

 

 

twiceler


 

 

 

ucsd


 

 

 

ultraseek


 

 

 

urlaliasbuilder


 

 

 

voila


 

 

 

w3c-checklink


 

 

 

Wanadoo


 

 

 

Web(\s|+)Downloader

Web Downloader

 

 

 

WebCloner

WebCloner

 

 

 

webcollage


 

 

 

WebCopier

WebCopier

 

 

 

Webinator


 

 

 

Webmetrics


 

 

 

webmirror


 

 

 

WebReaper

WebReaper

 

 

 

WebStripper

WebStripper

 

 

 

WebZIP

WebZIP

 

 

 

Wget

Wget

 

 

 

wordpress


 

 

 

worm


 

 

 

Xenu(\s|)Link(\s|)Sleuth

Xenu Link Sleuth

 

 

 

y!j


 

 

 

yacy


 

 

 

yahoo-mmcrawler


 

 

 

yahoofeedseeker


 

 

 

yahooseeker


 

 

 

yandex


 

 

 

yodaobot


 

 

 

zealbot


 

 

 

zeus


 

 

 

zyborg


 

 

 

...