Guidelines for the aggregation and exchange of Usage Data
First draft, based on technical specifications from the OA-Statistics project (written by Daniel Metje and Hans-Werner Hilse), the NEEO project (witten by Benoit Pauwels) and the SURE project (written by Peter Verhaar and Lucas van Schaik)
Added the sections based on the Knowledge Exchange meeting in Berlin. And filled in some additional information to these sections.
Revised version, in which comments made by Benoit Pauwels, Hans-Werner Hilse, Thobias Schäfer, Daniel Metje and Paul Needham have been incorporated.
The user who has sent the request for the file is identified in the <requester> element.
<requested/identifier> | IP-address of
The user can be identified by providing the IP-address. Including the full IP-address in the description of a usage event is not permitted by international privacy laws. For this reason, the IP-address needs to be obfuscated. The IP-address must be hashed using MD5 encryption. MD5 encryption of IP addresses can easily be hacked. The question if such MD5 encryption secures the privacy sufficiently warrant further research by legal advisors.
A data-URI, consisting of the prefix "data:", followed by a 32-digit hexadecimal number.
The IP address of the requester is pseudonymised using encryptions, before it is exchanged and taken outside the web-server to another location. Therefore individual users can be recognised when aggregated from distributed repositories, but cannot be referred back to a 'natural person'. This method may seem consisted with the European Act for Protection of Personal data. The summary can be found here: ?http://europa.eu/legislation_summaries/information_society/l14012_en.htm. Further legal research needs to be done if this method is sufficient to protect the personal data of a 'natural person', in order to operate within the boundaries of the law.
<requested/identifier> | C-class Subnet