Table of contents

Document information

Title:
Subject:
Moderator:
Version:
Date published:
Excerpt: The Digital Author Identifier (DAI) is a unique national number assigned to every author who has been appointed to a position at a Dutch university or research institute or has some other relevant connection with one of these organisations. The DAI brings together various different ways of writing the author's name and distinguishes between authors with the same name. Read more...

(Optional information)
Type:
Format:
Identifier:
Language:
Rights:
Tags:

Document History

Date

Version

Owner

Changelog

PDF

 

 

 

 

 

Abstract

The Digital Author Identifier (DAI) is a unique national number assigned to every author who has been appointed to a position at a Dutch university or research institute or has some other relevant connection with one of these organisations. The DAI brings together various different ways of writing the author’s name and distinguishes between authors with the same name. Read more...

In september 2011 JISC send out a survey about Author Identifier systems. The Dutch response can be found here https://docs.google.com/document/pub?id=1bitF0Ylh6GUEaWYGlFE8OQTgHnNhiESk-ktT_FRUWQo or in PDF http://wiki.surffoundation.nl/download/attachments/3473692/JISCNameIdentifierSurvey.pdf This survey provides many answers related to the DAI.

What is a Digital Author Identifier?

De Digital Author Identification (DAI) is een uniek landelijk nummer voor elke auteur, met een aanstelling of een andere relevante band bij een Nederlandse universiteit, hogeschool of onderzoeksinstituut. Het gebruik van de DAI is geregistreerd bij het het College Bescherming Persoonsgegevens (CBP) zie http://www.cbpweb.nl/asp/ORDetail.asp?moid=808d858982 . Daarnaast is de DAI voorbereid op de internationale ISO standaard "ISNI" (International Standard Name Identifier).

By design a DAI is a promoted Pica Production Number (PPN) resulting from a positive match between a person identified by a record in the NTA and an appointment registered in the local CRIS system. This makes that a DAI can only be assigned to researchers appointed at a university.

Background

The DAI, Digital Author Identifier, in the Netherlands was developed to prevent ambiguity in retrieving the work of one author, and to uniquely identify an author by a numeric identifier in order to overcome the problems imposed by name variants.

systems, parts and components

The Author Identification System consists of a central component - the National Thesaurus for Author names (NTA), part of the Shared Cataloguing System (GGC) - and decentral components - the local Current Research Information System (CRIS) - located at each university. Currently, all universities have implemented the METIS system as their CRIS solution.

responsibilities

OCLC is responsible for technical application management (BiSL: Application management) of the GGC and by that, the NTA. The partner who is responsible for the functional management (BiSL: Business information management) is still under discussion at the UKB meetings (UKB is the collaboration between the research libraries and the National library). The National Library is a candidate. The CRIS systems are under the responsibility of the universities and research institutes.

legal documents

The Nation Thesausus for Author names is protected by law. The only members who share the GGC can access and use the NTA. The exact words can be read here on the website of the Dutch Data Protection Authority. (CBP; College Bescherming Persoonsnamen)  http://www.cbpweb.nl/asp/ORDetail.asp?moid=808d858982

data range

The GGC is part of a generic bibliographic cataloguing database, thus the NTA contains a broad spectrum of authors from novelists to scientific authors.
By design a DAI is a promoted Pica Production Number (PPN) resulting from a positive match between a person identified by a record in the NTA and an appointment registered in the local CRIS system. This makes that a DAI can only be assigned to researchers appointed at a university.
The match against the NTA theoretically also implies that the researcher must have previously published to be included in the NTA. However when an NTA record has not been found when trying to create match with a local CRIS record, an NTA record may be created from the CRIS record in order to still be able to mint a DAI.

database quality

The NTA contains authors who are registered by Public Libraries and Research Libraries who are member of OCLC (Previously OCLC-PICA).
Population of the data is done by qualified cataloguers at libraries or specialist departments, and employees who administer the CRIS currently used by all Dutch universities, METIS.

database moderation

Changes in the central system may be made by cataloguers at university libraries and research information departments with access to WinIBW or WebGGC and METIS administrators, through a special version of the WebGGC. Note that each research institute is only allowed to make changes to the record-part associated with their own individual institution.

check, match, mint & assign

As said, the identifier is minted by a positive match between a person record in the NTA and an appointment record in the local CRIS.

From within the METIS application, the administrator uses a special version of the WebGGC for the NTA that preserves the context of the person selected in METIS. The administrator then searches the NTA for a positive match. If a match cannot be found, the administrator may create a new NTA record. The administrator then registers some metadata for the person according to the local METIS (e.g. name, date of birth, employee id, appointment date). Upon completion of this proces, the resulting DAI is sent back to the METIS.

Institutions that do not use the METIS application can use either the WinIBW application (a generic application used by libraries that are already cataloguing in the GGC) or the WebGGC webinterface.

Syntax

The DAI (or PPN) contains 9 to 10 characters. The first 8 to 9 characters are numbers. The last (9th or 10th) character is a control character. The control character is a modulus 11 check digit, like in the ISBN. More on the MOD11 algorithm can be found on for example wikipedia.

At the moment the INFO-URI namespace is used as an authority namespace. The DAI is URI-fied under the EU-REPO sub-namespace. This namespace defines components for compound objects in the Institutional Repositories.

How does a DAI look like

URI-fied a DAI looks like this: info:eu-repo/dai/nl/123456785

The DAI is the number after the string info:eu-repo/dai/nl/ . A DAI is a number like 123456785. The last character is a MOD11 check-digit.

The string: info:eu-repo/dai/nl/ is just an authority namespace, telling the user or machine that the number is a DAI originating from the Netherlands.

Click here to read more about info:eu-repo/dai/nl/

In practice the DAI can be used to unambiguously retrieve publications from one author across different systems. Click for example in NARCIS.

How does the DAI fit in to the ISNI? It is said it can be done by OCLC. But how? The ISNI is a 16-digit number. Does this subset of dutch authors get a prefix? What about the last control number?

XML schema

(better explanation needed!)

For the MODS metadata an extension has been made. The XML shema of the extension can be found here: http://purl.org/REP/standards/dai-extension.xsd

For the use of this extension in MODS see: Use of MODS

OWL

The DAI OWL file can be found at http://purl.org/info:eu-repo/dai#

Identifier workflow

(better explanation needed!)

Overview of the Technical DAI Infrastructure

Organisational overview

Systems linked to the DAI

document below https://docs.google.com/document/pub?id=1bitF0Ylh6GUEaWYGlFE8OQTgHnNhiESk-ktT_FRUWQo

1 Comment

  1. Unknown User (j.odekerken@maastrichtuniversity.nl)

    Ik zie een foutje in de DAI beschrijving:

    Syntax

    (better explanation needed!)

    The DAI is a 7-digit number.

    at the moment the info namaspace is used.

    (warning)

    info:eu-repo/dai/nl/1234567

     De DAI (zoals wij die van OCLC gebruiken) is in feite een string van 9 characters. De eerste 8 characters zijn cijfers, het 9e character is een controlecharacter en kan een X zijn. Bij een ISBN/ISSN gaat men gelijkaardig te werk.  Voorbeeld van een Maastrichtse DAI: info:eu-repo/dai/nl/30820137X. OCLC systemen zijn niet hoofdlettergevoelig (een x of een X). Mogelijk moeten wij binnen WISH wel afspraken maken hier over. Maar dat kan ik niet exact inschatten Verder nog een tikfoutje bij namaspace  Jos