DRIVER Set naming

Added information to answer questions about "Recommended Set names for "Open Access" and "Embargoed/Delayed Access" subcollections -
See DRIVER Set naming page.
Explanation: Recommended for hybrid repositories with a mixture of metadata-only and metadata-with-full-text to use a DRIVER set with records that contain the full text openly available. Also the DRIVER set should not contain Delayed Access records, this only leads to confusion at the end-user's side when he thinks to find Open Access material.
There should be not be separate DRIVER recommendations on sets for eTheses.
Explanation: DRIVER Guidelines are there for a bigger community. Harvested eTheses should be recognised through the terms used in the Publication type vocabulary.


Harvest batch size

Increase the recommended batch size from 100-200 records per batch, to 100-500 records per batch. See: Harvest batch sizepage .
Explanation: The experience is that problems with breaks in a OAI ListRecords communication happen quite rarely. The topscore of records per response found up to now was around 6500 records. The positive consequence of a hugh batch size is that the harvesting activity is very quick and thus those repositories have a high throughput.

Resumption token lifespan

Beter explanation why the recommendation of the Resumption token lifespan is needed. See: Resumption token page .
Explanation: There is a relation between the lifespan, batch size and throughput. If the throughput is slow and the batch size is small, the life span of the resumption token should increase. Otherwise the harvester keeps receiving only the first batch over and over again.

Deleted records strategy

The DRIVER Guidelines text explains clearer now why a persistent/transient strategy is valuable for both repository and service provider.
Explanation: The advantage for the repository to keep track of deletions is that a service provider will not display records which are not available any more in the repository. Besides that, this strategy allows harvesters to avoid re-loading the full repository each time and makes the harvesting process more efficient.
See: Deleted records page.


  • No labels