Digital Author Identification UKSG 17 – 18 april 2007 Daniel van Spanje DAI in DARE • DARE: Digital Academic REpositories – – – – Universities + KNAW + NWO + KB Infrastructure for linking the IR Stimulate production of digital scientific output 2003 – 2006 • 2007 – 2010: SURFshare 2 Main issues in DAI • Unique identifying number for researchers / authors • National scale • Benefits: – Improve searching for electronic publications – Integrate searching for electronic and nonelectronic publications – Link Library (Catalogue) and research environment (Metis) 3 Two projects • Pilot in 2005 – 2006 – one university: Groningen • Roll-out 2006 - 2007 – 13 academic research organizations • Project leader: Anneloes Degenaar • DAI website at University of Groningen: – http://dai.weblog.ub.rug.nl/ – http://dai-uitrol.ub.rug.nl/ 4 Organizations involved in DAI • 13 universities + CWI + KNAW • SURF • UCI • OCLC PICA 5 Systems involved • Institutional repository / DAREnet • Metis • Dutch Union Catalogue (NCC/PiCarta) 6 Institutional Repositories / DAREnet 7 Institutional Repositories / DAREnet 8 METIS 9 METIS 10 METIS 11 National Union Catalogue 12 Shared Cataloguing System (GGC) 13 Shared Cataloguing System (GGC) 14 Names and other issues • • • • • • • • • • • Authors with the same name Use of one or more initials Changing names Spelling variants Diacritics Pseudonymes Name in religion Nicknames Collective names Different structure of names in other languages and cultures ….. • Discussions on standardization and unification started in the Netherlands in the Orion project (2003-2004) 15 Proposed solution • Need established • “External”Requirements: – use existing mechanisms – local management – national function • Solution: use “collocation” mechanism of libraries and Metis as source 16 Cataloguing and Metis GGC NTA Cataloguing Repository Metis 17 Use authority records (NTA) in Metis NTA Repository GGC Cataloguing Metis CWI 18 How did we link • Mechanisms – Initial load per organization – Online input buttons (webtemplates) – XML output – Synchronization mechanisms • Requirements – No overwrite of library data! – Deduplication (Matching/merging) 19 Datamodel developed • Datamodel copied from bibliographic model: three levels • Metis name-information added to library data; no overwrite • Affiliations and other fields added 20 Structure of bibliographic data general Bibliographic metadata YoP / LoP / / Title / Author Imprint / LCSH / DDC local Groningen bibdat: Subject headings copy Copy level: •Location •holding •shelfnumber Copy level: •Location •holding •shelfnumber Linked Authority record Amsterdam bibdat Subject headings Copy level: •Location •holding •shelfnumber Copy level: •Location •holding •shelfnumber 21 Structure of authority data Library record Thesaurusrecord Linked Authority record Name of author Variant names Metis Groningen data (Metis name) Affiliation Affiliation •Begin •End Affiliation •Begin •End Amsterdam data (Metis name) Affiliation •Begin •End Affiliation •Begin •End 22 Library data Metis Researcher Name Affiliation data Example authority record + added fields 23 Example authority record + added fields 24 Datamodel: fields Authority file • Nationality • Language • Name (best known) • Name (most complete) • Maiden name • Name variants • Date of birth • Date of death • Profession / subject • Link to pseudonyms • notes • Entry date • Update date • Note: proper name field includes subfields for first name, middle name, last name, prefix, suffix Added fields • • • • Local researcher number Metis name (preferred) Metis name Sex • • • • • • • • • • Code organisation Name organisation Start date employment Enddate employment Code function Description of function Code of employment Notes Entry date Update date 25 Initial load Metis makes list of names Format conversion Load DAI in Metis Match names with auth file Manual dedup of list Merge names with names found Dedup in Metis Load new names (not found) Load B-records (? Duplicates?) Make Metis export Manual dedup by library staff Export DAI’s to Metis 26 Initial load • • • • • Data enrichment in Metis Export from Metis Conversion to cataloguing system Matching Merging: merge / new / B-record • Results depend on quality metadata – 95 % automatic / 5% manual – 70% automatic/ 30 % manual. – 50 % automatic / 50 % manual 27 Online process • DAI-button in Metis to create DAI-number • Export DAI-button in NTA/Cataloguingsystem to Metis • DAI-button in IR to create DAI-number • Separate DAI-http-request for online input • Online input via current cataloguing tool • + Offline synchronization mechanisms between Metis and NTA 28 DAI-button in Metis 29 URL link instead of button • http://www.pica.nl/dai/dai_redirect.php?action=maak_dai&user=<use rnumber>&metis_export_url=http://oras.service.rug.nl:1111/metisda d&p_onderzoekernummer=00033&p_naam_medewerker=Rotteveel&p _voorletter=R&p_voorvoegsel=&p_titulatuur=&p_voorkeur=J&p_gesla cht=M&p_geboortedatum=01-071974&p_code_functie=20&p_functie=Universitair%20hoofddocent&p_c ode_organisatie=22020200&p_organisatie_a=Medical%20Microbiology &p_begin_aanstelling=01-01-2005&p_einde_aanstelling=01-01-2006 30 Input form for Metis fields 31 Results of the DAI project • Now: – – – – 50% of the researchers have a DAI Procedure for initial load in place Start with online procedure P rivacy statement • Autumn 2007 – Online procedure in place – Procedure for synchronization in place – 100% of the researchers will have a DAI in 2007 (ca. 40.000) 32 Things to do • Finalize the roll-out, develop services (passport …) and implement a usergroup • Add DAI in metadatastandards (DCX, MODS) • International standardisation: ISPI • Involve authors for controll and updating 33 Concluding remark 34 • Thanks 35