thomson reuters general use powerpoint template

ORCID Technical Report
May 18, 2011
Development Approach
Phase 1
Phase 2
• Completed Spring 2010
• Self-claim oriented
• Limited light integration with
a few participant services
• Currently in use for
• In progress 2011
• Will be developed by ORCID,
• Will provide core for future
production service
• Will focus on currently active
• Development headed by
Geoff Bilder
• Development 2012+
• Will address assertions by
wide group of third parties
• Will extend capabilities for
alternate roles and other
types of contributions
• Will provide mechanisms for
automatic de-duplication of
third party donated records
ORCID Alpha Available for Demo
…This Alpha provides a test environment for illustrating use cases and gathering
feedback from the community for services ORCID may provide. The Alpha does
not represent ORCID's live system…
ORCID Alpha Features
Links to
Published Work
via DOI
ORCID Alpha Features
Management and
Privacy Controls
Crossref Lookup
for Publications
Phase 1 Scope
• ORCID will build a central registry of unique identifiers for researchers and scholars
with the following scope:
• ORCID will focus on currently active researchers.
• Data will come from individuals and universities.
• ORCID will be a hybrid system of self- and organization-asserted identity.
• Data collected will be those needed for disambiguation - extra data for
optionally creating full CV-like profiles might be added in the future.
• The system will provide basic matching and disambiguation of names.
• The ORCID system will, from the start, enable 3rd parties to build value added
services using ORCID infrastructure
• ORCID services will be developed based on the needs of the ORCID
Phase 1 Functionality
The development of the ORCID “alpha” and subsequent discussions with
stakeholders have identified a number of additional changes that would need to be
made to the system in order to meet common requirements. These include:
• Incorporate OAuth2 & profile exchange
• Privacy mechanism to support tertiary control (private/protected/public) at field
level (needed to support profile exchange)
• Authentication/authorization mechanism to support “delegated” management of
profiles (e.g. a researcher can grant permission to a departmental secretary or
librarian to edit a profile on their behalf)
• Include production-level publication lookup feature from CrossRef
• Expose minimal provenance information for metadata records
Phase 2: Issues of Assertion
• Consider disambiguation as a
collection of “claims” by different
• Evaluating the duplication,
contradiction, and uniqueness can
indicate the credibility of a record
Ongoing Research: Profile Exchange Group
• The work of the Profile Exchange sub-group:
•Recommend a technological approach to reliably and efficiently merge
researcher profiles from different databases
• Approach
• Create a Gold Standard of test data which comprises of a set of data where
there is a number of known True Positive matches
• In terms of creating the test set, a True Positive is determined by matching
md5 hash of lowercase email addresses where the email address contains
partial match with name of author (this is to avoid the known problem of generic
email addresses such as [email protected] [email protected], [email protected], etc. We are not at all
proposing that email/email hash would be used in a final system.
• The test data will be loaded into a system, where by contributors with matching
technology can pull down the records and apply their methods.
Profile Exchange Group Progress
• Mike Taylor is leading the R&D work in this space
• Research Specialist at Elsevier Labs
• Was responsible for ORCID Alpha Scopus integration
• Workspace created for Profile Exchange R&D
• Elsevier has provided a server to host data
• Access Innovation / Data Harmony donated an instance of their XIS XML
document repository
• Datasets
• Focus on High Energy Physics and Computer
• Data contributed from Elsevier (Scopus) and
• Published DTD to ORCID working group wiki
• Normalizing data onto common DTD to
ensure like-to-like comparisons
• Current Activities
• Research on datasets to develop clustering
algorithms and relationships

similar documents