University of St Andrews Scotland

Report
CERIF for Datasets:
Background and Key Findings
Workshop, London
26th July 2013
CERIF slides reproduced from presentations by euroCRIS members :
Keith Jeffery, Brigitte Joerg, Anna Clements
C4D Summary




JISC MRD Programme
Consortium : Sunderland, Glasgow, St Andrews,
NERC, EPSRC, DCC and euroCRIS
“CERIFication” of the metadata about research
datasets
Focus on MEDIN* standard : NERC requirement
for http://www.bodc.ac.uk/
* http://www.oceannet.org/
C4D workshop, Glasgow & London. July 2013
Datasets & metadata
Datasets have sparked interest in metadata
standards that support their:
 Discoverability
 Description
 Usability
 Re-use
C4D workshop, Glasgow & London. July 2013
For example …

CKAN : www.ckan.org
– Software platform; default schema is DC

eGMSDescription
– UK e-Government metadata standard; based on DC
– ‘flat’ model; single entity (a resource or dataset); keep
adding attributes

DCAT
– RDF schema vocabulary for PSI (public sector info)
– Some normalisation; can’t capture different
roles/semantics in relationships
C4D workshop, Glasgow & London. July 2013
Houssos, N., Joerg, B., Matthews, B.. A multi-level metadata approach for a Public Sector Information data infrastructure.
CRIS2012. Prague 06-09 June 2012
http://www.engage-project.eu/engage/wp/
C4D workshop, Glasgow & London. July 2013
… so what about CERIF?




Common European Research Information Format
A conceptual model for describing the complete
research domain
A standard for the development, implementation
and interoperability of current research
information systems (CRIS) and their various
application
Est. 1991; maintained by www.euroCRIS.org
C4D workshop, Glasgow & London. July 2013
… and euroCRIS?

Not for profit organisation of experts
– Research organisations; funders; publishers; systems
providers; standards organisations



109 institutional, 38 personal & 20 affiliate
members (euroCRIS annual report 2012)
41 countries; not just Europe
Main activity is the development, maintenance
and of implementation CERIF
C4D workshop, Glasgow & London. July 2013
euroCRIS : Strategic Partners
C4D workshop, Glasgow & London. July 2013
In the UK : The CERIF landscape
C4D workshop, Glasgow & London. July 2013
UK CERIF adoptipn



1/3 of UK HEIs have a CERIF-compliant CRIS*
Driven by desire to better support research
management at the institutional level
… and streamline reporting to funders
•
Source: UKOLN (R. Russell), Adoption of CERIF in Higher Education Institutions in the UK: A Landscape
Study, March 2012
http://www.ukoln.ac.uk/isc/reports/cerif-landscape-study-2012/CERIF-UK-landscape-report-v1.1.pdf
C4D workshop, Glasgow & London. July 2013
CERIF 1.5
CERIF 1.4 (XML)
CERIF 1.3
CERIF 2006 /
2008 Model
Base
CERIF 1.6
Link
Semantics
Language
2ndLevel
CERIF 2000 Model
Funding Programme
Person
Roles
EXPERTISE
Skills
Project
Service
Publication
Equipment
CV
CERIF 91
OrgUnit
PERSON
PROJECT
RESULTS
EQUIPMENT
CLASSIFICATION
- EC Recommendation to
Member States
1991
Patent
Classification
(Semantics )
--Data Model
-- Infrastructure
- Facility, Equipment,
Service
- Measurement & Indicator
- Entities and Link Tables
- Geographic Bounding Box
- Data Model
- CERIF 1.3 Vocabulary
- Model Normalization
- Robust/Consistent Structure - UUIDs
- Terms
- Extensible Structure
- Schemes
- Semantic Layer
- XML Exchange Specification - CERIF 1.4 new XML format
- CERIF 1.5 Federated
- Elaboration on Publication
Identifiers
- CERIF Core Semantics (2008 1.2)
Product
Event
PROJECT
Acronym : ERGO
Participants : Keith
Jefffery, Anne
Asserson, Rutherford
Appleton Lab, Univ
Bergen,, many more
- Networking of DBs
- Exchange of Records
Organisation
- Data Model
- Multilinguality
- Controlled Vocabulary
- Roles / Types
- User-driven
- EC Recommendation to
Member States
2000
--Data Model
-- C4D
datasets
+ Linked
Data
2002
2006
C4D workshop, Glasgow & London. July 2013
2012
2013
CERIF Entity Types
• Base Entities
• Result Entities
• Infrastructure Entities
• 2nd Level Entities
• Link Entities
CERIF Features
• Multiple Language
• Semantics
• Measures & Indicators
• Geographic Bounding Box
C4D workshop, Glasgow & London. July 2013
C4D workshop, Glasgow & London. July 2013
Project
Person
OrganisationUnit
Project
OrganisationUnit
Person
C4D workshop, Glasgow & London. July 2013
Project
Person
OrganisationUnit
Project
ID
URI
Acronym
StartDate
EndDate
Title
Abstract
Keywords
Person
ID
URI
Gender
FirstNames
OtherNames
FamilyNames
NameVariants
ResearchInterest
Keywords
OrganisationUnit
ID
URI
Acronym
Name
HeadCount
CurrencyCode
Turnover
ResearchActivity
Keywords
C4D workshop, Glasgow & London. July 2013
Project
Person
OrganisationUnit
cfProject
cfID
cfURI
cfAcronym
cfStartDate
cfEndDate
cfTitle
cfAbstract
cfKeywords
cfPerson
cfDescription
cfID
cfURI
cfGender
cfKeywords
cfBirthdate
C4D workshop, Glasgow & London. July 2013
cfOrganisationUnit
cfID
cfURI
cfAcronym
cfHeadCount
cfCurrencyCode
cfTurnover
ResultPublication
ResultPatent
ResultProduct
ResultPublication
ResultPatent
ResultProduct
C4D workshop, Glasgow & London. July 2013
ResultPublication
ResultPatent
ResultProduct
ResultPublication
ID
URI
Title
Subtitle
Abstract
Bibl. Note
PublicationDate
TotalPages
StartPage
EndPage
Keywords
ResultProduct
ID
URI
C4D workshop, Glasgow & London. July 2013
ResultPatent
ID
URI
PatentNumber
Title
CountryCode
RegistrationDate
ApprovalDate
Description
Keywords
ResultPublication
cfResultPublication
cfID
cfURI
cfNumber
PublicationDate
cfBibliographic
cfStartPage
Note
cfEndPage
cfTotalPages
cfVersionInfo
cfEdition
cfSeries
cfAbbreviation cfIssue
cfVolume
cfISBN
cfISSN
ResultPatent
ResultProduct
cfKeywords
cfResultProduct
cfID
cfURI
cfVersionInfo
cfSubtitle
cfAbstract
cfKeywords
cfName
cfName
cfDescription
cfTitle
cfAbstract
cfKeywords
cfVersionInfo
C4D workshop, Glasgow & London. July 2013
cfResultPatent
cfID
cfURI
cfPatentNumber
cfCountryCode
cfRegistrationDate
cfApprovalDate
Advantages of CERIF

CERIF has many advantages as the canonical model (the research
information entities, attributes, associations and semantics) for contextual
metadata for datasets:
– Covers all aspects of research information: researchers, projects, organisations,
funding, outputs, equipment, services, and so on;
– An optimal (relational) architecture allowing the expression of any kind of relation
between entities/attributes with every relation “time-stamped” and semantically
defined;
– Very fine-grained structure, allowing output of the metadata to virtually any format;
– A separated “semantic layer” allowing the use of multiple (any) controlled
vocabularies (classifications, typologies) as well as their cross-linking and
mapping;
– Ability to cope with multiple languages
C4D workshop, Glasgow & London. July 2013
Mapping to CERIF
24 of 30 MEDIN elements mapped to CERIF
C4D workshop, Glasgow & London. July 2013
C4D workshop, Glasgow & London. July 2013
DataCite version 3.0
Mandatory
• Identifier
• Creator
• Title
• Publisher
• Publication Year
• Subject
• Contributor
• Dates relevant to work
Recommended • Resource Type
Optional
• Scheme URI
• Title Type
• Subject Scheme
•
•
•
•
Related Identifier
Relation Type
Description
GeoLocation
•
•
•
•
Language of Resource
Alternate Identifier
Related Metadata
Size
•
•
•
•
Data Format
Version
Rights
Geolocation Place
C4D workshop, Glasgow & London. July 2013
More
work
required?
CERIF 1.6 released for testing 25th July 2013
http://www.cerifsupport.org/2013/07/24/cerif-1-6-formal-models-released-for-testing/
C4D workshop, Glasgow & London. July 2013
Mapping to other schemata
C4D vs RE3Data vs DCI vs DataCite
C4D workshop, Glasgow & London. July 2013
Key Findings

CERIF metadata model
 can be used to record rich metadata about datasets
 can related to other pieces of the research landscape
 can evolve / extend within formal euroCRIS governance structure
BUT …
 Needs testing in production environments
 Is cfResProd appropriate? Not just a research result?
 Ongoing need for agreed vocabularies
 CASRAI
 RCUK harmonisation
C4D workshop, Glasgow & London. July 2013
Case Study: DaMaRo



Have used C4D as basis for checking whether
DataFinder is rich and detailed enough
Once the C4D profile has been finalised, DaMaRo will
embark on implementation of C4D-compliant outputs
Most fields map to C4D
C4D workshop, Glasgow & London. July 2013
Next Steps





Further consultation with euroCRIS/CERIF TG in terms of best
approach
Aiming to achieve most comprehensive set of metadata
(incorporating RE3Data, DataCite, etc.)
Move new Pure model to production (after REF)
Exporting and importing CERIF-XML from
systems; exploring this with http://ckan.org
Aggregation of data into national data
register model
C4D workshop, Glasgow & London. July 2013
[email protected]
[email protected]

similar documents