CLARIN Licensing Schemes

CLARIN licensing schemes
Anje Müller Gjesdal & Gunn Inger
Lyse, University of Bergen
Paper outline
1. Organisational structure
2. Types of data
2. The CLARIN licensing scheme
1. Organisational model
2. Basic concepts
3. Sub-categories
• Distributed data infrastructure with sites all over
Europe (universities, libraries, archives etc.)
• Provides access to digital language data
collections, tools to work with them and
expertise for researchers
• Members: Austria, Bulgaria, Czech Republic,
Germany, Denmark, Estonia, the Netherlands,
Poland. Norway has observer status.
• ERIC since 2012
General assembly
Board of directors
National consortia, e.g. CLARINO in Norway
Expert committees on standards and legal
CLARIN centers
• Type A centres offer services that are relevant for
the infrastructure as a whole and that need to be
offered at a high level of commitment (stability,
availability, persistence)
• Type B centres offer services that include the
access to the resources stored by them and tools
deployed at the centre via specified and CLARIN
compliant interfaces in a stable and persistent
• Quality assurance: Service Level Agreements,
Data Seal of Approval
Types of data
• Spoken data (e.g. spontaneous speech,
recorded conversations)
• Web pages
• Copyrighted text (e.g. novels, research
Types of data - example
• The Norwegian Spanish Parallel Corpus (NSPC)
• a parallel, unidirectional translation corpus of
contemporary Norwegian written texts translated
into Spanish, published between 2000 and 2009
• contains fiction and non-fiction, and each text is
classified according to genre, the author's gender
and the gender and mother tongue of the
• Available under the CLARIN ACA license
Types of data – example
Search interface NSPC
Types of data – example
Search results for the word
‘Hollywood’ in NSPC
CLARIN licensing schemes –
organisational model
CLARIN licensing schemes
• Challenge: how to get linguists to license their
research output?
• Solution: A classification system (‘laundry
tags’) for licensing
– Categorizes existing fixed licensing schemes (e.g.
Creative Commons, Open Data Commons Open
Database License)
– Provides license agreement templates for new
CLARIN licensing scheme sub-categories
– A requirement for strictly Non-Commercial use (NC)
– A requirement to inform the copyright holder regarding
the usage of the tools and/or the resources in published
articles (INF)
– A option to redeposit modified versions of the tools and
resources with the Service Provider (ReD)
– A requirement that the resource has to be kept in the
CLARIN Secure Server environment (LOC)
– A requirement that the Resource may need to be handled
with care in order to respect the privacy of the personal
data it contains and if samples of the data are published,
they must be anonymized according to best practices (PD)
Example: LINDAT-CLARIN, Centre for Language
Research Infrastructure in the Czech Republic
CLARIN services
• Offers templates
– CLARIN Deposition License Agreements (DELA)
– CLARIN End User Agreement (EULA)
– CLARIN Terms of Service (TOS)
• Legal issues committee

similar documents