Improving the quality and efficiency of legal translation in

The JCiCM Platform
Improving the quality and efficiency
of legal translation in criminal
matters via corpus-based tools
Łucja Biel
Institute of Applied Linguistics
University of Warsaw
[email protected]
Quality of legal translation
• Fuzziness of the concept of translation quality
Two fundamental relations of translations
1. The relation of equivalence (relation to the source
text) – faithfullness, accuracy, precision,
information transfer
2. The relation of textual fit (relation to
nontranslated target-language texts of a
comparable genre)  communicative dimension
of translations, naturalness of translation
Chesterman (2004)
Efficiency of legal translation
• Automation of certain choices (the cognitive
level), cognitive routines established
• If not automated
• Speed of information retrieval (information
mining competence plus available resources)
- terminology mining: fast retrieval of
terminological equivalents
- phraseological environment of terms
- genre formata & conventions
Efficiency of legal translation
Quality of the information mining process
• Precision and calibration of searches
• Contextualisation of information
• Data provenance (metadata)
• Integration of resources in a single place
Traditional and new resources
for LITs
• Discussion fora
• Googling
• Electronic termbases, e.g. IATE
• Translation memories, e.g. DGT’s TM
• Translation-driven corpora
Translation-driven corpora
Corpus: a body of language representative of a
particular variety of language or genre which is
collected and stored in electronic form (for the
purpose of linguistic analysis) (McEnery et al. CASS)
• Monolingual genre-based corpora
• DIY / ad hoc corpora (Bernardini & Zanettin 2000,
Beeby et al. 2009, Gallego-Hernandez 2013, Scott
2013), disposable corpora (Varantola 2000)
• Comparative monolingual corpora
• Parallel multilingual corpora
• Multilingually comparable corpus
Translation-driven corpora
• Monolingual comparable corpora contain:
(1) a corpus of translations
(2) a corpus of texts created spontaneously in the same language
• A parallel corpus is a translation corpus in the strictest sense
─ it contains source texts aligned with their target texts; it
may be bilingual or multilingual and bi-directional.
• Multilingually comparable corpora (cf. Hansen-Schirra and
Teich, 2009: 1162) – a combination of comparable and
parallel corpora due to questions raised as to the legitimacy
of research into translations without any reference to
underlying source texts.
Corpora in translation practice
Advantages of working with
Shortcomings/ problem areas
• corpora provide a different type
of information than traditional
•corpora enable more nuanced
terminology work with
contextualised information
• they raise translators’ awareness
of SL/TL genre conventions
• increase the textual fit of
translations by producing more
natural translations which use
patterns typical of TL
• time-consuming nature of corpus
building process
• ‘primitive’ parallel corpus
• need to train translators how to
work with corpora
• lack of integration of corpus tools
with CAT tools
Availability of corpus tools for
• Despite the growing popularity of language
corpora and corpus tools in various areas of
linguistics and translation studies, relatively little
efforts have been devoted to the development
of corpus-based tools intended to assist legal
translators in their work
• limited corpus resources , in particular for lesser
used languages
• limited accessibility of corpus resources
• confidentiality of legal texts and related corpus
design limitations  legicentrism
Use of corpus tools by translators
• As a result, translators are not familiar with methods
of working with corpora
• Bowker (2004: 13) „the uptake of corpora in the world
of professional translators appears to have been
considerably slower”
• Varela-Vila (2009), Pastor & Alcina (2009: 13) 
corpora are increasingly more popular
• Depends on a popularity of corpora among the
academia in a given country
• Growing use of DIY/ad hoc corpora  may increase
familiarity with corpus tools among translators
• DIY/ad hoc corpora a powerful tool but more
structured, representative and integrated resourced
are needed
More advanced translation
• JudGENTT , a web platform with documentary, textual and
terminological resources for the translation of court
documents, developed by the GENTTT (Textual Genres for
Translation) Research Group, based at the Department of
Translation and Communication at the Universitat Jaume I in
Spain (
• More information: Anabel Borja Albi „A genre analysis approach to the
study of the translation of court documents” in Łucja Biel & Jan
Engberg Research Models and Methods in Legal Translation, LANS-TTS
Judicial Cooperation in Criminal Matters
JCiCM Platform
• Content: resources related to judicial cooperation
in criminal matters (JCiCM)
• Language pair: English – Polish
• Corpus type: bilingually comparable & parallel
• Objective:
• to create a more targeted aid which integrates
resources in a form of searchable corpora.
• to provide contextualised dynamic information on
available terminological equivalents (their
frequency and collocations)
• to ensure open access to the platform
Judicial cooperation in criminal
• Introduced by the Maastricht Treaty in 1993 and
regulated under Title V (Area of freedom, security and
justice) of the Treaty on the Functioning of the European
• Combating (transnational) crime
• Enhancing cooperation of judicial authorities in the
Member States (Eurojust, European Judicial Network)
• Based on the principle of mutual recognition of
judgments by Member States & approximation of
nationa law
JCiCM: corpus design
• A broad definition of JCiCM adopted for the purposes of
corpus design (cooperation & combating of crime)
• To facilitate translations under Directive 2010/64/EU of
the European Parliament and of the Council of 20
October 2010 on the right to interpretation and
translation in criminal proceedings  translations for
„suspected or accused persons who do not understand
the language of the criminal proceedings”
Corpus design: identification of
core corpus components
JCiCM corpus
• EU component
• PL national component
• UK national component
• + general legal reference corpora
General legal reference corpora
PL R-Acquis: Regulations
Words (Tokens)
in million
PL L-Acquis: Directives
Polish Law Corpus (PLC)
EN R-Acquis: Regulations
EN L-Acquis: Directives
Corpus design: EU component
• EU component: as a comparable and parallel aligned
corpus; EN-PL bilingual documents
• Legislative instruments - EUR-Lex, Heading 19 Area of
freedom, security and justice --> 19.30.20 Judicial
cooperation in criminal matters
• 746 documents of which:
• 121 secondary legislation
• 12 international agreements
• Case law of the Court of Justice: Curia - documents classified
under the subject matter Area of freedom, security and
justice, cooperation in criminal matters (50 cases)
• Essential documents , e.g. EAW
Corpus design: identification of
core EU documents
Corpus design: identification of
core EU documents
Concordances of trafficking
Corpus design: identification of
core national documents PL
• Criminal law: Polish legislation related to criminal law and
criminal procedure, eg. Kodeks Karny, Kodeks
postępowania karnego, ustawa o przeciwdziałaniu praniu
pieniędzy oraz finansowaniu terroryzmu
• Supreme Court judgments (The Criminal Law Chamber, an
annotated corpus compiled by R.Górski) 11.6 million
words and comprises 11,595 decisions, such as
postanowienie, uchwała, wyrok, zarządzenie, ranging from
• Essential documents: Article 3.2 any decision depriving a
person of his liberty, any charge or indictment, any
judgment) plus instructions for suspected persons and
• Monolingual
• Bilingual aligned documents wherever possible
Corpus design: identification of
core national documents UK
• Criminal law: no codification – bigger corpus
• Judgments
• Court & related documents: e.g. instructions/guides
for suspected persons and witnesses, judgments, bills
of indictment
•  the scale of the corpus will depend on (1) funding
and (2) involvement of translators and courts
Contextualisation: PL Supreme
Court Judgments
PL criminal law: akt oskarżenia
PL criminal law: akt oskarżenia
Research objectives
• the analysis of the interplay between EU and national
law terminology related to criminal matters in view of
the prevalent recommendation to avoid functional
equivalents (focus on terminology related to
transnational crime);
• the assessment of equivalents used in EU official
translations and their applicability in national contexts
('terminological fit');
• The analysis of translation practice  ‘established’
equivalents (Newmark 1981: 73; Molina and Hurtado
Albir 2002: 510)/recognised translation (cf. Newmark
1981: 76)
• Identification of key genric features of documents
used in JCiCM
• Need to persuade funders that language corpora are
worth investing public money in (applied rather than
basic research)
• Need to prepare more user-friendly interfaces and more
resources (economy of scale)
• Need to persuade judicial authorities to provide access
to bilingual documents
• Need to persuade and train translators to use corpusbased tools
Thank you
for your attention
[email protected]

similar documents