Abstracts
Dr. Vassiliki Foufi (School of Modern Greek), Eleni
Kogitsidou (Université de Grenoble), dr. Athanasios
Mavropoulos (Centre for the Greek Language), dr. Olympia
Tsaknaki (Aristotle University of Thessaloniki)
Compilation of a Literary Text Corpus
The ultimate goal of our research, carried out in the
context of the project Compilation of a parallel corpus of
French fiction translated into Greek, led by prof. Titika
Dimitroulia and financed by the AUTH Research Committee, is
the construction of a literary bitext, with the aim of
enriching the Greek digital content and studying key issues
of Translation and Translation Studies, such as the
contribution of translation in shaping the language of the
time, the imprint of the time in translation, the
translator's style, etc.
The parallel texts are considered very important for the
applications of automatic language processing and linguistic
research as they help to eliminate semantic ambiguities and
contribute to terminology extraction and corpora contrastive
studies.
We will present in detail the steps we followed to construct
the parallel corpus, based in electronic texts and using
multiple tools. First, we undertook the conversion of files
into editable formats by means of ABBYY Fine Reader that
provides with an optical character recognition
(http://www.abbyy.com.gr/). Then, we elaborated and
corrected the texts in order to remove the problems
encountered after the file conversion (not recognized
accented characters, typographical errors, etc.). Finally,
text alignment at a sentence level was performed with the
open source LF aligner
(http://sourceforge.net/projects/aligner/). We will present
mismatches between the source text and the translation, such
as different formatting, improper text segmentation in the
source language or the target language, mismatches between
the textual units, etc.
Prof. Dionysis Goutsos (National
and Kapodistrian University of Athens)
Greek corpus building and analysis: The story so far and
what is to follow
The paper offers a state-of-the-art account of corpus
research on Greek, focusing on both corpus compilation and
analysis. It outlines the main phases of development of
Modern Greek corpora and presents the most important
findings on the description of the Greek language deriving
from corpora, with specific examples. The main focus of the
paper is on the relevance of these findings for the study of
translation in Greek, as well as their implications for
translation theory and practice. Finally, the perspectives
of corpus-related research on Greek are outlined and some
translation hypotheses for further exploration are pointed
out.
Prof. Titika Dimitroulia
(Aristotle University of Thessaloniki)
Design and compilation of a literary parallel corpus: aims
and applications The compilation of a parallel corpus of French literary
fiction translated into Greek is situated in the context of
the Corpus-based Translation Studies (CTS) in the
Greek-speaking world and its applications to translation
didactics. At the same time, the corpus, containing works
published after 1974, constitutes a first sample of
contemporary translated Greek literary discourse, and thus
we believe that it can and even must be incorporated, with
all other similar works in progress, in monolingual Greek
corpora (HNC, SEK, Diachronic Greek Corpora) and comparable
corpora.
The choice of the works when designing this small-scale
corpus, financed by the AUTH Research Committee, complies
with the criterion of representativeness/balance, as an
effort was made to include works covering three centuries
(18th-20th) and different genres, so that the interference
of the source language and the genre can be studied better.
For the same reason, we tried to include the different
categories of translators (ordinary mediators and
established writers, academics, professional translators.
The texts are released with the permission of their
publishers, always under the copyright regime. This is the
reason why the corpus, online and open accessed, will give
possibility of access to and download of the full texts.
Based on the corpus and taking into account extra-textual
parameters, we plan to study the style of individual
translators, collective stylistic features, authorship
attribution, translationese and translation universals
(explicitation, simplification, normalization etc.), among
others. In the field of translation didactics, we examine
the translation of culturèmes and intertextuality, and
discuss the concept of quality in literary translation.
Finally, the corpus will be used in the context of the
debate on digital literature and humanities today.
Prof. Fryni Kakoyianni-Doa & dr. Eleni
Tziafa (University of Cyprus)
The SOURCe Project
The SOURCe project was developed in three parts and includes
(a) the search engine for the Searchable Online French-Greek
Parallel Corpus for the University of Cyprus (SOURCe), (b)
the Pencil and (c) the Library tool. These are designed as
freely available resources for language processing, along
with the data to be processed, in usable formats for
teachers, learners and translators. Our aim is to describe
the design principles and the properties of the SOURCe
Project and we will outline its future perspectives and
applications. This project is led by Fryni Kakoyianni-Doa
and is fully funded by the University of Cyprus. The core of
the project is a collection of parallel corpora: aligned (in
sentence level) original and translated texts, in French and
Greek language. In
order to release teachers and translators from long
preparation and complex corpus-building, we propose the
construction of simple, online corpora with basic
text-searching facilities, avoiding machine-based annotated,
tagged or parsed corpora which are more appropriate for
detailed linguistic research. We designed a simple
interface, through which the user may search existing
corpora, upload texts, and see them online. Moreover, we
enabled different 'viewpoints' so that different types of
users can see different views on the same underlying
datasets.
Prof. Rudy Loock (Université
Lille3)
Intra-language differences and translation quality
The aim of this presentation is to raise the question
whether the measurement of intra-language differences
between original language and translated language can be
used as a tool for translation quality assessment.
To ask such a question is to enter the thorny debate on the
interpretation of intra-language differences: should we
consider translated language as variation comparable to
dialectal variation or should we consider that the
over-representation or under-representation of a given
linguistic construction means that the quality of the
translation should be improved? From an even more general
perspective, should we consider that translated language is
intrinsically different and represents what researchers have
called a third code or should we consider that “the utopian
goal is to make it virtually impossible to tell the
translation from an original text in that language” (Teubert
1996: 241)?
Through the analysis of a learner corpus (translations tasks
from English to French performed by first-year and master’s
students) for two case studies (derived adverbs and
existential constructions), we try and see whether some
correlation can be found between the observed intra-language
differences and the overall quality of the translation
tasks.
Prof. Sofia Malamatidou (University of Birmingham)
Translation and Language Change: The Interplay of Diachronic
and Synchronic Corpus-Based Studies
Corpus-based research has yielded important insights into
translation in recent years, but most studies in the field
have focused on synchronic analyses, thus neglecting the
potential for diachronic analysis to enhance our
understanding of how translation might contribute to
important phenomena such as language change. Recently, a
number of scholars have adopted a corpus-based approach in
the investigation of translation as a form of language
contact and its impact on the target language. However, no
diachronic corpus-based study of translation involving
Modern Greek has so far been attempted. Similarly,
comparable and parallel corpora have not been efficiently
used by linguists for the analysis of diachronic phenomena.
This study aims to combine synchronic and diachronic
corpus-based approaches, as well as parallel and comparable
corpora for the analysis of linguistic features of
translated texts and their impact on non-translated texts.
Unlike most studies employing comparable corpora, which
focus on revealing recurrent features of translated language
independently of the SL and TL, this study approaches texts
with the intention of revealing features that are dependent
on the specific language pair involved in the translation
process, i.e. English and Modern Greek.
The study involves the diachronic analysis of the TROY
Corpus: a corpus of Modern Greek non-translated and
translated popular science articles, along with their
English source texts, covering a 20-year period (1990-2010)
and consisting of approximately half a million words. The
corpus is divided into three sections. The first subcorpus
consists of non-translated Modern Greek popular science
articles published in 1990-1991. The second subcorpus
consists of non-translated and translated Modern Greek
popular science articles published in 2003-2004, as well as
the source texts of the translations. The third subcorpus
includes non-translated as well as translated texts and
their source texts, all published in 2010-2011. The
linguistic feature analysed for the purposes of this study
is the frequency of the passive voice reporting verbs.
Spyridon Pilos (Spyridon Pilos, Head of sector “Language
Applications”, Informatics Unit, Resources
Directorate, Directorate General for Translation, European
Commission, Luxembourg)
The public translation memories and corpora of DG
Translation of the European Commission
The Directorate General for Translation (DGT) has made
available two data sets relevant for translation: the DGT-TM
and the DGT-Acquis. The DGT-TM, i.e. "DGT Translation
Memory", is a collection of compressed multilingual files in
the TMX format, an XML-based standard used for the exchange
of Translation Memory data. The present update covers the
entire body of EU law as published in the L-Series of the
Official Journal between 1972 and 2012, in 23 official
languages of the EU. Updates are scheduled to be released on
a yearly basis. The DGT-Acquis is a paragraph-aligned
parallel corpus consisting of full text documents with added
meta-information on which paragraphs are aligned with which
others in the other languages. In this corpus, one can thus
see each sentence in its context, while in translation
memories, each sentence is in isolation, i.e. out of
context. DGT-Acquis also contains the L-series of the
Official Journal but also the LM, C, CA and CE collections.
Both resources are available for downloading from the Joint
ResearchCenter's website on language technology resources.
DGT-TM is also accessible through the EU Open data Portal.
Prof. Mojca Schlamberger Brezar (University of Ljubljana)
L'argumentation pour ou contre - les connecteurs en
traduction du français vers le slovène à travers un corpus
parallèle journalistique
À partir des années 1990, la linguistique des corpus a fait
des progrès révolutionnaires dans le traitement des données.
Le domaine de la traductologie n’y présente pas une
exception. Nous parlerons de l'essor de la linguistique des
corpus pour le slovène, langue d'un peu plus de 2 millions
de locuteurs.
Après un court compte-rendu de l’état de choses dans la
linguistique des corpus pour le slovène, nous présenterons
la recherche sur les connecteurs argumentatifs et
contre-argumentatifs français du type parce que,
puisque, mais, pourtant etc. dans les deux parties,
journalistique et littéraire, du corpus parallèle
français-slovène FraSloK, et rassemblerons leurs équivalents
en traduction slovène. Nous nous pencherons sur les
stratégies argumentatives utilisées dans le texte du départ
et leur impact sur le choix des connecteurs en traduction.
Nous analyserons les stratégies de traduction de ces
connecteurs et étudierons leur dépendance du type du texte,
de différents registres de langue présents dans le corpus et
des choix personnels des traducteurs.
Prof. Federico Zanettin (Università di Perugia)
Corpora and literary translation research: Issues and
challenges In this presentation, I consider applications of corpus
linguistics tools and methodologies to descriptive
translation studies. More specifically, I discuss
ways in which corpora of different types can help
investigating literary translation, both from a quantitative
and a qualitative perspective. I provide an overview of the
main research lines, namely research on so-called
translation universals, translation norms and translator
style. Finally, I consider the main stages of corpus
compilation and use, from corpus design and annotation to
search and visualization techniques, with a focus on
parallel corpora. |