ebook img

Automatic annotation of similes in literary texts PDF

233 Pages·2017·2.93 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Automatic annotation of similes in literary texts

Université Pierre et Marie Curie École Doctorale ED 130 Laboratoire d’Informatique de Paris VI, Équipe ACASA Automatic Annotation of Similes in Literary Texts Par Suzanne Patience Mpouli Njanga Seh Thèse de doctorat en Informatique Dirigée par Jean-Gabriel Ganascia Présentée et soutenue publiquement le 03 octobre 2016 Devant un jury composé de : M. Walter Daelemans, Professeur, Universiteit Antwerpen – Rapporteur M. Stéphane Ferrari, Maître de conférences [HDR], Université de Caen – Rapporteur Mme Catherine Fuchs, Directrice de recherche, LATTICE-CNRS – Examinatrice M. Jean-Gabriel Ganascia, Professeur, UPMC – Directeur de thèse M. Dominique Legallois, Professeur, Université Sorbonne Nouvelle – Examinateur Mme Vanda Luengo, Professeur, UPMC – Examinatrice 2 To all the unsung heroes of the digital era. This thesis owes a lot to your hard work and selfless dedication. Word Template by Friedman & Morgan 2014 i ii ABSTRACT This thesis tackles the problem of the automatic recognition of similes in literary texts written in English or in French and proposes a framework to describe them from a stylistic perspective. In this respect, in the first part of this work, we are mainly interested in circumscribing the notion of simile and giving an overview of previous works and existing annotated corpora of similes and comparisons. For the purpose of this study, a simile has been defined as a syntactic structure that draws a parallel between at least two entities, lacks compositionality and is able to create an image in the receiver’s mind. In the second and last part, we present the designed method, its evaluation, and three of its possible applications in a literary context. Three main points differentiate the proposed approach from existing ones: it is strongly influenced by cognitive and linguistic theories on similes and comparisons, it takes into consideration a wide range of markers and it can adapt to diverse syntactic scenarios. Concretely speaking, it relies on three interconnected modules: - a syntactic module, which extracts potential simile candidates and identifies their components using grammatical roles and a set of handcrafted rules, - a semantic module which separates creative similes from both idiomatic similes and literal comparisons based on the salience of the ground and semantic similarity computed from data automatically retrieved from machine-readable dictionaries; - and an annotation module which makes use of the XML format and gives among others information on the type of comparisons (idiomatic, perceptual…) and on the semantic categories used. Finally, the two annotation tasks we designed show that the automatic detection of figuration in similes must take into consideration a series of features among which salience, categorisation and the sentence syntax. Word Template by Friedman & Morgan 2014 iii iv RÉSUMÉ Cette thèse aborde le problème de la détection automatique des comparaisons figuratives dans des textes littéraires en prose écrits aussi bien en français qu’en anglais et propose un canevas pour décrire ces comparaisons d’un point de vue stylistique. A cet effet, dans la première partie de ce travail, nous nous sommes attelés à circonscrire la notion de comparaisons figuratives et à présenter un panorama des précédents travaux réalisés dans le domaine ainsi que des pratiques hétérogènes en matière d’annotations de comparaisons dans des corpus de textes. Par comparaison figurative, il est entendu, dans le cadre de cette étude, toute structure syntaxique qui met en parallèle au moins deux entités, déroge au principe de compositionnalité et crée une image mentale dans l’esprit de ceux à qui elle est destinée. Dans la seconde partie de cette thèse, nous présentons notre méthode, quelques résultats d’évaluation ainsi que trois de ses possibles applications à des questions littéraires. Trois éléments principaux distinguent notre approche des travaux précédents : son ancrage dans les théories linguistiques et cognitives sur les comparaisons littérales et figuratives, sa capacité à gérer des marqueurs appartenant à différentes catégories grammaticales et sa flexibilité qui lui permet d’envisager différents scénarios syntaxiques. De manière plus concrète, nous proposons une méthode s’articulant autour de trois modules complémentaires : - un module syntaxique qui utilise la structure syntaxique et des règles manuelles pour identifier les comparaisons potentielles ainsi que leurs composantes ; - un module sémantique qui mesure la saillance des motifs détectés et la similarité sémantique des termes comparés en se basant sur des données recueillies automatiquement dans des dictionnaires électroniques ; - et un module d’annotation qui s’appuie sur le format XML et fournit entre autres des informations sur le type de comparaison (idiomatique, sensorielle…) et sur les catégories sémantiques employées. Pour finir, au vu des données recueillies au cours des deux campagnes d’annotation que nous avons menées, il paraît clair que la détection automatique des comparaisons figuratives doit tenir compte de plusieurs facteurs parmi lesquels la saillance du motif, la catégorie sémantique des termes comparés et la syntaxe de la phrase. Word Template by Friedman & Morgan 2014 v vi ACKNOWLEDGEMENTS Doing a PhD thesis is generally described as a nerve-racking experience, but it is also fulfilling in its own way as it enables to meet new people, to learn new things and to discover one’s own strengths and weaknesses as an academician. For this mind-shaping experience, I would like to thank my supervisor, Prof. Jean-Gabriel Ganascia, for the enthusiasm he has always shown for my thesis subject and for giving me free reins to choose in which directions to steer my research. I am, of course, very grateful to the LabEx OBVIL and its Director, Prof. Didier Alexandre for funding my PhD. I would particularly like to convey my sincere appreciation to Prof. Walter Daelemans, Mr Stéphane Ferrari, Prof. Dominique Legallois, Prof. Vanda Luengo and Ms Catherine Fuchs who readily accepted to be members of my jury and endeavoured to fit my defense in their schedule, in some cases, despite prior engagements. I would also like to thank my two annotators, Emmanuelle Kaas and Pauline Bruley, for their work and for sharing their insightful remarks with me. I was also positively influenced this last year by the constructive criticisms of the two members of my midterm defense, Prof. Milad Doueihi and Mr Eric de la Clergerie. For these three years full of ups and downs, laughter, food, drinks, productive and less productive brainstorming, I would like to thank all my colleagues, past and present, of the LIP6-ACASA & the LabEx OBVIL: Alexandre, Amal, Amine, Bin, Carmen, Chiara, Elodie, Fiona, Francesca, Gauvain, Marine, Marianne, and Marissa. I am tremendously indebted to Mihnea Tufis who poured more than his soul into the Dissimilitudes project. Words cannot express how much I admire his dedication and his quest for perfection. On a more personal note, I would like to thank my uncle and his partner for their silent support and for putting up with my odd schedule. Suzanne Mpouli - November 2016 1 Automatic Annotation of Similes in Literary Texts I am also indebted to all my friends who inquired about the progress of my thesis and patiently listened to me each time I needed it. I would specially like to express my deepest gratitude to Albert and Christelle who took time to read parts of my thesis. Last but not least, I know that I could never have done it without my wonderful cheerleading team: my parents, Marthe and Martin, to whom I owe everything and who have always encouraged me; my brothers, Guillaume and Thierry, who are never afraid to give me advice and to point my mistakes; my in-laws, distant relatives and nephews, Raul, Emmanuella, Céline, Anatole, Melvin, Yann-Karel & Jaycee who were a breath of fresh air. My heartfelt thanks go to my sister, Prisca, who has always believed in me and helps me every day to become a better version of myself. 2 Suzanne Mpouli - November 2016

Description:
Automatic Annotation of Similes in Literary Texts similes and comparisons, it takes into consideration a wide range of markers and it can Comparisons: How Similes Are Understood” (Gargani, 2014), the title immediately.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.