SENTIMENT ANALYSIS FOR MICRO-BLOGGING PLATFORMS IN ARABIC by Eshrag Ali Ahmad Refaee Submitted for the degree of Doctor of Philosophy Department of Computer Science School of Mathematical and Computer Sciences Heriot-Watt University July 2016 The copyright in this thesis is owned by the author. Any quotation from the report or use of any of the information contained in it must acknowledge this report as the source of the quotation or information. Abstract Sentiment Analysis (SA) concerns the automatic extraction and classification of sentimentsconveyedinagiventext,i.e. labellingatextinstanceaspositive,negative or neutral. SA research has attracted increasing interest in the past few years due to its numerous real-world applications. The recent interest in SA is also fuelled by the growing popularity of social media platforms (e.g. Twitter), as they provide large amounts of freely available and highly subjective content that can be readily crawled. Most previous SA work has focused on English with considerable success. In this work, we focus on studying SA in Arabic, as a less-resourced language. This work reports on a wide set of investigations for SA in Arabic tweets, systematically comparing three existing approaches that have been shown successful in English. Specifically, we report experiments evaluating fully-supervised-based (SL), distant- supervision-based (DS), and machine-translation-based (MT) approaches for SA. The investigations cover training SA models on manually-labelled (i.e. in SL meth- ods) and automatically-labelled (i.e. in DS methods) data-sets. In addition, we explored an MT-based approach that utilises existing off-the-shelf SA systems for English with no need for training data, assessing the impact of translation errors on the performance of SA models, which has not been previously addressed for Arabic tweets. Unlike previous work, we benchmark the trained models against an inde- pendent test-set of >3.5k instances collected at different points in time to account for topic-shifts issues in the Twitter stream. Despite the challenging noisy medium of Twitter and the mixture use of Dialectal and Standard forms of Arabic, we show that our SA systems are able to attain performance scores on Arabic tweets that are comparable to the state-of-the-art SA systems for English tweets. The thesis also investigates the role of a wide set of features, including syntactic, semantic, morphological, language-style and Twitter-specific features. We introduce a set of affective-cues/social-signals features that capture information about the presence of contextual cues (e.g. prayers, laughter, etc.) to correlate them with the sentiment conveyed in an instance. Our investigations reveal a generally positive impact for utilising these features for SA in Arabic. Specifically, we show that a rich set of morphological features, which has not been previously used, extracted using a publicly-available morphological analyser for Arabic can significantly improve the performance of SA classifiers. We also demonstrate the usefulness of language- independent features (e.g. Twitter-specific) for SA. Our feature-sets outperform results reported in previous work on a previously built data-set. Dedication To the soul of my father. ii Acknowledgements I would like to thank my supervisor, Verena Rieser, for her patient guidance, endless support and encouragement throughout my Ph.D. Verena has been a su- pervisor and a best friend. Her good advice and acts of kindness make her what it means to be the best person one can aim to become and to know, learn from and work with. I am especially thankful to Verena for her support and genuine empathy during the difficult times when I lost my father half way through my Ph.D. I would never have finished this thesis without her support. I would like to thank Helen Hastie and Rob Pooley for the helpful comments and discussionsinVerena’smaternityleave. IthankmembersoftheInteractionLabfrom whom I have come to learn new things. I would like also to thank IT Helpdesk staff, especially Iain Mccrone. Admin staff in MACS department have been very helpful, especially Sandra McArthur, Claire Porter and Christine McBride. I am very grateful to my mother and siblings, for their love and support. I would like to express my deep gratitude for the generous scholarship to pursue my post- graduate studies granted by the Royal Embassy of Saudi Arabia and the Cultural Bureau in London. Finally, I would like to extend thanks to Jazan University in SaudiArabiaforofferingmeajobasaResearchAssociateattheSchoolofComputer Sciences. iii ½J(cid:10).Ê.«@Y(cid:11)Z(cid:19)ÔAgJ(cid:11)(cid:9)(cid:17)K ú@Y(cid:11)æÔ”gk@é<B(cid:15)Ë(cid:11)YYÒÒmÌmÌ'@'@..½. ËH(cid:16)ÑAm(cid:11)êÌ(cid:11)Ê'(cid:15)ËA@’(cid:11)..Ë.@ ÕÕçæ(cid:10)'(cid:16)(cid:15)(cid:16)KQºéË(cid:16)J@Òª½J(cid:9)K.êkø.(cid:10)ñYË(cid:9)Ë@C(cid:11)Jé.®(cid:16)<(cid:15)Ë(cid:16)JÓ(cid:15)YAÒ’m(cid:11)Ì'Ë@Ag(cid:11)(cid:9).. ÉéJ(cid:10)ÒÊ«ªË@ú(cid:10)æ@(cid:9)Y(cid:17)K(cid:11)(cid:9)@ëðÉéª<(cid:15)Ëk@.@YÑÔgêÊA(cid:15)˯(cid:11)(cid:9)@ .½K(cid:9)A¢(cid:10)(cid:11)ʃ Õæ(cid:10)¢(cid:9)«ð ½êk.ð ÈCm.Ì ù(cid:10)ª(cid:9)J.(cid:28)(cid:9)K(cid:10) AÒ(cid:11)» YÒmÌ'@ ½Ë ÑêÊË@ .. ½‚®(cid:9)K(cid:9) ú(cid:10)Ϋ I(cid:16) J(cid:10)(cid:28)(cid:9)(cid:17)K@ AÒ(cid:11)» I(cid:16) K(cid:9)@ (cid:11)(cid:9) (cid:9) (cid:11)(cid:9) (cid:11) :úÍ@ ÉÒªË@ @Yë øYë@ úGA¯ ,YªK. AÓ@ (cid:10) (cid:10) (cid:10) ú(cid:10)ú.G(cid:10)(cid:9)ú¯A(cid:9)(cid:17)J(cid:11)«Ë@©A¯(cid:11)(cid:9)J(cid:10)Q K.ËQ@(cid:9)Ë’AYK(cid:11).ËÔ@gAJ(cid:11).úk@(cid:10)¯(cid:9)QúÓIÎ(cid:16)«:úJ(cid:9)(cid:10)».G(cid:9).(cid:14)@PYú(cid:9)JG(cid:9)AÓÓ@(cid:11) ..Éè.P¿ñQ«(cid:16)JÈ»A‚AX(cid:17)¯(cid:11)(cid:16)(cid:11)Ëú@áð(cid:10)(cid:11)(cid:9)G(cid:9)ÓA«I(cid:11).úX(cid:10)K(cid:10)ÍXá@(cid:9)B.Ó(cid:11).@ð(cid:11)úé(cid:10)«ÍX(cid:9)@@AX(cid:11)(cid:16)J(cid:11).ñƒ. K.BÐ(cid:11)ñP@YJ(cid:10).Ë.®@.(cid:16)PË@@@ñY(cid:11)(cid:11)(cid:9)úKë(cid:9)(cid:10)BÍ(cid:11)ø@(cid:10)iéQÒj(cid:10)K.‚îÕ(cid:29)(cid:10)E.Ë Õ(øË(cid:10)†(cid:16)Yá(cid:9)(cid:9)@QË(cid:11)Ó@å(cid:17)…@úú)(cid:10)(cid:10)ÍÍA@Hª(cid:11)(cid:9)..Ë.@ðøú(cid:10)(cid:10)G.Z..@@..Y(cid:11)h(cid:16)JPK.ð@BX(cid:11)(cid:11)(cid:9)P(cid:14)@@ (cid:10) (cid:10) (cid:10). (cid:19) ú(cid:10).Í.AJ(cid:10)(cid:11)úË×ø(cid:10)@ Pú@Íñ(cid:11)@m.(cid:26).'.. úH(cid:16)(cid:16)G@Qñ(cid:11)î¢D…k(cid:9)á(cid:9) AÓî(cid:11)E(cid:16)ú@ñ(cid:10)(cid:11)Í«@X.. HA(cid:16)Ó(cid:11)ñZK(cid:10)A“(cid:11)(cid:9)é@ƒPá(cid:9)YÓÖÏ@úÍú(cid:10)@Í.@. úú(cid:10)æ(cid:16)GJA.J(cid:10)J(cid:10)(cid:11)®«(cid:16)(cid:9)kú¯I(cid:9)(cid:16)(cid:19)AÜÊ(cid:11)ÔÏg@ H(cid:16)á(cid:9)QÓå”ú(cid:10)(cid:16)JÍ«@@..á(cid:9)éÓJ.(cid:28)(cid:10)úJ.mÌÍ'@@..ú(cid:10)×éÊ@K(cid:10)ñú(cid:10)£Í@ (cid:10) (cid:10) (cid:10) (cid:10) (cid:10). (cid:10) .á(cid:9) K(cid:10)YË@ Q(cid:9)« Y(cid:10)Òm× éJ(cid:9)K(cid:10)Q(cid:9)Ó@ (é¼<(cid:15)ËA@J(cid:11)ú(cid:9)ë(cid:10)Y×@ªñ(cid:129)(cid:11)K.(cid:16)K)ð(cid:28)(cid:10)Ëùà(cid:9)..ëAÜ(cid:11)AÐJ(cid:11)ß(cid:9)(cid:10)(cid:16)Jñ@Óð(cid:11)K(cid:10)C(cid:11)hÉ(cid:15)Ë@@¿Q(cid:11)Õ¯º(cid:9)ù@(cid:10)Ôª.«.ÓXHð(cid:16)Aî(cid:11)ADÕ(cid:16)J(cid:10)(cid:11)ʺËÔAgªJ.(cid:11)(cid:9)m(cid:26)Ë@'.úð(cid:10)úæ(cid:10)(cid:16)(cid:16)GË.@@.ñ(cid:11)kúéJ(cid:10)(cid:9)ÎË@ÓAª(cid:11)(cid:9)@úð(cid:10)(cid:11)ËÍ@@úÕ.ºk. (cid:16)JQú®(cid:16)(cid:10)¯»(cid:9)(cid:17)JËQP(cid:16)K(cid:19)@YQð(cid:11)º’YƒÓ(cid:17)K(cid:10)P.Õ(cid:9).ðæ(cid:16)Kú(cid:9)(cid:10)Aà¯G(cid:11).(cid:9)(cid:9) @QÕÕ(cid:11)ºº(cid:30).gK.®(cid:9)(cid:16)AkÜ(cid:11)..ß(cid:10)BZÕ(cid:11)º@Q(cid:11)(cid:9)«(cid:19)@J(cid:10)Q(cid:11)®(cid:9)Bº(cid:16)K(cid:11)@ƒQ(cid:17)ºú(cid:10).G(cid:9).ƒ@ñ(cid:17)(cid:11)(cid:144)kH(cid:9)(cid:16)@CA(cid:11)úÒ(cid:11)g(cid:10)ÊÍ(cid:9)¿@@ (cid:10) (cid:10) (cid:10) .. éK.Qª(cid:9)Ë@ ú¯(cid:9) H(cid:16) @ñ(cid:11)J(cid:9)‚Ë@ èY(cid:9)ë éÊJ(cid:10)£ H(cid:16) YÖÞ• (cid:10) (cid:11)(cid:9) (cid:11) (cid:9) (cid:9) .ú«A¯QË@ XAK(cid:10)@ ... éJ.m.(cid:26)'@ ÕË øYË@ øYËð úÍ@ (cid:10) (cid:10) (cid:10) (cid:10) ÑêÔ«YË à(cid:9) @P(cid:11)(cid:9)Ag(cid:11). éªÓAg(cid:11). úGñ‚(cid:28)(cid:9)Óð à(cid:9) YJ(cid:9)ÊK. éK(cid:10)Xñª‚Ë@ éJ(cid:10)¯(cid:9)A®(cid:16)(cid:11)(cid:17)JË@ éJ(cid:10)®(cid:16)jÊÖÏ@ úGñ‚(cid:28)(cid:9)Ó Qºƒ(cid:17)@ ,(cid:19)AÓ(cid:11)A(cid:16)J(cid:11)k(cid:9)ð (cid:10). .èYj(cid:16)JÖÏ@ éºÊÒÜ(cid:15)Ï@(cid:10)ú. ¯(cid:9) ú(cid:17)GAª(cid:11)(cid:16)JK.@ èQ(cid:16)(cid:30)¯(cid:9) È@ñ(cid:11)£ (cid:10) (cid:10) iv Social Media is the elephant in the room - no decision management system will escape the impact of social media. Social Media Monitoring, including sentiment analysis, will become more and more a commodity and focus will be on integration with decision-based systems. — Olivier Jouve IBM v Contents 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Thesis Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Thesis Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Background 9 2.1 The Problem of Sentiment Analysis . . . . . . . . . . . . . . . . . . . 9 2.1.1 Research on Sentiment Analysis . . . . . . . . . . . . . . . . . 10 2.1.1.1 Sentiment Analysis SubTasks . . . . . . . . . . . . . 11 2.1.1.2 Sentiment Analysis Domains . . . . . . . . . . . . . 11 2.2 Mining Social Media for Sentiments . . . . . . . . . . . . . . . . . . . 12 2.2.1 Challenges of Social Media Data . . . . . . . . . . . . . . . . . 12 2.2.2 Sentiment Analysis in Twitter . . . . . . . . . . . . . . . . . . 15 2.3 The Arabic Language and its Presence in Social Media . . . . . . . . 17 2.3.1 Why an Arabic Corpus of Social Media Content? . . . . . . . 18 2.3.2 Current Efforts to Develop NLP Tools and Resources for Ara- bic and its Dialects . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3.3 What are the challenges of SA in Arabic? . . . . . . . . . . . . 24 2.4 Sentiment Analysis: Prominent Approaches . . . . . . . . . . . . . . 27 2.4.1 Lexicon-based Approach . . . . . . . . . . . . . . . . . . . . . 27 2.4.2 Machine Learning Approaches . . . . . . . . . . . . . . . . . . 29 2.4.3 Distant-Supervision Approaches . . . . . . . . . . . . . . . . . 32 vi 2.4.3.1 Conventional Markers + Machine Learning . . . . . 32 2.4.3.2 Lexicon-based + Machine Learning . . . . . . . . . . 35 2.4.4 Sentiment Analysis on Arabic Tweets: Issues Identified . . . . 36 2.5 Sentiment Analysis of Arabic Social Media: A Framework . . . . . . 36 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3 Experimental Setup 40 3.1 Data Collection and Annotation . . . . . . . . . . . . . . . . . . . . . 40 3.1.1 Gold-Standard Training Data-sets: Manual Annotation . . . . 42 3.1.1.1 Sentiment Annotation . . . . . . . . . . . . . . . . . 43 3.1.2 Distant Supervision Training Data-sets: Automatic Annota- tion with Twitter’s Conventional Markers . . . . . . . . . . . 47 3.1.2.1 Sentiment Annotation . . . . . . . . . . . . . . . . . 48 3.1.3 Distant Supervision Training Data-sets: Automatic Annota- tion with Lexicon-based Methods . . . . . . . . . . . . . . . . 50 3.1.4 Test Data-set . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.2 Data Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.2.1 Stemming Experiments . . . . . . . . . . . . . . . . . . . . . . 58 3.3 Features Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.4 Levels of Sentiment Classification . . . . . . . . . . . . . . . . . . . . 69 3.5 Machine Learning Schemes . . . . . . . . . . . . . . . . . . . . . . . . 70 3.5.1 Baselines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.6 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.6.1 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . 72 3.6.2 Evaluation Methods . . . . . . . . . . . . . . . . . . . . . . . 73 3.6.2.1 Cross-Validation (CV) . . . . . . . . . . . . . . . . . 74 3.6.2.2 Independent Test-set . . . . . . . . . . . . . . . . . . 74 3.6.3 Statistical Tests . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.7 Experimental Setting Optimisation . . . . . . . . . . . . . . . . . . . 76 3.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 vii 4 Supervised Learning Approach 82 4.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.2 Experiments on M&D Data-set . . . . . . . . . . . . . . . . . . . . . 85 4.3 Experiments on GS1 Data-set . . . . . . . . . . . . . . . . . . . . . . 88 4.3.1 Binary classification: Polar vs. Neutral . . . . . . . . . . . . . 88 4.3.2 Binary classification: Positive vs. Negative . . . . . . . . . . . 89 4.3.3 Three-way classification: Positive vs. Negative vs. neutral . . 90 4.3.4 Summary of GS1 Results . . . . . . . . . . . . . . . . . . . . . 93 4.4 Experiments on GS2 Data-set . . . . . . . . . . . . . . . . . . . . . . 95 4.4.1 Binary classification: Polar vs. Neutral . . . . . . . . . . . . . 95 4.4.2 Binary classification: Positive vs. Negative . . . . . . . . . . . 98 4.4.3 Three-way classification: Positive vs. Negative vs. Neutral . . 101 4.4.4 Summary of GS2 Results . . . . . . . . . . . . . . . . . . . . . 103 4.5 Experiments on GS1+GS2 Data-set . . . . . . . . . . . . . . . . . . . 104 4.5.1 Binary classification: Polar vs. Neutral . . . . . . . . . . . . . 105 4.5.2 Binary classification: Positive vs. Negative . . . . . . . . . . . 106 4.5.3 Three-way classification: Positive vs. Negative vs. Negative . 110 4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 5 Distant Supervision Approaches 119 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 5.1.1 Why Distant Supervision? . . . . . . . . . . . . . . . . . . . . 119 5.1.2 What Are the Alternatives? . . . . . . . . . . . . . . . . . . . 120 5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 5.2.1 Conventional-Markers-based DS Approach . . . . . . . . . . . 121 5.2.2 Lexicon-based DS Approach . . . . . . . . . . . . . . . . . . . 124 5.2.2.1 A Lexicon-based Approach for SA . . . . . . . . . . 124 5.2.2.2 ACombinedApproachforSA:Lexicon-based+Machine- Learning . . . . . . . . . . . . . . . . . . . . . . . . . 127 5.3 DS Experiments: Part One . . . . . . . . . . . . . . . . . . . . . . . . 129 viii 5.3.1 Experiments on the Emoticon-based (Emo1) Data-set . . . . . 129 5.3.2 ExperimentsontheLexicon-presence-based(Lex-Pres1)Data- set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 5.3.3 ExperimentsontheLexicon-aggregation-based(Lex-Aggreg1) Data-set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 5.3.4 Summary of Part One Results . . . . . . . . . . . . . . . . . . 134 5.4 DS Experiments: Part Two . . . . . . . . . . . . . . . . . . . . . . . 136 5.4.1 Experiments on Emoticon-based (Emo2) and Hashtag-based (Hash) Data-sets . . . . . . . . . . . . . . . . . . . . . . . . . 136 5.4.1.1 Error Analysis for Emoticon-Based DS Data-set . . . 138 5.4.1.2 Learning Curves on Emoticon-based data-sets: Ara- bic vs. English . . . . . . . . . . . . . . . . . . . . . 142 5.4.2 Experiments on Extended Lexicon-based Data-sets . . . . . . 144 5.4.2.1 Error Analysis for Lexicon-Based DS Data-set . . . . 145 5.4.3 Summary of DS Experiments Part 2 . . . . . . . . . . . . . . 148 5.5 Discussion of DS Results . . . . . . . . . . . . . . . . . . . . . . . . . 154 5.5.1 Comparison with Previous Work . . . . . . . . . . . . . . . . 157 5.5.2 Other Factors Influencing Performance in DS methods . . . . 158 5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 6 Machine Translation Based Approaches 162 6.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 6.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 6.2.1 Generating English Translation . . . . . . . . . . . . . . . . . 168 6.2.2 ExperimentsonMT-basedApproach: UsingtheStanfordSen- timent Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . 169 6.2.2.1 Sentiment Annotation . . . . . . . . . . . . . . . . . 169 6.2.2.2 Experiment Results . . . . . . . . . . . . . . . . . . 170 6.2.2.3 Error Analysis . . . . . . . . . . . . . . . . . . . . . 174 6.2.3 Experiments on MT-based Approach: Using the Emoticon English Data-set (Emo-Eng) . . . . . . . . . . . . . . . . . . . 180 ix
Description: