Provided by the author(s) and NUI Galway in accordance with publisher policies. Please cite the published version when available. Augmenting Social Media Items with Metadata using Related Title Web Content Author(s) Kinsella, Sheila Publication 2012-01-23 Date Item record http://hdl.handle.net/10379/2674 Downloaded 2022-12-31T01:04:40Z Some rights reserved. For more information, please see the item record link above. Augmenting Social Media Items with Metadata using Related Web Content Sheila Kinsella Submitted in fulfillment of the requirements for the degree of Doctor of Philosophy Supervisor: Dr. John Breslin Internal Examiner: External Examiner: Prof. Dr. Stefan Decker Dr. Fabien Gandon Digital Enterprise Research Institute (DERI), National University of Ireland, Galway (NUIG) January 2012 Abstract The Web has shifted from a read-only medium where most users were solely con- sumers of information, to an interactive medium where collaborative technologies allow anyone to publish or edit content. In this environment, social media such as social network sites, blogs, wikis, and content-sharing websites have flourished and now masses of users are contributing to the pool of human knowledge that is the Web. This large-scale user participation means that the content-creation capacity of the Web has exploded and there is now wide coverage of news, niche interests and hyperlocal content, all available in real-time. In short, Web 2.0 services have successfully harnessed collective intelligence and a huge and diverse information source has emerged. The downside of social media as an information source is that often the indi- vidual items are very short, informal and lacking in metadata. Despite the wealth of information available in online communities, locating objects of interest can still be challenging. The search and navigation of social media could be greatly improved by augmenting the content of social media items with annotations to provide additional context or descriptors. This thesis investigates the potential of using related data from the Web to enrich social media items with metadata and thus make it easier to find or browse information in social media. We provide three methods by which social media items can be augmented with novel metadata, specifically tags, locations and categories. Our approaches make use of existing Web data retrieved from HTML documents, APIs and Linked Data. We describe how Semantic Web technologies can be used to represent social media posts and their metadata in a uniform way and thus allow enhanced search and browsing over online community data integrated from heterogeneous sources. iii Declaration I declare that this thesis is composed by myself, that the work contained herein is my own except where explicitly stated otherwise in the text, and that this work has not been submitted for any other degree or professional qualification except as specified. The research presented in this thesis was supported by Science Foundation Ireland under Grant No. SFI/02/CE1/I131 (Lion) and Grant No. SFI/08/CE/I1380 (Lion-2), and by the European Commission under contract 215032 (OKKAM) and contract FP7-248984 (GLOCAL). Sheila Kinsella January 18, 2012 v Acknowledgements I would like to thank my advisor John for his guidance and encouragement over the last few years, as well as my examiners Stefan and Fabien for their valuable feedback and discussions. I am grateful to all my friends and colleagues at DERI for their help and inspiration. Thanks especially to my collaborators Alexandre, Andreas, Conor, Mengjiao and Uldis. Special thanks also to all the current and past members of the Social Software Unit, who provided lots of much appreciated support and a great working environment. I am also grateful to my colleagues at EPFL and Yahoo! Research for the valuable internships - especially Adriana, Gleb, Sebastian and Karl in Lausanne, and Vanessa and Neil in Barcelona. Thanks to Richard for always being there for me. Thanks to my friends, especially Fiona, for lending an ear and taking my mind off things when needed. Most importantly, thanks to my family for the many years of support and encouragement for my study, through school and my undergraduate years as well as through this PhD. vii Core Publications Papers in Conference Proceedings • Sheila Kinsella, Mengjiao Wang, John Breslin, Conor Hayes: Improving Cate- gorisation in Social Media using Hyperlinks to Structured Data Sources. The 8th Extended Semantic Web Conference (ESWC 2011), Springer, 2011 • Sheila Kinsella, Alexandre Passant, John Breslin: Topic Classification in Social Media using Metadata from Hyperlinked Objects. The33rdEuropeanConference on Information Retrieval (ECIR 2011), Springer, 2011 • Sheila Kinsella, Alexandre Passant, John G. Breslin: Using Hyperlinks to Enrich Message Board Content with Linked Data. The 6th International Conference on Semantic Systems (I-SEMANTICS 2010), ACM, 2010 • Sheila Kinsella, Uldis Bojars, Andreas Harth, John G. Breslin, Stefan Decker: An Interactive Map of Semantic Web Ontology Usage. The 12th International Conference on Information Visualisation (IV08), IEEE Computer Society, 2008 Papers in Workshop Proceedings • Sheila Kinsella, Vanessa Murdock, Neil O’Hare: “I’m Eating a Sandwich in Glasgow”: Modeling Locations with Tweets. The 3rd International Workshop on Search and Mining User-generated Contents (SMUC 2011) at the 20th Inter- national Conference on Information and Knowledge Management (CIKM 2011), ACM, 2011 • Sheila Kinsella, Adriana Budura, Gleb Skobeltsyn, Sebastian Michel, John G. Breslin, Karl Aberer: From Web 1.0 to Web 2.0 and Back – How did your Grandma use to Tag? The 10th International Workshop on Web Information and Data Management (WIDM 2008) at the 17th International Conference on Information and Knowledge Management (CIKM 2008), ACM, 2008 ix
Description: