ebook img

Web Mining Applications in E-commerce and E-services PDF

187 Pages·2009·8.85 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Web Mining Applications in E-commerce and E-services

I-HsienTingandHui-JuWu(Eds.) WebMiningApplicationsinE-CommerceandE-Services StudiesinComputationalIntelligence,Volume172 Editor-in-Chief Prof.JanuszKacprzyk SystemsResearchInstitute PolishAcademyofSciences ul.Newelska6 01-447Warsaw Poland E-mail:[email protected] Furthervolumesofthisseriescanbefoundonourhomepage: Vol.162.CostinBadica,GiuseppeMangioni, springer.com VincenzaCarchioloandDumitruDanBurdescu(Eds.) IntelligentDistributedComputing,SystemsandApplications, Vol.150.RogerLee(Ed.) 2008 SoftwareEngineeringResearch,Managementand ISBN978-3-540-85256-8 Applications,2008 ISBN978-3-540-70774-5 Vol.163.PawelDelimata,MikhailJu.Moshkov, AndrzejSkowronandZbigniewSuraj Vol.151.TomaszG.Smolinski,MariofannaG.Milanova InhibitoryRulesinDataAnalysis,2009 andAboul-EllaHassanien(Eds.) ISBN978-3-540-85637-5 ComputationalIntelligenceinBiomedicineandBioinformatics, 2008 Vol.164.NadiaNedjah,LuizadeMacedoMourelle, ISBN978-3-540-70776-9 JanuszKacprzyk,FelipeM.G.Franc¸a Vol.152.Jarosl(cid:2)awStepaniuk andAlbertoFerreiradeSouza(Eds.) IntelligentTextCategorizationandClustering,2009 Rough–GranularComputinginKnowledgeDiscoveryandData ISBN978-3-540-85643-6 Mining,2008 ISBN978-3-540-70800-1 Vol.165.DjamelA.Zighed,ShusakuTsumoto, Vol.153.CarlosCottaandJanovanHemert(Eds.) ZbigniewW.RasandHakimHacid(Eds.) RecentAdvancesinEvolutionaryComputationfor MiningComplexData,2009 CombinatorialOptimization,2008 ISBN978-3-540-88066-0 ISBN978-3-540-70806-3 Vol.166.ConstantinosKoutsojannisandSpirosSirmakessis Vol.154.OscarCastillo,PatriciaMelin,JanuszKacprzykand (Eds.) WitoldPedrycz(Eds.) ToolsandApplicationswithArtificialIntelligence,2009 SoftComputingforHybridIntelligentSystems,2008 ISBN978-3-540-88068-4 ISBN978-3-540-70811-7 Vol.155.HamidR.TizhooshandM.Ventresca(Eds.) Vol.167.NgocThanhNguyenandLakhmiC.Jain(Eds.) OppositionalConceptsinComputationalIntelligence,2008 IntelligentAgentsintheEvolutionofWebandApplications,2009 ISBN978-3-540-70826-1 ISBN978-3-540-88070-7 Vol.156.DawnE.HolmesandLakhmiC.Jain(Eds.) Vol.168.AndreasTolkandLakhmiC.Jain(Eds.) InnovationsinBayesianNetworks,2008 ComplexSystemsinKnowledge-basedEnvironments:Theory, ISBN978-3-540-85065-6 ModelsandApplications,2009 ISBN978-3-540-88074-5 Vol.157.Ying-pingChenandMeng-HiotLim(Eds.) LinkageinEvolutionaryComputation,2008 Vol.169.NadiaNedjah,LuizadeMacedoMourelleand ISBN978-3-540-85067-0 JanuszKacprzyk(Eds.) Vol.158.MarinaGavrilova(Ed.) InnovativeApplicationsinDataMining,2009 GeneralizedVoronoiDiagram:AGeometry-BasedApproachto ISBN978-3-540-88044-8 ComputationalIntelligence,2009 ISBN978-3-540-85125-7 Vol.170.LakhmiC.JainandNgocThanhNguyen(Eds.) KnowledgeProcessingandDecisionMakinginAgent-Based Vol.159.DimitriPlemenosandGeorgiosMiaoulis(Eds.) Systems,2009 ArtificialIntelligenceTechniquesforComputerGraphics,2009 ISBN978-3-540-88048-6 ISBN978-3-540-85127-1 Vol.160.P.RajasekaranandVasanthaKalyaniDavid Vol.171.Chi-KeongGoh,Yew-SoonOngandKayChenTan PatternRecognitionusingNeuralandFunctionalNetworks, (Eds.) 2009 Multi-ObjectiveMemeticAlgorithms,2009 ISBN978-3-540-85129-5 ISBN978-3-540-88050-9 Vol.161.FranciscoBaptistaPereiraandJorgeTavares(Eds.) Vol.172.I-HsienTingandHui-JuWu(Eds.) Bio-inspiredAlgorithmsfortheVehicleRoutingProblem,2009 WebMiningApplicationsinE-CommerceandE-Services,2009 ISBN978-3-540-85151-6 ISBN978-3-540-88080-6 I-Hsien Ting Hui-JuWu (Eds.) Web Mining Applications in E-Commerce and E-Services 123 Dr.I-HsienTing DepartmentofInformationManagement NationalUniversityofKaohsiung No.700,KaohsiungUniversityRoad KaohsiungCity,811 Taiwan Email:[email protected] Dr.Hui-JuWu InstituteofHumanResourceManagement NationalChanghuaUniversityofEducation No.2,Shi-DaRoad ChanghuaCity,500 Taiwan Email:[email protected] ISBN978-3-540-88080-6 e-ISBN978-3-540-88081-3 DOI10.1007/978-3-540-88081-3 StudiesinComputationalIntelligence ISSN1860949X LibraryofCongressControlNumber:2008935505 (cid:2)c 2009Springer-VerlagBerlinHeidelberg This work is subject to copyright.All rights are reserved,whether the whole or part of the materialisconcerned,specifically the rightsof translation,reprinting,reuseof illustrations, recitation,broadcasting,reproductiononmicrofilmorinanyother way,andstorageindata banks.Duplicationofthispublicationorpartsthereofispermittedonlyundertheprovisionsof theGermanCopyrightLawofSeptember9,1965,initscurrentversion,andpermissionforuse mustalwaysbeobtainedfromSpringer.ViolationsareliabletoprosecutionundertheGerman CopyrightLaw. The use of general descriptive names,registered names,trademarks,etc.in thispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfrom therelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. Typeset&CoverDesign:ScientificPublishingServicesPvt.Ltd.,Chennai,India. Printedinacid-freepaper 987654321 springer.com Preface Web mining has become a popular area of research, integrating the different research areas of data mining and the World Wide Web. According to the taxonomy of Web mining, there are three sub-fields of Web-mining research: Web usage mining, Web content mining and Web structure mining. These three research fields cover most content and activities on the Web. With the rapid growth of the World Wide Web, Web mining has become a hot topic and is now part of the mainstream of Web re- search, such as Web information systems and Web intelligence. Among all of the possible applications in Web research, e-commerce and e-services have been identi- fied as important domains for Web-mining techniques. Web-mining techniques also play an important role in e-commerce and e-services, proving to be useful tools for understanding how e-commerce and e-service Web sites and services are used, ena- bling the provision of better services for customers and users. Thus, this book will focus upon Web-mining applications in e-commerce and e-services. Some chapters in this book are extended from the papers that presented in WMEE 2008 (the 2nd International Workshop for E-commerce and E-services). In addition, we also sent invitations to researchers that are famous in this research area to contrib- ute for this book. The chapters of this book are introduced as follows: In chapter 1, Peter I. Hofgesang presents an introduction to online web usage min- ing and provides background information followed by a comprehensive overview of the related work. In addition, it outlines the major, and yet mostly unsolved, chal- lenges in the field. In chapter 2, Gulden Uchyigit presented an overview of some of the techniques, algorithms, methodologies along with challenges of using semantic information in representation of domain knowledge, user needs and the recommendation algorithms. In chapter 3, Bettina Berendt and Daniel Trümper describe a novel method for analyzing large corpora has been developed. Using an ontology created with methods of global analysis, a corpus is divided into groups of documents sharing similar top- ics. The introduced local analysis allows the user to examine the relationships of documents in a more detailed way. In chapter 4, Jean-Pierre Norguet et al. propose a method based on output page mining and presents a solution to answer the need for summarized and conceptual audience metrics in Web analytics. The authors describes several methods for collect- ing the Web pages output by Web servers and aggregate the occurrences of taxonomy terms in these pages can provide audience metrics for the Web site topics. V I Preface In chapter 5, Leszek Borzemski presents empirical experience learnt from Web performance mining research, in particular, in the development of predictive model describing Web performance behavior from the perspective of end-users. The author evaluates Web performance from the perspective of Web clients therefore the Web performance is considered in the sense of the Web server-to-browser throughput or Web resource download speed rate. In chapter 6, Ali Mroue and Jean Caussanel describe an approach for automatically finding the prototypic browsing behavior of web users. User access logs are examined in order to extract the most significant user navigation access pattern. Such approach gives us an efficient way to better understand the way users are acting, and leads us to improve the structure of websites for improving navigation. In chapter 7, Istvan K. Nagy and Csaba Gaspar-Papanek investigate the time spent on web pages as a disregarded indicator of quality of online contents. The authors present influential factors on TSP measure and gave a TSP data preprocessing methodology whereby we were able to eliminate the effects of this factors. In addition, The authors introduce the concept of the sequential browsing and revisitation to more exactly restore users' navigation pattern based on TSP and the restored stack of browser. In chapter 8, Yingzi Jin et al. describe an attempt to learn ranking of companies from a social network that has been mined from the web. The authors conduct an experiment using the social network among 312 Japanese companies related to the electrical prod- ucts industry to learn and predict the ranking of companies according to their market capitalization. This study specifically examines a new approach to using web informa- tion for advanced analysis by integrating multiple relations among named entities. In chapter 9, Jun Shen, and Shuai Yuan propose a modelling based approach to de- sign and develop a P2P based service coordination system and their components. The peer profiles are described with the WSMO (Web Service Modelling Ontology) stan- dard, mainly for quality of service and geographic features of the e-services, which would be invoked by various peers. To fully explore the usability of service categoriza- tion and mining, the authors implement an ontology driven unified algorithm to select the most appropriate peers. The UOW-SWS prototype also shows that the enhanced peer coordination is more adaptive and effective in dynamic business processes. In chapter 10, I-Hsien Ting and Hui-Ju Wu provide a study about the issues of us- ing web mining techniques for on-line social networks analysis. Techniques and con- cepts of web mining and social networks analysis will be introduced and reviewed in this chapter as well as a discussion about how to use web mining techniques for on- line social networks analysis. Moreover, in this chapter, a process to use web mining for on-line social networks analysis is proposed, which can be treated as a general process in this research area. Discussions of the challenges and future research are also included in this chapter. In summary, this book’s content sets out to highlight the trends in theory and prac- tice which are likely to influence e-commerce and e-services practices in the web mining research. Through applying Web-mining techniques to e-commerce and e-services, value is enhanced and the research fields of Web mining, e-commerce and e-services can be expanded. I-Hsien Ting Hui-Ju Wu Contents Online Mining of Web Usage Data: An Overview Peter I. Hofgesang ................................................ 1 Semantically Enhanced Web Personalization Gulden Uchyigit .................................................. 25 Semantics-Based Analysis and Navigation of Heterogeneous Text Corpora: The Porpoise News and Blogs Engine Bettina Berendt, Daniel Tru¨mper ................................... 45 Semantic Analysis of Web Site Audience by Integrating Web Usage Mining and Web Content Mining Jean-Pierre Norguet, Esteban Zima´nyi, Ralf Steinberger................ 65 Towards Web Performance Mining Leszek Borzemski ................................................. 81 Anticipate Site Browsing to Anticipate the Need Ali Mroue, Jean Caussanel......................................... 103 User Behaviour Analysis Based on Time Spent on Web Pages Istvan K. Nagy, Csaba Gaspar-Papanek .............................. 117 Ranking Companies on the Web Using Social Network Mining Yingzi Jin, Yutaka Matsuo, Mitsuru Ishizuka ......................... 137 Adaptive E-Services Selection in P2P-Based Workflow with Multiple Property Specifications Jun Shen, Shuai Yuan............................................. 153 WebMiningTechniques for On-LineSocial Networks Analysis: An Overview I-Hsien Ting, Hui-Ju Wu .......................................... 169 Author Index................................................... 181 Online Mining of Web Usage Data: An Overview Peter I. Hofgesang VU UniversityAmsterdam, Department of ComputerScience DeBoelelaan 1081A, 1081 HVAmsterdam, The Netherlands [email protected] Abstract. In recent years, web usage mining techniques have helped online service providerstoenhancetheirservices, andrestructureand redesign theirwebsites inline with the insights gained. The application of these techniques is essential in building intelligent, personalised online services. More recently,it has been recognised that the shift from traditional to online services – and so the growing numbers of online cus- tomersandtheincreasingtrafficgeneratedbythem–bringsnewchallengestothefield. Highlydemandingreal-worldE-commerceandE-servicesapplications,wheretherapid, and possibly changing, large volumedata streams donot allow offline processing, mo- tivatethedevelopmentof new,highlyefficient real-timeweb usageminingtechniques. This chapter provides an introduction to online web usage mining and presents an overviewofthelatest developments.Inaddition,itoutlinesthemajor, andyetmostly unsolved,challenges in thefield. Keywords: Online web usage mining, survey, incremental algorithms, data stream mining. 1 Introduction In the case of traditional, “offline” web usage mining (WUM), usage and other user-related data are analysed and modelled offline. The mining process is not time-limited,theentireprocesstypicallytakesdaysorweeks,andtheentiredata setisavailableupfront,priortotheanalysis.Algorithmsmayperformseveraliter- ationsontheentiredatasetandthusdatainstancescanbereadmorethanonce. However,as the number ofonline users – and the traffic generatedby them – greatlyincreases,thesetechniquesbecomeinapplicable.Serviceswithmorethan a critical amount of user access traffic need to apply highly efficient, real-time processing techniques that are constrained both computationally and in terms of memory requirements. Real-time, or online, WUM techniques (as we refer to them throughout this chapter) that provide solutions to these problems have received great attention recently, both from academics and the industry. Figure 1 provides a schematic overview of the online WUM process. User interactionswiththewebserverarepresentedasacontinuousflowofusagedata; thedataarepre-processed–includingbeingfilteredandsessionised–on-the-fly; modelsareincrementallyupdatedwhennewdatainstancesarriveandrefreshed I.-H.Ting,H.-J.Wu(Eds.):WebMiningAppl.inE-Commerce&E-Services,SCI172,pp.1–23. springerlink.com (cid:2)c Springer-VerlagBerlinHeidelberg2009 2 P.I. Hofgesang Fig. 1. An overview of online WUM. User interactions with a web server are pre- processed continuously and fed into online WUM systems that process the data and update the models in real-time. The outputs of these models are used to, e.g. mon- itor user behaviour in real-time, to support online decision making, and to update personalised services on-the-fly. models are applied, e.g. to update (personalised) websites, to instantly alert on detectedchangesinuserbehaviour,andtoreportonperformanceanalysisoron results of monitoring user behaviour to support online decision making. This book chapter is intended to be an introduction to online WUM and it aims to provide an overview of the latest developments in the field and so, in this respect, it is – to the best of our knowledge – the first survey on the topic. The remainder of this chapter is organised as follows. In the 2 section, we provide a brief general introduction to WUM, and the new online challenges. We survey the literature related to online WUM divided in three sections (Sections 3, 4, and 5). 3 overviews the efficient and compact structures used in (or even developed for) online WUM. 4 overviews online algorithms for WUM, while 5 presents the work related to real-time monitoring systems. The most important(open)challengesaredescribedin6.Finally,the lastsectionprovides a discussion. 2 Background This section provides a background to traditional WUM; describes incremental learningtoefficientlyupdateWUMmodelsinasinglepassovertheclickstream; and, finally, it motivates the need for highly efficient real-time, change-aware algorithms for high volume, streaming web usage data through the description of web dynamics, characterising changing websites and usage data. 2.1 Web Usage Mining Web or application servers log all relevant information available on user–server interaction. These log data, also known as web user access or clickstream data, OnlineMining of Web Usage Data: AnOverview 3 can be used to explore, model, and predict user behaviour. WUM is the appli- cationofdataminingtechniquesto performthesesteps,todiscoverandanalyse patterns automatically in (enriched) clickstream data. Its applications include customerprofiling,personalisationofonlineservices,productandcontentrecom- mendations, and various other applications in E-commerce and web marketing. TherearethreemajorstagesintheWUMprocess(seeFigure2):(I)datacollec- tion and pre-processing, (II) pattern discovery, and (III) pattern analysis (see, for example, [18, 51, 67]). WebUsageDataSources.Theclickstreamdatacontaininformationoneachuser click,suchasthedateandtimeoftheclicks,theURIofvisitedwebsources,and some sortof user identifier (IP, browser type and, in the case of authentication- requiredsites,loginnames).Anexampleof(artificiallydesigned)useraccesslog data can be seen in Table 1. In additionto server-sidelogdata,someapplications allowthe installationof special software on the client side (see, for example, [3]) to collect various other information (e.g. scrolling activity, active window), and, in some cases, more reliableinformation(e.g.actualpageviewtime).Webaccessinformationcanbe further enriched by, for example, user registration information, search queries, and geographic and demographic information. Pre-processing.Rawlogdataneedtobepre-processed;first,byfilteringallirrele- vantdataandpossiblenoise,thenbyidentifyinguniquevisitors,andbyrecovering Fig. 2. An overviewof theweb usage mining process

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.