MODELING USERS FOR ONLINE ADVERTISING BYQIANGMA Adissertationsubmittedtothe GraduateSchool—NewBrunswick Rutgers,TheStateUniversityofNewJersey inpartialfulfillmentoftherequirements forthedegreeof DoctorofPhilosophy GraduatePrograminComputerScience Writtenunderthedirectionof S.Muthukrishnan andapprovedby NewBrunswick,NewJersey October,2016 ABSTRACTOFTHEDISSERTATION Modeling Users for Online Advertising byQiangMa DissertationDirector: S.Muthukrishnan Onlineadvertisingisabletotargetusersatafinelevelofgranularity. Todothiseffectively, models are required to represent users and their behavior. In this thesis, we studied several problemsrelatedtomodelsofonlineusers. Inonlineadvertising,advertisersandadplatforms use user profiles as the language to target users, composed of user information from demo- graphics,location,andinterests. Weimplementedauser-profile-drivenadcrawlingframework and empirically investigated the relationship between user profiles and the ads to which they are exposed. We observed user profiles to play a greater role in display ads than in video ads. Furthermore,themainmodeofaccessingonlinecontenthasbeenshiftingfromwebsitebrows- ing to mobile application usage. Mobile apps have become the building blocks to model user behavior on mobile devices. We designed a neural network model (app2vec) to vectorize mobile apps by studying how users employ these apps. We analyzed the learned app vectors qualitativelyandquantitativelyandusedthemtoextractuserappusageprofilesforapp-install advertising. Finally,advertisersarefacedwiththechallengeoffindingtheoptimaluserprofile properties to target. We designed a look-alike audience extension system, where advertisers providealistofpastconvertersas”seedusers”andoursystemdeterminesuserssimilartothe seed. Ratherthanassuminglinearseparabilityoflookalikeandnon-lookalikeusers,asinprior work,weproposeanewapproachwithnearest-neighborfiltering. Oursystemworksefficiently forbillionsofusersandimprovestheadcampaignconversionrateinpracticeatYahoo!. ii Acknowledgements This thesis could never have been completed without the hard work and support of several people. I am forever grateful to my advisor Prof. S. Muthukrishnan. Through our numer- ous conversations and joint projects, I was fortunate enough to be influenced by his system of questioning, reasoning and presenting. In addition to providing academic instruction, Prof. Muthukrishnanalsohelpedmetobecomeamorepositiveandhealthyperson. Hisactivesup- portplayedanessentialroleinensuringthesurvivalofmeandmyfamilyinournewhome. I would like to thank my committee members–Prof. Muthukrishnan, Prof. Nath, Prof. Imielinski and Dr. Kale–for taking the time to serve on my committee and for providing me withvaluablefeedbackonmythesiswork. Manythankstomyformeradvisor,Prof. DanfengYao,foryourguidanceatthestartofmy graduatestudyandforintroducingmetotheresearchworld. Mysincerethankstomyadvisors inmymaster program: Prof. FrankHsu, Prof. GaryWeissandProf. DamianLyons. Without yourguidanceandassistancewithmyresearchwork,Iwouldneverhavehadanopportunityto getintoaPhDprogram. Alongmyjourney, Ihavebeenmorethanfortunatetohavemanysupportivementorswho have helped me see beyond the ivory tower. My first internship mentor at MLP, Michael Kur- tis, benefited me with his professionalism and his sharp mind. More importantly, he gave me the courage to pursue a PhD course. My research internship mentor at Narus, Han Hee Song, trained me for the R&D work environment. I would also like to thank my research intern- ship mentor at Flurry, Wil Simpson, whose elegant way of breaking down complex practical problemsandwhoseorganizedworkingstylehavegreatlyinfluencedmyapproachtowork. MyfondestmemoriesofgraduatelifeareofworkingwithmyfriendsattheMassDALlab, includingDarjaKrushevskaja,PriyaGovindan,VasishtGopalan,EdanHarel,BrianThompson, andJohnRobertYaros. Thankyoufortheskillsyoutaughtmeandforthecourageandsupport iii yougavemeduringhardtimes. IamgratefultohaveworkedcloselywithDarjaKrushevskaja on many projects. She has always been an honest friend and helped me to overcome several difficulties. Iwouldnothavebeenabletomakesomuchprogresswithoutclosecollaborationwithmy talented and diligent colleagues at Yahoo!. My sincere thanks go out to Datong Chen, Zhen Xia, Musen Wen, Peiji Chen, Liang Wang, Jialing Liu, Jiayi Wen, Eeshan Wagh, and Robert Ormandi. I am indebted to Stratis Ioannidis for helping me to start breathing industrial scale big data. It has been my great honor to work with my co-authors and collaborators: Graham Cormode, Brian Thompson, Paul Barford, Igor Canadi, Mark Sandler, Ye Tian, Liang Wang, KuiXu,DanfengYao,AlexanderCrowell,SwatiRallapalli,MarioBaldi,LiliQiu,andAntonio Nucci. Icanneverthankmyfamilyenoughfortheircontinuedlove,support,patienceandforgive- ness. My father and mother have given everything they had to support my growth, including bravery to explore the world. In addition, my in-laws have always been supportive and I am verygrateful. Without the love and understanding of my dear loving wife, Qing Zhang, I would not be where I am today. She has given me the strength and encouragement necessary to face the world. Iwilltrytogethomeearlyfromwork,mylove. iv Dedication Tomyparents,mywifeandmyson. v Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v ListofTables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix ListofFigures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. OverviewofUserOnlineAdTargeting . . . . . . . . . . . . . . . . . . . . . . 5 2.1. OnlineAdvertising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2. UserTargeting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3. ResearchDirections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3. ObserveUserOnlineAdvertisingProfileandAdTargeting . . . . . . . . . . . 13 3.1. OnlineAdLandscape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.1.1. TheAnatomyofOnlineAdvertising . . . . . . . . . . . . . . . . . . . 14 3.1.2. AdFormats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2. MeasurementMethodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.1. SyntheticUserProfiles . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.2. ControlledWebpageCrawling . . . . . . . . . . . . . . . . . . . . . . 21 3.2.3. ControlledVideoWatching . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3. ExperimentsandAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.1. DisplayAdTargeting . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.2. VideoAdTargeting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 vi 3.3.3. PolymorphicVideosandUserInterests . . . . . . . . . . . . . . . . . 38 3.4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.5. Relatedwork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4. UserModelingonMobile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.1. TheAppSimilarityProblem . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.2. MobileAppModeling–app2vec . . . . . . . . . . . . . . . . . . . . . . . 52 4.2.1. word2vecBackground . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.2.2. AppVectorization–app2vec . . . . . . . . . . . . . . . . . . . . . 53 4.3. app2vecinDiverseAdConsiderationSetSelection . . . . . . . . . . . . . . 60 4.3.1. DeterminantalPointProcess(DPP)Background . . . . . . . . . . . . 60 4.3.2. OurApproach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.3.3. ExperimentsandResults . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.4. app2vecinAppRecommendation . . . . . . . . . . . . . . . . . . . . . . . 66 4.4.1. DPP-basedAppClustering . . . . . . . . . . . . . . . . . . . . . . . . 66 4.4.2. ExperimentsandResults . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.5. app2vecinApp-installAdConversionPrediction . . . . . . . . . . . . . . . 69 4.6. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.7. Relatedwork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.8. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5. LargeScaleLook-alikeAudienceModeling . . . . . . . . . . . . . . . . . . . . 74 5.1. Look-alikeAudienceTargeting . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.2. ExistingLook-alikeMethods&Systems . . . . . . . . . . . . . . . . . . . . . 75 5.2.1. SimpleSimilarity-basedLook-alikeSystem . . . . . . . . . . . . . . . 76 5.2.2. Regression-basedLook-alikeSystem . . . . . . . . . . . . . . . . . . 77 5.2.3. SegmentApproximation-basedLook-alikeSystem . . . . . . . . . . . 78 5.3. Graph-ConstraintLook-alike . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.3.1. PhaseI:GlobalGraphConstruction . . . . . . . . . . . . . . . . . . . 80 vii 5.3.2. PhaseII:CampaignSpecificModeling. . . . . . . . . . . . . . . . . . 83 5.3.3. SystemDesignandPipeline . . . . . . . . . . . . . . . . . . . . . . . 85 5.4. ExperimentsandResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.5. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 5.6. RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 6. FutureWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 viii List of Tables 3.1. Ratiooftargetedadsforexamplewebsites. . . . . . . . . . . . . . . . . . . . 27 3.2. SamplecategorymappingsforAlexaandWebPulse. . . . . . . . . . . . . . . 30 4.1. Exampleoftopsimilarappwithdatapreprocessing. . . . . . . . . . . . . . . . 55 4.2. Examplesofrelevantappsfromapp2vecmodel. . . . . . . . . . . . . . . . . 56 4.3. Appsimilarityquantitativeevaluationcriteria. . . . . . . . . . . . . . . . . . . 57 4.4. Appsimilarityquantitativeevaluations. . . . . . . . . . . . . . . . . . . . . . 57 4.5. Examplesofappanalogies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.6. Apprecommendationperformancecomparison. . . . . . . . . . . . . . . . . . 68 ix List of Figures 2.1. TrackersonNewYorkTimeshomepage. . . . . . . . . . . . . . . . . . . . . . 8 2.2. TrackersonasciencearticlepagefromtheNewYorkTimes. . . . . . . . . . . 8 2.3. UserprofileexamplefromGoogle. . . . . . . . . . . . . . . . . . . . . . . . . 9 2.4. Interactionsamonguserprofile,contentandads.. . . . . . . . . . . . . . . . . 10 3.1. Publishers’commonadtypes. . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2. Adformatexamples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3. ExampleofdifferentadformatsonYouTube. . . . . . . . . . . . . . . . . . . 17 3.4. Dynamicsofthenumberofprofileinterestsaswebsitesarevisited. . . . . . . . 19 3.5. AdCrawlerComponents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.6. Averagetrafficrankofadvertisers. . . . . . . . . . . . . . . . . . . . . . . . . 26 3.7. Examplesofadcreativeswithdifferenttargetingstrategies. . . . . . . . . . . . 29 3.8. Distributionofimpressionsovercategories. . . . . . . . . . . . . . . . . . . . 31 3.9. Collectedadamountdistributionbyprofile. . . . . . . . . . . . . . . . . . . . 32 3.10. Videoadsdistributionbytoplevelcategories. . . . . . . . . . . . . . . . . . . 33 3.11. Interestsaddedbymarkedactivity. . . . . . . . . . . . . . . . . . . . . . . . . 33 3.12. Interestsremoved. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.13. Addedintereststhatareexplainedbyvideoverticals. . . . . . . . . . . . . . . 35 3.14. ImpactofYouTubeprerolladstotheprofile. . . . . . . . . . . . . . . . . . . . 36 3.15. ImpactofYouTubesponsoredadstotheprofile. . . . . . . . . . . . . . . . . . 37 3.16. ImpactofYouTubeadstotheprofile.. . . . . . . . . . . . . . . . . . . . . . . 38 3.17. Similaritiesofprofilesforintraandinterpolymorphicvideosets. . . . . . . . . 41 3.18. Profilesimilaritycomparisonfor15setsofpolymorphicvideos. . . . . . . . . 42 4.1. Popularityofappcategories. . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.2. Performancecomparisononadclickpredictionsforsampledusers. . . . . . . . 65 x
Description: