Google AdSense User Modeling and Recommender Systems − Case exercise Authors:Marin Desnica Marie Sayadchi Khawja Umair Peter Szabó The most of the internet users are probably familiar with the ads appearing widely all over the internet with a little triangle in the top or bottomright corner − that is AdSense, a service from Google which together with AdWords makes a significant income of this company [3]. Google AdSense is an advertising platform using various information about the user viewing the ads, to show him the most relevant ones and by that maximize it’s own income. The most common information about the user is his geolocation, the languages he speak, his interests and the content of the website he is browsing right now. 1 The users of AdSense AdSense and AdWords are closely related. While AdWords is a service helping to create ads for companies and individuals who wants to promote their products, AdSense, on the other hand, is used by online publishers (owners of websites) to deliver these ads to the end users (visitors of their website) and earning money for it. [8] Connection between Google’s AdWords and AdSense [9] Ads generated in AdWords can be displayed as “google search ads”, which are shown on Google’s homepage along with the generic search results. They can also appear in Google Display Network, which is actually the Google’s AdSense service (see the image above). Google AdSense is currently available on around 4 million websites [10] (see Figure 3) maintained by more than 2 million publishers [11]. Unfortunately there are no official statistics how many users might get in touch with AdSense per day, month or a year, but a rough 1 estimation using YouTube’s usage statistics can be easily made. Considering the following facts: ● YouTube has the second highest number of unique visitors − over 187,000,000 visitors per month (the first place belongs to Google which does not use AdSense) [10], ● an ad through AdSense is displayed under almost every popular video on YouTube, ● the majority of the youtube users definitely must watch at least one popular video with an ad during a month, we can definitely say, that there are at least 187,000,000 users of AdSense per month. If we also consider, that there are almost 3 billion Internet users all over the World [12], we can see, that 94% of all the users of Internet, does not use YouTube (for example, it is not too much popular in the Asian countries). But they must visit other websites, and around 6% of the top 1 million websites (the ones with the most visitors per month) has Google AdSense placed on it [10]. So even if we do not count the 15,000,000 people using AdBlock (see chapter 5.1), the total number of users might be still several times larger. 2 Ad targeting in AdSense Google AdSense is a great example of a recommender system serving content tailored for every user separately. It recommends advertisements by taking into account the users age, gender, language, location, and interests. AdSense displays various types of ads − they can be basic text ads, images or even video and much more (See chapter 3.2). [2] When searching for goods or services, the displayed ad may be the exact solution that user is looking for. Even in contextual targeting in blogs/websites relevant ads can add to the depth of information user gets from visiting the page. For example, user is reading a blog post on “how to fix your bike’s dynamo”, in addition to the tips on the actual process, by looking into ads, user can have some estimation on the prices and availability of necessary parts and tools, or he might at the end prefer to use a service. 2.1 Targeting by language The advertiser using Google AdWords has an option to specify the language settings for his advertisements. AdSense determines the primary language of the website showing the ads, and displays there those ads, which intended for the given language. [1] To make language targeting even more efficient, AdSense does not look only at the language of the current website where the advertisement is served, but it takes in mind also the languages of the users recently visited websites as well. Then it is usual that an English website might contain for example Estonian ads if the user was previously browsing websites with content in Estonian language. [1] 2.2 Targeting by geolocation The geographical location of the user is determined by his IP address. The precision of the location detection varies and it depends mostly from how much information does the Internet service provider (ISP) share about his clients. For some users AdSense knows only the country, 2 but some ISPs shares even the users exact address, so the advertisement can be targeted with a very high precision. AdSense lets the advertiser to target whole countries, regions within these countries or even exact cities and a manually specified radius around them (see Figure 4). [6, 7] 2.3 Targeting by keywords After an online publisher signs up for using AdSense on his website, Google bots starts to crawl the website and analyze its content, so ads relevant to the websites content can be displayed there. This is called contextual targeting and it analyzes keywords, word frequency, font size, and the overall link structure of the website (see Figure 2). [5, 8] 2.4 Targeting by interests Users interests are mainly determined from his browsing habits on websites which are using AdSense and from his search history on Google as well. If the user starts searching and visiting websites with one specific content (e.g. sports equipment), it is highly probable that he will start to see ads related to this content on other (even unrelated) websites as well. [4] The user is also able to manually adjust or change these preferences or even completely disable tracking of his interests by visiting the Ads Settings page (see Figure 1). [5] The advertisers can also use an other targeting option − remarketing. They can target their ads to those users, who previously visited their website, to lure them back again (see Figure 5). [5] 2.5 Storing user relevant data User relevant data is stored via cookies. A “cookie” is a small file containing a string of characters that is sent to user’s computer when he visits a website. At Google they use cookies to improve the quality of their service and to better understand how people interact with them. [16] As users browse Google’s partner websites, Google stores an advertising cookie in the user's browser to understand the types of pages that user is visiting. This information is used to show ads that might appear to users based on their inferred interest and demographic categories. [14] User has some control of the information that is being stored in the form of cookies, he can edit his Ads Settings (see Fig. 1) and define or change there the inferred interest and demographic categories of him. Regardless of the interests of a user, Google will never show ads based on sensitive information or interest categories, such as those based on race, religion, sexual orientation, health, or sensitive financial categories. [14] 3 3 The algorithm behind AdSense The exact way the algorithm works is not known to the general public, but an educated guess would describe the algorithm steps like this: 1. Attempt to determine page context and serve a contextually relevant ad 2. Use clickstream data to determine what the user might be interested in and serve an ad that may not be contextually relevant. 3. Use basic demographic data (e.g. geolocation) to attempt to target ad relevance to the user. The premise is simple, the context of the page is a strong indication about what the user will click on and is the first priority of the algorithm. You may know that the user was interested in other, potentially more profitable subjects, but that the user is on that page now is a fairly good indication of what the user is interested in at that particular moment. [23] The biggest advantage of the Google AdSense algorithm is in recognizing the context where the ad is being displayed, so it does not “shower” the user with the same ads every time he comes online. An example would be showing the user some ads about a travel agency while browsing for a plane ticket. In the same time it is not solely constricted with the context of the page the user is browsing. The algorithm is familiar with the user and has information about his interests and demographics which it can use to serve him with the best possible ad in every situation. One of the most noticeable disadvantages is also related to recognizing the context. There are some users complaining that since they purchased a certain product they can’t seem to get rid of the ads concerning the product they already bought and are not anymore interested in it. [24] For an example, a person browsing the internet would purchase and electric heater and every time he would visit shopping site it would show him an ad with discount on electric heaters. 3.1 Choosing the proper place to display the ad For the ads displayed in a blog or website (in Google Display Network) there is an interesting mechanism of bidding and competing for a potential ad plot. The ad auction is used to select the ads that will appear on your pages and determine how much you’ll earn from those ads. First, ads are narrowed down through placementtargeting, to a set of ads to compete to appear in the target website. Two main points are considered in the selection of candidate ads: [13] ● relevance − ads that are relevant to the content or users of your site, and ● format of the ad − you as a website owner, can choose to allow or ban ads in either of text or image form. 4 Here the advertiser is the bidder. The bidding process is not done manually for each plot, but the advertiser (using AdWords) defines a set of limits on how much he is willing to spend for his ad and based on those numbers, Google determines the outcome of the bidding. An example of such setting is shown below: ● Advertiser’s daily budget: $10 ● Advertiser’s maximum costperclick bid: $0.50 ● Advertiser’s average actual costperclick: $0.40 ● Approximate number of clicks per day: 25 Based on the maximum CPC (cost per click), the set of potential plots that Google has already narrowed down, quality of the ad itself, its clickability… and the page’s traffic and popularity; bidding process matches ads to plots trying to optimize the adplot pair’s relevancy. Good adplot pair will guarantee good click rate leading to increase in Google’s and publishers income while giving the advertiser more visibility and marketing chances. [13] 3.2 Types of ads shown in AdSense AdSense can display ads of various types. The most often used are basic text ads, images, videos or flash banners. There are several types of video advertisements: ● Pre Roll ads appear on the screen before the video playback starts; ● Mid Roll ads appear in the middle of the video − e.g. the playback of the video can stop in the middle, and an ad starts to play, also ads played between videos in a playlist are considered as mid roll ads; ● Post Roll ads appear after the whole video ended − these type of ads are usually not used (at least not by AdSense). Text banners might appear on the websites and on the bottom of a video player. They can be positioned horizontally or vertically. They are usually very small in size and they can have different background colors. Example of text banners 5 Image banners are cheap and effective. Users can create them by using the AdSense application or upload their own images. These ads appear on special places on the websites or in videos (like bottom or some selected places). Example of image banners 4 Earning money with AdSense Google AdSense is specifically designed for advertising. There is no other main objective to this service. Unlike Facebook or YouTube which have their recommender systems as an additional service, AdSense is a recommender system by its nature. The goal of the service is to link advertisers to the online publishers while keeping the final displayed ad, as relevant and helpful to the end user as possible. Even in the case of ads shown in Google’s search page, Google is using its popularity as a source of income by selling its ad spots to advertisers. 4.1 Calculation of the price for the advertizer Basically, there is no fixed price for a publishing spot. The advertiser decides how much is he willing to spend per day, maximum amount he wants to pay per click (or per visit, …). Based on these settings and the competition at a given time, the price for a spot is determined. As a result of this system, price of the same spot for the same ad, may change largely. 4.2 Calculation of the payment for the publisher At the beginning, there was just one way of getting paid with AdSense, and this was on a per click basis. This means, that you earn money each time a visitor of your website clicks on an ad. It does not matter what he did afterwards on the target website − if he bought a product, or left immediately. 68 percent of the amount an advertiser pays per click on their ads on your website is your income. The rest is Google’s commission fee. [25] In addition to the costperclick (CPC) model, there are two other bid types: ● Costperthousandimpressions (CPM) is an income model where advertisers pay you a fixed price per thousand ad impressions (the ad appears on a website). No click on them is necessary for you to earn something from AdSense. ● Costperengagement (CPE) − in this case the advertiser is defining an action which the visitor needs to fulfill (expanding the ad, watch a video ad, finishing a poll, etc.) For each of these ad types, Google calculates which may work best in your website, maximizing your earning and Google’s revenue. [25] 6 Income of different websites can be very diverse due to their content and type, according to Thomas Maier [25], the general content websites may earn even less than 3$ per 1000 impressions, around 10$ for contentrich sites, e.g. blogs and even more for productrelated pages. 5 Problems and negatives of AdSense AdSense and AdWords are in contact with three different group of people: advertisers, publishers and viewers. Each of these audience groups may face many difficulties and problems in interaction with it, but they are greatly interconnected and a problem for one of the partners can affect the others. 5.1 Fluctuation in ad prices Many publishers use ads as their main source of income, spending a significant amount of time on updating and editing their websites. The biggest challenge for this group of people is that since there is no minimal price for an ad spot. It is dependent only on the competition between the advertisers and that makes AdSense a highly fluctuating source of income for the publishers. Considering being a publisher as a time consuming job, they generally complain that “one has so little influence over anything Google does”. Reports on the prices and statistics of one’s page is done by Google and Google’s transparency on its income statistics is not acceptable to many. 5.2 Irrelevant ads or same ads shown repeatedly Reading the forums of Adsense, it seems that it is common to get unrelated ads or repeating same ad on a particular website. This can be annoying for the publishers visitors, causing decrease in his page traffic and leading to low income for several months. The problem comes from the fact, that the publisher has no direct control on the ads shown on his website. In case of an irrelevant ad showing up, there is no way for the publisher to rate it , change it or influence its presence in any way. [26, 27] 5.3 Avoiding advertisements on the user side Google announced on Dec. 8 2009. that the Chrome would accept extensions (little programs that improve or customize the browser’s performance) as a way of harnessing the creativity of an outside community of programmers who would work free and agree to share what they make with others. [17] This decision made it possible for programmers all around the world to fight against advertisement targeting by making extension which allow blocking of all sorts of ads, but also Google ads on a Google product − Chrome. With AdBlock, most ads are not even downloaded at all, so users can focus on enjoying the content they want, and spend less time waiting for it. [18] AdBlock is the most popular Google Chrome extension (with over 15,000,000 users [19]), and also the most popular Safari extension [20]. There is an even older extension called AdBlock Plus which is primarily used on Firefox and is also among the most popular extensions for this browser. [21] 7 There are many ways to look at this topic, but it is essentially hurting the Ads business and forcing it to change it’s approach to users. Founder of Ars Technica, a site heavily relying on Ads, described users using AdBlock as “people who came and ate and didn't pay”. 6 Conclusion AdSense is definitely one of the most successful products of Google. It is widely spreaded and it is a key part of Google’s income and the key part of the marketing strategy of many companies as well. AdSense changed the way how we look at the advertisements in online media nowadays and made a large step forward in the modern advertising strategies by targeting the users personally. Of course, things are not always bright and shiny, and AdSense has also its problems. The number of websites using AdSense is decreasing (see fig. 2) and many of the publishers are not satisfied with the income they make from AdSense, which is often in reality lower than the estimated amount. There are a lot of other services offering similar, or almost the same as google AdSense (see [28] for details). 8 Figures Fig. 1: Screenshot of the user preferences configuration page of AdSense Fig. 2: User targeting by keywords 9 Fig. 3: Amount of websites using Google AdSense for the past year (Image source: [10]) Fig. 4: User targeting by geolocation 10
Description: