The role of data for competition in online advertising
Data being the lubricant of any interest-based advertising, control over access to data has become a central competitive factor in the advertising business; and a focal point of several antitrust investigations. The dependence of data on the part of a publisher, an advertiser or an ad tech intermediary depends significantly on the advertising format in question. This article outlines the relevance of data for competition on the various sub-markets of online advertising and their respective significance for a striving digital ecosystem. It concludes that because the most sustainable positive effects emanate overall from behavior-based advertising, competition authorities must pay particular attention to any measures by dominant companies to artificially restrict access to data required for behavioral advertising.
I. Debates around data for online advertising
The European Commissionand the UK Competition and Markets Authority (“CMA”) are currently investigating whether Google’s removal of third-party cookies anti-competitively deprives publishers and advertisers of access to data that is required for effective advertising. Similarly, the competition authorities in France, Germany, Poland, and Italy have opened probes into whether Apple’s App Tracking Transparency Framework (“ATTF”) restricts competition by making it more difficult for third parties to collect advertising-relevant data within Apple’s ecosystem. Conversely, some data protection authorities have advocated further restricting third-party access to data for advertising purposes. All these proceedings ultimately evolve around a central question: how relevant is data for an effective online advertising ecosystem?
II. Importance of data
A. Data for search-based advertising
Search-based advertising is the most profitable form of advertising. One of the reasons is that the format works without the provider needing any further data regarding consumers or their devices. A user's own search query provides the most relevant and personalized data. The more a search query implies that the searcher would currently be receptive to a particular product advertisement, the more advertisers are willing to bid for an ad space.
A search service can further increase the relevance of ads by using additional data regarding the searcher or his or her device beyond the search query. A user's previous search and click history allow the search service provider to draw significant conclusions about the user’s likely future behavior. For example, if in parallel to entering a search query for “Bosch,” a user watches a YouTube cooking video, ads for Bosch cooking appliances are likely to be more relevant than ads for Bosch washing machines. Geodata regarding a user's current location or general place of residence can also help to select more relevant ads from the pool of available ads, for example by giving preference to local providers. While such data can make advertising even more relevant, search-based advertising also works very well entirely without such data. This can be seen, for example, in the revenues of search engines which advertise that they do not collect or use personal data at all.
B. Data for context-based advertising
The relevance of (personal) data for context-based advertising is comparably inferior. In the case of context-based advertising, the targeting is based on the content that is published on the website or app that a user visits, rather than on the person or the device of the recipient. Accordingly, in theory, no further (personal) data is required for such targeting.
In practice, however, purely context-based online advertising, without any information regarding the user, has been shown to perform significantly worse in terms of all key performance indicators (“KPIs”) that matter to advertisers as compared to solutions that use additional data. Regardless of how specific the content visited by the user is, each publisher can display significantly more relevant ads in the proximity of such content if important parameters regarding the user are known (age, gender and location in particular). The more additional data the publisher, advertiser or their respective advertising intermediaries have regarding the user, the more relevant context-based ads can be served to them.
The bottom line is that, today, context-based advertising is enriched with data wherever possible to achieve a level of personalization. In principle, the same data is used for such combination of received content and information about the user as is also used for purely behavior-based targeting.
However, the use of user data in context-based advertising is not only important for the display of more relevant ads. As with all online advertising formats, advertisers attach great importance to measuring and optimizing the performance of their ads. At least for this purpose, market participants will once again need access to relevant user data (see below at E.). This means that even (effective) context-based advertising cannot do without access to data.
C. Data for behavior-based advertising
Finally, when it comes to behavior-based advertising, access to data is indispensable. If an advertiser cannot guess a user’s current or at least general interest from a search query or from the context of the website or app visited by the user, advertising inefficiencies due to wastage (i.e. scuttering losses) can only be avoided by using data regarding the user; data that allows conclusions to be drawn about his or her likely interests. Such conclusions can be drawn primarily from previous behavior of the user, such as his or her browsing, clicking and engagement history. Such behavior-based targeting stands and falls with access to such data.
D. Data for programmatic advertising in the open display market
The importance of data for programmatic advertising in the open display market is particularly obvious. To effectively market its inventory via programmatic advertising, a publisher must enable advertisers (and the intermediaries they use) to assess the value of an ad slot on a particular website or app that a user visits. For this purpose, information regarding the context (the content) on the medium and/or information regarding the user, which allows conclusions to be drawn about probable interests, must be provided. The more such information can be provided, the more accurately the algorithms of the companies involved can predict whether a user will be amenable to a particular advertising message or will find such a message annoying. The more precise the calculation and the higher the probability of a positive response to an ad, the more advertisers will bid for the ad slots. Without relevant data, on the other hand, the bids and thus the prices for ads are significantly lower, as the risk of wasted coverages (scuttering losses) increases. Beyond purely context-based advertising, the programmatic distribution of advertising inventory without relevant user data promises no success. Without data, you can't reasonably price ads because advertisers don't bid high “out of the blue.”
E. Data for online advertising support services
Access to user data is not only essential for the delivery of personalized advertising, particularly in the open display market, but many technical support services that are essential components of successful advertising models also depend on access to data. In particular, important functions such as attribution, frequency capping and ad fraud prevention cannot work without access to data:
- Data for attribution of advertising budgets: After the display of (personalized) ads, advertisers want to know how the user reacted to the ad or what actions were subsequently taken. This is the only way to determine, for example, how many transactions (for instance, app downloads) came about organically and how many were based on the placement of an ad. Such attribution requires tracking a user's behavior across multiple websites and/or apps. When a user purchases a product on a merchant's website, the merchant wants to know how that purchase came about – whether the user came to the website directly through the browser or through a generic search result, or whether an ad on another website led him or her to the merchant. This can only be tracked if it can be determined which website a user visited and which ads they actually saw before ending up at the retailer and carrying out a transaction there. It becomes even more complex when the user may have been exposed to multiple advertising campaigns from the same merchant. In order to allocate advertising budgets appropriately, it is then necessary to track which medium was used to acquire the user. To do this, the data regarding the publishers visited by a user and the ads there must be combined and analyzed programmatically.
- Data for frequency capping: Consumers perceive seeing the same ad continuously – across multiple websites and/or apps – as a nuisance. Regardless of the relevance of the ad, they may feel downright “stalked” by a frequent insertion. The feeling of being followed by advertisements is one of the main reasons for using ad blockers. The frustration triggered can damage the advertiser's brand. This would make an advertising campaign counterproductive. The most effective remedy is frequency capping, the technical limitation of how often a particular ad is shown to a user. In order to limit the frequency with which the same ads are displayed, advertisers must be able to track which user has already been shown the ad, how often and over what period of time. This is an extremely important step, above all in the programmatic display of ads across many intermediaries, in protecting consumers from an intrusive overload of ads. Technically, this requires that a user's devices can be identified and that the perception of certain ads can be tracked across multiple publishers.
- Data to prevent ad fraud: Access to and the combination of data is also necessary to combat ad fraud. Ad fraud causes millions of dollars in damage to advertisers every year. This is because with display advertising, publishers are usually paid according to the number of impressions, but sometimes also according to the number of clicks on an ad or the subsequent conversion (for example, installation of an advertised app). This creates incentives and opportunities to technically manipulate such numbers. In particular, fake accounts can be used to artificially generate impressions, clicks or installs. Publishers then have to pay for ads that did not reach real consumers. Verification services provide the most effective means against this. They check whether actions have been triggered by different, real users or whether there is a high probability that, behind an action, there is, for example, a fraudulent bot network with fake accounts. However, this verification requires access to usage data in connection with ads that have been placed. The better verification services can match IP addresses and track which IP addresses or devices visited which publishers, the more effectively they can combat ad fraud.
III. Online advertising without access to data?
As online advertising has grown, particularly the share of programmatic display advertising, the depth and scope of access to advertising-related data has also increased. This has led to concerns regarding data sovereignty and consumers' right to informational self-determination. Such concerns have led to a significant tightening of data protection law for advertising-related activities, in particular in Europe which now has one of the strictest frameworks in the world for online advertising. Nevertheless, some believe that the statutory requirements are still insufficient, and call for further restrictions on online advertising and the use of data in general. Demands range all the way to a ban on all targeted advertising, all data collection for advertising or even all real-time bidding for programmatic display advertising.
On closer inspection, however, many demands are unfounded, interest-driven, and counterproductive. They overlook the enormous benefits of using data for digital business models in general and online advertising in particular; not only for the economy as a whole, but also for each individual market participant, above all for consumers. Instead of only seeing risks and possible misuse of data and placing entire industries under general suspicion, the debate must focus more on balancing the interests involved and also appreciating the benefits of data.
A. Does online advertising need any data at all?
When assessing the role of data for advertising, one cannot consider particular interests in isolation. Everyone would feel more comfortable if service providers stored less personal data. Ideally, consumers would like to be able to use all the services in the world for free and without any advertising, without having to share a single piece of information about themselves. Actually, they would like to have everything in the world for free, at any time. Yet, for obvious reasons such land of milk and honey is just a utopia. Someone needs to cover the costs.
1. Alternatives for consumers
When it comes to using digital services in particular, an enormous mentality of “free” has established itself. Nobody wants to give anything in return. It is not surprising that consumers are fundamentally critical of the collection of their data. All factors equal, consumers would also prefer avoiding advertising altogether. Yet, an economic system cannot function in this way. The same consumers who initially complain about the collection of data and advertising are likely to complain even louder when they suddenly receive fewer digital offers or those become more and more expensive, for example because paywalls are introduced or subscription fees increased.
Critics of data-based advertising argue that any refusal by users to consent to data use for advertising purposes on end devices (for example, via Apple ATTF) implies a rejection of behavior-based advertising and a preference for subscriptions or other pay-for-performance financing. The same conclusions are drawn from studies in which users view personalized advertising as “annoying” or “predominantly critical.” Yet, these comparisons are flawed as they are based on incomplete choices. On closer inspection, consumers have exactly four alternatives: (i) free service with a great deal of non-personalized advertising without data access, (ii) free service with less personalized advertising thanks to data access, (iii) paid service without advertising and without data access, (iv) no service. When consumers are presented with this choice, for most digital offerings the majority chooses option (ii) – free services with personalized advertising thanks to access to data. Only few are willing and capable to pay for services that one can also have for free without a great deal of annoying advertising.
2. Advantages of online advertising
When weighing the pros and cons of accessing data, the economic benefits of online advertising in general and data-driven, behavior-based advertising, in particular, must also be considered.
Online advertising is the lifeblood of the exploding digital economy. It has enabled tremendous economic growth through the expansion of niche players, including Internet-native companies that rely entirely on marketing their offerings on the Internet. The core factor for this was and is access to user data. The most successful, and at the same time most significant and characteristic form of online advertising are not branding ads which aim at building a strong brand among a large share of the population and, because of the intended broad distribution, can largely do without any user data for personalization. Rather, online advertising became big through direct response ads, that is, targeted advertising that compresses the entire customer journey of a user (from creating awareness for a product to completing the transaction) in such a way that a single ad accompanies the consumer to the desired conversion. It is the direct response ads whose KPIs, including the return on investment, are much easier to measure and optimize than traditional advertising. And it is direct response ads that allow niche sellers to bid exactly as much for an ad as they can afford.
Taking a closer look at direct response ads, there are only two ad formats that are suitable for this – (i) search-based advertising, where users express their current transactional intent directly through the search query (or prompt) they enter, or (ii) behavior-based advertising, where such intent can be inferred from preceding user behavior. Purely context-based targeting, that is, advertising that is geared to the content of the publisher visited, is only suitable for this in very rare cases (particularly with retail media). This is because without a robust database, it is completely unclear whether a user is only interested in the respective content by chance or in general, or whether it is based on a specific and current commercial interest. Beyond the special case of retail media, context-based advertising also primarily aims at the transfer of an image of the publisher or the presented content to the advertised product or company; not at the immediate initiation of a transaction. In any case, the need to create media content for an extremely homogeneous group for effective contextual targeting significantly limits the ability of publishers to generate direct response ads via this form of advertising. Thus, context-based advertising represents a market separate from behavior advertising.
Now, if we compare search-based advertising with behavior-based advertising (as options for direct response ads), there are clear differences in terms of the need for data. Search-based advertising requires even less user data than context-based advertising. However, as the following aspects will show, this does not allow the conclusion that access to user data could be reduced in general without having to fear significant economic disadvantages.
B. Isn’t search-based advertising without data sufficient?
- Search-based advertising kicks in at the bottom of the marketing funnel. Search advertising often only delivers (paid) results that the consumer would have found anyway (in the generic results of the search engine or marketplace). However, search advertising is poorly suited to cater for the preceding discovery process. Niche providers in particular need, at the very least, targeted advertising that starts at the top of the marketing funnel and creates initial awareness of a product among a target group. Specifically, they need forms of advertising that consumers can discover on websites and apps even before they have formed any particular purchase decision or even inclination to buy (that could be expressed in a search query). Only behavior-based targeting, and thus access to data, enables the presentation of products and services to users with a high probability of a user responding to the ad, regardless of the context in which an ad appears, even though the user has previously never heard of the products or services. If a publisher knows what the user has liked or bought in the past, there is little risk that ads for similar products will be met with rejection. It is just as unlikely that a user with a similar interest and usage profile will be interested in similar products. This avoids inefficiencies due to wastage and opens up advertising options for even the smallest companies. The growth of many Internet-native companies is based on this. Some of them would not exist if there was only pure search advertising, since such advertising requires the entry of a search term and thus a certain pre-knowledge of the offer and a general propensity to buy. For competition in the digital sector and the expansion of niche providers, behavior-based advertising is thus more relevant than search-based advertising.
- Due to several economic factors, markets for search-based advertising are highly concentrated. For the most significant channel for search-based advertising, general internet search, Google has a near monopoly. Google can almost do as it pleases in this area. To succeed with an ad, advertisers often need to bid to the limit of their profitability. Their margins are being sucked up by Google. Given the inflated prices, for many advertisers, search advertising on Google is not an alternative to behavior-based advertising. At present, despite the rise of AI chatbots and their integration into rival search engines, the only serious alternatives in the search-based advertising space are Amazon Ads for merchants in the Amazon Marketplace and Apple Search Apps for app developers in the App Store. Yet, the ads are only open to merchants and app developers on iOS devices respectively. Moreover, Amazon and Apple themselves are the leaders in their respective markets and control robust ecosystems around them. Therefore, such providers are only a limited alternative to search advertising on Google. In terms of competition policy, it would be a mistake to further strengthen the dominant positions of Google, Amazon, and Apple in search advertising on their closed platforms by removing the technical basis for effective (direct response) advertising from the only realistic alternative to search advertising, specifically access to relevant data for behavior-based advertising.
- As a whole, the markets for online advertising are highly concentrated. Google dominates search advertising, video display advertising (with YouTube) and the various levels of the placement of display advertising. Meta/Facebook dominates advertising on social networks, Amazon advertising on its Marketplace and Apple advertising on its iOS App Store. All four tech giants exclude third parties from placing ads on their inventory (walled gardens). What “remains” is competition for advertising space and its placement in the open display market. In particular, the growth of programmatic display advertising there is creating scope for competition from thousands of publishers and their intermediaries. However, programmatic advertising in the open display market in particular is primarily behavior-based and thus dependent on access to data. To be sure, context-based advertising is also conveyed in the open display market. However, the advertising model is not suitable for the majority of publishers – whose content is not product-related or does not allow the transfer of images.
C. Isn't context-based advertising without data sufficient?
Even beyond direct response ads, there are no real alternatives for behavior-based targeting. Search-based, context-based, and behavior-based advertising differ technologically and functionally to such an extent that they can be assigned to different advertising markets. For advertisers, but also for publishers, they are only substitutable in marginal areas.
For most publishers, financing their offerings by displaying search-based advertising is not an option from the outset. Today, advertising finances far more than just search services. Context-based advertising is also not a viable advertising model for many publishers, namely those with content that does not lend itself for commercial ads. This is the case, for example, for most news portals as the reading of general daily news does not allow any conclusions about the reader’s commercial interests. Therefore, many publishers can only make competitive advertising offers via behavior-based targeting. If the technical basis for this advertising model is removed by means that restrict access to relevant data, also the basis of business for publishers that rely on such advertising will be lost.
D. Aren’t payment models sufficient for publishers?
Restricting access to data for behavior-based targeting is not mitigated by the fact that publishers could switch to a method of a direct payment of content by consumers. If free content is available at the same time, pure online subscription models have no chance of success. Even Netflix, a traditional subscription business, switched to a hybrid model with advertisement. In any event, the choice of a business model should remain with the publishers and should not be dictated unilaterally by state authorities and certainly not by private gatekeepers that impose their business models on publishers.
E. Win-win situation of behavior (data)-based advertising for market participants
It has been shown that there is no alternative to behavior-based advertising that realizes the same macroeconomic benefits. However, behavior-based advertising requires access to usage data. Thus, the overall economic benefits of the advertising model also depend on such data access. All market participants benefit from such access (except ad blockers and dominant providers of search-based advertising and their revenue share agreement partners):
- Data as a tool to generate positive (indirect) network effects. The better the match between consumers and advertisers, the stronger the positive (indirect) network effects that a publisher creates for its two user groups. If the publisher passes such size efficiencies on to its user groups, particularly by investing in high-quality content and lower costs for ads, the symbiosis represents a classic win-win-win situation. However, the quality of the intermediation of consumers and advertisers through behavior-based advertising now depends directly on the amount of data regarding the user that is available. This is because such data provides the only point of contact, the only means, for matching the user groups. As such, the strength of the positive network effects that a publisher can generate through behavior-based advertising also depends directly on the scope of access to data.
- Consumers benefit from access to their data by, among other things, (i) receiving a more and more diverse range of content and services for less money, (ii) having advertising that is more relevant and less annoying to them owing to personalization and (iii) publishers having to display less advertising overall to finance their services.
- Advertisers benefit from a robust data set, among other things, because their advertising campaigns become more effective since (i) they achieve better KPIs in particular, (ii) they are more measurable and budgetable (ii) consumers are less bothered owing to higher relevance and therefore (iii) fewer ad blockers, which render advertisers completely invisible, are used.
- Publishers benefit from a robust data set, among other things, because (i) their inventory achieves better KPIs, (ii) their inventory achieves higher ad rates in particular, (iii) fewer consumers switch media due to disruptive ads, and (iv) consumers use fewer ad blockers overall and (v) more advertisers adjust their budgets to online advertising.
- Advertising intermediaries and providers of support services benefit from a robust data set in part because data is essential for (i) programmatic display advertising, (ii) verification, (iii) attribution and (iv) measurement. Since only the placement of non-search-based advertising in the open display market is currently viable as a business model due to Google’s walled garden, the intermediaries are left with few alternatives.
Behavior-based advertising creates a win-win-win-situation for consumers, publishers and advertisers. Any ban on behavior-based advertising or the collection of personal data to this end would unravel this win-win situation, to the sole benefit of ad blockers and market-dominant search engines, which would be delighted with more traffic and a higher share of the overall marketing budget being invested into search ads. The same is true for any measures by digital gatekeepers to artificially restrict the access to data for behavioral advertising beyond the limits imposed by privacy laws. The big winners of any such limitations would be Google, Amazon and Apple. Everyone else would lose.
*Thomas Höppner is a Partner and Philipp Westerhoff is a Senior Associate with Hausfeld Rechtsanwälte LLP in Berlin. Hausfeld represents claimants in competition proceedings against Apple, Google and Amazon concerning online advertising. This article was first published in CPI Antitrust Chronicle, Spring 2023, Volume 2(2).
This article was first published in CPI Antitrust Chronicle, Spring 2023, Volume 2(2).
 Open display market refers to a sub-set of the display market, where (in contrast to owned-and-operated platforms) publishers do not sell their ad inventory through their own ad tech interfaces but a complex chain of third-party ad tech intermediaries.
 German Federal Cartel Office, Sector Inquiry into Online Advertising. Discussion Report (2022), at 240, 253, 319.
 See OVK, OVK Trend Study Paid Content (2022) https://www.ovk.de/app/uploads/2022/03/OVK-Trendstudie_Paid-Content_202200309.pdf, p. 13; AdLucent, 71% of Consumers Prefer Personalized Ads, (2022) https://www.adlucent.com/resources/blog/71-of-consumers-prefer-personalized-ads/, UID 2.0, Global Consumer Survey (2021) (February 23, 2021), https://www.thetradedesk.com/us/news/consumers-say-their-internet-experience-is-broken-heres-how-we-can-fix-it; Blockthrough, The Rise of Content-based Advertising: 2021 PageFair Adblock Report, (2021) p. 4.
 Benjamin Thompson, Online advertising in 2022, Stratechery (August 8, 2022), https://stratechery.com/2022/digital-advertising-in-2022/.
 Such use of customer data to identify customers with similar attributes is also called look-alike modeling. This opens up new target group segments that are similar to an existing customer base.