Bot Filtering: AdPlugg Takes on Bot Traffic

bot_filtering
What do the Terminator, the Star Wars films, the Matrix, and AdPlugg have in common? They all feature epic battles against robots :-).

AdPlugg’s battle is against bot traffic penetrating statistical data. This bot traffic can skew results, show false impressions and even worse, false clicks. This is a major concern and one that we at AdPlugg take very seriously.

So what is a “bot”? Bots are automated programs that browse the web. They might be browsing for all sorts of different reasons but the most common is for creating search indices. Bots that crawl the web are known as crawl bots or “spiders”. Of the crawl bots, the most well known is Google’s Googlebot. Crawl bots read a page and then follow all of the links on the page. The bot then does this again on the next set of pages, creating a spiderweb of linked pages that make up the known web.

This can be problematic for online advertising as online ads aren’t the same as other links. Online ads are designed to be viewed and followed by humans only. This is for two reasons. The most important being that advertisers buy ad space to get impressions and clicks from humans, not bots. Also important is that paying another site to link to yours is a “black hat” practice that can get you penalized or even banned from search engine results.

To combat the issue, ad links are required to include a ‘rel=”nofollow”‘ attribute. This attribute tells bots, “hey this is an ad, don’t follow it.”. Bots that follow the REP (Robots Exclusion Protocol), won’t follow the link. This attribute is included automatically on all ad links that AdPlugg serves.

However, ‘rel=”nofollow”‘ doesn’t stop all bots. Even amongst the largest search engines, there is inconsistent support for the standard. Wikipedia’s Nofollow article reports that while Google and Ask.com fully respect the attribute, Yahoo and Bing both follow the link anyway (but exclude the link from their search rankings).

Other offenders are much less well intentioned. Email harvesters, spambots, malware and bots that scan for security vulnerabilities are unlikely to respect the nofollow attribute.

This leaves it up to the ad servers and ad trackers to identify and filter bot traffic. It’s tricky however, because the links can’t appear broken. If a bot, such as Bingbot, tries to follow the link (even though nofollow is set), the link should work. Otherwise, it’s possible that Bingbot will demote the page (from the Bing search engine rankings) for having broken links.

AdPlugg handles bot traffic by allowing it to pass through but filtering it out of the statistics. This keeps all links working but stops bot traffic from being included in the ad’s impressions and click statistics.

The trick is identifying the bots. AdPlugg now scans for and filters out over 400 known bots. We have systems in place to update our bot list as we become aware of new bots. Similar to the plight of CAPTCHA technology, identifying bot traffic for the purpose of ad stats is a difficult battle. Bots that don’t want to be identified are remarkably good at it.

AdPlugg’s battle against the bots is an ongoing effort and our systems are continually being worked on and improved. We often get asked, “what differences make AdPlugg better than other options such as stand-alone plugins?” Well, bot filtering is a big one.

Leave a Reply

Your email address will not be published. Required fields are marked *