May 24, 2016

Adblocking Detection, My Approach

It occurred to me that there may be other people out there interested in detecting the presence of adblockers on their website. Here’s my approach.

First a little primer. When you sign up for adsense, adsense has you add a small code block to your website. The code block contains an ins element with class name “adsbygoogle”. When the adsense javascript code runs, it finds this element in your DOM and inserts the ad into it.

Most ad blockers work by preventing the adsense javascript file from downloading to the user’s browser and executing. Therefore, the simplest and most straight forward way to detect an adblocker is to see if any content was inserted into the ins “adsbygoogle” element. If you’re using javascript with the jquery library that might look like:

$(window).load(function() {
setTimeout(function() {
var ad = $(“ins.adsbygoogle”);
if (ad.length > 0 && ad.html().replace(/\s/g, “”).length == 0) {
console.log(“Ad blocked”);
}
}, 2000);});

The code sets a 2 second (2000 milisecond) timeout to give the adsense script time to load an execute. After two seconds have elapsed, it uses the jquery selector to find the ins element with adsbygoogle class. If the ins element is present, it checks to see if the tag contents (with the .html() call) has zero length and thus was not populated.

There are a couple drawbacks to this approach. While most adblockers work by preventing the ins element from being populated, some adblockers remove the ins element from the document DOM altogether. Others modify the css properties so the ins element is no longer visible. In my case some pages, such as the terms of service, are ad-less by design. I wanted to detect a modified DOM which meant signaling to my detection script whether an ins element was expected to be present in the first place.

The first thing I did was create a div container around my google adsense ins tag. Most adblockers give you the ability to hide arbitrary elements on a page, so you can hide any ad the blocker may have missed. Blockers store information about the element (usually the id) so they can continue to block the element if you leave the page and come back, or they see the element on other pages in the same domain. By creating a random string each element appears different to the adblocker. There is no way for the adblocker to hide all my container divs (see note below for caveat.)

This is done by PHP and might look like:

<?php
$id = getRandomID();
echo “<div id=\”$id\”><ins class=”adsbygoogle”…
echo “<script>addAdsenseID(‘$id’);</script>”;
?>

The function getRandomID() is a user created function I didn’t include for brevity. Insert your favorite random string generator. The last step, addAdsenseID is a javascript call that ads the id to an array.

The updated javascript now looks like:

$(window).load(function() {
setTimeout(function() {
for (var id in adsenseIds){
var ad = $(‘#’ + id + ‘ ins.adsbygoogle’);
if (ad.length > 0){
console.log(“DOM modified”);
} else {
if (ad.html().replace(/\s/g, “”).length == 0){
console.log(“Ad Content Empty”);
} else if (ad.css(‘display’) == ‘none’){
console.log(“Ad Display None”);
}

}
}
}, 2000);});

No more over counting pages without ads, no more missing the case where the DOM has been modified. It happens rarely, but as a data scientist “rarely” is just not specific enough. I opted to inspect each adsense element, but if you wanted one check per page you could always replace the for loop declaration with something like “var id = adsenseIds[0];”


Caveats: While my approach is robust enough for my purposes, it’s not full proof. At present I see two possible work arounds.

1.) A user can disable javascript altogether. With javascript disabled, my detection script won’t run. I doubt anyone would do that on my website since that would also render all the apps useless. I’m not loosing any sleep over this one.

2.) The ids I generated for my container div elements were completely random, but the ids for the elements on the rest of the page or not. Certain characters (‘q’, ‘x’, ‘z’, etc) are more likely to occur in random strings than strings based on English. One could create a statistical modal to predict whether an id was a random string and thus the container div id. It’s a lot of work, but it’s possible.

Related posts:

Posted in Work Life | Tags: ,


Leave a Reply

Your email address will not be published.