In Part 1, I touched on the global narrative of online privacy and the increasing ingenuity with which consumers are evading state and corporate-level tracking.

In this follow-up post, I will argue that growing consumer scepticism of online tracking is not the only issue facing big data. Perhaps more importantly, assumptions being made by enterprise-level data management platforms and their algorithms are inherently incomplete. Big data has changed the marketing industry beyond recognition, and delivers competitive advantages to those that use it well. But it’s no panacea.

Let me begin with some terminology. ‘Passive’ or ‘implicit’ data collection, most commonly enabled by placing a few lines of code in websites, apps and other hosted digital media, is the practice of tracking consumer behaviour on connected platforms and accruing terabytes of data in the process. ‘Active’ or ‘explicit’ data collection is more transparent, with insight companies openly asking respondents for answers.

The accepted wisdom in the debate between active and passive methodologies is that the former puts the emphasis on ‘claimed’ behaviour, while the latter tells a story of ‘actual’ behaviour. But is this really true?

Let’s take China as an example. Knowing that usage of VPNs and proxy servers is rife, GlobalWebIndex data has shown repeatedly that the Great Firewall Of China is not as secure as the government would hope:

This flies in the face of passively collected data on the two companies. Many sources will tell you that Facebook and Twitter struggle for adoption in China, since passive analytics systems will often look to server locations as a proxy for geographic usage. Clearly this couldn’t be further from the truth, with a huge number of Chinese users explicitly telling us that they are active users of western social networks.

There are many metrics other than location that throw up potential red herrings too. Web analytics software that report on unique users, pageviews, conversions and revenues often have significant discrepancies between each other. This goes for web analytics, mobile analytics, telemetry systems and other big data services. There’s no single view of the truth, since there are multiple points of access to the web, many of which have both multiple web browsers and multiple end users.

Take the BlueKai Registry as an example. Using the tool, you can view the attributes that programmatic media buyers are using to target you with advertising, using the 3rd party data housed within the BlueKai exchange. The registry lists demographics, interests, expected income and more. If you have a look, you will notice that while some of the segments you have been placed in are broadly accurate, many of them are anything but. Suffice to say I wish I earned as much money as BlueKai thinks I do!

Source:, April 2013

If you need further convincing that passive data has inherent blind spots and inaccuracies, consider the fact that even the largest web companies enrich their data with surveys. Facebook routinely surveys its audience, despite the fact that their users arguably share personal information more extensively on Facebook than any other platform in the world.

Source:, March 2013

Active data collection (i.e. survey-based methodologies that collect explicit consumer answers) is becoming more relevant than ever, in order to get a clear picture. Indeed, it’s becoming increasingly hard to deny that technology-based tracking will never be the ‘silver bullet’ that many digital marketers were hoping for. Even simple reporting such as user location is impossible to depend on, with traffic being routed around the world (often via hubs in the US and central Europe) through VPNs and proxy servers.

In a world where Do Not Track and VPNs are ever more common, a complete picture can only be made up by a suite of tools that offers both active and passive insight – so as to approach a clearer version of the truth.

There are two implications for marketers. First, be responsible about your data collection – and even more diligent about who you share data with. Consumers are rightly becoming more critical of user-level tracking, and abuses of data sharing and monetisation will ruin the party for everyone. Second, combine a passive approach with active methodologies, so that you might fully understand how your consumers are using digital media.

Never miss a post

By subscribing you confirm you’re happy for us to send you our latest articles.