The Facebook data breach raises urgent questions that need to be answered responsibly by our industry -- given its terrifying scale and impact. In today's world, data is a form of soft power, and it is essential for those who wield it, to use it responsibly so that consumer confidence isn't compromised.
The challenge is that, at an idea-generation stage, it can be difficult to draw a clear, bright line between whether data is being used for optimisation or for manipulation.
Take, for instance, the Obama and Trump campaigns in the US. The former used the same digital platforms for optimising communication and ensuring voter confidence and dissemination of information. On the other hand, British political consulting firm Cambridge Analytica used the same platforms but with malafide intent -- to manipulate the views and preferences of voters.
As investigations continue, it is increasingly clear that data was stolen, models used were unauthorised for the purpose they were being used, the messages (in many cases) were outright lies.
So the whole operation was questionable from the get-go. It is, therefore, extremely critical to demarcate this difference -- are the final consumers of a data-driven model being actively manipulated or is data being used to merely optimise a communications strategy?
It is also essential to clearly define the parties involved in the data "lifecycle" and their roles and responsibilities, with regard to how data is being used. There are usually three parties in this lifecycle, each requiring a different kind of oversight and norms.
First are the data originators, those that capture and store the data. And I'm not only talking about Facebook and Google, but also a wide range of other originators -- for e.g. Equifax (which holds extremely sensitive consumer credit information), banks (which store individual-centric financial information), telecom organisations (which hold a treasure trove of communications and browsing information), etc.
Two safeguards are critical here.
One, data security safeguards to ensure the privacy (external parties shouldn't be able to see it) and integrity (external parties shouldn't be able to change it) of the data. This can be improved by ring-fencing the data sources and ensuring advanced security measures.
And two, ensuring explicit consumer consent for sharing and using this data. This can be done by introducing easy-to-understand verbiage around fair-use -- where their data could be used and for what purposes.
These two interventions -- data security and informing users where data could be shared -- are the key and will go a long way in winning back consumer trust in these platforms.
The second type of entity involved is the data processing companies which employ intelligent algorithms over the data to extract insights. This includes companies like Cambridge Analytica.
Given that data processing companies also have access to a large scale of data, entrusted by clients, it is imperative that their systems are subject to similar levels of security, compliance and governance norms.
This can be resolved through globally-agreed standards of security, enforced through regular third-party audits. We need to be held to the same standards as the data sources themselves when it comes to security of the data so that we aren't the weak link in the event of a data leak.
There may also be value in exploring how we can expressly declare the nature of algorithms employed and the source of these algorithms (in cases where there is a patent to one), to an unaffiliated third-party regulator. This will ensure better transparency around what the data is being used for.
Finally, we have the third party in the data lifecycle: The buyers of the data -- organisations that pay for the data and algorithm-driven insights around it. In this case, they are the political organisations that are beneficiaries of the analysis work by the processing companies.
Here, let's go back to my earlier point of drawing the line between what is optimisation and manipulation.
Are the data buying organisations sponsoring an ad because they feel consumers genuinely stand to benefit from the content, or are they using the data to manipulate users into actions that are not in their best interest?
More importantly, does the ad-sponsoring organisation have the authority to display that ad, or are they a geo-political adversary? This can be cleared up by implementing fair-usage policies around what the extracted data is being used for, who is using it, and what are the implications of that data -- all of which needs to be made more transparent and subject to governance norms in certain cases.
Obviously, the three parties interplay with each other. For instance, Facebook and Google are two of these parties -- the source and the processor. Thus, it needs to be ensured that they be accountable to both sets of norms.
It is imperative that all parties in the data lifecycle take seriously the trust with which data is being shared with them by their users -- for their own good. The way things stand right now, biting around the edges of this debate is not going to win back lost consumer confidence in our industry and we are all the losers in the long term.
(Sameer Dhanrajani is Chief Strategy Officer at analytics service provider Fractal Analytics. The views expressed are personal. He can be contacted at firstname.lastname@example.org)