In an era when the collective voice of customers and citizens, empowered through social channels, has become a primary agent-of-change that is transforming governments, societies, industries, brands and products, most consumer-centric organizations have now adopted some form of social listening.  Use cases include intelligence to improve customer service, understand customer experience, design new products, measure brand health, meet the unmet needs of communities and customers, and, in most cases, become more socially responsible and responsive.

But the sheer volume and complexity of this data has made understanding human language at scale a serious challenge.  Slang, sarcasm, implicit meaning and simply separating the meaningful signals from the noise has required massive resources to filter, clean and glean actionable insights in a timely fashion.   Forrester Research reports that most companies are only currently processing 21 percent of their unstructured data.  And for good reason: many technologies today still perform quite poorly at accuracy (as compared to the human level “gold standard) and are capture and analyze, on average, less than half of the opinions expressed.   This can lead to a dangerous result where companies are acting on data that is far from precise and lacks rigor and transparency.

Ineffective or garbled translation of this data does a disservice to both brands and consumers.  For brands, bad data leads to bad insights and poor business outcomes.   For consumers, misinterpreting their expressed opinions means that they’re not being heard, and likely not being served well.

“Getting it right” matters when it comes to this kind of language analysis at scale and indeed there is a growing groundswell for accountability. DARPA, for example,has initiated a project designed to help eliminate human bias in AI by keeping humans involved in oversight of the models.  And while GDPR doesn’t specifically address social and voice of customer analysis, the spirit and goal of the policies are quite clear:  we must be accurate and explainable and provide a recourse if there appears to be inadvertent bias.

So how exactly do we do this right?

Technology can make a significant impact.  With the maturation of machine learning, today we can create models that can accurately process affective expressions like sentiment and emotion at about the level of humans – and in some cases exceed that performance (individual humans can be pretty inconsistent).  This has enabled processing of massive amounts of unstructured voice of customer data in near real time with greater precision than ever before.

But just applying AI and machine learning is no panacea either.   In fact, without applying it thoughtfully, strategically and with a well-defined process, it can lead to a whole set of other unintended consequences that can put organizations at risk.  AI learns from observing people and if people are inadvertently sloppy or biased, the algorithms will be too.   Examples of inadvertent discriminatory models abound. Some of the more public examples include Google’s Word2Vec showing sexist tendencies (such as one case where men were compared to computer programmers while women were compared to homemakers). In one criminal justice model, Black defendants were almost twice as likely as white ones to be falsely labeled future criminals.

Without diligence, as AI becomes integrated more deeply into organizations and insights functions inadvertently discriminatory classifiers will ultimately lead to inadvertent discriminatory actions (or inaction when an action is needed).  The World Economic Forum’s Global Future Council on Human Rights says it well: “The concern around discriminatory outcomes in machine learning is not just about upholding human rights, but also about maintaining trust and protecting the social contract founded on the idea that a person’s best interests are being served by the technology they are using or that is being used on them.”

The good news is that this is an issue savvy organizations are waking up to and beginning to take action on.  Over the last twelve months, we’ve seen a change in conversations about AI and machine learning going from “should we be using it for text analysis? (the answer a resounding yes) to “how do we make sure we do this right?”

In response to the latter question, we offer these seven guidelines.

1. Get Senior Executive Buy-in and Guidance: Successful adoption of these technologies should not be left to data scientists and model builders alone.  Executives need to lay out policies and guidelines that make it clear that avoiding inadvertent bias is not just a key value, but also a measurable KPI.

2. Employ Active Inclusion: Machine learning algorithms learn by observing human behavior.  Perhaps not surprisingly, humans often don’t agree (our research finds that humans only agree on emotion and sentiment about 65-80 percent of the time.  To overcome the potential bias of single human input, it’s critical that multiple individual humans representing your target audience provide input, and if there’s disagreement, there must be an adjudication process.

3. Process Even the Hard Stuff: Yes, language is hard.  And on social media, sarcasm, slang and new expressions and shorthand dominate.  But that doesn’t excuse us from not processing and interpreting it.  Many standard text analytics approaches are ineffective with this kind of data because there is almost an infinite number of ways opinions can be expressed and systems can often break when confronting that level of complexity.   Effective systems need to be optimized to perform not just on “The Queen’s English” but also all variations, dialects, and expression styles.

4. Measure and Validate: Before applying a classifier, it’s critically important to understand exactly how well that classifier performs before unleashing it in the wild.  Far too often today, companies have little to no specific understanding of how accurate their models are or how well they are capturing and processing required data.   They have to try to “eyeball” the data to see if it looks right.  We advocate for an F1 score (a combination of precision and recall) as a starting point although more effective evaluation can also include techniques like Area Under Curve (AUC).   The key measurement and validation needs to be measured against diverse new data it hasn’t seen before so that you can know how well the model will “generalize.”

5. Think Bottoms Up: Inadvertent bias can also emerge from one of the key challenges of most research:  you only find what you look for. Techniques like “organic topic discovery” are critically important to identify trends, concepts and opinions that bubble up naturally in conversations that might otherwise flow under the radar.

6. Keep a Human-in-the Loop: Human language is intrinsically human.  Technologies in this space should not aim to fully replace humans, but rather augment their intelligence and judgment at a vast scale.   In active machine learning approaches, all processed records can be considered in light of a confidence score, allowing analysts and domain experts to review low confidence records and intervene if needed.

7. Take Control of Your Classifiers: Given the growing risks associated with poor models, it’s becoming increasingly incumbent upon organizations to take control of their data directly and not simply cede the responsibilities to third parties without fully understanding their processing and technologies.  New “Machine Learning as a Service” solutions that integrate with social listening, engagement and business intelligence platforms are increasingly allowing even general users to access these technologies directly for maximum impact and assure adherence to company guideline, taxonomies and policies.

Machine learning applied to linguistic analysis truly is a game-changer and offers immense value to society, consumers, and companies alike.  But with great power comes an even greater responsibility to use the optimal tools, technologies, processes and procedures.   For anyone involved in analyzing this powerful unstructured data and interpreting it for their organizations, there is perhaps no greater obligation today than to adopt these new approaches to ensure that they do indeed “get it right” for the benefit of consumers, society and brands alike.


*This article first appeared in the GreenBook Blog