The Impact and Role of ChatGPT (and LLMs) on Social, Media, and Conversation Data
Beginning in 2008, Converseon pioneered the application of machine learning and natural language processing to unstructured social (and related) data. Since that time, we’ve gained extensive experience in building, testing, and deploying ML models using a wide range of algorithms, from decision trees to neural networks.
Our approach has been to remain, in some sense, algorithm agnostic – we use whatever best solves the problem at hand. We incorporate new innovations as they demonstrate productive impact on our work and solutions. Critical to this is our ability to apply human-in-the-loop integration, validation, and subject matter expertise to solve specific challenges. Our focus on model performance and practical application is one reason Converseon was recognized by Twitter as its first “recommended” NLP partner in 2022.
The emergence of LLMs (large language models), the underlying technology for applications such as ChatGPT, is the most recent and most significant example of innovation that we have evaluated and incorporated into our solutions. We do so with a sharp eye toward their overall effectiveness, and perhaps more importantly, their challenges, risks, opportunities and, of course, anticipated evolution.
The media frenzy over ChatGPT has many in the consumer, market, and social intelligence space, and beyond, seeing these developments as a suddenly disruptive force. But this technology, and our application of it, has been around for some years. More importantly, we must keep in mind that, from our perspective, the key challenge facing organizations today is true predictive intelligence. The ability to see ahead and peer around the corner at what is likely to come – while quantifying the value and impact of actions before you take them – is a key imperative, and provides a context for how we are exploiting this technology.
And it is through such a lens that we evaluate this and all new applicable technologies.
The summary of our view, which we detail here, is that LLMs are indeed a powerful technology that can play an important, but specific, role in the “intelligence” industries. They are integrated into our broader solution where they provide the most value while mitigating their risks.
LLMs provide significant power as foundation models upon which we build reliable task-specific solutions incorporating subject matter expertise and human oversight. By contrast, in isolation, LLMs have broad but shallow knowledge, are prone to error, sometimes “make up” facts, and are retrospective (not predictive) in nature.
In short, they must be approached with the same level of diligence, focus, expertise, validation and human in the loop intelligence that has been consistently required for other AI and machine learning approaches. In fact, in some ways even more diligence is required given the intoxicatingly convincing yet false output these models can produce. They are an important component in the ecosystem but only when realized with accompanying processes and technologies that augment, complement, and transform their power into the predictive, quantifiable, trusted and actionable decision intelligence demanded by organizations we work with.
ChatGPT has created quite a stir recently, earning quotes such as “everything is going to be different going forward” from Box CEO, Aaron Levie. Hearing what some people say about it, or even chatting with it yourself can make one feel awe struck or even intimidated.
Will everything really change? Or at least will everything in the digital world change and therefore how we acquire and use data to make informed decisions change?
This is the key question for many looking to use this new technology. Firstly, let’s address the sentience question and dispel, at least for GPT version 3.5 and all current, similar models, that ChatGPT (and its rivals such as Google’s Bard) are not alive, nor intelligent. It does not think, and it cannot reason. It is a large mathematical model of inputs, weights, memory and outputs based on those inputs and weights which means it is a tool, not a being or person. It has been trained on large corpuses of text, to predict what should reasonably come next and some very small percentage of this is with a human in the loop. It makes mistakes, and it can reflect the biases inherent in the texts upon which it is trained.
This means that like any tool, it has strengths and weaknesses,
and with that, right and wrong applications.
This is where things get interesting. If we adjust our thinking, ground our amazement and see ChatGPT as a tool instead of some mystical AI, we begin to see how and when it can be useful in helping our businesses make better decisions.
Let’s look at the strengths and weaknesses first and then we will look at some applications in business. We will use two practical examples to illustrate and demystify.
Example 1: The India Times asked ChatGPT “Can you tell me which is India’s biggest IT company?“ ChatGPT answered “Tata Consultancy Services (TCS) is one of the largest software companies in India. TCS is a multinational corporation that provides a range of IT services, including consulting, software development, and business process outsourcing. TCS is headquartered in Mumbai, India, and has operations in more than 50 countries around the world. In 2021, TCS was ranked as the third largest IT services company in the world by revenue, and it is one of the most valuable companies in India,” (bold added for some fact checking later).
Now that’s an astonishing answer. Or is it? It is a seemingly good reply in very natural sounding language. Has an infinite brain just been tapped into? Well, getting past the amazingly well constructed reply, we can see that although the question was localized as a comparison within India, the answer was about a company registered in India but in a global, not local context. So this isn’t actually a good answer. If your CEO asked “which product performs best in the USA” and you said “our salt and vinegar flavor sells the most world wide”, they would not be happy since you did not answer the question.
The next point to note is that the answer is short on facts. ChatGPT is great at combining data sources to produce a natural sounding summary. Primarily the system has learned to construct language correctly, whether this be code or a human language. But it doesn’t include many facts, because facts are not represented in language in the way it was trained. Remember, it is trained to predict the upcoming words and unless the entire training corpus was question and answer style, it wouldn’t learn facts as such. Another note on facts: given that these models have been trained with texts from many sources, applying probabilistic weights to inputs, the resulting “facts” are often wrong.
Let’s illustrate by looking at a simple Wiki search for TCS which has: “Tata Consultancy Services (TCS) is an Indian multinational information technology (IT) services and consulting company with its headquarters in Mumbai. It is a part of the Tata Group and operates in 150 locations across 46 countries.[8-2017] In July 2022, it was reported that TCS had over 600,000 employees worldwide.” (bold mine)
This brings us to the next problem with ChatGPT, when using the public model (i.e. not training a GPT 3.5 model yourself): you don’t get any citations and you get only old data from 2021 or earlier. You can follow up and ask for the sources it has used, but as some have demonstrated, this leads ChatGPT to change its answers. Yes, the system will actually change its answer as you go down the route of finding the primary source document. This means, no matter what the public ChatGPT model says, you always need to fact check any assertions made.
This isn’t good for decision making. You wouldn’t take at face value glib responses from a casual (human) acquaintance, nor should you accept the same from ChatGPT.
Driving home this point of facts and sources, which would you prefer as answer to the question of how many countries TCS operates in?
- ChatGPT: over 50 (pre-2021 data)
- Wiki: 46 (2017 investor relations data)
- TCS Website: (55 countries, Jan 2023)
Now, some reading this might be thinking “you are being too hard on the tech.” But remember, the context is business and making critical business decisions. If the data you have is wrong, the decision you make is wrong.
Now let’s explore how facts get commingled within ChatGPT. This is a common problem across NLP models. It is actually really hard to provide sufficient data so that minor sub-categories will have the right facts. This is because, at its heart, all current AI is really probabilistic, i.e. it infers “answers”, via weightings optimized during training to produce what would be the most likely answer to a specific prompt.
To illustrate this commingling our next example uses the iPhone 14 example from MKBHD’s Channel. He is a tech reviewer who had ChatGPT write a script for a video. It was super interesting, but as he points out when ChatGPT shared the technical specifications for the iPhone 14, it indicated a 12mp rear camera, which is true of some previous models. But the IPhone 14 has a 48mp rear camera.
This is not surprising and in truth is what we would expect to see. The model is learning from many different sources. It sees “12mp and iPhone” together 10x or even 100x more frequently than mentions of “iPhone” and “48mp” rear camera, because the iPhone 7 through the iPhone 13 had a 12mp rear camera. What the model couldn’t do was “understand” the adjustment that including 14 would make to the iPhone technical specs.
This exposes a weakness and benefit
If you are interested in the most popular opinion then clearly a model which gives higher weight to the most expressed opinion is going to work well. If you are interested in the hard facts, regardless of the “hype,” then this isn’t going to work for you. ChatGPT is summarizing in a style which doesn’t focus on the facts but rather, loosely put, the co-occurrence frequency of words across many documents. (This is inherited from training techniques that work to predict the missing words from, or the next words to, a sentence. The more frequently certain word combinations appear together the more likely they will be predicted when one of the words comes up in a question via the inputs).
Now onto the benefits.
How can one best use a model like this? We need to look at the underlying public model and then specifically train it further with your own data. ChatGPT itself is an instance of the public model with some additional training added. At Converseon, we have been achieving excellent results by applying additional training (fine-tuning) public models with very carefully curated data, and careful analysis of results.
In a general public pre-trained model, you are able to get a summary on various topics, the current winds (from 2021), so to speak, of public opinion as published on the internet (including Twitter and some other social platforms pre 2021). It is high level but can help you to break out of your way of seeing the world. This use of the tool is very much like bringing in experts from other disciplines or outside of your team or organization for brainstorming. It will likely not be of the same quality, but it fulfills that function.
Another use case is to get well-written text, in your language (but not culture – it ignores non-English cultures), of your own ideas. So if you have a particular idea and want to expand upon it, you are able to enter and guide the public model with your idea to produce something more verbose or complete. When it comes to making decisions, none of this is useful. It is old data, filtered through unknown biases, curated to some degree by unknown people. This can produce a different or aggregate POV but is that really what you want?
By contrast, with a public model which is fine-tuned using your own text data, you are able to get some powerful derivatives such as what Converseon has created with advanced sentiment, tone of voice, and topic classification. This means you can use the GPT 3.5 architecture to create more advanced tools to better do those repetitive classification tasks. Sure, you could also have it respond back in natural language, but if you are making decisions, often the hard facts, not the fuzzy colors of language nuance, are what is most needed.
What I would suggest you don’t do, is get it to write your next VP report or customer insights report summary. The weaknesses around facts and context will appear at random, and could catch you off guard. You will be guilty of sharing an unknown or unsourced opinion, instead of facts, within your organization.
To conclude, as OpenAI, the creators of ChatGPT put it,
“this is for entertainment purposes only”.