By now, you’ve heard of ChatGPT and its text generation capabilities. It has passed a business school exam, confounded teachers looking to spot cheaters and helped people craft emails to their co-workers and loved ones.
That it has accomplished those tasks is notable, because exams, essays and emails require correct answers. But being correct isn’t really the point of ChatGPT — it’s more of a byproduct of its objective: producing natural-sounding text.
So how do artificial intelligence chatbots work, and why do they get some answers right and some answers really, really wrong? Here’s a look inside the box.
The technology behind large language models like ChatGPT is similar to the predictive text feature you see when you compose a message on your phone. Your phone will evaluate what has been typed in and calculate probabilities of what’s most likely to follow, based on its model and what it has observed from your past behavior.
Anyone familiar with the process knows how many different directions a string of text can branch into.
Unlike the phone’s predictive text feature, ChatGPT is said to be generative (the G in GPT). It isn’t making one-off predictions; instead it’s meant to create text strings that make sense across multiple sentences and paragraphs. The output is meant to make sense and read as though a person wrote it, and it should match up with the prompt.
So what helps it pick a good next word, and then another word after that, and on and on?
The internal reference
There is no database of facts or a dictionary inside the machine to help it “understand” words. Instead, the system treats words mathematically, as a collection of values. You can think of these values as representing some quality the word might have. For example, is the word complimentary or critical? Sweet or sour? Low or high?
In theory, you could set these values wherever you like and find that you have come close to a word. Here is a fictional example to demonstrate the idea: The generator below is designed to return a different fruit based on the three qualities. Try changing any of the qualities to see how the output changes.
That technique is called word embedding, and it isn’t new. It originated in the field of linguistics in the 1950s. While the example above uses just three “qualities,” in a large language model, the number of “qualities” for every word would be in the hundreds, allowing a very precise way to identify words.
Learning to make sense
When the model is new, the qualities associated with each word are set randomly, which isn’t very useful, because its ability to predict depends on their being very finely tuned. To get there, it needs to be trained on a lot of content. That is the large part of the large language model.
A system like ChatGPT might be fed millions of webpages and digital documents. (Think about the entirety of Wikipedia, big news websites, blogs and digitized books.) The machine cycles through the training data one stretch at a time, blocking out a word in a sequence and calculating a “guess” at what values most closely represent what should go in the blank. When the right answer is revealed, the machine can use the difference between what it guessed and the actual word to improve.
It’s a lengthy process. OpenAI, the company behind ChatGPT, hasn’t published the details about how much training data went into ChatGPT or the computer power used to train it, but researchers from Nvidia, Stanford University and Microsoft estimate that, using 1,024 graphics processing units, it would have taken 34 days to train GPT 3, ChatGPT’s predecessor. One analyst estimated that the cost of computational resources to train and run large language models could stretch into the millions.
ChatGPT also has an extra layer of training, referred to as reinforcement learning from human feedback. While previous training is about getting the model to fill in missing text, this phase is about getting it to put out strings that are coherent, accurate and conversational.
During this stage, people rate the machine’s response, flagging output that is incorrect, unhelpful or even downright nonsensical. Using the feedback, the machine learns to predict whether humans will find its responses useful. OpenAI says this training makes the output of its model safer, more relevant and less likely to “hallucinate” facts. And researchers have said it is what aligns ChatGPT’s responses better with human expectations.
At the end of the process, there is no record of the original training data inside the model. It doesn’t contain facts or quotes that can be referred to — just how related or unrelated words were to one another in action.
Putting the training to use
This set of data turns out to be surprisingly powerful. When you type your query into ChatGPT, it translates everything into numbers using what it learned during training. Then it does the same series of calculations from above to predict the next word in its response. This time, there’s no hidden word to reveal; it just predicts.
Thanks to its ability to refer to earlier parts of the conversation, it can keep it up page after page of realistic, human-sounding text that is sometimes, but not always, correct.
At this point, there are plenty of disagreements about what AI is or will be capable of, but one thing is pretty well agreed upon — and prominently featured on the interfaces of ChatGPT, Google Bard and Microsoft Bing: These tools shouldn’t be relied on when accuracy is required.
Large language models are able to identify text patterns, not facts. And a number of models, including ChatGPT, have knowledge cutoff dates, which means they can’t connect to the internet to learn new information. That’s in contrast to Microsoft’s Bing chatbot, which can query online resources.
A large language model is also only as good as the material that was used to train it. Because models identify patterns between words, feeding an AI text that is dangerous or racist means the AI will learn text patterns that are dangerous or racist.
OpenAI says it has created some guardrails to prevent it from serving that up, and ChatGPT says it is “trained to decline inappropriate requests,” as we discovered when it refused to write an angry email demanding a raise. But the company also admits that ChatGPT will still sometimes “respond to harmful instructions or exhibit biased behavior.”
There are many useful ways to take advantage of the technology now, such as drafting cover letters, summarizing meetings or planning meals. The big question is whether improvements in the technology can push past some of its flaws, enabling it to create truly reliable text.
Graphics by JoElla Carman. In the “Pride and Prejudice” graphic, Google Bard, OpenAI GPT-1 and ChatGPT were given the prompt “Please summarize Pride and Prejudice by Jane Austen in one sentence.” BigScience Bloom was asked to finish the sentence “In the novel Pride and Prejudice, Jane Austen.” All responses collected May 11, 2023. In the email graphic, OpenAI ChatGPT was given the prompts: “Write a positive email asking for a raise,” “Write a neutral email asking for a raise,” “Write an agitated email asking for a raise,” “Write an angry email asking for a raise.” All responses collected May 8, 2023.