What is a Large Language Model?

You've seen what it can do with tools like ChatGPT. Now, learn the surprisingly simple secret behind how it "thinks."

Think of it as a super-powered autocomplete.

An LLM's core job is not to "understand" or "know" things in the human sense. Its primary function is to predict the most statistically likely next word in a sequence, based on all the text it has ever read.

It learns by reading the internet.

An LLM is trained on a colossal amount of text data from books, articles, websites, and code. During this training, it doesn't memorize facts; it learns the incredibly complex statistical relationships between words, phrases, and concepts.

This one simple skill unlocks powerful abilities:

Writing & Content

It can write essays, emails, and poems by predicting word after word.

Conversation

It can "chat" by predicting a likely response to your question.

Coding

It can write code by predicting the next symbol in a programming language.

But there's a catch...

Because it's only predicting, an LLM doesn't have a concept of "truth." This can lead to "hallucinations"---confidently stated falsehoods---and it can reflect the biases present in its training data. It's a powerful tool, not an all-knowing oracle.

It's All About Probability.

A Large Language Model is a marvel of scale and statistics. By mastering the simple task of predicting the next word, it unlocks a world of complex capabilities that are changing how we work and create.

Next: Explore the Tools Built on LLMs →

LLM Prediction

Input Tokens

↓ Transformer ↓

⚠️ Output may be a Hallucination?

Probability Distribution