💬
AI🎓 Ages 14-18Intermediate 11 min read

How Chatbots and Language Models Work

A teen guide to how chatbots and large language models work: tokens, next-word prediction, training, fine-tuning, context windows, hallucinations and limits.

Key takeaways

  • Large language models (LLMs) work by predicting the next token over and over, not by looking up stored answers
  • They are trained on huge text datasets, then fine-tuned with human feedback to be more helpful and safe
  • A context window limits how much text a model can consider at once
  • LLMs can hallucinate confident but false information, so claims must be verified

What a chatbot really is

When you type to an AI chatbot, it feels like talking to someone who understands you. Under the hood, something more mechanical is happening. The chatbot is a large language model (LLM), and its one core skill is predicting the next piece of text.

If you have not yet read What Is Machine Learning?, start there, since an LLM is a very large neural network.

Tokens and next-token prediction

Models do not read whole words exactly as we do. They split text into tokens, chunks that are often a word or part of a word. "Unbelievable" might become "un", "believ" and "able."

Given the tokens so far, the model outputs a probability for every possible next token. It picks one (often with a bit of randomness so replies are not identical), adds it to the text, and repeats. A whole essay is built one token at a time. That is the entire trick, scaled up enormously.

How it gets trained

Training happens in stages:

  1. Pretraining. The model reads a vast amount of text, books, websites, code, and learns to predict the next token. By doing this billions of times, it absorbs grammar, facts, reasoning patterns and writing styles. This stage needs huge computing power.
  2. Fine-tuning and human feedback. A raw pretrained model is not naturally helpful or safe. Developers fine-tune it on examples of good responses and use techniques like reinforcement learning from human feedback (RLHF), where humans rate answers and the model learns to prefer better ones. This makes it follow instructions and refuse harmful requests.

The result is a model that has no database of answers to look up. Its "knowledge" is baked into billions of weights, the adjustable numbers inside the network.

The context window

A model can only consider a limited amount of text at once, called the context window, measured in tokens. Everything in the current conversation, your messages and its replies, must fit inside it.

If a chat gets very long, the earliest parts can fall outside the window, and the model effectively forgets them. This is also why a fresh chat does not remember a previous one unless the product deliberately stores memory.

Why chatbots make confident mistakes

Because an LLM optimises for plausible-sounding text, not truth, it can produce a hallucination: a fluent, confident statement that is simply wrong. It may invent a fake book title, a wrong date, or a non-existent quote, all stated as if certain.

This is not lying; the model has no intent. It is generating the most likely-looking continuation. That is why you must verify any important fact, especially numbers, citations, names and recent events.

Other real limits:

  • Training cutoff. The base model only knows patterns up to when it was trained, so it can be out of date unless connected to live search.
  • Bias. Like any ML system, it reflects biases in its training text.
  • No true reasoning guarantee. It often reasons well, but it can also make basic logical or arithmetic slips.

Using them well

LLMs are excellent for drafting, brainstorming, explaining ideas, summarising and helping with code. Treat the output as a smart first draft, not a final authority. Check facts, keep your private data private, and form your own judgement.

To learn how to use these tools responsibly, read Using AI Safely and Responsibly. If you want to experiment with building on top of language models, grow your Coding skills.

Quick quiz

Test yourself and earn XP

At its core, what does a large language model actually do?

What is a 'token'?

What is the purpose of fine-tuning with human feedback?

What does the 'context window' limit?

Why might a chatbot state something false with total confidence?

FAQ

Not in the human sense. It models statistical patterns in language extremely well, which often looks like understanding, but it has no beliefs, intentions or lived experience.

It depends on the product. Some tools can search the web; the base model only knows patterns from its training data, which has a cutoff date.