Classification vs Prediction
Understand the two main jobs of supervised AI: classification sorts things into categories, regression predicts a number. Clear examples and where each fits.
Key takeaways
- Classification sorts an input into one of a set of categories, like 'cat' or 'dog'
- Regression predicts a number on a sliding scale, like a temperature or a price
- The key question is: is the answer a label from a list, or a number?
- Both are forms of supervised learning that need labelled examples to train on
- Choosing the wrong type for a problem makes the AI's answers far less useful
Two main jobs for a learning machine
When an AI learns from labelled examples, it is doing supervised learning. (If that term is new, the lesson on Supervised vs Unsupervised Learning explains it well.) But supervised learning is not one single task. It splits into two main jobs, and almost every practical AI you meet is doing one of them:
- Classification sorts an input into a category.
- Regression, often called prediction in everyday speech, predicts a number.
Telling these two apart is one of the first decisions an AI engineer makes, because it changes how the whole system is built. Let us look at each, then at the one simple question that tells them apart.
Classification: choosing from a list
In classification, the answer is a category picked from a fixed list of choices. The model looks at an input and decides which box it belongs in.
Here are real examples:
- An email arrives. Is it spam or safe? Two categories.
- A photo is uploaded. Is it a cat, a dog, or a rabbit? Three categories.
- A doctor's scan is checked. Does it look healthy or concerning? Two categories.
- A handwritten digit is scanned. Is it 0, 1, 2, 3, 4, 5, 6, 7, 8 or 9? Ten categories.
Notice the pattern: in every case, the possible answers form a short, named list. The model cannot invent a new category. A cat-or-dog classifier shown a photo of a horse will still answer "cat" or "dog", because those are its only options. This is a real and important limit. The model does not know what it does not know; it can only choose from the boxes it was given.
Classification often comes with a confidence score too. Instead of just saying "cat", a model might say "85% cat, 15% dog". That number is useful: a low confidence is a hint that the model is unsure and a human should take a closer look.
Regression: landing on a number
In regression, the answer is a number that can slide smoothly across a range. There is no list of boxes; the answer could be almost any value.
Real examples:
- Predict tomorrow's temperature in degrees. It could be 12, or 18.5, or 23.
- Predict the price of a house from its size, location and age.
- Predict how many minutes a food delivery will take.
- Predict how many viewers a video will get this week.
The giveaway is that the answer lives on a sliding scale. A temperature is not "hot" or "cold" as a category here; it is an actual number like 17.3 degrees. Because the answer can be slightly off rather than simply right or wrong, regression is judged differently. We do not just ask "Was it correct?" We ask "How close was it?" A house-price model that guesses $305,000 when the true price was $300,000 is doing well, even though it is not exactly right.
The one question that tells them apart
Forget the technical names for a moment. To decide which job you are facing, ask yourself a single question:
Is the answer a label chosen from a list, or a number on a sliding scale?
If the answer is a label from a list, like "spam" or "dog" or "healthy", it is classification. If the answer is a number that can take many values, like a price or a temperature, it is regression (prediction).
Try it on these:
| Task | Answer is... | Type |
|---|---|---|
| Is this review positive or negative? | A label (positive / negative) | Classification |
| What will this stock be worth tomorrow? | A number (a price) | Regression |
| Which language is this text written in? | A label (English, Spanish, ...) | Classification |
| How old is the person in this photo? | A number (years) | Regression |
Why choosing right matters
Picking the wrong type makes an AI clumsy. Suppose you want to predict a child's exact height. If you treat it as classification with boxes like "short", "average", and "tall", you throw away all the detail. You can never tell two "tall" children apart, even if one is much taller. Regression would give you a precise number instead.
The reverse happens too. If a task only has a few sensible answers, like "yes" or "no", forcing a model to output a precise number is awkward and confusing. The shape of the answer should match the shape of the question.
The choice also changes how you judge the model. A classifier is scored by how often it picks the right category: did it correctly say "spam" or "dog"? You can count clean hits and misses. A regression model is judged by how close its number is to the truth, often by measuring the average size of its errors. Guessing $305,000 instead of $300,000 is a small error; guessing $600,000 is a big one. Because the two tasks are scored so differently, you have to decide which job you are doing before you can even tell whether your model is any good.
A worked example: the same data, two questions
Imagine a school has records for many students: hours studied, hours slept, and past test scores. From the same data you could ask two completely different questions, and each leads to a different type of model.
- "Will this student pass or fail the next exam?" The answer is one of two labels, so this is classification.
- "What score out of 100 will this student get?" The answer is a number on a scale, so this is regression.
Notice that the second question is harder but more informative. The classifier only tells you pass or fail; the regression model tells you how well, which lets a teacher spot a student heading for a borderline result. This shows that the choice between classification and prediction is not just technical. It shapes how useful and how detailed the model's answers will be.
What both share, and where they fail
Despite their differences, classification and regression are siblings. Both are supervised learning, so both need labelled examples to train on: emails marked spam or safe, houses tagged with their real sold price. Both learn by comparing their guess to the true answer and nudging themselves to do better, a process you can see in action in How Does AI Learn?.
And both share the same honest limits. A model is only as good as its examples. If a price model only ever saw cheap houses, it will guess badly on a mansion. If a classifier only saw cats and dogs, it cannot recognise a horse. Neither type truly understands the world; they find patterns in the examples they were given. Knowing which job you are doing, and respecting the limits of the examples behind it, is what turns a flashy demo into a tool you can actually trust.
Quick quiz
Test yourself and earn XP
What does a classification model produce?
Classification picks an answer from a set of categories, such as 'spam' or 'not spam', or 'cat', 'dog', or 'rabbit'.
Which task is regression?
Predicting a temperature gives a number that can slide up or down, so it is regression, not classification.
What single question best tells the two apart?
Ask whether the output is one of a few named categories (classification) or a number that can take many values (regression).
Why do both need labelled examples?
Both are supervised tasks: the model improves by checking its guess against the true answer attached to each training example.
A model predicts a house price in dollars. What type is it?
A price is a number that can take a wide range of values, so predicting it is regression.
FAQ
In everyday talk, 'prediction' just means the model's answer, whether it is a category or a number. In machine learning, the precise word for predicting a number is 'regression', while sorting into categories is 'classification'. This lesson uses 'prediction' in its common sense of guessing a number, which is the regression task.
Yes, by grouping the numbers into bands. Instead of predicting an exact temperature, you could classify the day as 'cold', 'mild', or 'hot'. This loses detail but can be easier and is sometimes all you need.
Keep exploring
More in AI