Loading Data into Python with Lists and Dictionaries
Get real data into Python: build it as lists, dictionaries and lists-of-dicts, then loop, filter, count and summarise it. Runnable code, a worked example and a hands-on challenge.
Key takeaways
- A list holds many values in order; a dictionary holds labelled values as key: value pairs
- A list of dictionaries is the everyday shape of a data table β one dict per row
- Loop over the rows with for to read every record, and use if to filter the ones you want
- sum(), len(), max() and min() turn a list of numbers into a quick summary
- Getting data into clean Python structures is the first step of every data project
Data is just organised values
Every data project starts the same way: you need to get your information into the program in a tidy shape you can work with. In Python, two simple structures do almost all of that job β the list and the dictionary. In this lesson you'll build a small dataset by hand, then loop over it, filter it and summarise it. These are the exact moves you'll repeat on much bigger data later.
If lists and dictionaries are new to you, peek at lists and arrays and dictionaries in Python first. Ready? Let's load some data.
A list: many values in a row
A list holds several values in order, inside square brackets:
temperatures = [21, 23, 19, 25, 22]
print(temperatures[0]) # 21 (counting starts at 0)
print(len(temperatures)) # 5 (how many values)
Lists are perfect for a single column of data β five temperatures, ten scores, a hundred names. You can already summarise a list of numbers with built-in tools:
temperatures = [21, 23, 19, 25, 22]
print("Highest:", max(temperatures))
print("Lowest: ", min(temperatures))
print("Total: ", sum(temperatures))
A dictionary: one labelled record
A list is great until each "thing" has several pieces of information. A weather reading isn't just a temperature β it has a day, a temperature and maybe rainfall. For that we use a dictionary, which stores labelled values as key: value pairs:
monday = {"day": "Monday", "temp": 21, "rain": 4}
print(monday["temp"]) # 21
print(monday["day"]) # Monday
You read a value by its label in square brackets. One dictionary describes one record β one row of data.
A list of dictionaries: a whole table
Now combine them. Put one dictionary per row inside a list, and you have the everyday shape of a data table:
week = [
{"day": "Monday", "temp": 21, "rain": 4},
{"day": "Tuesday", "temp": 23, "rain": 0},
{"day": "Wednesday", "temp": 19, "rain": 9},
{"day": "Thursday", "temp": 25, "rain": 0},
{"day": "Friday", "temp": 22, "rain": 2},
]
Each item in week is a dictionary (a row); each dictionary has the same keys (the columns). This list of dictionaries is one of the most useful patterns in all of programming.
Looping over the rows
To read every record, loop over the list with for. Each time round, row is one dictionary:
for row in week:
print(row["day"], "was", row["temp"], "degrees")
That prints one tidy line per day. Inside the loop you can do anything you like with row["temp"], row["rain"] and so on.
Filtering: keeping only what you want
Filtering means choosing the rows that match a condition. Add an if inside the loop:
print("Dry days:")
for row in week:
if row["rain"] == 0:
print(" ", row["day"])
This prints only the days with zero rainfall. Change the condition to ask different questions β row["temp"] > 22 finds the warm days. Filtering is just looping plus an if decision.
Summarising: turning rows into numbers
Often you don't want the rows β you want a single answer, like the average temperature. Collect the values you care about into a list, then summarise:
temps = []
for row in week:
temps.append(row["temp"])
average = sum(temps) / len(temps)
print(f"Average temperature: {average:.1f} degrees")
print("Warmest day temp: ", max(temps))
We build a fresh list of just the temperatures, then sum() / len() gives the mean and max() the warmest value.
Worked example: a mini data report
Let's put it all together: a program that loads the week, then prints a short report β the average temperature, the warmest day, and how many days were dry.
week = [
{"day": "Monday", "temp": 21, "rain": 4},
{"day": "Tuesday", "temp": 23, "rain": 0},
{"day": "Wednesday", "temp": 19, "rain": 9},
{"day": "Thursday", "temp": 25, "rain": 0},
{"day": "Friday", "temp": 22, "rain": 2},
]
# 1. Collect the temperatures
temps = [row["temp"] for row in week]
# 2. Average
average = sum(temps) / len(temps)
# 3. Find the warmest day (its whole record)
warmest = week[0]
for row in week:
if row["temp"] > warmest["temp"]:
warmest = row
# 4. Count the dry days
dry_days = 0
for row in week:
if row["rain"] == 0:
dry_days += 1
print(f"Average temperature: {average:.1f} degrees")
print(f"Warmest day: {warmest['day']} ({warmest['temp']} degrees)")
print(f"Dry days: {dry_days} out of {len(week)}")
Output:
Average temperature: 22.0 degrees
Warmest day: Thursday (25 degrees)
Dry days: 2 out of 5
How it works, step by step:
[row["temp"] for row in week]is a list comprehension β a short way to build the list of temperatures.sum(temps) / len(temps)is the average.- To find the warmest day we keep a
warmestrecord and replace it whenever we meet a hotter one. We keep the whole row so we still know its name. - We count dry days by adding
1each time the rain is0.
Load, loop, filter, summarise β that's the heartbeat of data work.
Try it yourself
Extend the program:
- Add a
rainy_dayscount for days whererain > 0, and print it. - Find the coldest day's record (copy the warmest-day loop and flip the comparison).
- Add a sixth and seventh day to make a full week, and check your report still works without any other changes β proof that your code scales with the data.
- Challenge: print the days sorted from warmest to coolest. Hint:
week.sort(key=lambda r: r["temp"], reverse=True)then loop and print.
Once your data lives in a file rather than your code, the next steps are reading CSV data and JSON data β but the lists and dicts you mastered here stay exactly the same. And if you're curious what computers eventually do with all this organised data, see what is data? over in the AI section.
Quick quiz
Test yourself and earn XP
Which structure best represents ONE record with labelled fields like name and age?
A dictionary stores labelled values, e.g. {"name": "Maya", "age": 12}, which is perfect for one record.
What is the usual shape of a small data table in Python?
Each row becomes a dictionary, and the rows are collected in a list β a list of dicts.
How do you read the score field from a row dictionary called row?
You access a dictionary value by its key in square brackets: row["score"].
Which built-in finds the largest number in a list called scores?
max(scores) returns the largest value; min() returns the smallest.
What does len(people) tell you if people is a list of dicts?
len() counts the items in the list, which is the number of rows.
FAQ
Not at the start. Plain lists and dictionaries can hold and process surprisingly large amounts of data, and learning them well makes everything else easier. Bigger projects later reach for libraries that are built on these same ideas, but the thinking is identical: rows, fields, loops and filters.
Often from a file such as a CSV or JSON file, or from the internet. In this lesson we type the data straight into the code so we can focus on the structures, but the loops and filters you learn here work exactly the same once the data is loaded from a file.
Keep exploring
More in Coding