Should I just split each line on commas instead of using the csv module?

It is tempting, but splitting on commas breaks the moment a value itself contains a comma, like "Smith, John" or an address. Real CSV files wrap such values in quotes, and the csv module understands those quoting rules correctly. Using import csv saves you from a whole class of subtle bugs, so prefer it over line.split(',').

Can Python open CSV files made in Excel or Google Sheets?

Yes. Spreadsheet programs can save (export) a sheet as a .csv file, which is plain text. Python's csv module reads that directly. Going the other way, a CSV your program writes will open straight back up in Excel or Sheets as a table. CSV is the universal bridge between code and spreadsheets.

Working with CSV Data in Python | Coding

Data in rows and columns

Spreadsheets are everywhere — class grades, sports stats, science measurements, sales figures. Behind a huge number of them sits one of the simplest, most useful file formats ever invented: CSV, short for Comma-Separated Values. In this lesson you'll learn to read CSV data into Python, analyse it, filter it, and write results back out. This is the doorway to real data work.

You'll get the most from this if you already know about reading and writing files in Python and feel comfortable with dictionaries. Let's open some data.

What a CSV file actually is

A CSV file is just plain text. Each line is a row; within a row, commas separate the columns. Usually the first row holds the headers (the column names). Here's a tiny file, scores.csv:

name,subject,score
Amara,Maths,88
Ben,Maths,72
Chloe,Science,95
Dev,Science,61

You could open that in Excel and see a neat table, or in a text editor and see exactly the characters above. That dual nature — readable by humans and by every program — is why CSV is so popular.

Reading with csv.reader

Python has a built-in csv module for this. The simplest tool, csv.reader, hands you each row as a list of strings:

import csv

with open("scores.csv", newline="") as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

This prints lists like ['name', 'subject', 'score'], then ['Amara', 'Maths', '88'], and so on. Two details matter:

We open the file inside a with block (so it closes itself) and add newline="", which the csv module needs to handle line endings correctly across operating systems.
Every value is a string, even '88'. You can't add strings as numbers, so we'll convert when needed.

To skip the header row and total the scores, we'd index into each list with row[2] and convert:

import csv

total = 0
with open("scores.csv", newline="") as f:
    reader = csv.reader(f)
    next(reader)              # skip the header row
    for row in reader:
        total += int(row[2])  # row[2] is the score, as text
print("Total of all scores:", total)

next(reader) reads and discards the first row. int(row[2]) turns the score text into a number so we can add it.

A nicer way: csv.DictReader

Remembering that the score is row[2] is fragile — add a column and everything shifts. csv.DictReader fixes this by using the header row as keys, so each row becomes a dictionary:

import csv

with open("scores.csv", newline="") as f:
    reader = csv.DictReader(f)
    for row in reader:
        print(row["name"], "got", row["score"])

Now row["score"] always means the score, no matter the column order. This is much clearer and is the approach you'll usually want. (DictReader skips the header automatically, since it turns it into the keys.)

Doing real analysis

With DictReader, calculations read almost like English. Let's find the average score and the top student:

import csv

names = []
scores = []
with open("scores.csv", newline="") as f:
    reader = csv.DictReader(f)
    for row in reader:
        names.append(row["name"])
        scores.append(int(row["score"]))   # convert to number

average = sum(scores) / len(scores)
print(f"Average score: {average:.1f}")

best_index = scores.index(max(scores))
print("Top student:", names[best_index])

We collect names and (converted) scores into two lists. sum(scores) / len(scores) gives the mean, formatted to one decimal place. max(scores) finds the highest value, and scores.index(...) tells us where it is so we can look up the matching name.

Writing CSV back out

After processing data you often want to save the result. csv.writer writes lists of values as rows:

import csv

rows = [
    ["name", "score", "result"],
    ["Amara", 88, "Pass"],
    ["Ben", 72, "Pass"],
    ["Dev", 61, "Pass"],
]

with open("results.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerows(rows)

writerows writes every list as one CSV line. Note the "w" mode (write) and again newline="". There's also csv.DictWriter, which writes dictionaries — you give it the column names up front:

import csv

people = [
    {"name": "Amara", "score": 88},
    {"name": "Chloe", "score": 95},
]

with open("out.csv", "w", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=["name", "score"])
    writer.writeheader()             # writes name,score
    writer.writerows(people)

writeheader() writes the column titles, then writerows writes each dict, matching its keys to the fieldnames.

Worked example: pass/fail report

Let's build a complete program that reads scores.csv, decides who passed (60 or more), prints a summary, and writes a new CSV with a result column.

import csv

PASS_MARK = 60

def load_scores(path):
    """Return a list of {name, subject, score(int)} dicts."""
    records = []
    with open(path, newline="") as f:
        reader = csv.DictReader(f)
        for row in reader:
            records.append({
                "name": row["name"],
                "subject": row["subject"],
                "score": int(row["score"]),
            })
    return records

def add_results(records):
    """Add a 'result' key of Pass or Fail to each record."""
    for r in records:
        r["result"] = "Pass" if r["score"] >= PASS_MARK else "Fail"
    return records

def save_report(records, path):
    with open(path, "w", newline="") as f:
        writer = csv.DictWriter(
            f, fieldnames=["name", "subject", "score", "result"]
        )
        writer.writeheader()
        writer.writerows(records)

# Run the pipeline
data = load_scores("scores.csv")
data = add_results(data)

passed = sum(1 for r in data if r["result"] == "Pass")
print(f"{passed} of {len(data)} students passed.")

save_report(data, "report.csv")
print("Saved report.csv")

Walking through it:

load_scores reads the file with DictReader and converts each score to an int as it builds a clean list of dictionaries — a tidy in-memory copy of the table.
add_results loops through and tags each record "Pass" or "Fail" using a conditional expression ("Pass" if ... else "Fail").
save_report uses DictWriter to write the enriched data — now with four columns — back to a new file.
The middle section counts the passes with a generator expression (sum(1 for r in data if ...)) and prints a summary before saving.

Each function does one stage of the work: load, transform, save. That clean separation is exactly how professional data pipelines are structured.

Try it yourself

Extend the report program:

Add a function subject_average(records, subject) that returns the average score for one subject only (e.g. just "Maths"). Filter the records, convert nothing extra (scores are already ints), and divide the sum by the count.
Sort the students from highest to lowest score before saving. Hint: data.sort(key=lambda r: r["score"], reverse=True).
Add a grade column: 90+ is "A", 70+ is "B", 60+ is "C", otherwise "F". Use if / elif / else inside a loop.
Challenge: handle a missing file by catching FileNotFoundError in load_scores and returning an empty list with a friendly message.

CSV is your gateway to data analysis. When your files grow large and your calculations get heavier, the next step is dedicated libraries — see using Python libraries (modules) to discover tools built for exactly this kind of work.

Working with CSV Data in Python

Key takeaways