Python for Data Analysis
Learn how to use Python for data analysis, manipulation, and visualization. This hands-on course takes you from Python basics through NumPy, pandas, Matplotlib, and Seaborn so you can load, clean, transform, and explore real datasets.
About This Course
This Python for Data Analysis course is built for beginners who want to use code to make sense of data. You'll start with the parts of Python that matter most for working with data, set up a reproducible workflow in Jupyter, and learn the core libraries of the scientific Python stack.
From there you'll move into the practical work of analysis: creating and manipulating numpy arrays, structuring data with pandas DataFrames, loading data from CSV, Excel, SQL, and web APIs, cleaning messy values, and grouping and aggregating to answer questions. The course finishes with a guided end-to-end project where you analyze a real dataset and communicate your findings with charts built in Matplotlib and Seaborn.
What You'll Learn
- Write clean Python using the data types and control flow needed for analysis
- Run reproducible analyses in Jupyter notebooks with the scientific Python stack
- Perform fast, vectorized numerical computation with NumPy arrays
- Structure, index, and reshape tabular data using pandas Series and DataFrames
- Load data from CSV, Excel, SQL databases, and REST APIs into pandas
- Clean real-world data by handling missing values, duplicates, and inconsistent types
- Group, aggregate, and pivot data to summarize and answer analytical questions
- Create clear, publication-quality charts with Matplotlib and Seaborn
Requirements
- A computer with internet access (Windows, Mac, or Linux)
- No prior programming experience required - Python basics are taught from scratch
- Basic computer literacy: installing software and managing files and folders
- The free Anaconda distribution or Python 3 with Jupyter (installation is covered in the course)
- No mathematics beyond basic arithmetic is needed to get started
Who This Course Is For
- Complete beginners who want to learn data analysis with Python from scratch
- Analysts and spreadsheet users ready to move beyond Excel into code
- Students and researchers who need to clean and explore datasets
- Professionals preparing for a career in data analysis or data science
- Developers in other fields who want to add data skills to their toolkit
- Anyone who wants to turn raw data into clear, evidence-based insights
Build the Python foundation you need for analysis. This module focuses on the language features that come up constantly when working with data: core data types, collections such as list and dict, control flow, and writing reusable functions.
-
1.1 Variables, Numbers, Strings, and Booleans
-
1.2 Lists, Tuples, Sets, and Dictionaries
-
1.3 Conditionals and Loops
-
1.4 Functions and Comprehensions
-
1.5 Working with Files and Modules
-
1.6 Practice: Summarizing a List of Records
Set up a professional, reproducible workflow. You'll install the stack with Anaconda, learn to work efficiently in Jupyter notebooks, manage environments, and understand how NumPy, pandas, Matplotlib, and Seaborn fit together.
-
2.1 Installing Anaconda and Managing Environments
-
2.2 Working in Jupyter Notebooks and JupyterLab
-
2.3 Cells, Markdown, and Magic Commands
-
2.4 Tour of the Stack: NumPy, pandas, Matplotlib, Seaborn
-
2.5 Practice: Your First Analysis Notebook
Learn the array, the building block beneath pandas. You'll create and reshape ndarray objects, slice and index them, and replace slow Python loops with fast vectorized operations and broadcasting.
-
3.1 Creating and Inspecting ndarrays
-
3.2 Indexing, Slicing, and Reshaping
-
3.3 Vectorized Operations and Broadcasting
-
3.4 Aggregations and Boolean Masking
-
3.5 Random Numbers and Simulation Basics
-
3.6 Practice: Vectorizing a Calculation
Meet the workhorse of Python data analysis. You'll build Series and DataFrame objects, select rows and columns with loc and iloc, filter with boolean conditions, and add or rename columns.
-
4.1 Series, DataFrames, and the Index
-
4.2 Selecting Data with loc and iloc
-
4.3 Boolean Filtering and Conditional Selection
-
4.4 Adding, Renaming, and Dropping Columns
-
4.5 Sorting and Summary Statistics
-
4.6 Practice: Exploring a DataFrame
Get data into pandas from wherever it lives and make it usable. You'll read CSV and Excel files, query SQL databases, pull JSON from REST APIs, and then handle missing values, duplicates, and inconsistent types with fillna, dropna, and type conversion.
-
5.1 Reading CSV and Excel Files
-
5.2 Querying SQL Databases with pandas
-
5.3 Fetching Data from REST APIs (JSON)
-
5.4 Handling Missing Values and Duplicates
-
5.5 Fixing Data Types and Text Cleanup
-
5.6 Practice: Cleaning a Messy Dataset
Reshape and summarize data to answer questions. You'll apply functions across columns, split-apply-combine with groupby, build pivot tables, and join multiple datasets with merge and concat.
-
6.1 Transforming Columns with apply and map
-
6.2 Grouping and Aggregating with groupby
-
6.3 Pivot Tables and Crosstabs
-
6.4 Joining Data with merge and concat
-
6.5 Practice: Summarizing Sales by Category
Turn analysis into clear visuals. You'll build line, bar, scatter, and histogram plots with matplotlib.pyplot, create statistical charts with Seaborn, and learn to label, style, and export figures that communicate findings.
-
7.1 Matplotlib Basics: Figures and Axes
-
7.2 Line, Bar, Scatter, and Histogram Plots
-
7.3 Statistical Plots with Seaborn
-
7.4 Styling, Labeling, and Exporting Figures
-
7.5 Practice: Visualizing a Dataset
Bring it all together in a guided, end-to-end analysis. You'll frame a question, load and clean a real dataset, explore it with grouping and visualization, and present your findings in a polished notebook you can add to your portfolio.
-
8.1 Framing the Question and Getting the Data
-
8.2 Cleaning and Preparing the Dataset
-
8.3 Exploratory Analysis and Visualization
-
8.4 Communicating Findings in a Notebook
-
8.5 Final Project Submission and Next Steps
Dr. Elena Petrova
Data Scientist & Python Educator
About the Instructor
Dr. Elena Petrova is a data scientist with over 9 years of experience turning raw data into decisions for companies in finance, e-commerce, and healthcare analytics. She works daily with the Python data stack and has built production pipelines and dashboards on top of pandas, NumPy, and SQL.
She holds a PhD in Statistics and previously worked as a lead data analyst, where she mentored junior analysts moving from spreadsheets to Python. Over the past 5 years she has taught more than 80,000 students online, with a teaching style focused on practical, dataset-driven examples rather than abstract theory.
Elena is a regular contributor to open-source data tooling and speaks at PyData community events. Her goal in this course is simple: get you comfortable analyzing real data with Python as quickly as possible.
Other Courses by Dr. Elena Petrova
Priya Nair
I came in knowing only spreadsheets and finished able to clean and analyze data in pandas with confidence. The module on loading data from CSV, SQL, and APIs was exactly what my job needed. Elena explains every line of code clearly. Highly recommend for anyone moving from Excel to Python.
Marcus Lee
Solid beginner course. The NumPy and pandas sections finally made vectorization and groupby click for me. I gave 4 stars because the API section moves a little fast, but the downloadable notebooks made it easy to go back and practice. The final project is genuinely portfolio-worthy.
Sofia Alvarez
As a researcher who was drowning in messy CSV files, this course was a lifesaver. The data cleaning module alone was worth the price. I now do all my exploratory analysis in Jupyter with pandas and Seaborn instead of fighting with spreadsheets. The visualization section produced charts I actually used in a paper.
No prior programming experience is needed. The course starts with Python essentials in Module 1 and teaches everything you need to know about the language for data work before moving on to the data libraries. Basic computer literacy - installing software and managing files - is all that is required.
You'll use Python 3 with Jupyter notebooks and the libraries NumPy, pandas, Matplotlib, and Seaborn. The easiest way to get everything at once is the free Anaconda distribution, and Module 2 walks through installation step by step on Windows, Mac, and Linux. All tools used in the course are free and open source.
The course is structured over 8 weeks at roughly 5 hours per week, including about 40 hours of video plus exercises and the final project. Most students finish in 8 to 10 weeks at that pace. Because you have lifetime access, you can move faster or slower and revisit any module whenever you need a refresher.
No advanced math is required. Data analysis in this course relies mostly on basic arithmetic and summary statistics such as averages, counts, and percentages, all of which are explained as they come up. The focus is on using Python to load, clean, summarize, and visualize data rather than on mathematical theory.
Yes, you will receive a certificate of completion once you finish all the lessons, exercises, and the final project. You can add this certificate to your LinkedIn profile or resume to showcase your Python data analysis skills to potential employers.
Absolutely. The course includes a Q&A discussion board where you can ask questions and get help from the instructor and fellow students. Dr. Petrova typically responds within 24 to 48 hours. There is also a community Discord server where students share datasets, notebooks, and tips.
Module 8 is a guided, end-to-end analysis of a real dataset. You will:
- Frame an analytical question and obtain the data
- Load and clean the dataset with pandas
- Explore it using grouping, aggregation, and pivot tables
- Visualize key findings with Matplotlib and Seaborn
- Present your conclusions in a polished Jupyter notebook
The finished notebook makes a strong portfolio piece you can show to employers or clients.
Ready to Start Analyzing Data with Python?
Join more than 9,000 students already enrolled in this course