Pandas#
#
Introduction#
You’ll need to import pandas to get started:
import pandas as pd
Creating DataFrames#
- |
- |
|---|---|
|
From a dictionary |
|
From a list of dictionaries |
|
From a CSV file |
|
From an Excel file |
Inspecting Data#
- |
- |
|---|---|
|
First 5 rows |
|
Last 5 rows |
|
Number of rows and columns |
|
Info on DataFrame |
|
Summary statistics |
|
Column names |
|
Index |
|
Data types of columns |
Selecting Data#
- |
- |
|---|---|
|
Select column |
|
Select multiple columns |
|
Select row by index |
|
Select all rows for ‘col1’ |
|
Select row by position |
|
Select specific value |
|
Select rows based on condition |
Data Cleaning#
- |
- |
|---|---|
|
Drop rows with any missing values |
|
Drop columns with any missing values |
|
Replace missing values with 0 |
|
Drop duplicate rows |
|
Rename columns |
|
Change data type |
Adding/Removing Data#
- |
- |
|---|---|
|
Add new column |
|
Drop column |
|
Add new row |
|
Insert new column at position 2 |
Combining Data#
- |
- |
|---|---|
|
Concatenate rows |
|
Concatenate columns |
|
Merge DataFrames on key |
|
Merge on different keys |
|
Join DataFrames |
Aggregating Data#
- |
- |
|---|---|
|
Sum of values in column |
|
Mean of values in column |
|
Count of values in column |
|
Minimum value in column |
|
Maximum value in column |
|
Standard deviation |
|
Variance |
|
Group by and sum |
|
Group by and mean |
|
Group by multiple columns |
Applying Functions#
- |
- |
|---|---|
|
Apply function to all values |
|
Apply function to column |
|
Apply function to DataFrame elements |
|
Map values |
|
Replace values |
Handling Dates#
- |
- |
|---|---|
|
Convert to datetime |
|
Extract year |
|
Extract month |
|
Extract day |
|
Set date as index |
Input/Output#
- |
- |
|---|---|
|
Save DataFrame to CSV |
|
Load DataFrame from CSV |
|
Save DataFrame to Excel |
|
Load DataFrame from Excel |
|
Import SQLAlchemy for SQL operations |
|
Create SQL engine |
|
Save to SQL table |
|
Load from SQL table |