Python Foundations for Analytics — Interactive Guide

What Is Python Actually Doing?

Understanding the language behind your analytics

Why This Matters

You've used Python all semester for real analytics work
But what happens between import pandas as pd and your output?
Today: strip away the magic, see how the language works
Goal: read code confidently and debug without panic

Variables: Naming Things

A variable is a label attached to a value
revenue = 50000 stores the number and names it "revenue"
company = "Acme Corp" stores text and names it "company"
You already do this: df = pd.read_csv("data.csv")
df is just a name — you could call it my_data or potato

Types: Not All Data Is the Same

int — whole numbers: units_sold = 142
float — decimals: tax_rate = 0.065
str — text in quotes: account = "Cash"
bool — True or False: is_debit = True
The type determines what operations are allowed

Why Types Matter

python

# This crashes — you can't add text to a number
price = "29.99"
total = price + 10
# TypeError: can only concatenate str to str

# Fix: convert the string to a number first
price = float("29.99")
total = price + 10  # Works: 39.99

Lists: Ordered Collections

A list holds multiple values in order
accounts = ["Cash", "AR", "Revenue", "COGS"]
Access by position starting at zero:
accounts[0] returns "Cash"
accounts[2] returns "Revenue"
Zero-indexing is why pandas column positions start at 0

Dictionaries: Key-Value Lookups

A dict maps labels to values — like a chart of accounts
Use the label (key) to retrieve the value

Dictionaries: Key-Value Lookups

python

account = {
    "number": 1010,
    "name": "Cash",
    "balance": 50000,
    "type": "Asset"
}
account["name"]     # "Cash"
account["balance"]  # 50000

DataFrames Are Just Fancy Dicts

python

# A DataFrame is a dict of lists under the hood
data = {
    "Account": ["Cash", "AR", "Revenue"],
    "Balance": [50000, 12000, 75000]
}
df = pd.DataFrame(data)

# df["Account"] is dict-style key lookup
# That's why column names must be exact

For Loops: Repeating Actions

"For each item in this collection, do something"

For Loops: Repeating Actions

python

accounts = ["Cash", "AR", "Revenue"]

for account in accounts:
    print(f"Processing: {account}")

# Output:
# Processing: Cash
# Processing: AR
# Processing: Revenue

This is what .iterrows() does with your DataFrame rows

If/Else: Making Decisions

python

balance = -15000

if balance > 0:
    entry_type = "Debit"
elif balance < 0:
    entry_type = "Credit"
else:
    entry_type = "Zero"

print(entry_type)  # "Credit"

You use this logic when you filter DataFrames
df[df["Balance"] > 0] is an if-check on every row

Functions: Reusable Recipes

A function is a named set of instructions you can reuse

Functions: Reusable Recipes

python

def calculate_tax(amount, rate=0.065):
    """Calculate sales tax for a given amount."""
    return amount * rate

tax1 = calculate_tax(1000)        # 65.0
tax2 = calculate_tax(5000, 0.08)  # 400.0

rate=0.065 is a default — used when you don't specify one
pd.read_csv() is just a function someone else wrote

What import Actually Does

import pandas as pd means three things:
Find a library called pandas on this computer
Load all its functions into memory
Let me use the shortcut pd instead of pandas
pd.read_csv() calls the read_csv function from pandas
Libraries are just collections of functions someone published
You could write everything pandas does yourself — it would just take years

NameError: The Typo Detector

python

revenue = 75000
profit_margin = 0.15
profit = revnue * profit_margin
# NameError: name 'revnue' is not defined

# Python won't guess what you meant
# Check your spelling — that's the fix

TypeError: Mismatched Types

python

quantity = input("Enter quantity: ")  # returns "10"
price = 5.99
total = quantity * price
# This runs but gives "5.995.995.99..."

# input() always returns a string!
total = int(quantity) * price  # 59.9

KeyError: Column Not Found

python

df["Revnue"]
# KeyError: 'Revnue'

# Step 1: Check what columns actually exist
print(df.columns.tolist())
# ['Revenue', 'Expenses', 'Net Income']

# Step 2: Fix the typo
df["Revenue"]  # Works

What You Now Know

Variables name your data so you can reuse it
Types determine what operations are allowed
Lists and dicts organize multiple values
Loops repeat actions, if/else makes decisions
Functions package reusable logic
Import loads someone else's functions
Errors are messages — read them bottom-up

Next Wednesday: Debug Lab

You will receive a broken Python script
Your job: fix it using an LLM as your assistant
For each bug you find, document:
What broke and what error you saw
What the LLM suggested
Whether the suggestion was correct
Come prepared with access to ChatGPT or Claude

1 / 1

Python Foundations for Analytics — Interactive Guide

What Is Python Actually Doing?

Why This Matters

Variables: Naming Things

Types: Not All Data Is the Same

Why Types Matter

Lists: Ordered Collections

Dictionaries: Key-Value Lookups

Dictionaries: Key-Value Lookups

DataFrames Are Just Fancy Dicts

For Loops: Repeating Actions

For Loops: Repeating Actions

If/Else: Making Decisions

Functions: Reusable Recipes

Functions: Reusable Recipes

What import Actually Does

Errors Are Your Friends

NameError: The Typo Detector

TypeError: Mismatched Types

KeyError: Column Not Found

What You Now Know

Next Wednesday: Debug Lab