You've been using Python all semester, but a lot of it has felt like following a recipe. This workshop strips away the pandas magic and lets you work with the language directly — writing real code from scratch, making mistakes, and fixing them.
There are two parts. Part 1 walks you through the fundamentals by typing and running small snippets. Part 2 is a debugging challenge where you fix a broken script against real data. Start with Part 1 and move to Part 2 when you're ready.
Open a Python terminal (type python in your command prompt) or create a new .py file and run these examples. Type them yourself — don't copy and paste. Typing builds muscle memory.
A variable is just a name you attach to a value. You pick the name. Python remembers the value.
type this revenue = 75000 company = "Acme Corp" tax_rate = 0.065 print(revenue) # 75000 print(company) # Acme Corp print(revenue * tax_rate) # 4875.0
Create three variables: units_sold, price_per_unit, and product_name. Print the total revenue (units times price) with a message like "Total revenue for Widget: $5000".
Every value in Python has a type. The type determines what you can do with it.
type this print(type(142)) # <class 'int'> whole number print(type(0.065)) # <class 'float'> decimal print(type("Cash")) # <class 'str'> text print(type(True)) # <class 'bool'> true/false # This breaks: price = "29.99" total = price + 10 # TypeError! # Fix: convert the type total = float(price) + 10 # 39.99 print(total)
The variable quantity = "12" is a string. Multiply it by price = 9.99 and print the result. What happens? Now fix it so you get the correct total.
A list is an ordered collection. You access items by their position, starting at zero.
type this accounts = ["Cash", "AR", "Revenue", "COGS"] print(accounts[0]) # Cash (first item) print(accounts[2]) # Revenue (third item) print(len(accounts)) # 4 # Add an item accounts.append("Utilities") print(accounts) # [..., 'Utilities']
Create a list of 5 expense categories your company might have. Print the third one. Then add a sixth and print the full list.
A dictionary maps labels (keys) to values. Think of it as a lookup table — like a chart of accounts.
type this account = { "number": 1010, "name": "Cash", "balance": 50000, "type": "Asset" } print(account["name"]) # Cash print(account["balance"]) # 50000 # This is exactly what df["column"] does in pandas
Create a dictionary for an invoice with keys: invoice_number, client, amount, and paid (True/False). Print a sentence like "Invoice 1041 to Acme Corp for $8500 - Paid: True".
A for loop takes a collection and runs a block of code once for each item.
type this expenses = ["Rent", "Salaries", "Utilities", "Supplies"] for expense in expenses: print(f"Processing: {expense}") # Output: # Processing: Rent # Processing: Salaries # Processing: Utilities # Processing: Supplies
Create a list of amounts: [3000, 12000, 450, 1200]. Write a for loop that prints each amount and keeps a running total. After the loop, print the total.
total = 0 before the loop. Inside the loop, add each amount to it with total = total + amount.
Python checks conditions top to bottom and runs the first block that's true.
type this balance = -15000 if balance > 0: print("Debit balance") elif balance < 0: print("Credit balance") else: print("Zero balance") # Output: Credit balance
Combine a loop with if/else: loop through [5000, -3000, 0, 12000, -800] and print whether each amount is a debit, credit, or zero entry.
A function is a named recipe. You define it once and call it whenever you need it. This is what pd.read_csv() is — just a function someone else wrote.
type this def calculate_tax(amount, rate=0.065): """Calculate sales tax for a given amount.""" return amount * rate # Use the default rate print(calculate_tax(1000)) # 65.0 # Override the rate print(calculate_tax(5000, 0.08)) # 400.0
Write a function called classify_account that takes a balance and returns "Debit", "Credit", or "Zero". Then call it on three different balances and print the results.
Here's a small program that uses everything above. Type it into a file called practice.py and run it.
practice.py # A mini journal entry checker entries = [ {"account": "Cash", "debit": 5000, "credit": 0}, {"account": "Revenue", "debit": 0, "credit": 5000}, {"account": "Rent", "debit": 3000, "credit": 0}, {"account": "Cash", "debit": 0, "credit": 3000}, ] def check_balance(entries): total_debits = 0 total_credits = 0 for entry in entries: total_debits = total_debits + entry["debit"] total_credits = total_credits + entry["credit"] print(f" {entry['account']:<15} DR: {entry['debit']:>8,} CR: {entry['credit']:>8,}") print() print(f" Total Debits: {total_debits:>8,}") print(f" Total Credits: {total_credits:>8,}") if total_debits == total_credits: print(" Balanced!") else: print(f" DISCREPANCY: {abs(total_debits - total_credits):,}") check_balance(entries)
Add two more entries to the list (a matching debit and credit pair). Run it again to verify it still balances. Then deliberately make the debits and credits not match and see what happens.
Now that you've written Python from scratch, let's use those skills. You have a script that validates journal entries from CSV files. It's broken. Your job is to fix it, then run it against five data files. Each file introduces a different real-world data problem.
Use an LLM if you want, Google it, ask a neighbor, or just stare at it until it clicks. There's no wrong way to work through this.
Download everything into the same folder. Then run:
python validate_journal_entries.py entries_jan.csv
It will break immediately. Read the error. That's your starting point.
The script has one bug that prevents it from loading at all. Python will tell you exactly which line the problem is on. Read the error message — the answer is in it.
Hint: Python cares about whitespace. A lot.
Once you fix the script, January should run cleanly. All transactions balance, trial balance verified. Take a breath — you just debugged your first Python error.
The script crashes again. The error message mentions a column name. But the file clearly has data in it. What's different about this file compared to January?
Hint: Open both CSV files side by side. Look at the header row.
This one fails differently. The error message talks about types, not column names. The columns exist — the problem is what's inside them.
Hint: Open the CSV. Look at how the dollar amounts are formatted. How would Python read "$25,000.00" — as a number or as text?
The error says a column doesn't exist, but you can see it right there in the file. This one is tricky. The problem is something you literally cannot see by looking at the CSV normally.
Hint: Spaces can hide at the end of text. Try printing the column names with repr() to reveal hidden characters.
This file is valid accounting data, but it's structured completely differently. There are no Debit and Credit columns at all. Instead, there's a single Amount column where positive numbers are debits and negative numbers are credits.
Fixing this one requires more than a one-line patch. You need to understand what the script is doing and what the data is telling you.
Hint: You can create new columns from existing ones. Positive amounts are debits. Negative amounts (as absolute values) are credits.