Unit 3 · Classes, Modules & Files

Lesson · Unit 3 · 10 min read

File I/O, how Python reads and writes the disk.

Reading a config file, writing a log line, streaming a million rows through a script. open() and the with statement are the foundation. Here are the four file modes, the right pattern for large files, and the encoding mistake every beginner hits exactly once.

Section · 01

open() and the four modes

open() returns a file object. The first argument is the path; the second is the mode:

open("notes.txt", "r")     # read   — fails if missing
open("notes.txt", "w")     # write  — TRUNCATES if it exists, creates if not
open("notes.txt", "a")     # append — creates if missing, never overwrites
open("notes.txt", "x")     # exclusive create — FAILS if file already exists

The big gotcha is "w". Open a file in write mode and Python immediately empties it — before you write anything. Want to add to a file? Use "a" (append). Want to be sure you don’t accidentally clobber an existing file? Use "x".

Text vs binary

open("notes.txt", "r")     # text mode (default) — returns strings
open("photo.jpg", "rb")    # binary mode — returns bytes
open("photo.jpg", "wb")    # binary write

Use text mode (the default) for anything you’d read in an editor — code, config, CSV, JSON, logs. Use binary mode for everything else — images, archives, audio, compiled formats.

Section · 02

Always use the with statement

You could manage the file handle yourself:

# DON'T do this:
f = open("notes.txt", "r")
text = f.read()
f.close()

The problem: if anything raises an exception between open and close, the file handle leaks. On Windows that can lock the file. On every OS, you can blow through the limit on open handles in a long-running program.

The fix is the with statement, which guarantees the file is closed when the block exits — even on exception:

with open("notes.txt", "r") as f:
    text = f.read()

# At this point, f is already closed. No leak, no remembering to .close().

Always use withfor files. Always. There’s no real downside; it’s shorter and safer.

Section · 03

Three ways to read

1. The whole thing at once

with open("config.txt", "r") as f:
    content = f.read()        # one big string

Fine when the file is small (a few MB at most). For multi-GB files, this will eat your memory.

2. Into a list of lines

with open("config.txt", "r") as f:
    lines = f.readlines()     # list of strings — each ends in "\n"

Same memory issue as read() — all lines are loaded at once. Convenient for small/medium files where you want random access by line index.

3. Stream line by line — the best default

with open("log.txt", "r") as f:
    for line in f:                   # one line at a time, never all in memory
        if "ERROR" in line:
            print(line.rstrip())     # rstrip strips the trailing newline

This works on a 100-byte file or a 100-GB file. Python only reads one line at a time. It’s also the most Pythonic — when in doubt, iterate over the file.

Section · 04

Writing

# Single string
with open("output.txt", "w") as f:
    f.write("First line\n")
    f.write("Second line\n")

# Many lines at once — note: writelines does NOT add newlines for you
with open("output.txt", "w") as f:
    lines = ["one\n", "two\n", "three\n"]
    f.writelines(lines)

# Append a single log line
with open("log.txt", "a") as f:
    f.write(f"{datetime.now().isoformat()} - request handled\n")

Two things to remember:

1. Python doesn't add newlines automatically. If you want each call to
   .write() to land on a new line, include \n yourself.
2. writelines() takes an iterable of strings. It doesn't add separators
   between them either — same rule.

Section · 05

JSON and CSV — the standard library has you covered

You almost never want to parse JSON or CSV by hand. Python ships with both.

JSON

import json

# Read
with open("config.json", "r") as f:
    config = json.load(f)         # dict, list, etc. — whatever the file is

# Write
data = {"event": "login", "user_id": 42, "ok": True}
with open("event.json", "w") as f:
    json.dump(data, f, indent=2)  # indent makes it human-readable

# Or work with strings instead of files
text = json.dumps(data)           # dict -> JSON string
data = json.loads(text)           # JSON string -> dict

CSV

import csv

# Read — each row is a list of strings
with open("orders.csv", "r", newline="") as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

# Read with column names — each row is a dict
with open("orders.csv", "r", newline="") as f:
    reader = csv.DictReader(f)
    for row in reader:
        print(row["customer"], row["total"])

# Write
rows = [
    ["id", "customer", "total"],
    [1, "Ada", "49.99"],
    [2, "Ben", "120.00"],
]
with open("export.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerows(rows)

The newline="" argument to open when working with CSV is one of those little Python details — it prevents extra blank lines on Windows. Just include it whenever you open a file for the csv module.

Section · 06

The encoding mistake that bites everyone once

Text files store characters as bytes. The mapping between the two — encoding — matters. Most files today are UTF-8; some legacy files are Latin-1 or Windows-1252. Open with the wrong one and you get a UnicodeDecodeError or, worse, silently mangled characters.

# Default — relies on the system default, which differs across machines
with open("data.txt", "r") as f:
    text = f.read()

# Better — be explicit
with open("data.txt", "r", encoding="utf-8") as f:
    text = f.read()

Always specify encoding="utf-8"for text files unless you have a specific reason to use something else. UTF-8 handles every character in every language; it’s the web’s default for a reason.

Handling missing files cleanly

from pathlib import Path

config_path = Path("config.json")

if not config_path.exists():
    print(f"No config at {config_path}, using defaults.")
    config = {"page_size": 25, "theme": "dark"}
else:
    with config_path.open("r", encoding="utf-8") as f:
        config = json.load(f)

pathlib.Path is the modern way to work with file paths. It handles the differences between Windows and Mac/Linux (backslash vs forward slash) for you, has .exists(), .read_text(), .write_text(), and more. Get comfortable with it early — string-based paths are a legacy habit.

Curriculum source

Lesson content is original to YorkSims. Topic structure aligns with Python for Everybody by Dr. Charles R. Severance (py4e.com), licensed under Creative Commons Attribution 3.0 Unported.