Dataclasses
Learn how to use Python's @dataclass decorator to write clean, concise classes without boilerplate.
Dataclasses
Every class you have written so far needed an __init__ to set up instance variables — even when the setup was just storing values:
class BankAccount:
def __init__(self, owner, balance):
self.owner = owner
self.balance = balanceFor simple data-holding classes, this is repetitive. You write owner three times just to store one value. The @dataclass decorator fixes this — it auto-generates __init__, __repr__, __eq__, and more from your field definitions.
Without @dataclass — the repetition problem
Say you want a class to represent a bank transaction:
class Transaction:
def __init__(self, transaction_id, owner, amount, transaction_type, timestamp):
self.transaction_id = transaction_id
self.owner = owner
self.amount = amount
self.transaction_type = transaction_type
self.timestamp = timestamp
def __repr__(self):
return (
f"Transaction(id={self.transaction_id!r}, "
f"owner={self.owner!r}, "
f"amount={self.amount!r}, "
f"type={self.transaction_type!r})"
)
def __eq__(self, other):
if not isinstance(other, Transaction):
return NotImplemented
return self.transaction_id == other.transaction_idEvery field appears multiple times. This is just boilerplate — you are not adding any real logic.
With @dataclass — clean and concise
from dataclasses import dataclass
from datetime import datetime
@dataclass
class Transaction:
transaction_id: str
owner: str
amount: float
transaction_type: str
timestamp: datetime = NoneThat is it. Python generates __init__, __repr__, and __eq__ automatically. Same result, a fraction of the code.
t1 = Transaction("TXN-001", "Ahmad", 5000.0, "deposit")
t2 = Transaction("TXN-002", "Sara", 3000.0, "withdrawal")
print(t1)
# Transaction(transaction_id='TXN-001', owner='Ahmad', amount=5000.0, transaction_type='deposit', timestamp=None)
print(t1 == t2) # False — different transaction_idsWhat @dataclass generates
By default @dataclass automatically creates:
| Method | What it does |
|---|---|
__init__ | Sets all fields as instance variables |
__repr__ | Shows all field names and values |
__eq__ | Compares all fields for equality |
Default values
Fields with defaults must come after fields without:
from dataclasses import dataclass
from datetime import datetime
@dataclass
class Transaction:
transaction_id: str
owner: str
amount: float
transaction_type: str
timestamp: datetime = None
is_verified: bool = False
notes: str = ""
t = Transaction("TXN-001", "Ahmad", 5000.0, "deposit")
print(t)
# Transaction(transaction_id='TXN-001', owner='Ahmad', amount=5000.0,
# transaction_type='deposit', timestamp=None, is_verified=False, notes='')
t2 = Transaction("TXN-002", "Sara", 3000.0, "withdrawal",
is_verified=True, notes="salary payment")
print(t2)field() — more control over fields
Use field() from dataclasses when you need more control — default factories, excluding from repr, and more:
from dataclasses import dataclass, field
from datetime import datetime
@dataclass
class BankAccount:
owner: str
balance: float
bank_name: str = "HBL Bank"
transactions: list = field(default_factory=list) # fresh list per instance
created_at: datetime = field(default_factory=datetime.now)
_account_id: str = field(init=False, repr=False) # not in __init__ or __repr__
def __post_init__(self):
self._account_id = f"HBL-{id(self) % 100000:05d}"Never use a mutable default like transactions: list = [] in a dataclass. That same list would be shared across all instances — a classic Python bug. Always use field(default_factory=list) for mutable defaults like lists and dicts.
field() options:
| Option | Meaning |
|---|---|
default | A fixed default value |
default_factory | A function called to create the default — for lists, dicts |
init=False | Exclude from __init__ |
repr=False | Exclude from __repr__ |
compare=False | Exclude from __eq__ comparisons |
post_init — run code after init
__post_init__ runs automatically after the generated __init__. Use it for validation or computed fields:
from dataclasses import dataclass, field
from datetime import datetime
@dataclass
class Transaction:
transaction_id: str
owner: str
amount: float
transaction_type: str
timestamp: datetime = field(default_factory=datetime.now)
is_verified: bool = False
def __post_init__(self):
# validation
if self.amount <= 0:
raise ValueError(f"Transaction amount must be positive. Got {self.amount}")
if self.transaction_type not in ("deposit", "withdrawal", "transfer"):
raise ValueError(f"Invalid transaction type: {self.transaction_type}")
# normalize
self.owner = self.owner.strip().title()
self.transaction_id = self.transaction_id.upper()
# valid transaction
t1 = Transaction("txn-001", " ahmad ", 5000.0, "deposit")
print(t1.owner) # Ahmad — normalized
print(t1.transaction_id) # TXN-001 — normalized
# invalid — raises ValueError
t2 = Transaction("txn-002", "Sara", -500.0, "deposit")
# ValueError: Transaction amount must be positive. Got -500.0frozen=True — immutable dataclass
Set frozen=True to make all fields read-only after creation. Perfect for data that should never change — like a completed transaction record:
from dataclasses import dataclass
from datetime import datetime
@dataclass(frozen=True)
class TransactionRecord:
transaction_id: str
owner: str
amount: float
transaction_type: str
timestamp: datetime
record = TransactionRecord(
"TXN-001", "Ahmad", 5000.0, "deposit", datetime.now()
)
print(record.amount) # 5000.0
# trying to change it raises FrozenInstanceError
record.amount = 9999 # FrozenInstanceError: cannot assign to field 'amount'Frozen dataclasses are also hashable — you can use them as dictionary keys or put them in sets:
records = {record} # works — frozen dataclass is hashableorder=True — sorting support
Set order=True to generate comparison methods — __lt__, __le__, __gt__, __ge__ — based on field order:
from dataclasses import dataclass
@dataclass(order=True)
class Transaction:
amount: float # first field — used for comparison
owner: str
transaction_type: str
t1 = Transaction(5000.0, "Ahmad", "deposit")
t2 = Transaction(3000.0, "Sara", "withdrawal")
t3 = Transaction(8000.0, "Omar", "deposit")
transactions = [t1, t2, t3]
transactions.sort()
for t in transactions:
print(f"{t.owner}: {t.amount:,} PKR")Output:
Sara: 3,000 PKR
Ahmad: 5,000 PKR
Omar: 8,000 PKRComparison is based on field order — the first field is compared first, then the second if the first is equal, and so on. If you want to sort by a specific field, use sorted(transactions, key=lambda t: t.amount) instead of relying on order=True.
A complete BankAccount using dataclasses
Let us rebuild the BankAccount and Transaction classes using dataclasses — clean, modern Python:
from dataclasses import dataclass, field
from datetime import datetime
from typing import Literal
@dataclass(frozen=True)
class Transaction:
"""An immutable record of a single transaction."""
transaction_id: str
owner: str
amount: float
transaction_type: Literal["deposit", "withdrawal"]
timestamp: datetime = field(default_factory=datetime.now)
def __post_init__(self):
if self.amount <= 0:
raise ValueError("Transaction amount must be positive.")
def __str__(self):
symbol = "+" if self.transaction_type == "deposit" else "-"
time_str = self.timestamp.strftime("%Y-%m-%d %H:%M")
return f"[{time_str}] {symbol} {self.amount:,.0f} PKR ({self.transaction_type})"
@dataclass
class BankAccount:
"""A bank account with full transaction history."""
owner: str
balance: float = 0.0
bank_name: str = "HBL Bank"
transactions: list[Transaction] = field(default_factory=list)
created_at: datetime = field(default_factory=datetime.now)
_account_number: str = field(init=False, repr=False, compare=False)
_transaction_counter: int = field(init=False, repr=False, compare=False, default=0)
def __post_init__(self):
if self.balance < 0:
raise ValueError("Opening balance cannot be negative.")
self.owner = self.owner.strip().title()
self._account_number = f"HBL-{id(self) % 100000:05d}"
def _new_transaction_id(self) -> str:
self._transaction_counter += 1
return f"TXN-{self._transaction_counter:04d}"
def deposit(self, amount: float) -> None:
if amount <= 0:
print("Deposit amount must be positive.")
return
self.balance += amount
txn = Transaction(
transaction_id=self._new_transaction_id(),
owner=self.owner,
amount=amount,
transaction_type="deposit"
)
self.transactions.append(txn)
print(f"Deposited {amount:,.0f} PKR → Balance: {self.balance:,.0f} PKR")
def withdraw(self, amount: float) -> None:
if amount <= 0:
print("Withdrawal amount must be positive.")
return
if amount > self.balance:
print(f"Insufficient balance. Available: {self.balance:,.0f} PKR")
return
self.balance -= amount
txn = Transaction(
transaction_id=self._new_transaction_id(),
owner=self.owner,
amount=amount,
transaction_type="withdrawal"
)
self.transactions.append(txn)
print(f"Withdrawn {amount:,.0f} PKR → Balance: {self.balance:,.0f} PKR")
def statement(self) -> None:
print(f"\n{'=' * 45}")
print(f" {self.bank_name} — Account Statement")
print(f"{'=' * 45}")
print(f" Account: {self._account_number}")
print(f" Owner: {self.owner}")
print(f" Opened: {self.created_at.strftime('%Y-%m-%d')}")
print(f"{'─' * 45}")
if not self.transactions:
print(" No transactions yet.")
else:
for txn in self.transactions:
print(f" {txn}")
print(f"{'─' * 45}")
print(f" Balance: {self.balance:,.0f} PKR")
print(f"{'=' * 45}\n")Using it:
account1 = BankAccount(" ahmad ", opening_balance=10000)
account2 = BankAccount("Sara", balance=25000)
account1.deposit(5000)
account1.deposit(3000)
account1.withdraw(2000)
account1.withdraw(99999) # blocked
account1.statement()
# auto-generated __repr__
print(repr(account1))
# auto-generated __eq__
print(account1 == account2) # False — different owners and balances
# dataclass fields are still normal attributes
print(account1.owner) # Ahmad — normalized in __post_init__
print(account1.balance) # 16000.0
print(len(account1.transactions)) # 3Output:
Deposited 5,000 PKR → Balance: 15,000 PKR
Deposited 3,000 PKR → Balance: 18,000 PKR
Withdrawn 2,000 PKR → Balance: 16,000 PKR
Insufficient balance. Available: 16,000 PKR
=============================================
HBL Bank — Account Statement
=============================================
Account: HBL-12345
Owner: Ahmad
Opened: 2026-06-02
─────────────────────────────────────────────
[2026-06-02 14:35] + 5,000 PKR (deposit)
[2026-06-02 14:35] + 3,000 PKR (deposit)
[2026-06-02 14:35] - 2,000 PKR (withdrawal)
─────────────────────────────────────────────
Balance: 16,000 PKR
=============================================
BankAccount(owner='Ahmad', balance=16000.0, bank_name='HBL Bank',
transactions=[...], created_at=datetime.datetime(...))
False
Ahmad
16000.0
3@dataclass vs regular class — when to use which
| Situation | Use |
|---|---|
| Storing data with little logic | @dataclass |
| Immutable record — transaction, event | @dataclass(frozen=True) |
| Need sorting by fields | @dataclass(order=True) |
| Complex logic, many methods, inheritance | Regular class |
Need full control over __init__ | Regular class |
| Simple config or settings object | @dataclass |
@dataclass is not a replacement for all classes — it is a tool for data-holding classes. If your class is mostly logic with a few fields, a regular class is fine. If your class is mostly fields with a little logic, @dataclass will save you a lot of boilerplate.
Summary
| Feature | How |
|---|---|
| Basic dataclass | @dataclass |
| Default value | field: str = "default" |
| Default factory | field(default_factory=list) |
Exclude from __init__ | field(init=False) |
Exclude from __repr__ | field(repr=False) |
| Post-init logic | def __post_init__(self): |
| Immutable | @dataclass(frozen=True) |
| Sorting support | @dataclass(order=True) |