DocsHub
Object-Oriented Programming

Dataclasses

Learn how to use Python's @dataclass decorator to write clean, concise classes without boilerplate.

Dataclasses

Every class you have written so far needed an __init__ to set up instance variables — even when the setup was just storing values:

class BankAccount:
    def __init__(self, owner, balance):
        self.owner = owner
        self.balance = balance

For simple data-holding classes, this is repetitive. You write owner three times just to store one value. The @dataclass decorator fixes this — it auto-generates __init__, __repr__, __eq__, and more from your field definitions.


Without @dataclass — the repetition problem

Say you want a class to represent a bank transaction:

class Transaction:
    def __init__(self, transaction_id, owner, amount, transaction_type, timestamp):
        self.transaction_id = transaction_id
        self.owner = owner
        self.amount = amount
        self.transaction_type = transaction_type
        self.timestamp = timestamp

    def __repr__(self):
        return (
            f"Transaction(id={self.transaction_id!r}, "
            f"owner={self.owner!r}, "
            f"amount={self.amount!r}, "
            f"type={self.transaction_type!r})"
        )

    def __eq__(self, other):
        if not isinstance(other, Transaction):
            return NotImplemented
        return self.transaction_id == other.transaction_id

Every field appears multiple times. This is just boilerplate — you are not adding any real logic.


With @dataclass — clean and concise

from dataclasses import dataclass
from datetime import datetime

@dataclass
class Transaction:
    transaction_id: str
    owner: str
    amount: float
    transaction_type: str
    timestamp: datetime = None

That is it. Python generates __init__, __repr__, and __eq__ automatically. Same result, a fraction of the code.

t1 = Transaction("TXN-001", "Ahmad", 5000.0, "deposit")
t2 = Transaction("TXN-002", "Sara",  3000.0, "withdrawal")

print(t1)
# Transaction(transaction_id='TXN-001', owner='Ahmad', amount=5000.0, transaction_type='deposit', timestamp=None)

print(t1 == t2)   # False — different transaction_ids

What @dataclass generates

By default @dataclass automatically creates:

MethodWhat it does
__init__Sets all fields as instance variables
__repr__Shows all field names and values
__eq__Compares all fields for equality

Default values

Fields with defaults must come after fields without:

from dataclasses import dataclass
from datetime import datetime

@dataclass
class Transaction:
    transaction_id: str
    owner: str
    amount: float
    transaction_type: str
    timestamp: datetime = None
    is_verified: bool = False
    notes: str = ""


t = Transaction("TXN-001", "Ahmad", 5000.0, "deposit")
print(t)
# Transaction(transaction_id='TXN-001', owner='Ahmad', amount=5000.0,
# transaction_type='deposit', timestamp=None, is_verified=False, notes='')

t2 = Transaction("TXN-002", "Sara", 3000.0, "withdrawal",
                 is_verified=True, notes="salary payment")
print(t2)

field() — more control over fields

Use field() from dataclasses when you need more control — default factories, excluding from repr, and more:

from dataclasses import dataclass, field
from datetime import datetime

@dataclass
class BankAccount:
    owner: str
    balance: float
    bank_name: str = "HBL Bank"
    transactions: list = field(default_factory=list)   # fresh list per instance
    created_at: datetime = field(default_factory=datetime.now)
    _account_id: str = field(init=False, repr=False)   # not in __init__ or __repr__

    def __post_init__(self):
        self._account_id = f"HBL-{id(self) % 100000:05d}"

Never use a mutable default like transactions: list = [] in a dataclass. That same list would be shared across all instances — a classic Python bug. Always use field(default_factory=list) for mutable defaults like lists and dicts.

field() options:

OptionMeaning
defaultA fixed default value
default_factoryA function called to create the default — for lists, dicts
init=FalseExclude from __init__
repr=FalseExclude from __repr__
compare=FalseExclude from __eq__ comparisons

post_init — run code after init

__post_init__ runs automatically after the generated __init__. Use it for validation or computed fields:

from dataclasses import dataclass, field
from datetime import datetime

@dataclass
class Transaction:
    transaction_id: str
    owner: str
    amount: float
    transaction_type: str
    timestamp: datetime = field(default_factory=datetime.now)
    is_verified: bool = False

    def __post_init__(self):
        # validation
        if self.amount <= 0:
            raise ValueError(f"Transaction amount must be positive. Got {self.amount}")
        if self.transaction_type not in ("deposit", "withdrawal", "transfer"):
            raise ValueError(f"Invalid transaction type: {self.transaction_type}")
        # normalize
        self.owner = self.owner.strip().title()
        self.transaction_id = self.transaction_id.upper()


# valid transaction
t1 = Transaction("txn-001", "  ahmad  ", 5000.0, "deposit")
print(t1.owner)            # Ahmad — normalized
print(t1.transaction_id)   # TXN-001 — normalized

# invalid — raises ValueError
t2 = Transaction("txn-002", "Sara", -500.0, "deposit")
# ValueError: Transaction amount must be positive. Got -500.0

frozen=True — immutable dataclass

Set frozen=True to make all fields read-only after creation. Perfect for data that should never change — like a completed transaction record:

from dataclasses import dataclass
from datetime import datetime

@dataclass(frozen=True)
class TransactionRecord:
    transaction_id: str
    owner: str
    amount: float
    transaction_type: str
    timestamp: datetime


record = TransactionRecord(
    "TXN-001", "Ahmad", 5000.0, "deposit", datetime.now()
)

print(record.amount)   # 5000.0

# trying to change it raises FrozenInstanceError
record.amount = 9999   # FrozenInstanceError: cannot assign to field 'amount'

Frozen dataclasses are also hashable — you can use them as dictionary keys or put them in sets:

records = {record}   # works — frozen dataclass is hashable

order=True — sorting support

Set order=True to generate comparison methods — __lt__, __le__, __gt__, __ge__ — based on field order:

from dataclasses import dataclass

@dataclass(order=True)
class Transaction:
    amount: float    # first field — used for comparison
    owner: str
    transaction_type: str


t1 = Transaction(5000.0, "Ahmad", "deposit")
t2 = Transaction(3000.0, "Sara",  "withdrawal")
t3 = Transaction(8000.0, "Omar",  "deposit")

transactions = [t1, t2, t3]
transactions.sort()

for t in transactions:
    print(f"{t.owner}: {t.amount:,} PKR")

Output:

Sara: 3,000 PKR
Ahmad: 5,000 PKR
Omar: 8,000 PKR

Comparison is based on field order — the first field is compared first, then the second if the first is equal, and so on. If you want to sort by a specific field, use sorted(transactions, key=lambda t: t.amount) instead of relying on order=True.


A complete BankAccount using dataclasses

Let us rebuild the BankAccount and Transaction classes using dataclasses — clean, modern Python:

from dataclasses import dataclass, field
from datetime import datetime
from typing import Literal


@dataclass(frozen=True)
class Transaction:
    """An immutable record of a single transaction."""
    transaction_id: str
    owner: str
    amount: float
    transaction_type: Literal["deposit", "withdrawal"]
    timestamp: datetime = field(default_factory=datetime.now)

    def __post_init__(self):
        if self.amount <= 0:
            raise ValueError("Transaction amount must be positive.")

    def __str__(self):
        symbol = "+" if self.transaction_type == "deposit" else "-"
        time_str = self.timestamp.strftime("%Y-%m-%d %H:%M")
        return f"[{time_str}] {symbol} {self.amount:,.0f} PKR  ({self.transaction_type})"


@dataclass
class BankAccount:
    """A bank account with full transaction history."""
    owner: str
    balance: float = 0.0
    bank_name: str = "HBL Bank"
    transactions: list[Transaction] = field(default_factory=list)
    created_at: datetime = field(default_factory=datetime.now)
    _account_number: str = field(init=False, repr=False, compare=False)
    _transaction_counter: int = field(init=False, repr=False, compare=False, default=0)

    def __post_init__(self):
        if self.balance < 0:
            raise ValueError("Opening balance cannot be negative.")
        self.owner = self.owner.strip().title()
        self._account_number = f"HBL-{id(self) % 100000:05d}"

    def _new_transaction_id(self) -> str:
        self._transaction_counter += 1
        return f"TXN-{self._transaction_counter:04d}"

    def deposit(self, amount: float) -> None:
        if amount <= 0:
            print("Deposit amount must be positive.")
            return
        self.balance += amount
        txn = Transaction(
            transaction_id=self._new_transaction_id(),
            owner=self.owner,
            amount=amount,
            transaction_type="deposit"
        )
        self.transactions.append(txn)
        print(f"Deposited {amount:,.0f} PKR → Balance: {self.balance:,.0f} PKR")

    def withdraw(self, amount: float) -> None:
        if amount <= 0:
            print("Withdrawal amount must be positive.")
            return
        if amount > self.balance:
            print(f"Insufficient balance. Available: {self.balance:,.0f} PKR")
            return
        self.balance -= amount
        txn = Transaction(
            transaction_id=self._new_transaction_id(),
            owner=self.owner,
            amount=amount,
            transaction_type="withdrawal"
        )
        self.transactions.append(txn)
        print(f"Withdrawn {amount:,.0f} PKR → Balance: {self.balance:,.0f} PKR")

    def statement(self) -> None:
        print(f"\n{'=' * 45}")
        print(f"  {self.bank_name} — Account Statement")
        print(f"{'=' * 45}")
        print(f"  Account:  {self._account_number}")
        print(f"  Owner:    {self.owner}")
        print(f"  Opened:   {self.created_at.strftime('%Y-%m-%d')}")
        print(f"{'─' * 45}")
        if not self.transactions:
            print("  No transactions yet.")
        else:
            for txn in self.transactions:
                print(f"  {txn}")
        print(f"{'─' * 45}")
        print(f"  Balance:  {self.balance:,.0f} PKR")
        print(f"{'=' * 45}\n")

Using it:

account1 = BankAccount("  ahmad  ", opening_balance=10000)
account2 = BankAccount("Sara", balance=25000)

account1.deposit(5000)
account1.deposit(3000)
account1.withdraw(2000)
account1.withdraw(99999)   # blocked

account1.statement()

# auto-generated __repr__
print(repr(account1))

# auto-generated __eq__
print(account1 == account2)   # False — different owners and balances

# dataclass fields are still normal attributes
print(account1.owner)    # Ahmad — normalized in __post_init__
print(account1.balance)  # 16000.0
print(len(account1.transactions))  # 3

Output:

Deposited 5,000 PKR → Balance: 15,000 PKR
Deposited 3,000 PKR → Balance: 18,000 PKR
Withdrawn 2,000 PKR → Balance: 16,000 PKR
Insufficient balance. Available: 16,000 PKR

=============================================
  HBL Bank — Account Statement
=============================================
  Account:  HBL-12345
  Owner:    Ahmad
  Opened:   2026-06-02
─────────────────────────────────────────────
  [2026-06-02 14:35] + 5,000 PKR  (deposit)
  [2026-06-02 14:35] + 3,000 PKR  (deposit)
  [2026-06-02 14:35] - 2,000 PKR  (withdrawal)
─────────────────────────────────────────────
  Balance:  16,000 PKR
=============================================

BankAccount(owner='Ahmad', balance=16000.0, bank_name='HBL Bank',
transactions=[...], created_at=datetime.datetime(...))
False
Ahmad
16000.0
3

@dataclass vs regular class — when to use which

SituationUse
Storing data with little logic@dataclass
Immutable record — transaction, event@dataclass(frozen=True)
Need sorting by fields@dataclass(order=True)
Complex logic, many methods, inheritanceRegular class
Need full control over __init__Regular class
Simple config or settings object@dataclass

@dataclass is not a replacement for all classes — it is a tool for data-holding classes. If your class is mostly logic with a few fields, a regular class is fine. If your class is mostly fields with a little logic, @dataclass will save you a lot of boilerplate.


Summary

FeatureHow
Basic dataclass@dataclass
Default valuefield: str = "default"
Default factoryfield(default_factory=list)
Exclude from __init__field(init=False)
Exclude from __repr__field(repr=False)
Post-init logicdef __post_init__(self):
Immutable@dataclass(frozen=True)
Sorting support@dataclass(order=True)

On this page