SQL & SQLAlchemy Deep Dive
One consolidated reference for relational design, advanced querying, and SQLAlchemy 2.0 patterns. Combines SQL DDL/Advanced Queries and SQLAlchemy CRUD/Relationships/Advanced Patterns content.
Table of Contents
- Relational Foundations (DDL)
- Querying Patterns (SQL)
- SQLAlchemy 2.0 Model Design
- CRUD Patterns
- Relationships & Loading Strategies
- Advanced ORM Patterns
- Testing Database Code
- Performance & Troubleshooting
1) Relational Foundations (DDL)
- Core objects: databases, schemas, tables, views, indexes, sequences.
- Column design: types (INT, TEXT, UUID, JSONB), nullability, defaults, check constraints.
- Keys: primary keys (single or surrogate), unique constraints, foreign keys with
ON DELETE/UPDATEbehavior. - Migrations: forward-only scripts; avoid destructive changes without backups; add columns with defaults carefully; manage enum changes explicitly.
2) Querying Patterns (SQL)
- Filtering:
WHERE,IN,BETWEEN,ILIKE, null-safe comparisons (IS NULL). - Aggregation:
GROUP BY,HAVING, window functions (ROW_NUMBER,LAG,SUM OVER). - Joins: INNER/LEFT/RIGHT/FULL; prefer explicit join conditions; ensure selective predicates to avoid blowups.
- Pagination:
LIMIT/OFFSETfor basics; keyset pagination for large data. - Common patterns:
EXISTSvsIN;CASE WHEN; CTEs for readability; materialized views for heavy joins.
3) SQLAlchemy 2.0 Model Design
- Typed models:
Mapped[T]+mapped_column; dataclasses optional;__tablename__explicit. - Defaults: server_default vs python default; timestamps via
func.now();Enumwithvalidate_strings=True. - Metadata: naming conventions for constraints to keep Alembic diffs stable.
- Base patterns: declarative base in
models/__init__.py; import models inmodels/__all__.pyto register for migrations.
4) CRUD Patterns
- Create: instantiate model, add to session, flush to get PK, commit in service layer.
- Read: use
select(Model).where(...); preferscalars(); handleNoResultFound/MultipleResultsFound. - Update: load row, mutate fields, commit; for bulk, use
update()with care (skips ORM events). - Delete: soft-delete column for business-critical data; otherwise
session.delete(obj)with cascade awareness. - Transactions: context managers or explicit
session.begin(); avoid nested commits in request handlers.
5) Relationships & Loading Strategies
- Relationships:
relationshipwithback_populates; specifycascadeexplicitly;secondarytables for many-to-many. - Loading: default to
selectinloadto avoid N+1;joinedloadfor small/1-1;lazy="raise"to catch accidental lazy loads in APIs. - Cardinality patterns:
one-to-manyvia FK + backref;one-to-onevia unique FK;many-to-manyvia association table or association object with extra columns. - Ordering: set
order_byon relationships when deterministic ordering matters.
6) Advanced ORM Patterns
- Domain services: keep business logic out of models; use services for workflows.
- Repositories (optional): wrap session queries when you need abstraction; otherwise, pass session explicitly.
- Soft deletes:
deleted_atfilters via query property or helper functions; ensure unique constraints account for soft deletes. - Versioning/audit: history tables or event listeners; include actor/context columns.
- Bulk operations:
session.execute(insert(Model).values(...))for large imports; be aware of bypassed ORM events. - Schema evolution: feature flags around new columns; backfill jobs for non-nullable additions.
7) Testing Database Code
- Fixtures: session fixture with rollback per test; use nested transactions or
sessionmaker(bind=engine, expire_on_commit=False). - Factories: build test data with factory functions; avoid random data that hides determinism.
- Isolation: mark tests that hit DB; avoid sharing sessions across tests; seed reference data per test module.
- Assertions: verify both data and constraints (unique violations, FK errors); assert eager loading to avoid lazy access in APIs.
- Contract with API tests: reuse the same factories in API layer tests for realistic payloads.
8) Performance & Troubleshooting
- Explain plans: run
EXPLAIN ANALYZEon heavy queries; add indexes for filter/join columns; composite indexes ordered by selectivity. - Connection management: pool sizing (gunicorn workers × expected concurrency); timeouts for long-running queries.
- N+1 detection: enable
lazy="raise"in tests; log SQL to spot chatter; useselectinload/joinedload. - Locks & contention: keep transactions short; avoid long-running migrations during peak.
- Monitoring: metrics for query latency, errors, pool saturation; logs with query params stripped of PII.