210 lines
12 KiB
Markdown
210 lines
12 KiB
Markdown
# Architecture improvement proposals
|
||
|
||
This document compares **this repository** (Initiative / “Profyt” style stack: `fe0` + `be0` + PostgreSQL + MinIO) against the **layered patterns and RBAC/DB practices** described in the reference write-up (Cookiecutter-style Full Stack template: *Route → Service → Repository → Database*).
|
||
|
||
It proposes concrete improvements for **frontend**, **backend**, and **database**—not to copy the template verbatim, but to **reduce risk and operational cost** as the product grows.
|
||
|
||
---
|
||
|
||
## 1. How your app is structured today (summary)
|
||
|
||
| Area | What you have | Reference template pattern |
|
||
|------|----------------|----------------------------|
|
||
| **Backend** | A large `be0/main.py` FastAPI app with many endpoints, workflow glue, and file/YAML fallbacks in one module | Small route modules, thin handlers, `services/` + `repositories/`, `schemas/`, `deps.py` |
|
||
| **Domain DB** | `be0/src/initiative_db/` (async engine, `get_session` with commit/rollback, SQLAlchemy models) | Same async SQLAlchemy idea; stronger separation of data access from HTTP |
|
||
| **Auth** | `be0/src/auth_api.py` — Argon2, JWT (HS256), roles embedded in `user_roles` + JWT payload, `decode_access_token_user_id` per route | Reusable `get_current_user`, `RoleChecker`, optional claim checks (e.g. access vs refresh) |
|
||
| **Frontend** | `fe0` React + Vite, centralized `apiClient`, `AuthContext`, permission map in `fe0/src/lib/permissions.ts` | UI gates only; real enforcement on server (you partially do this) |
|
||
| **Storage** | MinIO via `be0/src/minio/storage.py`, evidence endpoints in `main.py`, `application_artifacts` in PostgreSQL | Similar object-store + DB metadata split |
|
||
| **Migrations** | SQL files under `be0/migrations/` applied via Docker `initdb.d` | Alembic revisions + env imports for all models |
|
||
|
||
**Bottom line:** You already have strong **pieces** (async sessions, MinIO abstraction, initiative schema, evidence pipeline). The main gap is **structural**—keeping business rules, HTTP, and storage concerns **separated and testable**, and making **role and ownership rules** easy to review in one place.
|
||
|
||
---
|
||
|
||
## 2. Database improvements
|
||
|
||
### 2.1 Single source of truth for schema evolution
|
||
|
||
**Today:** `be0/migrations/*.sql` are mounted into Postgres at container init, while SQLAlchemy models in `be0/src/initiative_db/models.py` are the app’s view of the world. That can **drift** (manual SQL + model mismatch).
|
||
|
||
**Proposals:**
|
||
|
||
- Introduce **Alembic** (or another migration tool) for `be0`, targeting the same `DATABASE_URL` / `INITIATIVE_DATABASE_URL`, with:
|
||
- `alembic/env.py` importing **all** model modules so autogenerate sees every table.
|
||
- A one-time **baseline** migration that matches the current SQL, then all future changes via revision files only.
|
||
- Keep Docker `initdb.d` for **local bootstrap** or replace it with `alembic upgrade head` in an entrypoint—avoid maintaining two parallel migration paths in production.
|
||
|
||
### 2.2 Naming convention and metadata hygiene
|
||
|
||
**Reference lesson:** A `Base.metadata` **naming convention** makes autogenerated diffs stable across environments.
|
||
|
||
**Proposal:** Add a SQLAlchemy `MetaData(naming_convention=…)` on your `Base` in `be0/src/initiative_db/models.py` *before* the next non-trivial migration, so new constraints don’t get database-default names that differ by environment.
|
||
|
||
### 2.3 Indexes and hot paths (review pass)
|
||
|
||
**Already good:** `initiatives.case_code` is unique; ownership is by `owner_id`. Ensure **all foreign keys** used in list/detail filters have indexes (Postgres does not always index FK columns automatically in every setup—verify in migrations).
|
||
|
||
**Proposal:** One focused pass over queries in `drafts`, `submissions`, and `application_artifacts` (evidence) — add composite or partial indexes only where proven by slow query logs (avoid over-indexing).
|
||
|
||
### 2.4 Align “roles” in DB vs product language
|
||
|
||
**Observation:** `001_initiative_schema.sql` defines `user_role` enum including `applicant`, `council_member`, etc. The **frontend** `fe0/src/lib/permissions.ts` uses `admin` | `editor` | `viewer` with business descriptions (“Người nộp đơn” tied to *viewer*). This **semantic mapping** is easy to get wrong in new endpoints.
|
||
|
||
**Proposals:**
|
||
|
||
- Add a **short internal doc** (table): DB role(s) → JWT/claims → UI permission set.
|
||
- In API design, **avoid duplicating** role names in ad hoc strings; prefer **one** enumeration shared between auth issuance and checks (or a small `const` module on the server used by all routers).
|
||
|
||
### 2.5 Transaction boundaries (you’re on the right track)
|
||
|
||
`get_session` commits on success and rolls back on exception—aligned with the reference pattern. **Keep** service-level logic from calling `commit()` except where a deliberate long-lived transaction is required.
|
||
|
||
**Proposal:** As you split code out of `main.py`, ensure **one unit of work per request** unless you document exceptions (e.g. multi-step background handoff).
|
||
|
||
---
|
||
|
||
## 3. Backend improvements
|
||
|
||
### 3.1 Move from monolithic `main.py` to layered packages
|
||
|
||
**Gap:** A large `main.py` mixing workflow state, file exports, application drafts, MinIO, and compliance tools makes it hard to:
|
||
|
||
- test ownership and RBAC in isolation;
|
||
- reason about what runs on every deploy;
|
||
- onboard new contributors.
|
||
|
||
**Target shape (incremental, not big-bang):**
|
||
|
||
```
|
||
be0/src/
|
||
api/ # or routers/
|
||
deps.py # get_current_user, require_role, get_initiative_scoped, etc.
|
||
routes/
|
||
auth.py
|
||
application_drafts.py
|
||
applications.py
|
||
evidence.py
|
||
admin.py
|
||
services/ # business rules: "can this user add evidence to this case?"
|
||
repositories/ # SQL only: initiative by case_code, artifact by role, …
|
||
schemas/ # Pydantic request/response DTOs
|
||
initiative_db/ # keep models + engine; repositories call into models
|
||
```
|
||
|
||
**Rule (from the reference):** **Routes never execute raw SQL**; they call **services** that call **repositories**. This matches your evidence flow conceptually (you’re halfway there with `application_storage`).
|
||
|
||
### 3.2 RBAC and IDOR: centralize “who can touch this row?”
|
||
|
||
**Reference patterns worth adopting:**
|
||
|
||
- **`RoleChecker`-style** dependencies (even if not a class) so route signatures read: “requires admin” vs “any authenticated user” without re-reading 30 lines of checks.
|
||
- **Ownership (IDOR):** For every `case_id` / `initiative_id` / `application_id`, a single function e.g. `assert_initiative_access(user_id, case_code, need_write=True)` that:
|
||
- returns **404** for cross-tenant access when appropriate (no existence leak);
|
||
- is reused by evidence upload, draft save, and submit.
|
||
|
||
**Today** you check `owner_id` in several places; factor that into **one** service method and reuse.
|
||
|
||
### 3.3 JWT and token hygiene
|
||
|
||
**Reference:** Distinguish **access** vs **refresh** token **claims** (e.g. `type`) so a refresh token cannot be used as a Bearer on API routes.
|
||
|
||
**Proposal:** If you only issue access tokens today, document that. If you add refresh, enforce `type == "access"` in the dependency that guards APIs.
|
||
|
||
**Already good:** `JWT_SECRET` length check in dev/prod, Argon2 for passwords.
|
||
|
||
### 3.4 Reduce dual persistence in production
|
||
|
||
**Today:** Some flows can fall back to **YAML on disk** or in-memory stores when Postgres is off. That’s great for demos; in production it **doubles** test matrices.
|
||
|
||
**Proposals:**
|
||
|
||
- **Feature flag** or environment switch: `INITIATIVES_PERSISTENCE=postgres` only in prod; fail fast if misconfigured.
|
||
- **Deprecate** or strictly scope file-based application draft storage to “local dev without Docker”.
|
||
|
||
### 3.5 MinIO and evidence
|
||
|
||
**Already good:** `S3Storage`, size limits, presigned download URLs, `application_artifacts` for metadata.
|
||
|
||
**Proposals:**
|
||
|
||
- Centralize **all** MinIO key conventions and `role` names (`evidence_research`, `evidence_textbook`, `full_pdf`) in **one** server module to avoid string drift.
|
||
- Add **periodic job** (cron/worker) for orphaned objects if uploads fail mid-way—your `cleanup` module in `minio` is a start; ensure it’s scheduled in deployed environments.
|
||
|
||
### 3.6 Errors and API contract
|
||
|
||
**Reference:** Map domain errors to HTTP status in one `exception_handlers` module.
|
||
|
||
**Proposal:** Use consistent JSON error bodies (`{ "detail": "...", "code": "EVIDENCE_TOO_LARGE" }`) so the **frontend** can branch without parsing English/Vietnamese strings.
|
||
|
||
---
|
||
|
||
## 4. Frontend improvements
|
||
|
||
### 4.1 Treat server as the only authority
|
||
|
||
**Today:** `fe0/src/lib/permissions.ts` encodes a rich matrix (`ROLE_PERMISSIONS`). That’s good UX, but **must** mirror server checks. Any mismatch is a **security** issue (UI hides a button, API still open—or the reverse).
|
||
|
||
**Proposals:**
|
||
|
||
- For sensitive actions, **prefer** “call API → 403/404” over purely client `hasPermission` (e.g. hide buttons but always handle 403 on mutation).
|
||
- Optionally expose a **`GET /api/v1/auth/me/permissions` or** embed resolved permissions in JWT claims—**if** you can keep them small and non-stale. Otherwise, document that **fe** is best-effort and **be** is truth.
|
||
|
||
### 4.2 Align “applicant” / “viewer” vocabulary
|
||
|
||
The UI comments describe **applicant** behavior while the `Role` type is **`viewer`**. New contributors will be confused.
|
||
|
||
**Proposals:**
|
||
|
||
- Rename in UI copy only (`getRoleDisplayName`) or add alias `type ApplicantRole = 'viewer'` with a comment, and keep **one** canonical role in code.
|
||
- Reconcile with **SQL** enum names in documentation (see §2.4).
|
||
|
||
### 4.3 Generated types from OpenAPI
|
||
|
||
**Gap:** Hand-maintained DTOs and ad hoc `as unknown as` casts in draft save/load are brittle.
|
||
|
||
**Proposal:** Export OpenAPI from FastAPI (`/openapi.json`) and generate TypeScript types (or a thin client) in CI. At minimum, generate **enums and response shapes** for `application-drafts`, `applications`, and `evidence`.
|
||
|
||
### 4.4 Draft and file state machine
|
||
|
||
You combine **local IndexedDB**, **server JSON tabs**, and **MinIO evidence**—users need clear **feedback** (you’ve started with toasts and size limits).
|
||
|
||
**Proposals:**
|
||
|
||
- A visible **per-field status** for evidence: `local only` / `synced` / `error` (small badge next to the filename).
|
||
- **Idempotent** retry: if upload fails, one-click “retry sync” without re-selecting the file (blob still in IDB).
|
||
|
||
### 4.5 Performance and DX
|
||
|
||
- **Code-split** very large forms (`InitiativeApplicationForm`, `InitiativeReportForm`) by section for faster first paint.
|
||
- **React Query** (you use it in places) for server-backed lists with stable keys; align cache invalidation on submit/delete application.
|
||
|
||
---
|
||
|
||
## 5. Suggested prioritization (practical order)
|
||
|
||
1. **DB:** Alembic + stop dual SQL/model drift; baseline migration.
|
||
2. **Backend:** Extract `deps.py` + initiative ownership helper; use it in evidence + draft + submit routes.
|
||
3. **Backend:** Split `main.py` into `routes/` by domain (even if services are thin at first).
|
||
4. **API:** Standard error JSON + OpenAPI export + generated TS types.
|
||
5. **Frontend:** Permission doc + 403 handling + evidence sync UX (status badge / retry).
|
||
6. **Ops:** One persistence mode in prod; optional cleanup job for MinIO/DB consistency.
|
||
|
||
---
|
||
|
||
## 6. Mapping to the reference “lessons” list
|
||
|
||
| Reference lesson | Your project |
|
||
|------------------|--------------|
|
||
| Four-layer separation | **Partial** — move HTTP out of `main.py` and add services. |
|
||
| `has_role` on user model / single RBAC place | **Partial** — centralize in `deps` + one service. |
|
||
| IDOR: always scope by `user_id` in queries | **Good direction** — consolidate checks. |
|
||
| 404 on cross-owner access | **Adopt** where you still return 403 for existence leaks. |
|
||
| DB timestamps / naming / Alembic imports | **Adopt** naming + Alembic hygiene. |
|
||
| One transaction per request | **Yes** in `get_session`. |
|
||
| NullPool for workers | **N/A** until you have forked workers; then apply. |
|
||
| Don’t create admin over HTTP | **OK** if admin bootstrap is CLI or SQL—keep that. |
|
||
|
||
---
|
||
|
||
*Generated from a comparison of this codebase with the structure and ideas in the supplied `architecture-analysis.md` (vstorm / full-stack template style). Adjust priorities to your release timeline and team size.*
|