sciagent/docs/architecture-improvement-proposals.md

# Architecture improvement proposals

This document compares **this repository** (Initiative / “Profyt” style stack: `fe0` + `be0` + PostgreSQL + MinIO) against the **layered patterns and RBAC/DB practices** described in the reference write-up (Cookiecutter-style Full Stack template: *Route → Service → Repository → Database*).

It proposes concrete improvements for **frontend**, **backend**, and **database**—not to copy the template verbatim, but to **reduce risk and operational cost** as the product grows.

---

## 1. How your app is structured today (summary)

| Area | What you have | Reference template pattern |
|------|----------------|----------------------------|
| **Backend** | A large `be0/main.py` FastAPI app with many endpoints, workflow glue, and file/YAML fallbacks in one module | Small route modules, thin handlers, `services/` + `repositories/`, `schemas/`, `deps.py` |
| **Domain DB** | `be0/src/initiative_db/` (async engine, `get_session` with commit/rollback, SQLAlchemy models) | Same async SQLAlchemy idea; stronger separation of data access from HTTP |
| **Auth** | `be0/src/auth_api.py` — Argon2, JWT (HS256), roles embedded in `user_roles` + JWT payload, `decode_access_token_user_id` per route | Reusable `get_current_user`, `RoleChecker`, optional claim checks (e.g. access vs refresh) |
| **Frontend** | `fe0` React + Vite, centralized `apiClient`, `AuthContext`, permission map in `fe0/src/lib/permissions.ts` | UI gates only; real enforcement on server (you partially do this) |
| **Storage** | MinIO via `be0/src/minio/storage.py`, evidence endpoints in `main.py`, `application_artifacts` in PostgreSQL | Similar object-store + DB metadata split |
| **Migrations** | SQL files under `be0/migrations/` applied via Docker `initdb.d` | Alembic revisions + env imports for all models |

**Bottom line:** You already have strong **pieces** (async sessions, MinIO abstraction, initiative schema, evidence pipeline). The main gap is **structural**—keeping business rules, HTTP, and storage concerns **separated and testable**, and making **role and ownership rules** easy to review in one place.

---

## 2. Database improvements

### 2.1 Single source of truth for schema evolution

**Today:** `be0/migrations/*.sql` are mounted into Postgres at container init, while SQLAlchemy models in `be0/src/initiative_db/models.py` are the app’s view of the world. That can **drift** (manual SQL + model mismatch).

**Proposals:**

- Introduce **Alembic** (or another migration tool) for `be0`, targeting the same `DATABASE_URL` / `INITIATIVE_DATABASE_URL`, with:
  - `alembic/env.py` importing **all** model modules so autogenerate sees every table.
  - A one-time **baseline** migration that matches the current SQL, then all future changes via revision files only.
- Keep Docker `initdb.d` for **local bootstrap** or replace it with `alembic upgrade head` in an entrypoint—avoid maintaining two parallel migration paths in production.

### 2.2 Naming convention and metadata hygiene

**Reference lesson:** A `Base.metadata` **naming convention** makes autogenerated diffs stable across environments.

**Proposal:** Add a SQLAlchemy `MetaData(naming_convention=…)` on your `Base` in `be0/src/initiative_db/models.py` *before* the next non-trivial migration, so new constraints don’t get database-default names that differ by environment.

### 2.3 Indexes and hot paths (review pass)

**Already good:** `initiatives.case_code` is unique; ownership is by `owner_id`. Ensure **all foreign keys** used in list/detail filters have indexes (Postgres does not always index FK columns automatically in every setup—verify in migrations).

**Proposal:** One focused pass over queries in `drafts`, `submissions`, and `application_artifacts` (evidence) — add composite or partial indexes only where proven by slow query logs (avoid over-indexing).

### 2.4 Align “roles” in DB vs product language

**Observation:** `001_initiative_schema.sql` defines `user_role` enum including `applicant`, `council_member`, etc. The **frontend** `fe0/src/lib/permissions.ts` uses `admin` | `editor` | `viewer` with business descriptions (“Người nộp đơn” tied to *viewer*). This **semantic mapping** is easy to get wrong in new endpoints.

**Proposals:**

- Add a **short internal doc** (table): DB role(s) → JWT/claims → UI permission set.
- In API design, **avoid duplicating** role names in ad hoc strings; prefer **one** enumeration shared between auth issuance and checks (or a small `const` module on the server used by all routers).

### 2.5 Transaction boundaries (you’re on the right track)

`get_session` commits on success and rolls back on exception—aligned with the reference pattern. **Keep** service-level logic from calling `commit()` except where a deliberate long-lived transaction is required.

**Proposal:** As you split code out of `main.py`, ensure **one unit of work per request** unless you document exceptions (e.g. multi-step background handoff).

---

## 3. Backend improvements

### 3.1 Move from monolithic `main.py` to layered packages

**Gap:** A large `main.py` mixing workflow state, file exports, application drafts, MinIO, and compliance tools makes it hard to:

- test ownership and RBAC in isolation;
- reason about what runs on every deploy;
- onboard new contributors.

**Target shape (incremental, not big-bang):**

```
be0/src/
  api/                 # or routers/
    deps.py            # get_current_user, require_role, get_initiative_scoped, etc.
    routes/
      auth.py
      application_drafts.py
      applications.py
      evidence.py
      admin.py
  services/            # business rules: "can this user add evidence to this case?"
  repositories/        # SQL only: initiative by case_code, artifact by role, …
  schemas/             # Pydantic request/response DTOs
  initiative_db/       # keep models + engine; repositories call into models
```

**Rule (from the reference):** **Routes never execute raw SQL**; they call **services** that call **repositories**. This matches your evidence flow conceptually (you’re halfway there with `application_storage`).

### 3.2 RBAC and IDOR: centralize “who can touch this row?”

**Reference patterns worth adopting:**

- **`RoleChecker`-style** dependencies (even if not a class) so route signatures read: “requires admin” vs “any authenticated user” without re-reading 30 lines of checks.
- **Ownership (IDOR):** For every `case_id` / `initiative_id` / `application_id`, a single function e.g. `assert_initiative_access(user_id, case_code, need_write=True)` that:
  - returns **404** for cross-tenant access when appropriate (no existence leak);
  - is reused by evidence upload, draft save, and submit.

**Today** you check `owner_id` in several places; factor that into **one** service method and reuse.

### 3.3 JWT and token hygiene

**Reference:** Distinguish **access** vs **refresh** token **claims** (e.g. `type`) so a refresh token cannot be used as a Bearer on API routes.

**Proposal:** If you only issue access tokens today, document that. If you add refresh, enforce `type == "access"` in the dependency that guards APIs.

**Already good:** `JWT_SECRET` length check in dev/prod, Argon2 for passwords.

### 3.4 Reduce dual persistence in production

**Today:** Some flows can fall back to **YAML on disk** or in-memory stores when Postgres is off. That’s great for demos; in production it **doubles** test matrices.

**Proposals:**

- **Feature flag** or environment switch: `INITIATIVES_PERSISTENCE=postgres` only in prod; fail fast if misconfigured.
- **Deprecate** or strictly scope file-based application draft storage to “local dev without Docker”.

### 3.5 MinIO and evidence

**Already good:** `S3Storage`, size limits, presigned download URLs, `application_artifacts` for metadata.

**Proposals:**

- Centralize **all** MinIO key conventions and `role` names (`evidence_research`, `evidence_textbook`, `full_pdf`) in **one** server module to avoid string drift.
- Add **periodic job** (cron/worker) for orphaned objects if uploads fail mid-way—your `cleanup` module in `minio` is a start; ensure it’s scheduled in deployed environments.

### 3.6 Errors and API contract

**Reference:** Map domain errors to HTTP status in one `exception_handlers` module.

**Proposal:** Use consistent JSON error bodies (`{ "detail": "...", "code": "EVIDENCE_TOO_LARGE" }`) so the **frontend** can branch without parsing English/Vietnamese strings.

---

## 4. Frontend improvements

### 4.1 Treat server as the only authority

**Today:** `fe0/src/lib/permissions.ts` encodes a rich matrix (`ROLE_PERMISSIONS`). That’s good UX, but **must** mirror server checks. Any mismatch is a **security** issue (UI hides a button, API still open—or the reverse).

**Proposals:**

- For sensitive actions, **prefer** “call API → 403/404” over purely client `hasPermission` (e.g. hide buttons but always handle 403 on mutation).
- Optionally expose a **`GET /api/v1/auth/me/permissions` or** embed resolved permissions in JWT claims—**if** you can keep them small and non-stale. Otherwise, document that **fe** is best-effort and **be** is truth.

### 4.2 Align “applicant” / “viewer” vocabulary

The UI comments describe **applicant** behavior while the `Role` type is **`viewer`**. New contributors will be confused.

**Proposals:**

- Rename in UI copy only (`getRoleDisplayName`) or add alias `type ApplicantRole = 'viewer'` with a comment, and keep **one** canonical role in code.
- Reconcile with **SQL** enum names in documentation (see §2.4).

### 4.3 Generated types from OpenAPI

**Gap:** Hand-maintained DTOs and ad hoc `as unknown as` casts in draft save/load are brittle.

**Proposal:** Export OpenAPI from FastAPI (`/openapi.json`) and generate TypeScript types (or a thin client) in CI. At minimum, generate **enums and response shapes** for `application-drafts`, `applications`, and `evidence`.

### 4.4 Draft and file state machine

You combine **local IndexedDB**, **server JSON tabs**, and **MinIO evidence**—users need clear **feedback** (you’ve started with toasts and size limits).

**Proposals:**

- A visible **per-field status** for evidence: `local only` / `synced` / `error` (small badge next to the filename).
- **Idempotent** retry: if upload fails, one-click “retry sync” without re-selecting the file (blob still in IDB).

### 4.5 Performance and DX

- **Code-split** very large forms (`InitiativeApplicationForm`, `InitiativeReportForm`) by section for faster first paint.
- **React Query** (you use it in places) for server-backed lists with stable keys; align cache invalidation on submit/delete application.

---

## 5. Suggested prioritization (practical order)

1. **DB:** Alembic + stop dual SQL/model drift; baseline migration.
2. **Backend:** Extract `deps.py` + initiative ownership helper; use it in evidence + draft + submit routes.
3. **Backend:** Split `main.py` into `routes/` by domain (even if services are thin at first).
4. **API:** Standard error JSON + OpenAPI export + generated TS types.
5. **Frontend:** Permission doc + 403 handling + evidence sync UX (status badge / retry).
6. **Ops:** One persistence mode in prod; optional cleanup job for MinIO/DB consistency.

---

## 6. Mapping to the reference “lessons” list

| Reference lesson | Your project |
|------------------|--------------|
| Four-layer separation | **Partial** — move HTTP out of `main.py` and add services. |
| `has_role` on user model / single RBAC place | **Partial** — centralize in `deps` + one service. |
| IDOR: always scope by `user_id` in queries | **Good direction** — consolidate checks. |
| 404 on cross-owner access | **Adopt** where you still return 403 for existence leaks. |
| DB timestamps / naming / Alembic imports | **Adopt** naming + Alembic hygiene. |
| One transaction per request | **Yes** in `get_session`. |
| NullPool for workers | **N/A** until you have forked workers; then apply. |
| Don’t create admin over HTTP | **OK** if admin bootstrap is CLI or SQL—keep that. |

---

*Generated from a comparison of this codebase with the structure and ideas in the supplied `architecture-analysis.md` (vstorm / full-stack template style). Adjust priorities to your release timeline and team size.*