Files
sciagent/docs/architecture-improvement-proposals.md
Thinh Lam 688fac73e9
CI/CD / backend (push) Failing after 2m8s
CI/CD / frontend (push) Failing after 1m40s
CI/CD / deploy (push) Has been skipped
sciagent code + Gitea Actions CI/CD
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 09:38:30 +07:00

210 lines
12 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Architecture improvement proposals
This document compares **this repository** (Initiative / “Profyt” style stack: `fe0` + `be0` + PostgreSQL + MinIO) against the **layered patterns and RBAC/DB practices** described in the reference write-up (Cookiecutter-style Full Stack template: *Route → Service → Repository → Database*).
It proposes concrete improvements for **frontend**, **backend**, and **database**—not to copy the template verbatim, but to **reduce risk and operational cost** as the product grows.
---
## 1. How your app is structured today (summary)
| Area | What you have | Reference template pattern |
|------|----------------|----------------------------|
| **Backend** | A large `be0/main.py` FastAPI app with many endpoints, workflow glue, and file/YAML fallbacks in one module | Small route modules, thin handlers, `services/` + `repositories/`, `schemas/`, `deps.py` |
| **Domain DB** | `be0/src/initiative_db/` (async engine, `get_session` with commit/rollback, SQLAlchemy models) | Same async SQLAlchemy idea; stronger separation of data access from HTTP |
| **Auth** | `be0/src/auth_api.py` — Argon2, JWT (HS256), roles embedded in `user_roles` + JWT payload, `decode_access_token_user_id` per route | Reusable `get_current_user`, `RoleChecker`, optional claim checks (e.g. access vs refresh) |
| **Frontend** | `fe0` React + Vite, centralized `apiClient`, `AuthContext`, permission map in `fe0/src/lib/permissions.ts` | UI gates only; real enforcement on server (you partially do this) |
| **Storage** | MinIO via `be0/src/minio/storage.py`, evidence endpoints in `main.py`, `application_artifacts` in PostgreSQL | Similar object-store + DB metadata split |
| **Migrations** | SQL files under `be0/migrations/` applied via Docker `initdb.d` | Alembic revisions + env imports for all models |
**Bottom line:** You already have strong **pieces** (async sessions, MinIO abstraction, initiative schema, evidence pipeline). The main gap is **structural**—keeping business rules, HTTP, and storage concerns **separated and testable**, and making **role and ownership rules** easy to review in one place.
---
## 2. Database improvements
### 2.1 Single source of truth for schema evolution
**Today:** `be0/migrations/*.sql` are mounted into Postgres at container init, while SQLAlchemy models in `be0/src/initiative_db/models.py` are the apps view of the world. That can **drift** (manual SQL + model mismatch).
**Proposals:**
- Introduce **Alembic** (or another migration tool) for `be0`, targeting the same `DATABASE_URL` / `INITIATIVE_DATABASE_URL`, with:
- `alembic/env.py` importing **all** model modules so autogenerate sees every table.
- A one-time **baseline** migration that matches the current SQL, then all future changes via revision files only.
- Keep Docker `initdb.d` for **local bootstrap** or replace it with `alembic upgrade head` in an entrypoint—avoid maintaining two parallel migration paths in production.
### 2.2 Naming convention and metadata hygiene
**Reference lesson:** A `Base.metadata` **naming convention** makes autogenerated diffs stable across environments.
**Proposal:** Add a SQLAlchemy `MetaData(naming_convention=…)` on your `Base` in `be0/src/initiative_db/models.py` *before* the next non-trivial migration, so new constraints dont get database-default names that differ by environment.
### 2.3 Indexes and hot paths (review pass)
**Already good:** `initiatives.case_code` is unique; ownership is by `owner_id`. Ensure **all foreign keys** used in list/detail filters have indexes (Postgres does not always index FK columns automatically in every setup—verify in migrations).
**Proposal:** One focused pass over queries in `drafts`, `submissions`, and `application_artifacts` (evidence) — add composite or partial indexes only where proven by slow query logs (avoid over-indexing).
### 2.4 Align “roles” in DB vs product language
**Observation:** `001_initiative_schema.sql` defines `user_role` enum including `applicant`, `council_member`, etc. The **frontend** `fe0/src/lib/permissions.ts` uses `admin` | `editor` | `viewer` with business descriptions (“Người nộp đơn” tied to *viewer*). This **semantic mapping** is easy to get wrong in new endpoints.
**Proposals:**
- Add a **short internal doc** (table): DB role(s) → JWT/claims → UI permission set.
- In API design, **avoid duplicating** role names in ad hoc strings; prefer **one** enumeration shared between auth issuance and checks (or a small `const` module on the server used by all routers).
### 2.5 Transaction boundaries (youre on the right track)
`get_session` commits on success and rolls back on exception—aligned with the reference pattern. **Keep** service-level logic from calling `commit()` except where a deliberate long-lived transaction is required.
**Proposal:** As you split code out of `main.py`, ensure **one unit of work per request** unless you document exceptions (e.g. multi-step background handoff).
---
## 3. Backend improvements
### 3.1 Move from monolithic `main.py` to layered packages
**Gap:** A large `main.py` mixing workflow state, file exports, application drafts, MinIO, and compliance tools makes it hard to:
- test ownership and RBAC in isolation;
- reason about what runs on every deploy;
- onboard new contributors.
**Target shape (incremental, not big-bang):**
```
be0/src/
api/ # or routers/
deps.py # get_current_user, require_role, get_initiative_scoped, etc.
routes/
auth.py
application_drafts.py
applications.py
evidence.py
admin.py
services/ # business rules: "can this user add evidence to this case?"
repositories/ # SQL only: initiative by case_code, artifact by role, …
schemas/ # Pydantic request/response DTOs
initiative_db/ # keep models + engine; repositories call into models
```
**Rule (from the reference):** **Routes never execute raw SQL**; they call **services** that call **repositories**. This matches your evidence flow conceptually (youre halfway there with `application_storage`).
### 3.2 RBAC and IDOR: centralize “who can touch this row?”
**Reference patterns worth adopting:**
- **`RoleChecker`-style** dependencies (even if not a class) so route signatures read: “requires admin” vs “any authenticated user” without re-reading 30 lines of checks.
- **Ownership (IDOR):** For every `case_id` / `initiative_id` / `application_id`, a single function e.g. `assert_initiative_access(user_id, case_code, need_write=True)` that:
- returns **404** for cross-tenant access when appropriate (no existence leak);
- is reused by evidence upload, draft save, and submit.
**Today** you check `owner_id` in several places; factor that into **one** service method and reuse.
### 3.3 JWT and token hygiene
**Reference:** Distinguish **access** vs **refresh** token **claims** (e.g. `type`) so a refresh token cannot be used as a Bearer on API routes.
**Proposal:** If you only issue access tokens today, document that. If you add refresh, enforce `type == "access"` in the dependency that guards APIs.
**Already good:** `JWT_SECRET` length check in dev/prod, Argon2 for passwords.
### 3.4 Reduce dual persistence in production
**Today:** Some flows can fall back to **YAML on disk** or in-memory stores when Postgres is off. Thats great for demos; in production it **doubles** test matrices.
**Proposals:**
- **Feature flag** or environment switch: `INITIATIVES_PERSISTENCE=postgres` only in prod; fail fast if misconfigured.
- **Deprecate** or strictly scope file-based application draft storage to “local dev without Docker”.
### 3.5 MinIO and evidence
**Already good:** `S3Storage`, size limits, presigned download URLs, `application_artifacts` for metadata.
**Proposals:**
- Centralize **all** MinIO key conventions and `role` names (`evidence_research`, `evidence_textbook`, `full_pdf`) in **one** server module to avoid string drift.
- Add **periodic job** (cron/worker) for orphaned objects if uploads fail mid-way—your `cleanup` module in `minio` is a start; ensure its scheduled in deployed environments.
### 3.6 Errors and API contract
**Reference:** Map domain errors to HTTP status in one `exception_handlers` module.
**Proposal:** Use consistent JSON error bodies (`{ "detail": "...", "code": "EVIDENCE_TOO_LARGE" }`) so the **frontend** can branch without parsing English/Vietnamese strings.
---
## 4. Frontend improvements
### 4.1 Treat server as the only authority
**Today:** `fe0/src/lib/permissions.ts` encodes a rich matrix (`ROLE_PERMISSIONS`). Thats good UX, but **must** mirror server checks. Any mismatch is a **security** issue (UI hides a button, API still open—or the reverse).
**Proposals:**
- For sensitive actions, **prefer** “call API → 403/404” over purely client `hasPermission` (e.g. hide buttons but always handle 403 on mutation).
- Optionally expose a **`GET /api/v1/auth/me/permissions` or** embed resolved permissions in JWT claims—**if** you can keep them small and non-stale. Otherwise, document that **fe** is best-effort and **be** is truth.
### 4.2 Align “applicant” / “viewer” vocabulary
The UI comments describe **applicant** behavior while the `Role` type is **`viewer`**. New contributors will be confused.
**Proposals:**
- Rename in UI copy only (`getRoleDisplayName`) or add alias `type ApplicantRole = 'viewer'` with a comment, and keep **one** canonical role in code.
- Reconcile with **SQL** enum names in documentation (see §2.4).
### 4.3 Generated types from OpenAPI
**Gap:** Hand-maintained DTOs and ad hoc `as unknown as` casts in draft save/load are brittle.
**Proposal:** Export OpenAPI from FastAPI (`/openapi.json`) and generate TypeScript types (or a thin client) in CI. At minimum, generate **enums and response shapes** for `application-drafts`, `applications`, and `evidence`.
### 4.4 Draft and file state machine
You combine **local IndexedDB**, **server JSON tabs**, and **MinIO evidence**—users need clear **feedback** (youve started with toasts and size limits).
**Proposals:**
- A visible **per-field status** for evidence: `local only` / `synced` / `error` (small badge next to the filename).
- **Idempotent** retry: if upload fails, one-click “retry sync” without re-selecting the file (blob still in IDB).
### 4.5 Performance and DX
- **Code-split** very large forms (`InitiativeApplicationForm`, `InitiativeReportForm`) by section for faster first paint.
- **React Query** (you use it in places) for server-backed lists with stable keys; align cache invalidation on submit/delete application.
---
## 5. Suggested prioritization (practical order)
1. **DB:** Alembic + stop dual SQL/model drift; baseline migration.
2. **Backend:** Extract `deps.py` + initiative ownership helper; use it in evidence + draft + submit routes.
3. **Backend:** Split `main.py` into `routes/` by domain (even if services are thin at first).
4. **API:** Standard error JSON + OpenAPI export + generated TS types.
5. **Frontend:** Permission doc + 403 handling + evidence sync UX (status badge / retry).
6. **Ops:** One persistence mode in prod; optional cleanup job for MinIO/DB consistency.
---
## 6. Mapping to the reference “lessons” list
| Reference lesson | Your project |
|------------------|--------------|
| Four-layer separation | **Partial** — move HTTP out of `main.py` and add services. |
| `has_role` on user model / single RBAC place | **Partial** — centralize in `deps` + one service. |
| IDOR: always scope by `user_id` in queries | **Good direction** — consolidate checks. |
| 404 on cross-owner access | **Adopt** where you still return 403 for existence leaks. |
| DB timestamps / naming / Alembic imports | **Adopt** naming + Alembic hygiene. |
| One transaction per request | **Yes** in `get_session`. |
| NullPool for workers | **N/A** until you have forked workers; then apply. |
| Dont create admin over HTTP | **OK** if admin bootstrap is CLI or SQL—keep that. |
---
*Generated from a comparison of this codebase with the structure and ideas in the supplied `architecture-analysis.md` (vstorm / full-stack template style). Adjust priorities to your release timeline and team size.*