12 KiB
Architecture improvement proposals
This document compares this repository (Initiative / “Profyt” style stack: fe0 + be0 + PostgreSQL + MinIO) against the layered patterns and RBAC/DB practices described in the reference write-up (Cookiecutter-style Full Stack template: Route → Service → Repository → Database).
It proposes concrete improvements for frontend, backend, and database—not to copy the template verbatim, but to reduce risk and operational cost as the product grows.
1. How your app is structured today (summary)
| Area | What you have | Reference template pattern |
|---|---|---|
| Backend | A large be0/main.py FastAPI app with many endpoints, workflow glue, and file/YAML fallbacks in one module |
Small route modules, thin handlers, services/ + repositories/, schemas/, deps.py |
| Domain DB | be0/src/initiative_db/ (async engine, get_session with commit/rollback, SQLAlchemy models) |
Same async SQLAlchemy idea; stronger separation of data access from HTTP |
| Auth | be0/src/auth_api.py — Argon2, JWT (HS256), roles embedded in user_roles + JWT payload, decode_access_token_user_id per route |
Reusable get_current_user, RoleChecker, optional claim checks (e.g. access vs refresh) |
| Frontend | fe0 React + Vite, centralized apiClient, AuthContext, permission map in fe0/src/lib/permissions.ts |
UI gates only; real enforcement on server (you partially do this) |
| Storage | MinIO via be0/src/minio/storage.py, evidence endpoints in main.py, application_artifacts in PostgreSQL |
Similar object-store + DB metadata split |
| Migrations | SQL files under be0/migrations/ applied via Docker initdb.d |
Alembic revisions + env imports for all models |
Bottom line: You already have strong pieces (async sessions, MinIO abstraction, initiative schema, evidence pipeline). The main gap is structural—keeping business rules, HTTP, and storage concerns separated and testable, and making role and ownership rules easy to review in one place.
2. Database improvements
2.1 Single source of truth for schema evolution
Today: be0/migrations/*.sql are mounted into Postgres at container init, while SQLAlchemy models in be0/src/initiative_db/models.py are the app’s view of the world. That can drift (manual SQL + model mismatch).
Proposals:
- Introduce Alembic (or another migration tool) for
be0, targeting the sameDATABASE_URL/INITIATIVE_DATABASE_URL, with:alembic/env.pyimporting all model modules so autogenerate sees every table.- A one-time baseline migration that matches the current SQL, then all future changes via revision files only.
- Keep Docker
initdb.dfor local bootstrap or replace it withalembic upgrade headin an entrypoint—avoid maintaining two parallel migration paths in production.
2.2 Naming convention and metadata hygiene
Reference lesson: A Base.metadata naming convention makes autogenerated diffs stable across environments.
Proposal: Add a SQLAlchemy MetaData(naming_convention=…) on your Base in be0/src/initiative_db/models.py before the next non-trivial migration, so new constraints don’t get database-default names that differ by environment.
2.3 Indexes and hot paths (review pass)
Already good: initiatives.case_code is unique; ownership is by owner_id. Ensure all foreign keys used in list/detail filters have indexes (Postgres does not always index FK columns automatically in every setup—verify in migrations).
Proposal: One focused pass over queries in drafts, submissions, and application_artifacts (evidence) — add composite or partial indexes only where proven by slow query logs (avoid over-indexing).
2.4 Align “roles” in DB vs product language
Observation: 001_initiative_schema.sql defines user_role enum including applicant, council_member, etc. The frontend fe0/src/lib/permissions.ts uses admin | editor | viewer with business descriptions (“Người nộp đơn” tied to viewer). This semantic mapping is easy to get wrong in new endpoints.
Proposals:
- Add a short internal doc (table): DB role(s) → JWT/claims → UI permission set.
- In API design, avoid duplicating role names in ad hoc strings; prefer one enumeration shared between auth issuance and checks (or a small
constmodule on the server used by all routers).
2.5 Transaction boundaries (you’re on the right track)
get_session commits on success and rolls back on exception—aligned with the reference pattern. Keep service-level logic from calling commit() except where a deliberate long-lived transaction is required.
Proposal: As you split code out of main.py, ensure one unit of work per request unless you document exceptions (e.g. multi-step background handoff).
3. Backend improvements
3.1 Move from monolithic main.py to layered packages
Gap: A large main.py mixing workflow state, file exports, application drafts, MinIO, and compliance tools makes it hard to:
- test ownership and RBAC in isolation;
- reason about what runs on every deploy;
- onboard new contributors.
Target shape (incremental, not big-bang):
be0/src/
api/ # or routers/
deps.py # get_current_user, require_role, get_initiative_scoped, etc.
routes/
auth.py
application_drafts.py
applications.py
evidence.py
admin.py
services/ # business rules: "can this user add evidence to this case?"
repositories/ # SQL only: initiative by case_code, artifact by role, …
schemas/ # Pydantic request/response DTOs
initiative_db/ # keep models + engine; repositories call into models
Rule (from the reference): Routes never execute raw SQL; they call services that call repositories. This matches your evidence flow conceptually (you’re halfway there with application_storage).
3.2 RBAC and IDOR: centralize “who can touch this row?”
Reference patterns worth adopting:
RoleChecker-style dependencies (even if not a class) so route signatures read: “requires admin” vs “any authenticated user” without re-reading 30 lines of checks.- Ownership (IDOR): For every
case_id/initiative_id/application_id, a single function e.g.assert_initiative_access(user_id, case_code, need_write=True)that:- returns 404 for cross-tenant access when appropriate (no existence leak);
- is reused by evidence upload, draft save, and submit.
Today you check owner_id in several places; factor that into one service method and reuse.
3.3 JWT and token hygiene
Reference: Distinguish access vs refresh token claims (e.g. type) so a refresh token cannot be used as a Bearer on API routes.
Proposal: If you only issue access tokens today, document that. If you add refresh, enforce type == "access" in the dependency that guards APIs.
Already good: JWT_SECRET length check in dev/prod, Argon2 for passwords.
3.4 Reduce dual persistence in production
Today: Some flows can fall back to YAML on disk or in-memory stores when Postgres is off. That’s great for demos; in production it doubles test matrices.
Proposals:
- Feature flag or environment switch:
INITIATIVES_PERSISTENCE=postgresonly in prod; fail fast if misconfigured. - Deprecate or strictly scope file-based application draft storage to “local dev without Docker”.
3.5 MinIO and evidence
Already good: S3Storage, size limits, presigned download URLs, application_artifacts for metadata.
Proposals:
- Centralize all MinIO key conventions and
rolenames (evidence_research,evidence_textbook,full_pdf) in one server module to avoid string drift. - Add periodic job (cron/worker) for orphaned objects if uploads fail mid-way—your
cleanupmodule inminiois a start; ensure it’s scheduled in deployed environments.
3.6 Errors and API contract
Reference: Map domain errors to HTTP status in one exception_handlers module.
Proposal: Use consistent JSON error bodies ({ "detail": "...", "code": "EVIDENCE_TOO_LARGE" }) so the frontend can branch without parsing English/Vietnamese strings.
4. Frontend improvements
4.1 Treat server as the only authority
Today: fe0/src/lib/permissions.ts encodes a rich matrix (ROLE_PERMISSIONS). That’s good UX, but must mirror server checks. Any mismatch is a security issue (UI hides a button, API still open—or the reverse).
Proposals:
- For sensitive actions, prefer “call API → 403/404” over purely client
hasPermission(e.g. hide buttons but always handle 403 on mutation). - Optionally expose a
GET /api/v1/auth/me/permissionsor embed resolved permissions in JWT claims—if you can keep them small and non-stale. Otherwise, document that fe is best-effort and be is truth.
4.2 Align “applicant” / “viewer” vocabulary
The UI comments describe applicant behavior while the Role type is viewer. New contributors will be confused.
Proposals:
- Rename in UI copy only (
getRoleDisplayName) or add aliastype ApplicantRole = 'viewer'with a comment, and keep one canonical role in code. - Reconcile with SQL enum names in documentation (see §2.4).
4.3 Generated types from OpenAPI
Gap: Hand-maintained DTOs and ad hoc as unknown as casts in draft save/load are brittle.
Proposal: Export OpenAPI from FastAPI (/openapi.json) and generate TypeScript types (or a thin client) in CI. At minimum, generate enums and response shapes for application-drafts, applications, and evidence.
4.4 Draft and file state machine
You combine local IndexedDB, server JSON tabs, and MinIO evidence—users need clear feedback (you’ve started with toasts and size limits).
Proposals:
- A visible per-field status for evidence:
local only/synced/error(small badge next to the filename). - Idempotent retry: if upload fails, one-click “retry sync” without re-selecting the file (blob still in IDB).
4.5 Performance and DX
- Code-split very large forms (
InitiativeApplicationForm,InitiativeReportForm) by section for faster first paint. - React Query (you use it in places) for server-backed lists with stable keys; align cache invalidation on submit/delete application.
5. Suggested prioritization (practical order)
- DB: Alembic + stop dual SQL/model drift; baseline migration.
- Backend: Extract
deps.py+ initiative ownership helper; use it in evidence + draft + submit routes. - Backend: Split
main.pyintoroutes/by domain (even if services are thin at first). - API: Standard error JSON + OpenAPI export + generated TS types.
- Frontend: Permission doc + 403 handling + evidence sync UX (status badge / retry).
- Ops: One persistence mode in prod; optional cleanup job for MinIO/DB consistency.
6. Mapping to the reference “lessons” list
| Reference lesson | Your project |
|---|---|
| Four-layer separation | Partial — move HTTP out of main.py and add services. |
has_role on user model / single RBAC place |
Partial — centralize in deps + one service. |
IDOR: always scope by user_id in queries |
Good direction — consolidate checks. |
| 404 on cross-owner access | Adopt where you still return 403 for existence leaks. |
| DB timestamps / naming / Alembic imports | Adopt naming + Alembic hygiene. |
| One transaction per request | Yes in get_session. |
| NullPool for workers | N/A until you have forked workers; then apply. |
| Don’t create admin over HTTP | OK if admin bootstrap is CLI or SQL—keep that. |
Generated from a comparison of this codebase with the structure and ideas in the supplied architecture-analysis.md (vstorm / full-stack template style). Adjust priorities to your release timeline and team size.