Files
sciagent/docs/architecture-improvement-proposals.md
T
Thinh Lam 688fac73e9
CI/CD / backend (push) Failing after 2m8s
CI/CD / frontend (push) Failing after 1m40s
CI/CD / deploy (push) Has been skipped
sciagent code + Gitea Actions CI/CD
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 09:38:30 +07:00

12 KiB
Raw Blame History

Architecture improvement proposals

This document compares this repository (Initiative / “Profyt” style stack: fe0 + be0 + PostgreSQL + MinIO) against the layered patterns and RBAC/DB practices described in the reference write-up (Cookiecutter-style Full Stack template: Route → Service → Repository → Database).

It proposes concrete improvements for frontend, backend, and database—not to copy the template verbatim, but to reduce risk and operational cost as the product grows.


1. How your app is structured today (summary)

Area What you have Reference template pattern
Backend A large be0/main.py FastAPI app with many endpoints, workflow glue, and file/YAML fallbacks in one module Small route modules, thin handlers, services/ + repositories/, schemas/, deps.py
Domain DB be0/src/initiative_db/ (async engine, get_session with commit/rollback, SQLAlchemy models) Same async SQLAlchemy idea; stronger separation of data access from HTTP
Auth be0/src/auth_api.py — Argon2, JWT (HS256), roles embedded in user_roles + JWT payload, decode_access_token_user_id per route Reusable get_current_user, RoleChecker, optional claim checks (e.g. access vs refresh)
Frontend fe0 React + Vite, centralized apiClient, AuthContext, permission map in fe0/src/lib/permissions.ts UI gates only; real enforcement on server (you partially do this)
Storage MinIO via be0/src/minio/storage.py, evidence endpoints in main.py, application_artifacts in PostgreSQL Similar object-store + DB metadata split
Migrations SQL files under be0/migrations/ applied via Docker initdb.d Alembic revisions + env imports for all models

Bottom line: You already have strong pieces (async sessions, MinIO abstraction, initiative schema, evidence pipeline). The main gap is structural—keeping business rules, HTTP, and storage concerns separated and testable, and making role and ownership rules easy to review in one place.


2. Database improvements

2.1 Single source of truth for schema evolution

Today: be0/migrations/*.sql are mounted into Postgres at container init, while SQLAlchemy models in be0/src/initiative_db/models.py are the apps view of the world. That can drift (manual SQL + model mismatch).

Proposals:

  • Introduce Alembic (or another migration tool) for be0, targeting the same DATABASE_URL / INITIATIVE_DATABASE_URL, with:
    • alembic/env.py importing all model modules so autogenerate sees every table.
    • A one-time baseline migration that matches the current SQL, then all future changes via revision files only.
  • Keep Docker initdb.d for local bootstrap or replace it with alembic upgrade head in an entrypoint—avoid maintaining two parallel migration paths in production.

2.2 Naming convention and metadata hygiene

Reference lesson: A Base.metadata naming convention makes autogenerated diffs stable across environments.

Proposal: Add a SQLAlchemy MetaData(naming_convention=…) on your Base in be0/src/initiative_db/models.py before the next non-trivial migration, so new constraints dont get database-default names that differ by environment.

2.3 Indexes and hot paths (review pass)

Already good: initiatives.case_code is unique; ownership is by owner_id. Ensure all foreign keys used in list/detail filters have indexes (Postgres does not always index FK columns automatically in every setup—verify in migrations).

Proposal: One focused pass over queries in drafts, submissions, and application_artifacts (evidence) — add composite or partial indexes only where proven by slow query logs (avoid over-indexing).

2.4 Align “roles” in DB vs product language

Observation: 001_initiative_schema.sql defines user_role enum including applicant, council_member, etc. The frontend fe0/src/lib/permissions.ts uses admin | editor | viewer with business descriptions (“Người nộp đơn” tied to viewer). This semantic mapping is easy to get wrong in new endpoints.

Proposals:

  • Add a short internal doc (table): DB role(s) → JWT/claims → UI permission set.
  • In API design, avoid duplicating role names in ad hoc strings; prefer one enumeration shared between auth issuance and checks (or a small const module on the server used by all routers).

2.5 Transaction boundaries (youre on the right track)

get_session commits on success and rolls back on exception—aligned with the reference pattern. Keep service-level logic from calling commit() except where a deliberate long-lived transaction is required.

Proposal: As you split code out of main.py, ensure one unit of work per request unless you document exceptions (e.g. multi-step background handoff).


3. Backend improvements

3.1 Move from monolithic main.py to layered packages

Gap: A large main.py mixing workflow state, file exports, application drafts, MinIO, and compliance tools makes it hard to:

  • test ownership and RBAC in isolation;
  • reason about what runs on every deploy;
  • onboard new contributors.

Target shape (incremental, not big-bang):

be0/src/
  api/                 # or routers/
    deps.py            # get_current_user, require_role, get_initiative_scoped, etc.
    routes/
      auth.py
      application_drafts.py
      applications.py
      evidence.py
      admin.py
  services/            # business rules: "can this user add evidence to this case?"
  repositories/        # SQL only: initiative by case_code, artifact by role, …
  schemas/             # Pydantic request/response DTOs
  initiative_db/       # keep models + engine; repositories call into models

Rule (from the reference): Routes never execute raw SQL; they call services that call repositories. This matches your evidence flow conceptually (youre halfway there with application_storage).

3.2 RBAC and IDOR: centralize “who can touch this row?”

Reference patterns worth adopting:

  • RoleChecker-style dependencies (even if not a class) so route signatures read: “requires admin” vs “any authenticated user” without re-reading 30 lines of checks.
  • Ownership (IDOR): For every case_id / initiative_id / application_id, a single function e.g. assert_initiative_access(user_id, case_code, need_write=True) that:
    • returns 404 for cross-tenant access when appropriate (no existence leak);
    • is reused by evidence upload, draft save, and submit.

Today you check owner_id in several places; factor that into one service method and reuse.

3.3 JWT and token hygiene

Reference: Distinguish access vs refresh token claims (e.g. type) so a refresh token cannot be used as a Bearer on API routes.

Proposal: If you only issue access tokens today, document that. If you add refresh, enforce type == "access" in the dependency that guards APIs.

Already good: JWT_SECRET length check in dev/prod, Argon2 for passwords.

3.4 Reduce dual persistence in production

Today: Some flows can fall back to YAML on disk or in-memory stores when Postgres is off. Thats great for demos; in production it doubles test matrices.

Proposals:

  • Feature flag or environment switch: INITIATIVES_PERSISTENCE=postgres only in prod; fail fast if misconfigured.
  • Deprecate or strictly scope file-based application draft storage to “local dev without Docker”.

3.5 MinIO and evidence

Already good: S3Storage, size limits, presigned download URLs, application_artifacts for metadata.

Proposals:

  • Centralize all MinIO key conventions and role names (evidence_research, evidence_textbook, full_pdf) in one server module to avoid string drift.
  • Add periodic job (cron/worker) for orphaned objects if uploads fail mid-way—your cleanup module in minio is a start; ensure its scheduled in deployed environments.

3.6 Errors and API contract

Reference: Map domain errors to HTTP status in one exception_handlers module.

Proposal: Use consistent JSON error bodies ({ "detail": "...", "code": "EVIDENCE_TOO_LARGE" }) so the frontend can branch without parsing English/Vietnamese strings.


4. Frontend improvements

4.1 Treat server as the only authority

Today: fe0/src/lib/permissions.ts encodes a rich matrix (ROLE_PERMISSIONS). Thats good UX, but must mirror server checks. Any mismatch is a security issue (UI hides a button, API still open—or the reverse).

Proposals:

  • For sensitive actions, prefer “call API → 403/404” over purely client hasPermission (e.g. hide buttons but always handle 403 on mutation).
  • Optionally expose a GET /api/v1/auth/me/permissions or embed resolved permissions in JWT claims—if you can keep them small and non-stale. Otherwise, document that fe is best-effort and be is truth.

4.2 Align “applicant” / “viewer” vocabulary

The UI comments describe applicant behavior while the Role type is viewer. New contributors will be confused.

Proposals:

  • Rename in UI copy only (getRoleDisplayName) or add alias type ApplicantRole = 'viewer' with a comment, and keep one canonical role in code.
  • Reconcile with SQL enum names in documentation (see §2.4).

4.3 Generated types from OpenAPI

Gap: Hand-maintained DTOs and ad hoc as unknown as casts in draft save/load are brittle.

Proposal: Export OpenAPI from FastAPI (/openapi.json) and generate TypeScript types (or a thin client) in CI. At minimum, generate enums and response shapes for application-drafts, applications, and evidence.

4.4 Draft and file state machine

You combine local IndexedDB, server JSON tabs, and MinIO evidence—users need clear feedback (youve started with toasts and size limits).

Proposals:

  • A visible per-field status for evidence: local only / synced / error (small badge next to the filename).
  • Idempotent retry: if upload fails, one-click “retry sync” without re-selecting the file (blob still in IDB).

4.5 Performance and DX

  • Code-split very large forms (InitiativeApplicationForm, InitiativeReportForm) by section for faster first paint.
  • React Query (you use it in places) for server-backed lists with stable keys; align cache invalidation on submit/delete application.

5. Suggested prioritization (practical order)

  1. DB: Alembic + stop dual SQL/model drift; baseline migration.
  2. Backend: Extract deps.py + initiative ownership helper; use it in evidence + draft + submit routes.
  3. Backend: Split main.py into routes/ by domain (even if services are thin at first).
  4. API: Standard error JSON + OpenAPI export + generated TS types.
  5. Frontend: Permission doc + 403 handling + evidence sync UX (status badge / retry).
  6. Ops: One persistence mode in prod; optional cleanup job for MinIO/DB consistency.

6. Mapping to the reference “lessons” list

Reference lesson Your project
Four-layer separation Partial — move HTTP out of main.py and add services.
has_role on user model / single RBAC place Partial — centralize in deps + one service.
IDOR: always scope by user_id in queries Good direction — consolidate checks.
404 on cross-owner access Adopt where you still return 403 for existence leaks.
DB timestamps / naming / Alembic imports Adopt naming + Alembic hygiene.
One transaction per request Yes in get_session.
NullPool for workers N/A until you have forked workers; then apply.
Dont create admin over HTTP OK if admin bootstrap is CLI or SQL—keep that.

Generated from a comparison of this codebase with the structure and ideas in the supplied architecture-analysis.md (vstorm / full-stack template style). Adjust priorities to your release timeline and team size.