sciagent code + Gitea Actions CI/CD
CI/CD / backend (push) Failing after 2m8s
CI/CD / frontend (push) Failing after 1m40s
CI/CD / deploy (push) Has been skipped

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Thinh Lam
2026-06-30 09:38:30 +07:00
commit 688fac73e9
1167 changed files with 158244 additions and 0 deletions
@@ -0,0 +1,221 @@
# Analysis: notification system for admin → applicant (status & feedback)
This document describes **how the stack works today**, **what is missing** for true notifications, and a **concrete v1 path** aligned with **current repo complexity** (`fe0` / `be0` / PostgreSQL). It incorporates a **review** of the refined draft (`ADMIN_APPLICANT_NOTIFICATION_SYSTEM_ANALYSIS.md` from review) and **adjusts** a few points for this codebase.
It complements `assets/APPLICANT_STATUS_NOTIFICATIONS_PLAN.md` (council + broader product) by anchoring on **`application_admin_results`** and **`PUT /api/applications/{applicationId}/admin-result`**.
---
## 0. Evaluation of the reviewed draft (summary)
The reviewed version improves the original repo doc in several ways; **adopt these**:
| Theme | Verdict |
|--------|---------|
| **Locked v1 scope** | In-app inbox only; ~60s polling + `refetchOnWindowFocus`; **no** email, MinIO PDF, WebSocket, or council unification in v1. Reduces scope and matches current team capacity. |
| **Append-on-every-save** | Explicit product choice: new row per admin save; optional **UI-only** collapsing of consecutive rows. Clear and simple. |
| **Schema pragmatism** | v1 `type` with `TEXT + CHECK`; **omit `JSONB` until a second notification type** exists. Fewer columns to maintain. |
| **Indexes** | `created_at DESC` inbox index + **partial index** on `(recipient_user_id) WHERE read_at IS NULL` for unread count. Appropriate. |
| **Security** | `PATCH .../read` returns **404** for foreign rows (same as missing id) to avoid user enumeration. Good default. |
| **Helper surface** | `notification_service.create_admin_decision_notification(...)` keeps the admin route thin and future council hook consistent. |
**Adjust for this repository (critical):**
1. **`application_id` type** — Public application identifiers in this project are **strings** (e.g. `sub-{hex}`, case codes), not UUIDs. The notification table should use **`application_id TEXT NOT NULL`** (or `VARCHAR`) for deep links, matching `ApplicationItem.id` and API paths—not `UUID`.
2. **“try/except without rolling back the decision”** — With **`get_session()`**, everything runs in **one transaction** that **commits once** at context exit. If the notification `INSERT` fails **after** the upsert has flushed, the **entire** transaction—including the decision—rolls back on exception, unless you:
- use a **`begin_nested()` savepoint** around the notification insert (rollback to savepoint on failure, then continue), or
- perform the notification insert in a **separate short session after** the admin-result transaction **commits** (best-effort second transaction).
The reviewed drafts intent (“decision is sacred; notification best-effort”) is right; the implementation must use one of the patterns above, not a bare `try/except` in the same flat transaction.
---
## 1. v1 scope (locked)
- **In-app notifications only:** PostgreSQL table + applicant `GET` / `PATCH` + `/dashboard/notifications` + optional header bell.
- **Delivery:** React Query `refetchInterval: 60_000` and `refetchOnWindowFocus: true` (no SSE/WebSocket in v1).
- **Write trigger:** successful **admin-result** upsert (same product semantics as todays `AdminStaffReadonlyReviewDialog` / `ResultManager`).
- **Deferred to v2:** email + outbox worker, MinIO/PDF letter, council `localStorage` → API unification, notification preferences, i18n beyond Vietnamese, retention jobs.
---
## 2. Current state (baseline)
### 2.1 Data and decisions
| Layer | What exists |
|--------|-------------|
| **PostgreSQL** | `initiatives.status`**`application_admin_results`** on **`PUT …/admin-result`** (idempotent upsert). |
| **Applicant reads** | `GET /api/applications/mine`, `GET /api/applications/{id}` — status and **`nhan_xet`** can reflect admin feedback after submission enrichment. |
| **Admin** | Decided list `lifecycle=decided`; React Query invalidation on save. |
| **Council** | Some flows still use **`localStorage`**; not applicant-visible until server-backed (v2; see assets plan). |
**Gap:** No **`user_notifications`** (or equivalent); applicants only discover changes by refetching lists/detail.
### 2.2 Frontend
- **Sonner:** admin-only affordance at save time.
- **React Query:** `applications`, `applications-mine`, `application-detail`, etc.—no notification queries yet.
- **`/dashboard/notifications`:** linked from sidebars; **no page implementation** observed.
### 2.3 Backend
- FastAPI + transactional `get_session()`; natural hook: after admin-result body returns, or inside handler with savepoint / post-commit insert (see §0).
### 2.4 MinIO (v2 only for notifications)
Evidence/exports buckets are **not** the source of truth for notification text. Optional v2 PDF letter remains separate from v1.
---
## 3. Target architecture (v1)
```mermaid
flowchart LR
subgraph admin [Admin UI]
A[Confirm / ResultManager]
end
subgraph be [be0]
B[PUT admin-result]
C[Notification helper]
end
subgraph db [PostgreSQL]
D[application_admin_results]
E[initiatives]
F[user_notifications]
end
subgraph fe [Applicant FE]
H[Polling + onFocus]
I[Inbox + bell]
end
A --> B
B --> D
B --> E
B --> C
C --> F
F --> H
H --> I
```
---
## 4. Database design (v1)
### 4.1 Principles
- **`application_admin_results`** remains canonical for full feedback/rationale.
- Notification rows are **summaries + pointer** to the public **`application_id` string** and optional FKs for audit.
### 4.2 Table: `user_notifications`
| Column | Type | Notes |
|--------|------|------|
| `id` | UUID PK | |
| `recipient_user_id` | UUID FK → `users.id` (`ON DELETE CASCADE`) | From `initiatives.owner_id` at insert. |
| `type` | `TEXT NOT NULL` | v1: `CHECK (type IN ('admin_application_decision'))`. |
| `title` | `TEXT NOT NULL` | e.g. “Kết quả duyệt hồ sơ”. |
| `body` | `TEXT NOT NULL` | Decision label + ~280 chars feedback (newline-stripped). |
| `application_id` | `TEXT NOT NULL` | **Public** id (`sub-…` / case-shaped), matches API list/detail. |
| `related_initiative_id` | UUID FK → `initiatives.id` (`ON DELETE SET NULL`) | |
| `source_admin_result_id` | UUID FK → `application_admin_results.id` (`ON DELETE SET NULL`) | |
| `read_at` | `TIMESTAMPTZ` nullable | |
| `created_at` | `TIMESTAMPTZ NOT NULL DEFAULT now()` | |
**v1:** no `payload JSONB`; add when a second `type` needs extra fields.
### 4.3 Indexes
```sql
CREATE INDEX user_notifications_inbox_idx
ON user_notifications (recipient_user_id, created_at DESC);
CREATE INDEX user_notifications_unread_idx
ON user_notifications (recipient_user_id)
WHERE read_at IS NULL;
```
### 4.4 Insertion semantics
- **Product:** append **one row per successful admin-result save** (including typo fixes).
- **Technical:** implement **best-effort** notification with **savepoint** (`session.begin_nested()`) or **post-commit** insert so a notification failure **never** rolls back the adjudication. Document the chosen pattern in code comments next to the handler.
---
## 5. Backend API (`be0`)
### 5.1 Write path
After **`upsert_admin_result`** succeeds, resolve `owner_id`, build title/body, insert `user_notifications` via helper using the patterns in §4.4.
Sketch:
```text
notification_service.create_admin_decision_notification(
session, *, initiative, admin_result_row, application_id_public: str, decision_label: str
) -> UserNotification | None
```
### 5.2 Read paths (applicant-only in v1)
| Method | Purpose |
|--------|---------|
| `GET /api/notifications` | Paginated; `recipient_user_id = current user`; fields: `id`, `type`, `title`, `body`, `read_at`, `created_at`, `application_id`, `related_initiative_id`. |
| `GET /api/notifications/unread-count` | Count unread; benefits from partial index. |
| `PATCH /api/notifications/{id}/read` | Set `read_at = now()` only if row belongs to user. |
**Authorization:** for `PATCH`, return **404** if row missing **or** not owned (same body as missing).
### 5.3 Relationship to applications API
Notifications complement **`GET /api/applications/{id}`** (status + feedback); they do not replace it.
---
## 6. Frontend (`fe0`)
- **Page:** implement **`/dashboard/notifications`**.
- **React Query:** `["notifications", { page }]` and `["notifications-unread-count"]`.
- **Polling:** `refetchInterval: 60_000`, `refetchOnWindowFocus: true`.
- **UX:** row click → `PATCH .../read` (optimistic unread decrement optional) → navigate using **`application_id`** string to existing applicant routes.
- **Bell:** subscribe to unread-count query; same polling cadence.
- **Admin UI:** no change required for v1 unless product adds “dont notify” toggle later.
---
## 7. Security and privacy
- Scope all reads/patch to authenticated **recipient**.
- Do not log full notification bodies in verbose HTTP logs in production.
- `recipient_user_id` snapshot at insert time: historical rows stay with original recipient if ownership changes.
---
## 8. Rollout order (v1)
1. Migration: table + indexes.
2. SQLAlchemy model + `notification_service` helper (with savepoint or post-commit).
3. Wire **admin-result** handler.
4. Applicant `GET` list, `GET` unread-count, `PATCH` read.
5. FE: inbox page + bell.
6. Tests: notification appears after PUT (applicant token); PATCH read; PATCH foreign id → 404; **notification failure does not undo admin-result** (integration test with forced insert error).
---
## 9. v2 candidates (out of scope for v1)
| Item | Notes |
|------|------|
| Email | Outbox + worker; `user_notifications` stays canonical. |
| MinIO PDF | Generate on save; store artifact key; optional `payload` JSONB for metadata. |
| Council | New `type` + `CHECK` extension + second writer from council API. |
| Preferences / retention | Add when volume or compliance requires. |
---
## 10. Relation to other docs
- **`assets/APPLICANT_STATUS_NOTIFICATIONS_PLAN.md`** — council outcomes, workflow, broader audit. This file is the **admin-first, v1-scoped** implementation companion.
---
*Update when migrations land, when the transaction strategy (savepoint vs post-commit) is fixed in code, or when v2 scope is agreed.*
@@ -0,0 +1,175 @@
# Radical fix plan: admin applications, admin results, and dashboard consistency
This document captures how the current stack is wired, what went wrong with “save decision → see it in **Kết quả đăng ký**”, and a **phased, radical** plan to make the behavior **reliable by design**—not only by patching one screen.
---
## 1. Problem statement (what users experience)
1. An admin uses **Mẫu hồ sơ và Minh chứng****Duyệt (xem trước)****Xác nhận** (`AdminStaffReadonlyReviewDialog` + `upsertAdminApplicationResult`).
2. They expect the submission to appear under **Kết quả đăng ký** (`ConsideredInitiativesList``ApprovedApplicationsList` with `lifecycle=decided`).
3. Sometimes nothing changes, or behaviour is confusing, because **several independent subsystems** must stay aligned: HTTP client semantics, API routes, PostgreSQL `initiatives.status`, optional file fallback, React Query keys, and optional MinIO (evidence—not the same as admin result).
---
## 2. Implementation map (how it fits together)
### 2.1 Frontend
| Piece | Role |
|--------|------|
| `ConsideredInitiativesList` | Renders `ApprovedApplicationsList` with `lifecycle="decided"` — only rows whose **list `status`** is `approved` or `rejected`. |
| `ApprovedApplicationsList` | `useQuery(["applications", filters])``GET /api/applications` with filters including `lifecycle`. |
| `AdminDocxTemplatePreview` | Opens `AdminStaffReadonlyReviewDialog`; confirm calls `upsertAdminApplicationResult`. |
| `applicationAdminResultApi` | `fetch` + `create` / `update`; `upsert` = **GET then POST or PUT**. |
| `shared/api/client.ts` | Axios `validateStatus: (s) => s < 500`**4xx responses do not throw** unless callers pass an override. |
### 2.2 Backend
| Piece | Role |
|--------|------|
| `POST/PUT/GET/DELETE …/admin-result` | Persists `application_admin_results` and sets **`initiatives.status`** to `approved` / `rejected` (PostgreSQL). |
| `GET /api/applications` | **Primary:** `list_submitted_applications` from Postgres. **Fallback:** `_load_submitted_items()` file index if Postgres fails or is disabled. |
| `GET /api/applications/{id}` | Same pattern: Postgres first, then file index. |
| MinIO (S3-compatible) | Evidence / attachments buckets; **not** where admin “decision rows” live. Decisions are **Postgres**; MinIO only matters for evidence flows. |
### 2.3 Data flow (intended)
```mermaid
sequenceDiagram
participant UI as Admin UI
participant API as be0 FastAPI
participant PG as Postgres
participant List as GET /api/applications
UI->>API: upsert admin-result (POST or PUT)
API->>PG: application_admin_results + initiatives.status
UI->>API: invalidate + refetch applications
List->>PG: list initiatives + drafts → status approved/rejected
List-->>UI: decided list
```
Breakage happens when **any** step returns “success” without real data, reads from a **different backend** than writes, or treats **HTTP 404** as a valid JSON body.
---
## 3. Root causes (systemic, not one-line bugs)
### A. Global Axios: 4xx treated as success (`validateStatus < 500`)
- For `GET /api/.../admin-result` with **no row**, the server returns **404** + `{ detail: "…" }`.
- With the default client, that **resolves** instead of rejects.
- Any code that checks `if (!data)` **fails** → truthy object `{ detail }` looks like “existing result”.
- **`upsert`** then chose **PUT** instead of **POST** on first save → no row created, **silent wrong success** possible.
**Partial fix already applied:** pass `axiosSuccessStatusOnly` (2xx-only) on all `applicationAdminResultApi` calls.
**Radical fix:** see §5.1 — stop relying on opt-in overrides.
### B. Client upsert = GET + mutate (race + footguns)
- Two round trips; duplicated server rules; easy to get wrong if GET semantics change.
- Prefer **idempotent server upsert** (`PUT` with “create or replace” semantics) **or** `POST`-only with clear 409 handling.
### C. Dual source of truth for application lists (Postgres vs file index)
- Listing can **silently fall back** to `_load_submitted_items()` when Postgres throws.
- Admin-result writes **only** hit Postgres.
- Result: UI can show **stale or empty** “decided” data while DB was actually updated (or the reverse in dev).
### D. React Query key `["applications", filters]`
- Invalidate `["applications"]` is correct for TanStack Query partial matching, but **any** cache/subscription edge case should be covered by tests.
### E. MinIO vs admin decision (scope confusion)
- **Fixing “Kết quả đăng ký”** does not require MinIO updates.
- Evidence upload paths are separate; do not conflate in testing or plans.
---
## 4. Verification already performed (baseline)
- **Docker:** `postgres`, `minio`, `be0` healthy; `GET /api/v1/test` OK.
- **Postgres + `create_admin_result`:** `initiatives.status` and `application_admin_results` stay in sync when using the Python layer directly.
- **Integration test `test_applications_db_integration`:** one failing test (`get_application_by_id` fallback `sub-…` without `submissionRecord.id`) — suggests **ID-resolution** edge cases still risky; align with list/GET contract in the same plan.
- **Host Python** may lack `boto3`; validate MinIO from **`be0` container** or install dev deps locally for S3 tests.
---
## 5. Radical fixes (options, from smaller to larger)
### 5.1 Frontend: default to real HTTP semantics (highest leverage)
**Goal:** No API call “succeeds” with a 404/422 body unless explicitly handled.
**Options:**
1. **Change default `validateStatus` to 2xx-only** in `ApiClient`, then **globally fix** call sites that depended on reading `{ detail }` from a resolved 4xx response (likely few; grep for patterns).
2. **Two clients:** `apiClientStrict` (2xx-only) for CRUD and `apiClient` legacy only where needed—migrate modules incrementally.
3. **Response interceptor:** if `status >= 400`, reject with unified `ApiError` (preserves current “no throw on 4xx” idea but **never** returns `res.data` as success to `.then()`).
**Acceptance:** ESLint rule or CI script: forbid `apiClient.get/post/put/delete` without explicit `validateStatus` or wrapper.
### 5.2 Backend: atomic admin-result upsert
**Goal:** Single request, no client-side GET-before-POST.
- Expose **`PUT /api/applications/{id}/admin-result`** as **idempotent upsert** (create or update in one transaction), or document **`POST`** as upsert with unique constraint handling.
- Optionally return **updated application row** snippet (`status`, `applicationId`) so the client can patch cache without listing.
### 5.3 Backend: single listing source in production
**Goal:** No silent list fallback when `INITIATIVE_DATABASE_URL` is set.
- On Postgres enabled: **fail listing with 503** and a clear JSON error instead of falling back to files; or log + metrics + feature flag.
- Deprecate file index for `/api/applications` in environments where submissions are always in PG.
### 5.4 Contract tests (API + FE)
- **pytest/httpx:** `POST admin-result``GET /api/applications?lifecycle=decided` contains that `id`.
- **Playwright or MSW:** Admin flow confirm → list row appears (requires auth fixture).
### 5.5 Observability
- Structured logs: `application_id`, `initiative_id`, `decision`, `source=postgres|file_fallback`.
- Metrics: `applications_list_fallback_total`, `admin_result_upsert_duration_ms`.
---
## 6. Recommended phases (practical rollout)
| Phase | Scope | Outcome |
|-------|--------|---------|
| **P0** | Keep `axiosSuccessStatusOnly` on admin-result API; add **one** e2e/API test: upsert → decided list. | Regression guard. |
| **P1** | Introduce **strict HTTP** default or interceptor (§5.1); fix broken call sites. | Class of bugs eliminated. |
| **P2** | Backend **idempotent PUT** upsert; simplify client to single call. | Fewer races, simpler mental model. |
| **P3** | Remove or gate **file fallback** for `/api/applications` when PG is configured. | Align list with admin writes. |
| **P4** | Fix failing DB test for `submissionRecord.id` omission; document canonical `applicationId` rules. | Predictable IDs end-to-end. |
---
## 7. Out of scope / non-goals
- **MinIO** consistency for “admin approve” — wrong layer unless the feature explicitly writes objects on decision.
- **Council** flow (`saveCouncilReviewOutcome` / local storage) — separate product path; only mention if merging with admin outcomes.
---
## 8. Decision log (fill in as you implement)
| Date | Decision | Rationale |
|------|----------|-----------|
| | | |
---
## References (code)
- `fe0/src/shared/api/client.ts` — global `validateStatus`.
- `fe0/src/lib/applicationReviewApi.ts``axiosSuccessStatusOnly`.
- `fe0/src/lib/applicationAdminResultApi.ts` — admin-result CRUD + upsert.
- `fe0/src/components/admin/result/ConsideredInitiativesList.tsx` — decided list entry point.
- `be0/src/initiative_db/application_admin_results.py` — DB writes + `initiative.status`.
- `be0/main.py``list_applications` Postgres vs file fallback.
- `be0/tests/test_applications_db_integration.py` — Postgres integration tests.
+871
View File
@@ -0,0 +1,871 @@
# Architecture Redesign Proposal
## Overview
This document outlines a comprehensive architectural redesign for the ProfytAI Compliance Management Platform, addressing critical issues identified in the current implementation.
## Design Principles
1. **Separation of Concerns**: Clear boundaries between layers
2. **Dependency Injection**: Loose coupling, easy testing
3. **Domain-Driven Design**: Business logic in domain layer
4. **Security First**: Authentication, authorization, input validation
5. **Testability**: All components should be easily testable
6. **Scalability**: Support for horizontal scaling
7. **Maintainability**: Clear structure, minimal complexity
---
## Proposed Architecture: Layered Architecture with Clean Architecture Principles
```
┌─────────────────────────────────────────────────────────┐
│ Presentation Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ API Routes │ │ Middleware │ │ WebSocket │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Application Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Services │ │ Use Cases │ │ DTOs │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Domain Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Entities │ │ Interfaces │ │ Value Obj. │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Infrastructure Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Repositories │ │ External │ │ Config │ │
│ │ │ │ Services │ │ │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────┘
```
---
## New Directory Structure
```
be0/
├── src/
│ ├── api/ # API Layer
│ │ ├── __init__.py
│ │ ├── dependencies.py # Dependency injection
│ │ ├── middleware/
│ │ │ ├── __init__.py
│ │ │ ├── auth.py # Authentication middleware
│ │ │ ├── cors.py # CORS configuration
│ │ │ ├── rate_limit.py # Rate limiting
│ │ │ └── error_handler.py # Global error handling
│ │ ├── routes/
│ │ │ ├── __init__.py
│ │ │ ├── workflows.py # Workflow endpoints
│ │ │ ├── documents.py # Document endpoints
│ │ │ ├── compliance.py # Compliance endpoints
│ │ │ ├── health.py # Health check
│ │ │ └── auth.py # Authentication endpoints
│ │ └── schemas/ # Request/Response schemas
│ │ ├── __init__.py
│ │ ├── workflow.py
│ │ ├── document.py
│ │ └── compliance.py
│ │
│ ├── application/ # Application Layer
│ │ ├── __init__.py
│ │ ├── services/
│ │ │ ├── __init__.py
│ │ │ ├── workflow_service.py
│ │ │ ├── document_service.py
│ │ │ ├── compliance_service.py
│ │ │ └── ai_service.py
│ │ ├── use_cases/
│ │ │ ├── __init__.py
│ │ │ ├── create_workflow.py
│ │ │ ├── update_workflow_item.py
│ │ │ ├── analyze_compliance.py
│ │ │ └── process_document.py
│ │ └── dto/ # Data Transfer Objects
│ │ ├── __init__.py
│ │ ├── workflow_dto.py
│ │ └── compliance_dto.py
│ │
│ ├── domain/ # Domain Layer
│ │ ├── __init__.py
│ │ ├── entities/
│ │ │ ├── __init__.py
│ │ │ ├── workflow.py
│ │ │ ├── workflow_item.py
│ │ │ ├── document.py
│ │ │ └── compliance_rule.py
│ │ ├── value_objects/
│ │ │ ├── __init__.py
│ │ │ ├── task_status.py
│ │ │ └── workflow_phase.py
│ │ ├── interfaces/ # Repository interfaces
│ │ │ ├── __init__.py
│ │ │ ├── workflow_repository.py
│ │ │ ├── document_repository.py
│ │ │ └── compliance_repository.py
│ │ └── exceptions/
│ │ ├── __init__.py
│ │ ├── domain_exceptions.py
│ │ └── service_exceptions.py
│ │
│ ├── infrastructure/ # Infrastructure Layer
│ │ ├── __init__.py
│ │ ├── database/
│ │ │ ├── __init__.py
│ │ │ ├── connection.py # DB connection pool
│ │ │ ├── repositories/
│ │ │ │ ├── __init__.py
│ │ │ │ ├── workflow_repository_impl.py
│ │ │ │ ├── document_repository_impl.py
│ │ │ │ └── neo4j_repository.py
│ │ │ └── migrations/
│ │ ├── external/
│ │ │ ├── __init__.py
│ │ │ ├── ollama_client.py # Ollama service client
│ │ │ └── storage/
│ │ │ ├── __init__.py
│ │ │ └── file_storage.py # File storage abstraction
│ │ ├── config/
│ │ │ ├── __init__.py
│ │ │ ├── settings.py # Pydantic settings
│ │ │ └── logging_config.py
│ │ └── security/
│ │ ├── __init__.py
│ │ ├── auth.py # JWT, password hashing
│ │ └── permissions.py
│ │
│ ├── core/ # Core utilities
│ │ ├── __init__.py
│ │ ├── logging.py
│ │ ├── exceptions.py
│ │ └── constants.py
│ │
│ └── main.py # Application entry point
├── tests/ # Test suite
│ ├── __init__.py
│ ├── unit/
│ │ ├── domain/
│ │ ├── application/
│ │ └── infrastructure/
│ ├── integration/
│ │ ├── api/
│ │ └── database/
│ ├── fixtures/
│ └── conftest.py
├── alembic/ # Database migrations
│ ├── versions/
│ └── env.py
├── requirements.txt
├── requirements-dev.txt
├── .env.example
└── Dockerfile
```
---
## Key Architectural Components
### 1. API Layer (Presentation)
**Purpose**: Handle HTTP requests, validate input, return responses
**Responsibilities**:
- Route definitions
- Request/Response serialization
- Input validation
- Authentication/Authorization checks
- Error handling
### 2. Application Layer
**Purpose**: Orchestrate business logic, coordinate between domain and infrastructure
**Responsibilities**:
- Use case implementation
- Service orchestration
- DTO transformation
- Transaction management
### 3. Domain Layer
**Purpose**: Core business logic, entities, and business rules
**Responsibilities**:
- Domain entities
- Business rules
- Value objects
- Domain events
- Repository interfaces (abstractions)
### 4. Infrastructure Layer
**Purpose**: External concerns - database, file system, external APIs
**Responsibilities**:
- Database access
- External API clients
- File storage
- Configuration
- Security implementation
---
## Implementation Examples
### Example 1: Configuration Management
```python
# infrastructure/config/settings.py
from pydantic_settings import BaseSettings
from typing import List
class Settings(BaseSettings):
# Application
app_name: str = "ProfytAI Compliance Platform"
app_version: str = "1.0.0"
debug: bool = False
# Server
host: str = "0.0.0.0"
port: int = 4402
# Database
neo4j_uri: str
neo4j_user: str
neo4j_password: str
# Security
secret_key: str
algorithm: str = "HS256"
access_token_expire_minutes: int = 30
cors_origins: List[str] = []
# AI/ML
ollama_base_url: str = "http://localhost:11434"
ollama_model: str = "gemma3:27b"
embedding_model: str = "embeddinggemma:300m"
# Storage
upload_dir: str = "./assets/data/uploads"
max_upload_size: int = 10 * 1024 * 1024 # 10MB
# Rate Limiting
rate_limit_per_minute: int = 60
class Config:
env_file = ".env"
case_sensitive = False
settings = Settings()
```
### Example 2: Domain Entity
```python
# domain/entities/workflow.py
from dataclasses import dataclass, field
from datetime import datetime
from typing import List, Optional
from uuid import UUID, uuid4
from domain.value_objects.task_status import TaskStatus
from domain.value_objects.workflow_phase import WorkflowPhase
@dataclass
class WorkflowItem:
id: int
task: str
status: TaskStatus
requires_approval: bool
approver: Optional[str] = None
comment: Optional[str] = None
updated_by: Optional[str] = None
updated_at: Optional[datetime] = None
@dataclass
class Workflow:
id: UUID
project_name: str
project_description: Optional[str]
records_officer_email: Optional[str]
current_phase: WorkflowPhase
checklist_items: List[WorkflowItem] = field(default_factory=list)
completed_items: List[int] = field(default_factory=list)
pending_approvals: List[str] = field(default_factory=list)
comments: dict = field(default_factory=dict)
validation_results: dict = field(default_factory=dict)
created_at: datetime = field(default_factory=datetime.utcnow)
updated_at: datetime = field(default_factory=datetime.utcnow)
def add_item(self, item: WorkflowItem) -> None:
"""Add a checklist item to the workflow."""
self.checklist_items.append(item)
self.updated_at = datetime.utcnow()
def update_item_status(
self,
item_id: int,
status: TaskStatus,
updated_by: str,
comment: Optional[str] = None
) -> None:
"""Update the status of a workflow item."""
item = next((i for i in self.checklist_items if i.id == item_id), None)
if not item:
raise ValueError(f"Item {item_id} not found")
item.status = status
item.updated_by = updated_by
item.updated_at = datetime.utcnow()
if comment:
item.comment = comment
if status == TaskStatus.COMPLETED and item_id not in self.completed_items:
self.completed_items.append(item_id)
self.updated_at = datetime.utcnow()
def can_advance_phase(self) -> bool:
"""Check if workflow can advance to next phase."""
all_completed = all(
item.status == TaskStatus.COMPLETED
for item in self.checklist_items
)
no_pending_approvals = len(self.pending_approvals) == 0
return all_completed and no_pending_approvals
@property
def completion_percentage(self) -> float:
"""Calculate completion percentage."""
if not self.checklist_items:
return 0.0
completed = len(self.completed_items)
total = len(self.checklist_items)
return (completed / total) * 100
```
### Example 3: Repository Interface (Domain)
```python
# domain/interfaces/workflow_repository.py
from abc import ABC, abstractmethod
from typing import List, Optional
from uuid import UUID
from domain.entities.workflow import Workflow
class IWorkflowRepository(ABC):
"""Repository interface for workflow persistence."""
@abstractmethod
async def create(self, workflow: Workflow) -> Workflow:
"""Create a new workflow."""
pass
@abstractmethod
async def get_by_id(self, workflow_id: UUID) -> Optional[Workflow]:
"""Get workflow by ID."""
pass
@abstractmethod
async def get_all(self, skip: int = 0, limit: int = 100) -> List[Workflow]:
"""Get all workflows with pagination."""
pass
@abstractmethod
async def update(self, workflow: Workflow) -> Workflow:
"""Update an existing workflow."""
pass
@abstractmethod
async def delete(self, workflow_id: UUID) -> bool:
"""Delete a workflow."""
pass
```
### Example 4: Repository Implementation (Infrastructure)
```python
# infrastructure/database/repositories/workflow_repository_impl.py
from typing import List, Optional
from uuid import UUID
from domain.entities.workflow import Workflow
from domain.interfaces.workflow_repository import IWorkflowRepository
from infrastructure.database.connection import get_db_session
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
class WorkflowRepository(IWorkflowRepository):
"""Neo4j implementation of workflow repository."""
def __init__(self, session: AsyncSession):
self.session = session
async def create(self, workflow: Workflow) -> Workflow:
"""Create workflow in Neo4j."""
query = """
CREATE (w:Workflow {
id: $id,
project_name: $project_name,
project_description: $project_description,
records_officer_email: $records_officer_email,
current_phase: $current_phase,
created_at: $created_at,
updated_at: $updated_at
})
RETURN w
"""
# Implementation details...
return workflow
async def get_by_id(self, workflow_id: UUID) -> Optional[Workflow]:
"""Get workflow by ID from Neo4j."""
query = """
MATCH (w:Workflow {id: $workflow_id})
OPTIONAL MATCH (w)-[:HAS_ITEM]->(i:WorkflowItem)
RETURN w, collect(i) as items
"""
# Implementation details...
pass
# ... other methods
```
### Example 5: Service Layer
```python
# application/services/workflow_service.py
from typing import List, Optional
from uuid import UUID
from domain.entities.workflow import Workflow, WorkflowItem
from domain.interfaces.workflow_repository import IWorkflowRepository
from domain.value_objects.workflow_phase import WorkflowPhase
from domain.value_objects.task_status import TaskStatus
from domain.exceptions.domain_exceptions import WorkflowNotFoundError
class WorkflowService:
"""Service for workflow business logic."""
def __init__(self, workflow_repository: IWorkflowRepository):
self.workflow_repository = workflow_repository
async def create_workflow(
self,
project_name: str,
project_description: Optional[str],
records_officer_email: Optional[str]
) -> Workflow:
"""Create a new workflow with initial phase."""
workflow = Workflow(
id=UUID(),
project_name=project_name,
project_description=project_description,
records_officer_email=records_officer_email,
current_phase=WorkflowPhase.CONCEPT_DEVELOPMENT
)
# Initialize Phase 1 items
phase1_items = self._get_phase1_items()
for item in phase1_items:
workflow.add_item(item)
return await self.workflow_repository.create(workflow)
async def get_workflow(self, workflow_id: UUID) -> Workflow:
"""Get workflow by ID."""
workflow = await self.workflow_repository.get_by_id(workflow_id)
if not workflow:
raise WorkflowNotFoundError(f"Workflow {workflow_id} not found")
return workflow
async def update_workflow_item(
self,
workflow_id: UUID,
item_id: int,
status: TaskStatus,
updated_by: str,
comment: Optional[str] = None
) -> Workflow:
"""Update a workflow item."""
workflow = await self.get_workflow(workflow_id)
workflow.update_item_status(item_id, status, updated_by, comment)
return await self.workflow_repository.update(workflow)
async def advance_workflow(self, workflow_id: UUID) -> Workflow:
"""Advance workflow to next phase."""
workflow = await self.get_workflow(workflow_id)
if not workflow.can_advance_phase():
raise ValueError("Cannot advance: Phase requirements not met")
# Advance to next phase logic...
return await self.workflow_repository.update(workflow)
def _get_phase1_items(self) -> List[WorkflowItem]:
"""Get Phase 1 checklist items."""
return [
WorkflowItem(
id=1,
task="Include Records Officer in system design process",
status=TaskStatus.PENDING,
requires_approval=True,
approver="Records Officer"
),
# ... more items
]
```
### Example 6: API Route with Dependency Injection
```python
# api/routes/workflows.py
from fastapi import APIRouter, Depends, HTTPException, status
from uuid import UUID
from typing import List
from api.schemas.workflow import (
WorkflowCreateRequest,
WorkflowResponse,
WorkflowItemUpdateRequest
)
from application.services.workflow_service import WorkflowService
from api.dependencies import get_workflow_service, get_current_user
from domain.value_objects.task_status import TaskStatus
router = APIRouter(prefix="/workflows", tags=["workflows"])
@router.post("", response_model=WorkflowResponse, status_code=status.HTTP_201_CREATED)
async def create_workflow(
request: WorkflowCreateRequest,
workflow_service: WorkflowService = Depends(get_workflow_service),
current_user = Depends(get_current_user)
):
"""Create a new workflow."""
try:
workflow = await workflow_service.create_workflow(
project_name=request.project_name,
project_description=request.project_description,
records_officer_email=request.records_officer_email
)
return WorkflowResponse.from_entity(workflow)
except Exception as e:
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail=str(e)
)
@router.get("/{workflow_id}", response_model=WorkflowResponse)
async def get_workflow(
workflow_id: UUID,
workflow_service: WorkflowService = Depends(get_workflow_service),
current_user = Depends(get_current_user)
):
"""Get workflow by ID."""
try:
workflow = await workflow_service.get_workflow(workflow_id)
return WorkflowResponse.from_entity(workflow)
except WorkflowNotFoundError:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Workflow not found"
)
@router.put("/{workflow_id}/items", response_model=WorkflowResponse)
async def update_workflow_item(
workflow_id: UUID,
request: WorkflowItemUpdateRequest,
workflow_service: WorkflowService = Depends(get_workflow_service),
current_user = Depends(get_current_user)
):
"""Update a workflow item."""
try:
workflow = await workflow_service.update_workflow_item(
workflow_id=workflow_id,
item_id=request.item_id,
status=TaskStatus(request.status),
updated_by=current_user.email,
comment=request.comment
)
return WorkflowResponse.from_entity(workflow)
except WorkflowNotFoundError:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Workflow not found"
)
```
### Example 7: Dependency Injection Setup
```python
# api/dependencies.py
from functools import lru_cache
from infrastructure.database.connection import get_db_session
from infrastructure.database.repositories.workflow_repository_impl import WorkflowRepository
from application.services.workflow_service import WorkflowService
from infrastructure.external.ollama_client import OllamaClient
from application.services.compliance_service import ComplianceService
from infrastructure.config.settings import settings
# Repository dependencies
async def get_workflow_repository():
async for session in get_db_session():
yield WorkflowRepository(session)
# Service dependencies
def get_workflow_service(
workflow_repo: WorkflowRepository = Depends(get_workflow_repository)
) -> WorkflowService:
return WorkflowService(workflow_repo)
def get_compliance_service() -> ComplianceService:
ollama_client = OllamaClient(
base_url=settings.ollama_base_url,
model=settings.ollama_model
)
return ComplianceService(ollama_client)
# Auth dependencies
async def get_current_user(
token: str = Depends(oauth2_scheme)
):
# JWT validation logic
pass
```
### Example 8: Main Application Setup
```python
# main.py
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from infrastructure.config.settings import settings
from infrastructure.config.logging_config import setup_logging
from api.middleware.error_handler import setup_exception_handlers
from api.middleware.cors import setup_cors
from api.routes import workflows, documents, compliance, health, auth
# Setup logging
setup_logging()
# Create FastAPI app
app = FastAPI(
title=settings.app_name,
version=settings.app_version,
debug=settings.debug
)
# Setup middleware
setup_cors(app, settings.cors_origins)
setup_exception_handlers(app)
# Include routers
app.include_router(auth.router)
app.include_router(workflows.router)
app.include_router(documents.router)
app.include_router(compliance.router)
app.include_router(health.router)
@app.on_event("startup")
async def startup_event():
"""Initialize services on startup."""
# Initialize database connections
# Initialize external services
pass
@app.on_event("shutdown")
async def shutdown_event():
"""Cleanup on shutdown."""
# Close database connections
# Cleanup resources
pass
```
---
## Security Improvements
### 1. Authentication & Authorization
```python
# infrastructure/security/auth.py
from datetime import datetime, timedelta
from typing import Optional
from jose import JWTError, jwt
from passlib.context import CryptContext
from infrastructure.config.settings import settings
pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
def verify_password(plain_password: str, hashed_password: str) -> bool:
"""Verify a password against a hash."""
return pwd_context.verify(plain_password, hashed_password)
def get_password_hash(password: str) -> str:
"""Hash a password."""
return pwd_context.hash(password)
def create_access_token(data: dict, expires_delta: Optional[timedelta] = None):
"""Create JWT access token."""
to_encode = data.copy()
if expires_delta:
expire = datetime.utcnow() + expires_delta
else:
expire = datetime.utcnow() + timedelta(
minutes=settings.access_token_expire_minutes
)
to_encode.update({"exp": expire})
encoded_jwt = jwt.encode(
to_encode,
settings.secret_key,
algorithm=settings.algorithm
)
return encoded_jwt
```
### 2. CORS Configuration
```python
# api/middleware/cors.py
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from typing import List
def setup_cors(app: FastAPI, allowed_origins: List[str]):
"""Configure CORS middleware."""
app.add_middleware(
CORSMiddleware,
allow_origins=allowed_origins, # Specific origins, not "*"
allow_credentials=True,
allow_methods=["GET", "POST", "PUT", "DELETE", "PATCH"],
allow_headers=["Content-Type", "Authorization"],
)
```
### 3. Rate Limiting
```python
# api/middleware/rate_limit.py
from fastapi import Request, HTTPException, status
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded
limiter = Limiter(key_func=get_remote_address)
@router.post("")
@limiter.limit("10/minute") # 10 requests per minute
async def create_workflow(request: Request, ...):
# Implementation
pass
```
---
## Testing Structure
```python
# tests/conftest.py
import pytest
from fastapi.testclient import TestClient
from main import app
from infrastructure.database.connection import get_test_db
@pytest.fixture
def client():
return TestClient(app)
@pytest.fixture
def test_db():
# Setup test database
yield
# Teardown
# tests/unit/application/services/test_workflow_service.py
import pytest
from uuid import UUID
from application.services.workflow_service import WorkflowService
from domain.entities.workflow import Workflow
@pytest.mark.asyncio
async def test_create_workflow():
# Mock repository
mock_repo = MockWorkflowRepository()
service = WorkflowService(mock_repo)
workflow = await service.create_workflow(
project_name="Test Project",
project_description="Test Description",
records_officer_email="test@example.com"
)
assert workflow.project_name == "Test Project"
assert workflow.current_phase == WorkflowPhase.CONCEPT_DEVELOPMENT
```
---
## Migration Strategy
### Phase 1: Foundation (Week 1-2)
1. Create new directory structure
2. Set up configuration management
3. Implement dependency injection
4. Set up database connection
### Phase 2: Domain Layer (Week 3)
1. Create domain entities
2. Define repository interfaces
3. Implement value objects
### Phase 3: Infrastructure (Week 4)
1. Implement repository classes
2. Set up external service clients
3. Configure security
### Phase 4: Application Layer (Week 5)
1. Create service classes
2. Implement use cases
3. Create DTOs
### Phase 5: API Layer (Week 6)
1. Create route modules
2. Implement middleware
3. Set up error handling
### Phase 6: Testing & Migration (Week 7-8)
1. Write unit tests
2. Write integration tests
3. Migrate existing endpoints
4. Deploy and monitor
---
## Benefits of This Architecture
1. **Testability**: Each layer can be tested independently
2. **Maintainability**: Clear separation of concerns
3. **Scalability**: Easy to add new features
4. **Security**: Built-in security at every layer
5. **Flexibility**: Easy to swap implementations (e.g., different databases)
6. **Team Collaboration**: Different teams can work on different layers
---
## Next Steps
1. Review and approve this architecture
2. Create detailed implementation plan
3. Set up project structure
4. Begin Phase 1 implementation
5. Establish coding standards and review process
+94
View File
@@ -0,0 +1,94 @@
# Architecture Redesign Summary
## Quick Overview
This document provides a quick reference for the architectural improvements proposed for the ProfytAI Compliance Management Platform.
## Key Improvements
### 1. **Layered Architecture**
- **API Layer**: HTTP request handling, validation, serialization
- **Application Layer**: Business logic orchestration, use cases
- **Domain Layer**: Core entities, business rules, interfaces
- **Infrastructure Layer**: Database, external services, configuration
### 2. **Dependency Injection**
- Services depend on interfaces, not implementations
- Easy to test with mocks
- Flexible to swap implementations
### 3. **Configuration Management**
- Type-safe settings with Pydantic
- Environment variable support
- Centralized configuration
### 4. **Security**
- JWT authentication
- CORS with specific origins
- Rate limiting
- Input validation at every layer
### 5. **Database Integration**
- Repository pattern
- Neo4j integration ready
- Migration support
## File Structure Comparison
### Before (Current)
```
be0/
├── main.py (545 lines - everything in one file)
├── src/
│ ├── compliance_verifier.py
│ └── utils.py
```
### After (Proposed)
```
be0/
├── main.py (clean entry point)
├── src/
│ ├── api/ (routes, middleware, schemas)
│ ├── application/ (services, use cases)
│ ├── domain/ (entities, interfaces)
│ ├── infrastructure/ (database, external, config)
│ └── core/ (utilities)
```
## Migration Checklist
- [ ] Create new directory structure
- [ ] Set up configuration management
- [ ] Implement domain entities
- [ ] Create repository interfaces
- [ ] Implement repository classes
- [ ] Create service layer
- [ ] Split routes into modules
- [ ] Add authentication/authorization
- [ ] Implement error handling
- [ ] Add tests
- [ ] Update documentation
## Benefits
1. **Maintainability**: Clear structure, easy to find code
2. **Testability**: Each layer can be tested independently
3. **Scalability**: Easy to add new features
4. **Security**: Built-in at every layer
5. **Team Collaboration**: Different teams can work on different layers
## Next Steps
1. Review `ARCHITECTURE_REDESIGN.md` for detailed design
2. Review code examples in `be0/src/`
3. Plan migration timeline
4. Start with Phase 1 (Foundation)
## Questions?
Refer to the detailed `ARCHITECTURE_REDESIGN.md` document for:
- Complete architecture explanation
- Code examples
- Migration strategy
- Best practices
+139
View File
@@ -0,0 +1,139 @@
# Fix Chat Assistant 500 Error
## Issue
Getting 500 Internal Server Error when calling `/api/v1/chat` endpoint.
## Root Causes
1. **Model Name Mismatch** ✅ FIXED
- Code was using `gemma3:27b` but entrypoint pulls `gemma3:270M`
- **Fixed**: Updated code to use `gemma3:270M`
2. **Ollama Not Running**
- Ollama service might not be started in the container
- Network connectivity issues
3. **Model Not Available**
- Model might not be pulled yet
- Model name incorrect
## Solutions
### Solution 1: Restart the Container
```bash
# Stop and restart the backend container
docker-compose down
docker-compose up -d be0
# Wait for Ollama to start (check logs)
docker-compose logs -f be0
```
### Solution 2: Check Ollama Status
```bash
# Check if container is running
docker ps | grep be0
# Check Ollama inside container
docker exec be0 ollama list
# If Ollama is not running, start it
docker exec be0 ollama serve &
```
### Solution 3: Pull the Model
```bash
# Pull the required model
docker exec be0 ollama pull gemma3:270M
# Verify it's available
docker exec be0 ollama list | grep gemma3
```
### Solution 4: Test the Health Endpoint
```bash
# Check health endpoint (includes Ollama status)
curl http://localhost:4402/health
# Should show:
# {
# "status": "healthy",
# "ollama": {
# "status": "connected",
# "available_models": ["gemma3:270M", ...]
# }
# }
```
### Solution 5: Check Backend Logs
```bash
# View recent logs
docker-compose logs be0 | tail -50
# View ChatAssistant specific logs
tail -f be0/logs/ChatAssistant.log
```
## Quick Fix Commands
```bash
# 1. Restart everything
docker-compose restart be0
# 2. Check Ollama
docker exec be0 ollama list
# 3. Test health
curl http://localhost:4402/health
# 4. Test chat endpoint
curl -X POST http://localhost:4402/api/v1/chat \
-H "Content-Type: application/json" \
-d '{"message": "Hello"}'
```
## What Was Fixed
1. ✅ Model name changed from `gemma3:27b` to `gemma3:270M`
2. ✅ Added better error handling with specific error messages
3. ✅ Added Ollama connection check on initialization
4. ✅ Added health endpoint with Ollama status
5. ✅ Improved logging for debugging
## Expected Behavior After Fix
1. Container starts and Ollama service runs
2. Model `gemma3:270M` is available
3. Health endpoint shows Ollama as "connected"
4. Chat endpoint returns 200 with AI response
## If Still Not Working
1. **Check container logs:**
```bash
docker-compose logs be0
```
2. **Check if Ollama is accessible:**
```bash
docker exec be0 curl http://localhost:11434/api/tags
```
3. **Manually start Ollama:**
```bash
docker exec -d be0 ollama serve
sleep 2
docker exec be0 ollama list
```
4. **Rebuild container:**
```bash
docker-compose down
docker-compose build be0
docker-compose up be0
```
+41
View File
@@ -0,0 +1,41 @@
# HANDOFF — SciAgent / ImageHub
_Updated: 2026-06-29 (session-end — Gitea Actions CI/CD pipeline) · branch `main` · **40 commits LOCAL/unpushed** · 🟢 HMW-mode OFF_
## TL;DR
- Stood up the repo's **first CI/CD****Gitea Actions** on the self-hosted box `103.149.170.102:3000` (Gitea 1.26.2). Previously deploy was manual Docker Compose, **no CI**.
- Pipeline `.gitea/workflows/ci-cd.yml` = **backend** (per-file pytest + throwaway Postgres) · **frontend** (typecheck/build/vitest across workspaces) · **deploy** (host-mode `docker compose up -d` on push to main). Local commit `c2e869b`.
- **One hard gate left: NO act_runner is installed** → all runs queue, nothing executes/deploys. User must run `scripts/setup-gitea-runner.sh` on the box (I have no SSH there).
## Shipped this session — commit `c2e869b` (local only)
- **`.gitea/workflows/ci-cd.yml`** — 3 jobs. backend: `pip install be0/requirements-dev.txt` then **pytest PER FILE** (loop) vs a `postgres:16-alpine` service (per-file avoids asyncpg cross-module event-loop contamination, [[be0-test-harness-reality]]). frontend: node 20, `npm ci`, `npm run typecheck` + `build`, `npm test --workspaces --if-present` (vitest in shared/investigator/publisher). deploy (`runs-on: deploy`, host): clone/reset **persistent `/srv/sciagent`** (NOT ephemeral — prod compose bind-mounts `./assets/minio-data`+`./be0`), write `.env` from secret `PROD_ENV`, `deploy-prod.sh --no-pull` + `check-prod-stack.sh`.
- **`be0/requirements-dev.txt`** — pytest + pytest-asyncio (neither was pinned).
- **`scripts/setup-gitea-runner.sh`** — act_runner 0.2.11 bootstrap (Docker+compose+node+systemd, labels `ci:docker://catthehacker/ubuntu:act-22.04,deploy:host`). ⚠️ runner registration token baked in (already public on Gitea mirror; rotatable).
- **Done via Gitea admin API (keychain user `oneness`, is_admin):** enabled Actions unit · stored secret `PROD_ENV` (valid prod `.env`, `PUBLIC_HOST=103.149.170.102`, fresh hex PG/MinIO pw + b64 JWT, `AUTH_MAIL_LOG_ONLY=1` placeholder) · minted runner token · pushed workflow+reqs to Gitea (workflow `state: active`).
- **Mirror refreshed** to current code: Gitea `main` now a **1212-file clean snapshot** (was 2026-06-14 / 965 files; now incl. all 4 monorepo FEs + the workflow). Leak-checked clean. Detail: [[gitea-cicd-pipeline]], [[gitea-mirror-and-tracked-secrets]].
## Current state
- Migrations 001…027 · 6 be0 routers · monorepo 4 FEs (`fe0` legacy standalone) + `@ump/shared`.
- Gitea workflow active; **runners online: 0**. PROD_ENV set; SMTP unfilled.
- Verify this session = artifact-level only (bash -n, pip syntax, YAML parse) — **no app code changed**, so BE/FE suites not re-run.
## Next — P1 (start here)
1. **Install the runner** (user, needs root on the box — I have no SSH): `curl -fsSL http://103.149.170.102:3000/tlam89/sciagent/raw/branch/main/scripts/setup-gitea-runner.sh | sudo bash`. Then ping me → I verify it's Online (API) + watch the first run (backend→frontend→deploy), report PASS/FAIL with logs.
2. **Fill SMTP** in `PROD_ENV` secret (else OTP/reset mail only logs). Give me `SMTP_*` → I update the secret via API.
3. (Decision) fe0 vs frontend_user port role — deferred this session (fe0 NOT deployed; user confirmed it was a slip).
## Open threads / risks
- 🔴 **NO runner = pipeline does nothing.** This is the blocker for all execution/deploy.
- 🔴 **40 commits LOCAL/unpushed to origin** — push to GitHub origin BLOCKED (history has `.env` secrets + 1.8 GB PII `assets/` → rotate + `git filter-repo` first). Gitea mirror is current; origin is not. Do NOT `git push origin`.
- First deploy = **fresh empty stack** (new Postgres via initdb migrations, empty MinIO) — no dev data carried over (assets/ excluded by design).
- Caught near-miss (documented): `git add -A` + `:(exclude)assets` did NOT exclude → leak-check stopped it pre-push. Reliable mirror method now in [[gitea-mirror-and-tracked-secrets]].
- CLAUDE.md still STALE (says "no CI"; says migr 014 / 3 routers / `fe0`).
## Quick commands
- Gitea API (admin): `CRED=$(printf 'protocol=http\nhost=103.149.170.102:3000\n\n'|git credential fill); U=…;P=…` then `curl -u $U:$P http://103.149.170.102:3000/api/v1/repos/tlam89/sciagent/actions/runners` (check online) / `…/actions/tasks` (runs).
- Runner install (on box, root): see P1 #1.
- Re-mint runner token: `curl -s -X POST -u $U:$P http://103.149.170.102:3000/api/v1/repos/tlam89/sciagent/actions/runners/registration-token`.
## Reality flags
- CI lives on **Gitea** (`103.149.170.102:3000`), NOT GitHub. Push to Gitea = clean orphan snapshot convention (excl `.env`/`assets`/`.claude`/`CLAUDE.md`). Origin (GitHub) push stays blocked.
- **Push ≠ deploy.** Even with the runner up, deploy only fires on push to Gitea `main`. This session = local commit only; nothing deployed, nothing pushed to origin.
- 🟢 HMW-mode OFF. No sub-agents spawned this session (main-agent + API + git only).
+985
View File
@@ -0,0 +1,985 @@
# Implementation Guide — `sang-kien-pdf`
A step-by-step walkthrough of how the Sáng kiến PDF + DOCX template generators are built. Read this if you want to understand **why** each piece exists, **how** to modify the layout, or **how** to port the same approach to a different government form.
---
## Table of contents
1. [The problem we're solving](#1-the-problem-were-solving)
2. [Architecture overview](#2-architecture-overview)
3. [Tech stack and rationale](#3-tech-stack-and-rationale)
4. [Project setup](#4-project-setup-from-scratch)
5. [Implementing the PDF generator](#5-implementing-the-pdf-generator)
- 5.1 [TypeScript data types](#51-typescript-data-types)
- 5.2 [Font registration](#52-font-registration)
- 5.3 [Shared styles](#53-shared-styles)
- 5.4 [Reusable components](#54-reusable-components)
- 5.5 [Page components](#55-page-components)
- 5.6 [Top-level Document](#56-top-level-document)
- 5.7 [Server-side render helper](#57-server-side-render-helper)
6. [Implementing the DOCX template generator](#6-implementing-the-docx-template-generator)
- 6.1 [The Jinja-in-DOCX strategy](#61-the-jinja-in-docx-strategy)
- 6.2 [The 3-row table loop trick](#62-the-3-row-table-loop-trick)
- 6.3 [Multi-section layout](#63-multi-section-layout)
- 6.4 [Building paragraphs and tables](#64-building-paragraphs-and-tables)
7. [Layout calibration](#7-layout-calibration-matching-the-standard)
8. [Verification workflow](#8-verification-workflow)
9. [Common modifications](#9-common-modifications)
10. [Troubleshooting](#10-troubleshooting)
11. [Porting to a different form](#11-porting-to-a-different-form)
---
## 1. The problem we're solving
The "Sáng kiến" application is a Vietnamese government form (Đại học Y Dược TP.HCM) that has six sections — a cover page (Trang bìa) plus Mẫu số 0104 plus Bản cam kết. Every applicant fills out the same skeleton with their own data.
Two real-world workflows need to be supported:
1. **Programmatic PDF generation** — a web service receives JSON, returns a printable PDF. No human edits the file before printing.
2. **Word-based filling** — an admin opens a `.docx` template in Word, types into it (or uses `docxtpl`/`Carbone`/etc. to merge JSON), and prints.
Both outputs must look identical to the official reference document (`Sang_kien_SOP_dong_vat`). The data shape (`data_blank.json`) is fixed by an existing system upstream and must not change.
The trick is keeping the two generators in sync — same layout, same data fields — while staying within each format's idioms.
---
## 2. Architecture overview
```
┌────────────────────┐
│ data.json │ ← source of truth (data_blank.json shape)
└──────────┬─────────┘
┌────────────────┴────────────────┐
▼ ▼
┌──────────────────────┐ ┌─────────────────────────┐
│ React-PDF pipeline │ │ docx + docxtpl path │
│ │ │ │
│ data → React tree │ │ build-docx-template.ts │
│ → PDF buffer │ │ generates .docx with │
│ │ │ {{ }} placeholders │
│ │ │ ↓ │
│ │ │ docxtpl.render(data) │
│ │ │ → filled .docx │
└──────────┬───────────┘ └────────────┬────────────┘
│ │
▼ ▼
filled.pdf filled.docx
```
The PDF path uses **runtime composition** — a React component receives data as props and returns a tree of `<Page>`/`<View>`/`<Text>` elements. The renderer turns that into a PDF buffer.
The DOCX path uses **template-based composition** — a build script (`build-docx-template.ts`) produces a `.docx` file *once*, with placeholder strings like `{{ mau_01.mo_dau }}` baked into the document body. At runtime, `docxtpl` (Python) or any other Jinja-aware OOXML tool reads that `.docx`, finds the placeholders, and replaces them with values from the JSON.
Both pipelines read **the same TypeScript types and JSON files**, so adding a new field requires touching both sides — but the field name lives in exactly one place: `src/types.ts`.
---
## 3. Tech stack and rationale
| Concern | Choice | Why |
|---|---|---|
| PDF rendering | `@react-pdf/renderer` v4 | Component-based, server- and browser-compatible. Uses Yoga for flexbox layout. Same API as React, so layouts compose like UI code. |
| Vietnamese font | `@expo-google-fonts/tinos` | Tinos is a metric-equivalent of Times New Roman (Apache 2.0) with the full Latin Extended Additional range — needed for `ư ơ ầ ậ ọ ặ` etc. The `@expo-google-fonts/*` packages ship actual `.ttf` files (most other font packages ship `.woff/.woff2`, which `@react-pdf/renderer` can't read). |
| DOCX generation | `docx` v9 (npm) | Object-model API: build paragraphs, tables, sections in TypeScript, then `Packer.toBuffer()` produces a valid `.docx`. Maintained, typed, stable. |
| Templating engine | `docxtpl` (Python) | The most popular Jinja-style DOCX templater. Recognizes `{{ var }}`, `{% if %}`, and crucially `{%tr for %}` for table-row loops. Compatible templates work in `docx-templates` (JS) and Carbone too. |
| TypeScript | 5.4 | Catches type errors at build time and gives autocompletion across all the data fields. |
| Test rendering | LibreOffice (`soffice`) | Used to convert `.docx``.pdf` so we can visually diff against the reference document. |
**Why not a pure HTML-to-PDF approach (Puppeteer)?** It works, but bundle size is huge and rendering is non-deterministic across machines. React-PDF gives byte-stable output.
**Why not just generate the DOCX and convert it to PDF?** That would solve the layout-sync problem but couples PDF generation to a heavy toolchain (LibreOffice). React-PDF runs in pure Node.js and works inside serverless environments.
---
## 4. Project setup from scratch
```bash
mkdir sang-kien-pdf && cd sang-kien-pdf
npm init -y
# Runtime dependencies
npm install @react-pdf/renderer react @expo-google-fonts/tinos docx
# Dev dependencies
npm install -D typescript ts-node @types/react @types/node
```
Create `tsconfig.json`:
```json
{
"compilerOptions": {
"target": "ES2020",
"module": "commonjs",
"lib": ["ES2020", "DOM"],
"jsx": "react",
"outDir": "./dist",
"rootDir": "./",
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true,
"forceConsistentCasingInFileNames": true,
"declaration": true,
"declarationMap": true,
"sourceMap": true,
"resolveJsonModule": true,
"moduleResolution": "node"
},
"include": ["src/**/*", "example/**/*", "tools/**/*"],
"exclude": ["node_modules", "dist"]
}
```
The `jsx: "react"` setting matters — React-PDF uses real JSX, not the new transform.
Add scripts to `package.json`:
```json
{
"scripts": {
"build": "tsc",
"generate": "ts-node example/generate-example.ts",
"generate:blank": "ts-node example/generate-example.ts --blank",
"build:docx": "ts-node tools/build-docx-template.ts"
}
}
```
---
## 5. Implementing the PDF generator
### 5.1 TypeScript data types
Start with the data shape. Every field in the JSON gets a strict TypeScript interface in `src/types.ts`. This is the single source of truth — every page component reads it, every change ripples out through the type system.
```ts
// src/types.ts
export interface NgayKy {
ngay: string;
thang: string;
nam: string;
}
export interface TrangBia {
ten_sang_kien: string;
tac_gia: string;
don_vi: string;
thong_tin_lien_he: string;
nam: string;
}
export interface Mau01ApplyRow {
tt: string;
ten_to_chuc: string;
dia_chi: string;
linh_vuc: string;
}
export interface Mau01HieuQua {
loi_ich_kinh_te: string;
hieu_qua_giang_day: string;
// … 8 more fields
}
export interface Mau01 {
mo_dau: string;
ten_sang_kien: string;
// …
danh_sach_ap_dung: Mau01ApplyRow[];
tinh_hieu_qua: Mau01HieuQua;
ngay_ky: NgayKy;
// …
}
// … repeat for Mau02, Mau03, Mau04, BanCamKet
export interface SangKienData {
trang_bia: TrangBia;
mau_01: Mau01;
mau_02: Mau02;
mau_03: Mau03;
mau_04: Mau04;
ban_cam_ket: BanCamKet;
}
```
Two design choices worth calling out:
**All fields are strings (or string arrays).** Even numbers like "Tỷ lệ %" are strings. The form is for humans, not databases — values get rendered verbatim, and string-only types let users write `"15%"` or `"khoảng 15"` without coercion errors.
**Array-shaped tables.** `danh_sach_tac_gia` is `Mau02AuthorRow[]`, not a fixed-size tuple. The page components iterate with `.map()`, and the DOCX template uses a `{%tr for %}` loop. Both handle 0, 1, or 100 rows.
### 5.2 Font registration
`@react-pdf/renderer` ships with three fonts (Helvetica, Times-Roman, Courier) and **none of them include Vietnamese glyphs**. If you skip this step, characters like `ư ơ ầ ậ` will render as blank space.
```ts
// src/fonts.ts
import { Font } from "@react-pdf/renderer";
let registered = false;
export function registerFonts(): void {
if (registered) return;
const regular = require.resolve(
"@expo-google-fonts/tinos/400Regular/Tinos_400Regular.ttf"
);
const italic = require.resolve(
"@expo-google-fonts/tinos/400Regular_Italic/Tinos_400Regular_Italic.ttf"
);
const bold = require.resolve(
"@expo-google-fonts/tinos/700Bold/Tinos_700Bold.ttf"
);
const boldItalic = require.resolve(
"@expo-google-fonts/tinos/700Bold_Italic/Tinos_700Bold_Italic.ttf"
);
Font.register({
family: "TimesVN",
fonts: [
{ src: regular },
{ src: italic, fontStyle: "italic" },
{ src: bold, fontWeight: "bold" },
{ src: boldItalic, fontWeight: "bold", fontStyle: "italic" },
],
});
Font.registerHyphenationCallback((word) => [word]);
registered = true;
}
```
Three things happen here:
1. **`require.resolve()` finds the TTF on disk** — this works in Node and bundlers like Webpack/Vite turn it into an asset URL automatically.
2. **One family, four variants**`fontWeight` and `fontStyle` keys let `<Text style={{ fontWeight: "bold" }}>` resolve to the bold TTF.
3. **Hyphenation callback returns `[word]`** — this disables React-PDF's default English hyphenator, which would chop Vietnamese words at random points.
The `registered` boolean guards against re-registration if `registerFonts()` is called from multiple entry points.
### 5.3 Shared styles
`StyleSheet.create()` in `src/styles.ts` defines reusable style objects. Three categories matter:
**Page-level constants.** A4 with ~2.5 cm margins:
```ts
page: {
fontFamily: FONT, // "TimesVN"
fontSize: 13, // 13pt body
paddingTop: 71, // ~2.5cm = 71pt
paddingBottom: 71,
paddingLeft: 71,
paddingRight: 71,
lineHeight: 1.25,
},
```
**Paragraph variants** for the three contexts that come up:
```ts
// Indented body text (justified, first-line indent ~1cm)
paragraph: { textAlign: "justify", textIndent: 28, marginBottom: 0 },
// Flush-left lines (section labels, inline list items)
paragraphFlush: { textAlign: "justify", marginBottom: 0 },
// Section headings (flush-left, with breathing room above)
sectionHead: { textAlign: "justify", marginBottom: 0, marginTop: 4 },
```
The `marginBottom: 0` is deliberate — Vietnamese government documents are visually dense, so paragraphs only get spacing between sections, not between adjacent lines.
**Component primitives** (table, checkbox, signature columns):
```ts
table: {
flexDirection: "column",
borderWidth: 1, borderColor: "#000",
borderRightWidth: 0, borderBottomWidth: 0, // we draw R+B per-cell
marginVertical: 4,
},
tableCell: {
borderRightWidth: 1, borderBottomWidth: 1, borderColor: "#000",
padding: 4,
},
```
The "outer border drawn on the table, inner borders drawn per-cell" pattern avoids double-thickness lines where cells meet.
**Cover-specific styles** are isolated in their own group because the cover page has unique requirements (page border via `position: absolute`, "Mẫu số 01" badge in the top corner).
### 5.4 Reusable components
`src/components.tsx` factors out the patterns that show up on multiple pages:
**`<Checkbox checked={boolean}>label</Checkbox>`** — a horizontal row with a bordered square. When `checked`, an inner filled `<View>` appears inside it. We don't use the Unicode `☑` character because Tinos doesn't include it; drawing geometry is font-independent.
```tsx
export const Checkbox: React.FC<CheckboxProps> = ({ checked, children }) => (
<View style={styles.checkboxRow}>
<View style={styles.checkboxBox}>
{checked ? <View style={styles.checkboxFill} /> : null}
</View>
<Text style={styles.checkboxLabel}>{children}</Text>
</View>
);
```
**Header variants** — three different two-column header patterns appear in the document:
- `<TopHeaderBoYTe />` — "BỘ Y TẾ / ĐẠI HỌC Y DƯỢC" left, "CỘNG HÒA…" right (Mẫu 03/04)
- `<TopHeaderDonVi donVi="..." />` — drops "BỘ Y TẾ", shows the unit name in bold (Mẫu 02)
- `<TopHeaderCongHoa />` — only the right column (Bản cam kết)
Each one uses the same `flexDirection: "row"` layout with two equal columns. The differences are which lines appear.
**Table primitives.**
```tsx
<Table columns={[6, 22, 14, 16, 14, 14, 14]}>
<Row>
<Cell width={6} header align="center">STT</Cell>
<Cell width={22} header align="center">Họ tên</Cell>
{/* … */}
</Row>
{data.danh_sach_tac_gia.map((row, i) => (
<Row key={i}>
<Cell width={6} align="center">{row.stt}</Cell>
<Cell width={22}>{row.ho_ten}</Cell>
{/* … */}
</Row>
))}
</Table>
```
The `width` prop is a **percentage** (the cell renders with `width: ${width}%`). Column widths must sum to 100. The `Cell` component automatically wraps string children in `<Text>` so callers can pass either plain text or nested elements.
**`<DateLine ngay thang nam />`** renders the recurring "TP. Hồ Chí Minh, ngày … tháng … năm …" line, with sensible blank-data placeholders (`.....`).
**`<SignatureBlock title subtitle name>`** renders one column of a two-column signature block (centered title, italic subtitle, then a 50pt vertical gap before the bold signer's name).
### 5.5 Page components
Each section of the form gets its own component file in `src/pages/`. They all follow the same shape:
```tsx
// src/pages/Mau01.tsx
import { Page, View, Text } from "@react-pdf/renderer";
import { styles } from "../styles";
import { Mau01 } from "../types";
import { Table, Row, Cell, DateLine } from "../components";
interface Props {
data: Mau01;
donVi: string; // pulled from mau_02.don_vi by the parent
}
export const Mau01Page: React.FC<Props> = ({ data, donVi }) => (
<Page size="A4" style={styles.page}>
<Text style={styles.centerTitleLarge}>BÁO CÁO TẢ SÁNG KIẾN</Text>
<Text style={styles.paragraphFlush}>
1. Mở đu{" "}
<Text style={styles.italic}>
(Giới thiệu về những vấn đ liên quan đến sáng kiến):
</Text>
</Text>
<Text style={styles.paragraph}>{data.mo_dau}</Text>
{/* … rest of the page */}
</Page>
);
```
Three patterns recur in every page:
1. **Static + dynamic mixed in the same `<Text>`.** Section labels like "1. Mở đầu" are fixed, but the italic instructional helper text and the data value next to them aren't. We use nested `<Text>` to apply different styles to different runs in one paragraph (because `<Text>` in React-PDF can contain other `<Text>` nodes, like `<span>` in HTML).
2. **`{" "}` for explicit whitespace.** JSX collapses whitespace between elements. To preserve a space between a label and an italic helper, we explicitly insert `{" "}`.
3. **Default-empty rows for tables.** When `data.danh_sach_ap_dung` is empty, we still want one blank row to render so the printed form has a place to write. The pattern:
```tsx
{(data.danh_sach_ap_dung && data.danh_sach_ap_dung.length > 0
? data.danh_sach_ap_dung
: [{ tt: "", ten_to_chuc: "", dia_chi: "", linh_vuc: "" }]
).map((row, i) => /* ... */)}
```
**Signature block on Mẫu 01 takes `donVi` as a prop**, not from `data` directly. The reason: the standard layout uses the unit name from Mẫu 02 (`mau_02.don_vi`) on Mẫu 01's signature line. Rather than duplicate the value in the JSON, the parent component (`SangKienDocument`) reads it from `mau_02` and passes it down.
**Cover page is special.** It uses absolute positioning to put the page border around the entire content area:
```tsx
<Page size="A4" style={styles.page}>
<Text style={styles.formNumberOnCover}>Mẫu số 01</Text>
<View style={styles.coverBorder} fixed />
<View style={styles.coverContent}>
{/* header, title, fields, footer */}
</View>
</Page>
```
`<View fixed>` tells React-PDF to render the border on every page in this section (irrelevant here since the cover is one page, but harmless), and `position: absolute` (set in `styles.coverBorder`) makes it overlay the whole page.
### 5.6 Top-level Document
`src/SangKienDocument.tsx` composes all six pages:
```tsx
export const SangKienDocument: React.FC<{ data: SangKienData }> = ({ data }) => {
registerFonts();
const donVi = data.mau_02.don_vi || data.trang_bia.don_vi;
return (
<Document
title={data.trang_bia.ten_sang_kien || "Báo cáo mô tả sáng kiến"}
author={data.trang_bia.tac_gia}
>
<CoverPage data={data.trang_bia} />
<Mau01Page data={data.mau_01} donVi={donVi} />
<Mau02Page data={data.mau_02} />
<Mau03Page data={data.mau_03} />
<Mau04Page data={data.mau_04} />
<BanCamKetPage data={data.ban_cam_ket} />
</Document>
);
};
```
`registerFonts()` is idempotent (the internal `registered` flag guards against duplicate registration), so calling it from the top-level component is safe.
The `<Document>` element accepts metadata that shows up in the PDF's title bar — `title`, `author`, `subject`, `creator`, `producer`, `keywords`. These don't affect rendering, just file properties.
### 5.7 Server-side render helper
`src/generate.tsx` wraps the React rendering in a Node-friendly Promise:
```tsx
import { pdf } from "@react-pdf/renderer";
export async function renderSangKienPdf(data: SangKienData): Promise<Buffer> {
const instance = pdf(<SangKienDocument data={data} />);
const blob = await instance.toBlob();
const arrayBuffer = await blob.arrayBuffer();
return Buffer.from(arrayBuffer);
}
export async function renderSangKienPdfFromFile(
inputJsonPath: string,
outputPdfPath: string
): Promise<void> {
const data = JSON.parse(fs.readFileSync(inputJsonPath, "utf-8")) as SangKienData;
const buffer = await renderSangKienPdf(data);
fs.mkdirSync(path.dirname(outputPdfPath), { recursive: true });
fs.writeFileSync(outputPdfPath, buffer);
}
```
`pdf(...).toBlob()` is the cleanest async API even on the server — the `Buffer.from(await blob.arrayBuffer())` conversion is one line.
`example/generate-example.ts` is a thin CLI on top:
```ts
const useBlank = process.argv.includes("--blank");
const inputPath = useBlank
? path.join(__dirname, "data-blank.json")
: path.join(__dirname, "sample-data.json");
const outputPath = path.join(__dirname, "..", "out", `sang-kien-${useBlank ? "blank" : "filled"}.pdf`);
await renderSangKienPdfFromFile(inputPath, outputPath);
```
---
## 6. Implementing the DOCX template generator
### 6.1 The Jinja-in-DOCX strategy
`docxtpl` works by storing Jinja-style strings *as ordinary text* inside the DOCX, then doing template expansion at render time. The build script's job is to produce a `.docx` whose visible text reads:
> **Tên sáng kiến (Tiếng Việt):** {{ trang_bia.ten_sang_kien }}
When you open this in Word, you literally see those curly braces. When `docxtpl` opens it, it walks the OOXML tree, finds runs containing `{{ ... }}`, and replaces them.
**The catch: text runs split across formatting changes.** If you write `Tên sáng kiến (Tiếng Việt): {{ trang_bia.ten_sang_kien }}` in one run, that's fine. But if you bold "Tên sáng kiến" and leave `{{ … }}` regular, Word stores them as **two separate runs**. A naive search for `{{` in the second run works — but if you split a placeholder *inside* the curly braces (`{{ trang_bia.` in one run, `ten_sang_kien }}` in another), `docxtpl` will fail silently. So:
> **Rule:** every placeholder must live entirely inside one continuous run with one set of formatting.
The `docx` library makes this easy — when you write `r("{{ mau_01.mo_dau }}")`, that's exactly one `<w:r>` element with one `<w:t>` inside.
### 6.2 The 3-row table loop trick
For repeating table rows, `docxtpl` uses a special syntax: `{%tr for item in collection %}` and `{%tr endfor %}`. The `tr` prefix tells the engine "remove the entire `<w:tr>` row containing this tag and use the rows between `for` and `endfor` as the loop body."
A naive single-row pattern doesn't work:
```
[ {%tr for x in items %} {{ x.id }} | {{ x.name }} {%tr endfor %} ]
```
Because `{%tr for %}` and `{%tr endfor %}` must be in the **same row** (they're stripped together) — and Jinja then sees two opening tags with no body.
The reliable pattern is **three rows**:
```
Row 1: | {%tr for item in collection %} | (empty cells) |
Row 2: | {{ item.id }} | {{ item.name }} | ← duplicated per item
Row 3: | {%tr endfor %} | (empty cells) |
```
Row 1 and Row 3 get stripped. Row 2 gets repeated for each item. The data row carries the actual `{{ }}` fields.
In code:
```ts
const aw = [6, 22, 14, 16, 14, 14, 14]; // column widths
const emptyRow_aw = (firstText: string) => {
const cells: TableCell[] = [];
for (let i = 0; i < aw.length; i++) {
cells.push(new TableCell({
borders: allThinBorders,
width: { size: aw[i] * 100, type: WidthType.PERCENTAGE },
children: [new Paragraph({ children: [r(i === 0 ? firstText : " ")] })],
}));
}
return cells;
};
new Table({
rows: [
new TableRow({ children: [/* header cells */] }),
new TableRow({ children: emptyRow_aw("{%tr for item in mau_02.danh_sach_tac_gia %}") }),
new TableRow({ children: [
dataCell("{{ item.stt }}", aw[0], AlignmentType.CENTER),
dataCell("{{ item.ho_ten }}", aw[1]),
// … 5 more
]}),
new TableRow({ children: emptyRow_aw("{%tr endfor %}") }),
],
});
```
The `emptyRow_aw` helper builds a row where the first cell contains the loop tag and the rest are blanks (just `" "`). After `docxtpl` strips it, the visible table has one header row plus one data row per item.
### 6.3 Multi-section layout
Word documents are split into **sections**, each with its own page settings — margins, orientation, page borders, headers, footers. The cover page needs:
- A **page border** (rounded rectangle around the content area)
- A **header** containing "Mẫu số 01" at the top right *outside* the border
The rest of the document needs:
- **No** page border
- **No** "Mẫu số 01" header (it's only on the cover)
In `docx` v9, this is two sections in the same document:
```ts
new Document({
sections: [
{
properties: {
page: {
size: { width: 11906, height: 16838, orientation: PageOrientation.PORTRAIT },
margin: { top: 1440, right: 1440, bottom: 1440, left: 1440 },
borders: {
pageBorderTop: { style: BorderStyle.SINGLE, size: 12, color: "000000", space: 24 },
pageBorderBottom: { style: BorderStyle.SINGLE, size: 12, color: "000000", space: 24 },
pageBorderLeft: { style: BorderStyle.SINGLE, size: 12, color: "000000", space: 24 },
pageBorderRight: { style: BorderStyle.SINGLE, size: 12, color: "000000", space: 24 },
},
},
},
headers: { default: coverHeader }, // contains "Mẫu số 01"
children: buildCoverPage(),
},
{
properties: {
page: { size: {/*…*/}, margin: {/*…*/} /* no borders */ },
},
// Explicit empty header so the cover header doesn't leak onto subsequent pages
headers: { default: new Header({ children: [new Paragraph({ children: [r("")] })] }) },
children: [
...buildMau01(),
...buildMau02(),
...buildMau03(),
...buildMau04(),
...buildBanCamKet(),
],
},
],
});
```
Two gotchas worth noting:
**Twips, not points.** `docx` uses twips (1/1440 inch). Multiply pt by 20 to get twips:
- A4 = 11906 × 16838 twips
- 1 inch margin = 1440 twips
- 1 cm = 567 twips
**Headers leak across sections.** If section 2 doesn't define `headers`, it inherits section 1's. We have to provide an explicit empty `Header` to prevent the "Mẫu số 01" text from showing up on every page of the document.
### 6.4 Building paragraphs and tables
The build script defines small helper functions to keep the body code readable:
```ts
const FONT = "Times New Roman";
const SIZE = 26; // 13pt (docx-js uses half-points)
const SIZE_HEADING = 28; // 14pt
function r(text: string, opts: { bold?: boolean; italic?: boolean; underline?: boolean; size?: number } = {}) {
return new TextRun({
text,
font: FONT,
size: opts.size ?? SIZE,
bold: opts.bold,
italics: opts.italic,
underline: opts.underline ? { type: UnderlineType.SINGLE } : undefined,
});
}
function bodyP(children: TextRun[], opts: { indent?: boolean } = {}) {
return new Paragraph({
children,
alignment: AlignmentType.JUSTIFIED,
indent: opts.indent ? { firstLine: 567 } : undefined,
spacing: { before: 0, after: 0, line: 300 },
});
}
function flushP(children: TextRun[], opts: { spaceBefore?: number } = {}) {
return new Paragraph({
children,
alignment: AlignmentType.JUSTIFIED,
spacing: { before: opts.spaceBefore ?? 0, after: 0, line: 300 },
});
}
function centerP(children: TextRun[], opts: { spaceBefore?: number; spaceAfter?: number } = {}) {
return new Paragraph({
children,
alignment: AlignmentType.CENTER,
spacing: { before: opts.spaceBefore ?? 0, after: opts.spaceAfter ?? 0, line: 300 },
});
}
```
A typical section then reads naturally:
```ts
out.push(centerP([r("BÁO CÁO MÔ TẢ SÁNG KIẾN", { bold: true, size: SIZE_HEADING })]));
out.push(flushP([
r("1. Mở đầu "),
r("(Giới thiệu về những vấn đề liên quan…):", { italic: true }),
]));
out.push(bodyP([r("{{ mau_01.mo_dau }}")], { indent: true }));
```
For checkboxes, since the templating engine has to choose which character to render, we embed the choice in the placeholder itself:
```ts
const checkbox = (cond: string, label: string) =>
flushP([
r(`{% if ${cond} %}`),
r("☑"),
r("{% else %}"),
r("☐"),
r("{% endif %} "),
r(label),
]);
out.push(checkbox(
"mau_02.phan_loai.giai_phap_ky_thuat",
"Giải pháp kỹ thuật, quản lý, tác nghiệp, ứng dụng tiến bộ kỹ thuật áp dụng cho Đại học Y Dược TP.HCM"
));
```
After `docxtpl` runs, this paragraph reduces to `☑ Giải pháp kỹ thuật…` or `☐ Giải pháp kỹ thuật…` depending on the boolean. (For DOCX rendering in Word, the `☑/☐` characters work fine because Word falls back to a Unicode-capable font automatically — unlike React-PDF.)
---
## 7. Layout calibration (matching the standard)
The "Sang_kien_SOP_dong_vat" reference document defines a specific visual style. Here's a checklist of the calibrations applied to both generators:
| Aspect | Rule | Where it lives |
|---|---|---|
| Body font | Times New Roman (or Tinos) 13pt | `styles.page.fontSize`, `r()` `SIZE = 26` |
| Page margins | 2.5 cm all around | `padding: 71` (PDF), `margin: 1440` (DOCX) |
| Body line height | 1.25 | `lineHeight: 1.25` (PDF), `line: 300` (DOCX, 240 = single, 300 ≈ 1.25) |
| First-line indent | ~1 cm on body paragraphs | `textIndent: 28` (PDF), `firstLine: 567` (DOCX) |
| Section numbers (`1.`, `2.`, `4.1`) | **NOT bold**; italic instructions in parens | Use `paragraphFlush` not bold |
| Inter-paragraph spacing | None within a section, small gap before new section | `marginBottom: 0`, `sectionHead.marginTop: 4` |
| Cover page | Page border (rounded rect), "Mẫu số 01" outside top-right | Cover-specific styles, dedicated section in DOCX |
| Cover divider | `=====***=====` (literal) | Hardcoded string |
| Cover info fields | Left-aligned, **bold label**, regular value | `coverField` style |
| Two-column header | "ĐƠN VỊ" or "BỘ Y TẾ" left, "CỘNG HÒA" right | `TopHeaderBoYTe`, `TopHeaderDonVi`, `TopHeaderCongHoa` |
| "Độc lập Tự do Hạnh phúc" | Underlined, bold | `underline: true` flag in `r()`/styles |
| Tables | Single thin black border, no shaded header | `borderWidth: 1`, no `backgroundColor` on `tableHeaderCell` |
| Mẫu 02 author table column 7 | Header includes parenthetical italic instruction | Custom `TableCell` with two centered paragraphs |
| Signature block | Two columns: "Xác nhận của lãnh đạo / [đơn vị]" left, "Đại diện nhóm tác giả sáng kiến" right | `<View style={signatureRow}>` (PDF), borderless 2-cell table (DOCX) |
| Mẫu 03 totals row | TỔNG (cols 13 merged) ‖ 100 ‖ blank | `columnSpan: 3` in DOCX, manual width sum in PDF |
| Mẫu 04 evaluation rubric | Two scoring rows + total row at bottom | Static text + `{{ … }}` for nhận xét/điểm |
When in doubt about a layout decision, open the reference DOCX in Word, click into the relevant element, and read its formatting from the ribbon. Mirror those settings in code.
---
## 8. Verification workflow
Visual diff against the reference is the only reliable way to know you got it right. The flow:
```bash
# 1. Generate the candidate PDF
npm run generate
# 2. Convert each page to JPEG
pdftoppm -jpeg -r 100 out/sang-kien-filled.pdf out/page
# 3. Convert the reference DOCX to PDF and JPEGs the same way
soffice --headless --convert-to pdf reference.docx --outdir ref/
pdftoppm -jpeg -r 100 ref/reference.pdf ref/ref-page
# 4. Open them side by side
```
For the DOCX generator, add one more step:
```bash
# Build the template
npm run build:docx
# Render placeholders WITHOUT filling them — does the layout look right?
soffice --headless --convert-to pdf out/template_application_form.docx --outdir out/
# Fill it with sample data and render
python tools/fill-docx.py example/sample-data.json out/sang-kien-filled.docx
soffice --headless --convert-to pdf out/sang-kien-filled.docx --outdir out/
```
Smoke test the DOCX template in Python before declaring victory:
```python
# tools/test-docx-fill.py
from docxtpl import DocxTemplate
import json
with open("example/sample-data.json", encoding="utf-8") as f:
data = json.load(f)
doc = DocxTemplate("out/template_application_form.docx")
doc.render(data)
doc.save("out/template-filled-test.docx")
```
If `docxtpl` raises `TemplateSyntaxError: Encountered unknown tag 'endfor'`, you've put a `{%tr for %}` and `{%tr endfor %}` in the same row instead of separate rows. Go re-read [§6.2](#62-the-3-row-table-loop-trick).
If a `{{ field }}` doesn't get replaced and you can still see the curly braces in the filled output, the placeholder got split across runs by Word's auto-formatting. Build the placeholder with one `r("{{ x }}")` call, not three.
---
## 9. Common modifications
### Adding a new field
Say you need to add `mau_01.tong_kinh_phi` (total budget).
1. **Update `src/types.ts`:**
```ts
export interface Mau01 {
// …
tong_kinh_phi: string; // new
}
```
2. **Update `example/data-blank.json`** and **`example/sample-data.json`** with the new field.
3. **Render it in `src/pages/Mau01.tsx`:**
```tsx
<Text style={styles.paragraphFlush}>
7. Tổng kinh phí: {data.tong_kinh_phi}
</Text>
```
4. **Add it to the DOCX template generator** in `tools/build-docx-template.ts`:
```ts
out.push(flushP([r("7. Tổng kinh phí: {{ mau_01.tong_kinh_phi }}")]));
```
5. **Regenerate:**
```bash
npm run generate
npm run build:docx
```
The TypeScript compiler will yell if you forget to update the page component or miss a field in the JSON.
### Changing a column width
Column widths are kept as small integer arrays in the page component (PDF) and the build script (DOCX). They must always sum to 100.
To widen the "Họ và tên" column on the Mẫu 02 author table from 22% to 28% (and shrink "Nơi công tác" from 16% to 10%):
In `src/pages/Mau02.tsx`:
```ts
const AUTHOR_WIDTHS = [6, 28, 14, 10, 14, 14, 14] as const; // was [6, 22, 14, 16, …]
```
In `tools/build-docx-template.ts` (inside `buildMau02()`):
```ts
const aw = [6, 28, 14, 10, 14, 14, 14];
```
Both numbers must match — there's no shared constant because the PDF widths are percentages of the page width (100% sum) while the DOCX widths happen to use the same convention but go through different code paths. Keeping them in sync is a manual discipline.
### Adding a new repeating table
Both the data shape, the page component, and the DOCX template need updates:
1. **Type:** add `Mau01NewRow[]` to `Mau01`, define `interface Mau01NewRow { … }`.
2. **PDF page:** mirror the existing pattern in `src/pages/Mau01.tsx`:
```tsx
<Table columns={[10, 30, 30, 30]}>
<Row>
<Cell width={10} header align="center">TT</Cell>
{/* … */}
</Row>
{(data.danh_sach_moi && data.danh_sach_moi.length > 0
? data.danh_sach_moi
: [{ tt: "", ... }]
).map((row, i) => (
<Row key={i}>
<Cell width={10} align="center">{row.tt}</Cell>
{/* … */}
</Row>
))}
</Table>
```
3. **DOCX template:** use the 3-row pattern from [§6.2](#62-the-3-row-table-loop-trick):
```ts
const w = [10, 30, 30, 30];
const emptyRow = (firstText: string) => /* same helper pattern */;
new Table({
rows: [
new TableRow({ children: [headerCell("TT", w[0]), /* … */] }),
new TableRow({ children: emptyRow("{%tr for item in mau_01.danh_sach_moi %}") }),
new TableRow({ children: [dataCell("{{ item.tt }}", w[0], AlignmentType.CENTER), /* … */] }),
new TableRow({ children: emptyRow("{%tr endfor %}") }),
],
});
```
### Switching to your organization's font
Replace the four TTF paths in `src/fonts.ts`:
```ts
Font.register({
family: "TimesVN",
fonts: [
{ src: "/path/to/your/Regular.ttf" },
{ src: "/path/to/your/Italic.ttf", fontStyle: "italic" },
{ src: "/path/to/your/Bold.ttf", fontWeight: "bold" },
{ src: "/path/to/your/BoldItalic.ttf", fontWeight: "bold", fontStyle: "italic" },
],
});
```
For the DOCX side, change `const FONT = "Times New Roman"` in `tools/build-docx-template.ts` to whatever font you want to embed. Word will fall back to a system font if the named font isn't installed on the reader's machine, so prefer common names (Times New Roman, Arial, Calibri).
---
## 10. Troubleshooting
**PDF renders blank squares where Vietnamese characters should be.**
The font isn't registered or the registered font lacks Vietnamese glyphs. Check that `registerFonts()` is called and that the TTFs at the resolved paths are actually loaded (not 404 / missing). Tinos has the right glyph coverage; many "Times New Roman clones" don't.
**`Error: Failed to fetch font from https://…`**
You're hitting `@react-pdf/renderer`'s URL-based font loading and your environment can't reach the URL. Switch to local TTFs via `require.resolve()` (already what `src/fonts.ts` does).
**`docxtpl` raises `TemplateSyntaxError: Encountered unknown tag 'endfor'`.**
You put the `{%tr for %}` and `{%tr endfor %}` tags in the *same* table row. Re-read [§6.2](#62-the-3-row-table-loop-trick) — they have to be on separate rows.
**Some `{{ field }}` placeholders aren't being replaced.**
Word split your text run mid-placeholder. Make sure each placeholder is constructed with a single `r("{{ x }}")` call, not split across multiple `r()` calls or assembled from concatenated strings.
**The DOCX has "Mẫu số 01" appearing on every page, not just the cover.**
The cover-section header is leaking into the next section. Add an explicit empty header to the second section:
```ts
headers: { default: new Header({ children: [new Paragraph({ children: [r("")] })] }) },
```
**Tables overflow the right margin.**
Column width percentages don't sum to exactly 100, or a single cell has too much wide content with no wrap point. Either fix the widths or add `wordBreak: "break-word"` to the cell style.
**`textIndent` doesn't seem to work in `<Text>`.**
React-PDF's `textIndent` only takes effect when the `<Text>` *itself* has `display: "block"`-like behavior — i.e. it's a top-level paragraph, not nested inside another `<Text>`. If you're nesting, wrap the inner content in a parent `<Text>` that has the indent style.
**The DOCX page border doesn't appear.**
Page borders are a Word feature configured in section properties. Check that you've set all four (`pageBorderTop/Bottom/Left/Right`), with non-zero `size` and a `space` value (24 puts them ~1.7cm from the edge in our setup). LibreOffice and Word may render them slightly differently — Word is the canonical view.
**Filled DOCX has weird extra empty rows above each table.**
Those are the `{%tr for %}`/`{%tr endfor %}` rows that didn't get stripped — meaning the loop tags ended up in paragraphs *inside* a cell, not as standalone row text. Make sure the `firstText` in your `emptyRow_*()` helper is the entire cell content, not appended to other text.
---
## 11. Porting to a different form
The same pattern works for any structured government form. The migration steps:
1. **Extract the data model.** Open the reference DOCX, list every blank line and every table column. Each becomes a field in `types.ts`. Repeating sections (lists of authors, lists of attachments) become arrays.
2. **Identify the sections.** Most forms have a cover page plus N body sections. Each body section becomes a `<Page>` component plus a `buildSectionN()` function in the DOCX builder.
3. **Catalog the visual primitives.** Headers, signature blocks, tables, checkboxes, date lines — write them once in `components.tsx` (PDF) and as helper functions (DOCX), then reuse.
4. **Calibrate the styles.** Open the reference, measure margins, font, line spacing, and indent. Set them as constants. See [§7](#7-layout-calibration-matching-the-standard).
5. **Render and diff.** Generate, convert to JPEG, line up against the reference. Iterate until they match.
6. **Smoke-test the DOCX template** with `docxtpl`. If a placeholder doesn't fill, it's almost always run-splitting — fix by collapsing into one `r()` call.
The most labor-intensive part is the visual calibration (step 45). Everything else is mechanical translation from "what the form looks like" to "code that produces the same thing."
---
## Appendix: file-by-file inventory
| File | Lines | Purpose |
|---|---:|---|
| `src/types.ts` | 177 | TypeScript interfaces matching `data_blank.json` |
| `src/fonts.ts` | 56 | Tinos font registration |
| `src/styles.ts` | 239 | Shared `StyleSheet.create()` styles |
| `src/components.tsx` | 156 | Reusable `<Checkbox>`, `<Table>`, `<DateLine>`, header variants |
| `src/pages/CoverPage.tsx` | 64 | Trang bìa with page border |
| `src/pages/Mau01.tsx` | 172 | Báo cáo mô tả sáng kiến |
| `src/pages/Mau02.tsx` | 206 | Đơn đề nghị công nhận sáng kiến |
| `src/pages/Mau03.tsx` | 82 | Bản xác nhận tỷ lệ đóng góp |
| `src/pages/Mau04.tsx` | 94 | Phiếu đánh giá sáng kiến |
| `src/pages/BanCamKet.tsx` | 119 | Bản cam kết |
| `src/SangKienDocument.tsx` | 43 | Top-level `<Document>` composing all pages |
| `src/generate.tsx` | 37 | `renderSangKienPdf(data)` server-side helper |
| `src/index.ts` | 5 | Public API barrel |
| `tools/build-docx-template.ts` | 1301 | Generates the Jinja-style DOCX template |
| `tools/fill-docx.py` | ~30 | CLI to fill a template with JSON data via `docxtpl` |
| `tools/test-docx-fill.py` | ~25 | Smoke test script |
| `example/generate-example.ts` | ~35 | CLI for the PDF pipeline |
| `example/sample-data.json` | — | Realistic filled-in example |
| `example/data-blank.json` | — | All-empty template instance |
Total: about **2750 lines** of TypeScript + ~50 lines of Python. The DOCX generator is the largest single file because every static line of body text is a `out.push(flushP([r("…")]))` call, but the pattern is repetitive and easy to skim.
+363
View File
@@ -0,0 +1,363 @@
# Specification: Browser-Based DOCX-to-PDF Converter
**Status:** Ready for implementation
**Audience:** Frontend engineer (React + TypeScript)
**Estimated effort:** 12 days for a working component, +1 day for polish and tests
---
## 1. Overview
This document specifies a React component, `DocxToPdfViewer`, that accepts a `.docx` file in the browser, renders it on screen with layout fidelity equivalent to Microsoft Word, and produces a downloadable PDF that matches the rendering page-for-page. The component runs entirely in the browser; no document content ever leaves the user's machine.
The component is intended for use cases where users need to view a Word document and obtain a PDF copy without installing Word, opening a desktop converter, or trusting a third-party cloud service. Typical scenarios include legal forms, application packets, internal templates, and document submission flows where PDF is the required output format.
## 2. Goals and Non-Goals
### 2.1 Goals
The component must preserve the document's page size, margins, fonts (where embedded or system-available), paragraph alignment, tables, inline and floating images, headers, footers, footnotes, bullet and numbered lists, and basic text formatting (bold, italic, underline, color, size). It must correctly render documents containing non-Latin scripts, with Vietnamese diacritics, CJK characters, and right-to-left scripts as concrete test cases. It must work on the current versions of Chromium-based browsers, Firefox, and Safari without server assistance. It must expose a clear TypeScript API and emit lifecycle events suitable for integration into larger applications.
### 2.2 Non-Goals
The output PDF is **rasterised**: each page is a JPEG image embedded in a PDF page of matching dimensions. Text in the output is therefore not selectable or searchable. If selectable text is required, the implementer should use a server-side converter (LibreOffice headless, Aspose, or a paid API) instead — this is documented in Section 12.
The component does not edit, sign, redact, fill forms in, or otherwise modify the source document. It does not support `.doc` (legacy binary format); callers must convert to `.docx` upstream. It does not attempt to be a general-purpose Word viewer with comments, track changes, or revision history rendering; only the final accepted state is rendered.
## 3. System Context
The pipeline has three stages, executed in order:
```
┌─────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ .docx file │ -> │ docx-preview │ -> │ html2canvas │ -> │ jsPDF │ -> Blob
│ (Blob) │ │ (HTML) │ │ (Canvas[]) │ │ (PDF Blob) │
└─────────────┘ └──────────────┘ └──────────────┘ └──────────────┘
└─> visible to user as the on-screen preview
```
The rendered HTML serves a dual purpose: it is both the on-screen preview shown to the user *and* the source material from which the PDF is rasterised. There is no separate hidden render pass. This is a deliberate architectural choice; see Rule 3 in Section 7.
## 4. Dependencies
The implementation requires three runtime dependencies and their type definitions:
| Package | Version | Purpose |
|---|---|---|
| `docx-preview` | `^0.3.5` | Parses `.docx` and renders to HTML with high layout fidelity. |
| `html2canvas` | `^1.4.1` | Rasterises a DOM subtree to an HTMLCanvasElement. |
| `jspdf` | `^2.5.1` | Assembles canvas images into a multi-page PDF. |
`docx-preview` has a transitive runtime dependency on `jszip`, which it imports via its package; no direct install is required when bundling with npm. When loading via CDN, `jszip` must be loaded as a separate `<script>` tag *before* `docx-preview`.
The React peer dependency is React 18 or later. Install:
```bash
npm install docx-preview jspdf html2canvas
```
TypeScript definitions ship with `jspdf` and `docx-preview`. `html2canvas` includes its own declarations in recent versions.
## 5. Public API
The component is the default export of a single file, `DocxToPdfViewer.tsx`. Its props are:
```ts
type ConverterStatus = "idle" | "rendering" | "capturing" | "ready" | "error";
interface DocxToPdfViewerProps {
/** Pre-supplied .docx file. If omitted, the built-in file picker is shown. */
file?: File | null;
/** Hide the built-in file picker. Use when `file` is controlled externally. */
hideFilePicker?: boolean;
/** Hide the inline HTML preview. Use when only the PDF blob is needed. */
hidePreview?: boolean;
/** Called when the PDF blob is ready. */
onPdfReady?: (pdfBlob: Blob, sourceFile: File) => void;
/** Called on every stage of the conversion lifecycle. */
onStatusChange?: (status: ConverterStatus) => void;
/** Rendering scale passed to html2canvas. Default 2. Range 14. */
renderScale?: number;
/** JPEG quality 01 for embedded page images. Default 0.95. */
imageQuality?: number;
/** Use PNG (lossless, larger files) instead of JPEG. Default false. */
losslessImages?: boolean;
className?: string;
style?: React.CSSProperties;
}
```
The component is fully self-contained: with no props, dropping it into a tree produces a working drag-and-drop converter. With `file` supplied externally, conversion starts automatically whenever the prop changes. `onPdfReady` is the integration seam for callers who need to upload, store, or further process the PDF.
## 6. Implementation Guide
### 6.1 Project Structure
A single-file component is sufficient. Place `DocxToPdfViewer.tsx` in your component directory. No additional CSS files, context providers, or build configuration are required beyond what a standard Vite/Next/CRA React project already provides.
### 6.2 The Rendering Stage
The component holds a `ref` to a single visible `<div>`. When a file is received, the implementation calls `docx-preview`'s `renderAsync` with that ref as the body container. The library injects a `<div class="docx-wrapper">` containing one `<section class="docx">` element per page, plus a `<style>` block of derived CSS at the top of the container.
```ts
await renderAsync(source, container, undefined, {
inWrapper: true,
breakPages: true,
ignoreLastRenderedPageBreak: false,
useBase64URL: true,
experimental: true,
renderHeaders: true,
renderFooters: true,
renderFootnotes: true,
});
```
`breakPages: true` is essential — it causes the library to emit one section per page rather than a single continuous flow, which is what makes per-page capture possible later. `useBase64URL: true` inlines images and fonts as data URLs, which avoids cross-origin issues during canvas capture (see Section 8). `experimental: true` enables tab-stop calculation; the option name is misleading but the feature is stable in practice.
After `renderAsync` resolves, wait one animation frame plus a short `setTimeout` before measuring page dimensions. Browsers do not guarantee that injected styles have been applied and font metrics finalised by the time the promise resolves; measuring too early produces zero-width pages.
```ts
await new Promise<void>(r => requestAnimationFrame(() => r()));
await new Promise<void>(r => setTimeout(r, 50));
```
### 6.3 The Capture Stage
Once rendered, locate the page elements:
```ts
let pages = Array.from(
container.querySelectorAll<HTMLElement>("section.docx")
);
if (pages.length === 0) {
pages = Array.from(container.querySelectorAll<HTMLElement>("section"));
}
if (pages.length === 0) {
throw new Error("docx-preview produced no page sections.");
}
```
The fallback selector exists to defend against future `docx-preview` versions that might change the section classname; it has no cost when the primary selector succeeds.
For each page, call `html2canvas` with the page element as the target. The recommended configuration:
```ts
const canvas = await html2canvas(page, {
scale: renderScale, // 2 for crisp output
useCORS: true, // honour CORS headers on any external images
backgroundColor: "#ffffff",// avoid transparent pages
logging: false,
windowWidth: page.offsetWidth,
windowHeight: page.offsetHeight,
});
```
`scale: 2` is the sweet spot. `scale: 1` produces visibly blurry text; `scale: 3+` quadruples memory consumption per page and offers diminishing visual return except for print output.
### 6.4 The PDF Assembly Stage
Initialise `jsPDF` once, using the first page's dimensions. Convert CSS pixels to millimetres using the constant `25.4 / 96` (millimetres per inch divided by CSS pixels per inch at the standard 96 DPI):
```ts
const PX_TO_MM = 25.4 / 96;
const widthMm = firstPage.offsetWidth * PX_TO_MM;
const heightMm = firstPage.offsetHeight * PX_TO_MM;
const pdf = new jsPDF({
orientation: widthMm > heightMm ? "landscape" : "portrait",
unit: "mm",
format: [widthMm, heightMm],
compress: true,
});
```
For each captured canvas, derive that page's own dimensions (a document may mix portrait and landscape sections) and add it. The first page is implicit; subsequent pages require explicit `addPage`:
```ts
for (let i = 0; i < pages.length; i++) {
const page = pages[i];
const pwMm = page.offsetWidth * PX_TO_MM;
const phMm = page.offsetHeight * PX_TO_MM;
const imgData = canvas.toDataURL("image/jpeg", 0.95);
if (i > 0) {
pdf.addPage([pwMm, phMm], pwMm > phMm ? "landscape" : "portrait");
}
pdf.addImage(imgData, "JPEG", 0, 0, pwMm, phMm, undefined, "FAST");
}
const blob = pdf.output("blob");
```
The `"FAST"` compression mode is the correct choice for embedded JPEGs. The image is already compressed; asking jsPDF to re-compress with `"SLOW"` or `"MEDIUM"` adds significant CPU time and no file-size benefit. For the lossless variant (`losslessImages: true`), substitute `"image/png"` and `"PNG"`; expect 510× larger output.
### 6.5 The Component Shell
The UI surface comprises four elements: a file picker that doubles as a drop zone, a status line, a download button that appears when the PDF is ready, and the preview container that `docx-preview` renders into. Detailed visual design is out of scope for this spec — the component should accept `className` and `style` props and ship with neutral default styles that integrate into any application without requiring a CSS reset.
The drop zone must accept both click-to-browse and drag-and-drop. On drag-over, prevent the default to enable drop. On drop, validate that the file has a `.docx` extension before passing it to the conversion pipeline.
### 6.6 Lifecycle Management
The PDF blob is held in a ref rather than React state, because re-renders triggered by other state changes (progress updates, status changes) should not re-create the URL or re-trigger downstream consumers. A separate boolean state (`pdfReady`) controls the visibility of the Download button.
Object URLs are created lazily, at the moment the user clicks Download, and revoked after a short delay sufficient for the browser to initiate the download (4 seconds is a conservative value):
```ts
const url = URL.createObjectURL(blob);
const a = document.createElement("a");
a.href = url;
a.download = sourceFile.name.replace(/\.docx$/i, "") + ".pdf";
document.body.appendChild(a);
a.click();
a.remove();
setTimeout(() => URL.revokeObjectURL(url), 4000);
```
Creating the URL only at click time avoids holding a long-lived blob URL in memory for users who never download.
When the `file` prop changes or the user selects a new file via the picker, the conversion pipeline restarts and the previous blob is discarded. The previous preview DOM is cleared by setting `container.innerHTML = ""` before the next `renderAsync` call.
## 7. Critical Implementation Rules
The following four rules each correspond to a non-obvious failure mode that has cost real engineering time. They are not stylistic preferences — they will cause the component to fail or produce blank output if violated.
### Rule 1: Do not override `docx-preview`'s `className` option
The option is documented as "class name/prefix for default and document style classes". In practice, it controls the **literal class name applied to each page section**. If `className: "my-pages"` is passed, the sections come out as `<section class="my-pages">`, not `<section class="docx">`. Any selector that looks for `section.docx` will return zero pages, and the implementation will throw "no page sections" despite a successful render.
Leave the option at its default. If the page selector needs to be defensive against future library changes, query both `section.docx` and `section` as fallbacks, but do not solve the problem by changing `className`.
### Rule 2: Do not hide the capture target with CSS `visibility`, `display`, or `opacity`
It is tempting to render `docx-preview`'s output into a hidden off-screen container and only show the resulting PDF. This does not work. `html2canvas` respects computed CSS visibility: an element with `visibility: hidden`, `display: none`, or `opacity: 0` (or any ancestor with those properties) will be rasterised as blank or transparent pixels. The capture stage will complete without error, and the resulting PDF will have the correct page count and dimensions but be entirely empty.
If the rendered HTML must not be visible to the user, position it off-screen with `position: fixed; left: -100000px;` *without* applying any visibility, display, or opacity rules. Mark it `aria-hidden="true"` and `inert` for accessibility. In practice, however, see Rule 3 — the rendered HTML should usually be the visible preview.
### Rule 3: Do not preview the generated PDF in an `<iframe>`
Browsers' built-in PDF viewer is unreliable inside sandboxed iframes, embedded extension contexts, and certain CSP-restricted hosts. A `blob:` URL pointing to a valid PDF will load into a top-level tab without issue but stay blank in `<iframe src="blob:...">` inside a sandbox. The conversion will succeed, the blob will be valid, the download will work, but the inline preview will be empty.
The architectural fix is to recognise that `docx-preview` is already producing a high-fidelity, paginated, **selectable** HTML rendering of the document. That rendering is the preview. The PDF is a derivative artefact that only needs to materialise at download time. The implementation should render `docx-preview` directly into the visible preview container — never into a hidden stage that is then mirrored into an iframe. This is also better UX outside sandboxed contexts: the HTML preview has selectable text, is scrollable, and renders faster than asking the browser to display a PDF.
### Rule 4: Choose CDN sources deliberately when loading without a bundler
`docx-preview` is **not** published on cdnjs. It is available on npm, jsDelivr, and unpkg. Hosts that enforce a strict CSP allowing only cdnjs (such as Claude's artifact iframe, Chrome extension contexts, and some enterprise application shells) will block loading from unpkg with a `script-src` violation. The library script never executes, the global `window.docx` is undefined, and the first call into the pipeline throws `TypeError: Cannot read properties of undefined (reading 'renderAsync')`.
For browser-only HTML deployments, use jsDelivr's `/npm/` path:
```html
<script src="https://cdn.jsdelivr.net/npm/docx-preview@0.3.5/dist/docx-preview.min.js"></script>
```
For bundled React applications, install via npm; CDN choice does not apply. Verify each library's presence on `window` before the first conversion call and surface a clear error to the user if any failed to load.
## 8. Error Handling and Edge Cases
The implementation must handle the following scenarios gracefully:
**Wrong file type.** When the user drops a `.pdf`, `.txt`, `.doc`, or any non-`.docx` file, the component shows an inline error message and does not enter the rendering stage. Validation is by file extension; MIME-type sniffing is unreliable across browsers.
**Corrupted or malformed `.docx`.** `docx-preview` will throw during `renderAsync` if the file is not a valid OOXML package or contains unparseable XML. The error must be caught, the status set to `"error"`, and the error message surfaced to the user. The component must remain in a state where another file can be selected.
**Empty document.** A valid `.docx` containing no content will produce an empty wrapper with no `<section>` elements. The implementation throws an explicit error rather than producing an empty PDF.
**Images with restrictive CORS.** With `useBase64URL: true`, `docx-preview` inlines embedded images as data URLs and CORS does not apply. If the option is changed to `false`, externally hosted images will taint the canvas and cause `toDataURL` to throw a `SecurityError`. Do not change this option.
**Very large documents.** Documents with more than ~50 pages may exhaust memory at `scale: 2` because each captured canvas is held in memory before being added to the PDF. For documents this large, the implementation should release each canvas (by setting its reference to null) immediately after `addImage` returns, and consider lowering `renderScale` to 1.5 when page count exceeds a threshold.
**Mixed page orientations.** Documents that switch from portrait to landscape mid-flow are handled by the per-page dimension calculation in Section 6.4. Do not assume all pages share the first page's dimensions.
**Rapid file changes.** If the user drops a second file while the first is still converting, the in-flight conversion must be cancelled or its results discarded. The simplest approach is to track an incrementing conversion ID; results from a non-current ID are ignored on completion. This is not strictly required for correctness — the second call will overwrite the first — but it prevents stale progress updates from confusing the status display.
## 9. Performance Considerations
For a typical 5-page A4 document, end-to-end conversion on mid-range 2024 hardware takes 1.53 seconds. The dominant cost is `html2canvas` capture, which scales roughly linearly with page count and quadratically with `renderScale`. The `docx-preview` rendering stage typically takes 100300 ms regardless of page count. PDF assembly is negligible.
Memory peaks during the capture loop, holding one canvas worth of pixels per page until added to the PDF. At `scale: 2` with US Letter pages, a single canvas is approximately 8 MB of RGBA data. A 20-page document briefly holds ~160 MB before garbage collection.
Output PDF file sizes for a 5-page document at default settings are approximately 1.53 MB. Lowering `imageQuality` from 0.95 to 0.85 typically reduces output by 30% with no visible degradation; lowering below 0.80 introduces visible JPEG artifacts on text edges.
## 10. Browser Support
The component targets the current and one prior major version of Chrome, Edge, Firefox, and Safari. Internet Explorer is not supported. The relevant browser features are:
- `File` and `FileReader` APIs (universal since 2014)
- `Blob` and `URL.createObjectURL` (universal since 2014)
- Canvas `toDataURL` with JPEG support (universal since 2012)
- ES2020 syntax targets in `tsconfig.json`
`html2canvas` has known limitations rendering certain CSS features — `mix-blend-mode`, `backdrop-filter`, complex `clip-path` — that may affect documents using heavy graphical design. For Word documents this is rarely relevant; standard business documents do not invoke these features.
## 11. Testing
Implementations should be verified against the following test corpus:
| Test document | Asserts |
|---|---|
| Plain prose, 3 pages, A4 | Basic flow; page count and dimensions match |
| Document with one table per page | Tables render with borders and cell shading |
| Mixed portrait and landscape sections | Each PDF page matches its source orientation |
| Document with embedded PNG and JPEG images | Images appear in correct positions |
| Vietnamese-language document with diacritics | All characters render; no missing glyphs |
| Document with header and footer including page numbers | Headers/footers appear on every page |
| Document with bulleted and numbered lists | List markers render with correct indentation |
| 30-page document | Memory does not exceed 500 MB during capture |
| Corrupted .docx (truncated zip) | Component shows error and remains usable |
Beyond visual diffing of the rendered preview against the source `.docx` opened in Word, the captured PDF should be opened in a separate PDF reader (Acrobat, Preview, or Firefox's built-in viewer) to confirm that page dimensions, count, and rendered content match. Programmatic visual regression testing of the PDF output is beyond the scope of this spec but can be implemented using `pdf-parse` + `pixelmatch` if needed.
## 12. Known Limitations and Alternatives
The text in the output PDF is rasterised and therefore not selectable, searchable, copyable, or screen-readable. Users who need any of these properties — particularly accessibility for visually impaired users — must use a server-side converter that emits real PDF text objects. Recommended alternatives in decreasing order of fidelity and increasing order of cost:
1. **LibreOffice headless** (`soffice --convert-to pdf`): free, self-hosted, very high fidelity, requires Linux server with LibreOffice installed. ~13 seconds per document.
2. **Aspose.Words Cloud or self-hosted**: paid, very high fidelity, native PDF text output, requires license.
3. **CloudConvert, ConvertAPI, or similar SaaS**: paid per-document, simple HTTP API, sends document contents to a third party.
The HTML preview produced by `docx-preview` *is* accessible — screen readers can navigate it, text is selectable, and users can zoom — so the component's accessibility story is intact for users who don't need the PDF artifact itself.
This component cannot edit, sign, redact, or annotate documents. For those features, evaluate `pdf-lib` (PDF mutation) or `docx` (DOCX generation, which is a different package than `docx-preview`).
## 13. Appendix: Algorithm Pseudocode
For reference, the complete conversion algorithm in 20 lines:
```
function convert(file, container):
clear container
await renderAsync(file, container, {
inWrapper: true,
breakPages: true,
useBase64URL: true,
experimental: true,
renderHeaders: true, renderFooters: true, renderFootnotes: true,
})
await rAF; await sleep(50)
pages = container.querySelectorAll("section.docx") || container.querySelectorAll("section")
if pages is empty: throw
pdf = new jsPDF using pages[0] dimensions in mm
for each page in pages:
canvas = await html2canvas(page, scale=2, useCORS=true, bg=white)
if not first page: pdf.addPage(page dimensions)
pdf.addImage(canvas.toDataURL("image/jpeg", 0.95), 0, 0, w_mm, h_mm)
return pdf.output("blob")
```
The pseudocode omits error handling, lifecycle management, and progress reporting, all of which are required in the production implementation per Sections 6.6 and 8.
---
*End of specification.*
+313
View File
@@ -0,0 +1,313 @@
# Applicant PDF / report preview — re-implementation guide
This document describes how PDF and “draft” preview work in the DYD frontend and backend so you can reproduce behavior in another codebase or refactor safely.
---
## 1. Two different things called “PDF”
| Path | What it is | Layout fidelity |
|------|------------|-------------------|
| **Server PDF** | `GET /api/reports/{reportId}/export/pdf` | Same as **Xuất Word**, then LibreOffice **DOCX → PDF**. Official template layout. |
| **Client “draft” preview** | `PdfExportDialog` + `PrintableReport` + `html2canvas` + `jspdf` | HTML recreation; **not** pixel-equal to Word. Used when server PDF fails (typical: LibreOffice missing). |
**Canonical document for official exports:** the filled `.docx`. The PDF is a **conversion**, not a separately maintained template.
Reference implementation:
- Dialog: [`fe0/src/components/PdfExportDialog.tsx`](../fe0/src/components/PdfExportDialog.tsx)
- Layout: [`fe0/src/components/PrintableReport.tsx`](../fe0/src/components/PrintableReport.tsx)
- Hooks: [`fe0/src/api/hooks.ts`](../fe0/src/api/hooks.ts)
- PDF handler: [`src/Backend/DYD.Application/Features/Reports/ExportReportPdf.cs`](../src/Backend/DYD.Application/Features/Reports/ExportReportPdf.cs)
- LibreOffice wrapper: [`src/Backend/DYD.Application/Common/Export/SofficeConverter.cs`](../src/Backend/DYD.Application/Common/Export/SofficeConverter.cs)
---
## 2. Why server PDF matches DOCX (margins, text, spacing)
[`ExportReportPdfHandler`](../src/Backend/DYD.Application/Features/Reports/ExportReportPdf.cs) runs:
1. `ExportReportDocxQuery` → bytes of the filled report `.docx` (identical pipeline to **Xuất Word**).
2. `SofficeConverter.WordToPdfAsync(wordBytes)` → writes temp `.docx`, runs **LibreOffice headless** `--convert-to pdf`, reads `input.pdf`.
So the PDF is **one layout engine pass** over the merged OpenXML file. Field text and structural spacing come from that document; you are not duplicating merge logic for PDF.
**Caveats:** Font availability on the server, LibreOffice vs Microsoft Word subtle differences, and very long unbroken strings may change wrapping. For strict WYSIWYG with Word desktop, compare in both viewers.
**Contrast:** The React `PrintableReport` path **does not** use the Word template—it renders its own HTML. Use it only as a **fallback preview**, not as a legal duplicate of the official form.
---
## 3. User flows (where the preview opens)
### 3.1 My Reports (`fe0/src/pages/MyReports.tsx`)
1. User clicks **Xuất PDF** on a row.
2. App calls `GET /api/reports/{id}/export/pdf` and downloads the blob on success.
3. If the error message contains `LibreOffice` or `soffice`, it alerts and opens `PdfExportDialog` with `{ reportId, initiativeId, reportCode }` from the row (initiative id is set correctly).
### 3.2 Dashboard Overview — Panel 2 (`fe0/src/pages/DashboardOverview.tsx`)
1. **Xuất PDF** calls the backend download path only (`handleExportPdfBackend`).
2. On LibreOffice failure, fallback opens `PdfExportDialog` but currently passes **`initiativeId: ''`**. That disables `useInitiative`, so **Section I** of `PrintableReport` may show placeholders. **Fix when re-implementing:** pass `selectedReport.initiativeId`.
### 3.3 Component names
There is no symbol `ApplicantPreviewPanel`. The modal title is **“Xem trước PDF”** (`PdfExportDialog`).
---
## 4. Data loading for the client preview
`PdfExportDialog` renders `PrintableReport` and passes data from three React Query hooks (all under authenticated `apiFetch`):
| Hook | Endpoint | Type (TS) |
|------|----------|-----------|
| `useInitiative(id)` | `GET /api/initiatives/{id}` | `InitiativeDetail` |
| `useReport(id)` | `GET /api/reports/{id}` | `Report` |
| `useDocumentsByReport(reportId)` | `GET /api/documents/by-report/{reportId}` | `DocumentListItem[]` |
Hooks use `enabled: !!id`. Empty `initiativeId` skips initiative fetch.
`DocumentListItem` includes optional **`content`** (JSON string) and **`summary`**. Each recognition document (Mẫu 0104) stores form state as JSON in `content`.
`DocumentType` enum: `Description = 1`, `Application = 2`, `ContributionRatio = 3`, `Evaluation = 4`.
---
## 5. `PrintableReport` structure (fallback HTML)
### 5.1 Root layout
- `ref` on a single root `div` captured by `html2canvas`.
- Width **794px** (A4 at 96dpi), padding **40px**, `box-sizing: border-box`.
- Font: **Times New Roman**, 12px, line-height 1.5, black on white.
- Optional prop `schoolName` (default: `Đại học Y Dược TP. HCM`).
### 5.2 Sections and data sources
| Section | Source | Notes |
|---------|--------|------|
| Header (ministry, school, “BÁO CÁO SÁNG KIẾN”, export date) | Static + `new Date()` vi-VN | |
| Metadata grid (mã BC, mã SK, tiêu đề, đơn vị, trạng thái BC, dates) | `report`, `initiative` | Status via numeric map → Vietnamese label |
| **I.** Nội dung Sáng kiến | `initiative` | `shortSummary`, `description`, `objectives`, `scopeOfApplication`, `expectedOutcomes`, `startDate`/`endDate`, `estimatedBudget` |
| **II.** Kết quả thực tế | `report` | `actualOutcomes`, `actualBudget`, `implementationNotes`, `challenges`, `lessonsLearned` |
| **III.** Checklist 4 tài liệu | `documents` | Fixed order of four `DocumentType` values; codes, `DOCUMENT_STATUS_LABELS`, `approvalDate`; optional bullet list from `summary` |
| **IV.** Mẫu 01 | Parse `content` for `DocumentType.Description``DescriptionData` | Only if parsed object is truthy |
| **V.** Mẫu 02 | `DocumentType.Application``ApplicationData` | Authors/support staff tables, classification labels |
| **VI.** Mẫu 03 | `DocumentType.ContributionRatio``ContributionData` | Participants + total % row |
| **VII.** Mẫu 04 | `DocumentType.Evaluation``EvaluationData` | Scores table + total /100 |
| Footer | Static + date | |
Empty string fields use helper `Paragraph` → gray italic “(chưa có nội dung)”.
`pageBreakBefore` on IVVII does **not** create real breaks in the **downloaded** PDF from html2canvas; see section 6.
---
## 6. Download pipeline (`PdfExportDialog` → file)
1. `await document.fonts.ready` if available (helps Vietnamese glyphs).
2. `html2canvas(ref, { scale: 2, backgroundColor: '#ffffff', useCORS: true, logging: false, windowWidth/Height: element scroll size })`.
3. `canvas.toDataURL('image/png')`.
4. `jsPDF` A4 portrait; image width = page width; height from aspect ratio.
5. **Multi-page:** repeatedly `addPage()` and `addImage` with a **negative Y offset** to slice one tall image across pages (`position = heightLeft - imgHeight`). **Limitation:** content can be cut mid-line or mid-table.
**Output filename:** `{reportCode}_{YYYYMMDD}.pdf` (fallback codes: `report?.code` or `BaoCao`).
**Dependencies:** `html2canvas`, `jspdf` (see `fe0/package.json`).
---
## 7. JSON shapes (`document.content`) — examples
Parse with `JSON.parse`; invalid JSON → section omitted (or empty). Shapes are defined in [`PrintableReport.tsx`](../fe0/src/components/PrintableReport.tsx) and should stay aligned with the four tab forms that persist `PUT /api/documents/{id}`.
### 7.1 Mẫu 01 — `DescriptionData` (`DocumentType.Description`)
```json
{
"introduction": "Mở đầu...",
"initiativeName": "Tên sáng kiến",
"applicationField": "Lĩnh vực",
"currentStatus": "Tình trạng giải pháp đã biết",
"purpose": "Mục đích",
"solutionContent": "Nội dung giải pháp",
"implementationSteps": "Các bước thực hiện",
"conditions": "Điều kiện áp dụng",
"trialUnits": [
{ "id": 1, "name": "Đơn vị A", "address": "Địa chỉ", "field": "Lĩnh vực áp dụng" }
],
"novelty": "Tính mới",
"effectiveness": {
"economic": "...",
"social": "...",
"teaching": "...",
"productivity": "...",
"quality": "...",
"environment": "...",
"safety": "..."
},
"confidentialInfo": "...",
"submissionDate": "2026-01-15",
"authorName": "..."
}
```
### 7.2 Mẫu 02 — `ApplicationData` (`DocumentType.Application`)
```json
{
"unitName": "Đơn vị chủ quản",
"initiativeName": "Tên SK đề nghị",
"investorName": "Chủ đầu tư",
"applicationField": "Lĩnh vực",
"firstApplyDate": "2025-06-01",
"authors": [
{
"id": 1,
"name": "Nguyễn Văn A",
"dob": "01/01/1980",
"workplace": "Khoa X",
"title": "PGS",
"qualification": "TS",
"contributionPercent": 60
}
],
"initiativeClassification": "technical",
"contentSummary": "Tóm tắt nội dung",
"confidentialInfo": "",
"conditions": "",
"authorEvaluation": "",
"trialEvaluation": "",
"supportStaff": [
{
"id": 1,
"name": "Trợ lý",
"dob": "",
"workplace": "",
"title": "",
"qualification": "",
"supportContent": "Hỗ trợ hành chính"
}
],
"submissionDay": 10,
"submissionMonth": 5,
"submissionYear": "2026"
}
```
Classification values: `technical` | `research` | `textbook` (mapped to long Vietnamese labels in UI).
### 7.3 Mẫu 03 — `ContributionData` (`DocumentType.ContributionRatio`)
```json
{
"initiativeName": "Tên SK",
"mainAuthor": "Tác giả chính",
"position": "Trưởng khoa — Khoa X",
"representativePercent": 40,
"submissionDate": "2026-05-01",
"participants": [
{ "id": 1, "fullName": "Nguyễn B", "workUnit": "Khoa Y", "contributionPercent": 40 }
],
"digitalSignatureConfirmed": true
}
```
Total row sums `participants[].contributionPercent` (display only; not validated here).
### 7.4 Mẫu 04 — `EvaluationData` (`DocumentType.Evaluation`)
```json
{
"initiativeName": "Tên SK",
"authorName": "Tác giả",
"position": "Chức vụ",
"evaluationDate": "2026-05-11",
"noveltyLevel": "high",
"noveltyScore": 35,
"noveltyComment": "Nhận xét tính mới",
"effectivenessLevel": "medium",
"effectivenessScore": 45,
"effectivenessComment": "Nhận xét hiệu quả",
"conclusion": "Kết luận"
}
```
Levels: `high` | `medium` | `low`. Printed table shows shortened level text (split on ` (`).
---
## 8. Field inventory (PrintableReport) — quick reference
### 8.1 `InitiativeDetail` → Section I and header
| UI label (approx.) | Field on `InitiativeDetail` |
|--------------------|----------------------------|
| Mã Sáng kiến | `code` |
| Tiêu đề SK | `title` |
| Đơn vị chủ trì | `owningUnitName` |
| Mô tả tóm tắt | `shortSummary` |
| Mô tả chi tiết | `description` |
| Mục tiêu | `objectives` |
| Phạm vi áp dụng | `scopeOfApplication` |
| Kết quả dự kiến | `expectedOutcomes` |
| Thời gian áp dụng | `startDate`, `endDate` (ISO) |
| Kinh phí dự toán | `estimatedBudget` |
### 8.2 `Report` → Section II and header
| UI label | Field on `Report` |
|----------|-------------------|
| Mã Báo cáo | `code` |
| Trạng thái BC | `status` (numeric → label map) |
| Ngày nộp BC | `submissionDate` |
| Ngày duyệt BC | `approvalDate` |
| Kết quả đạt được | `actualOutcomes` |
| Kinh phí thực tế | `actualBudget` |
| Ghi chú triển khai | `implementationNotes` |
| Khó khăn | `challenges` |
| Bài học | `lessonsLearned` |
### 8.3 `DocumentListItem` → Section III
| UI | Field |
|----|--------|
| Loại | `type` + `DOCUMENT_TYPE_LABELS` |
| Mã | `code` |
| Trạng thái | `status` + `DOCUMENT_STATUS_LABELS` |
| Ngày duyệt | `approvalDate` |
| Tóm tắt (bullets) | `summary` |
### 8.4 JSON document sections
See section 7 for keys. Every field in `DescriptionData`, `ApplicationData`, `ContributionData`, `EvaluationData` maps 1:1 to labels inside [`PrintableReport.tsx`](../fe0/src/components/PrintableReport.tsx) (search `Paragraph label=` and table headers).
---
## 9. Re-implementation checklist
- [ ] **Official PDF/Word:** Call the same APIs (`export/docx`, `export/pdf`) so template fidelity stays server-side.
- [ ] **Fallback modal:** Fetch initiative + report + documents; render a single scrollable column layout; optional: fix Dashboard Overview `initiativeId` on fallback.
- [ ] **Download:** Match `html2canvas` + `jspdf` multi-page slicing or replace with a service that returns vector PDF (e.g. server-only preview URL).
- [ ] **i18n:** Labels are hardcoded Vietnamese in `PrintableReport`.
- [ ] **Tests:** Fixture objects for `InitiativeDetail`, `Report`, four `DocumentListItem` records with sample `content` JSON; snapshot or visual regression on the root `div` dimensions.
---
## 10. Embedded PDF preview (optional UX)
If you need an in-app preview that **matches** the official PDF (as in a browser PDF viewer with page count):
1. `GET /api/reports/{id}/export/pdf``blob``URL.createObjectURL(blob)`.
2. Use `<iframe src={url} />` or `react-pdf` / PDF.js on that URL.
3. For side-by-side Word, same pattern with `GET /api/reports/{id}/export/docx` (browser may download; for in-browser preview consider a docx preview library or server-rendered HTML).
This path keeps **one** generation pipeline and avoids duplicating the Word layout in React.
---
*Generated to match the DYD codebase layout as of the docs authoring; update file paths if the repo structure changes.*
+343
View File
@@ -0,0 +1,343 @@
# Simplified Tech Stack for Local Governance Layer
## Analysis & Simplification Strategy
### Key Observations
1. **Local Application Context**: Single-server deployment, not distributed
2. **Existing Stack**: Already using FastAPI + PostgreSQL
3. **Complexity Overkill**: Enterprise tools (Kafka, Camunda, Elasticsearch) are unnecessary for local deployment
4. **Core Needs**: State machine, rules engine, document storage, audit logging
---
## Simplified Tech Stack Recommendation
### ✅ **Core Stack (Keep These)**
| Component | Technology | Rationale |
|-----------|-----------|-----------|
| **Database** | PostgreSQL 15+ | ✅ Already in use, supports JSONB, excellent for local deployment |
| **API Framework** | FastAPI (Python) | ✅ Already in use, fast, async, great for this use case |
| **Document Storage** | Local filesystem + PostgreSQL (metadata) | ✅ Simple, no external service needed |
| **Business Rules** | Custom Python classes/functions | ✅ Lightweight, maintainable, no external engine needed |
### 🔄 **Replace Complex Components**
| Original Suggestion | Simplified Alternative | Why |
|-------------------|----------------------|-----|
| **Camunda/Temporal** | Custom state machine (Python) | Simple workflow states, no need for enterprise orchestration |
| **Elasticsearch + ML** | PostgreSQL full-text search + `pg_trgm` (trigram similarity) | Built-in, sufficient for duplicate detection |
| **Apache Kafka/RabbitMQ** | PostgreSQL NOTIFY/LISTEN or in-memory event queue | Simple pub/sub, no separate service |
| **AWS S3/MinIO** | Local filesystem with organized folders | Direct file storage, simpler for local |
| **Drools** | Python rule functions/classes | More maintainable, easier to debug |
---
## Recommended Simplified Architecture
### 1. **Database Layer**
```python
# Single PostgreSQL database with:
- Core tables (initiatives, authors, reviews, etc.)
- JSONB columns for flexible metadata
- Full-text search indexes (GIN indexes on text fields)
- pg_trgm extension for similarity matching
```
**Benefits:**
- No additional services
- ACID compliance
- Built-in full-text search
- Trigram similarity for duplicate detection
### 2. **Business Rules Engine**
```python
# Custom Python classes
class NoveltyChecker:
def check(self, initiative: Initiative) -> ValidationResult
class ScoringEngine:
def calculate_score(self, reviews: List[Review]) -> Score
class WorkflowStateMachine:
def transition(self, initiative: Initiative, action: str) -> State
```
**Benefits:**
- Easy to test and debug
- No external dependencies
- Version control friendly
- Can be extended incrementally
### 3. **Workflow Engine**
```python
# Simple state machine
class InitiativeWorkflow:
STATES = ['DRAFT', 'SUBMITTED', 'UNIT_REVIEW', ...]
TRANSITIONS = {
'DRAFT': ['SUBMITTED'],
'SUBMITTED': ['UNIT_REVIEW', 'REJECTED'],
...
}
def can_transition(self, from_state, to_state, user_role):
# Check permissions and business rules
pass
```
**Benefits:**
- No external workflow engine
- Easy to understand and modify
- Can store state in database
- Lightweight
### 4. **Document Storage**
```python
# Local filesystem structure
/initiatives/
/{initiative_id}/
/forms/
form_01_v1.pdf
form_03_v1.pdf
/reviews/
review_001.pdf
/attachments/
evidence_001.pdf
# Metadata in PostgreSQL
CREATE TABLE document_metadata (
id UUID PRIMARY KEY,
initiative_id UUID REFERENCES initiatives(id),
file_path TEXT,
form_type VARCHAR(50),
version INT,
uploaded_by UUID,
uploaded_at TIMESTAMP,
checksum VARCHAR(64)
);
```
**Benefits:**
- No object storage service needed
- Easy backup (just copy folder)
- Direct file access
- Simple versioning
### 5. **Duplicate Detection**
```sql
-- Use PostgreSQL trigram similarity
CREATE EXTENSION IF NOT EXISTS pg_trgm;
-- Similarity query
SELECT
i1.id,
i1.title,
similarity(i1.description, i2.description) as sim_score
FROM initiatives i1
CROSS JOIN initiatives i2
WHERE i1.id != i2.id
AND similarity(i1.description, i2.description) > 0.7
ORDER BY sim_score DESC;
```
**Benefits:**
- Built into PostgreSQL
- No ML model training needed
- Fast enough for local scale
- Can be enhanced with custom logic
### 6. **Event System**
```python
# Simple in-memory event dispatcher
class EventDispatcher:
def __init__(self):
self.listeners = {}
def subscribe(self, event_type, callback):
if event_type not in self.listeners:
self.listeners[event_type] = []
self.listeners[event_type].append(callback)
def emit(self, event_type, data):
for callback in self.listeners.get(event_type, []):
callback(data)
# Or use PostgreSQL NOTIFY/LISTEN for persistence
```
**Benefits:**
- No message broker needed
- Simple pub/sub pattern
- Can persist events to database if needed
- Easy to add email notifications
### 7. **Audit Logging**
```sql
-- Simple append-only table
CREATE TABLE audit_log (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
initiative_id UUID,
actor_id UUID,
action VARCHAR(100),
timestamp TIMESTAMP DEFAULT NOW(),
previous_state JSONB,
new_state JSONB,
metadata JSONB
);
CREATE INDEX idx_audit_initiative ON audit_log(initiative_id);
CREATE INDEX idx_audit_timestamp ON audit_log(timestamp);
```
**Benefits:**
- No separate audit system
- Queryable with SQL
- Can export for compliance
- Simple to implement
---
## Complete Simplified Stack
### **Backend**
```
FastAPI (Python)
├── Database: PostgreSQL 15+
│ ├── Core tables (initiatives, authors, reviews, etc.)
│ ├── JSONB for flexible data
│ ├── Full-text search (GIN indexes)
│ ├── Trigram similarity (pg_trgm)
│ └── Audit log table
├── Business Logic: Custom Python classes
│ ├── NoveltyChecker
│ ├── ScoringEngine
│ ├── WorkflowStateMachine
│ └── DuplicateDetector
├── Document Storage: Local filesystem
│ └── Organized folder structure
├── Event System: In-memory dispatcher + PostgreSQL NOTIFY
└── API: FastAPI REST endpoints
```
### **Frontend** (Already in place)
```
React + TypeScript
├── Feature-based architecture
├── React Query for data fetching
└── Existing UI components
```
---
## Implementation Priority
### **Phase 1: Core Foundation** (Week 1-2)
1. ✅ Database schema (PostgreSQL)
2. ✅ Basic CRUD APIs (FastAPI)
3. ✅ Document upload/storage (local filesystem)
4. ✅ Basic state machine (Python class)
### **Phase 2: Business Rules** (Week 3-4)
1. ✅ Novelty checking (PostgreSQL similarity)
2. ✅ Author contribution validation
3. ✅ Scoring algorithm (Group 01)
4. ✅ Auto-classification (Group 02)
### **Phase 3: Workflow & Notifications** (Week 5-6)
1. ✅ Complete state machine transitions
2. ✅ Deadline tracking & alerts
3. ✅ Email notifications (SMTP)
4. ✅ Duplicate detection & mediation
### **Phase 4: Advanced Features** (Week 7-8)
1. ✅ Reporting & analytics
2. ✅ Audit trail queries
3. ✅ Role-based permissions
4. ✅ Appeal workflow
---
## Technology Comparison
### **Original Stack Complexity**
- 8+ services to manage
- External dependencies (Kafka, Elasticsearch, S3)
- Complex deployment
- Higher resource usage
- Steeper learning curve
### **Simplified Stack**
- 2 services (FastAPI + PostgreSQL)
- Minimal external dependencies
- Simple deployment
- Lower resource usage
- Easier to maintain
---
## When to Scale Up
Consider adding complexity only if:
- **>10,000 initiatives/year**: Add Elasticsearch for search
- **>100 concurrent users**: Add Redis for caching
- **Multi-server deployment**: Add message queue (RabbitMQ)
- **Advanced ML needed**: Add dedicated ML service
- **Cloud deployment**: Use S3 for documents
For local application with <5,000 initiatives/year, simplified stack is sufficient.
---
## Code Structure Example
```
be0/
├── src/
│ ├── domain/
│ │ ├── entities/
│ │ │ ├── initiative.py
│ │ │ ├── author.py
│ │ │ └── review.py
│ │ └── rules/
│ │ ├── novelty_checker.py
│ │ ├── scoring_engine.py
│ │ └── duplicate_detector.py
│ ├── application/
│ │ ├── services/
│ │ │ ├── workflow_service.py
│ │ │ └── notification_service.py
│ │ └── state_machine.py
│ ├── infrastructure/
│ │ ├── database/
│ │ │ └── models.py
│ │ ├── storage/
│ │ │ └── file_storage.py
│ │ └── events/
│ │ └── dispatcher.py
│ └── api/
│ └── routes/
│ └── initiatives.py
└── storage/
└── documents/
└── initiatives/
```
---
## Summary
**Simplified Stack:**
- ✅ PostgreSQL (database + search + similarity)
- ✅ FastAPI (API framework)
- ✅ Python (business rules + workflow)
- ✅ Local filesystem (document storage)
- ✅ In-memory events (or PostgreSQL NOTIFY)
**Removed:**
- ❌ Camunda/Temporal (use custom state machine)
- ❌ Elasticsearch (use PostgreSQL full-text search)
- ❌ Kafka/RabbitMQ (use simple event dispatcher)
- ❌ S3/MinIO (use local filesystem)
- ❌ Drools (use Python functions)
**Result:** Simpler, easier to maintain, sufficient for local deployment, can scale up later if needed.
+113
View File
@@ -0,0 +1,113 @@
# Phase 1 Signup page — how it works
This document describes [`Signup.tsx`](./Signup.tsx): a **standalone registration screen** for the `fe01` “phase1” demo flow. Use it as a spec when re-implementing the same behavior elsewhere.
## Where it lives in the app
- **Component:** `fe01/src/phase1/pages/Signup.tsx` (default export `Signup`)
- **Route:** `/phase1/signup` (see `fe01/src/App.tsx`)
This is **not** the same as the main apps tabbed login/signup at `/signup`, which uses `SignupForm` (`fe01/src/components/auth/SignupForm.tsx`) with extra fields and optional OTP. The phase1 page is intentionally simpler.
## Dependencies
| Area | What |
|------|------|
| **Router** | `react-router-dom`: `Link`, `useNavigate` — success screen uses `navigate('/login')` (root login, not phase1). Link at bottom goes to `/phase1/login`. |
| **API** | `authService` from `@/phase1/lib/auth-service``register()` and `resendVerify()`. |
| **UI** | shadcn-style: `Button`, `Input`, `Label`, `Card*`, `Alert`, `Select`; icons from `lucide-react`. |
**Base URL:** `authService` uses `import.meta.env.VITE_API_URL` or falls back to `http://localhost:3000` (see `auth-service.ts`). Ensure your re-implementation uses the same env variable or your chosen API base.
## User-facing flow (two screens)
1. **Form screen** — collect name, school email, password, role → submit.
2. **Success screen** — shown when registration succeeds: explains that a verification email was sent, offers **resend verification** and **back to login**.
There is **no** auto-redirect after signup; the user must verify email via the link in mail (handled by backend + a separate verify route elsewhere, e.g. `/verify-email?userId=...&token=...`).
## State model
| State | Type | Role |
|-------|------|------|
| `email`, `password`, `fullName` | `string` | Controlled form fields |
| `role` | `'Admin' \| 'Editor' \| 'Viewer'` | Default `'Viewer'` |
| `error` | `string` | Inline validation / API error message |
| `loading` | `boolean` | Disables inputs and shows spinner on submit/resend |
| `successMsg` | `string \| null` | Message from API after successful register or resend |
| `registeredEmail` | `string \| null` | Email used for resend; together with `successMsg` gates the success UI |
**Success UI condition:** `successMsg && registeredEmail` — both must be set after a successful `register()`.
## Role picker
- **Canonical values** sent to API: `Admin`, `Editor`, `Viewer` (must match backend).
- **Display:** Vietnamese labels + icons + a one-line **description** under the select (`ROLE_OPTIONS` array in the component).
Re-implementing: keep the **API values** exactly as the backend expects; only labels/descriptions are presentation.
## Client-side validation (before `register`)
Order in `handleSubmit`:
1. Clear `error`.
2. `fullName` — non-empty after trim.
3. `email` — non-empty after trim.
4. **Domain check**`validateDomain()`:
- Must contain `@`.
- Part after `@` (lower case, trimmed) must equal one of **`ump.edu.vn`** or **`umc.edu.vn`** (hardcoded hint list `ALLOWED_DOMAINS_HINT`).
5. `password` — length ≥ 8.
**Important:** This list is a **UX hint**. The file comment states the server re-validates using `SystemSettings` / `Auth.AllowedEmailDomains`. Your copy can change the hint list, but must stay aligned with server policy.
**On submit:** email is sent as `email.trim().toLowerCase()`; `fullName` is `fullName.trim()`.
## API: `register`
**Call:** `authService.register({ email, password, fullName, role })`
**HTTP:** `POST {API_URL}/auth/register`
**Headers:** `Content-Type: application/json`, `X-Client-Origin: window.location.origin`
**Body:** JSON `{ email, password, fullName, role }` with `role` one of `Admin` | `Editor` | `Viewer`.
**On failure (`!res.ok`):** `authService` returns `{ success: false, error: string }` — page sets `error` and stops.
**On success:** `{ success: true, userId, email, role, emailSent, message }` — page sets `registeredEmail` from `res.email`, `successMsg` from `res.message`, does **not** set tokens (registration does not log the user in).
## API: resend verification
**When:** User is on success screen and clicks “Gửi lại email xác nhận”.
**Call:** `authService.resendVerify(registeredEmail)`
**HTTP:** `POST {API_URL}/auth/resend-verify` with JSON `{ email }`, same `X-Client-Origin` pattern.
**On response:** Page updates `successMsg` with `res.message` (does not clear the success layout).
## Error display
- Errors render in a destructive `Alert` above the submit button.
- Success screen uses a non-destructive `Alert` for `successMsg`.
## Styling / layout notes
- **Form:** centered card on `bg-muted/20`, max width `max-w-md`.
- **Success:** full-height centered layout with gradient background and slightly richer card styling (backdrop blur, green mail icon).
- **Accessibility:** labels use `htmlFor` matching input `id`; password field has `minLength={8}` and `required` in addition to JS checks.
## Checklist for a faithful re-implementation
- [ ] Same route or intentionally different — document if you move it.
- [ ] `register` + `resendVerify` endpoints, payloads, and headers (`X-Client-Origin`, `credentials: 'include'` if cookies matter for your stack).
- [ ] Two-step UI: form → “check your email” with resend + login navigation.
- [ ] Client validation order and domain hint list (or replace hint with server-driven list if you add an endpoint).
- [ ] Role enum matches backend; Vietnamese copy optional.
- [ ] Distinguish this simple flow from `SignupForm` (OTP/extra fields) if both exist in the product.
## Related files
- `fe01/src/phase1/lib/auth-service.ts``register`, `resendVerify`, `verifyEmail`
- `fe01/src/App.tsx` — route wiring
- `fe01/src/components/auth/SignupForm.tsx` — richer signup used on main `/login` + `/signup` (different product path)
+259
View File
@@ -0,0 +1,259 @@
# Tech Stack Comparison: Original vs Simplified
## Quick Reference
### Original Suggestions → Simplified Alternatives
| Requirement | Original Tech | Simplified Tech | Complexity Reduction |
|------------|--------------|----------------|---------------------|
| **Workflow Engine** | Camunda / Temporal | Custom Python state machine | 90% simpler |
| **Document Storage** | AWS S3 / MinIO | Local filesystem + PostgreSQL metadata | 80% simpler |
| **Search & Duplicate Detection** | Elasticsearch + ML (Sentence-BERT) | PostgreSQL full-text + pg_trgm | 85% simpler |
| **Event Bus** | Apache Kafka / RabbitMQ | PostgreSQL NOTIFY/LISTEN or in-memory | 90% simpler |
| **Business Rules** | Drools | Custom Python classes/functions | 70% simpler |
| **Audit Log** | Separate WORM storage | PostgreSQL append-only table | 60% simpler |
---
## Detailed Simplifications
### 1. Workflow Engine
**Original:** Camunda or Temporal
- Separate service to run
- Complex BPMN diagrams
- Additional database
- Learning curve
**Simplified:** Custom Python State Machine
```python
# ~100 lines of code
class InitiativeWorkflow:
STATES = {
'DRAFT': ['SUBMITTED'],
'SUBMITTED': ['UNIT_REVIEW', 'REJECTED'],
'UNIT_REVIEW': ['COUNCIL_REVIEW', 'REJECTED'],
'COUNCIL_REVIEW': ['APPROVED', 'REJECTED'],
'APPROVED': ['FINALIZED', 'APPEAL'],
'REJECTED': ['APPEAL'],
'APPEAL': ['APPROVED', 'REJECTED', 'FINALIZED'],
'FINALIZED': []
}
def can_transition(self, from_state, to_state, user_role):
return to_state in self.STATES.get(from_state, [])
```
**Savings:**
- No separate service
- No BPMN learning
- Easier to debug
- Version controlled
---
### 2. Document Storage
**Original:** AWS S3 / MinIO
- Separate service
- API calls for every operation
- Network latency
- Additional configuration
**Simplified:** Local Filesystem
```
/initiatives/
/{initiative_id}/
/forms/
/reviews/
/attachments/
```
**Savings:**
- Direct file access
- No API calls
- Simpler backup (copy folder)
- No network dependency
---
### 3. Search & Duplicate Detection
**Original:** Elasticsearch + ML Model (Sentence-BERT)
- Separate service
- Model training required
- Complex deployment
- Resource intensive
**Simplified:** PostgreSQL Full-Text + Trigram Similarity
```sql
-- Enable extensions
CREATE EXTENSION IF NOT EXISTS pg_trgm;
-- Create index
CREATE INDEX idx_initiative_description_gin
ON initiatives USING gin(to_tsvector('english', description));
-- Similarity search
SELECT id, title,
similarity(description, 'search text') as score
FROM initiatives
WHERE similarity(description, 'search text') > 0.3
ORDER BY score DESC;
```
**Savings:**
- Built into PostgreSQL
- No model training
- No separate service
- Good enough for local scale
---
### 4. Event Bus
**Original:** Apache Kafka / RabbitMQ
- Separate service
- Complex configuration
- Message persistence
- Consumer groups
**Simplified:** PostgreSQL NOTIFY/LISTEN
```python
# Publisher
async def notify_event(event_type, data):
await db.execute(
"SELECT pg_notify('initiative_events', %s)",
json.dumps({'type': event_type, 'data': data})
)
# Listener
async def listen_events():
conn = await asyncpg.connect(...)
await conn.add_listener('initiative_events', handle_event)
```
**Savings:**
- No separate service
- Built into database
- Persistent (if needed)
- Simple pub/sub
---
### 5. Business Rules Engine
**Original:** Drools
- Java-based
- Separate rule files
- Complex syntax
- Additional dependency
**Simplified:** Python Functions/Classes
```python
class NoveltyChecker:
def check(self, initiative):
# Check similarity with existing
similar = self.find_similar(initiative)
if similar:
return ValidationResult(invalid=True, reason="Duplicate found")
return ValidationResult(valid=True)
class ScoringEngine:
def calculate(self, reviews):
scores = [r.score for r in reviews if r.score is not None]
if len(scores) == 0:
return None
return sum(scores) / len(scores)
```
**Savings:**
- Native Python
- Easy to test
- Version controlled
- No external engine
---
## Resource Usage Comparison
### Original Stack
- PostgreSQL: ~200MB RAM
- FastAPI: ~100MB RAM
- Elasticsearch: ~1GB RAM
- Kafka: ~500MB RAM
- MinIO: ~200MB RAM
- **Total: ~2GB RAM minimum**
### Simplified Stack
- PostgreSQL: ~200MB RAM
- FastAPI: ~100MB RAM
- **Total: ~300MB RAM**
**Savings: 85% less memory**
---
## Deployment Complexity
### Original Stack
```
docker-compose.yml:
- postgres
- fastapi
- elasticsearch
- kafka
- minio
- zookeeper (for Kafka)
Total: 6+ containers
```
### Simplified Stack
```
docker-compose.yml:
- postgres
- fastapi
Total: 2 containers
```
**Savings: 67% fewer services**
---
## Maintenance Effort
| Task | Original | Simplified | Time Saved |
|------|----------|------------|------------|
| Setup | 2-3 days | 2-3 hours | 90% |
| Debugging | Complex (multiple services) | Simple (2 services) | 70% |
| Updates | Multiple services | 2 services | 80% |
| Monitoring | Multiple dashboards | Single dashboard | 75% |
---
## When to Upgrade
Upgrade to original stack only if:
1. **Scale:** >10,000 initiatives/year
2. **Users:** >100 concurrent users
3. **Performance:** Response time >2s
4. **Distribution:** Multi-server deployment
5. **Advanced ML:** Need sophisticated NLP
For local application with typical load (<5,000 initiatives/year), simplified stack is optimal.
---
## Migration Path
If you need to scale later:
1. **Add Redis** for caching (if slow queries)
2. **Add Elasticsearch** for advanced search (if PostgreSQL search insufficient)
3. **Add RabbitMQ** for async processing (if need background jobs)
4. **Move to S3** for documents (if need cloud storage)
But start simple, scale when needed.
@@ -0,0 +1,336 @@
# Application files: persistence, retrieval by `applicationId`, and backup notes
This document describes how the running **initiative** stack stores and loads:
- Evidence attachments (minh chứng 2.1 / 2.2 / kỹ thuật)
- The **submitted full-package PDF** (đơn + báo cáo from the « Xem lại » flow)
- The **filled DOCX / official PDF** derived from the Word template
It focuses on what **PostgreSQL** and **MinIO** hold. The root file [`database/schema.sql`](../database/schema.sql) describes a separate **integer `applications`** domain (attachments table with `application_id` INT); that schema is **not** wired into `be0` today. Production behavior is driven by **`be0/migrations/*.sql`** and **`INITIATIVE_DATABASE_URL`**.
**Implementation planning:** The phased backup and storage-hardening plan below is **refined against** the review in [`feedback-data-management.md`](feedback-data-management.md) (canonical bytes, `storage_kind`, SHA verification on pack, streaming ZIP + manifest, indexed IDs, evidence versioning, and sequencing).
---
## Identifiers: what “applicationId” means
The UI and APIs expose a **public submission id** shaped like `sub-{16 hex chars}` (see `save_submitted_application` in `be0/src/initiative_db/submissions.py`). Internally, persistence is keyed by:
| Concept | Example | Where |
|--------|---------|--------|
| **Public `applicationId`** (list/detail) | `sub-abc123def4567890` | `drafts.payload.submissionRecord.id`, API responses |
| **Draft / case code** | `CASE-…` or `SUB-…` | `initiatives.case_code`, `draft_case_id` on API rows |
| **Initiative primary key** | UUID | `initiatives.id`, MinIO key prefix, `application_artifacts.initiative_id` |
**Resolving a row:** `get_application_by_id` (`be0/src/initiative_db/submissions.py`) scans submitted initiatives and matches when either:
- `_submission_display_id(initiative, submissionRecord) == applicationId`, or
- `initiative.case_code == applicationId`.
So admins can deep-link with **`sub-…`** or sometimes **`CASE-…`**. For backups, always persist **`initiatives.id`**, **`case_code`**, and **`sub-…`** together.
---
## MinIO
Configured in Docker via `S3_*` env vars (`docker-compose.yml`):
| Bucket (env) | Purpose |
|----------------|---------|
| **`initiative-attachments`** (`S3_BUCKET_ATTACHMENTS`) | Evidence uploads for Đơn (research / textbook / technical) |
| **`initiative-exports`** (`S3_BUCKET_EXPORTS`) | Optional copy of the **submitted full PDF** after successful submit |
| **`initiative-quarantine`** (`S3_BUCKET_QUARANTINE`) | Reserved for quarantine flows (not detailed here) |
**Object key layout** (`be0/src/minio/storage.py`):
- Evidence and export artifacts use **`build_key_for_initiative`**:
`initiatives/{initiative_uuid_no_hyphens}/{yyyy}/{mm}/{uuid}-{safe_filename}`
The API uses the **internal endpoint** for the server (`S3_ENDPOINT_URL`, e.g. `http://minio:9000`) and **`S3_PUBLIC_ENDPOINT_URL`** for presigned URLs the browser can open (e.g. `http://localhost:19000`).
**Integrity:** uploads compute SHA-256 and store it in object metadata and/or Postgres (`application_artifacts.sha256`).
---
## PostgreSQL (initiative database)
Core tables (`be0/migrations/001_initiative_schema.sql`, `002_application_storage_extensions.sql`, plus review-doc extensions):
### `initiatives`
- `id` (UUID), `case_code` (unique text), `owner_id`, `status`, `submitted_at`, etc.
- Submitted applications have `status != 'draft'` (e.g. `submitted`).
### `drafts`
- `payload` JSONB holds the live bundle: tab data, `submissionRecord`, `submissionFile`, etc.
After submit, important keys include:
- `payload.submissionRecord` — metadata including public `id` (`sub-…`)
- `payload.submissionFile` — e.g. `{ "url": "/submitted-initiatives/sub-….pdf", "type": "pdf" }`
### `application_artifacts`
One row per **`(initiative_id, role)`** (`002_application_storage_extensions.sql`). **Planned (Phase 1):** add roles for the **printable application form** binaries (e.g. **`official_form_docx`**, **`official_form_pdf`**) — distinct from **`full_pdf`** (the **client-uploaded** full hồ sơ PDF).
| `role` | Meaning |
|--------|---------|
| `full_pdf` | Submitted package PDF — **`storage_uri`** is either a **MinIO key** (under exports bucket) or a **relative URL** to static files |
| `research_evidence` | Minh chứng 2.1 (nghiên cứu) |
| `textbook_evidence` | Minh chứng 2.2 (giáo trình) |
| `technical_evidence` | Minh chứng kỹ thuật (nhóm 1) |
Columns: `storage_uri`, `original_name`, `mime_type`, `byte_size`, `sha256`, `uploaded_by`, `uploaded_at`, plus review fields for evidence.
### `application_submit_snapshots`
Append-only rows: merged tabs, submit metadata, and **`full_pdf_uri`** (today this records the **URL passed at submit time**, typically `/submitted-initiatives/...`, not necessarily the MinIO key).
Treat this table as **historical audit** of the submit request, not as the driver for backup byte locations: **`application_artifacts`** (and `storage_kind` once added) is the operational source of truth ([`feedback-data-management.md`](feedback-data-management.md) §8).
### `application_review_documents`
Versioned JSON used to regenerate the Word template output:
- `official_bieu_mau`, `template_data`, `full_bundle` (JSONB)
- Tied to **`initiative_id`** and `case_id`
**Today:** the binary filled DOCX is **not** stored in MinIO; this table is the only server-side input to regeneration. **Target (for a trustworthy admin backup):** treat this JSON as **supporting data** (re-render, analytics, diffing). The **canonical bytes** for “what the applicant signed off on” for the printable mẫu should be **immutable objects in MinIO** plus rows in `application_artifacts` (see [Implementation plan — Phase 1](#phase-1-canonical-bytes-for-printable-docx--pdf-before-backup-ships)).
### Other useful tables
- `draft_tab_snapshots` — history of tab JSON (`report` / `application` / `contribution`)
---
## Backend flows
### Evidence upload & download
- **POST** `/api/v1/application-drafts/{case_id}/evidence` — multipart upload; stores object in **`initiative-attachments`**; upserts `application_artifacts` with role `research_evidence` | `textbook_evidence` | `technical_evidence` (`be0/main.py`).
- **GET** `/api/v1/application-drafts/{case_id}/evidence` — returns metadata plus **presigned** `downloadUrl` / `viewUrl` for staff or owner.
`case_id` is normalized to the initiatives **`case_code`** (e.g. `CASE-…`).
### Submit full PDF
- **POST** `/api/applications/submit` — receives PDF + JSON `metadata` (`be0/main.py`).
- Always writes the file to **`SUBMITTED_INITIATIVES_DIR`** (default: repo `assets/submitted-initiatives` or `fe0/public/submitted-initiatives` in dev), served under **`/submitted-initiatives/{sub-….pdf}`**.
- If PostgreSQL is enabled: **`save_submitted_application`** updates `initiatives` / `drafts`, writes **`application_submit_snapshots`**, **`application_taxonomy`**, **`application_workflow`**, and **`upsert_artifact_full_pdf`**.
- **MinIO copy:** `_maybe_upload_submitted_pdf_to_exports_minio` uploads the same bytes to **`initiative-exports`** and, on success, sets **`application_artifacts.full_pdf.storage_uri`** to the **object key** (not the `/submitted-initiatives/...` URL). If MinIO fails, the artifact still points at the **filesystem URL** only — **this is slated to become a hard failure** once canonical storage is enforced ([Phase 2](#phase-2-canonical-storage-for-submitted-full-pdf)).
### Filled DOCX / official PDF (preview; persistence plan)
- **POST** `/api/v1/docx/preview-application-form` — renders `template_application_form.docx` with **docxtpl**; returns bytes (**no DB/MinIO write** today).
- **POST** `/api/v1/docx/preview-application-form-pdf` — same merge, then **LibreOffice** conversion to PDF; returns bytes.
The client builds `officialBieuMau` from draft state; **`persistReviewDocumentBundle`** (**POST** `/api/v1/review-documents`) saves the JSON bundle to **`application_review_documents`**.
**Preview endpoints remain useful** for staff “what-if” and for regenerating with newer templates. **They must not** be the only path that feeds the admin backup ZIP once Phase 1 is done — backups should stream **stored** printable DOCX/PDF bytes unless a legacy row has no stored object (then document explicit fallback or backfill).
### Admin detail: presigned full PDF
For **GET** `/api/applications/{application_id}`, when `full_pdf.storage_uri` looks like a **MinIO key** (not `/submitted-initiatives` or `http`), **`_enrich_application_detail_full_pdf_presign`** adds `files.fullText.viewUrl` (presigned GET on **`initiative-exports`**).
---
## Frontend
| Concern | Location |
|---------|----------|
| Submit PDF | `fe0/src/components/applicant/submitInitiativePdf.ts`**POST** `/api/applications/submit` with `FormData` + JWT; metadata includes **`initiativeCaseId`** (must match Postgres `case_code`). |
| Draft load/save | `fe0/src/components/applicant/applicationDrafts.ts`**GET/POST** `/api/v1/application-drafts/...`. |
| DOCX/PDF from template | `fe0/src/lib/applicationFormDocxApi.ts` → preview endpoints; `ApplicationFormDocxPreview.tsx` orchestrates save + review bundle persistence. |
| Evidence UI | e.g. `ApplicationEvidenceManagePage.tsx` — uses **GET** `/api/v1/application-drafts/{caseId}/evidence` with presigned URLs. |
| Admin list/detail | Uses **GET** `/api/applications`, **GET** list/detail with `applicationId`; detail exposes `draft_case_id` for loading drafts/evidence. |
Important: **`sub-…`** is the list id; **draft/evidence APIs use `case_code` (`CASE-…`)**. The API surfaces `draft_case_id` on submission rows to bridge the two.
---
## Applicant honesty checkboxes, complete tabs & PDF minh chứng (engineering guide)
Goal: applicants cannot tick the **cam kết trung thực** checkboxes at the end of **Báo cáo**, **Đơn**, and **Xác nhận đóng góp** until the workflow rules below are satisfied; the UI shows a **Sonner** toast listing missing items. **PDF minh chứng** means the classification-specific evidence file for Đơn (research / textbook / technical), stored in **MinIO** via `POST /api/v1/application-drafts/{case_id}/evidence` (see [Evidence upload & download](#evidence-upload--download)).
### Intended behaviour (product)
| Control | When it may be ticked |
|--------|------------------------|
| **Báo cáo** (`InitiativeReportForm`) | All required fields on the report tab are non-empty (§1–§6 narrative + hiệu quả fields exposed in the UI). |
| **Đơn** (`InitiativeApplicationForm`) | All required Đơn fields are complete **and** the correct **PDF minh chứng** slot is filled for the chosen classification (local `File`, or `FileHandle` with `serverStorageKey` after MinIO upload). Sub-forms (bản cam kết / biểu xác nhận) must match the selected nhóm. |
| **Xác nhận đóng góp** (`ContributionConfirmationForm`) | Same checks as Đơn **and** Báo cáo, **and** the applicant has already ticked honesty on **Báo cáo** and **Đơn**. |
| **Xem lại — Gửi** (`ApplicationFormDocxPreview`) | Same as contribution gate **plus** `contribution.digitalSignatureConfirmed` in the persisted contribution JSON. |
Implementation reference:
- Shared validators + messages: `fe0/src/lib/applicantHonestyPrerequisites.ts` (`collectReportTabHonestyGaps`, `collectApplicationTabHonestyGaps`, `collectContributionDigitalSignaturePrerequisiteGaps`, `collectApplicantSubmitToAdminPrerequisiteGaps`, `formatApplicantPrerequisiteToastDescription`).
- Checkbox handlers toast with `toast.error(..., { description })` and **do not** flip state when prerequisites fail.
Staff / council flows without `DraftProvider` skip the contribution-tab signature gate (no full draft in context); fields stay **`readOnly`** as today.
### Frontend (detailed)
1. **Single source of truth for messages** — Keep gap strings in `applicantHonestyPrerequisites.ts` so DOCX preview and forms stay aligned.
2. **Evidence PDF** — Treat as present if `applicantEvidencePdfPresent(file)` is true: `File` with non-zero size, or `FileHandle` with `serverStorageKey` (MinIO) or positive `size` (IndexedDB). Matches hydration in `DraftContext` after `getApplicationEvidence(caseId)`.
3. **Contribution tab** — Uses `draft.report` and `draft.application` from `DraftContext`; authors/% totals are validated on Đơn; contribution UI mirrors `authors` when connected to Postgres drafts.
4. **Review submit** — Besides tab JSON, enforce contribution signature flag on the object passed into `ApplicationFormDocxPreview` (from `draftTabs.contribution`).
### Backend (recommended)
Today, gates are **client-side** only. For integrity:
- **`POST /api/applications/submit`** — Implemented in `be0/src/initiative_db/submission_readiness.py`, invoked from `save_submitted_application` **before** the initiative is marked submitted. Loads merged `drafts.payload.tabs` (with snapshot fallback), reads **`application_artifacts`** for `research_evidence` / `textbook_evidence` / `technical_evidence` (non-empty `storage_uri`), and validates tab JSON + honesty flags to match the applicant UI. On failure: **400** with `detail: { "message": "…", "missing": ["…", …] }` (see `ApplicationSubmissionNotReadyError` handling in `be0/main.py`). The client maps this in `fe0/src/components/applicant/submitInitiativePdf.ts`. Partial PDF written on disk is removed when Postgres validation fails.
- **`POST /api/v1/application-drafts/{case_id}/evidence`** — Already the canonical upload path; reject non-PDF or oversize files (existing behaviour).
### PostgreSQL
- Tab JSON lives under **`drafts.payload`** (and/or tab snapshots). Honesty flags are plain booleans: `report.honestyConfirmed`, `application.honestyConfirmed`, `contribution.digitalSignatureConfirmed`. No migration is required for gating unless you add a **server-side** “submission readiness” snapshot column.
### MinIO
- Required PDF for Đơn is stored under **`initiative-attachments`** with keys from `build_key_for_initiative`; metadata is reflected in **`application_artifacts`** (`research_evidence` | `textbook_evidence` | `technical_evidence`). Frontend readiness should agree with **either** the draft file handle (`serverStorageKey`) **or** a fresh **`GET .../evidence`** bundle (see `collectDocxTemplateCompletenessGaps` in admin review for a related pattern).
---
## Retrieving everything for one submission (interim checklist)
Until Phases 12 are done, a reader resolving **`applicationId`** (`sub-…`) should:
1. **Postgres:** Resolve `initiatives` + latest `drafts` (today: `get_application_by_id` scan; target: indexed `submission_public_id` — [Phase 4](#phase-4-identifiers--schema-hygiene)).
2. **Submitted full-package PDF (`full_pdf` artifact):** Read `application_artifacts` with `role = 'full_pdf'`. Dispatch on **`storage_kind`** once added; until then, avoid relying only on string-prefix heuristics for production backups.
3. **Evidence:** Roles `research_evidence`, `textbook_evidence`, `technical_evidence` → keys in **`initiative-attachments`**.
4. **Printable mẫu DOCX/PDF:** After Phase 1, stream from MinIO using new artifact roles; until then see **legacy** note in Phase 3.
Optional ZIP extras: latest `application_review_documents` JSON, `draft_tab_snapshots`, read-only copies of `application_submit_snapshots` for audit.
**Related rationale and risks** (regeneration vs backup, polymorphic `storage_uri`, integrity): [`feedback-data-management.md`](feedback-data-management.md).
---
## Implementation plan: admin backup (database + document management)
Goal: **admin downloads one ZIP** containing **all evidence attachments**, the **submitted full-package PDF**, and the **printable application DOCX + PDF** (mẫu), with **verifiable integrity** and **no reliance on regenerating** printable documents at download time (after prerequisites).
Phasing follows the sequencing in [`feedback-data-management.md`](feedback-data-management.md) §“Suggested order of work”, expanded into concrete schema and API work.
### Phase 0 — Decisions & prerequisites
| Item | Action |
|------|--------|
| **Canonical bytes for printable mẫu** | Store immutable DOCX + PDF in MinIO at submit (or immediately pre-submit in the same transaction as finalize), not only JSON. |
| **Evidence versioning** | Decide: append-only evidence history vs “latest only”. For approvals, prefer **versioned or append-only** so backup matches what was reviewed ([`feedback-data-management.md`](feedback-data-management.md) §7). |
| **Quarantine bucket** | Define behavior if objects exist in **`initiative-quarantine`**: include/exclude/fail backup ([`feedback-data-management.md`](feedback-data-management.md) §11). |
| **MinIO operations** | Document versioning, lifecycle, retention, DR (suggested spin-off: `MINIO_OPERATIONS.md` per feedback §9). |
| **Dead schema** | Move or clearly label [`database/schema.sql`](../database/schema.sql) so tooling does not confuse INT `application_id` with `sub-…` ([`feedback-data-management.md`](feedback-data-management.md) §6). |
### Phase 1 — Canonical bytes for printable DOCX + PDF (before backup ships)
**Problem:** Regenerating DOCX/PDF at backup time uses **current** template, docxtpl, LibreOffice, and fonts — not provably what the applicant saw ([`feedback-data-management.md`](feedback-data-management.md) §1).
**Database**
- Extend `application_artifacts.role` CHECK (new migration) with two roles, e.g. **`official_form_docx`** and **`official_form_pdf`** (names TBD; must be distinct from **`full_pdf`**, which is the **client-uploaded full hồ sơ** PDF).
- On successful submit (or single “finalize” step server-side): compute SHA-256 for each file; **`INSERT`/upsert** rows with `storage_uri` = MinIO key, `sha256`, `byte_size`, `mime_type`, `original_name`, **`storage_kind = 'minio_exports'`** (once column exists).
**Application logic**
- Server: build `officialBieuMau` from the same snapshot used for submission (bundle already available in draft + review document path), call existing **`fill_application_form_docx`** → bytes; call **`convert_docx_bytes_to_pdf`** → bytes; upload both to **`initiative-exports`** using `build_key_for_initiative`.
- **Do not** put LibreOffice on the admin **download** path after this; optional background **verify-only** job may re-read objects.
**JSON**
- Keep saving **`application_review_documents`** for re-render/diff; it is **not** the sole legal snapshot of the printable files once binaries exist.
**Gate:** Do **not** release the admin backup endpoint that promises “printable DOCX/PDF” until this phase is done for **new** submits; for **legacy** rows without these artifacts, define policy (backfill job vs manifest flag `missing_official_form: true`).
### Phase 2 — Canonical storage for submitted full-package PDF
**Problem:** `full_pdf` may point at filesystem-only, MinIO-only, or both; best-effort upload risks silent loss ([`feedback-data-management.md`](feedback-data-management.md) §2).
**Database**
- Add **`storage_kind`** on **`application_artifacts`** (enum/text): e.g. `minio_exports`, `minio_attachments`, `filesystem`, `external_url`. Backfill from existing `storage_uri` shape; default new rows explicitly.
- Optionally add **`content_sha256_verified_at`** or rely on manifest at backup time only.
**Application logic**
- Make **MinIO upload of `full_pdf` synchronous and required** when persistence is enabled: if upload fails, **fail submit** with retryable error.
- Treat filesystem write as **cache** for dev/static serving if desired, not sole store.
- **Backfill job:** filesystem-only historical PDFs → **`initiative-exports`**, then update artifact row + `storage_kind`.
**Infrastructure**
- Ensure **`SUBMITTED_INITIATIVES_DIR`** is on a **persistent volume** in every environment, or stop relying on it for production.
### Phase 3 — Admin backup endpoint + ZIP contract
**Authorization:** admin-only; **audit** every request: actor, `applicationId`, timestamp, outcome, bytes streamed ([`feedback-data-management.md`](feedback-data-management.md) §10).
**Resolution:** load initiative by **`submission_public_id`** or **`case_code`** (indexed) after Phase 4; until then use existing lookup with awareness of scan cost for **bulk** exports.
**Integrity**
- While streaming each file into the ZIP, **compute SHA-256** and **compare** to `application_artifacts.sha256`. On mismatch: **fail entire export**, log at high severity ([`feedback-data-management.md`](feedback-data-management.md) §4).
- Optional **`POST /admin/…/backup/verify`** (verify-only, no ZIP) for periodic audits.
**ZIP layout** (suggested; ASCII-safe entry names, original names in manifest):
```text
manifest.json
submitted/full-package.pdf
submitted/official-form.docx
submitted/official-form.pdf
evidence/research/{safe-name-or-id}
evidence/textbook/…
evidence/technical/…
metadata/application_review_documents.json # optional
```
**`manifest.json`** (minimum fields): `applicationId`, `case_code`, `initiative_id`, submitted timestamps, owner id, **list of files** with `role`, `original_name`, `mime_type`, `byte_size`, **stored** `sha256`, **verified** `sha256` (computed during ZIP build), `storage_kind`.
**Transport**
- **Stream ZIP** with a streaming library (e.g. `zipstream-ng`); **do not** buffer whole archives in memory.
- Single-initiative: synchronous response acceptable.
- **Bulk** (date range, many rows): **async job** → write ZIP to **`initiative-exports`** or **`initiative-backups`** → presigned URL when ready (avoids proxy timeouts).
**Sources for each ZIP entry**
| Content | Source |
|--------|--------|
| Full hồ sơ PDF | `application_artifacts.full_pdf` → MinIO **`initiative-exports`** (after Phase 2) |
| Printable DOCX / PDF | `official_form_docx` / `official_form_pdf`**`initiative-exports`** |
| Evidence | `research_*`, `textbook_*`, `technical_*`**`initiative-attachments`** |
| Structured snapshot | Optional: latest `application_review_documents` JSON |
**Legacy:** If `official_form_*` missing, either skip with manifest flags or run **one-time backfill** using frozen template policy — **document** that backfilled bytes are “as-of backfill date” not original submit date.
### Phase 4 — Identifiers & schema hygiene
- Add **`submission_public_id`** (unique, indexed) on **`initiatives`**, set once at submit; replace linear scan in `get_application_by_id` with indexed lookup ([`feedback-data-management.md`](feedback-data-management.md) §5).
- Document resolution: **`sub-…`** vs **`CASE-…`** explicitly (remove “sometimes” from ops docs).
### Phase 5 — Hardening (ongoing)
- MinIO **versioning** / **object lock** if compliance requires; off-cluster backup of MinIO; periodic verify-only sweeps ([`feedback-data-management.md`](feedback-data-management.md) §9, §10, quarter roadmap).
---
### Frontend (admin)
- New **“Tải bản sao lưu”** (or similar) on application detail: call backup endpoint, handle long downloads (progress if async + poll).
- For async pattern: show job id, link when presigned URL ready.
- Ensure **admin audit** expectations match backend logging.
---
### Summary
| Layer | Current summary | After plan |
|--------|-----------------|------------|
| **Postgres** | Artifacts + polymorphic `storage_uri` | Explicit `storage_kind`, optional `submission_public_id`, new artifact roles for official DOCX/PDF |
| **MinIO** | Evidence + best-effort full PDF | Required `full_pdf` + official form binaries on **`initiative-exports`**; evidence on **`initiative-attachments`** |
| **Admin backup** | Would require regeneration / fragile dispatch | Streaming ZIP + manifest + verified SHA + audit; optional async for bulk |
This aligns the **database and document management system** with a backup that **admins can trust**: **stored bytes**, **verified at pack time**, and **operationally grounded** in explicit storage metadata.
+209
View File
@@ -0,0 +1,209 @@
# Architecture improvement proposals
This document compares **this repository** (Initiative / “Profyt” style stack: `fe0` + `be0` + PostgreSQL + MinIO) against the **layered patterns and RBAC/DB practices** described in the reference write-up (Cookiecutter-style Full Stack template: *Route → Service → Repository → Database*).
It proposes concrete improvements for **frontend**, **backend**, and **database**—not to copy the template verbatim, but to **reduce risk and operational cost** as the product grows.
---
## 1. How your app is structured today (summary)
| Area | What you have | Reference template pattern |
|------|----------------|----------------------------|
| **Backend** | A large `be0/main.py` FastAPI app with many endpoints, workflow glue, and file/YAML fallbacks in one module | Small route modules, thin handlers, `services/` + `repositories/`, `schemas/`, `deps.py` |
| **Domain DB** | `be0/src/initiative_db/` (async engine, `get_session` with commit/rollback, SQLAlchemy models) | Same async SQLAlchemy idea; stronger separation of data access from HTTP |
| **Auth** | `be0/src/auth_api.py` — Argon2, JWT (HS256), roles embedded in `user_roles` + JWT payload, `decode_access_token_user_id` per route | Reusable `get_current_user`, `RoleChecker`, optional claim checks (e.g. access vs refresh) |
| **Frontend** | `fe0` React + Vite, centralized `apiClient`, `AuthContext`, permission map in `fe0/src/lib/permissions.ts` | UI gates only; real enforcement on server (you partially do this) |
| **Storage** | MinIO via `be0/src/minio/storage.py`, evidence endpoints in `main.py`, `application_artifacts` in PostgreSQL | Similar object-store + DB metadata split |
| **Migrations** | SQL files under `be0/migrations/` applied via Docker `initdb.d` | Alembic revisions + env imports for all models |
**Bottom line:** You already have strong **pieces** (async sessions, MinIO abstraction, initiative schema, evidence pipeline). The main gap is **structural**—keeping business rules, HTTP, and storage concerns **separated and testable**, and making **role and ownership rules** easy to review in one place.
---
## 2. Database improvements
### 2.1 Single source of truth for schema evolution
**Today:** `be0/migrations/*.sql` are mounted into Postgres at container init, while SQLAlchemy models in `be0/src/initiative_db/models.py` are the apps view of the world. That can **drift** (manual SQL + model mismatch).
**Proposals:**
- Introduce **Alembic** (or another migration tool) for `be0`, targeting the same `DATABASE_URL` / `INITIATIVE_DATABASE_URL`, with:
- `alembic/env.py` importing **all** model modules so autogenerate sees every table.
- A one-time **baseline** migration that matches the current SQL, then all future changes via revision files only.
- Keep Docker `initdb.d` for **local bootstrap** or replace it with `alembic upgrade head` in an entrypoint—avoid maintaining two parallel migration paths in production.
### 2.2 Naming convention and metadata hygiene
**Reference lesson:** A `Base.metadata` **naming convention** makes autogenerated diffs stable across environments.
**Proposal:** Add a SQLAlchemy `MetaData(naming_convention=…)` on your `Base` in `be0/src/initiative_db/models.py` *before* the next non-trivial migration, so new constraints dont get database-default names that differ by environment.
### 2.3 Indexes and hot paths (review pass)
**Already good:** `initiatives.case_code` is unique; ownership is by `owner_id`. Ensure **all foreign keys** used in list/detail filters have indexes (Postgres does not always index FK columns automatically in every setup—verify in migrations).
**Proposal:** One focused pass over queries in `drafts`, `submissions`, and `application_artifacts` (evidence) — add composite or partial indexes only where proven by slow query logs (avoid over-indexing).
### 2.4 Align “roles” in DB vs product language
**Observation:** `001_initiative_schema.sql` defines `user_role` enum including `applicant`, `council_member`, etc. The **frontend** `fe0/src/lib/permissions.ts` uses `admin` | `editor` | `viewer` with business descriptions (“Người nộp đơn” tied to *viewer*). This **semantic mapping** is easy to get wrong in new endpoints.
**Proposals:**
- Add a **short internal doc** (table): DB role(s) → JWT/claims → UI permission set.
- In API design, **avoid duplicating** role names in ad hoc strings; prefer **one** enumeration shared between auth issuance and checks (or a small `const` module on the server used by all routers).
### 2.5 Transaction boundaries (youre on the right track)
`get_session` commits on success and rolls back on exception—aligned with the reference pattern. **Keep** service-level logic from calling `commit()` except where a deliberate long-lived transaction is required.
**Proposal:** As you split code out of `main.py`, ensure **one unit of work per request** unless you document exceptions (e.g. multi-step background handoff).
---
## 3. Backend improvements
### 3.1 Move from monolithic `main.py` to layered packages
**Gap:** A large `main.py` mixing workflow state, file exports, application drafts, MinIO, and compliance tools makes it hard to:
- test ownership and RBAC in isolation;
- reason about what runs on every deploy;
- onboard new contributors.
**Target shape (incremental, not big-bang):**
```
be0/src/
api/ # or routers/
deps.py # get_current_user, require_role, get_initiative_scoped, etc.
routes/
auth.py
application_drafts.py
applications.py
evidence.py
admin.py
services/ # business rules: "can this user add evidence to this case?"
repositories/ # SQL only: initiative by case_code, artifact by role, …
schemas/ # Pydantic request/response DTOs
initiative_db/ # keep models + engine; repositories call into models
```
**Rule (from the reference):** **Routes never execute raw SQL**; they call **services** that call **repositories**. This matches your evidence flow conceptually (youre halfway there with `application_storage`).
### 3.2 RBAC and IDOR: centralize “who can touch this row?”
**Reference patterns worth adopting:**
- **`RoleChecker`-style** dependencies (even if not a class) so route signatures read: “requires admin” vs “any authenticated user” without re-reading 30 lines of checks.
- **Ownership (IDOR):** For every `case_id` / `initiative_id` / `application_id`, a single function e.g. `assert_initiative_access(user_id, case_code, need_write=True)` that:
- returns **404** for cross-tenant access when appropriate (no existence leak);
- is reused by evidence upload, draft save, and submit.
**Today** you check `owner_id` in several places; factor that into **one** service method and reuse.
### 3.3 JWT and token hygiene
**Reference:** Distinguish **access** vs **refresh** token **claims** (e.g. `type`) so a refresh token cannot be used as a Bearer on API routes.
**Proposal:** If you only issue access tokens today, document that. If you add refresh, enforce `type == "access"` in the dependency that guards APIs.
**Already good:** `JWT_SECRET` length check in dev/prod, Argon2 for passwords.
### 3.4 Reduce dual persistence in production
**Today:** Some flows can fall back to **YAML on disk** or in-memory stores when Postgres is off. Thats great for demos; in production it **doubles** test matrices.
**Proposals:**
- **Feature flag** or environment switch: `INITIATIVES_PERSISTENCE=postgres` only in prod; fail fast if misconfigured.
- **Deprecate** or strictly scope file-based application draft storage to “local dev without Docker”.
### 3.5 MinIO and evidence
**Already good:** `S3Storage`, size limits, presigned download URLs, `application_artifacts` for metadata.
**Proposals:**
- Centralize **all** MinIO key conventions and `role` names (`evidence_research`, `evidence_textbook`, `full_pdf`) in **one** server module to avoid string drift.
- Add **periodic job** (cron/worker) for orphaned objects if uploads fail mid-way—your `cleanup` module in `minio` is a start; ensure its scheduled in deployed environments.
### 3.6 Errors and API contract
**Reference:** Map domain errors to HTTP status in one `exception_handlers` module.
**Proposal:** Use consistent JSON error bodies (`{ "detail": "...", "code": "EVIDENCE_TOO_LARGE" }`) so the **frontend** can branch without parsing English/Vietnamese strings.
---
## 4. Frontend improvements
### 4.1 Treat server as the only authority
**Today:** `fe0/src/lib/permissions.ts` encodes a rich matrix (`ROLE_PERMISSIONS`). Thats good UX, but **must** mirror server checks. Any mismatch is a **security** issue (UI hides a button, API still open—or the reverse).
**Proposals:**
- For sensitive actions, **prefer** “call API → 403/404” over purely client `hasPermission` (e.g. hide buttons but always handle 403 on mutation).
- Optionally expose a **`GET /api/v1/auth/me/permissions` or** embed resolved permissions in JWT claims—**if** you can keep them small and non-stale. Otherwise, document that **fe** is best-effort and **be** is truth.
### 4.2 Align “applicant” / “viewer” vocabulary
The UI comments describe **applicant** behavior while the `Role` type is **`viewer`**. New contributors will be confused.
**Proposals:**
- Rename in UI copy only (`getRoleDisplayName`) or add alias `type ApplicantRole = 'viewer'` with a comment, and keep **one** canonical role in code.
- Reconcile with **SQL** enum names in documentation (see §2.4).
### 4.3 Generated types from OpenAPI
**Gap:** Hand-maintained DTOs and ad hoc `as unknown as` casts in draft save/load are brittle.
**Proposal:** Export OpenAPI from FastAPI (`/openapi.json`) and generate TypeScript types (or a thin client) in CI. At minimum, generate **enums and response shapes** for `application-drafts`, `applications`, and `evidence`.
### 4.4 Draft and file state machine
You combine **local IndexedDB**, **server JSON tabs**, and **MinIO evidence**—users need clear **feedback** (youve started with toasts and size limits).
**Proposals:**
- A visible **per-field status** for evidence: `local only` / `synced` / `error` (small badge next to the filename).
- **Idempotent** retry: if upload fails, one-click “retry sync” without re-selecting the file (blob still in IDB).
### 4.5 Performance and DX
- **Code-split** very large forms (`InitiativeApplicationForm`, `InitiativeReportForm`) by section for faster first paint.
- **React Query** (you use it in places) for server-backed lists with stable keys; align cache invalidation on submit/delete application.
---
## 5. Suggested prioritization (practical order)
1. **DB:** Alembic + stop dual SQL/model drift; baseline migration.
2. **Backend:** Extract `deps.py` + initiative ownership helper; use it in evidence + draft + submit routes.
3. **Backend:** Split `main.py` into `routes/` by domain (even if services are thin at first).
4. **API:** Standard error JSON + OpenAPI export + generated TS types.
5. **Frontend:** Permission doc + 403 handling + evidence sync UX (status badge / retry).
6. **Ops:** One persistence mode in prod; optional cleanup job for MinIO/DB consistency.
---
## 6. Mapping to the reference “lessons” list
| Reference lesson | Your project |
|------------------|--------------|
| Four-layer separation | **Partial** — move HTTP out of `main.py` and add services. |
| `has_role` on user model / single RBAC place | **Partial** — centralize in `deps` + one service. |
| IDOR: always scope by `user_id` in queries | **Good direction** — consolidate checks. |
| 404 on cross-owner access | **Adopt** where you still return 403 for existence leaks. |
| DB timestamps / naming / Alembic imports | **Adopt** naming + Alembic hygiene. |
| One transaction per request | **Yes** in `get_session`. |
| NullPool for workers | **N/A** until you have forked workers; then apply. |
| Dont create admin over HTTP | **OK** if admin bootstrap is CLI or SQL—keep that. |
---
*Generated from a comparison of this codebase with the structure and ideas in the supplied `architecture-analysis.md` (vstorm / full-stack template style). Adjust priorities to your release timeline and team size.*
@@ -0,0 +1,500 @@
# Medical-Imaging 3D Viewer — Reconstruction Spec
A self-contained spec for the **VTK.js quad-view DICOM/NIfTI viewer** + the **organ-mask
overlay** system, written so it can be rebuilt from scratch in another React app. It
captures the architecture, the public API, every VTK module + magic number, the
interaction model, and the hard-won gotchas (each marked **⚠️**).
> Source of truth: `shared/src/components/viewer/` in this repo. This doc is a snapshot
> (2026-06-20) — if it disagrees with the code, the code wins.
---
## 1. What it is
A 4-pane ("quad view") medical image viewer rendered with **VTK.js into a single WebGL
canvas**:
```
┌─────────────┬─────────────┐
│ AXIAL │ CORONAL │ 3 panes = orthogonal 2D slices (reslice of the volume)
│ (2D slice) │ (2D slice) │ 1 pane = 3D volume rendering
├─────────────┼─────────────┤
│ SAGITTAL │ 3D │ + per-organ mask overlays (3D surface + 2D fills)
│ (2D slice) │ (volume) │ + a 5-tool annotation layer on the 2D panes
└─────────────┴─────────────┘
```
It loads a single `.nii`/`.nii.gz` (NIfTI) or a set of `.dcm` (DICOM) files, shows three
orthogonal slices + a 3D volume, supports window/level + opacity, slice-scroll, zoom,
double-click-to-expand, client-side annotations, and colored organ-segmentation overlays.
---
## 2. Tech stack & exact dependencies
| Package | Version | Role |
|---|---|---|
| `@kitware/vtk.js` | `^34.16.2` (built/ran on 34.18) | All rendering. ~960 KB — code-split it (see §4). |
| `nifti-reader-js` | `^0.8.0` | NIfTI header/image parse + gzip decompress. |
| `react` / `react-dom` | `^18.3.1` | Component shell + hooks. **React 18 auto-batches** (matters for the overlay toggle). |
| `@radix-ui/react-dropdown-menu` | `^2.1.15` | Only used by `ViewRotationControls` (optional). |
| `lucide-react` | `^0.462.0` | Icons (host UI only — not core). |
DICOM parsing additionally uses a DICOM lib inside `useDicomData` (e.g. `dicom-parser` /
`cornerstone`-style decode) — out of scope here; the NIfTI path is the reference.
**VTK.js profile imports (MUST be imported once, before any vtk instance):**
```ts
import "@kitware/vtk.js/Rendering/Profiles/Volume"; // volume rendering
import "@kitware/vtk.js/Rendering/Profiles/Geometry"; // surfaces (organ overlays), slices
import "@kitware/vtk.js/Rendering/Misc/RenderingAPIs"; // OpenGL backend
```
---
## 3. File inventory (`shared/src/components/viewer/`)
| File | LOC | Responsibility |
|---|---|---|
| `index.ts` | 14 | The `@ump/shared/viewer` barrel — the **only** public entry. |
| `types.ts` | 92 | `NiftiData`, `OrganMaskData`, `Annotation*` contracts. |
| `niftiLoader.ts` | 121 | `parseNiftiBuffer()` + non-hook `loadNiftiImageData()`. |
| `useNiftiData.ts` | 48 | React hook wrapping `parseNiftiBuffer` for the main image. |
| `useDicomData.ts` | 219 | DICOM equivalent of `useNiftiData`. |
| `UnifiedQuadViewRenderer.tsx` | 189 | Format dispatch (NIfTI vs DICOM) + prop pass-through. |
| `NiftiQuadViewRenderer.tsx` | 1249 | **The core** — quad view, slices, volume, interaction, overlays. |
| `QuadViewRenderer.tsx` | 820 | DICOM equivalent (same structure, DICOM input). |
| `AnnotationOverlay.tsx` | 254 | Per-pane 2D drawing surface (5 tools). |
| `ViewRotationControls.tsx` | 128 | Optional 3D-orientation dropdown. |
The host (not in `shared`): a full-screen dialog that mounts `UnifiedQuadViewRenderer`,
owns the window/level/opacity sliders, the annotation toolbar, and the organ panel.
---
## 4. Packaging / bundle strategy ⚠️
VTK is heavy (~960 KB). Expose the viewer as a **lazy subpath**, never from the main
barrel, so VTK lands in its own async chunk and never bloats a page's initial bundle.
`shared/package.json`:
```json
{ "name": "@ump/shared",
"exports": { ".": "./src/index.ts", "./viewer": "./src/components/viewer/index.ts" } }
```
`viewer/index.ts` (the entire public API):
```ts
export { UnifiedQuadViewRenderer } from './UnifiedQuadViewRenderer';
export type { UnifiedQuadViewRendererProps, FileFormat } from './UnifiedQuadViewRenderer';
export type { Annotation, AnnotationTool, AnnotationPoint } from './types';
export { loadNiftiImageData, parseNiftiBuffer } from './niftiLoader';
export type { OrganMaskData, OrganName } from './types';
```
**Consume it via `React.lazy`:**
```ts
const ViewerDialog = lazy(() => import('./DatasetFileViewerDialog')); // statically imports @ump/shared/viewer
```
**⚠️ Vite/TS alias ORDER:** when `@ump/shared` is aliased to source, the subpath
`@ump/shared/viewer` needs its **own** alias entry listed **before** the bare one (array
form, so prefix-matching picks the longer key first):
```ts
// vite.config.ts
resolve: { alias: [
{ find: '@ump/shared/viewer', replacement: path.resolve(__dirname, '../shared/src/components/viewer/index.ts') },
{ find: '@ump/shared', replacement: path.resolve(__dirname, '../shared/src/index.ts') },
]}
```
```jsonc
// tsconfig.json
"paths": {
"@ump/shared/viewer": ["../shared/src/components/viewer/index.ts"],
"@ump/shared": ["../shared/src/index.ts"]
}
```
**⚠️ Keep the NIfTI loader (`loadNiftiImageData`) and `OrganMaskData` in the `/viewer`
subpath** — importing them from the main barrel would pull VTK into the page's initial bundle.
---
## 5. Data model (`types.ts`, verbatim)
```ts
export interface NiftiData {
header: { dims: number[]; pixDims: number[]; datatype: number; littleEndian: boolean;
voxOffset: number; affine: number[][]; description: string };
imageData: any; // vtkImageData
rawData: ArrayBuffer;
dimensions: [number, number, number]; // [nx, ny, nz]
spacing: [number, number, number]; // [sx, sy, sz]
}
export type OrganName = string;
export interface OrganMaskData {
id?: string; // ⚠️ STABLE unique key (the mask file id). Renderer keys by this.
organName: OrganName; // display label
imageData: any; // vtkImageData of the binary mask (0 = bg, >0 = organ)
color: [number, number, number]; // 0-255 RGB
}
export type AnnotationTool = 'none' | 'bbox' | 'points' | 'pen' | 'brush' | 'polygon';
export interface AnnotationPoint { x: number; y: number } // normalized [0..1] to the pane
export interface Annotation {
id: string; view: 'axial' | 'coronal' | 'sagittal'; sliceIndex: number;
tool: Exclude<AnnotationTool, 'none'>; points: AnnotationPoint[];
color: string; strokeWidth?: number; label?: string;
}
```
---
## 6. Public API — `UnifiedQuadViewRendererProps`
```ts
interface UnifiedQuadViewRendererProps {
files: File[]; // 1 NIfTI File, or N DICOM Files
windowWidth: number; // CT window width (e.g. 400)
windowLevel: number; // CT window level (e.g. 40)
opacity: number; // 3D volume opacity 0..1 (e.g. 0.8)
isLoading?: boolean;
onRotate3D?: (o: ViewOrientation) => void;
// segmentation/MedSAM props (optional, unused unless you wire a backend):
segmentationEnabled?: boolean; boundingBox?; onBoundingBoxChange?; segmentationMask?;
currentSliceIndex?: number; onSliceIndexChange?;
// organ-mask overlays:
organMasks?: OrganMaskData[]; // ← selected organs to overlay (3D + 2D)
// annotations:
annotationTool?: AnnotationTool;
annotations?: Annotation[];
onAnnotationsChange?: (a: Annotation[]) => void;
}
```
`UnifiedQuadViewRenderer` detects format from file extension (`.nii`/`.nii.gz` → NIfTI,
`.dcm`/`.dicom` → DICOM), runs the matching hook (`useNiftiData`/`useDicomData`), and
renders `NiftiQuadViewRenderer` or `QuadViewRenderer`, forwarding all props.
**⚠️ `organMasks` is forwarded only on the NIfTI path** — DICOM ignores it.
---
## 7. NIfTI loading pipeline (`niftiLoader.ts`)
Pure, non-hook, throws on bad input. Reused by both the main-image hook and the
organ-mask loader (DRY — one parser).
```ts
export function parseNiftiBuffer(input: ArrayBuffer): NiftiData {
let buf = input;
if (nifti.isCompressed(buf)) buf = nifti.decompress(buf) as ArrayBuffer; // .nii.gz
if (!nifti.isNIFTI(buf)) throw new Error('Not a valid NIfTI file');
const header = nifti.readHeader(buf);
const image = nifti.readImage(header, buf);
const [ , nx, ny, nz ] = header.dims;
const sx = Math.abs(header.pixDims[1]) || 1, sy = Math.abs(header.pixDims[2]) || 1,
sz = Math.abs(header.pixDims[3]) || 1;
// datatype → typed array (UINT8/INT8/UINT16/INT16/FLOAT32 direct; FLOAT64/INT32 → Float32; default Float32)
let typed = /* switch(header.datatypeCode) … */;
// scl_slope / scl_inter scaling (skip when slope===1 && inter===0)
const slope = header.scl_slope || 1, inter = header.scl_inter || 0;
const scaled = (slope !== 1 || inter !== 0)
? Float32Array.from(typed, v => v * slope + inter)
: (typed instanceof Float32Array ? typed : new Float32Array(typed));
const imageData = vtkImageData.newInstance();
imageData.setDimensions([nx, ny, nz]);
imageData.setSpacing([sx, sy, sz]); // ⚠️ origin left at (0,0,0)
imageData.getPointData().setScalars(
vtkDataArray.newInstance({ name: 'Scalars', numberOfComponents: 1, values: scaled }));
return { header: {}, imageData, rawData: buf, dimensions: [nx,ny,nz], spacing: [sx,sy,sz] };
}
export async function loadNiftiImageData(file: File) { // for organ masks
return parseNiftiBuffer(await file.arrayBuffer()).imageData;
}
```
`useNiftiData(file)` just wraps this in a `useEffect` with a `lastFileRef` dedupe (key =
`name-size-lastModified`) and `{ niftiData, isLoading, error }` state.
**⚠️ No affine/world transform** is applied — only dims + spacing, origin (0,0,0). Two
volumes (image + mask) only co-register if they share the same grid. See §11.
---
## 8. The quad-view rendering core (`NiftiQuadViewRenderer.tsx`)
### 8.1 Scene graph (built once, in an init `useEffect`)
- **1** `vtkRenderWindow` + **1** `vtkOpenGLRenderWindow` (`.setContainer(containerDiv)`,
`.setSize(rect.w, rect.h)` in CSS px).
- **1** `vtkRenderWindowInteractor` (`.setView`, `.initialize`, default style = trackball).
- **4** `vtkRenderer`, each `.setViewport(...)` into a quadrant; backgrounds: 2D panes
`[0,0,0]`, 3D pane `[0.1,0.1,0.15]`. `renderWindow.addRenderer(ren)` for each.
- Per renderer, an absolutely-positioned **HTML `<div>` overlay** (the visible border +
label), appended to the container (see §8.3).
Store everything in a `useRef` "context": `{ renderWindow, renderWindowView, interactor,
renderers[4], containers[4], imageSliceActors[3], slicePlanes[3], volumeActor, ctf, pf,
iStyle, tStyle, organMaskActors: Map }`.
### 8.2 Viewport layouts (normalized `[xmin, ymin, xmax, ymax]`, GL origin bottom-left)
```ts
// 2×2 grid (default): note the 0.01 margin + 0.02 gutter between panes
axial = [0.01, 0.51, 0.49, 0.99] // top-left
coronal = [0.51, 0.51, 0.99, 0.99] // top-right
sagittal = [0.01, 0.01, 0.49, 0.49] // bottom-left
threeD = [0.51, 0.01, 0.99, 0.49] // bottom-right
// Expanded (double-click a pane): main left, 3 stacked right
main = [0.01, 0.01, 0.74, 0.99]
side = [ [0.75,0.67,0.99,0.99], [0.75,0.34,0.99,0.66], [0.75,0.01,0.99,0.33] ]
```
### 8.3 HTML border overlays ⚠️ (use percentages, not px)
Each pane has a transparent `<div>` (2px border + a corner label) over the canvas. Position
it as **percentages of the container**, matching the renderer's normalized viewport:
```ts
el.style.position = 'absolute';
el.style.left = `${vp[0] * 100}%`;
el.style.bottom = `${vp[1] * 100}%`;
el.style.width = `${(vp[2]-vp[0]) * 100}%`;
el.style.height = `${(vp[3]-vp[1]) * 100}%`;
el.style.boxSizing = 'border-box'; // set at creation
el.style.border = 'solid 2px hsl(var(--border))';
```
**⚠️ Do NOT compute px from `getBoundingClientRect()`** — the viewer often opens inside a
dialog that animates with `transform: scale(.95→1)`, and `ResizeObserver` does NOT fire on
transform changes, so a rect captured mid-animation freezes ~5% small/offset while the
canvas (`width:100%`) stretches to fill → content spills past the frame. Percentages track
the canvas under any transform/resize.
### 8.4 Slices (the 3 2D panes)
```ts
axialPlane.setNormal(0,0,1); coronalPlane.setNormal(0,1,0); sagittalPlane.setNormal(1,0,0);
// per view i: mapper = vtkImageResliceMapper({ slicePlane: planes[i] }); actor = vtkImageSlice(mapper)
// camera = parallel projection; positioned per medical convention (§8.6)
```
On data load, set every plane's origin to the volume center. `imageSliceActors[i] = { actor, mapper, ctf }`.
### 8.5 3D volume (the 3D pane)
```ts
volumeMapper = vtkVolumeMapper({ sampleDistance: 1.0 });
volumeActor = vtkVolume(); volumeActor.setMapper(volumeMapper);
// shade on, ambient .2 / diffuse .7 / specular .3 / specularPower 8
// on data load: mapper.setInputData(im); renderer3D.removeAllVolumes(); renderer3D.addVolume(volumeActor)
// setScalarOpacityUnitDistance(0, diagonal / max(dims)); gradientOpacity min 0 / max (range*0.05)
```
### 8.6 Cameras (medical orientations, set once on data load) — verbatim
```ts
// center = volume bounds center; d = boundingBox.diagonalLength * 1.5
axial: pos(cx, cy, cz-1) focal(center) viewUp(0,-1,0) then renderer.resetCamera()
coronal: pos(cx, cy-1, cz) focal(center) viewUp(0, 0,1) then resetCamera()
sagittal: pos(cx-1, cy, cz) focal(center) viewUp(0, 0,1) then resetCamera()
3D: rotate3DView('anterior')
// rotate3DView(o): focal = center; viewUp/pos per orientation:
// anterior pos(cx, cy-d, cz) up(0,0,1) posterior pos(cx, cy+d, cz) up(0,0,1)
// left-lat pos(cx-d, cy, cz) up(0,0,1) right-lat pos(cx+d, cy, cz) up(0,0,1)
// superior pos(cx, cy, cz+d) up(0,1,0) inferior pos(cx, cy, cz-d) up(0,-1,0)
```
### 8.7 Window/level + opacity transfer functions (on every slider change)
```ts
const low = level - width/2, high = level + width/2;
// 2D grayscale (per slice ctf): points (low-1,0,0,0)(low,0,0,0)(high,1,1,1)(high+1,1,1,1)
// + actor.getProperty().setColorWindow(width); setColorLevel(level);
// 3D volume color (bone/soft-tissue ramp):
ctf: (low-200, 0,0,0)(low, .4,.2,.1)(low+.3Δ, .8,.6,.5)(low+.5Δ, .9,.8,.7)(high, 1,1,.9)(high+200, 1,1,1)
pf : (low-200,0)(low,0)(low+.2Δ, op*.2)(low+.5Δ, op*.5)(high, op) // Δ = high-low
```
### 8.8 Slice scrolling + zoom (wheel listener per 2D pane)
```ts
// plain wheel → slice nav: axisIndex = view===axial?2 : view===coronal?1 : 0
// plane.origin[axisIndex] += spacing[axisIndex] * (deltaY>0?1:-1); clamp to bounds; render
// Ctrl/⌘ wheel → camera.zoom(deltaY<0 ? 1.1 : 0.9)
```
Attach with `{ passive: false }` and `preventDefault()`.
### 8.9 Resize
`ResizeObserver` on the container → `renderWindowView.setSize(rect.w, rect.h)` + reposition
the % border divs + `render()`.
---
## 9. Interaction model ⚠️ (the subtle part)
One interactor, bound to the **full canvas container** (never per-pane). Two styles:
`vtkInteractorStyleImage` (2D: pan/zoom/window-level) and
`vtkInteractorStyleTrackballCamera` (3D: rotate). Each pane `<div>` carries
`dataset.viewId = "0..3"`.
- **Style swap by pane:** on `pointerenter`/`mousedown`, `interactor.setInteractorStyle(viewId==='3' ? trackball : image)`, and bind events to the full container (once).
- **⚠️ Confine a drag to its origin pane:** VTK re-resolves the "poked" renderer on **every
mouse-move** (`findPokedRenderer`), and trackball/image act on that renderer — so a drag
begun in the 3D pane that wanders into a 2D pane retargets the 2D camera. `findPokedRenderer`
skips renderers whose `getInteractive()` is false, so **on `mousedown` set every OTHER
renderer `setInteractive(false)`, restore all on `mouseup` + a global `mouseup`.**
- **⚠️ Double-click to expand:** VTK takes **pointer capture** on press, so the native
`dblclick`/`mouseup`/`click` fire on the parent container, **not** the per-pane div — a
`dblclick` listener on the div never fires. Detect it on the div's `mousedown` via
`e.detail === 2` (the 2nd press of a double-click), then toggle `expandedView`.
- **⚠️ `resetCamera()` on expand:** when the layout changes, re-run
`renderers.forEach(r => r.resetCamera())` after `setViewport`, else the enlarged pane keeps
the tiny framing it had as a quadrant (content stuck in a corner). `resetCamera` preserves
direction + view-up (orientation/rotation kept).
- **⚠️ VTK + Vite HMR:** Fast Refresh leaves stale inputless mappers → console floods
`No input!` + black panes. **Always verify on a FULL reload**, not HMR.
---
## 10. Annotation overlay (`AnnotationOverlay.tsx`)
One overlay `<div>` per 2D pane, `absolute inset-0`, rendering an SVG of the annotations.
Geometry is **normalized [0..1]** to the pane and tagged with `{view, sliceIndex}` so an
annotation only shows on the slice it was drawn on. Tools: `bbox` (2-pt drag), `points`
(click), `pen` (polyline w=2), `brush` (polyline w=16, 0.55 opacity), `polygon` (click
vertices, double-click to close ≥3 pts).
- **Pointer-transparent when tool === 'none'** (`pointerEvents: none`) so VTK keeps
scroll/zoom/rotate; `pointerEvents: auto` + `cursor: crosshair` when a tool is active.
- **⚠️ Capture with NATIVE listeners that `stopPropagation()` + `setPointerCapture`** — and
**also `interactor.disable()` while a tool is active** in the renderer. `stopPropagation()`
ALONE is insufficient: VTK's native canvas listener fires before any React handler can stop it.
- Latest `tool`/callbacks kept in refs so the once-bound native listeners always see current
values without re-binding mid-drag.
- `dblclick` while a polygon draft has ≥3 pts → close it; otherwise → forward to
`onRequestExpand()` (expand the pane).
- `wheel` while a tool is active → `stopPropagation` + forward `{deltaY, ctrlKey, metaKey}`
to the host, which re-dispatches a synthetic `wheel` on the underlying pane so slice-scroll
keeps working under the overlay.
---
## 11. Organ-mask overlays ⚠️ (3D surface + 2D fills)
Driven by the `organMasks: OrganMaskData[]` prop. A `useEffect([organMasks])` diffs them
against an `organMaskActors: Map<key, {...}>` and adds/removes per organ. **⚠️ Key by
`maskData.id ?? organName`, NOT the display label** — two organs sharing a label otherwise
collapse into one (and the UI lies "visible" with nothing rendered).
### 11.1 3D overlay — render a SURFACE, not a second volume
**⚠️ vtk.js does NOT composite two overlapping *volumes* in one renderer** — an overlay
volume silently fails to show even with valid data. Render the binary mask as a colored
**iso-surface** instead (also reads better as a solid shell):
```ts
const mc = vtkImageMarchingCubes.newInstance({ contourValue: 0.5, computeNormals: true, mergePoints: true });
mc.setInputData(maskImageData);
const mapper = vtkMapper.newInstance(); mapper.setInputConnection(mc.getOutputPort()); mapper.setScalarVisibility(false);
const actor = vtkActor.newInstance(); actor.setMapper(mapper);
actor.getProperty().setColor(r/255, g/255, b/255); actor.getProperty().setOpacity(0.7);
renderer3D.addActor(actor);
```
**⚠️ After adding overlay geometry call `renderer3D.resetCameraClippingRange()`** before
`render()`. The camera's far-clip was set for the main volume (e.g. far=1000) while the
surface sits ~10181283 away → it is entirely **culled** → empty 3D pane. This (not the
multi-volume issue) is the usual "nothing renders" cause; diagnose by logging
`actor.getBounds()` vs `camera.getClippingRange()`.
**⚠️ `ImageMarchingCubes` ships no `.d.ts`** — `// @ts-expect-error` the import.
**⚠️ PERF:** marching cubes on a 512³ mask ≈ ~9 s main-thread (no worker), and it runs in
the effect *after* the host's loading spinner clears → a frozen UI. Mitigate by cropping to
the mask's non-zero bbox before MC, or a Web Worker, or precomputing during load.
### 11.2 2D overlay — reslice onto the slice panes, sharing the main slice planes
For each view `vi ∈ {0,1,2}`, reslice the **same** mask with the **same `slicePlanes[vi]`**
the main image uses (so it tracks slice-scroll for free — the wheel handler already mutates
that plane + renders):
```ts
const m = vtkImageResliceMapper.newInstance(); m.setSlicePlane(slicePlanes[vi]); m.setInputData(maskImageData);
const ctf = vtkColorTransferFunction.newInstance(); ctf.addRGBPoint(0,0,0,0); ctf.addRGBPoint(1, r/255,g/255,b/255);
const pwf = vtkPiecewiseFunction.newInstance(); pwf.addPoint(0,0); pwf.addPoint(0.5,0); pwf.addPoint(1, 0.6); // opacity
const slice = vtkImageSlice.newInstance(); slice.setMapper(m);
slice.getProperty().setRGBTransferFunction(0, ctf);
slice.getProperty().setPiecewiseFunction(0, pwf);
slice.getProperty().setColorWindow(1); slice.getProperty().setColorLevel(0.5); // ⚠️ see below
slice.getProperty().setInterpolationTypeToNearest();
renderers[vi].addActor(slice);
```
**⚠️ `setColorWindow(1)` + `setColorLevel(0.5)` are mandatory.** The default color window
(255) squashes the binary value `1` to ~0 on the transfer function → the mask renders
**near-black** on the slice instead of the organ color. Window 1 / level 0.5 maps data
`0→0`, `1→1`. No z-fighting in practice (coplanar with the main slice, added after).
### 11.3 Lifecycle
Store `{ actor(surface), mapper, ctf: mcFilter, pf: null, sliceActors: [{actor,mapper,ctf,pf}×3] }`
in the Map. On toggle-off: `removeActor` the surface from `renderer3D` + each slice from
`renderers[vi]`, then `.delete()` all. **⚠️ Also free them in the component's unmount
cleanup** (iterate the Map) — they leak otherwise when the viewer closes with organs selected.
---
## 12. Host integration (the dialog)
The viewer is mounted full-screen and fed by a host that owns the controls. Minimal shape:
```tsx
<UnifiedQuadViewRenderer
files={[file]} windowWidth={ww} windowLevel={wl} opacity={op}
organMasks={organMasks} // ← selected organs
annotationTool={tool} annotations={annos} onAnnotationsChange={setAnnos} />
```
- **Control bar:** range sliders → `windowWidth` (1..4000), `windowLevel` (-1000..3000),
`opacity` (0..1).
- **Organ panel:** lists the available masks (id, label, color swatch). Toggling an organ
**lazily** loads its mask: `presignURL → fetch as File → loadNiftiImageData(file) →
OrganMaskData{ id, organName, imageData, color }`, cache by id, and derive
`organMasks = selectedIds.map(id => cache[id])`. Assign a stable color per organ by list
index from a fixed palette.
- **⚠️ Loader lives behind the lazy boundary** — the dialog (already VTK-heavy) imports
`loadNiftiImageData` from `@ump/shared/viewer`; the host *page* only uses VTK-free helpers
(presign/fetch) so the page bundle stays clean.
- Full-screen dialog: use `inset-0` (not `w-screen/h-screen`, which overflows by the
scrollbar width); a `flex-1` child in a non-definite-height parent collapses to 0 → give a
definite height (`h-screen` flex-col).
---
## 13. Consolidated gotcha checklist ⚠️
1. **% border divs**, never px from `getBoundingClientRect()` (transform-stale-rect).
2. **Confine drags** to the origin pane via `setInteractive(false)` on the others (VTK
re-pokes every move).
3. **Expand = `e.detail===2` on mousedown** (pointer-capture eats `dblclick`) **+
`resetCamera()`** on layout change.
4. **Organ 3D overlay = marching-cubes SURFACE**, not a 2nd volume **+
`resetCameraClippingRange()`** (stale far-clip culls it).
5. **Organ 2D overlay**: reslice on the **shared** slice planes + **`colorWindow(1)/
colorLevel(0.5)`** (else near-black) + piecewise opacity.
6. **Key overlays by stable id**, not label.
7. **Free organ actors** on toggle-off AND unmount.
8. **Annotation overlay**: pointer-transparent when idle; native listeners +
`stopPropagation` + `interactor.disable()` while drawing.
9. **Co-registration**: masks must share the image's grid (dims+spacing, origin 0,0,0) — no
resampling is applied.
10. **Lazy subpath + alias order**; verify VTK/visual/interaction changes on a **FULL reload**.
---
## 14. Reconstruction build order
1. Install deps (§2); import the 3 VTK profiles once.
2. `parseNiftiBuffer` + `loadNiftiImageData` + `useNiftiData` (§7).
3. The init effect: 1 canvas / 1 renderWindow / 1 OpenGL view / 4 renderers + viewports
(§8.18.2) + the % border divs (§8.3).
4. Data-load effect: planes + reslice slice actors (§8.4), volume actor (§8.5), cameras
(§8.6) — verify the 3 slices + 3D volume render.
5. Window/level/opacity effect (§8.7) + wheel scroll/zoom (§8.8) + resize (§8.9).
6. Interaction: interactor + style swap + confine + expand (§9) — verify drag isolation +
double-click expand on a FULL reload.
7. Annotation overlay (§10).
8. Organ overlays: 3D surface (§11.1) then 2D reslice (§11.2) + lifecycle (§11.3).
9. Host dialog + organ panel (§12).
10. Walk the gotcha checklist (§13).
> Verify every visual/interaction change live with **real mouse input on a full reload** —
> `tsc`-clean ≠ works, and synthetic event-dispatch can mask the pointer-capture/clip bugs.
</content>
+356
View File
@@ -0,0 +1,356 @@
# ImageHub — Architecture
> **"GitHub for medical-imaging research datasets."** A self-hosted platform for
> versioning, viewing, de-identifying, and collaborating on imaging datasets
> (DICOM / NIfTI / WSI), modeled on Gitea's architecture but rebuilt on a
> Python-centric stack suited to the imaging + ML ecosystem.
>
> *"ImageHub" is a placeholder name — rename freely.*
This document describes (1) the Gitea patterns we are reproducing, (2) how each
maps to the imaging domain, (3) the recommended stack, (4) the subsystems and
data model, and (5) an MVP-first roadmap.
---
## 1. Design philosophy (inherited from Gitea)
Gitea is worth copying for five structural decisions. We keep all five:
1. **Modular monolith, not microservices.** One deployable core app with clear
internal layers. You can scale the heavy parts out later (we do — the worker
tier) without paying distributed-systems tax up front.
2. **Strict downward layering.** `cli → api → services → models → core`.
Dependencies only point down. Business logic lives in `services`, never in
models or HTTP handlers.
3. **Server-rendered UI + progressive enhancement, not a SPA.** Pages are
rendered server-side; rich client behavior (the image viewer) is embedded as
self-contained widgets. Faster to build, easy to deep-link, SEO/printable.
4. **Pluggable infrastructure behind interfaces.** Storage, queue, search,
cache, and auth are interfaces with swappable drivers (local disk ↔ S3,
in-proc ↔ Redis, Postgres FTS ↔ OpenSearch). Same idea as Gitea's
`modules/storage`, `modules/queue`, `modules/indexer`.
5. **The domain engine is a first-class subsystem.** For Gitea that engine is
Git. For us it is the **Dataset Versioning Engine** — a content-addressed,
Merkle-DAG version control system specialized for large imaging files. This is
the single most important component and the heart of the product.
What we deliberately change from Gitea:
- **Workers are externalized.** Gitea runs background jobs in-process. Imaging
jobs (de-identification, format conversion, thumbnailing, ML) are heavy,
Python-bound, and sometimes need GPUs — so they run in a separate, scalable
worker tier driven by a real queue.
- **All "files" are large binaries.** Gitea bolts on Git-LFS for large files; for
us large-file handling is the *default and only* path — every blob is
content-addressed and stored in object storage.
- **De-identification & audit are core**, not afterthoughts (domain requirement).
---
## 2. Concept mapping: Gitea → ImageHub
| Gitea concept | ImageHub equivalent | Notes |
|---|---|---|
| Repository | **Dataset** | A versioned collection of imaging studies/series + metadata + labels. |
| Git commit | **Version** (commit) | Immutable snapshot = a content-addressed manifest + parent links. |
| Branch / tag | **Branch / tag** | e.g. `raw`, `deidentified`, `train-split-v3`; tags for citable releases. |
| Blob / tree | **Blob / manifest** | Blob = one file (DICOM instance, NIfTI, label). Manifest = the tree of a version. |
| Git-LFS | *(native)* | Every blob is large; content-addressed object store is the only path. |
| Git transport (SSH/HTTP) | **Transport API + CLI/SDK** | Resumable chunked upload/download; "have/want" blob negotiation like LFS batch. |
| Pull Request | **Change Proposal** | Review added/changed/relabeled data before merging into a branch. |
| Diff / code review | **Dataset diff + image diff** | Added/removed/changed series and label diffs, viewed side-by-side. |
| Issues | **Issues / annotation tasks** | QC findings, labeling tasks, discussions. |
| Releases | **Dataset releases** | Frozen, citable snapshots (DOI-friendly) — key for research reproducibility. |
| Wiki | **Datasheet / data dictionary** | Dataset documentation, "Datasheets for Datasets". |
| Actions / act_runner | **Pipelines / runners** | Event-driven compute: de-id, QC, train/eval; pins exact data version. |
| Webhooks | **Webhooks** | Same. |
| Code search indexer | **Metadata + tag search** | Faceted search over modality/body-part/labels; optional image-embedding search. |
| Org / Team / User / RBAC | **Org / Team / User / RBAC** | Nearly identical; plus dataset access requests / data-use agreements. |
| `app.ini` + `modules/setting` | **Config system** | Typed config from file + env. |
| XORM migrations | **Alembic migrations** | Ordered, append-only schema migrations. |
| Storage (local/minio/s3) | **Object storage** | Same abstraction; blobs live here. |
| *(minimal in Gitea)* | **Audit & compliance log** | First-class, append-only PHI-access trail. |
| *(none)* | **De-identification engine** | Domain-specific; no Gitea analogue. |
---
## 3. Recommended stack ("own stack", Python-centric)
Rationale: the medical-imaging and ML ecosystems (pydicom, SimpleITK, nibabel,
dcm2niix, highdicom, MONAI, the de-id tooling) are overwhelmingly Python. A
single-language core + worker stack removes the model-duplication friction you'd
get from a Go core calling Python workers.
| Layer | Choice | Gitea analogue |
|---|---|---|
| Core web/API | **Python 3.12 + FastAPI** (uvicorn/gunicorn) | `routers/` (chi) |
| Templating | **Jinja2 + HTMX** for progressive enhancement | `templates/` |
| Frontend build | **Vite + TypeScript** | `web_src/` + Vite |
| DICOM viewer | **Cornerstone3D** (DICOM), **NiiVue** (NIfTI) | embedded widgets |
| ORM / migrations | **SQLAlchemy 2.0 + Alembic** | XORM + migrations |
| Primary DB | **PostgreSQL** (single target) | multi-DB → standardize on PG |
| Queue / workers | **Redis + Arq** (async) or Celery | `modules/queue` + workers |
| Object storage | **S3 / MinIO** (self-host) | `modules/storage` |
| Search | **OpenSearch** (or Postgres FTS to start) | `modules/indexer` |
| Cache / pubsub / sessions | **Redis** | `modules/cache`, eventsource |
| Auth | **Authlib** (OIDC/OAuth2) + sessions + API tokens | `services/auth` |
| Imaging libs | pydicom, highdicom, SimpleITK, nibabel, dcm2niix, Pillow; OpenSlide for WSI | — |
| ML integration | MONAI / PyTorch dataset adapters via the SDK | — |
| De-id | pydicom + `deid` (CTP rules) + Presidio (text) + OCR (burned-in PHI) | — |
| Client | **Python SDK + CLI** (`imagehub clone/pull/push/commit`) | the `git` client |
> **Alternative if you want Gitea-grade transport performance:** keep a **Go
> core** for the API/transport/auth layer and use **Python only in the worker
> tier**. Faithful to Gitea, but you maintain two languages and duplicate the
> dataset/manifest types across the boundary. Recommended only if the upload/
> download path is your dominant bottleneck. Default to all-Python.
---
## 4. Layered architecture
```
cli/ Admin & ops commands (Typer): serve, migrate, doctor, deid-batch, user-admin
└─ api/ FastAPI routers — UI pages + REST API + transport endpoints (thin: parse → service → render)
└─ services/ Business logic: dataset ops, versioning workflows, review, pipelines, de-id orchestration
└─ models/ SQLAlchemy entities + queries (one module per domain: user, dataset, version, annotation…)
└─ core/ Leaf infra & domain engines — MUST NOT import the layers above
├─ vcs/ ← the Dataset Versioning Engine (the "Git")
├─ storage/ ← content-addressed blob store over S3/MinIO
├─ imaging/ ← DICOM/NIfTI parsing, metadata, thumbnails, conversion
├─ deid/ ← de-identification pipeline stages
├─ queue/ ← Redis/Arq job abstraction
├─ index/ ← search abstraction (OpenSearch / PG FTS)
├─ audit/ ← append-only audit log
├─ config/ ← typed settings
└─ auth/ ← tokens, sessions, OIDC, permissions
```
**Layer rules (enforce with import-linter, the analogue of Gitea's depguard):**
- `core/` is the foundation; it may not import `models/`, `services/`, or `api/`.
- Cross-entity business logic goes in `services/`, never in `models/`.
- `api/` handlers stay thin — no business logic, no direct DB-engine access.
- Every DB query takes a `session`/context so it enlists in the request transaction.
---
## 5. Core subsystems
### 5.1 Dataset Versioning Engine (`core/vcs`) — the heart
A content-addressed Merkle DAG, like Git, specialized for large imaging files.
- **Blob store.** Every file is hashed (SHA-256) and stored once in object
storage at `blobs/<aa>/<bb>/<hash>`. Identical files across versions/datasets
dedupe for free (huge win — imaging datasets share many instances).
- **Manifest (tree).** A version's manifest lists `logical_path → {blob_hash,
size, media_type, imaging_meta}`. The manifest is itself content-addressed.
- **Commit.** `{manifest_hash, parents[], author, timestamp, message}`. The
parent chain is the history DAG.
- **Refs.** Branches/tags map `name → commit_id`, stored in **Postgres** (not in
object storage) so they're transactional and queryable.
- **Transport / negotiation.** On push, the client hashes locally and asks the
server which blobs are missing ("have/want", like the LFS batch API), uploads
only those (resumable, chunked), then posts the commit. Pull is the reverse.
- **Diff.** Compare two manifests → added / removed / modified entries; surfaced
in the UI as a dataset diff and, per-image, as a viewer side-by-side.
- **Merge.** Three-way path-level merge of manifests; conflicts when the same
path changed on both sides. Label/annotation merges can be semantic.
**Build vs. buy:** building this custom gives full control and the cleanest
domain fit (recommended). If you need to move faster, back it with **lakeFS**
(git-like branches/commits/merge over S3) or **DVC**, and keep your manifest API
as the stable interface so you can swap the backend later.
### 5.2 Object storage (`core/storage`)
Driver interface (`put/get/stat/delete/presign`) with `local` and `s3/minio`
implementations — exactly Gitea's `modules/storage` pattern. Stores blobs,
manifests, thumbnails, pipeline artifacts. Presigned URLs let clients up/download
directly to S3 for large transfers, bypassing the app.
### 5.3 Ingestion & processing pipeline (`core/queue` + workers)
On upload, enqueue jobs; workers (Arq) process them:
1. Verify checksums, store blobs (dedup).
2. **Extract metadata** (pydicom/nibabel): modality, body part, study/series UIDs,
dimensions, acquisition params → indexed + linked to blobs.
3. **Thumbnails / previews** for the browse UI.
4. **De-identification** (§5.4).
5. **Format normalization** (optional: DICOM→NIfTI via dcm2niix for ML).
6. Commit the resulting version; update search index; write audit entries.
Workers scale independently; GPU nodes handle ML jobs.
### 5.4 De-identification engine (`core/deid`) — compliance must-have
A configurable, multi-stage pipeline producing a `deidentified` branch from a
`raw`/PHI version:
- **Tag de-id** per **DICOM PS3.15 Annex E** confidentiality profiles: remove/
replace PHI tags, regenerate UIDs *consistently* (so series stay linked),
handle private tags.
- **Date shifting**: consistent per-patient offset to preserve intervals.
- **Burned-in pixel PHI**: OCR (Tesseract/EasyOCR) to detect text in pixels,
redact, and flag for human review.
- **Free-text / report de-id**: Presidio NER over any text fields/reports.
- **Re-identification map** (only if policy allows): the original↔pseudonym
mapping is encrypted, access-restricted, and fully audited; otherwise the PHI
source is dropped.
- **Verification stage** emits a report of exactly what changed.
Tooling: pydicom, Stanford `deid` / MIRC CTP rule sets, Presidio, an OCR engine.
Profiles are configurable per org/dataset.
### 5.5 Web viewer (`api` + embedded TS widgets)
Progressive-enhancement widgets (not a separate SPA), true to Gitea:
- **Cornerstone3D** for DICOM (multi-frame, MPR, windowing, measurements,
segmentation overlays).
- **NiiVue** for NIfTI volumes (great for neuro/research).
- **OpenSlide**-backed deep-zoom tiles for whole-slide pathology (optional).
The server exposes a frame/tile API (a WADO-RS-like read path even without full
DICOMweb). Annotations are structured objects (DICOM SR or JSON), **versioned
with the dataset**.
### 5.6 Search & discovery (`core/index`)
Index extracted metadata + labels → faceted search ("brain MRI, T1, age<40, has
tumor label"). Start on **Postgres FTS**; graduate to **OpenSearch** for scale.
Optional later: compute image embeddings (a foundation model) → **pgvector** for
"find similar studies/lesions".
### 5.7 Collaboration (`services`)
Change Proposals (PRs), reviews, issues, comments, annotation tasks, releases,
datasheets — the GitHub social layer, mapped to datasets. A reviewer of a Change
Proposal sees the dataset diff and can open the viewer on changed series.
### 5.8 Pipelines & runners (Actions analogue, optional/advanced)
Event-driven compute (`on: push | proposal | tag | schedule`) executed by
**runners** (containers that poll for jobs, à la `act_runner`). Use cases: auto
de-id, QC/validation, dataset statistics, **training/eval** with MONAI. Each run
**pins the dataset version hash**, giving reproducible ML by construction.
### 5.9 Auth, permissions, audit (`core/auth`, `core/audit`)
- OIDC/OAuth2 login, sessions, scoped API tokens.
- Org → Team → permission model; dataset visibility `private | internal | public`;
dataset-level access requests / data-use agreements.
- **Audit log**: append-only Postgres table (actor, action, object, dataset,
version, IP, purpose-of-use, timestamp). Every PHI-bearing access (view
original, download) is logged; optional hash-chaining for tamper-evidence;
retention + legal-hold support.
### 5.10 API, SDK, CLI
- **REST API** (FastAPI, OpenAPI-documented — the swagger analogue).
- **Python SDK** (the most important client for ML users): pull a pinned version
straight into a `torch`/MONAI `Dataset`.
- **CLI** (`imagehub clone/pull/push/checkout/commit/diff`) — the `git`/`dvc`
analogue for data engineers.
---
## 6. Data model (core tables)
```
user, organization, team, team_membership, team_access
dataset(id, owner_id, name, visibility, default_branch, description)
ref(dataset_id, name, type[branch|tag], commit_id) -- transactional refs
commit(id, dataset_id, manifest_hash, parent_ids[], author_id, message, created_at)
blob(hash PK, size, storage_key, media_type, refcount) -- content-addressed, deduped
manifest(hash PK, storage_key) -- stored in object store, hash in DB
instance_meta(blob_hash, dataset_id, study_uid, series_uid, modality, body_part, dims, params…)
annotation(id, dataset_id, commit_id, target, type, payload, author_id)
label_schema(id, dataset_id, spec) label(id, schema_id, value)
change_proposal(id, dataset_id, src_ref, dst_ref, status) review, comment
issue(id, dataset_id, …) issue_comment
release(id, dataset_id, tag, notes, doi?)
pipeline(id, dataset_id, spec) pipeline_run(id, pipeline_id, commit_id, status, artifacts) runner
webhook webhook_delivery
audit_log(id, actor_id, action, object_type, object_id, dataset_id, ip, purpose, created_at) -- append-only
access_request, data_use_agreement
phi_map(dataset_id, original_ref, pseudonym, …) -- encrypted, restricted, audited
```
---
## 7. Key flows
1. **Ingest & de-identify:** upload → blobs stored (deduped) → metadata extracted
→ de-id pipeline → new commit on `deidentified` branch → indexed → audited.
2. **Browse & view:** datasets list → dataset → series list → Cornerstone3D/NiiVue
streams frames → annotation overlays.
3. **Curate an ML subset (zero-copy):** faceted query → new branch/dataset whose
manifest *references existing blobs* (no data copied) → commit → tag a release
→ `sdk.pull(tag)` in training.
4. **Propose a change (PR):** push new/relabeled data to a branch → open Change
Proposal → reviewer sees dataset diff + image diff → approve → merge.
5. **Reproducible training:** tag triggers a pipeline that pins the version hash,
runs MONAI train/eval, and links metrics + model artifact to that exact data
version.
---
## 8. Deployment topology
```
┌──────────── reverse proxy (Caddy/Traefik) + TLS ────────────┐
│ │
┌────────▼────────┐ ┌──────────────────┐ ┌───────────────────────▼─┐
│ Core app (N×) │ │ Worker tier (M×) │ │ Runners (K×, GPU opt.) │
│ FastAPI/uvicorn│ │ Arq + imaging/ML │ │ pipelines (train/eval) │
└───┬─────┬───┬───┘ └───┬─────────┬─────┘ └────────────┬────────────┘
│ │ │ │ │ │
┌───▼─┐ ┌─▼─┐ │ ┌───▼───┐ ┌───▼────────┐ ┌────▼────┐
│ PG │ │Redis│ └──────▶│ Redis │ │ Object store│◀────────┤Object st.│
│(state│ │queue│ │ queue │ │ S3 / MinIO │ │ (blobs) │
│ refs)│ │cache│ └───────┘ │ (blobs) │ └─────────┘
└─────┘ └────┘ └─────────────┘
┌───────────────┐
│ OpenSearch │ (metadata/label search)
└───────────────┘
```
- **Core app**: stateless, horizontally scalable.
- **Worker tier**: scales independently; CPU for de-id/convert, GPU for ML.
- **Postgres**: state, refs, metadata, audit. **Redis**: queue, cache, sessions,
server-sent events. **Object storage**: all blobs. **OpenSearch**: search.
- **Dev / small self-host**: a single `docker-compose` (app + worker + PG + Redis
+ MinIO + OpenSearch). **Scale**: Kubernetes with separate node pools.
- Contrast with Gitea (one binary, in-proc workers): we externalize workers and
object storage because imaging/ML work is heavy, Python-bound, and GPU-hungry.
---
## 9. Build-vs-buy summary
| Component | Recommendation |
|---|---|
| Versioning engine | **Build** the manifest/commit model (custom) — or back it with **lakeFS/DVC** behind your API to ship faster. |
| Viewer | **Adopt** Cornerstone3D + NiiVue (+ OpenSlide for WSI). Don't build. |
| De-identification | **Assemble** from pydicom + `deid`/CTP rules + Presidio + OCR. Don't build from scratch. |
| Search | **Postgres FTS** first → **OpenSearch** at scale. |
| Auth | **Authlib** (OIDC). |
| Queue | **Arq** (async) or **Celery**. |
| Object storage | **MinIO** self-host / **S3** cloud. |
---
## 10. MVP-first roadmap
Ordered for the chosen must-haves (versioning + viewer + de-id + audit):
- **Phase 0 — Skeleton.** Layered project structure, config, Postgres + Alembic,
object-storage driver, auth (user/org/team), dataset CRUD.
- **Phase 1 — Versioning engine.** Blobs, manifests, commits, branches; push/pull
via CLI + SDK; dataset diff. *(This is the product's spine — invest here.)*
- **Phase 2 — Ingestion + de-id + audit.** Worker tier, metadata extraction,
de-identification pipeline, append-only audit log. *(The compliance core.)*
- **Phase 3 — Viewer + search.** Cornerstone3D/NiiVue widgets, thumbnails,
faceted metadata search, browse UI.
- **Phase 4 — Collaboration.** Change Proposals, reviews, issues, annotations,
citable releases, datasheets.
- **Phase 5 — Pipelines.** Runners, event triggers, reproducible MONAI train/eval,
webhooks.
- **Later / optional.** DICOMweb + PACS adapter (QIDO/WADO/STOW), image-embedding
similarity search (pgvector), whole-slide pathology.
---
## Appendix — naming parallels for orientation
`git clone` → `imagehub clone` · repository → dataset · commit → version ·
push/pull → push/pull · PR → change proposal · `.git/objects` → content-addressed
blob store · act_runner → pipeline runner · `app.ini` → config · XORM → SQLAlchemy.
+101
View File
@@ -0,0 +1,101 @@
# Audit Log Manager — Implementation notes (review & debug)
This document describes the **Audit Log Manager** delivered in this repo: database schema, backend recording, admin API, frontend layout, and how to verify end-to-end (Postgres, API, MinIO).
## 1. Database
- **Migration:** `be0/migrations/008_audit_events.sql`
- Creates PostgreSQL enum **`audit_action`** and table **`audit_events`** (append-only by convention).
- Indexes: actor+time, entity+time, action+time, GIN on `metadata`.
**Apply** (example against local Docker Postgres — adjust connection):
```bash
psql "$INITIATIVE_DATABASE_URL" -f be0/migrations/008_audit_events.sql
```
(`INITIATIVE_DATABASE_URL` is typically `postgresql://…` for `psql`; the app uses SQLAlchemy async URL like `postgresql+asyncpg://…`.)
Docker Compose mounts `008_audit_events.sql` into `docker-entrypoint-initdb.d` for **new** databases only. **Existing** `initiative_pg_data` volumes still need the `psql -f …/008_audit_events.sql` step once.
## 2. Backend model & helpers
| Piece | Path |
|-------|------|
| ORM model | `be0/src/initiative_db/models.py``AuditEvent` |
| Helpers | `be0/src/audit.py``record_audit`, `persist_audit_standalone`, `resolve_actor_fields` |
- **`await record_audit(session, …)`** — insert via a **SAVEPOINT**: if `audit_events` is missing (migration not applied), logs a warning and **does not roll back** the parent transaction (so login/register still succeed).
- **`await persist_audit_standalone(…)`** — own session + `commit`; same missing-table handling without raising.
## 3. Where events are written
| Area | Actions / entity_type | Notes |
|------|----------------------|--------|
| Auth | `register``create` / `user` | Same transaction as user + roles |
| Auth | OK login → `login` / `auth` | |
| Auth | Bad login → `login_failed` / `auth` | **Standalone** insert after failed credentials |
| Auth | Valid Bearer logout → `logout` / `auth` | **Standalone**; skipped if JWT does not decode |
| Auth | Profile patch → `update` / `user` | Before/after: `fullName`, `phone` |
| Auth | Password change → `update` / `user` | Snapshots `{ password: "[redacted|changed]" }` only |
| Drafts autosave | `update` / `application_draft` | When `owner_id` is known (authenticated saves) |
| MinIO evidence | `create`/`update`/`delete` / `application_evidence` | After DB + object write/delete in `main.py` |
| Staff evidence review | `update` / `application_evidence_review` | |
Admin adjudication (`application_admin_results.py`):
- `create` / `application_admin_result`
- `update` / …
- `upsert``create` or `update` depending on prior row
- `delete` — requires **`actor_user_id`**; `delete_admin_result(..., actor_user_id=…)` called from `main.py`
Legacy table **`audit_log`** (draft telemetry) remains unchanged.
## 4. Admin HTTP API
- **JWT decode** for admin routes (`decode_bearer_token`, `decode_access_token_user_id`) lives in **`be0/src/auth_jwt.py`** so audit routes do not pull in Argon2 at import time.
- **Router:** `be0/src/admin_audit_routes.py`, mounted in `main.py` as **`/api/v1/admin`**.
- **`GET /api/v1/admin/audit`** — list (admin JWT only).
- Default window: **now 7 days****now** if `from` / `to` omitted.
- Query params: `from`, `to`, `actor_user_id`, `actor_email` **(exact, lowercased)** `entity_type`, `entity_id`, `action` (comma-separated), `request_id`, `page`, `page_size` (≤ 100), `sort` (`occurred_at:asc|desc`).
- **`GET /api/v1/admin/audit/{id}`** — full row incl. `before` / `after` JSON.
Non-admin receives **403**.
## 5. Frontend
| Layer | Path |
|-------|------|
| Shared types + API client | `fe0/src/audit/` |
| Admin UI | `fe0/src/admin/audit/``AuditLogManagerPage`, filters, table, detail **Sheet** |
| Applicant-side copy (reuse) | `fe0/src/applicant/audit/actionLabels.ts` — Vietnamese labels for actions |
| Nav | **`fe0/src/components/admin/DashboardSidebar.tsx`** (+ duplicate `DashboardSidebar.tsx`) — **« Nhật ký Audit »** → `/dashboard/admin/audit` |
| Routes | `fe0/src/App.tsx`**`admin/audit` before `admin`** nested under dashboard |
Detail panel loads **`/api/v1/admin/audit/{id}`** and runs **`microdiff`** on the client only (see `audit/jsonMicrodiffLines.ts`), per architecture doc.
Dependency: **`microdiff`** (added to `fe0/package.json`).
## 6. Debugging checklist
1. **500 on `/auth/login` after adding audit** — Postgres log / SQLAlchemy showed `relation "audit_events" does not exist`: apply migration `008`. Until then, **`record_audit`** skips the insert and auth still works.
2. **`ECONNREFUSED` from `fe0` to `:4402`** — API not listening yet (`be0` still on NLTK pip, or old entrypoint failed on `ollama: command not found`). Rebuild `be0`; entrypoint now **skips Ollama** when the binary is absent. Wait for `Uvicorn running on http://0.0.0.0:4402` before logging in.
3. **Table missing / Admin audit 503** — run migration `008_audit_events.sql`.
4. **Empty list but events expected** — check `from`/`to`; default is last **7 days** on the API.
5. **403 on `/api/v1/admin/audit`** — JWT must include **`admin`** in `roles` (same as other admin APIs).
6. **Failed-login rows missing** — must use **`persist_audit_standalone`** path; if Postgres URL wrong, standalone insert is skipped (`is_postgres_enabled()`).
7. **MinIO uploads without audit rows** — evidence handler must reach `upsert_evidence_artifact` **after** successful `s3.upload`; failures before that do not audit.
8. **Comment typo broke build once**`formatAuditTime.ts` JSDoc used invalid `**/”` sequence; stick to **` */`** closing block comments.
## 7. Suggested verification flow
1. Apply migration → restart API.
2. Register / login → see `create`/`login` rows (filter `action=login,create`).
3. Save applicant draft tab → `application_draft`.
4. Upload evidence with MinIO configured → `application_evidence` with `metadata.minioBucket`.
5. Open **`/dashboard/admin/audit`** as admin → pagination + row click opens JSON + microdiff.
## 8. References
- Spec: `assets/docs/audit-log-implementation.md`
- Product plan: `docs/audit-log-manager-plan.md`
+143
View File
@@ -0,0 +1,143 @@
# Audit Log Manager — Admin Planning Document
This plan describes the **Audit Log Manager**: an admin-only surface for tracing and monitoring **per-user activity** in the web application, with **time** as the primary navigation axis. It is derived from and must stay consistent with [`assets/docs/audit-log-implementation.md`](../assets/docs/audit-log-implementation.md) (schema, `record_audit`, role-based logging rules, and Step 6 admin viewer).
---
## 1. Goals
| Goal | Description |
|------|-------------|
| **Forensic traceability** | Answer: “What did user *U* do, and when?” and “What happened to entity *E* between *T₁* and *T₂*?” |
| **Time-first exploration** | Default and sort order anchored on **`occurred_at`** (UTC stored as `TIMESTAMPTZ`; display in admins locale or a configured org timezone). |
| **Tamper-aware source** | Queries read from append-only `audit_events` (app role: `INSERT`/`SELECT` only per implementation guide). |
| **No scope creep** | v1 is **query + view**, not anomaly detection, not primary shipping to external log stacks (see implementation guide §9). |
---
## 2. Relationship to backend data model
The implementation guide defines a unified **`audit_events`** table keyed by time:
- **`occurred_at`** — canonical **timestamp** for all list, filter, and timeline views; indexed with `actor_user_id` and other dimensions.
- **Actor columns** — `actor_user_id`, `actor_email`, `actor_role` (denormalized at event time).
- **Target** — `action`, `entity_type`, `entity_id`, optional `before` / `after` JSONB snapshots, `metadata`, `request_id`.
**Note:** The repository may also contain legacy `audit_log` / trigger-based patterns. The Audit Log Manager should target the **new** `audit_events` model from the implementation guide once migrated; until then, scope API field mapping explicitly so the UI contract does not depend on legacy tables.
---
## 3. Admin API design
### 3.1 Endpoint
- **`GET /api/v1/admin/audit`** — admin role check on router **and** service layer (mirror implementation guide §8).
### 3.2 Query parameters (time + filters)
| Parameter | Purpose |
|-----------|---------|
| **`from`** | Inclusive lower bound on `occurred_at` (ISO-8601 / RFC3339, timezone-aware). |
| **`to`** | Exclusive or inclusive upper bound (pick one, document in OpenAPI; default: *inclusive* end-of-day if date-only strings are allowed). |
| **`actor_user_id`** | Filter by user UUID (strong identifier for “this users timeline”). |
| **`actor_email`** | Partial or exact match per product choice; prefer **exact** for predictable forensic use. |
| **`entity_type`**, **`entity_id`** | Narrow to one records history. |
| **`action`** | One or many `audit_action` values (`create`, `read`, `update`, `delete`, `login`, `logout`, `login_failed`). |
| **`request_id`** | Optional: correlate all events from one HTTP request. |
| **`page`**, **`page_size`** | Paginated results; cap `page_size` (e.g. 100) for performance. |
| **`sort`** | Default **`occurred_at,desc`**; optional `asc` for chronological “playback” within a window. |
### 3.3 Response shape
Each item should expose at minimum:
- `id`, `occurred_at`, `actor_user_id`, `actor_email`, `actor_role`, `action`, `entity_type`, `entity_id`, `metadata`, `request_id`, and flags or URLs indicating presence of `before` / `after` (full JSON may be omitted from list rows and loaded on detail fetch if payloads are large).
Optional follow-up:
- **`GET /api/v1/admin/audit/{id}`** — full row including `before` / `after` for the detail panel.
### 3.4 Query strategy
- Always constrain by **`occurred_at`** when the admin does not pass `from`/`to` — e.g. default **last 24 hours** or **last 7 days** to avoid full table scans.
- Use existing indexes: `(actor_user_id, occurred_at DESC)`, `(entity_type, entity_id, occurred_at DESC)`, `(action, occurred_at DESC)`.
---
## 4. Admin UI — Audit Log Manager
### 4.1 Placement and access
- Route: **`/dashboard/admin/audit`** (or equivalent admin layout path).
- Wrapped in **`ProtectedRoute`** (or equivalent) **admin-only**; no feature flags that downgrade the check in production.
### 4.2 Layout
1. **Filter bar** — time range, user (email or ID), entity type/id, action multiselect, optional free-text on `metadata` paths (defer to v2 unless trivial; GIN on `metadata` supports it later).
2. **Results table** — columns: **Time** (`occurred_at`), **Actor** (email + role), **Action**, **Entity** (type + id), **Summary** (one-line from metadata or action), **Request** (link/filter by `request_id` if present).
3. **Detail drawer / panel** — on row click: show full metadata, **`before` / `after`** side-by-side; **diff in the browser** with a JSON diff library (per implementation guide — do not compute diff on server).
### 4.3 Time-centric UX presets
Presets speed up monitoring:
- Last 24 hours, last 7 days, last 30 days, custom range (date-time pickers with timezone label).
- **“This users activity”** deep link: `/dashboard/admin/audit?actor_user_id=…&from=…&to=…`.
### 4.4 Monitoring-oriented views (v1 vs later)
| View | v1 | Later |
|------|----|--------|
| Paginated event list with time filters | Yes | — |
| Sort by `occurred_at` | Yes | — |
| Per-user timeline (same data, fixed `actor_user_id`) | Yes | Optional density / grouping by day |
| Per-entity history | Yes | — |
| Export (CSV/JSON) for compliance | Optional | Regulated customers often need it |
| Live / websocket tail | No | Only if product requires real-time |
---
## 5. Timestamp semantics
- **Storage:** `TIMESTAMPTZ` in PostgreSQL; all APIs use ISO-8601 with offset or `Z`.
- **Display:** Convert to admin-visible timezone; show UTC in tooltip or secondary row for disputes.
- **Day boundaries:** If the UI sends date-only `from`/`to`, define explicitly (e.g. start of day / end of day in chosen timezone, then convert to UTC for the query).
---
## 6. Privacy, security, and correctness
- **Admin-only** on UI and API; verify with automated tests.
- **Do not** log secrets in `before`/`after` (enforce `to_dict()` hygiene per implementation guide §5.1 and §8).
- **Failed login** events: `actor_user_id` null, `actor_email` = attempted address; UI must not imply existence of accounts beyond what policy allows (align with auth UX).
- **Read volume:** Applicant read-audit is mostly suppressed by design; admins see **all** reads they are allowed to see — train admins that “quiet” applicant timelines are expected.
---
## 7. Implementation phases (aligned with implementation guide §7)
| Phase | Deliverable for Audit Log Manager |
|-------|-----------------------------------|
| **After Step 1** | Stub UI + API returns empty or mock; contract freeze. |
| **After Steps 24** | Real data for auth, user admin, initiative writes; filters and table usable for forensic queries. |
| **After Step 5** | Read events appear per role rules; admin sees broad read coverage. |
| **Step 6 (shipping target)** | Full **Audit Log Manager** as specified here: filters, pagination, detail panel, time presets. |
| **Step 7** | Retention/partitioning — Manager may add UI notice “data older than X may be purged per policy.” |
---
## 8. Acceptance criteria (Audit Log Manager)
- [ ] Admin cannot access `/api/v1/admin/audit` as non-admin (403).
- [ ] Listing is **ordered by `occurred_at`** descending by default; ascending works for a bounded window.
- [ ] Filtering by **`from` / `to`** reduces rows correctly and uses indexed columns.
- [ ] Filtering by **`actor_user_id`** or **`actor_email`** returns only that actors events.
- [ ] Row detail shows **`before` / `after`** without server-side diff; client-side diff renders correctly for large JSON (smoke test with sample initiative snapshot).
- [ ] Time preset “last 7 days” matches server interpretation (document timezone).
- [ ] No secrets appear in exported or displayed JSON (spot-check against `to_dict()` tests).
---
## 9. References
- Primary spec: [`assets/docs/audit-log-implementation.md`](../assets/docs/audit-log-implementation.md) — sections 2 (schema), 4 (`record_audit`), 5 (where writes happen), 6 (what to log per role), 7 Step 6 (admin viewer), 8 (verification).
+62
View File
@@ -0,0 +1,62 @@
# Evaluation: Auth, Registration, Roles, and User Management
## The most important finding first
The current registration endpoint trusts a client-supplied `role`. This isn't a gap — it's a privilege escalation. Anyone who can hit `/api/v1/auth/register` can `POST` `role: "admin"` and become a Quản trị viên. Bypassing the regex with a `@ump.edu.vn` address is the only constraint. Fix this before anything else; the rest of the cleanup is hygiene by comparison.
## Current implementation
**Email policy is wrong on both ends.** `_normalize_ump_email` and the client `validateUmpEmail` both reject `@umc.edu.vn`. Two places to fix, and the server is authoritative — the client check is UX only.
**The login role picker is conceptually inverted.** The spec says role flows from server to client based on identity. Today, `Login.tsx` makes the user pick `loginRole` and `buildUserWithSelectedRole` then *validates* the pick against DB roles, failing with a Vietnamese error if it doesn't match. That's not "no role picker" — it's a worse version of one, because legitimate users get an unhelpful error when they pick the wrong row in a Select they shouldn't see at all.
**`buildUserWithSelectedRole` after registration is circular.** The client picks a role, the server stores it (it shouldn't), and then the client confirms the role it just sent. The only reason this "works" is that the registration vulnerability lets the client decide. Once registration is fixed to derive the role server-side, this call needs to disappear or become "trust the server's `roles[]`."
**Enum drift in PostgreSQL.** `user_role` defines `applicant`, `council_member`, `editor`, `admin`, `viewer` — five values — but the auth code only ever writes/reads three. The doc itself notes that `applicant` maps to `viewer` "in the app." That's a design smell: either the DB has two pairs of synonyms (in which case drop two), or the Vietnamese taxonomy actually has finer distinctions the code is collapsing. Whichever it is, decide and migrate. Living with both is how you end up with users you can't query consistently.
**Sidebar dead link.** `/dashboard/users` is linked from `DashboardSidebar.tsx` but no route handles it, so it falls through to the 404. Admins clicking this today get nothing.
**JWT staleness.** Roles are baked into the JWT at issue time. When admin promotes a viewer to Hội đồng (once that flow exists), the change won't take effect until refresh. Acceptable, but the new admin UI should make this visible — either auto-refresh affected sessions or document the lag.
## The proposed plan
The five suggested directions are pointed in the right direction but underspecified in places that will matter during implementation.
**(1) Shared allow-list — fine, but the server is the only one that has to be right.** The frontend regex is a UX nicety; treat it as such. Don't overengineer "sharing" — duplicating a small list in two places is cheaper than building a config delivery mechanism for two strings.
**(2) Derive role on register — yes, and also reconcile on login.** The plan says "optionally" reconcile on login. Make it non-optional. If the admin email list changes (someone added, someone removed), you don't want stale `user_roles` rows to outlive the policy. A simple rule: on every login and refresh, if the email is in the admin list and the user lacks `admin`, add it; if the email is *not* in the list and the user has `admin` *granted by the email rule*, remove it. That last clause matters — you don't want to wipe admin grants made through future UI tooling.
**Hardcoding the five emails in source is brittle.** People leave institutions. Put them in environment config or a seeded DB table (`admin_emails`) so ops can change them without a deploy. A DB table also gives you an audit trail.
**(3) Remove the role selectors — yes, but resolve the multi-role question.** The DB supports multiple `user_roles` rows per user; the UI has been pretending one is "active" via `localStorage['auth-active-role']`. Once registration assigns exactly one role and login stops asking, can a user ever have more than one role? If yes (e.g., a Hội đồng member who's also an admin), you need a deterministic rule for `currentRole` — highest-privilege wins is the usual answer. If no, simplify the data model and stop reading `user_roles` as a list.
**(4) Admin API — the plan stops at read.** "List users + roles" satisfies the literal text of the requirement, but the requirement is incoherent without writes. The product says admins should see who is Người nộp đơn vs Hội đồng. How does anyone *become* Hội đồng? Not through registration (everyone defaults to viewer), not through the email list (that's admin only). You need at least:
- `POST /api/v1/admin/users/:id/roles` to grant Hội đồng
- `DELETE /api/v1/admin/users/:id/roles/:role` to revoke it
- Probably `PATCH /api/v1/admin/users/:id` for `is_active` (deactivate without deletion)
Without these, the UserManagement page is a read-only museum and there's no path for a viewer to ever become a council member.
**(5) UserManagement page — same issue.** Plan it as a CRUD-capable surface from day one. Listing without ability to act means admins will ask for "edit" two weeks after launch and you'll rebuild half of it.
## What the plan doesn't mention but should
**Data migration.** Right now there are presumably users in `user_roles` whose `role` was set by the broken registration flow. After fix, you need a one-time migration: for each user, if their email is in the admin list set `admin`; otherwise set `viewer` (and decide what to do with existing `editor` rows — preserve as Hội đồng, or wipe and require re-grant). Document this; don't let it get discovered in production.
**The `applicant` / `council_member` enum values.** Decide their fate as part of this work. If the answer is "they're aliases for viewer/editor," write a migration that consolidates and drops them from the enum. If the answer is "they're the real names and viewer/editor were a mistake," do the inverse. Don't ship the auth refactor while leaving five enum values where three are real.
**Rate limiting on register.** Once role assignment is server-derived, registration becomes lower-risk, but it's still an unauthenticated write endpoint. If you don't already, add basic rate limiting per IP — easy to forget when you're focused on the role logic.
**Tests for the email allow-list and admin derivation.** These are exactly the rules that will silently regress when someone touches the regex six months from now. Worth a small table-driven test suite: each of the five admin emails → admin; a non-admin `@ump.edu.vn` → viewer; a non-admin `@umc.edu.vn` → viewer; a `@gmail.com` → 400; case and whitespace variants → normalized.
## Suggested order of work
1. Fix registration to ignore client role and derive from email. Ship this alone if you have to — it closes the privilege escalation.
2. Add `@umc.edu.vn` to the allow-list (server first, client second).
3. Remove role selectors from Login/Register UI; simplify `buildUserWithSelectedRole`.
4. Build the admin list/grant/revoke API with proper authz.
5. Build the UserManagement page against it.
6. Migration + enum cleanup.
Steps 13 are mostly deletion and should be small. Steps 45 are where the real new code lives.
@@ -0,0 +1,286 @@
# Refactor guide: auth, registration, login UI, and user management
This document is the **implementation spec** for refactoring authentication. It replaces an earlier “as-is only” description with **ordered instructions** informed by [`auth-implementation-feedback.md`](./auth-implementation-feedback.md). Use it when changing **`fe0` login/registration UI** (`Login.tsx`, `AuthContext`, `auth-service`) and the supporting **`be0`** + **PostgreSQL** behavior.
---
## 1. Nonnegotiable: security
**Blocker:** The registration endpoint **`POST /api/v1/auth/register`** currently persists **`UserRoleRow(role=body.role)`** where `role` is **client-supplied**. That is **privilege escalation**: anyone who can call the API with a valid `@ump.edu.vn`-style address can send `"admin"` and become **Quản trị viên**. Email validation is not a substitute for server-side role authority.
**Instruction:** Before any cosmetic UI work, **stop trusting `role` from the client on register**. Derive roles **only** on the server (see §5). Treat closing this hole as **P0**; remaining items are hygiene by comparison.
---
## 2. Product rules (target behavior)
1. **Email domains:** `{name}@umc.edu.vn` **or** `{name}@ump.edu.vn` only (normalize: trim, lowercase; **server is authoritative**; client validation is UX only).
2. **Roles from identity (no user-facing role picker on login or register):**
- A configurable set of emails maps to **Quản trị viên** (`admin` in `fe0/src/lib/permissions.ts`).
- Every other allowed email gets **Người nộp đơn** (`viewer`) on **first registration**.
- **Hội đồng** (`editor`) is **not** self-service: grant/revoke via **admin UserManagement** (API), not registration.
3. **User management:** Admins can **list** users and roles, **grant** Hội đồng, **revoke** it, and **deactivate** accounts as needed. The page must be **CRUD-capable from day one** (read-only listing alone is insufficient).
---
## 3. Current implementation (baseline to remove or replace)
### 3.1 Backend (`be0/src/auth_api.py`)
- **`_normalize_ump_email` / `UMP_EMAIL_RE`:** only `@ump.edu.vn`**reject** `@umc.edu.vn` today.
- **Register:** inserts `User` + `UserRoleRow(role=body.role)`**trusted client role (bug).**
- **Login:** returns `roles` from DB; **no** role in HTTP body — correct for API shape, but the **frontend** then forces a **role Select** and `buildUserWithSelectedRole(user, selectedRole)`, which **fails** if the user picks the wrong row. That is **not** “no role picker”; it is an **inverted** UX (identity should drive role; users should not see a selector).
### 3.2 Frontend
- **`fe0/src/pages/Login.tsx`:** `validateUmpEmail` — UMP only; **`loginRole` / `regRole`** Selects; register sends **`role`** in JSON.
- **`fe0/src/contexts/AuthContext.tsx`:** After login/register, **`buildUserWithSelectedRole(..., selectedRole)`** (and after register, **`payload.role`**) — **circular** with todays register bug: client chooses → server stores → client “confirms” the same choice. After server-derived roles, this must become **trust `user.roles` from the API** (see §6).
- **`fe0/src/lib/auth-service.ts`:** Register payload includes **`role`** — remove once API ignores it.
- **`/dashboard/users`:** linked from `DashboardSidebar.tsx` / `admin/DashboardSidebar.tsx` but **no route** in `fe0/src/app/router/routes.tsx`**404**.
### 3.3 Database
- **`user_role` enum** (`001_initiative_schema.sql`): `applicant`, `council_member`, `editor`, `admin`, `viewer`**five values**; auth code only reads/writes **three**. Collapsing `applicant``viewer` and `council_member``editor` (or standardizing on one pair) is a **design decision** — do not leave five values with only three “real” forever (see §8).
### 3.4 JWT
- Roles are **embedded in the JWT** at issue time. If an admin **changes** `user_roles` in DB, the user sees new permissions only after **refresh** (or re-login). **Acceptable**, but: surface this in **UserManagement** or docs (e.g. “changes apply on next refresh”) and optionally trigger refresh after admin actions on the **same** browser session if you add that flow later.
---
## 4. Configuration: admin emails (avoid brittle hardcoding)
**Feedback:** Hardcoding the five institutional emails in source is brittle.
**Instruction:** Load the admin allow-list from **environment** (e.g. comma-separated `AUTH_ADMIN_EMAILS`) **and/or** a **seeded DB table** (e.g. `admin_emails` with audit columns). The servers derivation logic must use **one** resolved list (env merged with DB, or DB only after migration — pick one approach and document it).
Frontend must **not** duplicate the list for **authorization**; at most duplicate **domain regex** for UX (§5.1).
---
## 5. Backend refactor instructions (`be0`)
### 5.1 Email allow-list
- Replace “UMP-only” normalization with a function that accepts **`@ump.edu.vn`** **or** **`@umc.edu.vn`** (same local-part rules as today unless product says otherwise).
- **Reject** everything else with **400** and a clear message.
- **Ship server change before or with** client regex update; server must stay correct if the client is bypassed.
### 5.2 Registration: derive role; ignore client `role`
- **Remove** `role` from the public contract **or** accept but **ignore** it (document deprecation); **never** insert `UserRoleRow` from the client.
- On register: if normalized email ∈ admin list → ensure **`admin`** in `user_roles` (and **not** `viewer`-only self-service for those accounts — product: admins are allow-listed); else → insert **`viewer`** only.
- **Do not** assign **`editor`** at registration.
### 5.3 Login and refresh: **mandatory** role reconciliation
**Feedback:** “Optionally reconcile on login” is too weak; **make it mandatory**.
On **every** `login` and **`refresh`** (and ideally before issuing JWT in those handlers):
1. **Admin by email rule:** If email ∈ admin list and user **lacks** `admin`**add** `admin` row in `user_roles`.
2. **Removal:** If email **∉** admin list and users `admin` was **only** from the email rule → **remove** `admin`.
**Critical:** If you later add **`admin` grants via UserManagement** (not email-derived), you **must not** strip those when reconciling. Implement **one** of:
- **Two sources of truth:** e.g. column `users.is_admin_by_policy BOOLEAN` vs `admin` from grants table; or
- **Tag rows:** e.g. separate table `user_role_grants(source='email_policy'|'admin_ui')` before mutating; or
- **Rule:** only auto-remove `admin` if it was created by the policy sync marker you define.
Document the chosen rule in code comments and in this doc.
3. **Default applicant:** If user has **no** `viewer` and is not only admin-only (decide product: admins also `viewer` or not), apply your migration policy — usually new users get `viewer`; existing users need a **data migration** (§8).
### 5.4 Rate limiting
- Add **basic rate limiting** on **`/auth/register`** (per IP or per email) once role assignment is fixed — still an unauthenticated write.
### 5.5 Admin API (read **and** write)
Minimum surface for a real UserManagement product:
| Method | Path (example) | Purpose |
|--------|----------------|---------|
| `GET` | `/api/v1/admin/users` | List users: id, email, full_name, roles[], is_active, … |
| `POST` | `/api/v1/admin/users/:id/roles` | Grant **Hội đồng** (`editor`) — body may specify role |
| `DELETE` | `/api/v1/admin/users/:id/roles/:role` | Revoke `editor` (and optionally other roles per policy) |
| `PATCH` | `/api/v1/admin/users/:id` | e.g. `is_active` — deactivate without deleting |
Protect with existing admin checks (e.g. JWT must include `admin`, `_require_admin_user` pattern in `be0/main.py`).
**Why writes matter:** Viewers never become Hội đồng via registration or the email list; without grant/revoke APIs, the UserManagement page is a **read-only museum**.
### 5.6 Password reset (self-service)
- **`POST /api/v1/auth/forgot-password`** — body `{ "email" }`. Email is normalized with the same institutional rules as register. For valid domains, the JSON response is **always** the same generic success message whether or not the account exists (**enumeration-safe**). Bodies **must not** carry `role` or any privilege field (models use `extra="ignore"`).
- **`POST /api/v1/auth/reset-password`** — `{ "token", "newPassword", "newPasswordConfirm" }`. One-time token stored only as a **hash** in `password_reset_tokens`, short TTL (see `be0/src/auth_api.py`).
- **Rate limiting:** forgot (per normalized email + per client IP) and reset (per IP), implemented in `be0/src/auth_rate_limit.py`.
- **Outbound mail:** configure **`SMTP_HOST`**, **`SMTP_PORT`** (default 587), **`SMTP_USER`**, **`SMTP_PASSWORD`**, **`AUTH_MAIL_FROM`**, **`SMTP_USE_TLS`** (default on), or use **`AUTH_MAIL_LOG_ONLY=1`** to log reset links (development). Set **`AUTH_PUBLIC_WEB_ORIGIN`** or **`PUBLIC_WEB_ORIGIN`** so email links point at the SPA (default `http://localhost:8081`).
- **JWT `cv` (credential version):** column `users.credential_version` increments on password **`/change-password`** and **`/reset-password`**; middleware in `be0/main.py` plus **`/auth/refresh`** reject tokens whose `cv` no longer matches. Apply **`be0/migrations/012_password_reset.sql`**.
Admins **do not** set plaintext passwords; a future “Gửi email đặt lại mật khẩu” in UserManagement should call the same forgot-password logic server-side.
---
## 6. Frontend refactor instructions (`fe0`) — login components
### 6.1 `Login.tsx`
**Remove:**
- All **`<Vai trò đăng nhập>`** / **`loginRole`** state and `Select`.
- All **`regRole`** state and registration **role** `Select`.
- Imports only used for role pickers (`ROLE_DISPLAY_NAMES`, role icons, extra `Select` pieces) — prune dead imports.
**Keep / adjust:**
- Email + password fields; copy should say **UMP or UMC** once regex allows both.
- Replace `validateUmpEmail` with something like `validateInstitutionalEmail` matching **both** domains (still **UX**; server validates for real).
**Behavior:**
- **`handleLogin`:** call `login(email, password)` **without** a role argument (update `AuthContext` signature — §6.2).
- **`handleRegister`:** call `register({ fullName, email, password, passwordConfirm })` **without** `role` (update types and `auth-service`).
**Post-login navigation:** Keep using `resolvePostLoginPath`, but pass the **resolved active role** from context (single derived role — §6.2), e.g. `resolvePostLoginPath(user.roles[0], fromPathname)` or a small helper `getPrimaryRole(user)`.
### 6.2 `AuthContext.tsx`
**`login`:** Change to `login(email, password)` only. After `authService.login`, build session user from **`result.user.roles` returned by the server** — **no** second argument from the UI.
**`register`:** Same: **no** `role` in payload; after success, **trust** `result.user.roles` from API.
**`buildUserWithSelectedRole`:** Refactor or replace:
- **Preferred:** `buildUserFromAuthPayload(authUser)` that sets **one** active role using a **deterministic rule**:
- If multiple roles exist (e.g. `admin` + `editor`), use **highest privilege** (e.g. `admin` > `editor` > `viewer`) for `user.roles` / permissions in the shell, **or**
- Keep `availableRoles` for a future **internal** switcher only if product requires it — but **not** on the login screen.
- **Remove** reliance on `localStorage['auth-active-role']` for **login/register** flows unless you keep it **only** for intentional in-app role switching between **already granted** roles (optional follow-up).
- Eliminate error paths like “wrong role selected” for login — users never select.
**Session restore (`refreshSession`):** Same as login: **no** client-selected role; apply the same deterministic mapping from API `roles`.
### 6.3 `auth-service.ts`
- **`register`:** Omit `role` from JSON body once API ignores it.
- Types: `AuthUser` unchanged if API still returns `roles: Role[]`.
### 6.4 Routes and UserManagement page
- Add a real route for **`/dashboard/users`** (or move link to **`/dashboard/admin/users`** and update sidebars consistently).
- Implement **`UserManagement`** with `ProtectedRoute` requiring **`admin.users`** (or equivalent): **table + actions** calling the admin API from §5.5.
- Plan the UI as **list + grant editor + revoke editor + deactivate** from **day one**.
### 6.5 Callers of `login` / `register`
- Grep for `login(` and `register(` across `fe0` (e.g. tests, `SignUpModal.tsx`) and update signatures.
---
## 7. Target data flow (after refactor)
```mermaid
sequenceDiagram
participant UI as Login.tsx
participant Ctx as AuthContext
participant API as auth-service.ts
participant BE as be0 auth_api
participant DB as PostgreSQL
Note over UI,DB: Register (no client role)
UI->>Ctx: register({ fullName, email, passwords })
Ctx->>API: POST /api/v1/auth/register (no role)
API->>BE: body without trusted role
BE->>BE: derive admin vs viewer from email list
BE->>DB: INSERT users + user_roles (server only)
BE-->>API: JWT + user.roles
Ctx->>Ctx: buildUserFromAuthPayload(user)
Note over UI,DB: Login (no role picker)
UI->>Ctx: login(email, password)
Ctx->>API: POST /api/v1/auth/login
API->>BE: email + password
BE->>DB: load user; reconcile policy roles
BE-->>API: JWT + user.roles
Ctx->>Ctx: buildUserFromAuthPayload(user)
```
---
## 8. Data migration and enum cleanup
**One-time migration** after fixing register:
- For each `users` row: if email ∈ admin list → ensure `admin` (per reconciliation rules); else if no explicit **editor** grant from admin tooling → normalize to **`viewer`** only as per product.
- **Existing `editor` rows:** **Preserve** as Hội đồng unless product says to wipe and re-grant.
- **Users who became `admin` via the old bug:** Migration should **align** with email policy: if not in admin list, **remove** spurious `admin` (with the same caution as §5.3 if you introduce UI-granted admin later).
**Enum (`applicant` / `council_member`):**
- Decide: **aliases** of `viewer` / `editor` vs **canonical** names.
- Ship a migration that **consolidates** rows and **narrows** the enum or renames consistently. Do **not** finish the auth refactor while five enum values exist but only three are meaningful.
---
## 9. Tests (mandatory for rules that regress silently)
Table-driven tests (Python **or** TS — preferably backend for authority):
- Each admin-configured email → effective roles include **`admin`** (after register and after login).
- `user@ump.edu.vn` (not admin) → **`viewer`** only.
- `user@umc.edu.vn` (not admin) → **`viewer`** only; invalid domain → **400**.
- Case and whitespace on email → **normalized** to same key as policy list.
- Register **ignores** injected `role: "admin"` in JSON for non-admin email (or rejects body field entirely).
---
## 10. Suggested order of work
Aligned with [`auth-implementation-feedback.md`](./auth-implementation-feedback.md):
1. **Fix register:** ignore client `role`; derive roles server-side (**closes privilege escalation**). Ship alone if needed.
2. **Allow `@umc.edu.vn`** on server, then update client regex/copy.
3. **Remove** login/register role selectors; simplify **`AuthContext`** / **`buildUserWithSelectedRole`** → **`buildUserFromAuthPayload`** with a deterministic multi-role rule.
4. **Implement admin API** (list + grant/revoke + patch `is_active`) with authz.
5. **Build UserManagement** page and **add** `/dashboard/users` route; fix sidebar 404.
6. **Data migration + enum cleanup.**
7. **Rate limiting** on register; **tests** from §9.
Steps **13** are mostly **deletion and server logic** on the login path. Steps **45** are the bulk of **new** code.
---
## 11. File index
| Area | File(s) |
|------|---------|
| Login / register UI | `fe0/src/pages/Login.tsx` |
| Session + role resolution | `fe0/src/contexts/AuthContext.tsx` |
| HTTP client | `fe0/src/lib/auth-service.ts` |
| Permissions / labels | `fe0/src/lib/permissions.ts` |
| Post-login paths | `fe0/src/lib/dashboardNavigation.ts` |
| Routes | `fe0/src/app/router/routes.tsx` |
| Sidebars (dead link) | `fe0/src/components/DashboardSidebar.tsx`, `fe0/src/components/admin/DashboardSidebar.tsx` |
| Auth API | `be0/src/auth_api.py` |
| Password reset mail + rate limits + JWT cv middleware | `be0/src/auth_mail.py`, `be0/src/auth_rate_limit.py`, `be0/src/auth_credential_middleware.py` |
| ORM / enum | `be0/src/initiative_db/models.py`, `be0/migrations/001_initiative_schema.sql` |
| Admin guard patterns | `be0/main.py` (`_require_admin_user`, etc.) |
---
## 12. Evaluation notes incorporated from feedback
| Topic | Handling in this refactor |
|-------|---------------------------|
| Client `role` on register | **Privilege escalation — fix first** |
| Login role Select | **Inverted spec — remove** |
| `buildUserWithSelectedRole` after register | **Circular — trust server `roles`** |
| Email allow-list “sharing” | **Server canonical**; duplicate small regex on client if needed |
| Reconcile on login | **Mandatory**, with **safe rule** for non-email admin grants |
| Admin emails in source | **Env or DB table** |
| Admin API | **Read + write** (grant/revoke/deactivate) |
| UserManagement UI | **CRUD from day one** |
| JWT staleness | **Document**; optional future refresh UX |
| Migration / enum | **Explicit steps** §8 |
| Rate limit register | **§5.4** |
| Tests | **§9** |
+47
View File
@@ -0,0 +1,47 @@
# Backend — Clean Architecture + DDD (`be0`)
Incremental re-layering of the FastAPI monolith (`be0/main.py`, ~3.7k LOC) into
DDD bounded contexts. **Strangler-fig**: the monolith keeps running; each context is
peeled out and cut over one endpoint at a time. No big-bang rewrite.
## Layers — the dependency rule points INWARD
```
api ─────────────► application ─────────────► domain ◄───────── infrastructure
(FastAPI, Pydantic) (use cases, ports) (pure model) (adapters: SQLAlchemy,
▲ argon2, jwt, mail, S3…)
shared_kernel
```
- **`domain/<context>/`** — entities, value objects, domain services, repository **ports**, errors. **Pure Python**: no FastAPI, SQLAlchemy, jwt, argon2, aioboto3, or `os.getenv`.
- **`application/<context>/`** — use cases orchestrating the domain via **ports** (Protocols); DTOs. No framework imports. One module per use case.
- **`infrastructure/<context>/`** — adapters that *implement* the ports (SQLAlchemy repositories, Argon2 hasher, JWT issuer, SMTP mailer, rate limiter, audit sink) + `persistence/` (engine/session) + existing `vector_db/` (Qdrant).
- **`api/<context>/`** — FastAPI routers + Pydantic schemas + dependencies. The **only** layer that imports FastAPI and maps domain errors to HTTP.
- **`composition/`** — wires use cases from concrete adapters (constructor injection; no DI framework).
- **`shared_kernel/`** — `Entity`/`AggregateRoot`, `ValueObject`, and the `DomainError` hierarchy.
### DomainError → HTTP (mapped only in the api layer)
`ValidationError` → 400 · `AuthenticationError` → 401 · `AuthorizationError` → 403 · `NotFoundError` → 404 · `ConflictError` → 409 · `RateLimited` → 429. Inner layers raise these, never `HTTPException`.
## Bounded contexts & extraction order (strangler-fig)
**Identity (1st)** → Admin → AI → Evidence/Files → **Initiative** (resolve the dual-submission-model decision: `initiatives/drafts` vs `application_workflow/application_artifacts`) → **Review** (last — most globals-coupled).
## Status
| Context | domain | application | infrastructure | api | live cut-over |
|---|---|---|---|---|---|
| **Identity** | ✅ tested | ✅ Login tested | ⏳ | ⏳ | ⏳ (needs DB) |
| Admin · AI · Files · Initiative · Review | ⚪ | ⚪ | ⚪ | ⚪ | ⚪ |
Identity domain + application are extracted **verbatim** (behavior-preserving) from `src/auth_api.py` and covered by 32 unit tests (`tests/test_identity_domain.py`, `tests/test_authenticate_user.py`) that run with **no DB**.
## Per-endpoint cut-over procedure
1. Extract pure rules → `domain` + write unit tests. ✅ (pattern established)
2. Write the use case → `application` (ports + fakes-tested). ✅ (Login)
3. Implement adapters → `infrastructure` (wrap the existing battle-tested primitives — `auth_jwt`, `auth_mail`, `auth_rate_limit`, `PasswordHasher`; do **not** rewrite security code).
4. Add the FastAPI router → `api`, wired via `composition`. Mount under a parallel prefix first.
5. **With the stack up** (`docker compose up`), run the DB-backed auth tests against the new router and confirm byte-parity with the old handler.
6. Replace the old route in `auth_api.py` with the new router in `main.py`; re-run tests. Repeat until `auth_api.py` is empty → delete it.
## Why infrastructure/api are deferred this chunk
The auth tests are DB-backed (`INITIATIVE_DATABASE_URL` + Postgres). With the stack down, the DB-touching adapters/router can't be *verified*, and an unverified swap of live auth is unsafe. So this chunk ships the **verified** pure layers (domain + application) + the dead-scaffold cleanup; steps 36 happen when the stack is up.
+232
View File
@@ -0,0 +1,232 @@
# Production Docker deployment (`docker-compose.prod.yml`)
This guide walks through **common failures** when running the prod-style stack locally or on a VPS, in a fixed order: validate environment, reconcile Postgres credentials with the Docker volume, then confirm frontend wiring.
**Stack topology (frontend → backend → DB → MinIO):** [deploy-stack-overview.md](./deploy-stack-overview.md)
Related files: `.env.example` (copy to `.env`), `scripts/deploy-prod.sh`, `scripts/verify-prod-env.sh`.
---
## 1. `.env` in the repo root (cloud / VPS)
Docker Compose substitutes `${PUBLIC_HOST}`, `${POSTGRES_USER}`, etc. from a file named `.env` in the **same directory** as `docker-compose.prod.yml` (or from `--env-file` when you use the deploy script).
### It may already be there: plain `ls` hides it
Unix `ls` does **not** list dotfiles. A file named `.env` will **not** show up unless you:
```bash
ls -a # lists .env alongside . ..
test -f .env && echo ok # exits 0 if the file exists
```
### Create it when it is missing
From the repo root on the server:
```bash
cp .env.example .env
nano .env # or vim / your editor — set PUBLIC_HOST, secrets, Postgres identifiers (see section **3** below)
chmod 600 .env # optional: restrict reads to your user/root
```
`./scripts/deploy-prod.sh` refuses to run if `.env` is absent. If you start Compose by hand **without** a `.env` file, `${POSTGRES_*}` interpolates empty and Postgres health checks / connections can misbehave — always keep a populated `.env` next to the compose file.
---
## 2. Run validation before compose
Always fix script failures before restarting containers.
```bash
./scripts/verify-prod-env.sh
```
`verify-prod-env.sh` rejects:
- Empty `PUBLIC_HOST`, ports, MinIO or Postgres variables.
- `POSTGRES_USER` / `POSTGRES_DB` that are not plain SQL identifiers (letters, digits, underscore only — no `!`, spaces, unicode).
- `POSTGRES_PASSWORD` containing `@`, `:`, `/`, or `%`, which breaks `INITIATIVE_DATABASE_URL` in Compose (assembled without URL-encoding).
If `deploy-prod.sh` exits early, rerun `verify-prod-env.sh` and edit `.env` until it prints `OK`.
---
## 3. Postgres — `FATAL: role "<name>" does not exist`
### Why it happens
The official Postgres image **creates `POSTGRES_USER` and `POSTGRES_DB` only when the data directory is empty** (first start of the named volume). After that, changing `.env` does **not** rename or recreate roles inside the volume.
Typical triggers:
| Situation | Result |
|-----------|--------|
| Volume was initialized with `POSTGRES_USER=initiative`; `.env` now uses a different username | Existing DB has role `initiative`, not your new name. |
| Username with special characters (`user_pkhcn2025!`) | Prefer plain identifiers — see validation above — and historically some setups never created the role cleanly. |
### Fix (pick one track)
**A. Keep existing data — align `.env` with the roles that already exist**
1. Discover the logical volume name Compose uses:
```bash
docker compose --env-file .env -f docker-compose.prod.yml down
docker volume ls | grep initiative_pg_data
```
The name looks like `<project>_initiative_pg_data` (Compose names the volume from your project directory).
2. Start only Postgres temporarily with `.env` that matches credentials you **know** worked on first bootstrap (often your dev values from `docker-compose.yml`: user `initiative`, DB `initiatives`):
```bash
docker compose --env-file .env -f docker-compose.prod.yml up -d postgres
```
3. List roles inside the cluster (substitute `-U`/`-d`/`PGPASSWORD` to match credentials that succeed):
```bash
docker compose --env-file .env -f docker-compose.prod.yml exec postgres \
psql -U initiative -d initiatives -c '\du'
```
Set `POSTGRES_USER` / `POSTGRES_DB` / `POSTGRES_PASSWORD` in `.env` to match an existing role and database. Do **not** change only the username without aligning to an existing login.
**Password-only mismatch:** If the role and database names are already correct but someone changed `POSTGRES_PASSWORD` in `.env` after the volume was first created, run from the repo root (with `postgres` running):
```bash
./scripts/sync-postgres-app-password.sh
```
That executes `ALTER ROLE … PASSWORD` to match `.env` when `psql` inside the container can connect without the old password (typical with the official images local socket rules). If it fails, use the `psql` steps above with credentials that still work, or re-init the volume (**B**). Optional: `POSTGRES_SUPERUSER` in `.env` if you must connect as another superuser (e.g. `postgres`).
**B. You can afford to lose Postgres data — re-init the volume**
1. Stop stack; remove volume (this **deletes** all DB data):
```bash
docker compose --env-file .env -f docker-compose.prod.yml down
docker volume rm <project>_initiative_pg_data # exact name from `docker volume ls`
```
2. Ensure `./scripts/verify-prod-env.sh` passes.
3. Bring stack up fresh so scripts in `docker-entrypoint-initdb.d/` run:
```bash
./scripts/deploy-prod.sh
```
**C. Rename or add roles without wiping data (advanced)**
Connect as your **currently working** database superuser, then:
- `ALTER ROLE initiative RENAME TO new_name;`
- Create a parallel role/password with matching grants if your app expects a dedicated user only.
Operational details vary with your retention and backup policy; involve your DBA playbook if applicable.
---
## 4. Frontend (`fe0`) — port mismatch (host cannot reach UI)
Compose maps **`${FE_PORT}:8080`**: traffic to the container must hit **port 8080** inside `fe0`.
Vite defaults to **5173** if nothing overrides it. Previously that meant the mapped port forwarded to nothing or the wrong listener.
### Required state
[Vite](../fe0/vite.config.ts) must set:
- `server.port: 8080`
- host `0.0.0.0` and **port 8080** (Compose/Dockerfile pass `npm run dev -- --host 0.0.0.0 --port 8080` so bind-mounted trees without an updated `vite.config.ts` still match `${FE_PORT}:8080`)
If logs show:
```text
Local: http://localhost:5173/
```
fix `vite.config.ts` so the dev server uses **8080**, then recreate or restart `fe0`.
After that, browsers use:
```text
http://${PUBLIC_HOST}:${FE_PORT}
```
---
## 5. Different IPs in logs (`fe0` vs MinIO)
This is usually **correct**, not contradictory:
| Log line | Meaning |
|----------|---------|
| `fe0` “Network”: `http://10.5.0.x:…` | **Static container IP** on Compose bridge `profyt-net` (`docker-compose.prod.yml` `ipv4_address`). |
| MinIO banner: `http://<PUBLIC_HOST>:19000` | **Public/browser URL**, from `MINIO_SERVER_URL` using `PUBLIC_HOST` and `MINIO_API_PORT`. |
`be0` still talks to MinIO as `http://minio:9000` internally; browsers use `${PUBLIC_HOST}` unless you override presign with **`S3_PUBLIC_ENDPOINT_URL`**.
When the UI is **`https://`**, embedding plain **`http://…:${MINIO_API_PORT}`** presigned URLs is blocked (**mixed content**). In-app PDF preview can use **`GET …/evidence/content`**; for direct presigned links in the browser, terminate TLS on the MinIO API host and set **`S3_PUBLIC_ENDPOINT_URL`** / **`MINIO_SERVER_URL`** to that **`https://…`** base — see **[minio-behind-https.md](./minio-behind-https.md)** and **`deploy/nginx/minio-s3-proxy.conf.example`**.
---
## 6. Operational checklist after changes
```bash
./scripts/verify-prod-env.sh
docker compose --env-file .env -f docker-compose.prod.yml config >/dev/null
./scripts/deploy-prod.sh # or: up without -d for foreground logs
docker compose --env-file .env -f docker-compose.prod.yml ps
```
For Postgres persistence issues, skim **section 3** before editing `.env` again.
---
## 7. Postgres — `relation "audit_events" does not exist`
### Why it happens
`docker-entrypoint-initdb.d` on the Postgres image runs **only when the data volume is empty**. If the volume was created **before** `008_audit_events.sql` existed in compose, that migration never ran. **`be0`** then fails when it tries to write audit rows.
### Fix
**After pulling a current `be0` image / repo:** restart **`be0`**. On startup, `scripts/apply_initiative_migrations.py` applies **`008_audit_events.sql`** automatically if `public.audit_events` is missing (same pattern as migration 009).
Or apply by hand from the repo root on the server (adjust user/db to match `.env`):
```bash
docker compose --env-file .env -f docker-compose.prod.yml exec -T postgres \
psql -U "${POSTGRES_USER}" -d "${POSTGRES_DB}" \
< be0/migrations/008_audit_events.sql
```
---
## 8. Large uploads — `413 Request Entity Too Large` (evidence PDF, etc.)
The app allows evidence up to **50 MB** end-to-end, but **HTTPS reverse proxies** (nginx in front of `www.rcc-ump.com`) often default to **`client_max_body_size 1m`**, which rejects a **multi-megabyte** PDF **before** Docker sees the request. The browser console may show an HTML nginx error page (comment about “friendly error page”).
### Fix (nginx)
In the `server { }` (or the `location` that proxies to your `fe0` port), set at least:
```nginx
client_max_body_size 64m;
```
Reload nginx after editing. If uploads are slow, you may also need longer timeouts on the same `location`:
```nginx
proxy_read_timeout 300s;
proxy_connect_timeout 300s;
proxy_send_timeout 300s;
```
If **Cloudflare** (or another CDN) sits in front of the origin, confirm it does not impose a smaller upload limit than nginx.
**Note:** Browsers hit **`fe0`** (Vite proxy `/api` → `be0`). The body limit must allow the full multipart upload on the **first** hop (usually nginx → origin), not only inside Docker.
+150
View File
@@ -0,0 +1,150 @@
# Stack trace for `docker-compose.prod.yml` deployment
Use this doc to see **how the frontend, backend, Postgres, and MinIO connect**, in what **order Compose starts services**, and how to **verify** each tier after deploy. For Postgres volume errors, HTTP vs HTTPS on Vite, and `.env` dotfiles, see [deploy-production-docker.md](./deploy-production-docker.md).
---
## 1. High-level dependency graph
Compose service names (`postgres`, `minio`, etc.) resolve on the **`profyt-net`** bridge (`10.5.0.0/16`). Static IPs are defined in `docker-compose.prod.yml` for readability; traffic still uses DNS names **`be0`**, **`minio`**, **`postgres`**.
```mermaid
flowchart TB
subgraph browser [Browser on the Internet]
U[User]
end
subgraph host [Docker host ports]
FEPORT["HOST:FE_PORT ~ Vite HTTP"]
MINAPI["HOST:MINIO_API_PORT"]
MINUI["HOST:MINIO_CONSOLE_PORT"]
end
subgraph net [Compose network profyt-net]
fe0["fe0 :8080<br/>nginx (prod)"]
be0["be0 :4402<br/>FastAPI"]
pg[("postgres :5432")]
s3["minio :9000 API<br/>:9001 console"]
end
U -->|"http://PUBLIC_HOST:FE_PORT (not https)"| FEPORT --> fe0
U -->|"presigned GET bytes"| MINAPI --> s3
fe0 -->|"nginx proxy /api, /submitted-initiatives…"| be0
be0 -->|"INITIATIVE_DATABASE_URL"| pg
be0 -->|"S3_ENDPOINT_URL presign + server ops"| s3
subgraph oneoff [runs after MinIO healthy]
cors["minio-cors:<br/>ensure buckets"]
end
cors --> s3
subgraph localhost_only [Reachable only on the host VM]
dbmap["127.0.0.1:15432 → postgres"]
bemap["127.0.0.1:4402 → be0"]
end
```
---
## 2. Responsibility matrix
| Piece | Compose service | Listen (container) | Published to Internet? | Env / config that ties it together |
|--------|----------------|----------------------|-------------------------|-----------------------------------|
| **Frontend** | `fe0` | `0.0.0.0:8080` | **`http://${PUBLIC_HOST}:${FE_PORT}`** maps host → container 8080 | **Production:** `Dockerfile.prod` → nginx static + proxy `/api``be0`. **Dev:** Vite (`Dockerfile` + `npm run dev`) |
| **Backend API** | `be0` | `0.0.0.0:4402` | **No**`127.0.0.1:4402` on host only; browsers use **`fe0` proxy** `/api`, etc. | `INITIATIVE_DATABASE_URL=postgresql+asyncpg://…@postgres:5432/…`, `S3_ENDPOINT_URL=http://minio:9000`, **`S3_PUBLIC_ENDPOINT_URL`** defaults to **`http://${PUBLIC_HOST}:${MINIO_API_PORT}`** — set **`https://…`** when MinIO sits behind HTTPS for presigned URLs ([minio-behind-https.md](./minio-behind-https.md)) |
| **Database** | `postgres` | `:5432` | **No**`127.0.0.1:15432` on host for admin tools only | `POSTGRES_*` first init only (`initiative_pg_data` volume) |
| **Object storage** | `minio` | API `:9000`, console `:9001` | **`http://${PUBLIC_HOST}:${MINIO_API_PORT}`** (often proxied via HTTPS in production — see linked doc above) **`MINIO_CONSOLE_PORT`** | **`MINIO_SERVER_URL`** / **`MINIO_BROWSER_REDIRECT_URL`** compile from **`PUBLIC_HOST`** by default or use `.env` HTTPS overrides together with **`S3_PUBLIC_ENDPOINT_URL`** |
**`minio-cors`** is a one-shot job: waits for healthy MinIO and creates **`initiative-attachments`**, **`initiative-exports`**, and **`initiative-quarantine`**. **Community MinIO** does not implement S3 per-bucket CORS (`mc cors set`); browsers rely on **`MINIO_API_CORS_ALLOW_ORIGIN`** on the **`minio`** service (defaults to `*` in Compose) for presigned GETs.
---
## 3. Request paths (mental trace)
1. **SPA + API calls**
User opens **`http://PUBLIC_HOST:FE_PORT`**. The browser loads Vite-served assets from **`fe0`**. Calls to **`/api/...`** (and similar proxied paths) go to **`fe0`**, which forwards to **`http://be0:4402`** inside the network.
2. **Presigned S3 / MinIO from the browser**
**`be0`** builds URLs using **`S3_PUBLIC_ENDPOINT_URL`** (must be reachable from the users browser, usually **`http://PUBLIC_HOST:MINIO_API_PORT`**). The browser downloads objects **directly from MinIO** on the host-published port—not through **`be0`**.
3. **Backend → Postgres**
Only **`be0`** uses **`INITIATIVE_DATABASE_URL`**; host `127.0.0.1:15432` is optional for **`psql`** / dumps from the VPS shell.
4. **Backend → MinIO (server-side)**
**`be0`** uses **`S3_ENDPOINT_URL=http://minio:9000`** for signing and internal API traffic; **`minio`** is the Compose DNS name, not **`PUBLIC_HOST`**.
---
## 4. Startup order Compose enforces
| Order | Service | Blocking condition |
|------|---------|--------------------|
| 1 | `postgres`, `minio` | (none in compose—they start in parallel.) |
| 2 | `minio-cors` | `minio` **healthy** |
| 3 | `be0` | `postgres` **healthy** AND `minio` **healthy** |
| 4 | `fe0` | `be0` **started** |
If Postgres never becomes healthy (**bad `POSTGRES_*` vs existing volume**, etc.), **`be0` never attaches** cleanly and **`fe0`** may misbehave or appear “up” while API calls fail.
---
## 5. Deploy checklist (recommended)
From the **repository root on the VPS** (same folder as `docker-compose.prod.yml`):
1. **`.env`** present (`ls -a`), values filled from `.env.example`.
2. **`PUBLIC_HOST`** = the hostname or IP users type in the browser (must match how you open the UI and how MinIO URLs are generated).
3. **`./scripts/verify-prod-env.sh`** exits `0`.
4. Start the stack (pick one):
- **Script (pull, build, detached):** `./scripts/deploy-prod.sh`
- **Manual compose:** see **subsection 5.1** below.
5. Open app with **`http://`**, not **`https://`**, unless you put a reverse proxy in front.
### 5.1 Manual `docker compose -f docker-compose.prod.yml up`
This is valid as long as you stay in the **repo root** and a **`.env`** file exists there.
- **Variable substitution:** Compose automatically reads a file named **`.env`** in the **project directory** (normally your current working directory) and uses it to expand `${PUBLIC_HOST}`, `${FE_PORT}`, etc. in `docker-compose.prod.yml`. You do **not** have to pass `--env-file .env` for that to work, but being explicit avoids surprises:
```bash
docker compose --env-file .env -f docker-compose.prod.yml up -d --build
```
- **Foreground vs daemon:** plain `up` streams logs in the terminal and exits with Ctrl+C (containers stop unless you use `--abort-on-container-exit` behavior—default stops on interrupt). For a long-running server, prefer **`up -d`** (detached).
- **Rebuild after Dockerfile or dependency changes:** add **`--build`** (the deploy script always builds). Without it, Compose may reuse old images.
- **No pre-checks:** the script runs `verify-prod-env.sh` and `compose config` for you; if you use only `up`, run **`./scripts/verify-prod-env.sh`** yourself first so bad `POSTGRES_USER` / empty ports fail fast.
Example minimal manual flow:
```bash
cd /path/to/remix-of-my-perspective-lifestyle-32
./scripts/verify-prod-env.sh
docker compose --env-file .env -f docker-compose.prod.yml up -d --build
docker compose --env-file .env -f docker-compose.prod.yml ps
```
---
## 6. Quick verification commands
Run on the host with the same `--env-file` you use for deploy:
```bash
docker compose --env-file .env -f docker-compose.prod.yml ps
docker compose --env-file .env -f docker-compose.prod.yml logs --tail=80 postgres
docker compose --env-file .env -f docker-compose.prod.yml logs --tail=80 be0
docker compose --env-file .env -f docker-compose.prod.yml logs --tail=40 fe0
docker compose --env-file .env -f docker-compose.prod.yml logs --tail=40 minio
```
Smoke checks:
- **Postgres**: `docker compose ... exec postgres pg_isready -U "$POSTGRES_USER" -d "$POSTGRES_DB"` (substitute real values from `.env` when using shell snippets).
- **Backend** (from host): `curl -sS http://127.0.0.1:4402/docs` — expect FastAPI Swagger HTML (or `/openapi.json`).
- **MinIO** (from host or laptop if firewall allows): `curl -sS -o /dev/null -w "%{http_code}" http://${PUBLIC_HOST}:${MINIO_API_PORT}/minio/health/live`.
---
## 7. Firewall hint
Typically you must allow **inbound TCP**: **`FE_PORT`**, **`MINIO_API_PORT`**, **`MINIO_CONSOLE_PORT`** (and **`22`** for SSH). Postgres and **`be0`** intentionally stay on **localhost-only** binds in this compose file.
+40
View File
@@ -0,0 +1,40 @@
# Document Templates — admin-managed forms (feature)
A from-scratch subsystem (2026-06-14): admins upload `.docx` templates with Jinja
`{{ placeholders }}`; applicants fill a form **generated from each template's fields** and download a
server-rendered PDF. Independent of the hardcoded *sáng kiến* form pipeline (which stays as-is).
## Data + storage
- Table **`document_templates`** (migration `015_document_templates.sql`, model `DocumentTemplate`):
`id, name, description, storage_key, original_filename, content_sha256, fields JSONB ([{key,label,type}]),
is_active, created_by, created_at, updated_at`.
- MinIO bucket **`initiative-templates`** (`S3Settings.s3_bucket_templates`, has a default; created by
`ensure_buckets_exist`). Server-side only — no browser CORS / presign needed.
## Backend — `be0/src/template_routes.py` (mounted `/api/v1/templates`)
- `POST /templates` (admin) — multipart `.docx` → extract `{{placeholders}}` → MinIO + row.
- `GET /templates` (authed; admin sees inactive too) · `GET /{id}` · `GET /{id}/file` (admin, raw docx).
- `PUT /{id}` (admin) metadata · `DELETE /{id}` (admin) soft-delete, `?hard=true` removes row + object.
- `POST /{id}/render` (authed) — fill with `values` → DOCX (docxtpl) or PDF (reuses
`src.be01.docx_to_pdf.convert_docx_bytes_to_pdf`, LibreOffice).
- **Placeholder extraction:** docxtpl `get_undeclared_template_variables()` with a regex-on-stripped-XML
fallback — DOCX often splits `{{ }}` across `<w:r>` runs, so stripping tags first recovers them.
- Auth mirrors the extracted admin routers (`decode_access_token_user_id`/`decode_bearer_token`; admin = `"admin"` in JWT roles).
## Frontend
- **`@ump/shared/lib/templateApi.ts`** — `DocumentTemplate`/`TemplateField` types + list/get/create
(FormData)/update/delete/render (`postArrayBuffer`)/downloadFile + `saveArrayBufferAs` / `arrayBufferToObjectUrl`.
- **frontend_admin** `pages/TemplatesPage.tsx` (+ `layouts/AdminLayout.tsx` shell; route `/templates`,
admin-gated) — upload / list / edit meta / activate-deactivate / download / hard-delete (alert-dialog). Its first real feature.
- **frontend_user** `pages/TemplatesFillPage.tsx` (route `/dashboard/templates`, sidebar « Mẫu tài liệu »)
— pick a template → form generated from its fields → server PDF → iframe preview + download.
## Verified
Backend **e2e live** against the dev stack: create → extract fields → render (filled values) → delete.
`@ump/shared` + frontend_user + frontend_admin `tsc --noEmit` + `vite build` all clean. Commits
`c6d003c` (BE) + `4f1cb3e` (FE).
## Deferred (v1 limitations)
- **No file-replace endpoint** — delete + re-create to change a template's `.docx` (metadata edit works).
- **All fields render as single-line text** — no type/date/number/long-text per field.
- **No audit-log entries** on template create/update/delete (the rest of the app audits admin mutations).
@@ -0,0 +1,298 @@
# fe0: Dashboard data refresh and API polling
This document explains **why the browser repeatedly calls `/api/applications` and `/api/notifications/unread-count`** on the dashboard, how that fits the **React + TanStack Query + Axios** stack, and **design tradeoffs** for tuning behavior.
It also encodes a **stabilization plan** for frontend, backend pressure, and predictable data loading—refined from a stability review (`assets/docs/feedback.md`) focused on removing implicit globals, polite polling, and consistent refresh semantics.
## 1. High-level flow
```mermaid
flowchart LR
subgraph ui [Dashboard UI]
D[Dashboard.tsx]
AAL[Admin ApprovedApplicationsList]
CAL[Council ApprovedApplicationsList]
NB[NotificationBell]
end
subgraph rq [TanStack Query]
QApps["useQuery applications"]
QNotif["useQuery unread count"]
end
subgraph http [HTTP]
AC[apiClient axios]
BE["Backend APIs"]
end
D -->|admin role| AAL
D -->|editor role| CAL
D -.-> NB
AAL --> QApps
NB --> QNotif
QApps --> AC
QNotif --> AC
AC --> BE
```
- **`Dashboard.tsx`** chooses which shell to render by role: **admin** sees the admin applications list (inbox), **editor** (council) sees a different list implementation, **applicant** sees the registration workspace (no inbox polling for applications in the same way).
- **`apiClient`** (`fe0/src/shared/api/client.ts`) is the shared Axios instance used by queries and mutations.
- **TanStack Query** caches by `queryKey`, runs `queryFn` on mount, and can **refetch on an interval** or **when the window regains focus**, depending on per-query options and **explicit** `QueryClient.defaultOptions` (see §7).
## 2. What triggers repeated `/api/applications` (admin inbox)
The route **`/dashboard`** for users with the **admin** role renders:
`fe0/src/pages/Dashboard.tsx``AdminApprovedApplicationsList` with `lifecycle="inbox"`.
The list loads data with `useQuery` in `fe0/src/components/admin/ApprovedApplicationsList.tsx`.
### Current behavior (as implemented today)
| Option | Value | Effect |
|--------|--------|--------|
| `queryKey` | `["applications", filters]` | Separate cache per filter set; **must** be a stable key—see §11. |
| `refetchInterval` | `10 * 1000` (10 seconds) | **Automatic polling** while mounted. **Target:** visibility-aware + optional jitter (§8, §12). |
| `refetchOnWindowFocus` | `"always"` (today) | Refetch on every focus regardless of staleness—**high load**; **target** is `true` + sensible `staleTime` (§8). |
| `refetchOnReconnect` | `true` | Refetch when the browser regains network after offline. |
| `placeholderData` | `(previous) => previous` | Keeps showing the last page while a refetch runs (less table flicker). **Keep this.** |
So the “every few seconds” pattern you see in DevTools is **intentional polling**, not a runaway bug—but the combination of **10s polling + `"always"` focus** multiplies traffic when admins tab frequently (§8).
### Same component, other lifecycles
`ApprovedApplicationsList` is also reused for the **decided** list (`lifecycle="decided"`) from `DecidedApplicationsPanel`. The **same** `refetchInterval: 10s` applies there as well—polling is tied to the component, not only the inbox title.
## 3. Council dashboard: different refresh strategy (target: unify)
**Editors** (`hasRole("editor")`) get `CouncilApprovedApplicationsList` (`fe0/src/components/council/ApprovedApplicationsList.tsx`).
That files `applicationsQuery` **does not set `refetchInterval`** today. Updates are driven more by:
- Normal Query behavior (mount, default focus rules, etc.).
- **`reportSyncQuery`**: when its `dataUpdatedAt` changes, an effect runs `queryClient.invalidateQueries({ queryKey: ["applications"] })`, which pulls a fresh `/api/applications` without a fixed timer.
**Problem:** admin (time-based polling) and council (event-driven invalidation) are two mental models for similar surfaces, in different files—cognitive load, bug asymmetry, and drift (fixes in one place may not land in the other).
**Target architecture (single strategy everywhere):**
1. **Primary:** invalidation on mutations (`approve`, `reject`, `submit`, `assign`, etc.) plus invalidation on lightweight **report sync** / version signals where applicable.
2. **Secondary:** a **slow safety-net poll** (e.g. **60120s**, visibility-aware, optionally jittered) so a missed invalidation does not leave the UI stale forever.
3. **Later (product-driven only):** **SSE** behind `apiClient` if true realtime is required—one long-lived connection per tab scales better than many short polls; WebSockets only if the server must push high-frequency updates.
Until unified, treat **both** admin and council lists as in scope for **`isFetching` audits** and query-key stability (§10, §11).
## 4. Notifications unread count
`fe0/src/components/notifications/NotificationBell.tsx`:
- `queryKey`: `["notifications-unread-count"]`
- `queryFn`: `fetchNotificationsUnreadCount``GET /api/notifications/unread-count`
- `refetchInterval`: **60_000 ms** (once per minute)
- `refetchOnWindowFocus`: `true`
- `staleTime`: **30_000 ms**
`NotificationManager.tsx` uses a similar **60s** interval for the list and calls `queryClient.invalidateQueries({ queryKey: ["notifications-unread-count"] })` after mutations so the bell can update sooner than the next minute tick. **This invalidation pattern is the model** for other features (§3).
## 5. Other polling in the admin area
These are separate from the inbox but follow the same idea (“keep dashboards somewhat fresh”):
| Location | Interval | Purpose |
|----------|----------|----------|
| `OverviewTab.tsx` | 30s | Health/status style data |
| `AIManagementTab.tsx` | 30s | AI service health |
| `NotificationBell` / `NotificationManager` | 60s | Notifications |
**Target:** centralize intervals in one module (e.g. `fe0/src/shared/config/polling.ts`) so ops and load tests can tune without hunting magic numbers across files (§12).
## 6. `client.ts` dev logging (stability and privacy)
In `fe0/src/shared/api/client.ts`, the Axios **response interceptor** logs successful responses when `import.meta.env.DEV` is true.
**Risks:** full `data` payloads on large lists **flood the console**; a misconfigured deploy that runs “dev-like” builds could **leak user data** to browser consoles.
**Target:**
1. **Sample or summarize** responses in dev; prefer `console.debug` over `console.log` for high-volume paths so DevTools defaults stay readable.
2. **Guard production** with a build-time assertion (or strict env contract), not `import.meta.env.DEV` alone.
## 7. QueryClient defaults (critical: one entrypoint, explicit defaults)
Having **two** `App.tsx` files with **different** `QueryClient` configuration is a **silent global switch**: a refactor, import cleanup, or rebase can change refetch behavior app-wide without touching feature code—and per-query `refetchInterval` would still “look” correct in review.
**Required actions (do first):**
1. **Pick one entrypoint.** Remove the duplicate in the **same** change set; do not leave a long-lived TODO.
2. **Prevent regression:** CI or ESLint `no-restricted-imports` forbidding the removed path if it could be revived.
3. **Set explicit `defaultOptions`** on the surviving `QueryClient`, even when values match library defaults—**implicit defaults are a major-version upgrade hazard** for TanStack Query.
Illustrative shape (adjust `staleTime` / `gcTime` / retry helpers to match product decisions):
```ts
const queryClient = new QueryClient({
defaultOptions: {
queries: {
refetchOnWindowFocus: false,
refetchOnReconnect: true,
staleTime: 30_000,
gcTime: 5 * 60_000,
retry: (failureCount, err) => {
if (isAuthError(err)) return false;
return failureCount < 2;
},
retryDelay: (attempt) => Math.min(1000 * 2 ** attempt, 8000),
},
mutations: { retry: false },
},
});
```
Then **`refetchInterval`**, `refetchOnWindowFocus: true`, and other overrides become **deliberate opt-ins** at the query level.
**Todays split (legacy):** `fe0/src/main.tsx` imports `fe0/src/App.tsx` (`new QueryClient()` with no defaults). `fe0/src/app/App.tsx` uses different defaults and is **not** wired from `main.tsx` until consolidated.
## 8. Polling and focus: polite defaults (frontend + backend load)
### Why `"always"` on focus is wrong for an inbox
`refetchOnWindowFocus: "always"` refetches **on every focus event regardless of staleness**. With **10s** polling, admins who tab in and out can drive **1220+ requests/minute per tab**; many admins at start of shift create **synchronized bursts** the backend cannot absorb gracefully.
**Target:** use **`true`** (refetch only when stale) with a **sensible `staleTime`** for that query. Approvals inboxes are not trading screens; the UX difference is negligible; server load is not.
### Visibility-aware polling (default pattern, not optional)
Background tabs still run timers (throttling varies). Dashboards left open all day waste work that scales with headcount.
**Default for every polling query:** pause when the document is hidden.
```ts
function useVisibilityAwareInterval(ms: number) {
return () => (document.visibilityState === "visible" ? ms : false);
}
```
Use the function form of `refetchInterval` supported by TanStack Query so engineers do not re-implement this ad hoc.
### Jitter (optional smoothing)
Fixed intervals from **mount** align across users (start of shift). **±1020% jitter** on poll delays spreads load with negligible UX impact—worth adopting once concurrent admin count grows.
## 9. HTTP, timeouts, retries, and auth (document + implement gaps)
The happy path is documented elsewhere; **stability** requires explicit policy—**even when nothing fails in tests**.
| Concern | Risk if ignored | Target |
|--------|------------------|--------|
| **No Axios timeout** | Hung requests pile up; **10s polling** stacks in-flight work; **per-host concurrency** pins the tab; UI looks frozen. | Set **explicit timeouts** on `apiClient` (or per-route overrides for long operations). |
| **Default Query retries** | TanStack Query retries **3×** by default; a bad poll tick can **amplify load** during an outage (4 quick failures per cycle). | Align `retry` / `retryDelay` with **`defaultOptions`** (§7); cap retries on read-heavy queries. |
| **401 / 403** | **Silent loops:** auth expired → poll → 401 → retry → poll again; “dashboard broken” reports. | **Never retry** auth failures; interceptor should **logout / redirect / refresh** in one documented path—**no** infinite poll on unauthenticated sessions. |
| **Offline** | `refetchOnReconnect: true` helps, but users may see **blank** data and assume loss. | **Surface offline / reconnect** in UI where lists are empty or stale. |
Add or link implementation details in `fe0/src/shared/api/client.ts` and auth helpers as these behaviors are codified.
## 10. `isLoading` vs `isFetching` (UI coupling)
**Pattern problem:** wiring **`isFetching`** from a **list query** into controls that are **conceptually independent** (export, filters, “new application”, pagination) causes bugs that **localhost hides** (fast requests → flicker too quick to see) and **cloud exposes** (slow polls keep `isFetching` true → controls look “stuck refreshing”).
**Rules of thumb:**
- **`isLoading`** (no cached data yet) is usually safe for gating skeletons or first-load UI.
- **`isFetching`** should **almost never** disable user-initiated actions; use a **subtle indicator** or **local** loading only for that action (e.g. export-only state).
**Action:** audit every consumer of `["applications", ...]` (and similar list keys) for `isFetching` / `isLoading`. Consider a lint rule or review checklist: *if a button is disabled on `isFetching`, require an inline justification.*
## 11. Query key stability
If `filters` is an **object literal created in render** (`{ status, page, q }`), its **reference changes every render**. TanStack Query may treat the key as new every time → **extra requests**, refetch on **keystrokes**, refetch on **unrelated state** updates.
**Mitigations:**
- **`useMemo`** for the filters object keyed by primitive fields, **or**
- **Prefer primitive keys:** `["applications", status, page, q, ...]`—verbose but **serializable** and easy to debug.
Encode the chosen rule in team TanStack Query conventions.
## 12. Centralize polling constants
Intervals such as `10s`, `30s`, `60s` scattered across files are hard to tune for load tests or incidents.
**Target module** (example):
```ts
// fe0/src/shared/config/polling.ts
export const POLL_INTERVALS = {
adminInbox: 10_000,
notificationsCount: 60_000,
notificationsList: 60_000,
adminOverview: 30_000,
aiHealth: 30_000,
} as const;
```
Optionally drive values from **env** later without touching every callsite.
## 13. Phased implementation order
Pragmatic sequencing when work must land incrementally (from stability review):
1. **First****One `App.tsx`**, explicit **`QueryClient.defaultOptions`**, CI/ESLint guard against the removed path (**§7**).
2. **Next****`isFetching` audit**; **visibility-aware** polling helper; replace admin inbox **`"always"`** with **`true` + `staleTime`** (**§8, §10**).
3. **Then****Centralize `POLL_INTERVALS`**; **document and implement** timeout / retry / auth behavior (**§9, §12**); verify **query key stability** (**§11**).
4. **Horizon****Unify admin + council** refresh: invalidation primary, **slow safety-net poll** (**§3**); **SSE** only if realtime becomes a product requirement.
## 14. Quick file map
| File | Role |
|------|------|
| `fe0/src/pages/Dashboard.tsx` | Role-based dashboard shell; wires admin inbox list. |
| `fe0/src/components/admin/ApprovedApplicationsList.tsx` | Admin `/api/applications` query; **10s** poll, focus **"always"** today—**targets in §8, §10, §11**. |
| `fe0/src/components/council/ApprovedApplicationsList.tsx` | Council list; invalidates on report sync—**unify with §3**. |
| `fe0/src/components/notifications/NotificationBell.tsx` | Unread count; **60s** polling. |
| `fe0/src/components/notifications/NotificationManager.tsx` | Notification list + invalidates unread count query. |
| `fe0/src/lib/userNotificationsApi.ts` | HTTP helper for unread count. |
| `fe0/src/shared/api/client.ts` | Axios instance; dev logging—**§6, §9**. |
| `fe0/src/App.tsx` | `QueryClientProvider` + router (**actual** entry today). |
| `fe0/src/app/App.tsx` | Alternate shell—**remove as part of §7**. |
## 15. Local machine vs cloud server (why behavior can *look* different)
**The admin inbox polling interval is not environment-specific** in code: `refetchInterval: 10s` runs the same in dev, local production builds, and cloud deploys. If the admin dashboard is open and focused, you should see the same *intent* (repeated `GET /api/applications`) everywhere.
What often *differs* is how **noticeable** that is.
### Higher latency on the cloud
On a remote host, each poll typically spends **longer** in flight. While a query is in progress, TanStack Query sets **`isFetching === true`** for that query.
- **Localhost**: UI tied to `isFetching` may **flicker too fast to see**.
- **Cloud**: the same coupling looks like a steady “refresh” problem (**§10**).
Stabilizing export used **export-only loading state** so the button does not follow list refetch; slow networks still poll the same, but the control stays calm.
### Dev vs production logging
- **Local (`vite dev`)**: success logs per response can make the **console look very busy**—often **logging**, not extra requests vs prod with the same code paths.
- **Cloud (typical production build)**: those success logs are off; use **Network** in DevTools to see polling.
### Deployment or asset skew
If the server serves an **older bundle** (cached `index.html`/`assets`, wrong image, or different branch), behavior can diverge from your laptop until deploys and caches align.
### Tab visibility and throttling
Browsers may **throttle timers** for background tabs. Testing with the dashboard tab **in the background** locally can make polls appear rarer than when the tab is **focused**. **Visibility-aware polling (§8)** makes behavior match operator expectations and reduces waste.
### How to verify locally
Open the **admin inbox**, keep the tab **focused**, wait **1520 seconds**, and watch **Network** for repeating `GET /api/applications` (same pattern as cloud).
---
## What to preserve from the current design
- **`placeholderData: (previous) => previous`** to limit table flicker.
- **Invalidating `notifications-unread-count` after mutations** rather than waiting for the next poll.
- **A single shared `apiClient`**—work above layers policy on top of it, not a replacement.
- **Documenting local-vs-cloud differences** (latency, logging, `isFetching`) as institutional knowledge.
---
*Update this doc when `refetchInterval` / focus policies change, `App` entrypoints are consolidated, or admin/council refresh strategies are unified.*
+200
View File
@@ -0,0 +1,200 @@
# Data Management Review — Admin Backup Feature
Reviewing `application-files-persistence.md` from the perspective of building an admin endpoint that produces a **trustworthy, complete archive** of an initiative: all evidence attachments, the submitted PDF, and the submitted DOCX.
The persistence layer is **functional** and the documentation is unusually candid about its rough edges (best-effort MinIO upload, polymorphic `storage_uri`, three different identifiers, an unwired schema file). That candor is the right starting point. But several of those rough edges become **load-bearing** the moment you add a backup feature, because a backup is essentially a contract that says *"these bytes are what was submitted"* — and right now the system cannot honestly make that claim for one of the three artifact types you want to back up.
The notes below are ordered by impact on the backup feature, not by where they appear in the document.
---
## 1. Critical: the DOCX is not stored — a "backup" that regenerates it is not a backup
This is the single biggest issue and it blocks the feature you want to ship.
> *"The binary DOCX is not stored as a file in MinIO."* … *"For backup, … call server-side preview endpoints for each version to regenerate DOCX/PDF if you need binaries in the archive."*
Regenerating the DOCX at backup time means the bytes you hand to an admin are produced by:
- **The current template** (`template_application_form.docx`) — which will change.
- **The current `docxtpl` version** — which will change.
- **The current LibreOffice version** — which has known rendering drift across releases.
- **The current font set installed in the container** — which will change every base-image upgrade.
A backup taken in 2026 of a 2024 submission may not be byte-identical, or even visually identical, to what the user actually submitted in 2024. For an approvals workflow that may have **legal or audit weight**, this is a real problem: you cannot prove what the applicant signed off on. If a dispute arises ("I never agreed to that section"), your backup is evidence of what the system *would produce today*, not what was submitted.
**Recommendation — do this before building the backup endpoint:**
1. At submit time, render the official DOCX **and** the official PDF, hash both, and write them immutably to MinIO with their SHA-256 in object metadata and in `application_artifacts`. Treat them the same way you (mostly) treat `full_pdf` today.
2. Keep the `application_review_documents` JSON bundle. It's still useful — for re-rendering with newer templates, for diffing, for analytics. But it stops being the source of truth for what was submitted.
3. The backup endpoint then *just streams stored bytes*, never invokes LibreOffice. This also removes a slow, fragile dependency from the admin request path.
Without this fix, anything else you build is a backup of *some* of the artifacts plus a regeneration of the rest, which is a different and weaker product than what you described.
---
## 2. Critical: the submitted PDF lives in two places, and MinIO is best-effort
> *"If MinIO fails, the artifact still points at the filesystem URL only."*
This means at any moment, for any given submission, the canonical bytes of the submitted PDF live in **one of three states**:
- Filesystem only (MinIO upload failed, or feature was off).
- MinIO only (would happen if filesystem cleanup ever ran — does it?).
- Both (happy path, but with no guarantee the bytes match if anything ever rewrote one side).
A backup endpoint must handle all three, *and* it must know which to trust when both exist. The string-prefix logic in `_enrich_application_detail_full_pdf_presign` ("if it looks like a MinIO key, presign it; otherwise treat as filesystem URL") is too fragile to be the answer here.
There's also a deployment hazard hiding in the filesystem path. `SUBMITTED_INITIATIVES_DIR` defaults to `assets/submitted-initiatives` or `fe0/public/submitted-initiatives`. If that path is **not on a persistent volume** in the cloud deployment (easy to miss in a Docker setup), then container restarts silently lose data — and the artifact row still points at a now-404 URL. Your backup endpoint would happily produce a ZIP with a missing file and an admin would not know until they tried to open it.
**Recommendations:**
1. **Make MinIO upload synchronous and required at submit time.** If MinIO is down, fail the submission and let the user retry — don't silently degrade to a single-host filesystem copy. The current "best effort" pattern trades a visible error today for an invisible data-loss event later.
2. **Treat the filesystem location as a cache, not a store.** It's fine to keep it for the dev-mode static-file flow, but it should never be the only copy.
3. **Audit `SUBMITTED_INITIATIVES_DIR` mounting in every environment** before backup ships. If it's not on a persistent volume, fix it.
4. **Backfill** existing filesystem-only PDFs into MinIO with a one-time job. After that, every artifact row should resolve to a MinIO key.
---
## 3. High: `storage_uri` is polymorphic and parsed by string prefix
> *"`storage_uri` is either a **MinIO key** (under exports bucket) or a **relative URL** to static files"* … *"when `full_pdf.storage_uri` looks like a **MinIO key** (not `/submitted-initiatives` or `http`), …"*
Detecting storage type by string-shape is the kind of code that works for a year and then breaks the day someone changes a URL prefix, deploys behind a CDN, or introduces a second exports bucket. It also makes the backup logic harder to reason about: every code path that reads `storage_uri` re-implements the same brittle dispatch.
**Recommendation:** add an explicit `storage_kind` column (enum: `minio_exports`, `minio_attachments`, `filesystem`, `external_url`) to `application_artifacts`, populate it on every write, and dispatch on it. The migration is small. The clarity is permanent. Once #2 above is done, you'd expect almost all rows to be `minio_exports`, but the column lets you migrate confidently and lets the backup endpoint fail loudly when it sees something it doesn't know how to handle.
---
## 4. High: integrity is recorded but apparently not verified on read
`application_artifacts.sha256` is stored. Nothing in the document mentions verifying it when the file is read back. For a backup endpoint, this needs to be **mandatory**:
- Compute SHA-256 while streaming bytes into the ZIP.
- Compare against the recorded value.
- If it mismatches, fail the entire backup loudly and log it as a P1 — silent corruption is worse than a missing file.
- Include the verified SHA in the manifest (see #10).
This is cheap to implement and turns a passive integrity field into an active guarantee. It also catches MinIO storage corruption, accidental object overwrites, and bugs that double-encode bytes — all of which have precedent in real systems.
---
## 5. High: three identifiers and scan-based resolution
> *"`get_application_by_id` … scans submitted initiatives and matches when either `_submission_display_id(...) == applicationId`, or `initiative.case_code == applicationId`."*
A linear scan is fine at hundreds of records, fragile at thousands, and broken at tens of thousands. It's also racy — `_submission_display_id` is a derived value, so two submissions with subtle metadata differences could in principle collide.
For the backup feature this matters more than usual because:
- Admins often want **bulk** backups (a date range, a status, an owner). Bulk × scan = N².
- The endpoint is admin-facing, so slowness is less likely to be reported as "the app is broken" and more likely to silently get worse.
**Recommendations:**
1. Add a `submission_public_id` column (the `sub-...` value) on `initiatives`, indexed and unique. Compute it once at submit time, store it, never re-derive.
2. Change `get_application_by_id` to a single indexed lookup against either `submission_public_id` or `case_code`.
3. Document the resolution order explicitly. Right now the contract — "admins can deep-link with `sub-…` or sometimes `CASE-…`" — has a "sometimes" in it, which is exactly the kind of language that produces support tickets.
---
## 6. High: dead schema in `database/schema.sql`
> *"The root file `database/schema.sql` describes a separate **integer `applications`** domain (attachments table with `application_id` INT); that schema is **not** wired into `be0` today."*
This file will trick at least one future engineer into writing code against the wrong schema. It will also confuse any backup-related tooling, since "applicationId" in that file means an integer and in the running system means a `sub-...` string.
**Recommendation:** delete it, or move it to `database/unused/` with a `README.md` explaining why it's there. Don't leave authoritative-looking schema files next to the real ones. If there's a reason it can't be deleted (historical reference, planned migration), say so in a header comment at the top of the file.
---
## 7. Medium: evidence files have no apparent versioning
> *"One row per **`(initiative_id, role)`** … `research_evidence` | `textbook_evidence` | `technical_evidence`"*
If a user uploads research evidence, then re-uploads a corrected version before submitting, what happens to the old file? Three possibilities, all bad if not addressed:
- The MinIO object is overwritten — old bytes gone, no audit trail of the change.
- A new MinIO object is written but the old one is orphaned — storage grows forever, and nothing references the old bytes.
- The old `application_artifacts` row is updated in place — Postgres has no record that a previous version existed.
For backup integrity, this matters because **what gets reviewed and what gets archived may not be the same thing**. A reviewer might approve based on version 2 of an evidence file, but if version 3 was uploaded post-review, your backup would archive version 3.
**Recommendation:** decide explicitly. Either:
- Make evidence uploads append-only (new row per upload, old rows marked superseded), and have the backup capture the version that was current at submit time, *or*
- Document clearly that only the latest evidence is archived and that re-uploads after review are not tracked. Then add a UI guardrail preventing re-upload after review-locked status.
The first option is more work and almost certainly the right call for an approvals system.
---
## 8. Medium: `application_submit_snapshots.full_pdf_uri` can disagree with the artifact
> *"`full_pdf_uri` (today this records the **URL passed at submit time**, typically `/submitted-initiatives/...`, not necessarily the MinIO key)."*
So the snapshot table holds one URL form and the artifacts table can hold another. Which is canonical for the backup? The doc implies artifacts wins, but this isn't enforced anywhere. If the two ever drift, debugging "why does the snapshot say one thing and the backup contains another" will be painful.
**Recommendation:** treat `application_artifacts` as the single source of truth for "where the bytes live" and treat `application_submit_snapshots` as an immutable audit log of "what the request looked like at submit time." Document this distinction. Don't read `full_pdf_uri` from the snapshot for any operational purpose, including backup — it's history, not state.
---
## 9. Medium: MinIO bucket policy is not described
The document covers what's stored where, but not:
- **Versioning**: are buckets MinIO-versioned? If not, an accidental overwrite is unrecoverable.
- **Object lock / WORM**: for an approvals system with audit requirements, write-once on `initiative-exports` would protect against silent tampering, including from an admin with bucket credentials.
- **Lifecycle**: does anything age out? If retention rules apply (GDPR-style, contractual), the backup endpoint is exactly where they'll be tested.
- **Encryption at rest**: SSE config?
- **Backup of MinIO itself**: who backs up the backups?
These are not blockers for shipping the feature, but they're questions an auditor will ask the day after you ship it. Better to have answers in writing now. A short `MINIO_OPERATIONS.md` covering bucket policies, retention, and disaster recovery would close most of these in an afternoon.
---
## 10. Medium: backup endpoint design — stream, manifest, audit
A few concrete design points for the endpoint itself, since the doc's outline is sparse:
- **Stream the ZIP, never buffer it.** A single submission might have a few hundred MB of evidence. Buffering in memory will OOM the API container under modest concurrency. Use `zipstream-ng` (Python) or equivalent and write directly to the response.
- **Include a `manifest.json` at the root** of the ZIP, containing: `applicationId`, `case_code`, `initiative_id`, submitted-at timestamp, owner, status at backup time, list of files with their roles, original filenames, MIME types, byte sizes, recorded SHA-256, and *verified* SHA-256 (computed during streaming). The manifest is what makes the ZIP a self-describing archive rather than a folder of mystery bytes.
- **Use a clear directory structure inside the ZIP**, e.g. `submitted/full.pdf`, `submitted/full.docx`, `evidence/research/...`, `evidence/textbook/...`, `evidence/technical/...`, `manifest.json`. Avoid Vietnamese filenames at the top level — preserve them inside `manifest.json` and use ASCII-safe `{role}-{n}-{sha-prefix}.{ext}` on disk so Windows and older zip tools don't choke on UTF-8.
- **For long downloads or bulk exports, switch to an async job pattern.** Admin requests a backup → server creates a job → job streams the ZIP into `initiative-exports` (or a dedicated `initiative-backups` bucket) → admin gets a presigned URL when ready. This isolates long-running work from the request lifecycle and survives reverse-proxy timeouts. For single-initiative backups a synchronous endpoint is fine; for "back up everything from Q1" it isn't.
- **Audit log every backup download.** Who, when, which `applicationId`, IP, user-agent, bytes streamed, success/failure. This is admin access to user-submitted content — it should be at least as well-logged as any other privileged action.
- **Consider a "verify-only" mode** that re-downloads from MinIO, recomputes SHAs, and reports discrepancies without producing a ZIP. Cheap to implement once #4 is in place, very useful for periodic data-integrity audits.
---
## 11. Low: the quarantine bucket is undocumented
> *"`initiative-quarantine` (`S3_BUCKET_QUARANTINE`) — Reserved for quarantine flows (not detailed here)"*
If files can land in quarantine (presumably for AV scanning or content review), the backup endpoint needs a defined behavior for them: include and label, exclude entirely, or fail the backup until they're cleared. Pick one and document it. Otherwise this becomes the kind of edge case discovered by a real incident.
---
## What's already good
To be fair to the existing design — several decisions in this document are correct and worth preserving:
- **SHA-256 captured at upload time** is the right foundation; it just needs to be actively verified on read.
- **Append-only `application_submit_snapshots`** is exactly the right shape for an audit table. Don't ever let it become mutable.
- **Separate buckets per concern** (`attachments`, `exports`, `quarantine`) makes lifecycle and access policies straightforward.
- **`application_review_documents` storing JSON** is genuinely useful for re-rendering and analytics — the issue isn't that it exists, it's that it's currently being asked to also serve as the source of truth for "what was submitted," which is a job for stored bytes.
- **Public/internal endpoint split for MinIO** is the right pattern for presigned URLs. Most teams get this wrong on the first try.
- **Documenting the dual-identifier confusion explicitly** is rare and valuable. Don't lose this institutional knowledge.
---
## Suggested order of work
A pragmatic sequencing if the team can only do this incrementally:
1. **Week 1 — unblock the backup feature's core promise.** Persist the rendered DOCX and official PDF as immutable bytes in MinIO at submit time, with SHA-256. This is what makes "backup" actually mean backup. Until this lands, do not ship the backup endpoint — you'll have to break the contract later when you fix it.
2. **Week 2 — make storage canonical.** Make MinIO upload synchronous and required for the submitted PDF. Add `storage_kind` to `application_artifacts`. Backfill any filesystem-only rows. Verify `SUBMITTED_INITIATIVES_DIR` is on a persistent volume in every environment, or stop relying on it.
3. **Week 3 — build the endpoint.** Streaming ZIP, manifest with verified SHAs, audit log, async job pattern for bulk. Single-initiative download first; bulk later.
4. **Following sprint — clean up the foundations.** Delete or quarantine `database/schema.sql`. Add `submission_public_id` indexed column and remove the scan-based lookup. Decide and document evidence versioning. Write `MINIO_OPERATIONS.md`.
5. **Quarter horizon — harden.** Periodic verify-only sweeps. MinIO bucket versioning and object lock if compliance requires. Backup of the MinIO itself (off-cluster).
The order matters: shipping the endpoint before #1 produces a backup that lies about what it contains, and that's worse than not having a backup at all — admins will rely on it, and you'll find out it's wrong only when something goes wrong elsewhere.
+207
View File
@@ -0,0 +1,207 @@
# Frontend Stability Review — Dashboard Data Refresh
Reviewing `fe0-dashboard-data-refresh-architecture.md` from the perspective of stability, predictability, and operability. The current design is **reasonable in spirit** (polling for "good enough" freshness, invalidation where it's natural) but has several **latent footguns** that will eventually cause incidents or flakiness as the user base and feature set grow.
The notes below are ordered by stability impact, not by where they appear in the document.
---
## 1. Critical: two `App.tsx` entrypoints with divergent QueryClient defaults
This is the single most dangerous thing in the document.
> *"There is also `fe0/src/app/App.tsx`, which configures **different** defaults (`refetchOnWindowFocus: false`, `staleTime: 5 minutes`)... If you ever switch entrypoints, global refetch behavior would change without touching feature code."*
This is a **silent global config switch** waiting to be flipped. A junior engineer cleaning up imports, a refactor that "consolidates the App shell," or a bad rebase can change every cache and refetch policy in the app without a single line of feature code being modified. Worse, the per-query `refetchInterval: 10s` would still appear correct, masking the change. PR review will not catch this — diff readers don't reason about which `App.tsx` `main.tsx` resolves to.
**Action — do this first, before anything else on this list:**
1. Pick one entrypoint. Delete the other in the same PR. Do not leave a TODO.
2. Add a CI check or ESLint rule (`no-restricted-imports`) that forbids importing the other path, in case anyone restores it.
3. The surviving file should set **explicit** `defaultOptions` on the `QueryClient`, even if the values match library defaults. Implicit defaults are a version-upgrade hazard — TanStack Query has changed defaults across major versions before. You do not want a `pnpm up` to silently change refetch semantics.
```ts
const queryClient = new QueryClient({
defaultOptions: {
queries: {
refetchOnWindowFocus: false,
refetchOnReconnect: true,
staleTime: 30_000,
gcTime: 5 * 60_000,
retry: (failureCount, err) => {
if (isAuthError(err)) return false; // never retry 401/403
return failureCount < 2;
},
retryDelay: (attempt) => Math.min(1000 * 2 ** attempt, 8000),
},
mutations: { retry: false },
},
});
```
Then per-query options (`refetchInterval`, `refetchOnWindowFocus: "always"`, etc.) become **deliberate opt-ins**, which is what they should be.
---
## 2. Critical: `refetchOnWindowFocus: "always"` on the admin inbox
`"always"` means *refetch on every focus event regardless of staleness*. Combined with 10s polling, the practical effect is:
- An admin with the tab focused: ~6 requests/min from polling.
- An admin who tabs in/out a few times per minute: polling **plus** every focus refetch — easily 1220 requests/min per tab.
- N admins doing this at 9am Monday: a synchronized burst that the backend has no way to absorb gracefully.
`"always"` is rarely the right answer. It exists for cases where stale data is genuinely dangerous (e.g. a trading screen). An approvals inbox is not that. Use `true` (refetch only if stale) and pair it with a sensible `staleTime`. The user experience difference will be imperceptible; the server load difference will not.
---
## 3. High: polling is not visibility-aware
A background tab still polls (browsers throttle timers but do not stop them, and some don't throttle aggressively when audio/SSE/etc. is active). For a dashboard people leave open all day in a second tab, this is pure waste — and it's waste that scales linearly with headcount.
The document already mentions the fix as a "common direction":
```ts
refetchInterval: (query) =>
document.visibilityState === "visible" ? 10_000 : false,
```
This should not be in the "if you want fewer calls" section. It should be the **default pattern** for every polling query in the app, captured in a small wrapper hook so engineers don't have to remember:
```ts
function useVisibilityAwareInterval(ms: number) {
return (query: Query) =>
document.visibilityState === "visible" ? ms : false;
}
```
---
## 4. High: `isFetching` is leaking into unrelated UI
> *"Stabilizing export used export-only loading state so the button does not follow list refetch"*
The fact that this fix was needed tells me the codebase has a **pattern problem**, not just one buggy button. Anywhere `isLoading` / `isFetching` from a list query is wired into a control that is conceptually independent (export, filters, "new application" button, pagination), the same bug exists — it just hasn't been noticed because dev latency hides it.
**Recommendation:** audit every consumer of the `["applications", ...]` query for `isFetching` / `isLoading` usage. As a rule:
- `isLoading` (true only on first load, no cached data) is *usually* fine to gate UI on.
- `isFetching` (true on every refetch including background polls) should **almost never** disable user-initiated controls. It is for showing a subtle indicator at most.
Consider a lint rule or a code review checklist item: "If you're disabling a button on `isFetching`, justify it in a comment."
This bug profile — "works locally, broken in cloud" — is exactly the kind of thing that erodes trust in the frontend. Slow networks shouldn't reveal latent bugs; they should reveal honest slowness.
---
## 5. High: query key stability for `["applications", filters]`
Not in the document, but worth checking: if `filters` is an object literal constructed in render (`{ status, page, q }`), its reference changes every render and TanStack Query may re-key on every render, producing far more requests than intended. Symptoms: requests fire on keystrokes, on unrelated state changes, etc.
Mitigations:
- Memoize the filters object with `useMemo`, *or*
- Pass primitive fields directly into the key: `["applications", status, page, q]`. This is more verbose but eliminates a whole class of bugs and makes the key serializable for debugging.
If this isn't already enforced, add it to the team's TanStack Query conventions doc.
---
## 6. Medium: inconsistent refresh strategy between admin and council
Admin uses **time-based polling**. Council uses **event-driven invalidation via `reportSyncQuery`**. These are two different mental models for the same surface (an "approved applications list"), maintained in two different files.
This costs the team in three ways:
1. **Cognitive load** — every engineer has to learn both patterns and remember which file uses which.
2. **Bug asymmetry** — a bug fixed in one file will not be fixed in the other. Already visible: the `isFetching` export issue likely needs to be checked in both.
3. **Drift** — over time the two implementations diverge in subtle ways (column order, filter semantics, error handling) and the union of behaviors becomes the de facto contract, which nobody actually documented.
Pick one strategy and apply it everywhere applications are listed. My recommendation:
- **Primary**: invalidation on mutations (`approve`, `reject`, `submit`, `assign`) plus invalidation on a lightweight "version" or "report sync" signal.
- **Secondary**: a slow safety-net poll (60120s, visibility-aware) so a missed invalidation doesn't leave the screen permanently stale.
This gets you near-instant updates after user actions (which is what people actually notice) and removes the 10s drumbeat against the backend.
If true realtime ever becomes a product requirement, SSE is the natural next step — *one* long-lived connection per tab is dramatically cheaper than 6 polls/minute, and it integrates cleanly behind the existing `apiClient`. WebSockets are overkill unless the server needs to push high-frequency updates.
---
## 7. Medium: no documented retry, timeout, or auth-failure policy
The document covers the happy path thoroughly but says nothing about:
- **Timeouts** — Axios has no default timeout. A hung backend will leave requests pending indefinitely, which compounds with 10s polling: stuck requests stack up, the browser hits its per-host concurrency limit, and the UI appears frozen.
- **Retries** — TanStack Query retries 3× by default. For a 10s poll, that means a transient 500 produces 4 requests in quick succession before the next poll, amplifying load exactly when the backend is already struggling.
- **401/403** — does the response interceptor force logout, redirect, or attempt refresh? Silent 401 loops (auth expired → poll → 401 → poll → 401) are a classic source of "the dashboard is broken" reports that turn out to be expired sessions.
- **Network offline** — `refetchOnReconnect: true` is good, but is there UI feedback during the offline window? Otherwise users see "blank inbox" and assume data loss.
Add an explicit section to the architecture doc covering these. They are stability features even when they don't fire.
---
## 8. Medium: dev logging interceptor
Logging every successful response in dev is fine, but two small upgrades:
1. **Sample or summarize** rather than logging full `data` payloads. Large list responses make the console unusable and hide the actual interesting events. `console.debug` (silenced by default in DevTools) is often a better choice than `console.log`.
2. **Guard against accidental production leakage** with a build-time assertion, not just `import.meta.env.DEV`. A misconfigured deploy that ships dev mode to production would log user data into browser consoles — minor but real privacy/compliance concern.
---
## 9. Low: polling intervals are scattered magic numbers
`10s`, `30s`, `30s`, `60s`, `60s` across five files. None are configurable. None share a constant. If product asks "can we double polling intervals across the board for a load test?", the answer requires touching five files and hoping you found them all.
Centralize:
```ts
// fe0/src/shared/config/polling.ts
export const POLL_INTERVALS = {
adminInbox: 10_000,
notificationsCount: 60_000,
notificationsList: 60_000,
adminOverview: 30_000,
aiHealth: 30_000,
} as const;
```
Bonus: makes it trivial to make them env-configurable later, and gives ops a single place to tune under load.
---
## 10. Low: thundering herd / no jitter
All polls fire at fixed intervals from mount time. With many admins logging in around the same wall-clock window (start of shift, morning standup), poll cycles tend to align. Adding ±1020% jitter to intervals smooths backend load with zero UX cost:
```ts
const jitter = (ms: number) => ms * (0.9 + Math.random() * 0.2);
refetchInterval: jitter(10_000)
```
Worth doing once you have more than a handful of concurrent admins.
---
## Suggested order of work
A pragmatic sequencing if the team can only do this incrementally:
1. **This week** — Delete one of the two `App.tsx` files. Set explicit `QueryClient` defaults. (Items 1.)
2. **Next sprint** — Audit `isFetching` consumers; introduce visibility-aware polling helper; replace `"always"` with `true`. (Items 2, 3, 4.)
3. **Following sprint** — Centralize polling intervals; document retry/timeout/auth policy and implement gaps; verify query key stability. (Items 5, 7, 9.)
4. **Quarter horizon** — Converge admin and council on one refresh strategy (invalidation primary, slow poll as safety net). Consider SSE if/when realtime becomes a product ask. (Item 6.)
---
## What's already good
To be fair to the existing design — the document describes several decisions that are correct and worth preserving:
- `placeholderData: (previous) => previous` to avoid table flicker is the right call.
- Invalidating `notifications-unread-count` after mutations rather than waiting for the next poll tick is exactly the pattern more of the app should follow.
- A single shared `apiClient` is a good foundation; the issues above are about what to put on top of it, not about replacing it.
- Documenting local-vs-cloud behavioral differences (latency hiding `isFetching` bugs) is the kind of institutional knowledge that usually only lives in people's heads. Keep doing this.
The bones are fine. The work is mostly about removing implicit behavior, consolidating duplicated patterns, and making polling polite.
+43
View File
@@ -0,0 +1,43 @@
# HTTPS for MinIO (presigned URLs and mixed content)
If the SPA is served over **`https://`**, the browser blocks embedding or opening **`http://…:19000`** presigned MinIO URLs (mixed active content).
The built-in evidence viewer proxies through **`GET /api/v1/application-drafts/…/evidence/content`** so previews work without exposing MinIO to HTTPS.
To restore **working direct presigned URLs** (new-tab open, integrations, downloads that bypass the API):
1. **Terminate TLS on a hostname that points at MinIOs S3 port** (`MINIO_API_PORT`, default **19000**), e.g. `https://minio-api.example.com` → nginx → `127.0.0.1:19000`.
2. **Set the same public base URL** in `.env` (no trailing slash), then restart Compose so **`be0`** and **`minio`** pick it up:
| Variable | Role |
|---------|------|
| **`S3_PUBLIC_ENDPOINT_URL`** | Host used when **`be0`** signs presigned GET/PUT URLs (must match what the browser uses). |
| **`MINIO_SERVER_URL`** | MinIO server URL advertised to clients (console / redirects). Should match **`S3_PUBLIC_ENDPOINT_URL`** for the S3 API host. |
| **`MINIO_BROWSER_REDIRECT_URL`** | Optional HTTPS URL for the **console** if you terminate TLS separately (default remains `http://${PUBLIC_HOST}:${MINIO_CONSOLE_PORT}`). |
`docker-compose.prod.yml` wires:
- **`S3_PUBLIC_ENDPOINT_URL=${S3_PUBLIC_ENDPOINT_URL:-http://${PUBLIC_HOST}:${MINIO_API_PORT}}`**
- **`MINIO_SERVER_URL=${MINIO_SERVER_URL:-http://${PUBLIC_HOST}:${MINIO_API_PORT}}`**
Example `.env` after nginx + certificate:
```bash
S3_PUBLIC_ENDPOINT_URL=https://minio-api.example.com
MINIO_SERVER_URL=https://minio-api.example.com
```
3. **`proxy_set_header Host $http_host`** on nginx must preserve the **`Host`** the client sent — AWS Signature V4 on presigned URLs is bound to host + path.
4. **Operational hardening**: after nginx fronts MinIO publicly, bind the Docker publish to **`127.0.0.1:${MINIO_API_PORT}:9000`** so only nginx can reach bare HTTP on that port from outside.
5. **CORS**: on **community MinIO**, configure **`MINIO_API_CORS_ALLOW_ORIGIN`** on the **`minio`** service (comma-separated origins, or `*` for dev). Per-bucket **`mc cors set`** is **AiStor-only** and will fail with “not implemented” on the OSS image.
## Example nginx config
See **[deploy/nginx/minio-s3-proxy.conf.example](../deploy/nginx/minio-s3-proxy.conf.example)**.
## Stack diagram (prod)
- **Browser** → `https://minio-api…`**nginx (TLS)**`http://127.0.0.1:19000`**MinIO**
- **be0** → `http://minio:9000` (Compose network) unchanged for server-side uploads and streaming.
+185
View File
@@ -0,0 +1,185 @@
# OTP-verified registration: workflow & state machine (engineering guide)
This document describes **email OTP verification during signup** end-to-end: **frontend**, **backend**, **database**, and **when object storage (e.g. MinIO) is relevant**. It is written so another team can implement the same pattern in a **new** application; it aligns with the reference implementation in this repo (`be0/src/auth_api.py`, `be0/src/auth_mail.py`, `fe0/src/auth/registration/`, migrations `013_*` / `014_*`).
---
## 1. Goals & threat model (what we optimize for)
| Goal | How |
|------|-----|
| Prove control of the inbox before full account use | `users.email_verified = false` until OTP succeeds; **login denied** until verified. |
| Avoid storing OTP plaintext | Persist **only a hash** of the OTP (same idea as password storage). |
| Limit brute-force on OTP | Cap **failed attempts** per pending code; use **constant-time** compare on hash. |
| Limit abuse of “resend” | Rate-limit **per email** and **per IP** (see §7). |
| Avoid email enumeration on resend | Return **same JSON envelope** whether the address exists or needs OTP (reference pattern). |
**MinIO / S3:** OTP delivery does **not** use object storage. Include MinIO in your design doc **only** if registration itself uploads files (avatars, documents); keep OTP and mail independent of buckets.
---
## 2. Account-level state machine (`users`)
After successful `POST /auth/register`, the server creates an **active** user with **`email_verified = false`**. Verification flips the flag; login is allowed only when **`email_verified = true`**.
```mermaid
stateDiagram-v2
[*] --> NoAccount: visitor
NoAccount --> RegisteredUnverified: POST /register OK\n(password hashed, user row created)
RegisteredUnverified --> Verified: POST /verify-otp OK\n(email_verified := true)
RegisteredUnverified --> RegisteredUnverified: POST /resend-otp\n(new OTP row, old pending superseded)
RegisteredUnverified --> NoAccount: optional admin deactivate/\ndelete user (app-specific)
Verified --> [*]: normal user lifecycle\n(login, JWT, etc.)
note right of RegisteredUnverified
Login with correct password
returns 403 until verified
(do not leak OTP validity)
end note
```
**Engineering rules:**
- **Register** must be **atomic**: create user + staff/profile as needed + insert OTP row + commit before sending email (or you risk orphan users without OTP).
- **Login** when password OK but `email_verified` is false: return **403** with a message that tells the user to complete OTP / resend (reference copy mentions OTP resend on registration flow).
---
## 3. OTP row lifecycle (`registration_otp_codes`)
Each issued OTP is one row. Only **one logical “pending”** code per user is typical: **issue** deletes previous unused rows for that user, then inserts a new row.
```mermaid
stateDiagram-v2
[*] --> Pending: issue OTP\n(plaintext only in memory/email)
Pending --> Consumed: verify OK\nused_at := now,\nuser.email_verified := true
Pending --> InvalidWrongCode: wrong OTP\nfailed_attempts += 1
InvalidWrongCode --> Pending: failed_attempts < max
InvalidWrongCap --> [*]: failed_attempts >= max\n(row ignored for verify)
Pending --> Superseded: resend or re-register issue\n(DB deletes unused rows first)
Pending --> Expired: now > expires_at\n(row ignored for verify)
Consumed --> [*]
state InvalidWrongCap <<choice>>
```
**Reference schema** (`014_registration_otp.sql`):
- `user_id``users(id)`
- `otp_hash` — never store plaintext OTP
- `expires_at` — server-side TTL (e.g. env-configurable minutes, clamped to a sane max)
- `failed_attempts` — increment on mismatch; **reject verify** when ≥ max (e.g. 5)
- `used_at` — non-null means consumed
**Verify query logic (conceptual):**
1. Resolve user by **normalized email**, active, **`email_verified` still false** — otherwise reject with **generic** error (same message as bad OTP if you want strict anti-enumeration on verify; reference uses a uniform rejection detail).
2. Select **latest** pending row where `used_at IS NULL`, `expires_at > now()`, `failed_attempts < max`.
3. Compare `hash(submitted_otp)` to `otp_hash` with **constant-time** comparison (`hmac.compare_digest` in Python).
4. On match: set `user.email_verified = true`, `row.used_at = now()`, audit log, commit.
5. On mismatch: increment `failed_attempts`, commit, reject.
---
## 4. Frontend state machine (wizard UX)
The SPA typically uses **local UI states**, not the servers DB state diagram:
```mermaid
stateDiagram-v2
[*] --> Form: land on registration
Form --> Form: client validation\n(email domain, password policy, etc.)
Form --> Otp: POST /register 2xx\nemailVerificationRequired === true
Form --> Form: 4xx/5xx show rejection
Otp --> Success: POST /verify-otp OK
Otp --> Otp: verify 4xx clear inputs/\nshow generic error
Otp --> Otp: POST /resend-otp\n(+ cooldown timer UX)
Success --> Login: navigate / user signs in
```
**Contract hints for engineers:**
- Read **`otpTtlSeconds`** from register response and drive countdown (stay in sync with server TTL; avoid hardcoding if backend can change TTL via env).
- **`otpDeliveryChannel`** (or equivalent): `smtp` | `log_only` | `none` | `smtp_failed` — drives toasts/help text (SMTP missing, dev log-only, SMTP error).
- **Resend UX cooldown** on the client is **supplementary**; server **must** enforce rate limits independently.
- OTP input: **exactly 6 digits** if matching this API; paste-friendly fields improve UX.
---
## 5. Backend API surface (minimal)
| Method | Path | Purpose |
|--------|------|---------|
| `POST` | `/auth/register` | Create user with `email_verified=false`; issue OTP; attempt email delivery; return `{ otpTtlSeconds, otpDeliveryChannel, emailVerificationRequired, ... }`. |
| `POST` | `/auth/verify-otp` | Body: `{ email, otp }` — normalized email + `\d{6}` OTP; on success set verified. |
| `POST` | `/auth/resend-otp` | Body: `{ email }` — enumeration-safe constant response; **429** if rate-limited. |
| `POST` | `/auth/login` | Reject **403** when password OK but email not verified (reference behavior). |
**Optional legacy path in this repo:** `POST /auth/verify-email` with **magic-link token** (`email_verification_tokens`). New apps should prefer **either** OTP **or** magic link consistently to reduce confusion.
---
## 6. Email delivery (no MinIO)
Outbound mail is **SMTP** or **dev log**:
- **Production:** `SMTP_HOST`, `SMTP_USER`, `SMTP_PASSWORD`, `SMTP_PORT`, `SMTP_USE_TLS`, `AUTH_MAIL_FROM`, optional `AUTH_PUBLIC_WEB_ORIGIN` for links in other mails.
- **Dev:** `AUTH_MAIL_LOG_ONLY=1` — log plaintext OTP (**never** enable in production).
If SMTP fails **after** user creation, return a distinct delivery channel (e.g. `smtp_failed`) so the client can explain “account created but email failed”; ops fix SMTP and user uses **resend**.
---
## 7. Rate limiting & scaling caveats
Reference implementation uses **in-process** buckets (`auth_rate_limit.py`): e.g. **5 resends per email per hour** and **30 per IP per hour** (rolling window).
**Greenfield checklist:**
- Single replica: in-memory limits are simple.
- Multiple API workers / pods: use **Redis** (or API gateway limits) for shared counters; tune limits per product policy.
- Always rate-limit **resend** harder than **verify** if verify is already capped by `failed_attempts`.
---
## 8. Database prerequisites
1. **`users.email_verified`** — `BOOLEAN NOT NULL DEFAULT FALSE` (migration `013_email_verification.sql` pattern).
2. **`registration_otp_codes`** — table as in §3 (`014_registration_otp.sql` pattern).
3. **Indexes** — partial index on pending OTP by `user_id` speeds lookup.
**MinIO:** no migration needed for OTP. Add bucket policies only if registration uploads binaries.
---
## 9. Observability & security checklist
- **Audit:** log verification success and resend with actor metadata (no OTP plaintext in audit).
- **Logs:** mail failures at `register` / `resend` with exception detail for ops; avoid logging hashed OTP.
- **HTTPS:** registration and OTP over TLS; HttpOnly cookies if using cookie-based sessions.
- **CORS:** allow credentials only for trusted front-end origins.
- **JWT:** issue tokens **after** login; do not put OTP or `email_verified=false` users into a “logged-in” state unless your product explicitly wants partial tokens (reference does not).
---
## 10. Reference file map (this repository)
| Layer | Location |
|-------|----------|
| Register / verify / resend / login gate | `be0/src/auth_api.py` |
| SMTP + `AUTH_MAIL_LOG_ONLY` | `be0/src/auth_mail.py` |
| Resend rate limits | `be0/src/auth_rate_limit.py` |
| OTP table migration | `be0/migrations/014_registration_otp.sql` |
| `email_verified` + magic-link tokens | `be0/migrations/013_email_verification.sql` |
| Registration UI + delivery UX | `fe0/src/auth/registration/RegistrationWithOtp.tsx` |
| TTL/cooldown constants (keep aligned) | `fe0/src/auth/registration/constants.ts` |
| API client | `fe0/src/lib/auth-service.ts` |
---
## 11. Summary one-liner for architects
**OTP registration is a small state machine on top of `users.email_verified` plus short-lived hashed OTP rows, sent over SMTP (not object storage), with gated login and abuse limits on resend—implement the same invariants even if frameworks differ.**
+93
View File
@@ -0,0 +1,93 @@
# Research-project proposals + PI cockpit (frontend_investigator)
_Added 2026-06-14. A new parallel domain (independent of the sáng-kiến / initiative flow): a Principal
Investigator submits a research-project proposal (Thuyết minh đề tài, Mẫu III.06-TM.ĐTUD), an admin
approves it, then the PI manages the project via a "cockpit" (members / datasets / models / assets /
tiến độ + audit)._
## Shape (one-line)
PI app `frontend_investigator` → fill proposal → submit → **admin approves in `frontend_admin`**
cockpit unlocks (auto-seeded from the proposal) → PI manages entities. Owner + admin authz; every
mutation writes an append-only audit row.
## Backend (`be0/`)
- **Migration `016_research_projects.sql`** — 7 tables:
- `research_projects` — the aggregate root. **The proposal row IS the project** across its lifecycle
(`status`: `draft → submitted → approved | rejected`). `content JSONB` holds the whole proposal
form blob; a few **extracted scalars** (`title/level/pi_name/period_months/budget_total`) are columns
for listing/overview. `code` (mã số) is null until approved. Review fields: `reviewed_by/at/note`.
- `research_project_{members,datasets,models,assets,milestones}` — normalized child entity tables
(FK → project, `ON DELETE CASCADE`), columns mirror the cockpit artifact's fields. Milestones use
`start_period`/`end_period` (the JSON keys `start`/`end` map to them).
- `research_project_audit` — append-only (BIGSERIAL; actor + action + subject + detail).
- Registered in **4 places** (the stage-together rule): both compose files' `docker-entrypoint-initdb.d`
+ `apply_initiative_migrations.py` (`_needs_research_projects_migration` + guarded block) + the `.sql`.
- **`src/research_routes.py`** — router mounted at `/api/v1/research/*` (now the **5th** extracted router:
auth · admin-audit · admin-user-profiles · templates · **research**). Endpoints:
- Proposals: `POST/GET /projects`, `GET/PUT/DELETE /projects/{id}`, `POST /projects/{id}/submit`,
`POST /projects/{id}/approve` (admin), `POST /projects/{id}/reject` (admin), `GET /projects/{id}/audit`.
- Cockpit entities: `GET/POST /projects/{id}/entities/{entity}`, `PUT/DELETE …/{item_id}`,
`GET /projects/{id}/cockpit` (one-shot bundle: project + 5 entity lists + recent audit).
- **Generic config-driven CRUD** (`_ENTITY_CONFIG`): one whitelist `_apply_fields` maps json_key→column
for all 5 entities (no column injection — sensitive cols absent from configs).
- **Authz (v1) = owner OR platform admin.** Read hides others' rows with 404; submit/update are
owner-only + draft-only; approve/reject are admin-only + submitted-only; entity mutations require the
project to be **approved** (409 otherwise = "cockpit unlocks on approval"). `credential_version` is
enforced by the global `main.py` middleware (routers use bare `decode_access_token_user_id`).
- **Seed-on-approve** — `_seed_cockpit_from_proposal` populates members (chủ nhiệm + thư ký +
thành viên) + milestones (tiến độ) from `content` so the cockpit opens "according to the proposal".
Best-effort + idempotent.
- Auth model: a **PI = any authenticated user who owns a project** (no new system role). Admin via the
existing allow-list mechanism (`AUTH_ADMIN_EMAILS` env, else the built-in default list in `auth_api.py`).
## Frontend
- **`frontend_investigator/`** — new Vite/React/TS app cloned from `frontend_user` (same design system by
construction: `index.css` + tailwind tokens, Inter/Merriweather, shared shadcn primitives, auth flow,
dashboard shell). Dev `:8083`→container `5175` (static IP `10.5.0.7`); prod `FE_INV_PORT`. Pages:
`ProjectsListPage` (Đề tài của tôi), `ProposalFormPage` (schema-driven form), `CockpitPage`.
Monorepo wiring: root `workspaces` += `frontend_investigator`; **all 4 Dockerfiles** COPY the new
manifest (npm ci/install resolves); both compose files mount it into the FE services.
- **`shared/src/lib/researchApi.ts`** — typed client for `/api/v1/research/*` (+ barrel export); used by
both `frontend_investigator` and `frontend_admin`.
- **Proposal form** (`components/proposal/`): `proposalSchema.ts` (the Mẫu III.06-TM.ĐTUD schema +
buildInitial/shouldShow/collectMissing) + `ProposalFormFields.tsx` (renderer on shared shadcn) +
`ProposalFormPage.tsx` (load/save-draft/submit; read-only for non-draft; approved → cockpit).
- **Cockpit** (`components/cockpit/`): `cockpitConfig.ts` (ENTITIES field defs + status tones) +
`CockpitWidgets.tsx` (Badge/Stat/Bar/EntityCard/EntityDrawer) + `CockpitPage.tsx` (bundle via TanStack
Query, tabbed Overview/entities/audit, CRUD mutations + cache invalidation, owner/admin-gated).
- **`frontend_admin`** — `ResearchReviewPage.tsx` (route `/research`, nav "Thẩm định đề tài", admin-gated):
submitted-proposals queue + detail dialog (content read-only) + approve (assign mã số) / reject (note).
## Tests
- `be0/tests/test_research_routes.py`**7 tests** (3 pure-unit scalar extraction + 4 DB integration:
lifecycle, reject/authz, entity CRUD+coercion+seeding+approved-gate+audit+bundle, malformed-content
seeding). Run: `docker exec be0 sh -lc 'cd /app && python -m unittest tests.test_research_routes'`.
- FE: `npm run typecheck` (×4 workspaces clean) + `npm run build -w frontend_investigator|frontend_admin`.
## Gotchas hit this session (blameless RCA)
1. **Migration COMMENT with semicolons** (016 v1) → the naive SQL splitter in
`apply_initiative_migrations.py` splits on `;` even inside string literals → `unterminated quoted
string`, the COMMENT failed (tables still created, since they commit before it). _5-why:_ the COMMENT
body contained `; ` separators. _Fix:_ rewrote without semicolons (periods) + stripped accents.
_Guard:_ already a documented rule (CLAUDE.md + reviewer memory) — keep COMMENT bodies semicolon/accent-free.
2. **Seeder crash on malformed content** (P2, reviewer-caught P1) — `for x in c.get(key) or []` only
guards falsy; a truthy non-list (PI-controlled JSONB, e.g. `{"tienDoThucHien": 5}`) → `TypeError`
500 on approve. _Fix (double-fault: code + test):_ `v = c.get(key); rows = v if isinstance(v, list) else []`
+ regression test `test_seeding_survives_malformed_content`.
## Commits (local `main`)
`63e8bec` P1 schema+lifecycle · `c10ce1b` P2 entity APIs+seeding · `b561db4` P3 scaffold ·
`d3e7daf` P4 form · `8d186a6` P5 cockpit · `93cf6bf` P6 admin review · `b80cb64` admin allow-list.
## Eval / run
- Admin account provisioned for E2E: `ththinh@ump.edu.vn` (added to the default policy-admin list).
- Dev: PI app `localhost:5175`, admin `localhost:5174` (host vite servers, proxy `/api`→be0:4402).
- **Not deployed** (push ≠ deploy; no CI). Prod needs `scripts/deploy-prod.sh` + `FE_INV_PORT` set, and
a deploy of the new `frontend_investigator` service.
## Follow-ups (not done)
- Bundle code-split (frontend_investigator ~1.6 MB — shared-deps bloat, same as frontend_user).
- Richer admin proposal view (currently a humanized key:value dump — moving `proposalSchema` into
`@ump/shared` would let the admin reuse the labeled renderer).
- Per-member RBAC (v2) — link cockpit members to real accounts + enforce the 5 project roles server-side.
- A `file-replace`/attachments story if proposals need evidence files (none yet).
@@ -0,0 +1,262 @@
# BÁO CÁO MÔ TẢ SÁNG KIẾN
> **Mẫu số 01**
---
**BỘ Y TẾ**
**ĐẠI HỌC Y DƯỢC THÀNH PHỐ HỒ CHÍ MINH**
---
| Trường | Nội dung |
|--------|----------|
| **Tên sáng kiến (Tiếng Việt)** | Thành lập Tổ thẩm định cấp đơn vị trong công tác xét sáng kiến cải tiến kỹ thuật năm 2026 của Đại học Y Dược TP.HCM |
| **Tác giả/nhóm tác giả sáng kiến** | Phòng Khoa học Công nghệ, Đại học Y Dược TP.HCM |
| **Đơn vị công tác** | Phòng Khoa học Công nghệ |
| **Thông tin liên hệ** | Phòng Khoa học Công nghệ Đại học Y Dược TP.HCM |
| **Năm** | 2026 |
---
## 1. Mở đầu
Hàng năm, Đại học Y Dược Thành phố Hồ Chí Minh (ĐHYD TP.HCM) tiếp nhận hàng trăm hồ sơ đăng ký sáng kiến cải tiến kỹ thuật từ cán bộ viên chức và người lao động trên toàn trường. Quy trình xét duyệt truyền thống dựa trên **hồ sơ giấy****trao đổi qua email** đã bộc lộ nhiều hạn chế nghiêm trọng:
- **Thiếu tính nhất quán:** Mỗi đơn vị sử dụng phiên bản biểu mẫu Word khác nhau, dẫn đến lỗi định dạng, sai năm, sai cấu trúc bảng biểu khi nộp hồ sơ.
- **Khó truy xuất:** Hồ sơ giấy và file đính kèm email phân tán, không có hệ thống trung tâm để tra cứu lịch sử nộp, trạng thái xét duyệt, hoặc phiên bản tài liệu.
- **Thiếu minh bạch:** Quy trình chấm điểm, phản hồi từ hội đồng, và quyết định công nhận không được ghi nhận có hệ thống, gây khó khăn cho việc kiểm toán và đảm bảo công bằng.
- **Tốn thời gian:** Việc tổng hợp, kiểm tra tính đầy đủ, và chuyển đổi định dạng hồ sơ tiêu tốn hàng trăm giờ lao động thủ công mỗi đợt xét.
- **Rủi ro mất dữ liệu:** File USB, email không mã hóa, và thiếu cơ chế sao lưu khiến minh chứng có nguy cơ thất lạc hoặc bị giả mạo.
Từ thực trạng trên, việc xây dựng **hệ thống quản lý sáng kiến số hóa** với **Tổ thẩm định cấp đơn vị** được số hóa toàn diện là cần thiết nhằm nâng cao chất lượng, tốc độ, và tính minh bạch của công tác xét sáng kiến tại ĐHYD TP.HCM.
---
## 2. Tên sáng kiến (tên quy trình, giải pháp, phương pháp)
**Hệ thống quản lý sáng kiến cải tiến kỹ thuật trực tuyến (SKI Platform)** — Nền tảng số hóa toàn trình quy trình đăng ký, thẩm định, và công nhận sáng kiến cấp đơn vị tại Đại học Y Dược TP.HCM.
---
## 3. Lĩnh vực áp dụng của sáng kiến
Cải cách hành chính, quản lý giáo dục, ứng dụng công nghệ thông tin trong quản trị đại học.
---
## 4. Mô tả sáng kiến
### 4.1. Tình trạng giải pháp đã biết hoặc hiện trạng công tác khi chưa có sáng kiến
Trước khi có hệ thống SKI Platform, quy trình xét sáng kiến tại ĐHYD TP.HCM hoạt động hoàn toàn thủ công:
**Ưu điểm của quy trình cũ:**
- Quen thuộc với cán bộ; không cần đào tạo công nghệ.
- Linh hoạt trong trao đổi trực tiếp giữa ứng viên và hội đồng.
**Nhược điểm nghiêm trọng:**
| Vấn đề | Mô tả |
|--------|-------|
| **Biểu mẫu không đồng nhất** | Mỗi đợt xét, cán bộ tự chỉnh sửa file Word gốc, dẫn đến sai lệch bảng biểu, font chữ, và cấu trúc mục lục. |
| **Không có bản xem trước** | Ứng viên không biết hồ sơ in ra trông như thế nào cho đến khi nộp bản cứng. |
| **Tổng hợp thủ công** | Phòng KHCN phải mở từng file Word, kiểm tra tính đầy đủ, và tổng hợp danh sách bằng Excel thủ công. |
| **Thiếu kiểm soát phiên bản** | Không phân biệt được bản nháp và bản nộp chính thức; ứng viên gửi nhiều phiên bản qua email gây nhầm lẫn. |
| **Minh chứng phân tán** | File PDF, hình ảnh minh chứng được gửi qua USB, Zalo, email — không có kho lưu trữ tập trung. |
| **Không có nhật ký kiểm toán** | Không ghi nhận ai đã xem, chỉnh sửa, hoặc phê duyệt hồ sơ nào, vào thời điểm nào. |
| **Rủi ro bảo mật** | File minh chứng không được kiểm tra virus, không mã hóa hash, không có cơ chế cách ly. |
### 4.2. Nội dung giải pháp đề nghị công nhận là sáng kiến
#### Mục đích của sáng kiến
Xây dựng và triển khai nền tảng phần mềm **SKI Platform** nhằm:
- Số hóa toàn bộ quy trình từ soạn thảo, nộp hồ sơ, thẩm định, đến công nhận sáng kiến.
- Đảm bảo tính nhất quán tuyệt đối của biểu mẫu theo đúng mẫu quy định (Mẫu số 0104, Bản cam kết).
- Tạo kênh minh bạch, truy xuất được cho toàn bộ vòng đời hồ sơ sáng kiến.
#### Về nội dung của sáng kiến
##### Các bước thực hiện giải pháp
**Bước 1 — Xây dựng hệ thống biểu mẫu động (Dynamic Template Engine)**
Hệ thống sử dụng bộ ba module phối hợp:
- `build_template.py`: Tạo file DOCX mẫu chứa placeholder Jinja2 (`{{ trang_bia.ten_sang_kien }}`, `{{ mau_01.mo_dau }}`...) bằng thư viện `python-docx`, với kiểm soát chính xác từng ô bảng, font chữ Times New Roman, căn lề, và khoảng cách đoạn.
- `fill_application_form.py`: Nhận dữ liệu JSON từ biểu mẫu trực tuyến, điền vào template DOCX qua `docxtpl`, và xuất file Word hoàn chỉnh trong bộ nhớ.
- `docx_normalize.py` (1.300+ dòng mã nguồn): Bộ chuẩn hóa OOXML chuyên sâu, xử lý hơn 15 tình huống bất thường khi hiển thị DOCX trên trình duyệt, bao gồm: ép font Times New Roman xuyên suốt, căn chỉnh tiêu đề "BỘ Y TẾ / ĐẠI HỌC Y DƯỢC", tách dòng mềm cho letterhead, co chữ tránh tràn cột, xử lý justified paragraph với soft break, loại bỏ Mẫu số 04 khỏi bản nộp ứng viên, và chuẩn hóa bảng chữ ký.
**Bước 2 — Đường ống xuất PDF kép (Dual PDF Pipeline)**
Hệ thống cung cấp hai phương thức tạo PDF song song:
- *Phía máy chủ (Server-side):* Chuyển đổi DOCX → PDF bằng LibreOffice headless, đảm bảo độ trung thực cao nhất với bản in Word. API endpoint `POST /api/v1/docx/convert-pdf` phục vụ tải PDF chính thức.
- *Phía trình duyệt (Client-side fallback):* Sử dụng chuỗi `docx-preview` → phân trang HTML `<section>``html2canvas``jsPDF` để tạo PDF ngay trên trình duyệt khi máy chủ không sẵn sàng. Hệ thống CSS chuyên biệt (`applicationFormDocxPreview.css`, `docxTableReflow.ts`, `docxJustifyMitigationCss.ts`) xử lý các vấn đề hiển thị đặc thù của trình duyệt.
**Bước 3 — Xem trước tài liệu trong trình duyệt (Real-time DOCX Preview)**
Ứng viên xem trước hồ sơ Word ngay trên giao diện web mà không cần cài đặt Microsoft Word. Hệ thống render DOCX trực tiếp bằng `docx-preview`, với các bản vá CSS tùy chỉnh để:
- Giữ đúng font Times New Roman xuyên suốt tài liệu.
- Sửa lỗi giãn chữ trong đoạn justified có soft break.
- Điều chỉnh chiều cao hàng bảng (`trHeight`) mà `docx-preview` xử lý sai.
- Tô màu nội dung do người dùng nhập để phân biệt với văn bản mẫu.
**Bước 4 — Quản lý minh chứng số (Evidence Management via MinIO)**
Toàn bộ file minh chứng (PDF, ảnh, tài liệu) được lưu trữ trên hệ thống lưu trữ đối tượng MinIO (tương thích S3):
- Ba bucket phân tách chức năng: `initiative-attachments` (minh chứng chính), `initiative-exports` (bản xuất), `initiative-quarantine` (cách ly file nghi vấn).
- Mã băm SHA-256 cho mỗi file đảm bảo tính toàn vẹn dữ liệu.
- Presigned URL cho phép trình duyệt tải/xem file trực tiếp mà không cần proxy qua backend.
- Danh sách MIME type cho phép và giới hạn kích thước file.
**Bước 5 — Quy trình phân quyền và xác thực đa lớp**
Hệ thống phân quyền theo vai trò rõ ràng:
| Vai trò | Quyền hạn |
|---------|-----------|
| **Người nộp đơn (viewer)** | Soạn thảo hồ sơ, tải minh chứng, xem trước PDF, sử dụng trợ lý AI, nộp hồ sơ. |
| **Hội đồng (editor)** | Xem hồ sơ nộp, đánh giá theo Mẫu 04, nhận xét. |
| **Quản trị viên (admin)** | Quản lý người dùng, xuất danh sách, xem nhật ký kiểm toán, sao lưu/khôi phục dữ liệu. |
Bảo mật đa lớp:
- Mật khẩu mã hóa Argon2id.
- Token xác thực JWT HS256 với cơ chế vô hiệu hóa sau đổi mật khẩu (`credential_version`).
- Xác minh email tổ chức (chỉ chấp nhận email `@ump.edu.vn`).
- Mã OTP 6 chữ số gửi qua email cho đăng ký mới.
- Giới hạn tần suất đăng nhập (rate limiting) chống brute-force.
**Bước 6 — Nhật ký kiểm toán toàn diện (Audit Trail)**
Mọi hành động trên hệ thống đều được ghi nhận vào bảng `audit_events` (chỉ ghi thêm, không xóa/sửa):
- Các loại hành động: `create`, `read`, `update`, `delete`, `login`, `logout`, `login_failed`.
- Ghi nhận payload trước/sau thay đổi để truy vết đầy đủ.
- Sử dụng nested savepoint trong PostgreSQL để đảm bảo lỗi audit không ảnh hưởng đến giao dịch chính.
- Giao diện quản trị cho phép lọc, sắp xếp, và xem chi tiết từng sự kiện.
**Bước 7 — Trợ lý AI hỗ trợ tuân thủ (AI Compliance Assistant)**
Tích hợp mô hình ngôn ngữ lớn (LLM) qua Ollama để:
- Trả lời câu hỏi về quy trình, quy định sáng kiến thông qua giao diện chat.
- Kiểm tra mức độ tuân thủ của hồ sơ dựa trên phân tích embedding và đối chiếu từ khóa (RAKE + cosine similarity).
- Hỗ trợ ứng viên hoàn thiện hồ sơ mà không cần tư vấn trực tiếp từ bộ phận pháp chế.
**Bước 8 — Lưu trữ và sao lưu có hệ thống**
- PostgreSQL 16 lưu trữ toàn bộ dữ liệu cấu trúc: bản nháp, bản nộp chính thức, phản hồi hội đồng, hồ sơ nhân sự.
- JSONB cho `officialBieuMau` giữ nguyên cấu trúc biểu mẫu gốc, cho phép tái tạo DOCX bất kỳ lúc nào.
- Cơ chế snapshot phiên bản: phân biệt rõ draft (nháp) và submission (nộp chính thức).
- API backup/restore cho quản trị viên.
##### Các điều kiện cần thiết để áp dụng giải pháp
- Máy chủ Linux với Docker và Docker Compose (PostgreSQL 16, MinIO, Python 3.11+, Node.js 18+).
- Tên miền nội bộ hoặc công khai với chứng chỉ SSL cho truy cập HTTPS.
- Tài khoản email SMTP của tổ chức để gửi OTP và thông báo.
- (Tùy chọn) LibreOffice headless trên máy chủ để xuất PDF chất lượng cao.
- (Tùy chọn) Ollama server cho tính năng trợ lý AI.
##### Lĩnh vực áp dụng
Toàn bộ các đơn vị trực thuộc Đại học Y Dược TP.HCM: các Khoa, Phòng, Trung tâm, Viện, Bệnh viện thực hành.
##### Kết quả thu được
- Triển khai thành công tại Phòng Khoa học Công nghệ, ĐHYD TP.HCM trong đợt xét sáng kiến năm 2026.
- 100% hồ sơ nộp qua hệ thống có định dạng chuẩn, đúng biểu mẫu quy định.
- Xem trước PDF trực tuyến giúp ứng viên tự kiểm tra trước khi nộp, giảm tỷ lệ hồ sơ bị trả lại.
- Thời gian tổng hợp danh sách ứng viên giảm từ nhiều ngày xuống vài phút nhờ giao diện quản trị.
- Toàn bộ minh chứng được lưu trữ tập trung, truy xuất được, và có mã hash toàn vẹn.
##### Danh sách đơn vị/cá nhân đã tham gia áp dụng thử hoặc lần đầu
| TT | Tên tổ chức/cá nhân | Địa chỉ | Lĩnh vực áp dụng sáng kiến |
|----|----------------------|---------|----------------------------|
| 1 | Phòng Khoa học Công nghệ | ĐHYD TP.HCM | Quản lý hồ sơ sáng kiến |
| 2 | Cán bộ viên chức nộp sáng kiến | Các đơn vị trực thuộc ĐHYD TP.HCM | Đăng ký và theo dõi hồ sơ sáng kiến |
#### Về tính mới của sáng kiến
Hệ thống SKI Platform mang nhiều điểm mới so với các giải pháp hiện có tại ĐHYD TP.HCM và các cơ sở giáo dục tương đương:
1. **Bộ chuẩn hóa OOXML chuyên sâu** — Đây là điểm khác biệt cốt lõi. Thay vì chỉ điền dữ liệu vào template Word (điều nhiều hệ thống có thể làm), SKI Platform xử lý hơn 15 trường hợp bất thường ở cấp XML trong file DOCX để đảm bảo hiển thị chính xác trên trình duyệt. Không có giải pháp thương mại nào được thiết kế riêng cho bộ biểu mẫu sáng kiến của ĐHYD TP.HCM.
2. **Đường ống PDF kép tự động chuyển đổi dự phòng** — Kết hợp LibreOffice server-side với pipeline client-side (docx-preview + html2canvas + jsPDF), đảm bảo ứng viên luôn có thể tạo PDF bất kể tình trạng máy chủ. Cơ chế fallback tự động là tính năng chưa thấy ở các hệ thống quản lý hành chính đại học khác.
3. **Tích hợp AI tuân thủ** — Ứng dụng LLM (qua Ollama) và kỹ thuật embedding để hỗ trợ ứng viên kiểm tra tính tuân thủ của hồ sơ theo quy định là tính năng tiên phong trong lĩnh vực quản lý sáng kiến tại các cơ sở y dược.
4. **Kiến trúc lưu trữ phân tầng** — Phân tách rõ ràng giữa dữ liệu cấu trúc (PostgreSQL JSONB), file minh chứng (MinIO S3), và bản xuất (PDF/DOCX), với cơ chế hash SHA-256 và bucket cách ly — vượt trội so với việc gửi file qua email và USB.
5. **Cơ chế xác thực gắn với tổ chức** — Email domain gate (`@ump.edu.vn`) + OTP + credential versioning đảm bảo chỉ cán bộ viên chức ĐHYD TP.HCM mới truy cập được hệ thống, đồng thời tự động vô hiệu hóa phiên đăng nhập cũ khi đổi mật khẩu.
#### Về tính hiệu quả
So sánh hiệu quả thu được khi áp dụng sáng kiến so với quy trình thủ công trước đây:
**+ Tạo ra lợi ích kinh tế:**
Giảm chi phí in ấn hồ sơ giấy (trung bình 50100 trang/hồ sơ × hàng trăm hồ sơ/đợt). Tiết kiệm thời gian lao động thủ công tương đương 23 nhân sự chuyên trách trong mỗi đợt xét (tổng hợp, kiểm tra, chuyển đổi định dạng).
**+ Đem lại hiệu quả trong giảng dạy:**
Giải phóng thời gian cho cán bộ giảng dạy — thay vì dành nhiều giờ soạn thảo biểu mẫu Word thủ công, giảng viên tập trung vào nội dung sáng kiến. Giao diện hướng dẫn và trợ lý AI hỗ trợ ứng viên lần đầu nộp hồ sơ.
**+ Tăng năng suất lao động:**
Thời gian hoàn thành hồ sơ giảm đáng kể nhờ: biểu mẫu tự động điền, xem trước PDF tức thì, và kiểm tra tính đầy đủ tự động. Phòng KHCN tổng hợp danh sách và trạng thái hồ sơ trong vài phút thay vì nhiều ngày.
**+ Nâng cao hiệu quả công việc:**
Quy trình từ nộp → thẩm định → công nhận được số hóa xuyên suốt trên một nền tảng duy nhất, loại bỏ việc chuyển file qua email, USB, và tổng hợp thủ công.
**+ Nâng cao chất lượng công việc, dịch vụ:**
100% hồ sơ xuất ra tuân thủ đúng biểu mẫu quy định (Mẫu số 0104, Bản cam kết) nhờ bộ chuẩn hóa OOXML tự động. Loại bỏ hoàn toàn lỗi "sai năm", "sai font", "bảng biểu vỡ" vốn phổ biến khi chỉnh sửa Word thủ công.
**+ Giảm chi phí:**
Loại bỏ chi phí in ấn, photocopy, và lưu trữ vật lý. Giảm chi phí nhân sự hành chính cho công tác kiểm tra và tổng hợp hồ sơ.
**+ Cải thiện môi trường, điều kiện học tập, làm việc, sống:**
Giảm sử dụng giấy in, góp phần vào mục tiêu "văn phòng xanh" của nhà trường. Ứng viên có thể nộp và theo dõi hồ sơ mọi lúc mọi nơi qua trình duyệt web mà không cần đến trực tiếp Phòng KHCN.
**+ Bảo vệ sức khỏe:**
Giảm thiểu tiếp xúc giấy tờ vật lý, phù hợp với bối cảnh hậu đại dịch và nhu cầu làm việc từ xa.
**+ Đảm bảo an toàn lao động, PCCC:**
Giảm khối lượng hồ sơ giấy lưu trữ vật lý, giảm nguy cơ cháy nổ từ kho tài liệu. Dữ liệu số được sao lưu tự động, không phụ thuộc vào điều kiện vật lý của kho lưu trữ.
**+ Nâng cao khả năng, trình độ, nhận thức, trách nhiệm:**
Hệ thống nhật ký kiểm toán nâng cao ý thức trách nhiệm — mọi hành động đều được ghi nhận, tạo văn hóa minh bạch trong công tác xét sáng kiến. Trợ lý AI giúp cán bộ nâng cao hiểu biết về quy định sáng kiến mà không cần phụ thuộc vào tư vấn trực tiếp.
---
## 6. Những thông tin cần được bảo mật (nếu có)
- Mã nguồn hệ thống và cấu hình bảo mật (khóa JWT, thông tin SMTP, credentials MinIO/PostgreSQL).
- Dữ liệu cá nhân của ứng viên (email, CCCD, thông tin liên hệ) được lưu trữ mã hóa và chỉ truy cập theo phân quyền.
---
## Chữ ký
| | |
|---|---|
| **LÃNH ĐẠO ĐƠN VỊ** | **Tp. Hồ Chí Minh, ngày .... tháng .... năm 2026** |
| *(Ký, ghi rõ họ tên)* | **Tác giả chính / Đại diện nhóm tác giả sáng kiến** |
| | *(Ký, ghi rõ họ tên)* |
| | |
| ...................................... | ...................................... |
---
## Phụ lục kỹ thuật — Kiến trúc hệ thống
| Tầng | Công nghệ |
|------|-----------|
| **Frontend** | React 18, TypeScript, Vite, Tailwind CSS, shadcn/ui, TanStack Query |
| **DOCX/PDF trình duyệt** | docx-preview, html2canvas, jsPDF |
| **Backend** | Python 3.11+, FastAPI, Pydantic, SQLAlchemy 2 (async + asyncpg) |
| **Xác thực** | Argon2-cffi, PyJWT (HS256), SMTP OTP |
| **Lưu trữ đối tượng** | MinIO (S3-compatible), aioboto3 |
| **Cơ sở dữ liệu** | PostgreSQL 16 (citext, JSONB, enum) |
| **AI** | Ollama, LangChain, NLTK/RAKE, scikit-learn |
| **Chuyển đổi Office** | LibreOffice headless (DOCX → PDF) |
| **Hạ tầng** | Docker Compose: fe0, be0, postgres, minio |
@@ -0,0 +1,173 @@
# Security incident — rcc-ump.com (2026-05-27)
**Status:** Remediation in progress (code fixes tracked below)
**Scope:** Production exposure at `https://www.rcc-ump.com` / `https://rcc-ump.com`
**Related prior audit:** [assets/docs/2026-05-21-security-review.md](../assets/docs/2026-05-21-security-review.md)
---
## Executive summary
Public screenshots and `curl` tests show the production site was serving the **Vite development server**, not a built SPA. That exposes full TypeScript source, `import.meta.env` values (including internal Docker hostnames), stack traces, and HMR internals. Combined with backend misconfiguration (default JWT secret, unauthenticated API routes), this created a path from **reconnaissance → data theft → admin takeover → arbitrary file write**.
Treat the VPS as **compromised until forensics prove otherwise**. Rotate credentials and redeploy with the fixes in this document before bringing the site back.
---
## Evidence (what attackers saw)
| Observation | Confirms |
|---|---|
| DevTools Sources tree shows `@vite/client`, `@react-refresh`, `node_modules`, `/src/**` | Vite **dev** server on the public internet |
| `curl …/src/shared/api/client.ts` returns source with `DEV: true`, `VITE_DEV_PROXY_TARGET: http://be0:4402` | Env + internal service names leaked |
| `curl …/vite.config.ts` returns HTML error with full config + stack trace | Verbose dev error handling |
| `lovable-tagger` in plugin list | Dev-only tooling active |
**Root cause in repo:** `docker-compose.prod.yml` and `fe0/Dockerfile` run `npm run dev -- --host 0.0.0.0`.
---
## Findings → fixes (checklist)
Track implementation in git; check off after deploy to production.
### Step 0 — Incident response (ops, not code)
- [ ] Restrict public access (maintenance page / firewall) during remediation
- [ ] Rotate **Postgres**, **MinIO**, **SMTP**, and generate new **`JWT_SECRET`** (`openssl rand -base64 48`)
- [ ] Bump every user's `credential_version` in Postgres (invalidates old JWTs)
- [ ] Review `audit_events`, unknown admin users, MinIO objects, modified files under `./be0` / `./fe0`
- [ ] Bind MinIO console to localhost; do not expose `:9001` to the internet
- [ ] Purge `.env` from git history if the repo was ever shared (`git filter-repo`)
### Step 1 — JWT and production mode ✅ (code)
| ID | Finding | Fix | Status |
|---|---|---|---|
| C-1 | `JWT_SECRET` unset → dev fallback signs tokens | `JWT_SECRET` + `ENVIRONMENT=production` in `docker-compose.prod.yml`; `verify-prod-env.sh` | **Done** |
| — | `ENVIRONMENT` never set in prod | Pass `ENVIRONMENT=production` to `be0` | **Done** |
### Step 2 — Production frontend (stop source leak) ✅ (code)
| ID | Finding | Fix | Status |
|---|---|---|---|
| H-3 | Vite dev in production | `fe0/Dockerfile.prod` + `fe0/nginx/default.conf`; updated `docker-compose.prod.yml` | **Done** |
| — | Prod API URL pointed at localhost | Same-origin `/api` via nginx when `VITE_API_URL` unset | **Done** |
### Step 3 — Remove broken / dangerous endpoints ✅ (code)
| ID | Finding | Fix | Status |
|---|---|---|---|
| C-3 | `POST /upload_document` | **Removed** | **Done** |
| M-9 | `POST /get_page` | **Removed** | **Done** |
### Step 4 — Authenticate sensitive API routes ✅ (code)
| ID | Finding | Fix | Status |
|---|---|---|---|
| C-4 | `/api/v1/review-documents` CRUD | Login + owner/staff | **Done** |
| C-5 | `GET /api/applications` | Staff-only | **Done** |
| C-5 | `GET /api/applications/{id}` | Owner or staff | **Done** |
| H-1 | LLM / chat / ideas endpoints | Auth on all listed routes | **Done** |
| H-4 | `DELETE …/admin-result` auth order | Auth first | **Done** |
### Step 5 — Hardening ✅ (code)
| ID | Finding | Fix | Status |
|---|---|---|---|
| H-5 | MinIO CORS `*` default | Required `MINIO_API_CORS_ALLOW_ORIGIN`; console on localhost | **Done** |
| H-7 | No login rate limit | `allow_login()` | **Done** |
| M-2 | No security headers | Middleware + nginx | **Done** |
| M-3 | CORS `*` risk | Fail startup if `*` in origins | **Done** |
**Deploy:** Rebuild `fe0` with `Dockerfile.prod`. Confirm DevTools no longer shows `@vite/client` or `/src/`. Replace placeholder `fe0/public/logo.svg` with your institution logo if needed.
**Deploy (Steps 15):** Update `.env` (see below), run `./scripts/verify-prod-env.sh`, then:
```bash
docker compose --env-file .env -f docker-compose.prod.yml up -d --build
```
Recreate `be0` after setting `JWT_SECRET` and bump all users' `credential_version` in Postgres.
- Non-root Docker users; remove prod bind-mounts of `./be0` / `./fe0` source
- HttpOnly refresh tokens; shorten JWT TTL
- Upgrade `xlsx`; pin `pip install` at image build time
- Auth audit test: every mutating route must have auth dependency
- Add `SECURITY.md` disclosure policy
---
## Production architecture (target)
```
Browser (HTTPS, external nginx/Caddy on VPS)
├─► fe0 container (nginx :8080) ── static files from dist/
│ proxy /api/* ──► be0:4402 (Docker network)
│ proxy /submitted-initiatives/ ──► be0:4402
├─► be0 bound 127.0.0.1:4402 on host (not public)
├─► postgres bound 127.0.0.1:15432
└─► MinIO API (TLS via reverse proxy); console localhost-only
```
External TLS termination (Certbot/Caddy on the VPS) sits in front of `${FE_PORT}`. See [deploy-production-docker.md](./deploy-production-docker.md) and [minio-behind-https.md](./minio-behind-https.md).
---
## Verification after deploy
### Automated tests (run before deploy)
```bash
# Backend — includes 17 security regression tests
cd be0 && python -m pytest tests/ -q
# Frontend unit tests + env config
cd fe0 && npm test && npm run build
# Production .env validation script
./scripts/test-verify-prod-env.sh
```
### Production smoke checks
```bash
# 1. Env validation
./scripts/verify-prod-env.sh
# 2. Stack healthy
docker compose --env-file .env -f docker-compose.prod.yml ps
# 3. No Vite dev artifacts (expect 404, not TS source)
curl -sS -o /dev/null -w "%{http_code}\n" https://www.rcc-ump.com/src/main.tsx
# 4. Unauthenticated PII blocked (expect 401)
curl -sS -o /dev/null -w "%{http_code}\n" https://www.rcc-ump.com/api/applications
# 5. JWT not forgeable — login with real user; admin routes reject unsigned tokens
curl -sS -o /dev/null -w "%{http_code}\n" \
-H "Authorization: Bearer invalid" \
https://www.rcc-ump.com/api/v1/admin/audit-events
```
---
## `.env` additions required for production
```bash
# Generate once:
JWT_SECRET=$(openssl rand -base64 48)
# Restrict MinIO browser CORS to your SPA origin (scheme + host, no trailing slash):
MINIO_API_CORS_ALLOW_ORIGIN=https://www.rcc-ump.com
# Public app URL (emails, CORS extras):
AUTH_PUBLIC_WEB_ORIGIN=https://www.rcc-ump.com
CORS_ORIGINS_EXTRA=https://www.rcc-ump.com
```
After first deploy with `JWT_SECRET`, run SQL (or admin script) to increment `credential_version` for all users.
---
*Last updated: 2026-05-27 — update checkboxes as fixes land on `main` and production.*
+296
View File
@@ -0,0 +1,296 @@
# Database Engineering Review — User Profile Manager State Machine
**Reviewer role:** Senior database engineer
**Document under review:** *User profile manager — integration analysis and implementation state machine*
**Audience:** Backend / platform engineering team
**Status:** Constructive feedback before implementation kickoff
---
## 1. Overall assessment
The document is well-organized and shows good systems thinking. In particular, three things are already right and should not be relitigated during implementation:
- The **MinIO boundary** is correctly drawn. Profile scalars belong in PostgreSQL; binary artifacts (proof documents, exports) are the only thing that should ever touch object storage. Keep that line bright.
- The **naming separation** between RBAC `roles` (`admin`, `viewer`, …) and the institutional `job_title` / `position_title` is essential and easy to get wrong. Holding to it prevents a class of authorization bugs.
- Treating `is_active` (account disabled) and `profile_verification_status` (HR data trusted) as **orthogonal** axes is correct. They answer different questions and must not collapse into one column.
That said, the document is largely an interface-level plan. Several decisions that look like product-policy knobs are actually **database design decisions** with long-term cost — schema shape, constraint placement, transition enforcement, index strategy, and migration mechanics. The rest of this review focuses there.
---
## 2. Schema shape: pick the 1:1 child table, not extra columns on `users`
The document offers two options and leans neutral. The team should pick **`user_staff_profiles` as a 1:1 child table keyed by `user_id`**, and commit to it. Reasons:
1. **Hot-row contention.** The `users` row is read on every authenticated request (token validation, `/me`, role checks). Every column added to that table widens the row, increases TOAST pressure for nullable text columns, and pulls more bytes through the cache for traffic that doesn't care about HR data. Profile fields are read on profile pages and admin queues — different access pattern, different table.
2. **Clear ownership.** Auth code owns `users` + `user_roles`. HR/profile code owns `user_staff_profiles`. The boundary lines up with the API boundary you're already drawing (`/auth/*` vs `/admin/users/*`).
3. **Migration safety.** Adding columns to `users` in a live system means coordinating with every read path that does `SELECT *` (hopefully none, but in practice some). A new table is additive and isolated.
4. **Permissions.** Postgres-level GRANTs and row-level security policies, if you ever want them, are far easier to write per-table than per-column.
The one tradeoff is an extra LEFT JOIN on profile reads. That is cheap and indexable; it is not a reason to denormalize.
**Recommendation:** create `user_staff_profiles` with `user_id PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE`. The 1:1 is enforced by the PK.
---
## 3. Field-by-field schema critique
### 3.1 `employee_id`
The document says "uniqueness policy TBD". This must be decided before the migration is written, because it determines the constraint shape:
- If **globally unique when present**, use a **partial unique index**: `CREATE UNIQUE INDEX ... ON user_staff_profiles (employee_id) WHERE employee_id IS NOT NULL;` A plain `UNIQUE` would forbid more than one NULL on some configurations and is the wrong tool here.
- If **required at registration**, make it `NOT NULL` and use a regular `UNIQUE`.
- Decide format validation (length, charset, leading zeros) and enforce with a `CHECK` constraint, not just in Pydantic. The DB is the last line of defense when a future script writes directly.
### 3.2 `academic_title` + `academic_title_other`
The `enum + free-text "other"` pattern works but has two failure modes worth pre-empting:
- **Drift between code and DB.** If `academic_title` is a Postgres `ENUM`, adding a value requires `ALTER TYPE`, which is fine but easy to forget in code review. Prefer a **lookup table** (`academic_titles(code, label_vi, label_en, sort_order, active)`) referenced by FK. Adding a value is an INSERT, not a DDL change. Translations and ordering live next to the value.
- **`other` invariant.** Enforce at the DB level: `CHECK ((academic_title = 'other') = (academic_title_other IS NOT NULL AND length(trim(academic_title_other)) > 0))`. Without this, you will eventually see rows where `academic_title='professor'` and `academic_title_other='Something'` from a buggy client.
### 3.3 `department` vs the existing `users.unit_id`
This is the most important inconsistency in the proposal. `User` already has an optional `unit_id`. The new design proposes `department` as free text. Pick one:
- If `units` is a real catalog, **drop `department` from the new design** and reuse `unit_id`. Free text on top of a catalog guarantees data quality decay.
- If the catalog is incomplete and free text is a transitional necessity, model it as **`unit_id NULLABLE` + `unit_name_freetext NULLABLE`** with a `CHECK` that exactly one is set. Plan a backfill story.
Don't ship two parallel notions of "where this person works."
### 3.4 `job_title`
Free text is fine here, but:
- Length cap (e.g. `VARCHAR(120)` or `TEXT` with a `CHECK length()`). Without one, this is a future abuse vector in admin UI.
- Keep it nullable. Some accounts (system, integration) won't have one.
### 3.5 Verification fields
`profile_verification_status`, `verification_submitted_at`, `verified_at`, `verified_by_user_id`, `rejection_reason` — fine, but enforce the **invariants between them** in the DB, not just in Python. See §4.
### 3.6 Timestamps
Every `_at` column should be `TIMESTAMPTZ`, never `TIMESTAMP`. You will eventually have admins in different time zones, and "naive timestamps interpreted as UTC by some code paths and as local time by others" is a recurring bug class. Make the DB authoritative.
---
## 4. Enforce the state machine in the database, not only in the application
The Mermaid diagram is clear, but right now nothing prevents a buggy code path or a one-off SQL fix from putting a row in an impossible state — e.g. `status='verified'` with `verified_at IS NULL`, or `status='rejected'` with no `rejection_reason`. Three layers of defense, in order of cost:
### 4.1 Status as a constrained type
Use either a Postgres `ENUM` or a `CHECK (profile_verification_status IN ('draft','pending','verified','rejected'))`. Lookup table is also acceptable. Whichever you pick, **don't store free-text statuses** — they always rot.
### 4.2 Cross-column invariants
Express the state machine's *value* invariants as `CHECK` constraints. Concretely:
```sql
CONSTRAINT verified_requires_metadata CHECK (
profile_verification_status <> 'verified'
OR (verified_at IS NOT NULL AND verified_by_user_id IS NOT NULL)
),
CONSTRAINT rejected_requires_reason CHECK (
profile_verification_status <> 'rejected'
OR (rejection_reason IS NOT NULL AND length(trim(rejection_reason)) > 0)
),
CONSTRAINT pending_clears_verification CHECK (
profile_verification_status NOT IN ('draft','pending')
OR (verified_at IS NULL AND verified_by_user_id IS NULL)
)
```
These are cheap, write-time-only, and they turn an entire category of "how did we get here" production incidents into impossible-to-commit transactions.
### 4.3 Transition legality
The richer rule — *"`verified` cannot transition to `draft` directly"* — is harder to express as a column constraint. Two options:
- **Application-level only**, but always wrap transitions in `UPDATE … WHERE id = $1 AND profile_verification_status = $expected_prior_state` and check the affected-row count. This gives you optimistic concurrency for free and rejects illegal transitions implicitly because the WHERE won't match.
- **Trigger-based** transition table. Heavier; only worth it if multiple services write to the table. For a single backend, the conditional UPDATE pattern is enough.
Whichever you pick, document and test the legal transitions explicitly. The diagram is the spec; the test suite should encode it directly (one test per transition, including illegal ones expecting failure).
---
## 5. Concurrency — name the races now
Two scenarios will happen in production. Decide the answer before writing code:
1. **Two admins click "Verify" on the same pending profile within the same second.** With the conditional-UPDATE pattern above, the second one's UPDATE affects 0 rows. The handler must return a clean `409 Conflict` (or treat as idempotent no-op — pick one and document it). Don't return 200 with a stale read; that's how audit logs end up with two "verified" entries for one transition.
2. **Applicant PATCHes their profile while an admin is reviewing it.** If the policy is "edits after verification revert to pending", this resolves itself. If the policy is "freeze after verified", you need a check at PATCH time. In both cases, the admin's verification UPDATE should be conditional on the *content* the admin reviewed, not just the prior state — consider an `etag` / `version` integer column bumped on every applicant edit, and require admins to send back the version they reviewed.
The document mentions idempotency as a "pick one behavior" — agreed, but the team should write down which one and link to it from the admin endpoint code.
---
## 6. Audit logging — beyond "extend `audit_log`"
§3.5 of the source doc says "extend `audit_log` with `entity_type='user_profile'`, before/after snapshot." Some sharper guidance:
- Store snapshots as **`JSONB`**, not stringified JSON. JSONB lets you query later (e.g. "show me all profiles where the verifier was admin X" or "find profiles whose `job_title` changed in Q3") without parsing.
- **Whitelist** fields included in snapshots, don't blacklist. The whitelist won't accidentally include `password_hash` or future PII columns the auditor didn't think about.
- Add a **monotonic event ordering** column (`created_at TIMESTAMPTZ DEFAULT now()` plus an `event_id BIGSERIAL`) and index by `(entity_type, entity_id, event_id)`. Querying "the history of this profile in order" is the most common audit access pattern; make it fast.
- Audit rows must be written **in the same transaction** as the state change. If the audit insert fails, the state change must roll back. Otherwise the log is not trustworthy.
- Plan for growth. Audit logs are append-only and grow without bound. Decide retention (e.g. 7 years for HR-relevant entries, 2 years for cosmetic edits) and put a partitioning strategy on the table from day one — `PARTITION BY RANGE (created_at)` monthly is a low-effort default and means you can drop old partitions cheaply later.
---
## 7. Indexing strategy for the admin queue
The admin "Users" page will have predictable hot queries. Plan indexes for them up front, not after the page is slow:
- **Pending queue**: `CREATE INDEX ix_usp_pending ON user_staff_profiles (verification_submitted_at) WHERE profile_verification_status = 'pending';` Partial indexes are tiny and exactly match the query.
- **Filter by department/unit**: composite index covering the typical filter + sort, e.g. `(unit_id, profile_verification_status, verification_submitted_at)`.
- **Lookup by employee_id**: covered by the unique index above.
- **Admin activity**: `(verified_by_user_id, verified_at DESC)` for "what did admin X approve recently".
Skip speculative indexes. Add them when the access pattern is real. But the four above will be real on day one.
---
## 8. Migration mechanics
The document says "Alembic/SQL migration consistent with repo conventions" — fine, but a few points the team should agree on before writing the migration:
1. **Backfill plan.** New columns on `users` (or new `user_staff_profiles` rows for existing accounts) need a defined initial state. Recommended: for every existing user, INSERT a `user_staff_profiles` row with `profile_verification_status = 'draft'` and all institutional fields NULL. Do this in the same migration that creates the table, in a single transaction if the user count is small (<100k); otherwise batch it.
2. **NOT NULL columns must be added in two steps** for live tables: add nullable → backfill → set NOT NULL. The Alembic migration should do all three, and the second step should be batched.
3. **Reversibility.** Every migration needs a working `downgrade()`. Verification status data will be lost on downgrade — that's fine, but make it explicit (`DROP TABLE user_staff_profiles`) rather than half-reversible.
4. **Lock duration.** `ALTER TABLE users ADD COLUMN` with a default takes an `ACCESS EXCLUSIVE` lock; on a busy auth table this is user-visible downtime. Adding a separate table sidesteps this entirely — another reason to prefer `user_staff_profiles`.
---
## 9. Suggested DDL skeleton
Provided as a starting point for the migration author, not as the final word. Field list mirrors §5 of the source doc.
```sql
CREATE TYPE profile_verification_status AS ENUM
('draft', 'pending', 'verified', 'rejected');
CREATE TABLE academic_titles (
code TEXT PRIMARY KEY,
label_vi TEXT NOT NULL,
label_en TEXT NOT NULL,
sort_order INTEGER NOT NULL DEFAULT 0,
active BOOLEAN NOT NULL DEFAULT TRUE
);
CREATE TABLE user_staff_profiles (
user_id UUID PRIMARY KEY
REFERENCES users(id) ON DELETE CASCADE,
employee_id TEXT,
academic_title_code TEXT REFERENCES academic_titles(code),
academic_title_other TEXT,
unit_name_freetext TEXT, -- only if catalog is incomplete; see §3.3
job_title TEXT,
profile_verification_status profile_verification_status
NOT NULL DEFAULT 'draft',
verification_submitted_at TIMESTAMPTZ,
verified_at TIMESTAMPTZ,
verified_by_user_id UUID REFERENCES users(id),
rejection_reason TEXT,
version INTEGER NOT NULL DEFAULT 1,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
CONSTRAINT employee_id_shape
CHECK (employee_id IS NULL OR employee_id ~ '^[A-Z0-9-]{3,32}$'),
CONSTRAINT academic_title_other_invariant CHECK (
(academic_title_code = 'other')
= (academic_title_other IS NOT NULL
AND length(trim(academic_title_other)) > 0)
),
CONSTRAINT verified_requires_metadata CHECK (
profile_verification_status <> 'verified'
OR (verified_at IS NOT NULL AND verified_by_user_id IS NOT NULL)
),
CONSTRAINT rejected_requires_reason CHECK (
profile_verification_status <> 'rejected'
OR (rejection_reason IS NOT NULL
AND length(trim(rejection_reason)) > 0)
),
CONSTRAINT non_terminal_clears_verification CHECK (
profile_verification_status NOT IN ('draft','pending')
OR (verified_at IS NULL AND verified_by_user_id IS NULL)
),
CONSTRAINT job_title_length CHECK (
job_title IS NULL OR length(job_title) <= 120
)
);
CREATE UNIQUE INDEX ix_usp_employee_id_unique
ON user_staff_profiles (employee_id)
WHERE employee_id IS NOT NULL;
CREATE INDEX ix_usp_pending_queue
ON user_staff_profiles (verification_submitted_at)
WHERE profile_verification_status = 'pending';
CREATE INDEX ix_usp_verifier_activity
ON user_staff_profiles (verified_by_user_id, verified_at DESC)
WHERE verified_by_user_id IS NOT NULL;
```
The `version` column supports the optimistic concurrency pattern in §5. Bump it in every UPDATE; require it in admin verify/reject calls.
---
## 10. Testing the database layer specifically
API tests (mentioned in source §5) are necessary but not sufficient. Add:
- **Constraint tests** that attempt to INSERT impossible rows and assert the `CHECK` rejects them. These guard the DDL itself.
- **Transition tests** that walk every legal edge in the state machine and assert success, plus every illegal edge and assert rejection. The Mermaid diagram is the source of truth — generate the test matrix from it.
- **Concurrency tests** using two real connections (not mocks) attempting the same verify simultaneously. Pytest can do this with threading + a proper test DB.
- **Migration round-trip tests**: `upgrade → downgrade → upgrade` on a populated test DB to catch reversibility bugs early.
---
## 11. Open questions the team must answer before merge
These are not blockers for design, but they *are* blockers for "is the migration done":
1. Is `employee_id` required at registration, optional, or required-before-pending? (Determines NULL policy and `CHECK` shape.)
2. Strict or pragmatic re-verification on edit? (§4.1.B in source doc.) Pick one; encode it in tests.
3. Verifying an already-verified profile: idempotent 200, or 409? (§3.4 in source doc.)
4. Catalog vs free-text department: is `units` populated and trustworthy? If yes, drop `unit_name_freetext`. If no, what is the backfill plan?
5. Audit log retention period and partitioning cadence?
6. Does verification ever expire (e.g. annual re-verification for HR compliance)? If yes, schema needs `verified_until` and a scheduled job; better to design for it now than retrofit.
---
## 12. Summary checklist
| Topic | Recommendation |
| --- | --- |
| Schema shape | Separate `user_staff_profiles` 1:1 table; do not widen `users`. |
| `employee_id` uniqueness | Partial unique index, NULL policy decided before migration. |
| Academic title | Lookup table over Postgres ENUM; `CHECK` enforces the `other` invariant. |
| Department | Reconcile with existing `unit_id`; do not ship parallel notions. |
| State enforcement | Postgres ENUM/CHECK for status; cross-column CHECKs for invariants. |
| Transitions | Conditional UPDATE on prior state; reject illegal transitions implicitly. |
| Concurrency | `version` column for optimistic locking; admin verify carries expected version. |
| Audit | JSONB snapshots, whitelist fields, same-transaction writes, partitioned by month. |
| Indexes | Partial index on pending queue; composite on admin activity; unique on employee_id. |
| Timestamps | `TIMESTAMPTZ` everywhere. |
| Migration | Two-phase NOT NULL; backfill in same migration; reversible downgrade. |
| Tests | Constraint tests, transition matrix, concurrency tests, migration round-trip. |
The plan in the source document is sound. The notes above are the difference between a design that survives one release and one that survives five.
---
*Prepared as a database engineering review, complementary to the original integration analysis. Any path references mirror the source document and should be re-verified against the current tree before code changes land.*
@@ -0,0 +1,285 @@
# User profile manager — integration analysis and implementation state machine
This document analyzes how **`fe0/src/auth/LoginRegisterCard.tsx`** connects to the rest of the stack, clarifies boundaries with **PostgreSQL** and **MinIO**, and specifies a **state-machinedriven** plan for institutional profile fields, **admin verification**, and applicant **self-service** edits.
**Companion document (canonical DB engineering guidance):** [`user-profile-manager-db-review.md`](./user-profile-manager-db-review.md) — schema shape, DDL skeleton, indexes, concurrency, audit, and constraint-level enforcement. This plan **inherits** those decisions below so implementation does not fork.
Related: [`auth-registration-and-user-management.md`](./auth-registration-and-user-management.md) (broader auth; some paths may lag the tree).
---
## 1. How `LoginRegisterCard` connects today
### 1.1 Frontend wiring
| Layer | Role |
| ----- | ----- |
| `fe0/src/pages/Login.tsx` | Renders `LoginRegisterCard` only. |
| `fe0/src/auth/LoginRegisterCard.tsx` | Tabs (login/register); validates institutional email (`fe0/src/auth/institutionalEmail`); collects **full name**, **email**, **password** (+ confirm); calls `useAuth()` from **`fe0/src/contexts/AuthContext.tsx`**. |
| `AuthContext` | `login``authService.login`; `register``authService.register`; on success stores JWT via `authService`, builds session user via `fe0/src/auth/sessionUser.ts` (`buildUserFromAuthPayload`). |
| `fe0/src/lib/auth-service.ts` | HTTP to **`POST {API_URL}/api/v1/auth/register`** with JSON `{ fullName, email, password, passwordConfirm }`; stores `accessToken` in localStorage on success. |
| Routing | Successful login/register navigates with `resolvePostLoginPath` (`fe0/src/lib/dashboardNavigation.ts`) using the resolved role. |
Session user shape (`AuthSessionUser`): `id`, `email`, `name`, `phone`, `roles` (effective + available), computed `permissions`.
### 1.2 Backend wiring
| Layer | Role |
| ----- | ----- |
| `be0/src/auth_api.py` | **`POST /auth/register`** (mounted under `/api/v1`): normalizes `@ump.edu.vn` / `@umc.edu.vn`, enforces password policy, creates **`users`** row + **`user_roles`** row (**server-derived** `admin` vs `viewer`; client `role` ignored). Issues JWT embedding roles. Writes **`audit_log`** for registration. |
| `be0/src/initiative_db/models.py` **`User`** | Persists **`email`**, **`password_hash`**, **`full_name`**, optional **`phone`**, optional **`unit_id`**, **`is_active`**, timestamps — **no** institutional extended profile tables yet. |
### 1.3 Database today
- **`users`** + **`user_roles`:** credentials, display identity (`full_name`, `phone`), **`unit_id`** (catalog hook — see §2), `is_active`, RBAC-like roles (`admin`, `editor`, `viewer`, …).
### 1.4 MinIO — explicit boundary
**Registration and profile HTTP APIs must not persist profile scalars in MinIO.** Object storage (**attachments**, **exports**, **quarantine**) lives in **`be0/src/minio/storage.py`** — **binary blobs** only.
| Concern | Store |
| ------- | ----- |
| Profile fields, verification status, timestamps | **PostgreSQL** — preferably **`user_staff_profiles`** + catalog tables (§5), not widening hot `users` rows for HR text (see DB review §2). |
| Optional future HR **proof uploads** | MinIO keys + metadata rows referencing `user_staff_profiles`; out of scope until product demands it |
---
## 2. Gap analysis vs desired product
### 2.1 Registration (`LoginRegisterCard`)
| Field | UI | Persistence / naming |
| ----- | --- | ----- |
| Họ và tên | `<input>` | Stays **`users.full_name`** (already collected as `fullName`). |
| Mã số nhân sự | `<input>` | **`user_staff_profiles.employee_id`** — format + uniqueness at DB (**partial unique index** when non-NULL — see DB review §3.1 / §9); Pydantic + **`CHECK`** in migration. **`NULL`/required policy decided before DDL** ([§9 open questions](#9-open-questions-before-merge)). |
| Học hàm, học vị | dropdown + “Khác” `<input>` | **`academic_title_code`** FK → **`academic_titles` lookup table** (not raw Postgres ENUM for title list — avoids DDL drift); **`academic_title_other`** mandatory iff `code='other'` — enforce with **`CHECK`** ([DB review §3.2](user-profile-manager-db-review.md)). |
| Đơn vị công tác | `<input>` or catalog | **Must reconcile `users.unit_id`:** either **catalog only** (`unit_id` FK, drop parallel free-text) **or** staged **`unit_id` XOR `unit_name_freetext`** with `CHECK`; **never** ship orphan `department` text alongside a real **`units`** catalog ([DB review §3.3](user-profile-manager-db-review.md)). |
| Chức vụ (job) | `<input>` | **`job_title`** in DB/API — never overload JWT **`roles`**. Nullable for non-person accounts; **length capped** (`CHECK` or varchar). |
| Email / SĐT | `<input>` | Email unchanged (authoritative normalization on server); phone aligns with **`auth_api`** PATCH rules. |
**Backend:** **`INSERT users` + `INSERT user_staff_profiles`** + **`user_roles`** in **one transaction** — profile row created at registration (possibly all NULL + `draft`) or seeded in a follow-on migration backfill ([DB review §8](user-profile-manager-db-review.md)).
### 2.2 Applicant profile (`ApplicantProfileView`)
**Today:** **`GET .../auth/me`**, **`PATCH .../auth/profile`** — **`fullName`**, **`phone`** only.
**Target:** Joint reads join **`User`** ↔ **`UserStaffProfile`** for `/me` (or dedicated profile projection). PATCH applicant-owned staff fields bumps **`UserStaffProfiles.version`** (optimistic concurrency for admin verify — §4.4). Policy: **freeze after `verified`** vs **material edits → `pending`** — pick one ([§9](#9-open-questions-before-merge)).
### 2.3 Admin “Users” (`DashboardSidebar`)
`fe0/src/components/admin/DashboardSidebar.tsx` links **`/dashboard/users`** via management menu items (~`MenuItem` + `Link`).
**Routing gap:** `fe0/src/App.tsx` has **no** `users` route → **404** until **User Profile Manager** ships.
---
## 3. Cross-cutting correctness (security, concurrency, audit)
1. **Server validation:** domain, enums, lengths, **`employee_id`** shape — mirror in **`CHECK`** where feasible (DB review §3).
2. **RBAC vs job title:** API names **`job_title`**; JWT / `user_roles` = **`admin`** | **`viewer`** | … only.
3. **Authz:** Applicants **read/update own** staff profile subset; admins **list / verify / reject** via **`/admin/...`**; same patterns as existing audit actors.
4. **Two axes:** **`users.is_active`** (login disabled) **orthogonal** to **`profile_verification_status`** (HR trust) — do not collapse.
5. **`TIMESTAMPTZ`** everywhere for `_at` columns (DB review §3.6).
6. **Admin idempotency:** **Verify twice** → choose **409 Conflict** vs **204/200 idempotent no-op**; document next to handler; conditional `UPDATE` makes **lost races** observable ([§4.4](#44-concurrency-two-admins-patch-while-reviewing)).
**Audit ([DB review §6](user-profile-manager-db-review.md)):**
- **`JSONB`** before/after snapshots with a **whitelist** of fields — never blacklist (avoids leaking `password_hash` or future secrets).
- **Same transaction** as profile state transitions; rollback state if audit insert fails.
- **Ordering:** predictable sort key (`BIGSERIAL`/monotonic **`event_id`** + `created_at`); index **`(entity_type, entity_id, event_id)`** for history timelines.
- **Growth:** retention policy + partitioning (e.g. monthly `RANGE` on `created_at`) for append-only volumes.
---
## 4. State machines
Treat the **diagram** as the **logical spec**. The database **implements invariants** (status type + cross-column `CHECK`s); illegal rows must **fail to commit** ([DB review §4](user-profile-manager-db-review.md)).
### 4.1 Profile verification lifecycle
| State | Meaning |
| ----- | ------- |
| `draft` | Account exists; profile incomplete or not submitted for verification. |
| `pending` | Submitted queue; awaits admin decision. |
| `verified` | Admin approved → **`verified_at`** + **`verified_by_user_id`** required (DB **`CHECK`**). |
| `rejected` | Admin declined → **`rejection_reason`** non-empty required (DB **`CHECK`**). |
**Cross-column constraints (PostgreSQL)** — align code with DDL in [DB review §4.2 / §9](user-profile-manager-db-review.md), e.g.:
- Verified ⇒ `verified_at` + `verified_by_user_id` non-NULL
- Rejected ⇒ trimmed `rejection_reason`
- Draft/pending ⇒ verification metadata cleared
**Transitions hard to encode as CHECK** (e.g. no direct `verified → draft`): use **`UPDATE … WHERE user_id=$1 AND profile_verification_status=$expected AND version=$etag`** — **zero rows** ⇒ illegal transition **or stale read****409** (DB review §4.3).
#### Mermaid — verification lifecycle
```mermaid
stateDiagram-v2
[*] --> draft: Registration + empty staff profile / backfill
draft --> pending: Submit for review meets completeness rules
pending --> verified: Admin approves conditional UPDATE succeeds
pending --> rejected: Admin rejects with reason
rejected --> pending: Applicant resubmits clears reason per policy
verified --> pending: Material edit triggers re-verification strict policy optional
verified --> verified: Cosmetic-only edit under pragmatic policy optional
note right of verified
DB VERIFY CHECK verified_at verifier NOT NULL rejection cleared
Applicant PATCH increments version optimistic lock
end note
```
**Product/policy encoded in tests** (mirror diagram + illegal edges): see [§8](#8-testing-matrix).
### 4.2 Request / ownership (actors)
```mermaid
stateDiagram-v2
[*] --> Anonymous
Anonymous --> AuthenticatedApplicant: Register or Login JWT
state AuthenticatedApplicant {
[*] --> SelfRead
SelfRead --> SelfPatch: PATCH own profile increments version optional
SelfPatch --> SelfRead
}
AuthenticatedApplicant --> pending: Submit transitions draft→pending
state AdminAuthenticated {
[*] --> Listing
Listing --> Detail: Open profile sends version seen
Detail --> Listing: VERIFY REJECT conditional on prior status + version
}
```
### 4.3 Data flow (target shape)
```mermaid
flowchart LR
subgraph fe0
LR[LoginRegisterCard]
AS[auth-service]
LR --> AS
end
subgraph be0
API[auth_api register/profile]
U[(users + user_roles)]
P[user_staff_profiles]
AU[audit same txn]
API --> U
API --> P
API --> AU
end
AS -->|HTTPS JSON| API
subgraph minio_optional
S3[(MinIO blobs)]
end
S3 -.->|not profile scalars| API
```
### 4.4 Concurrency: two admins, PATCH while reviewing
| Scenario | Required behavior |
| -------- | ----------------- |
| Two admins verify same `pending` row | Conditional `UPDATE` — second gets **0 rows****`409 Conflict`** **or** explicit idempotent **200**; **never** pretend success while row unchanged inconsistently ([DB review §5](user-profile-manager-db-review.md)). |
| Applicant PATCH while admin reads | If verify must bind to reviewed snapshot, admin submits **`expected_version`** (etag from detail API); mismatch → **409** reload. Bump **`version`** on every applicant mutating PATCH. |
---
## 5. Canonical database shape (summary)
**Decision:** **`user_staff_profiles`** as **1:1 child****`PRIMARY KEY (user_id)` REFERENCES users(id) ON DELETE CASCADE`** — **do not widen `users`** for HR fields ([DB review §2](user-profile-manager-db-review.md)).
| Concern | Approach |
| ------- | -------- |
| Academic titles | Table **`academic_titles`** (`code`, `label_vi`, …); FK **`academic_title_code`**; **`other`** invariant `CHECK`. |
| `employee_id` | Partial **`UNIQUE WHERE employee_id IS NOT NULL`**; shape `CHECK`; NULL/required/product rule in [§9](#9-open-questions-before-merge). |
| Department | **`users.unit_id`** vs **`unit_name_freetext`** — single story; XOR `CHECK` during transition ([DB review §3.3](user-profile-manager-db-review.md)). |
| Status typing | Postgres **`ENUM`** or `CHECK (... IN (...))` — no free-text status. |
| Optimistic lock | **`version INTEGER NOT NULL DEFAULT 1`** on `user_staff_profiles`; bump every applicant-facing mutating UPDATE. |
**Indexing (day-one hot paths)** — detail in [DB review §7](user-profile-manager-db-review.md):
- Partial index: **`WHERE profile_verification_status = 'pending'`** + `verification_submitted_at`.
- Composite filters (unit/status/submitted_at) when query shape is confirmed.
- **`(verified_by_user_id, verified_at DESC)`** for verifier dashboards.
**DDL skeleton:** [DB review §9](user-profile-manager-db-review.md) — use as migration author starter; reconcile column names (`academic_title_code`, `unit_name_freetext`) with API DTO naming in code.
---
## 6. Migration mechanics ([DB review §8](user-profile-manager-db-review.md))
1. **`user_staff_profiles` migration is additive** — avoids long **`ACCESS EXCLUSIVE`** on **`users`** for nullable-widen patterns.
2. **Backfill:** `INSERT … SELECT FROM users` with defaults **`draft`**, institutional columns NULL — same migration if row count modest; batched otherwise.
3. **NOT NULL rollout:** nullable column → **backfill** → `SET NOT NULL` in steps.
4. **`downgrade()`:** explicit (e.g. `DROP TABLE user_staff_profiles`), accepting HR data loss on rollback.
---
## 7. Implementation backlog (ordered)
1. **DDL + seed `academic_titles`** (+ migration conventions + **`TIMESTAMPTZ`** audit columns if partitioning introduced).
2. **SQLAlchemy models** — `UserStaffProfile`, relationship from `User`; avoid `SELECT *` antipatterns on wide auth paths.
3. **Transactional registration** — `User` + `UserStaffProfile` + `UserRoleRow` + registration audit branch.
4. **Applicant APIs** — extend **`RegisterBody`**, **`/me`**, **`PATCH /profile`**; join profile; **`version`**/`etag` semantics for PATCH.
5. **Admin APIs** — queue list indexes; verify/reject with **conditional UPDATE** + **409**/`200` policy; JSONB whitelist audit **in same txn**.
6. **Frontend** — `LoginRegisterCard`, `ApplicantProfileView`, **`/dashboard/users`**, sidebar permission alignment.
7. **Observability** — document idempotency and conflict responses in OpenAPI or internal README.
**MinIO:** unchanged unless attaching proof blobs later.
---
## 8. Testing matrix
Beyond API/integration tests originally listed:
| Class | Goal |
| ----- | ----- |
| **Constraint violations** | `INSERT`/`UPDATE` rows breaking `CHECK` / FK / partial unique → expect DB error surfaced as **4xx**/mapped. |
| **Transition matrix** | Every legal Mermaid transition succeeds; illegal transitions (**0-row UPDATE**) fail as specified. |
| **Concurrency** | Two DB connections / threads race verify → one wins; second behavior matches documented idempotency. |
| **Migration round-trip** | `upgrade → downgrade → upgrade` on seeded DB ([DB review §10](user-profile-manager-db-review.md)). |
---
## 9. Open questions (before merge)
Blocking “migration done”; align product then freeze DDL ([DB review §11](user-profile-manager-db-review.md)):
1. **`employee_id`:** required at register, optional, or required-before-`pending`?
2. **Re-verification:** strict (any PATCH from `verified` → `pending`) vs pragmatic (field whitelist).
3. **Duplicate verify:** idempotent **200** vs **409** — document in handler + client.
4. **`units` catalog:** authoritative? If yes, drop free-text arm; else XOR + backfill strategy.
5. **Audit:** retention horizon + partition cadence.
6. **`verified_until` / expiry:** periodic re-verification for HR compliance — schema + job vs defer.
---
## 10. File reference map
| Area | Path |
| ---- | ---- |
| Register UI | `fe0/src/auth/LoginRegisterCard.tsx` |
| Auth state | `fe0/src/contexts/AuthContext.tsx` |
| HTTP client | `fe0/src/lib/auth-service.ts` |
| Applicant profile UI | `fe0/src/components/applicant/profile/ApplicantProfileView.tsx` |
| Admin sidebar (Users link) | `fe0/src/components/admin/DashboardSidebar.tsx` |
| Routes | `fe0/src/App.tsx` |
| Register / login / profile API | `be0/src/auth_api.py` |
| User ORM | `be0/src/initiative_db/models.py` |
| MinIO abstraction | `be0/src/minio/storage.py` |
| DB engineering review | [`docs/user-profile-manager-db-review.md`](./user-profile-manager-db-review.md) |
---
*Refined against [`user-profile-manager-db-review.md`](./user-profile-manager-db-review.md): committed schema (`user_staff_profiles`), DB-level state invariants, concurrency (`version`/conditional UPDATE), audit JSONB whitelist + same-transaction writes, indexing and migration posture, expanded testing and open questions.*