sciagent code + Gitea Actions CI/CD
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,336 @@
|
||||
# Application files: persistence, retrieval by `applicationId`, and backup notes
|
||||
|
||||
This document describes how the running **initiative** stack stores and loads:
|
||||
|
||||
- Evidence attachments (minh chứng 2.1 / 2.2 / kỹ thuật)
|
||||
- The **submitted full-package PDF** (đơn + báo cáo from the « Xem lại » flow)
|
||||
- The **filled DOCX / official PDF** derived from the Word template
|
||||
|
||||
It focuses on what **PostgreSQL** and **MinIO** hold. The root file [`database/schema.sql`](../database/schema.sql) describes a separate **integer `applications`** domain (attachments table with `application_id` INT); that schema is **not** wired into `be0` today. Production behavior is driven by **`be0/migrations/*.sql`** and **`INITIATIVE_DATABASE_URL`**.
|
||||
|
||||
**Implementation planning:** The phased backup and storage-hardening plan below is **refined against** the review in [`feedback-data-management.md`](feedback-data-management.md) (canonical bytes, `storage_kind`, SHA verification on pack, streaming ZIP + manifest, indexed IDs, evidence versioning, and sequencing).
|
||||
|
||||
---
|
||||
|
||||
## Identifiers: what “applicationId” means
|
||||
|
||||
The UI and APIs expose a **public submission id** shaped like `sub-{16 hex chars}` (see `save_submitted_application` in `be0/src/initiative_db/submissions.py`). Internally, persistence is keyed by:
|
||||
|
||||
| Concept | Example | Where |
|
||||
|--------|---------|--------|
|
||||
| **Public `applicationId`** (list/detail) | `sub-abc123def4567890` | `drafts.payload.submissionRecord.id`, API responses |
|
||||
| **Draft / case code** | `CASE-…` or `SUB-…` | `initiatives.case_code`, `draft_case_id` on API rows |
|
||||
| **Initiative primary key** | UUID | `initiatives.id`, MinIO key prefix, `application_artifacts.initiative_id` |
|
||||
|
||||
**Resolving a row:** `get_application_by_id` (`be0/src/initiative_db/submissions.py`) scans submitted initiatives and matches when either:
|
||||
|
||||
- `_submission_display_id(initiative, submissionRecord) == applicationId`, or
|
||||
- `initiative.case_code == applicationId`.
|
||||
|
||||
So admins can deep-link with **`sub-…`** or sometimes **`CASE-…`**. For backups, always persist **`initiatives.id`**, **`case_code`**, and **`sub-…`** together.
|
||||
|
||||
---
|
||||
|
||||
## MinIO
|
||||
|
||||
Configured in Docker via `S3_*` env vars (`docker-compose.yml`):
|
||||
|
||||
| Bucket (env) | Purpose |
|
||||
|----------------|---------|
|
||||
| **`initiative-attachments`** (`S3_BUCKET_ATTACHMENTS`) | Evidence uploads for Đơn (research / textbook / technical) |
|
||||
| **`initiative-exports`** (`S3_BUCKET_EXPORTS`) | Optional copy of the **submitted full PDF** after successful submit |
|
||||
| **`initiative-quarantine`** (`S3_BUCKET_QUARANTINE`) | Reserved for quarantine flows (not detailed here) |
|
||||
|
||||
**Object key layout** (`be0/src/minio/storage.py`):
|
||||
|
||||
- Evidence and export artifacts use **`build_key_for_initiative`**:
|
||||
`initiatives/{initiative_uuid_no_hyphens}/{yyyy}/{mm}/{uuid}-{safe_filename}`
|
||||
|
||||
The API uses the **internal endpoint** for the server (`S3_ENDPOINT_URL`, e.g. `http://minio:9000`) and **`S3_PUBLIC_ENDPOINT_URL`** for presigned URLs the browser can open (e.g. `http://localhost:19000`).
|
||||
|
||||
**Integrity:** uploads compute SHA-256 and store it in object metadata and/or Postgres (`application_artifacts.sha256`).
|
||||
|
||||
---
|
||||
|
||||
## PostgreSQL (initiative database)
|
||||
|
||||
Core tables (`be0/migrations/001_initiative_schema.sql`, `002_application_storage_extensions.sql`, plus review-doc extensions):
|
||||
|
||||
### `initiatives`
|
||||
|
||||
- `id` (UUID), `case_code` (unique text), `owner_id`, `status`, `submitted_at`, etc.
|
||||
- Submitted applications have `status != 'draft'` (e.g. `submitted`).
|
||||
|
||||
### `drafts`
|
||||
|
||||
- `payload` JSONB holds the live bundle: tab data, `submissionRecord`, `submissionFile`, etc.
|
||||
|
||||
After submit, important keys include:
|
||||
|
||||
- `payload.submissionRecord` — metadata including public `id` (`sub-…`)
|
||||
- `payload.submissionFile` — e.g. `{ "url": "/submitted-initiatives/sub-….pdf", "type": "pdf" }`
|
||||
|
||||
### `application_artifacts`
|
||||
|
||||
One row per **`(initiative_id, role)`** (`002_application_storage_extensions.sql`). **Planned (Phase 1):** add roles for the **printable application form** binaries (e.g. **`official_form_docx`**, **`official_form_pdf`**) — distinct from **`full_pdf`** (the **client-uploaded** full hồ sơ PDF).
|
||||
|
||||
| `role` | Meaning |
|
||||
|--------|---------|
|
||||
| `full_pdf` | Submitted package PDF — **`storage_uri`** is either a **MinIO key** (under exports bucket) or a **relative URL** to static files |
|
||||
| `research_evidence` | Minh chứng 2.1 (nghiên cứu) |
|
||||
| `textbook_evidence` | Minh chứng 2.2 (giáo trình) |
|
||||
| `technical_evidence` | Minh chứng kỹ thuật (nhóm 1) |
|
||||
|
||||
Columns: `storage_uri`, `original_name`, `mime_type`, `byte_size`, `sha256`, `uploaded_by`, `uploaded_at`, plus review fields for evidence.
|
||||
|
||||
### `application_submit_snapshots`
|
||||
|
||||
Append-only rows: merged tabs, submit metadata, and **`full_pdf_uri`** (today this records the **URL passed at submit time**, typically `/submitted-initiatives/...`, not necessarily the MinIO key).
|
||||
|
||||
Treat this table as **historical audit** of the submit request, not as the driver for backup byte locations: **`application_artifacts`** (and `storage_kind` once added) is the operational source of truth ([`feedback-data-management.md`](feedback-data-management.md) §8).
|
||||
|
||||
### `application_review_documents`
|
||||
|
||||
Versioned JSON used to regenerate the Word template output:
|
||||
|
||||
- `official_bieu_mau`, `template_data`, `full_bundle` (JSONB)
|
||||
- Tied to **`initiative_id`** and `case_id`
|
||||
|
||||
**Today:** the binary filled DOCX is **not** stored in MinIO; this table is the only server-side input to regeneration. **Target (for a trustworthy admin backup):** treat this JSON as **supporting data** (re-render, analytics, diffing). The **canonical bytes** for “what the applicant signed off on” for the printable mẫu should be **immutable objects in MinIO** plus rows in `application_artifacts` (see [Implementation plan — Phase 1](#phase-1-canonical-bytes-for-printable-docx--pdf-before-backup-ships)).
|
||||
|
||||
### Other useful tables
|
||||
|
||||
- `draft_tab_snapshots` — history of tab JSON (`report` / `application` / `contribution`)
|
||||
|
||||
---
|
||||
|
||||
## Backend flows
|
||||
|
||||
### Evidence upload & download
|
||||
|
||||
- **POST** `/api/v1/application-drafts/{case_id}/evidence` — multipart upload; stores object in **`initiative-attachments`**; upserts `application_artifacts` with role `research_evidence` | `textbook_evidence` | `technical_evidence` (`be0/main.py`).
|
||||
- **GET** `/api/v1/application-drafts/{case_id}/evidence` — returns metadata plus **presigned** `downloadUrl` / `viewUrl` for staff or owner.
|
||||
|
||||
`case_id` is normalized to the initiative’s **`case_code`** (e.g. `CASE-…`).
|
||||
|
||||
### Submit full PDF
|
||||
|
||||
- **POST** `/api/applications/submit` — receives PDF + JSON `metadata` (`be0/main.py`).
|
||||
- Always writes the file to **`SUBMITTED_INITIATIVES_DIR`** (default: repo `assets/submitted-initiatives` or `fe0/public/submitted-initiatives` in dev), served under **`/submitted-initiatives/{sub-….pdf}`**.
|
||||
- If PostgreSQL is enabled: **`save_submitted_application`** updates `initiatives` / `drafts`, writes **`application_submit_snapshots`**, **`application_taxonomy`**, **`application_workflow`**, and **`upsert_artifact_full_pdf`**.
|
||||
- **MinIO copy:** `_maybe_upload_submitted_pdf_to_exports_minio` uploads the same bytes to **`initiative-exports`** and, on success, sets **`application_artifacts.full_pdf.storage_uri`** to the **object key** (not the `/submitted-initiatives/...` URL). If MinIO fails, the artifact still points at the **filesystem URL** only — **this is slated to become a hard failure** once canonical storage is enforced ([Phase 2](#phase-2-canonical-storage-for-submitted-full-pdf)).
|
||||
|
||||
### Filled DOCX / official PDF (preview; persistence plan)
|
||||
|
||||
- **POST** `/api/v1/docx/preview-application-form` — renders `template_application_form.docx` with **docxtpl**; returns bytes (**no DB/MinIO write** today).
|
||||
- **POST** `/api/v1/docx/preview-application-form-pdf` — same merge, then **LibreOffice** conversion to PDF; returns bytes.
|
||||
|
||||
The client builds `officialBieuMau` from draft state; **`persistReviewDocumentBundle`** (**POST** `/api/v1/review-documents`) saves the JSON bundle to **`application_review_documents`**.
|
||||
|
||||
**Preview endpoints remain useful** for staff “what-if” and for regenerating with newer templates. **They must not** be the only path that feeds the admin backup ZIP once Phase 1 is done — backups should stream **stored** printable DOCX/PDF bytes unless a legacy row has no stored object (then document explicit fallback or backfill).
|
||||
|
||||
### Admin detail: presigned full PDF
|
||||
|
||||
For **GET** `/api/applications/{application_id}`, when `full_pdf.storage_uri` looks like a **MinIO key** (not `/submitted-initiatives` or `http`), **`_enrich_application_detail_full_pdf_presign`** adds `files.fullText.viewUrl` (presigned GET on **`initiative-exports`**).
|
||||
|
||||
---
|
||||
|
||||
## Frontend
|
||||
|
||||
| Concern | Location |
|
||||
|---------|----------|
|
||||
| Submit PDF | `fe0/src/components/applicant/submitInitiativePdf.ts` → **POST** `/api/applications/submit` with `FormData` + JWT; metadata includes **`initiativeCaseId`** (must match Postgres `case_code`). |
|
||||
| Draft load/save | `fe0/src/components/applicant/applicationDrafts.ts` — **GET/POST** `/api/v1/application-drafts/...`. |
|
||||
| DOCX/PDF from template | `fe0/src/lib/applicationFormDocxApi.ts` → preview endpoints; `ApplicationFormDocxPreview.tsx` orchestrates save + review bundle persistence. |
|
||||
| Evidence UI | e.g. `ApplicationEvidenceManagePage.tsx` — uses **GET** `/api/v1/application-drafts/{caseId}/evidence` with presigned URLs. |
|
||||
| Admin list/detail | Uses **GET** `/api/applications`, **GET** list/detail with `applicationId`; detail exposes `draft_case_id` for loading drafts/evidence. |
|
||||
|
||||
Important: **`sub-…`** is the list id; **draft/evidence APIs use `case_code` (`CASE-…`)**. The API surfaces `draft_case_id` on submission rows to bridge the two.
|
||||
|
||||
---
|
||||
|
||||
## Applicant honesty checkboxes, complete tabs & PDF minh chứng (engineering guide)
|
||||
|
||||
Goal: applicants cannot tick the **cam kết trung thực** checkboxes at the end of **Báo cáo**, **Đơn**, and **Xác nhận đóng góp** until the workflow rules below are satisfied; the UI shows a **Sonner** toast listing missing items. **PDF minh chứng** means the classification-specific evidence file for Đơn (research / textbook / technical), stored in **MinIO** via `POST /api/v1/application-drafts/{case_id}/evidence` (see [Evidence upload & download](#evidence-upload--download)).
|
||||
|
||||
### Intended behaviour (product)
|
||||
|
||||
| Control | When it may be ticked |
|
||||
|--------|------------------------|
|
||||
| **Báo cáo** (`InitiativeReportForm`) | All required fields on the report tab are non-empty (§1–§6 narrative + hiệu quả fields exposed in the UI). |
|
||||
| **Đơn** (`InitiativeApplicationForm`) | All required Đơn fields are complete **and** the correct **PDF minh chứng** slot is filled for the chosen classification (local `File`, or `FileHandle` with `serverStorageKey` after MinIO upload). Sub-forms (bản cam kết / biểu xác nhận) must match the selected nhóm. |
|
||||
| **Xác nhận đóng góp** (`ContributionConfirmationForm`) | Same checks as Đơn **and** Báo cáo, **and** the applicant has already ticked honesty on **Báo cáo** and **Đơn**. |
|
||||
| **Xem lại — Gửi** (`ApplicationFormDocxPreview`) | Same as contribution gate **plus** `contribution.digitalSignatureConfirmed` in the persisted contribution JSON. |
|
||||
|
||||
Implementation reference:
|
||||
|
||||
- Shared validators + messages: `fe0/src/lib/applicantHonestyPrerequisites.ts` (`collectReportTabHonestyGaps`, `collectApplicationTabHonestyGaps`, `collectContributionDigitalSignaturePrerequisiteGaps`, `collectApplicantSubmitToAdminPrerequisiteGaps`, `formatApplicantPrerequisiteToastDescription`).
|
||||
- Checkbox handlers toast with `toast.error(..., { description })` and **do not** flip state when prerequisites fail.
|
||||
|
||||
Staff / council flows without `DraftProvider` skip the contribution-tab signature gate (no full draft in context); fields stay **`readOnly`** as today.
|
||||
|
||||
### Frontend (detailed)
|
||||
|
||||
1. **Single source of truth for messages** — Keep gap strings in `applicantHonestyPrerequisites.ts` so DOCX preview and forms stay aligned.
|
||||
2. **Evidence PDF** — Treat as present if `applicantEvidencePdfPresent(file)` is true: `File` with non-zero size, or `FileHandle` with `serverStorageKey` (MinIO) or positive `size` (IndexedDB). Matches hydration in `DraftContext` after `getApplicationEvidence(caseId)`.
|
||||
3. **Contribution tab** — Uses `draft.report` and `draft.application` from `DraftContext`; authors/% totals are validated on Đơn; contribution UI mirrors `authors` when connected to Postgres drafts.
|
||||
4. **Review submit** — Besides tab JSON, enforce contribution signature flag on the object passed into `ApplicationFormDocxPreview` (from `draftTabs.contribution`).
|
||||
|
||||
### Backend (recommended)
|
||||
|
||||
Today, gates are **client-side** only. For integrity:
|
||||
|
||||
- **`POST /api/applications/submit`** — Implemented in `be0/src/initiative_db/submission_readiness.py`, invoked from `save_submitted_application` **before** the initiative is marked submitted. Loads merged `drafts.payload.tabs` (with snapshot fallback), reads **`application_artifacts`** for `research_evidence` / `textbook_evidence` / `technical_evidence` (non-empty `storage_uri`), and validates tab JSON + honesty flags to match the applicant UI. On failure: **400** with `detail: { "message": "…", "missing": ["…", …] }` (see `ApplicationSubmissionNotReadyError` handling in `be0/main.py`). The client maps this in `fe0/src/components/applicant/submitInitiativePdf.ts`. Partial PDF written on disk is removed when Postgres validation fails.
|
||||
- **`POST /api/v1/application-drafts/{case_id}/evidence`** — Already the canonical upload path; reject non-PDF or oversize files (existing behaviour).
|
||||
|
||||
### PostgreSQL
|
||||
|
||||
- Tab JSON lives under **`drafts.payload`** (and/or tab snapshots). Honesty flags are plain booleans: `report.honestyConfirmed`, `application.honestyConfirmed`, `contribution.digitalSignatureConfirmed`. No migration is required for gating unless you add a **server-side** “submission readiness” snapshot column.
|
||||
|
||||
### MinIO
|
||||
|
||||
- Required PDF for Đơn is stored under **`initiative-attachments`** with keys from `build_key_for_initiative`; metadata is reflected in **`application_artifacts`** (`research_evidence` | `textbook_evidence` | `technical_evidence`). Frontend readiness should agree with **either** the draft file handle (`serverStorageKey`) **or** a fresh **`GET .../evidence`** bundle (see `collectDocxTemplateCompletenessGaps` in admin review for a related pattern).
|
||||
|
||||
---
|
||||
|
||||
## Retrieving everything for one submission (interim checklist)
|
||||
|
||||
Until Phases 1–2 are done, a reader resolving **`applicationId`** (`sub-…`) should:
|
||||
|
||||
1. **Postgres:** Resolve `initiatives` + latest `drafts` (today: `get_application_by_id` scan; target: indexed `submission_public_id` — [Phase 4](#phase-4-identifiers--schema-hygiene)).
|
||||
2. **Submitted full-package PDF (`full_pdf` artifact):** Read `application_artifacts` with `role = 'full_pdf'`. Dispatch on **`storage_kind`** once added; until then, avoid relying only on string-prefix heuristics for production backups.
|
||||
3. **Evidence:** Roles `research_evidence`, `textbook_evidence`, `technical_evidence` → keys in **`initiative-attachments`**.
|
||||
4. **Printable mẫu DOCX/PDF:** After Phase 1, stream from MinIO using new artifact roles; until then see **legacy** note in Phase 3.
|
||||
|
||||
Optional ZIP extras: latest `application_review_documents` JSON, `draft_tab_snapshots`, read-only copies of `application_submit_snapshots` for audit.
|
||||
|
||||
**Related rationale and risks** (regeneration vs backup, polymorphic `storage_uri`, integrity): [`feedback-data-management.md`](feedback-data-management.md).
|
||||
|
||||
---
|
||||
|
||||
## Implementation plan: admin backup (database + document management)
|
||||
|
||||
Goal: **admin downloads one ZIP** containing **all evidence attachments**, the **submitted full-package PDF**, and the **printable application DOCX + PDF** (mẫu), with **verifiable integrity** and **no reliance on regenerating** printable documents at download time (after prerequisites).
|
||||
|
||||
Phasing follows the sequencing in [`feedback-data-management.md`](feedback-data-management.md) §“Suggested order of work”, expanded into concrete schema and API work.
|
||||
|
||||
### Phase 0 — Decisions & prerequisites
|
||||
|
||||
| Item | Action |
|
||||
|------|--------|
|
||||
| **Canonical bytes for printable mẫu** | Store immutable DOCX + PDF in MinIO at submit (or immediately pre-submit in the same transaction as finalize), not only JSON. |
|
||||
| **Evidence versioning** | Decide: append-only evidence history vs “latest only”. For approvals, prefer **versioned or append-only** so backup matches what was reviewed ([`feedback-data-management.md`](feedback-data-management.md) §7). |
|
||||
| **Quarantine bucket** | Define behavior if objects exist in **`initiative-quarantine`**: include/exclude/fail backup ([`feedback-data-management.md`](feedback-data-management.md) §11). |
|
||||
| **MinIO operations** | Document versioning, lifecycle, retention, DR (suggested spin-off: `MINIO_OPERATIONS.md` per feedback §9). |
|
||||
| **Dead schema** | Move or clearly label [`database/schema.sql`](../database/schema.sql) so tooling does not confuse INT `application_id` with `sub-…` ([`feedback-data-management.md`](feedback-data-management.md) §6). |
|
||||
|
||||
### Phase 1 — Canonical bytes for printable DOCX + PDF (before backup ships)
|
||||
|
||||
**Problem:** Regenerating DOCX/PDF at backup time uses **current** template, docxtpl, LibreOffice, and fonts — not provably what the applicant saw ([`feedback-data-management.md`](feedback-data-management.md) §1).
|
||||
|
||||
**Database**
|
||||
|
||||
- Extend `application_artifacts.role` CHECK (new migration) with two roles, e.g. **`official_form_docx`** and **`official_form_pdf`** (names TBD; must be distinct from **`full_pdf`**, which is the **client-uploaded full hồ sơ** PDF).
|
||||
- On successful submit (or single “finalize” step server-side): compute SHA-256 for each file; **`INSERT`/upsert** rows with `storage_uri` = MinIO key, `sha256`, `byte_size`, `mime_type`, `original_name`, **`storage_kind = 'minio_exports'`** (once column exists).
|
||||
|
||||
**Application logic**
|
||||
|
||||
- Server: build `officialBieuMau` from the same snapshot used for submission (bundle already available in draft + review document path), call existing **`fill_application_form_docx`** → bytes; call **`convert_docx_bytes_to_pdf`** → bytes; upload both to **`initiative-exports`** using `build_key_for_initiative`.
|
||||
- **Do not** put LibreOffice on the admin **download** path after this; optional background **verify-only** job may re-read objects.
|
||||
|
||||
**JSON**
|
||||
|
||||
- Keep saving **`application_review_documents`** for re-render/diff; it is **not** the sole legal snapshot of the printable files once binaries exist.
|
||||
|
||||
**Gate:** Do **not** release the admin backup endpoint that promises “printable DOCX/PDF” until this phase is done for **new** submits; for **legacy** rows without these artifacts, define policy (backfill job vs manifest flag `missing_official_form: true`).
|
||||
|
||||
### Phase 2 — Canonical storage for submitted full-package PDF
|
||||
|
||||
**Problem:** `full_pdf` may point at filesystem-only, MinIO-only, or both; best-effort upload risks silent loss ([`feedback-data-management.md`](feedback-data-management.md) §2).
|
||||
|
||||
**Database**
|
||||
|
||||
- Add **`storage_kind`** on **`application_artifacts`** (enum/text): e.g. `minio_exports`, `minio_attachments`, `filesystem`, `external_url`. Backfill from existing `storage_uri` shape; default new rows explicitly.
|
||||
- Optionally add **`content_sha256_verified_at`** or rely on manifest at backup time only.
|
||||
|
||||
**Application logic**
|
||||
|
||||
- Make **MinIO upload of `full_pdf` synchronous and required** when persistence is enabled: if upload fails, **fail submit** with retryable error.
|
||||
- Treat filesystem write as **cache** for dev/static serving if desired, not sole store.
|
||||
- **Backfill job:** filesystem-only historical PDFs → **`initiative-exports`**, then update artifact row + `storage_kind`.
|
||||
|
||||
**Infrastructure**
|
||||
|
||||
- Ensure **`SUBMITTED_INITIATIVES_DIR`** is on a **persistent volume** in every environment, or stop relying on it for production.
|
||||
|
||||
### Phase 3 — Admin backup endpoint + ZIP contract
|
||||
|
||||
**Authorization:** admin-only; **audit** every request: actor, `applicationId`, timestamp, outcome, bytes streamed ([`feedback-data-management.md`](feedback-data-management.md) §10).
|
||||
|
||||
**Resolution:** load initiative by **`submission_public_id`** or **`case_code`** (indexed) after Phase 4; until then use existing lookup with awareness of scan cost for **bulk** exports.
|
||||
|
||||
**Integrity**
|
||||
|
||||
- While streaming each file into the ZIP, **compute SHA-256** and **compare** to `application_artifacts.sha256`. On mismatch: **fail entire export**, log at high severity ([`feedback-data-management.md`](feedback-data-management.md) §4).
|
||||
- Optional **`POST /admin/…/backup/verify`** (verify-only, no ZIP) for periodic audits.
|
||||
|
||||
**ZIP layout** (suggested; ASCII-safe entry names, original names in manifest):
|
||||
|
||||
```text
|
||||
manifest.json
|
||||
submitted/full-package.pdf
|
||||
submitted/official-form.docx
|
||||
submitted/official-form.pdf
|
||||
evidence/research/{safe-name-or-id}
|
||||
evidence/textbook/…
|
||||
evidence/technical/…
|
||||
metadata/application_review_documents.json # optional
|
||||
```
|
||||
|
||||
**`manifest.json`** (minimum fields): `applicationId`, `case_code`, `initiative_id`, submitted timestamps, owner id, **list of files** with `role`, `original_name`, `mime_type`, `byte_size`, **stored** `sha256`, **verified** `sha256` (computed during ZIP build), `storage_kind`.
|
||||
|
||||
**Transport**
|
||||
|
||||
- **Stream ZIP** with a streaming library (e.g. `zipstream-ng`); **do not** buffer whole archives in memory.
|
||||
- Single-initiative: synchronous response acceptable.
|
||||
- **Bulk** (date range, many rows): **async job** → write ZIP to **`initiative-exports`** or **`initiative-backups`** → presigned URL when ready (avoids proxy timeouts).
|
||||
|
||||
**Sources for each ZIP entry**
|
||||
|
||||
| Content | Source |
|
||||
|--------|--------|
|
||||
| Full hồ sơ PDF | `application_artifacts.full_pdf` → MinIO **`initiative-exports`** (after Phase 2) |
|
||||
| Printable DOCX / PDF | `official_form_docx` / `official_form_pdf` → **`initiative-exports`** |
|
||||
| Evidence | `research_*`, `textbook_*`, `technical_*` → **`initiative-attachments`** |
|
||||
| Structured snapshot | Optional: latest `application_review_documents` JSON |
|
||||
|
||||
**Legacy:** If `official_form_*` missing, either skip with manifest flags or run **one-time backfill** using frozen template policy — **document** that backfilled bytes are “as-of backfill date” not original submit date.
|
||||
|
||||
### Phase 4 — Identifiers & schema hygiene
|
||||
|
||||
- Add **`submission_public_id`** (unique, indexed) on **`initiatives`**, set once at submit; replace linear scan in `get_application_by_id` with indexed lookup ([`feedback-data-management.md`](feedback-data-management.md) §5).
|
||||
- Document resolution: **`sub-…`** vs **`CASE-…`** explicitly (remove “sometimes” from ops docs).
|
||||
|
||||
### Phase 5 — Hardening (ongoing)
|
||||
|
||||
- MinIO **versioning** / **object lock** if compliance requires; off-cluster backup of MinIO; periodic verify-only sweeps ([`feedback-data-management.md`](feedback-data-management.md) §9, §10, quarter roadmap).
|
||||
|
||||
---
|
||||
|
||||
### Frontend (admin)
|
||||
|
||||
- New **“Tải bản sao lưu”** (or similar) on application detail: call backup endpoint, handle long downloads (progress if async + poll).
|
||||
- For async pattern: show job id, link when presigned URL ready.
|
||||
- Ensure **admin audit** expectations match backend logging.
|
||||
|
||||
---
|
||||
|
||||
### Summary
|
||||
|
||||
| Layer | Current summary | After plan |
|
||||
|--------|-----------------|------------|
|
||||
| **Postgres** | Artifacts + polymorphic `storage_uri` | Explicit `storage_kind`, optional `submission_public_id`, new artifact roles for official DOCX/PDF |
|
||||
| **MinIO** | Evidence + best-effort full PDF | Required `full_pdf` + official form binaries on **`initiative-exports`**; evidence on **`initiative-attachments`** |
|
||||
| **Admin backup** | Would require regeneration / fragile dispatch | Streaming ZIP + manifest + verified SHA + audit; optional async for bulk |
|
||||
|
||||
This aligns the **database and document management system** with a backup that **admins can trust**: **stored bytes**, **verified at pack time**, and **operationally grounded** in explicit storage metadata.
|
||||
Reference in New Issue
Block a user