sciagent code + Gitea Actions CI/CD
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,985 @@
|
||||
# Implementation Guide — `sang-kien-pdf`
|
||||
|
||||
A step-by-step walkthrough of how the Sáng kiến PDF + DOCX template generators are built. Read this if you want to understand **why** each piece exists, **how** to modify the layout, or **how** to port the same approach to a different government form.
|
||||
|
||||
---
|
||||
|
||||
## Table of contents
|
||||
|
||||
1. [The problem we're solving](#1-the-problem-were-solving)
|
||||
2. [Architecture overview](#2-architecture-overview)
|
||||
3. [Tech stack and rationale](#3-tech-stack-and-rationale)
|
||||
4. [Project setup](#4-project-setup-from-scratch)
|
||||
5. [Implementing the PDF generator](#5-implementing-the-pdf-generator)
|
||||
- 5.1 [TypeScript data types](#51-typescript-data-types)
|
||||
- 5.2 [Font registration](#52-font-registration)
|
||||
- 5.3 [Shared styles](#53-shared-styles)
|
||||
- 5.4 [Reusable components](#54-reusable-components)
|
||||
- 5.5 [Page components](#55-page-components)
|
||||
- 5.6 [Top-level Document](#56-top-level-document)
|
||||
- 5.7 [Server-side render helper](#57-server-side-render-helper)
|
||||
6. [Implementing the DOCX template generator](#6-implementing-the-docx-template-generator)
|
||||
- 6.1 [The Jinja-in-DOCX strategy](#61-the-jinja-in-docx-strategy)
|
||||
- 6.2 [The 3-row table loop trick](#62-the-3-row-table-loop-trick)
|
||||
- 6.3 [Multi-section layout](#63-multi-section-layout)
|
||||
- 6.4 [Building paragraphs and tables](#64-building-paragraphs-and-tables)
|
||||
7. [Layout calibration](#7-layout-calibration-matching-the-standard)
|
||||
8. [Verification workflow](#8-verification-workflow)
|
||||
9. [Common modifications](#9-common-modifications)
|
||||
10. [Troubleshooting](#10-troubleshooting)
|
||||
11. [Porting to a different form](#11-porting-to-a-different-form)
|
||||
|
||||
---
|
||||
|
||||
## 1. The problem we're solving
|
||||
|
||||
The "Sáng kiến" application is a Vietnamese government form (Đại học Y Dược TP.HCM) that has six sections — a cover page (Trang bìa) plus Mẫu số 01–04 plus Bản cam kết. Every applicant fills out the same skeleton with their own data.
|
||||
|
||||
Two real-world workflows need to be supported:
|
||||
|
||||
1. **Programmatic PDF generation** — a web service receives JSON, returns a printable PDF. No human edits the file before printing.
|
||||
2. **Word-based filling** — an admin opens a `.docx` template in Word, types into it (or uses `docxtpl`/`Carbone`/etc. to merge JSON), and prints.
|
||||
|
||||
Both outputs must look identical to the official reference document (`Sang_kien_SOP_dong_vat`). The data shape (`data_blank.json`) is fixed by an existing system upstream and must not change.
|
||||
|
||||
The trick is keeping the two generators in sync — same layout, same data fields — while staying within each format's idioms.
|
||||
|
||||
---
|
||||
|
||||
## 2. Architecture overview
|
||||
|
||||
```
|
||||
┌────────────────────┐
|
||||
│ data.json │ ← source of truth (data_blank.json shape)
|
||||
└──────────┬─────────┘
|
||||
│
|
||||
┌────────────────┴────────────────┐
|
||||
▼ ▼
|
||||
┌──────────────────────┐ ┌─────────────────────────┐
|
||||
│ React-PDF pipeline │ │ docx + docxtpl path │
|
||||
│ │ │ │
|
||||
│ data → React tree │ │ build-docx-template.ts │
|
||||
│ → PDF buffer │ │ generates .docx with │
|
||||
│ │ │ {{ }} placeholders │
|
||||
│ │ │ ↓ │
|
||||
│ │ │ docxtpl.render(data) │
|
||||
│ │ │ → filled .docx │
|
||||
└──────────┬───────────┘ └────────────┬────────────┘
|
||||
│ │
|
||||
▼ ▼
|
||||
filled.pdf filled.docx
|
||||
```
|
||||
|
||||
The PDF path uses **runtime composition** — a React component receives data as props and returns a tree of `<Page>`/`<View>`/`<Text>` elements. The renderer turns that into a PDF buffer.
|
||||
|
||||
The DOCX path uses **template-based composition** — a build script (`build-docx-template.ts`) produces a `.docx` file *once*, with placeholder strings like `{{ mau_01.mo_dau }}` baked into the document body. At runtime, `docxtpl` (Python) or any other Jinja-aware OOXML tool reads that `.docx`, finds the placeholders, and replaces them with values from the JSON.
|
||||
|
||||
Both pipelines read **the same TypeScript types and JSON files**, so adding a new field requires touching both sides — but the field name lives in exactly one place: `src/types.ts`.
|
||||
|
||||
---
|
||||
|
||||
## 3. Tech stack and rationale
|
||||
|
||||
| Concern | Choice | Why |
|
||||
|---|---|---|
|
||||
| PDF rendering | `@react-pdf/renderer` v4 | Component-based, server- and browser-compatible. Uses Yoga for flexbox layout. Same API as React, so layouts compose like UI code. |
|
||||
| Vietnamese font | `@expo-google-fonts/tinos` | Tinos is a metric-equivalent of Times New Roman (Apache 2.0) with the full Latin Extended Additional range — needed for `ư ơ ầ ậ ọ ặ` etc. The `@expo-google-fonts/*` packages ship actual `.ttf` files (most other font packages ship `.woff/.woff2`, which `@react-pdf/renderer` can't read). |
|
||||
| DOCX generation | `docx` v9 (npm) | Object-model API: build paragraphs, tables, sections in TypeScript, then `Packer.toBuffer()` produces a valid `.docx`. Maintained, typed, stable. |
|
||||
| Templating engine | `docxtpl` (Python) | The most popular Jinja-style DOCX templater. Recognizes `{{ var }}`, `{% if %}`, and crucially `{%tr for %}` for table-row loops. Compatible templates work in `docx-templates` (JS) and Carbone too. |
|
||||
| TypeScript | 5.4 | Catches type errors at build time and gives autocompletion across all the data fields. |
|
||||
| Test rendering | LibreOffice (`soffice`) | Used to convert `.docx` → `.pdf` so we can visually diff against the reference document. |
|
||||
|
||||
**Why not a pure HTML-to-PDF approach (Puppeteer)?** It works, but bundle size is huge and rendering is non-deterministic across machines. React-PDF gives byte-stable output.
|
||||
|
||||
**Why not just generate the DOCX and convert it to PDF?** That would solve the layout-sync problem but couples PDF generation to a heavy toolchain (LibreOffice). React-PDF runs in pure Node.js and works inside serverless environments.
|
||||
|
||||
---
|
||||
|
||||
## 4. Project setup from scratch
|
||||
|
||||
```bash
|
||||
mkdir sang-kien-pdf && cd sang-kien-pdf
|
||||
npm init -y
|
||||
|
||||
# Runtime dependencies
|
||||
npm install @react-pdf/renderer react @expo-google-fonts/tinos docx
|
||||
|
||||
# Dev dependencies
|
||||
npm install -D typescript ts-node @types/react @types/node
|
||||
```
|
||||
|
||||
Create `tsconfig.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"compilerOptions": {
|
||||
"target": "ES2020",
|
||||
"module": "commonjs",
|
||||
"lib": ["ES2020", "DOM"],
|
||||
"jsx": "react",
|
||||
"outDir": "./dist",
|
||||
"rootDir": "./",
|
||||
"strict": true,
|
||||
"esModuleInterop": true,
|
||||
"skipLibCheck": true,
|
||||
"forceConsistentCasingInFileNames": true,
|
||||
"declaration": true,
|
||||
"declarationMap": true,
|
||||
"sourceMap": true,
|
||||
"resolveJsonModule": true,
|
||||
"moduleResolution": "node"
|
||||
},
|
||||
"include": ["src/**/*", "example/**/*", "tools/**/*"],
|
||||
"exclude": ["node_modules", "dist"]
|
||||
}
|
||||
```
|
||||
|
||||
The `jsx: "react"` setting matters — React-PDF uses real JSX, not the new transform.
|
||||
|
||||
Add scripts to `package.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"scripts": {
|
||||
"build": "tsc",
|
||||
"generate": "ts-node example/generate-example.ts",
|
||||
"generate:blank": "ts-node example/generate-example.ts --blank",
|
||||
"build:docx": "ts-node tools/build-docx-template.ts"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Implementing the PDF generator
|
||||
|
||||
### 5.1 TypeScript data types
|
||||
|
||||
Start with the data shape. Every field in the JSON gets a strict TypeScript interface in `src/types.ts`. This is the single source of truth — every page component reads it, every change ripples out through the type system.
|
||||
|
||||
```ts
|
||||
// src/types.ts
|
||||
export interface NgayKy {
|
||||
ngay: string;
|
||||
thang: string;
|
||||
nam: string;
|
||||
}
|
||||
|
||||
export interface TrangBia {
|
||||
ten_sang_kien: string;
|
||||
tac_gia: string;
|
||||
don_vi: string;
|
||||
thong_tin_lien_he: string;
|
||||
nam: string;
|
||||
}
|
||||
|
||||
export interface Mau01ApplyRow {
|
||||
tt: string;
|
||||
ten_to_chuc: string;
|
||||
dia_chi: string;
|
||||
linh_vuc: string;
|
||||
}
|
||||
|
||||
export interface Mau01HieuQua {
|
||||
loi_ich_kinh_te: string;
|
||||
hieu_qua_giang_day: string;
|
||||
// … 8 more fields
|
||||
}
|
||||
|
||||
export interface Mau01 {
|
||||
mo_dau: string;
|
||||
ten_sang_kien: string;
|
||||
// …
|
||||
danh_sach_ap_dung: Mau01ApplyRow[];
|
||||
tinh_hieu_qua: Mau01HieuQua;
|
||||
ngay_ky: NgayKy;
|
||||
// …
|
||||
}
|
||||
|
||||
// … repeat for Mau02, Mau03, Mau04, BanCamKet
|
||||
|
||||
export interface SangKienData {
|
||||
trang_bia: TrangBia;
|
||||
mau_01: Mau01;
|
||||
mau_02: Mau02;
|
||||
mau_03: Mau03;
|
||||
mau_04: Mau04;
|
||||
ban_cam_ket: BanCamKet;
|
||||
}
|
||||
```
|
||||
|
||||
Two design choices worth calling out:
|
||||
|
||||
**All fields are strings (or string arrays).** Even numbers like "Tỷ lệ %" are strings. The form is for humans, not databases — values get rendered verbatim, and string-only types let users write `"15%"` or `"khoảng 15"` without coercion errors.
|
||||
|
||||
**Array-shaped tables.** `danh_sach_tac_gia` is `Mau02AuthorRow[]`, not a fixed-size tuple. The page components iterate with `.map()`, and the DOCX template uses a `{%tr for %}` loop. Both handle 0, 1, or 100 rows.
|
||||
|
||||
### 5.2 Font registration
|
||||
|
||||
`@react-pdf/renderer` ships with three fonts (Helvetica, Times-Roman, Courier) and **none of them include Vietnamese glyphs**. If you skip this step, characters like `ư ơ ầ ậ` will render as blank space.
|
||||
|
||||
```ts
|
||||
// src/fonts.ts
|
||||
import { Font } from "@react-pdf/renderer";
|
||||
|
||||
let registered = false;
|
||||
|
||||
export function registerFonts(): void {
|
||||
if (registered) return;
|
||||
|
||||
const regular = require.resolve(
|
||||
"@expo-google-fonts/tinos/400Regular/Tinos_400Regular.ttf"
|
||||
);
|
||||
const italic = require.resolve(
|
||||
"@expo-google-fonts/tinos/400Regular_Italic/Tinos_400Regular_Italic.ttf"
|
||||
);
|
||||
const bold = require.resolve(
|
||||
"@expo-google-fonts/tinos/700Bold/Tinos_700Bold.ttf"
|
||||
);
|
||||
const boldItalic = require.resolve(
|
||||
"@expo-google-fonts/tinos/700Bold_Italic/Tinos_700Bold_Italic.ttf"
|
||||
);
|
||||
|
||||
Font.register({
|
||||
family: "TimesVN",
|
||||
fonts: [
|
||||
{ src: regular },
|
||||
{ src: italic, fontStyle: "italic" },
|
||||
{ src: bold, fontWeight: "bold" },
|
||||
{ src: boldItalic, fontWeight: "bold", fontStyle: "italic" },
|
||||
],
|
||||
});
|
||||
|
||||
Font.registerHyphenationCallback((word) => [word]);
|
||||
registered = true;
|
||||
}
|
||||
```
|
||||
|
||||
Three things happen here:
|
||||
|
||||
1. **`require.resolve()` finds the TTF on disk** — this works in Node and bundlers like Webpack/Vite turn it into an asset URL automatically.
|
||||
2. **One family, four variants** — `fontWeight` and `fontStyle` keys let `<Text style={{ fontWeight: "bold" }}>` resolve to the bold TTF.
|
||||
3. **Hyphenation callback returns `[word]`** — this disables React-PDF's default English hyphenator, which would chop Vietnamese words at random points.
|
||||
|
||||
The `registered` boolean guards against re-registration if `registerFonts()` is called from multiple entry points.
|
||||
|
||||
### 5.3 Shared styles
|
||||
|
||||
`StyleSheet.create()` in `src/styles.ts` defines reusable style objects. Three categories matter:
|
||||
|
||||
**Page-level constants.** A4 with ~2.5 cm margins:
|
||||
|
||||
```ts
|
||||
page: {
|
||||
fontFamily: FONT, // "TimesVN"
|
||||
fontSize: 13, // 13pt body
|
||||
paddingTop: 71, // ~2.5cm = 71pt
|
||||
paddingBottom: 71,
|
||||
paddingLeft: 71,
|
||||
paddingRight: 71,
|
||||
lineHeight: 1.25,
|
||||
},
|
||||
```
|
||||
|
||||
**Paragraph variants** for the three contexts that come up:
|
||||
|
||||
```ts
|
||||
// Indented body text (justified, first-line indent ~1cm)
|
||||
paragraph: { textAlign: "justify", textIndent: 28, marginBottom: 0 },
|
||||
|
||||
// Flush-left lines (section labels, inline list items)
|
||||
paragraphFlush: { textAlign: "justify", marginBottom: 0 },
|
||||
|
||||
// Section headings (flush-left, with breathing room above)
|
||||
sectionHead: { textAlign: "justify", marginBottom: 0, marginTop: 4 },
|
||||
```
|
||||
|
||||
The `marginBottom: 0` is deliberate — Vietnamese government documents are visually dense, so paragraphs only get spacing between sections, not between adjacent lines.
|
||||
|
||||
**Component primitives** (table, checkbox, signature columns):
|
||||
|
||||
```ts
|
||||
table: {
|
||||
flexDirection: "column",
|
||||
borderWidth: 1, borderColor: "#000",
|
||||
borderRightWidth: 0, borderBottomWidth: 0, // we draw R+B per-cell
|
||||
marginVertical: 4,
|
||||
},
|
||||
tableCell: {
|
||||
borderRightWidth: 1, borderBottomWidth: 1, borderColor: "#000",
|
||||
padding: 4,
|
||||
},
|
||||
```
|
||||
|
||||
The "outer border drawn on the table, inner borders drawn per-cell" pattern avoids double-thickness lines where cells meet.
|
||||
|
||||
**Cover-specific styles** are isolated in their own group because the cover page has unique requirements (page border via `position: absolute`, "Mẫu số 01" badge in the top corner).
|
||||
|
||||
### 5.4 Reusable components
|
||||
|
||||
`src/components.tsx` factors out the patterns that show up on multiple pages:
|
||||
|
||||
**`<Checkbox checked={boolean}>label</Checkbox>`** — a horizontal row with a bordered square. When `checked`, an inner filled `<View>` appears inside it. We don't use the Unicode `☑` character because Tinos doesn't include it; drawing geometry is font-independent.
|
||||
|
||||
```tsx
|
||||
export const Checkbox: React.FC<CheckboxProps> = ({ checked, children }) => (
|
||||
<View style={styles.checkboxRow}>
|
||||
<View style={styles.checkboxBox}>
|
||||
{checked ? <View style={styles.checkboxFill} /> : null}
|
||||
</View>
|
||||
<Text style={styles.checkboxLabel}>{children}</Text>
|
||||
</View>
|
||||
);
|
||||
```
|
||||
|
||||
**Header variants** — three different two-column header patterns appear in the document:
|
||||
|
||||
- `<TopHeaderBoYTe />` — "BỘ Y TẾ / ĐẠI HỌC Y DƯỢC" left, "CỘNG HÒA…" right (Mẫu 03/04)
|
||||
- `<TopHeaderDonVi donVi="..." />` — drops "BỘ Y TẾ", shows the unit name in bold (Mẫu 02)
|
||||
- `<TopHeaderCongHoa />` — only the right column (Bản cam kết)
|
||||
|
||||
Each one uses the same `flexDirection: "row"` layout with two equal columns. The differences are which lines appear.
|
||||
|
||||
**Table primitives.**
|
||||
|
||||
```tsx
|
||||
<Table columns={[6, 22, 14, 16, 14, 14, 14]}>
|
||||
<Row>
|
||||
<Cell width={6} header align="center">STT</Cell>
|
||||
<Cell width={22} header align="center">Họ và tên</Cell>
|
||||
{/* … */}
|
||||
</Row>
|
||||
{data.danh_sach_tac_gia.map((row, i) => (
|
||||
<Row key={i}>
|
||||
<Cell width={6} align="center">{row.stt}</Cell>
|
||||
<Cell width={22}>{row.ho_ten}</Cell>
|
||||
{/* … */}
|
||||
</Row>
|
||||
))}
|
||||
</Table>
|
||||
```
|
||||
|
||||
The `width` prop is a **percentage** (the cell renders with `width: ${width}%`). Column widths must sum to 100. The `Cell` component automatically wraps string children in `<Text>` so callers can pass either plain text or nested elements.
|
||||
|
||||
**`<DateLine ngay thang nam />`** renders the recurring "TP. Hồ Chí Minh, ngày … tháng … năm …" line, with sensible blank-data placeholders (`.....`).
|
||||
|
||||
**`<SignatureBlock title subtitle name>`** renders one column of a two-column signature block (centered title, italic subtitle, then a 50pt vertical gap before the bold signer's name).
|
||||
|
||||
### 5.5 Page components
|
||||
|
||||
Each section of the form gets its own component file in `src/pages/`. They all follow the same shape:
|
||||
|
||||
```tsx
|
||||
// src/pages/Mau01.tsx
|
||||
import { Page, View, Text } from "@react-pdf/renderer";
|
||||
import { styles } from "../styles";
|
||||
import { Mau01 } from "../types";
|
||||
import { Table, Row, Cell, DateLine } from "../components";
|
||||
|
||||
interface Props {
|
||||
data: Mau01;
|
||||
donVi: string; // pulled from mau_02.don_vi by the parent
|
||||
}
|
||||
|
||||
export const Mau01Page: React.FC<Props> = ({ data, donVi }) => (
|
||||
<Page size="A4" style={styles.page}>
|
||||
<Text style={styles.centerTitleLarge}>BÁO CÁO MÔ TẢ SÁNG KIẾN</Text>
|
||||
|
||||
<Text style={styles.paragraphFlush}>
|
||||
1. Mở đầu{" "}
|
||||
<Text style={styles.italic}>
|
||||
(Giới thiệu về những vấn đề liên quan đến sáng kiến…):
|
||||
</Text>
|
||||
</Text>
|
||||
<Text style={styles.paragraph}>{data.mo_dau}</Text>
|
||||
|
||||
{/* … rest of the page */}
|
||||
</Page>
|
||||
);
|
||||
```
|
||||
|
||||
Three patterns recur in every page:
|
||||
|
||||
1. **Static + dynamic mixed in the same `<Text>`.** Section labels like "1. Mở đầu" are fixed, but the italic instructional helper text and the data value next to them aren't. We use nested `<Text>` to apply different styles to different runs in one paragraph (because `<Text>` in React-PDF can contain other `<Text>` nodes, like `<span>` in HTML).
|
||||
|
||||
2. **`{" "}` for explicit whitespace.** JSX collapses whitespace between elements. To preserve a space between a label and an italic helper, we explicitly insert `{" "}`.
|
||||
|
||||
3. **Default-empty rows for tables.** When `data.danh_sach_ap_dung` is empty, we still want one blank row to render so the printed form has a place to write. The pattern:
|
||||
```tsx
|
||||
{(data.danh_sach_ap_dung && data.danh_sach_ap_dung.length > 0
|
||||
? data.danh_sach_ap_dung
|
||||
: [{ tt: "", ten_to_chuc: "", dia_chi: "", linh_vuc: "" }]
|
||||
).map((row, i) => /* ... */)}
|
||||
```
|
||||
|
||||
**Signature block on Mẫu 01 takes `donVi` as a prop**, not from `data` directly. The reason: the standard layout uses the unit name from Mẫu 02 (`mau_02.don_vi`) on Mẫu 01's signature line. Rather than duplicate the value in the JSON, the parent component (`SangKienDocument`) reads it from `mau_02` and passes it down.
|
||||
|
||||
**Cover page is special.** It uses absolute positioning to put the page border around the entire content area:
|
||||
|
||||
```tsx
|
||||
<Page size="A4" style={styles.page}>
|
||||
<Text style={styles.formNumberOnCover}>Mẫu số 01</Text>
|
||||
<View style={styles.coverBorder} fixed />
|
||||
<View style={styles.coverContent}>
|
||||
{/* header, title, fields, footer */}
|
||||
</View>
|
||||
</Page>
|
||||
```
|
||||
|
||||
`<View fixed>` tells React-PDF to render the border on every page in this section (irrelevant here since the cover is one page, but harmless), and `position: absolute` (set in `styles.coverBorder`) makes it overlay the whole page.
|
||||
|
||||
### 5.6 Top-level Document
|
||||
|
||||
`src/SangKienDocument.tsx` composes all six pages:
|
||||
|
||||
```tsx
|
||||
export const SangKienDocument: React.FC<{ data: SangKienData }> = ({ data }) => {
|
||||
registerFonts();
|
||||
const donVi = data.mau_02.don_vi || data.trang_bia.don_vi;
|
||||
|
||||
return (
|
||||
<Document
|
||||
title={data.trang_bia.ten_sang_kien || "Báo cáo mô tả sáng kiến"}
|
||||
author={data.trang_bia.tac_gia}
|
||||
>
|
||||
<CoverPage data={data.trang_bia} />
|
||||
<Mau01Page data={data.mau_01} donVi={donVi} />
|
||||
<Mau02Page data={data.mau_02} />
|
||||
<Mau03Page data={data.mau_03} />
|
||||
<Mau04Page data={data.mau_04} />
|
||||
<BanCamKetPage data={data.ban_cam_ket} />
|
||||
</Document>
|
||||
);
|
||||
};
|
||||
```
|
||||
|
||||
`registerFonts()` is idempotent (the internal `registered` flag guards against duplicate registration), so calling it from the top-level component is safe.
|
||||
|
||||
The `<Document>` element accepts metadata that shows up in the PDF's title bar — `title`, `author`, `subject`, `creator`, `producer`, `keywords`. These don't affect rendering, just file properties.
|
||||
|
||||
### 5.7 Server-side render helper
|
||||
|
||||
`src/generate.tsx` wraps the React rendering in a Node-friendly Promise:
|
||||
|
||||
```tsx
|
||||
import { pdf } from "@react-pdf/renderer";
|
||||
|
||||
export async function renderSangKienPdf(data: SangKienData): Promise<Buffer> {
|
||||
const instance = pdf(<SangKienDocument data={data} />);
|
||||
const blob = await instance.toBlob();
|
||||
const arrayBuffer = await blob.arrayBuffer();
|
||||
return Buffer.from(arrayBuffer);
|
||||
}
|
||||
|
||||
export async function renderSangKienPdfFromFile(
|
||||
inputJsonPath: string,
|
||||
outputPdfPath: string
|
||||
): Promise<void> {
|
||||
const data = JSON.parse(fs.readFileSync(inputJsonPath, "utf-8")) as SangKienData;
|
||||
const buffer = await renderSangKienPdf(data);
|
||||
fs.mkdirSync(path.dirname(outputPdfPath), { recursive: true });
|
||||
fs.writeFileSync(outputPdfPath, buffer);
|
||||
}
|
||||
```
|
||||
|
||||
`pdf(...).toBlob()` is the cleanest async API even on the server — the `Buffer.from(await blob.arrayBuffer())` conversion is one line.
|
||||
|
||||
`example/generate-example.ts` is a thin CLI on top:
|
||||
|
||||
```ts
|
||||
const useBlank = process.argv.includes("--blank");
|
||||
const inputPath = useBlank
|
||||
? path.join(__dirname, "data-blank.json")
|
||||
: path.join(__dirname, "sample-data.json");
|
||||
const outputPath = path.join(__dirname, "..", "out", `sang-kien-${useBlank ? "blank" : "filled"}.pdf`);
|
||||
|
||||
await renderSangKienPdfFromFile(inputPath, outputPath);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Implementing the DOCX template generator
|
||||
|
||||
### 6.1 The Jinja-in-DOCX strategy
|
||||
|
||||
`docxtpl` works by storing Jinja-style strings *as ordinary text* inside the DOCX, then doing template expansion at render time. The build script's job is to produce a `.docx` whose visible text reads:
|
||||
|
||||
> **Tên sáng kiến (Tiếng Việt):** {{ trang_bia.ten_sang_kien }}
|
||||
|
||||
When you open this in Word, you literally see those curly braces. When `docxtpl` opens it, it walks the OOXML tree, finds runs containing `{{ ... }}`, and replaces them.
|
||||
|
||||
**The catch: text runs split across formatting changes.** If you write `Tên sáng kiến (Tiếng Việt): {{ trang_bia.ten_sang_kien }}` in one run, that's fine. But if you bold "Tên sáng kiến" and leave `{{ … }}` regular, Word stores them as **two separate runs**. A naive search for `{{` in the second run works — but if you split a placeholder *inside* the curly braces (`{{ trang_bia.` in one run, `ten_sang_kien }}` in another), `docxtpl` will fail silently. So:
|
||||
|
||||
> **Rule:** every placeholder must live entirely inside one continuous run with one set of formatting.
|
||||
|
||||
The `docx` library makes this easy — when you write `r("{{ mau_01.mo_dau }}")`, that's exactly one `<w:r>` element with one `<w:t>` inside.
|
||||
|
||||
### 6.2 The 3-row table loop trick
|
||||
|
||||
For repeating table rows, `docxtpl` uses a special syntax: `{%tr for item in collection %}` and `{%tr endfor %}`. The `tr` prefix tells the engine "remove the entire `<w:tr>` row containing this tag and use the rows between `for` and `endfor` as the loop body."
|
||||
|
||||
A naive single-row pattern doesn't work:
|
||||
|
||||
```
|
||||
[ {%tr for x in items %} {{ x.id }} | {{ x.name }} {%tr endfor %} ]
|
||||
```
|
||||
|
||||
Because `{%tr for %}` and `{%tr endfor %}` must be in the **same row** (they're stripped together) — and Jinja then sees two opening tags with no body.
|
||||
|
||||
The reliable pattern is **three rows**:
|
||||
|
||||
```
|
||||
Row 1: | {%tr for item in collection %} | (empty cells) |
|
||||
Row 2: | {{ item.id }} | {{ item.name }} | ← duplicated per item
|
||||
Row 3: | {%tr endfor %} | (empty cells) |
|
||||
```
|
||||
|
||||
Row 1 and Row 3 get stripped. Row 2 gets repeated for each item. The data row carries the actual `{{ }}` fields.
|
||||
|
||||
In code:
|
||||
|
||||
```ts
|
||||
const aw = [6, 22, 14, 16, 14, 14, 14]; // column widths
|
||||
|
||||
const emptyRow_aw = (firstText: string) => {
|
||||
const cells: TableCell[] = [];
|
||||
for (let i = 0; i < aw.length; i++) {
|
||||
cells.push(new TableCell({
|
||||
borders: allThinBorders,
|
||||
width: { size: aw[i] * 100, type: WidthType.PERCENTAGE },
|
||||
children: [new Paragraph({ children: [r(i === 0 ? firstText : " ")] })],
|
||||
}));
|
||||
}
|
||||
return cells;
|
||||
};
|
||||
|
||||
new Table({
|
||||
rows: [
|
||||
new TableRow({ children: [/* header cells */] }),
|
||||
new TableRow({ children: emptyRow_aw("{%tr for item in mau_02.danh_sach_tac_gia %}") }),
|
||||
new TableRow({ children: [
|
||||
dataCell("{{ item.stt }}", aw[0], AlignmentType.CENTER),
|
||||
dataCell("{{ item.ho_ten }}", aw[1]),
|
||||
// … 5 more
|
||||
]}),
|
||||
new TableRow({ children: emptyRow_aw("{%tr endfor %}") }),
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
The `emptyRow_aw` helper builds a row where the first cell contains the loop tag and the rest are blanks (just `" "`). After `docxtpl` strips it, the visible table has one header row plus one data row per item.
|
||||
|
||||
### 6.3 Multi-section layout
|
||||
|
||||
Word documents are split into **sections**, each with its own page settings — margins, orientation, page borders, headers, footers. The cover page needs:
|
||||
|
||||
- A **page border** (rounded rectangle around the content area)
|
||||
- A **header** containing "Mẫu số 01" at the top right *outside* the border
|
||||
|
||||
The rest of the document needs:
|
||||
|
||||
- **No** page border
|
||||
- **No** "Mẫu số 01" header (it's only on the cover)
|
||||
|
||||
In `docx` v9, this is two sections in the same document:
|
||||
|
||||
```ts
|
||||
new Document({
|
||||
sections: [
|
||||
{
|
||||
properties: {
|
||||
page: {
|
||||
size: { width: 11906, height: 16838, orientation: PageOrientation.PORTRAIT },
|
||||
margin: { top: 1440, right: 1440, bottom: 1440, left: 1440 },
|
||||
borders: {
|
||||
pageBorderTop: { style: BorderStyle.SINGLE, size: 12, color: "000000", space: 24 },
|
||||
pageBorderBottom: { style: BorderStyle.SINGLE, size: 12, color: "000000", space: 24 },
|
||||
pageBorderLeft: { style: BorderStyle.SINGLE, size: 12, color: "000000", space: 24 },
|
||||
pageBorderRight: { style: BorderStyle.SINGLE, size: 12, color: "000000", space: 24 },
|
||||
},
|
||||
},
|
||||
},
|
||||
headers: { default: coverHeader }, // contains "Mẫu số 01"
|
||||
children: buildCoverPage(),
|
||||
},
|
||||
{
|
||||
properties: {
|
||||
page: { size: {/*…*/}, margin: {/*…*/} /* no borders */ },
|
||||
},
|
||||
// Explicit empty header so the cover header doesn't leak onto subsequent pages
|
||||
headers: { default: new Header({ children: [new Paragraph({ children: [r("")] })] }) },
|
||||
children: [
|
||||
...buildMau01(),
|
||||
...buildMau02(),
|
||||
...buildMau03(),
|
||||
...buildMau04(),
|
||||
...buildBanCamKet(),
|
||||
],
|
||||
},
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
Two gotchas worth noting:
|
||||
|
||||
**Twips, not points.** `docx` uses twips (1/1440 inch). Multiply pt by 20 to get twips:
|
||||
- A4 = 11906 × 16838 twips
|
||||
- 1 inch margin = 1440 twips
|
||||
- 1 cm = 567 twips
|
||||
|
||||
**Headers leak across sections.** If section 2 doesn't define `headers`, it inherits section 1's. We have to provide an explicit empty `Header` to prevent the "Mẫu số 01" text from showing up on every page of the document.
|
||||
|
||||
### 6.4 Building paragraphs and tables
|
||||
|
||||
The build script defines small helper functions to keep the body code readable:
|
||||
|
||||
```ts
|
||||
const FONT = "Times New Roman";
|
||||
const SIZE = 26; // 13pt (docx-js uses half-points)
|
||||
const SIZE_HEADING = 28; // 14pt
|
||||
|
||||
function r(text: string, opts: { bold?: boolean; italic?: boolean; underline?: boolean; size?: number } = {}) {
|
||||
return new TextRun({
|
||||
text,
|
||||
font: FONT,
|
||||
size: opts.size ?? SIZE,
|
||||
bold: opts.bold,
|
||||
italics: opts.italic,
|
||||
underline: opts.underline ? { type: UnderlineType.SINGLE } : undefined,
|
||||
});
|
||||
}
|
||||
|
||||
function bodyP(children: TextRun[], opts: { indent?: boolean } = {}) {
|
||||
return new Paragraph({
|
||||
children,
|
||||
alignment: AlignmentType.JUSTIFIED,
|
||||
indent: opts.indent ? { firstLine: 567 } : undefined,
|
||||
spacing: { before: 0, after: 0, line: 300 },
|
||||
});
|
||||
}
|
||||
|
||||
function flushP(children: TextRun[], opts: { spaceBefore?: number } = {}) {
|
||||
return new Paragraph({
|
||||
children,
|
||||
alignment: AlignmentType.JUSTIFIED,
|
||||
spacing: { before: opts.spaceBefore ?? 0, after: 0, line: 300 },
|
||||
});
|
||||
}
|
||||
|
||||
function centerP(children: TextRun[], opts: { spaceBefore?: number; spaceAfter?: number } = {}) {
|
||||
return new Paragraph({
|
||||
children,
|
||||
alignment: AlignmentType.CENTER,
|
||||
spacing: { before: opts.spaceBefore ?? 0, after: opts.spaceAfter ?? 0, line: 300 },
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
A typical section then reads naturally:
|
||||
|
||||
```ts
|
||||
out.push(centerP([r("BÁO CÁO MÔ TẢ SÁNG KIẾN", { bold: true, size: SIZE_HEADING })]));
|
||||
|
||||
out.push(flushP([
|
||||
r("1. Mở đầu "),
|
||||
r("(Giới thiệu về những vấn đề liên quan…):", { italic: true }),
|
||||
]));
|
||||
out.push(bodyP([r("{{ mau_01.mo_dau }}")], { indent: true }));
|
||||
```
|
||||
|
||||
For checkboxes, since the templating engine has to choose which character to render, we embed the choice in the placeholder itself:
|
||||
|
||||
```ts
|
||||
const checkbox = (cond: string, label: string) =>
|
||||
flushP([
|
||||
r(`{% if ${cond} %}`),
|
||||
r("☑"),
|
||||
r("{% else %}"),
|
||||
r("☐"),
|
||||
r("{% endif %} "),
|
||||
r(label),
|
||||
]);
|
||||
|
||||
out.push(checkbox(
|
||||
"mau_02.phan_loai.giai_phap_ky_thuat",
|
||||
"Giải pháp kỹ thuật, quản lý, tác nghiệp, ứng dụng tiến bộ kỹ thuật áp dụng cho Đại học Y Dược TP.HCM"
|
||||
));
|
||||
```
|
||||
|
||||
After `docxtpl` runs, this paragraph reduces to `☑ Giải pháp kỹ thuật…` or `☐ Giải pháp kỹ thuật…` depending on the boolean. (For DOCX rendering in Word, the `☑/☐` characters work fine because Word falls back to a Unicode-capable font automatically — unlike React-PDF.)
|
||||
|
||||
---
|
||||
|
||||
## 7. Layout calibration (matching the standard)
|
||||
|
||||
The "Sang_kien_SOP_dong_vat" reference document defines a specific visual style. Here's a checklist of the calibrations applied to both generators:
|
||||
|
||||
| Aspect | Rule | Where it lives |
|
||||
|---|---|---|
|
||||
| Body font | Times New Roman (or Tinos) 13pt | `styles.page.fontSize`, `r()` `SIZE = 26` |
|
||||
| Page margins | 2.5 cm all around | `padding: 71` (PDF), `margin: 1440` (DOCX) |
|
||||
| Body line height | 1.25 | `lineHeight: 1.25` (PDF), `line: 300` (DOCX, 240 = single, 300 ≈ 1.25) |
|
||||
| First-line indent | ~1 cm on body paragraphs | `textIndent: 28` (PDF), `firstLine: 567` (DOCX) |
|
||||
| Section numbers (`1.`, `2.`, `4.1`) | **NOT bold**; italic instructions in parens | Use `paragraphFlush` not bold |
|
||||
| Inter-paragraph spacing | None within a section, small gap before new section | `marginBottom: 0`, `sectionHead.marginTop: 4` |
|
||||
| Cover page | Page border (rounded rect), "Mẫu số 01" outside top-right | Cover-specific styles, dedicated section in DOCX |
|
||||
| Cover divider | `=====***=====` (literal) | Hardcoded string |
|
||||
| Cover info fields | Left-aligned, **bold label**, regular value | `coverField` style |
|
||||
| Two-column header | "ĐƠN VỊ" or "BỘ Y TẾ" left, "CỘNG HÒA" right | `TopHeaderBoYTe`, `TopHeaderDonVi`, `TopHeaderCongHoa` |
|
||||
| "Độc lập – Tự do – Hạnh phúc" | Underlined, bold | `underline: true` flag in `r()`/styles |
|
||||
| Tables | Single thin black border, no shaded header | `borderWidth: 1`, no `backgroundColor` on `tableHeaderCell` |
|
||||
| Mẫu 02 author table column 7 | Header includes parenthetical italic instruction | Custom `TableCell` with two centered paragraphs |
|
||||
| Signature block | Two columns: "Xác nhận của lãnh đạo / [đơn vị]" left, "Đại diện nhóm tác giả sáng kiến" right | `<View style={signatureRow}>` (PDF), borderless 2-cell table (DOCX) |
|
||||
| Mẫu 03 totals row | TỔNG (cols 1–3 merged) ‖ 100 ‖ blank | `columnSpan: 3` in DOCX, manual width sum in PDF |
|
||||
| Mẫu 04 evaluation rubric | Two scoring rows + total row at bottom | Static text + `{{ … }}` for nhận xét/điểm |
|
||||
|
||||
When in doubt about a layout decision, open the reference DOCX in Word, click into the relevant element, and read its formatting from the ribbon. Mirror those settings in code.
|
||||
|
||||
---
|
||||
|
||||
## 8. Verification workflow
|
||||
|
||||
Visual diff against the reference is the only reliable way to know you got it right. The flow:
|
||||
|
||||
```bash
|
||||
# 1. Generate the candidate PDF
|
||||
npm run generate
|
||||
|
||||
# 2. Convert each page to JPEG
|
||||
pdftoppm -jpeg -r 100 out/sang-kien-filled.pdf out/page
|
||||
|
||||
# 3. Convert the reference DOCX to PDF and JPEGs the same way
|
||||
soffice --headless --convert-to pdf reference.docx --outdir ref/
|
||||
pdftoppm -jpeg -r 100 ref/reference.pdf ref/ref-page
|
||||
|
||||
# 4. Open them side by side
|
||||
```
|
||||
|
||||
For the DOCX generator, add one more step:
|
||||
|
||||
```bash
|
||||
# Build the template
|
||||
npm run build:docx
|
||||
|
||||
# Render placeholders WITHOUT filling them — does the layout look right?
|
||||
soffice --headless --convert-to pdf out/template_application_form.docx --outdir out/
|
||||
|
||||
# Fill it with sample data and render
|
||||
python tools/fill-docx.py example/sample-data.json out/sang-kien-filled.docx
|
||||
soffice --headless --convert-to pdf out/sang-kien-filled.docx --outdir out/
|
||||
```
|
||||
|
||||
Smoke test the DOCX template in Python before declaring victory:
|
||||
|
||||
```python
|
||||
# tools/test-docx-fill.py
|
||||
from docxtpl import DocxTemplate
|
||||
import json
|
||||
|
||||
with open("example/sample-data.json", encoding="utf-8") as f:
|
||||
data = json.load(f)
|
||||
|
||||
doc = DocxTemplate("out/template_application_form.docx")
|
||||
doc.render(data)
|
||||
doc.save("out/template-filled-test.docx")
|
||||
```
|
||||
|
||||
If `docxtpl` raises `TemplateSyntaxError: Encountered unknown tag 'endfor'`, you've put a `{%tr for %}` and `{%tr endfor %}` in the same row instead of separate rows. Go re-read [§6.2](#62-the-3-row-table-loop-trick).
|
||||
|
||||
If a `{{ field }}` doesn't get replaced and you can still see the curly braces in the filled output, the placeholder got split across runs by Word's auto-formatting. Build the placeholder with one `r("{{ x }}")` call, not three.
|
||||
|
||||
---
|
||||
|
||||
## 9. Common modifications
|
||||
|
||||
### Adding a new field
|
||||
|
||||
Say you need to add `mau_01.tong_kinh_phi` (total budget).
|
||||
|
||||
1. **Update `src/types.ts`:**
|
||||
```ts
|
||||
export interface Mau01 {
|
||||
// …
|
||||
tong_kinh_phi: string; // new
|
||||
}
|
||||
```
|
||||
|
||||
2. **Update `example/data-blank.json`** and **`example/sample-data.json`** with the new field.
|
||||
|
||||
3. **Render it in `src/pages/Mau01.tsx`:**
|
||||
```tsx
|
||||
<Text style={styles.paragraphFlush}>
|
||||
7. Tổng kinh phí: {data.tong_kinh_phi}
|
||||
</Text>
|
||||
```
|
||||
|
||||
4. **Add it to the DOCX template generator** in `tools/build-docx-template.ts`:
|
||||
```ts
|
||||
out.push(flushP([r("7. Tổng kinh phí: {{ mau_01.tong_kinh_phi }}")]));
|
||||
```
|
||||
|
||||
5. **Regenerate:**
|
||||
```bash
|
||||
npm run generate
|
||||
npm run build:docx
|
||||
```
|
||||
|
||||
The TypeScript compiler will yell if you forget to update the page component or miss a field in the JSON.
|
||||
|
||||
### Changing a column width
|
||||
|
||||
Column widths are kept as small integer arrays in the page component (PDF) and the build script (DOCX). They must always sum to 100.
|
||||
|
||||
To widen the "Họ và tên" column on the Mẫu 02 author table from 22% to 28% (and shrink "Nơi công tác" from 16% to 10%):
|
||||
|
||||
In `src/pages/Mau02.tsx`:
|
||||
```ts
|
||||
const AUTHOR_WIDTHS = [6, 28, 14, 10, 14, 14, 14] as const; // was [6, 22, 14, 16, …]
|
||||
```
|
||||
|
||||
In `tools/build-docx-template.ts` (inside `buildMau02()`):
|
||||
```ts
|
||||
const aw = [6, 28, 14, 10, 14, 14, 14];
|
||||
```
|
||||
|
||||
Both numbers must match — there's no shared constant because the PDF widths are percentages of the page width (100% sum) while the DOCX widths happen to use the same convention but go through different code paths. Keeping them in sync is a manual discipline.
|
||||
|
||||
### Adding a new repeating table
|
||||
|
||||
Both the data shape, the page component, and the DOCX template need updates:
|
||||
|
||||
1. **Type:** add `Mau01NewRow[]` to `Mau01`, define `interface Mau01NewRow { … }`.
|
||||
|
||||
2. **PDF page:** mirror the existing pattern in `src/pages/Mau01.tsx`:
|
||||
```tsx
|
||||
<Table columns={[10, 30, 30, 30]}>
|
||||
<Row>
|
||||
<Cell width={10} header align="center">TT</Cell>
|
||||
{/* … */}
|
||||
</Row>
|
||||
{(data.danh_sach_moi && data.danh_sach_moi.length > 0
|
||||
? data.danh_sach_moi
|
||||
: [{ tt: "", ... }]
|
||||
).map((row, i) => (
|
||||
<Row key={i}>
|
||||
<Cell width={10} align="center">{row.tt}</Cell>
|
||||
{/* … */}
|
||||
</Row>
|
||||
))}
|
||||
</Table>
|
||||
```
|
||||
|
||||
3. **DOCX template:** use the 3-row pattern from [§6.2](#62-the-3-row-table-loop-trick):
|
||||
```ts
|
||||
const w = [10, 30, 30, 30];
|
||||
const emptyRow = (firstText: string) => /* same helper pattern */;
|
||||
|
||||
new Table({
|
||||
rows: [
|
||||
new TableRow({ children: [headerCell("TT", w[0]), /* … */] }),
|
||||
new TableRow({ children: emptyRow("{%tr for item in mau_01.danh_sach_moi %}") }),
|
||||
new TableRow({ children: [dataCell("{{ item.tt }}", w[0], AlignmentType.CENTER), /* … */] }),
|
||||
new TableRow({ children: emptyRow("{%tr endfor %}") }),
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
### Switching to your organization's font
|
||||
|
||||
Replace the four TTF paths in `src/fonts.ts`:
|
||||
|
||||
```ts
|
||||
Font.register({
|
||||
family: "TimesVN",
|
||||
fonts: [
|
||||
{ src: "/path/to/your/Regular.ttf" },
|
||||
{ src: "/path/to/your/Italic.ttf", fontStyle: "italic" },
|
||||
{ src: "/path/to/your/Bold.ttf", fontWeight: "bold" },
|
||||
{ src: "/path/to/your/BoldItalic.ttf", fontWeight: "bold", fontStyle: "italic" },
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
For the DOCX side, change `const FONT = "Times New Roman"` in `tools/build-docx-template.ts` to whatever font you want to embed. Word will fall back to a system font if the named font isn't installed on the reader's machine, so prefer common names (Times New Roman, Arial, Calibri).
|
||||
|
||||
---
|
||||
|
||||
## 10. Troubleshooting
|
||||
|
||||
**PDF renders blank squares where Vietnamese characters should be.**
|
||||
The font isn't registered or the registered font lacks Vietnamese glyphs. Check that `registerFonts()` is called and that the TTFs at the resolved paths are actually loaded (not 404 / missing). Tinos has the right glyph coverage; many "Times New Roman clones" don't.
|
||||
|
||||
**`Error: Failed to fetch font from https://…`**
|
||||
You're hitting `@react-pdf/renderer`'s URL-based font loading and your environment can't reach the URL. Switch to local TTFs via `require.resolve()` (already what `src/fonts.ts` does).
|
||||
|
||||
**`docxtpl` raises `TemplateSyntaxError: Encountered unknown tag 'endfor'`.**
|
||||
You put the `{%tr for %}` and `{%tr endfor %}` tags in the *same* table row. Re-read [§6.2](#62-the-3-row-table-loop-trick) — they have to be on separate rows.
|
||||
|
||||
**Some `{{ field }}` placeholders aren't being replaced.**
|
||||
Word split your text run mid-placeholder. Make sure each placeholder is constructed with a single `r("{{ x }}")` call, not split across multiple `r()` calls or assembled from concatenated strings.
|
||||
|
||||
**The DOCX has "Mẫu số 01" appearing on every page, not just the cover.**
|
||||
The cover-section header is leaking into the next section. Add an explicit empty header to the second section:
|
||||
```ts
|
||||
headers: { default: new Header({ children: [new Paragraph({ children: [r("")] })] }) },
|
||||
```
|
||||
|
||||
**Tables overflow the right margin.**
|
||||
Column width percentages don't sum to exactly 100, or a single cell has too much wide content with no wrap point. Either fix the widths or add `wordBreak: "break-word"` to the cell style.
|
||||
|
||||
**`textIndent` doesn't seem to work in `<Text>`.**
|
||||
React-PDF's `textIndent` only takes effect when the `<Text>` *itself* has `display: "block"`-like behavior — i.e. it's a top-level paragraph, not nested inside another `<Text>`. If you're nesting, wrap the inner content in a parent `<Text>` that has the indent style.
|
||||
|
||||
**The DOCX page border doesn't appear.**
|
||||
Page borders are a Word feature configured in section properties. Check that you've set all four (`pageBorderTop/Bottom/Left/Right`), with non-zero `size` and a `space` value (24 puts them ~1.7cm from the edge in our setup). LibreOffice and Word may render them slightly differently — Word is the canonical view.
|
||||
|
||||
**Filled DOCX has weird extra empty rows above each table.**
|
||||
Those are the `{%tr for %}`/`{%tr endfor %}` rows that didn't get stripped — meaning the loop tags ended up in paragraphs *inside* a cell, not as standalone row text. Make sure the `firstText` in your `emptyRow_*()` helper is the entire cell content, not appended to other text.
|
||||
|
||||
---
|
||||
|
||||
## 11. Porting to a different form
|
||||
|
||||
The same pattern works for any structured government form. The migration steps:
|
||||
|
||||
1. **Extract the data model.** Open the reference DOCX, list every blank line and every table column. Each becomes a field in `types.ts`. Repeating sections (lists of authors, lists of attachments) become arrays.
|
||||
|
||||
2. **Identify the sections.** Most forms have a cover page plus N body sections. Each body section becomes a `<Page>` component plus a `buildSectionN()` function in the DOCX builder.
|
||||
|
||||
3. **Catalog the visual primitives.** Headers, signature blocks, tables, checkboxes, date lines — write them once in `components.tsx` (PDF) and as helper functions (DOCX), then reuse.
|
||||
|
||||
4. **Calibrate the styles.** Open the reference, measure margins, font, line spacing, and indent. Set them as constants. See [§7](#7-layout-calibration-matching-the-standard).
|
||||
|
||||
5. **Render and diff.** Generate, convert to JPEG, line up against the reference. Iterate until they match.
|
||||
|
||||
6. **Smoke-test the DOCX template** with `docxtpl`. If a placeholder doesn't fill, it's almost always run-splitting — fix by collapsing into one `r()` call.
|
||||
|
||||
The most labor-intensive part is the visual calibration (step 4–5). Everything else is mechanical translation from "what the form looks like" to "code that produces the same thing."
|
||||
|
||||
---
|
||||
|
||||
## Appendix: file-by-file inventory
|
||||
|
||||
| File | Lines | Purpose |
|
||||
|---|---:|---|
|
||||
| `src/types.ts` | 177 | TypeScript interfaces matching `data_blank.json` |
|
||||
| `src/fonts.ts` | 56 | Tinos font registration |
|
||||
| `src/styles.ts` | 239 | Shared `StyleSheet.create()` styles |
|
||||
| `src/components.tsx` | 156 | Reusable `<Checkbox>`, `<Table>`, `<DateLine>`, header variants |
|
||||
| `src/pages/CoverPage.tsx` | 64 | Trang bìa with page border |
|
||||
| `src/pages/Mau01.tsx` | 172 | Báo cáo mô tả sáng kiến |
|
||||
| `src/pages/Mau02.tsx` | 206 | Đơn đề nghị công nhận sáng kiến |
|
||||
| `src/pages/Mau03.tsx` | 82 | Bản xác nhận tỷ lệ đóng góp |
|
||||
| `src/pages/Mau04.tsx` | 94 | Phiếu đánh giá sáng kiến |
|
||||
| `src/pages/BanCamKet.tsx` | 119 | Bản cam kết |
|
||||
| `src/SangKienDocument.tsx` | 43 | Top-level `<Document>` composing all pages |
|
||||
| `src/generate.tsx` | 37 | `renderSangKienPdf(data)` server-side helper |
|
||||
| `src/index.ts` | 5 | Public API barrel |
|
||||
| `tools/build-docx-template.ts` | 1301 | Generates the Jinja-style DOCX template |
|
||||
| `tools/fill-docx.py` | ~30 | CLI to fill a template with JSON data via `docxtpl` |
|
||||
| `tools/test-docx-fill.py` | ~25 | Smoke test script |
|
||||
| `example/generate-example.ts` | ~35 | CLI for the PDF pipeline |
|
||||
| `example/sample-data.json` | — | Realistic filled-in example |
|
||||
| `example/data-blank.json` | — | All-empty template instance |
|
||||
|
||||
Total: about **2750 lines** of TypeScript + ~50 lines of Python. The DOCX generator is the largest single file because every static line of body text is a `out.push(flushP([r("…")]))` call, but the pattern is repetitive and easy to skim.
|
||||
Reference in New Issue
Block a user