sciagent code + Gitea Actions CI/CD
CI/CD / backend (push) Failing after 2m8s
CI/CD / frontend (push) Failing after 1m40s
CI/CD / deploy (push) Has been skipped

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Thinh Lam
2026-06-30 09:38:30 +07:00
commit 688fac73e9
1167 changed files with 158244 additions and 0 deletions
@@ -0,0 +1,500 @@
# Medical-Imaging 3D Viewer — Reconstruction Spec
A self-contained spec for the **VTK.js quad-view DICOM/NIfTI viewer** + the **organ-mask
overlay** system, written so it can be rebuilt from scratch in another React app. It
captures the architecture, the public API, every VTK module + magic number, the
interaction model, and the hard-won gotchas (each marked **⚠️**).
> Source of truth: `shared/src/components/viewer/` in this repo. This doc is a snapshot
> (2026-06-20) — if it disagrees with the code, the code wins.
---
## 1. What it is
A 4-pane ("quad view") medical image viewer rendered with **VTK.js into a single WebGL
canvas**:
```
┌─────────────┬─────────────┐
│ AXIAL │ CORONAL │ 3 panes = orthogonal 2D slices (reslice of the volume)
│ (2D slice) │ (2D slice) │ 1 pane = 3D volume rendering
├─────────────┼─────────────┤
│ SAGITTAL │ 3D │ + per-organ mask overlays (3D surface + 2D fills)
│ (2D slice) │ (volume) │ + a 5-tool annotation layer on the 2D panes
└─────────────┴─────────────┘
```
It loads a single `.nii`/`.nii.gz` (NIfTI) or a set of `.dcm` (DICOM) files, shows three
orthogonal slices + a 3D volume, supports window/level + opacity, slice-scroll, zoom,
double-click-to-expand, client-side annotations, and colored organ-segmentation overlays.
---
## 2. Tech stack & exact dependencies
| Package | Version | Role |
|---|---|---|
| `@kitware/vtk.js` | `^34.16.2` (built/ran on 34.18) | All rendering. ~960 KB — code-split it (see §4). |
| `nifti-reader-js` | `^0.8.0` | NIfTI header/image parse + gzip decompress. |
| `react` / `react-dom` | `^18.3.1` | Component shell + hooks. **React 18 auto-batches** (matters for the overlay toggle). |
| `@radix-ui/react-dropdown-menu` | `^2.1.15` | Only used by `ViewRotationControls` (optional). |
| `lucide-react` | `^0.462.0` | Icons (host UI only — not core). |
DICOM parsing additionally uses a DICOM lib inside `useDicomData` (e.g. `dicom-parser` /
`cornerstone`-style decode) — out of scope here; the NIfTI path is the reference.
**VTK.js profile imports (MUST be imported once, before any vtk instance):**
```ts
import "@kitware/vtk.js/Rendering/Profiles/Volume"; // volume rendering
import "@kitware/vtk.js/Rendering/Profiles/Geometry"; // surfaces (organ overlays), slices
import "@kitware/vtk.js/Rendering/Misc/RenderingAPIs"; // OpenGL backend
```
---
## 3. File inventory (`shared/src/components/viewer/`)
| File | LOC | Responsibility |
|---|---|---|
| `index.ts` | 14 | The `@ump/shared/viewer` barrel — the **only** public entry. |
| `types.ts` | 92 | `NiftiData`, `OrganMaskData`, `Annotation*` contracts. |
| `niftiLoader.ts` | 121 | `parseNiftiBuffer()` + non-hook `loadNiftiImageData()`. |
| `useNiftiData.ts` | 48 | React hook wrapping `parseNiftiBuffer` for the main image. |
| `useDicomData.ts` | 219 | DICOM equivalent of `useNiftiData`. |
| `UnifiedQuadViewRenderer.tsx` | 189 | Format dispatch (NIfTI vs DICOM) + prop pass-through. |
| `NiftiQuadViewRenderer.tsx` | 1249 | **The core** — quad view, slices, volume, interaction, overlays. |
| `QuadViewRenderer.tsx` | 820 | DICOM equivalent (same structure, DICOM input). |
| `AnnotationOverlay.tsx` | 254 | Per-pane 2D drawing surface (5 tools). |
| `ViewRotationControls.tsx` | 128 | Optional 3D-orientation dropdown. |
The host (not in `shared`): a full-screen dialog that mounts `UnifiedQuadViewRenderer`,
owns the window/level/opacity sliders, the annotation toolbar, and the organ panel.
---
## 4. Packaging / bundle strategy ⚠️
VTK is heavy (~960 KB). Expose the viewer as a **lazy subpath**, never from the main
barrel, so VTK lands in its own async chunk and never bloats a page's initial bundle.
`shared/package.json`:
```json
{ "name": "@ump/shared",
"exports": { ".": "./src/index.ts", "./viewer": "./src/components/viewer/index.ts" } }
```
`viewer/index.ts` (the entire public API):
```ts
export { UnifiedQuadViewRenderer } from './UnifiedQuadViewRenderer';
export type { UnifiedQuadViewRendererProps, FileFormat } from './UnifiedQuadViewRenderer';
export type { Annotation, AnnotationTool, AnnotationPoint } from './types';
export { loadNiftiImageData, parseNiftiBuffer } from './niftiLoader';
export type { OrganMaskData, OrganName } from './types';
```
**Consume it via `React.lazy`:**
```ts
const ViewerDialog = lazy(() => import('./DatasetFileViewerDialog')); // statically imports @ump/shared/viewer
```
**⚠️ Vite/TS alias ORDER:** when `@ump/shared` is aliased to source, the subpath
`@ump/shared/viewer` needs its **own** alias entry listed **before** the bare one (array
form, so prefix-matching picks the longer key first):
```ts
// vite.config.ts
resolve: { alias: [
{ find: '@ump/shared/viewer', replacement: path.resolve(__dirname, '../shared/src/components/viewer/index.ts') },
{ find: '@ump/shared', replacement: path.resolve(__dirname, '../shared/src/index.ts') },
]}
```
```jsonc
// tsconfig.json
"paths": {
"@ump/shared/viewer": ["../shared/src/components/viewer/index.ts"],
"@ump/shared": ["../shared/src/index.ts"]
}
```
**⚠️ Keep the NIfTI loader (`loadNiftiImageData`) and `OrganMaskData` in the `/viewer`
subpath** — importing them from the main barrel would pull VTK into the page's initial bundle.
---
## 5. Data model (`types.ts`, verbatim)
```ts
export interface NiftiData {
header: { dims: number[]; pixDims: number[]; datatype: number; littleEndian: boolean;
voxOffset: number; affine: number[][]; description: string };
imageData: any; // vtkImageData
rawData: ArrayBuffer;
dimensions: [number, number, number]; // [nx, ny, nz]
spacing: [number, number, number]; // [sx, sy, sz]
}
export type OrganName = string;
export interface OrganMaskData {
id?: string; // ⚠️ STABLE unique key (the mask file id). Renderer keys by this.
organName: OrganName; // display label
imageData: any; // vtkImageData of the binary mask (0 = bg, >0 = organ)
color: [number, number, number]; // 0-255 RGB
}
export type AnnotationTool = 'none' | 'bbox' | 'points' | 'pen' | 'brush' | 'polygon';
export interface AnnotationPoint { x: number; y: number } // normalized [0..1] to the pane
export interface Annotation {
id: string; view: 'axial' | 'coronal' | 'sagittal'; sliceIndex: number;
tool: Exclude<AnnotationTool, 'none'>; points: AnnotationPoint[];
color: string; strokeWidth?: number; label?: string;
}
```
---
## 6. Public API — `UnifiedQuadViewRendererProps`
```ts
interface UnifiedQuadViewRendererProps {
files: File[]; // 1 NIfTI File, or N DICOM Files
windowWidth: number; // CT window width (e.g. 400)
windowLevel: number; // CT window level (e.g. 40)
opacity: number; // 3D volume opacity 0..1 (e.g. 0.8)
isLoading?: boolean;
onRotate3D?: (o: ViewOrientation) => void;
// segmentation/MedSAM props (optional, unused unless you wire a backend):
segmentationEnabled?: boolean; boundingBox?; onBoundingBoxChange?; segmentationMask?;
currentSliceIndex?: number; onSliceIndexChange?;
// organ-mask overlays:
organMasks?: OrganMaskData[]; // ← selected organs to overlay (3D + 2D)
// annotations:
annotationTool?: AnnotationTool;
annotations?: Annotation[];
onAnnotationsChange?: (a: Annotation[]) => void;
}
```
`UnifiedQuadViewRenderer` detects format from file extension (`.nii`/`.nii.gz` → NIfTI,
`.dcm`/`.dicom` → DICOM), runs the matching hook (`useNiftiData`/`useDicomData`), and
renders `NiftiQuadViewRenderer` or `QuadViewRenderer`, forwarding all props.
**⚠️ `organMasks` is forwarded only on the NIfTI path** — DICOM ignores it.
---
## 7. NIfTI loading pipeline (`niftiLoader.ts`)
Pure, non-hook, throws on bad input. Reused by both the main-image hook and the
organ-mask loader (DRY — one parser).
```ts
export function parseNiftiBuffer(input: ArrayBuffer): NiftiData {
let buf = input;
if (nifti.isCompressed(buf)) buf = nifti.decompress(buf) as ArrayBuffer; // .nii.gz
if (!nifti.isNIFTI(buf)) throw new Error('Not a valid NIfTI file');
const header = nifti.readHeader(buf);
const image = nifti.readImage(header, buf);
const [ , nx, ny, nz ] = header.dims;
const sx = Math.abs(header.pixDims[1]) || 1, sy = Math.abs(header.pixDims[2]) || 1,
sz = Math.abs(header.pixDims[3]) || 1;
// datatype → typed array (UINT8/INT8/UINT16/INT16/FLOAT32 direct; FLOAT64/INT32 → Float32; default Float32)
let typed = /* switch(header.datatypeCode) … */;
// scl_slope / scl_inter scaling (skip when slope===1 && inter===0)
const slope = header.scl_slope || 1, inter = header.scl_inter || 0;
const scaled = (slope !== 1 || inter !== 0)
? Float32Array.from(typed, v => v * slope + inter)
: (typed instanceof Float32Array ? typed : new Float32Array(typed));
const imageData = vtkImageData.newInstance();
imageData.setDimensions([nx, ny, nz]);
imageData.setSpacing([sx, sy, sz]); // ⚠️ origin left at (0,0,0)
imageData.getPointData().setScalars(
vtkDataArray.newInstance({ name: 'Scalars', numberOfComponents: 1, values: scaled }));
return { header: {}, imageData, rawData: buf, dimensions: [nx,ny,nz], spacing: [sx,sy,sz] };
}
export async function loadNiftiImageData(file: File) { // for organ masks
return parseNiftiBuffer(await file.arrayBuffer()).imageData;
}
```
`useNiftiData(file)` just wraps this in a `useEffect` with a `lastFileRef` dedupe (key =
`name-size-lastModified`) and `{ niftiData, isLoading, error }` state.
**⚠️ No affine/world transform** is applied — only dims + spacing, origin (0,0,0). Two
volumes (image + mask) only co-register if they share the same grid. See §11.
---
## 8. The quad-view rendering core (`NiftiQuadViewRenderer.tsx`)
### 8.1 Scene graph (built once, in an init `useEffect`)
- **1** `vtkRenderWindow` + **1** `vtkOpenGLRenderWindow` (`.setContainer(containerDiv)`,
`.setSize(rect.w, rect.h)` in CSS px).
- **1** `vtkRenderWindowInteractor` (`.setView`, `.initialize`, default style = trackball).
- **4** `vtkRenderer`, each `.setViewport(...)` into a quadrant; backgrounds: 2D panes
`[0,0,0]`, 3D pane `[0.1,0.1,0.15]`. `renderWindow.addRenderer(ren)` for each.
- Per renderer, an absolutely-positioned **HTML `<div>` overlay** (the visible border +
label), appended to the container (see §8.3).
Store everything in a `useRef` "context": `{ renderWindow, renderWindowView, interactor,
renderers[4], containers[4], imageSliceActors[3], slicePlanes[3], volumeActor, ctf, pf,
iStyle, tStyle, organMaskActors: Map }`.
### 8.2 Viewport layouts (normalized `[xmin, ymin, xmax, ymax]`, GL origin bottom-left)
```ts
// 2×2 grid (default): note the 0.01 margin + 0.02 gutter between panes
axial = [0.01, 0.51, 0.49, 0.99] // top-left
coronal = [0.51, 0.51, 0.99, 0.99] // top-right
sagittal = [0.01, 0.01, 0.49, 0.49] // bottom-left
threeD = [0.51, 0.01, 0.99, 0.49] // bottom-right
// Expanded (double-click a pane): main left, 3 stacked right
main = [0.01, 0.01, 0.74, 0.99]
side = [ [0.75,0.67,0.99,0.99], [0.75,0.34,0.99,0.66], [0.75,0.01,0.99,0.33] ]
```
### 8.3 HTML border overlays ⚠️ (use percentages, not px)
Each pane has a transparent `<div>` (2px border + a corner label) over the canvas. Position
it as **percentages of the container**, matching the renderer's normalized viewport:
```ts
el.style.position = 'absolute';
el.style.left = `${vp[0] * 100}%`;
el.style.bottom = `${vp[1] * 100}%`;
el.style.width = `${(vp[2]-vp[0]) * 100}%`;
el.style.height = `${(vp[3]-vp[1]) * 100}%`;
el.style.boxSizing = 'border-box'; // set at creation
el.style.border = 'solid 2px hsl(var(--border))';
```
**⚠️ Do NOT compute px from `getBoundingClientRect()`** — the viewer often opens inside a
dialog that animates with `transform: scale(.95→1)`, and `ResizeObserver` does NOT fire on
transform changes, so a rect captured mid-animation freezes ~5% small/offset while the
canvas (`width:100%`) stretches to fill → content spills past the frame. Percentages track
the canvas under any transform/resize.
### 8.4 Slices (the 3 2D panes)
```ts
axialPlane.setNormal(0,0,1); coronalPlane.setNormal(0,1,0); sagittalPlane.setNormal(1,0,0);
// per view i: mapper = vtkImageResliceMapper({ slicePlane: planes[i] }); actor = vtkImageSlice(mapper)
// camera = parallel projection; positioned per medical convention (§8.6)
```
On data load, set every plane's origin to the volume center. `imageSliceActors[i] = { actor, mapper, ctf }`.
### 8.5 3D volume (the 3D pane)
```ts
volumeMapper = vtkVolumeMapper({ sampleDistance: 1.0 });
volumeActor = vtkVolume(); volumeActor.setMapper(volumeMapper);
// shade on, ambient .2 / diffuse .7 / specular .3 / specularPower 8
// on data load: mapper.setInputData(im); renderer3D.removeAllVolumes(); renderer3D.addVolume(volumeActor)
// setScalarOpacityUnitDistance(0, diagonal / max(dims)); gradientOpacity min 0 / max (range*0.05)
```
### 8.6 Cameras (medical orientations, set once on data load) — verbatim
```ts
// center = volume bounds center; d = boundingBox.diagonalLength * 1.5
axial: pos(cx, cy, cz-1) focal(center) viewUp(0,-1,0) then renderer.resetCamera()
coronal: pos(cx, cy-1, cz) focal(center) viewUp(0, 0,1) then resetCamera()
sagittal: pos(cx-1, cy, cz) focal(center) viewUp(0, 0,1) then resetCamera()
3D: rotate3DView('anterior')
// rotate3DView(o): focal = center; viewUp/pos per orientation:
// anterior pos(cx, cy-d, cz) up(0,0,1) posterior pos(cx, cy+d, cz) up(0,0,1)
// left-lat pos(cx-d, cy, cz) up(0,0,1) right-lat pos(cx+d, cy, cz) up(0,0,1)
// superior pos(cx, cy, cz+d) up(0,1,0) inferior pos(cx, cy, cz-d) up(0,-1,0)
```
### 8.7 Window/level + opacity transfer functions (on every slider change)
```ts
const low = level - width/2, high = level + width/2;
// 2D grayscale (per slice ctf): points (low-1,0,0,0)(low,0,0,0)(high,1,1,1)(high+1,1,1,1)
// + actor.getProperty().setColorWindow(width); setColorLevel(level);
// 3D volume color (bone/soft-tissue ramp):
ctf: (low-200, 0,0,0)(low, .4,.2,.1)(low+.3Δ, .8,.6,.5)(low+.5Δ, .9,.8,.7)(high, 1,1,.9)(high+200, 1,1,1)
pf : (low-200,0)(low,0)(low+.2Δ, op*.2)(low+.5Δ, op*.5)(high, op) // Δ = high-low
```
### 8.8 Slice scrolling + zoom (wheel listener per 2D pane)
```ts
// plain wheel → slice nav: axisIndex = view===axial?2 : view===coronal?1 : 0
// plane.origin[axisIndex] += spacing[axisIndex] * (deltaY>0?1:-1); clamp to bounds; render
// Ctrl/⌘ wheel → camera.zoom(deltaY<0 ? 1.1 : 0.9)
```
Attach with `{ passive: false }` and `preventDefault()`.
### 8.9 Resize
`ResizeObserver` on the container → `renderWindowView.setSize(rect.w, rect.h)` + reposition
the % border divs + `render()`.
---
## 9. Interaction model ⚠️ (the subtle part)
One interactor, bound to the **full canvas container** (never per-pane). Two styles:
`vtkInteractorStyleImage` (2D: pan/zoom/window-level) and
`vtkInteractorStyleTrackballCamera` (3D: rotate). Each pane `<div>` carries
`dataset.viewId = "0..3"`.
- **Style swap by pane:** on `pointerenter`/`mousedown`, `interactor.setInteractorStyle(viewId==='3' ? trackball : image)`, and bind events to the full container (once).
- **⚠️ Confine a drag to its origin pane:** VTK re-resolves the "poked" renderer on **every
mouse-move** (`findPokedRenderer`), and trackball/image act on that renderer — so a drag
begun in the 3D pane that wanders into a 2D pane retargets the 2D camera. `findPokedRenderer`
skips renderers whose `getInteractive()` is false, so **on `mousedown` set every OTHER
renderer `setInteractive(false)`, restore all on `mouseup` + a global `mouseup`.**
- **⚠️ Double-click to expand:** VTK takes **pointer capture** on press, so the native
`dblclick`/`mouseup`/`click` fire on the parent container, **not** the per-pane div — a
`dblclick` listener on the div never fires. Detect it on the div's `mousedown` via
`e.detail === 2` (the 2nd press of a double-click), then toggle `expandedView`.
- **⚠️ `resetCamera()` on expand:** when the layout changes, re-run
`renderers.forEach(r => r.resetCamera())` after `setViewport`, else the enlarged pane keeps
the tiny framing it had as a quadrant (content stuck in a corner). `resetCamera` preserves
direction + view-up (orientation/rotation kept).
- **⚠️ VTK + Vite HMR:** Fast Refresh leaves stale inputless mappers → console floods
`No input!` + black panes. **Always verify on a FULL reload**, not HMR.
---
## 10. Annotation overlay (`AnnotationOverlay.tsx`)
One overlay `<div>` per 2D pane, `absolute inset-0`, rendering an SVG of the annotations.
Geometry is **normalized [0..1]** to the pane and tagged with `{view, sliceIndex}` so an
annotation only shows on the slice it was drawn on. Tools: `bbox` (2-pt drag), `points`
(click), `pen` (polyline w=2), `brush` (polyline w=16, 0.55 opacity), `polygon` (click
vertices, double-click to close ≥3 pts).
- **Pointer-transparent when tool === 'none'** (`pointerEvents: none`) so VTK keeps
scroll/zoom/rotate; `pointerEvents: auto` + `cursor: crosshair` when a tool is active.
- **⚠️ Capture with NATIVE listeners that `stopPropagation()` + `setPointerCapture`** — and
**also `interactor.disable()` while a tool is active** in the renderer. `stopPropagation()`
ALONE is insufficient: VTK's native canvas listener fires before any React handler can stop it.
- Latest `tool`/callbacks kept in refs so the once-bound native listeners always see current
values without re-binding mid-drag.
- `dblclick` while a polygon draft has ≥3 pts → close it; otherwise → forward to
`onRequestExpand()` (expand the pane).
- `wheel` while a tool is active → `stopPropagation` + forward `{deltaY, ctrlKey, metaKey}`
to the host, which re-dispatches a synthetic `wheel` on the underlying pane so slice-scroll
keeps working under the overlay.
---
## 11. Organ-mask overlays ⚠️ (3D surface + 2D fills)
Driven by the `organMasks: OrganMaskData[]` prop. A `useEffect([organMasks])` diffs them
against an `organMaskActors: Map<key, {...}>` and adds/removes per organ. **⚠️ Key by
`maskData.id ?? organName`, NOT the display label** — two organs sharing a label otherwise
collapse into one (and the UI lies "visible" with nothing rendered).
### 11.1 3D overlay — render a SURFACE, not a second volume
**⚠️ vtk.js does NOT composite two overlapping *volumes* in one renderer** — an overlay
volume silently fails to show even with valid data. Render the binary mask as a colored
**iso-surface** instead (also reads better as a solid shell):
```ts
const mc = vtkImageMarchingCubes.newInstance({ contourValue: 0.5, computeNormals: true, mergePoints: true });
mc.setInputData(maskImageData);
const mapper = vtkMapper.newInstance(); mapper.setInputConnection(mc.getOutputPort()); mapper.setScalarVisibility(false);
const actor = vtkActor.newInstance(); actor.setMapper(mapper);
actor.getProperty().setColor(r/255, g/255, b/255); actor.getProperty().setOpacity(0.7);
renderer3D.addActor(actor);
```
**⚠️ After adding overlay geometry call `renderer3D.resetCameraClippingRange()`** before
`render()`. The camera's far-clip was set for the main volume (e.g. far=1000) while the
surface sits ~10181283 away → it is entirely **culled** → empty 3D pane. This (not the
multi-volume issue) is the usual "nothing renders" cause; diagnose by logging
`actor.getBounds()` vs `camera.getClippingRange()`.
**⚠️ `ImageMarchingCubes` ships no `.d.ts`** — `// @ts-expect-error` the import.
**⚠️ PERF:** marching cubes on a 512³ mask ≈ ~9 s main-thread (no worker), and it runs in
the effect *after* the host's loading spinner clears → a frozen UI. Mitigate by cropping to
the mask's non-zero bbox before MC, or a Web Worker, or precomputing during load.
### 11.2 2D overlay — reslice onto the slice panes, sharing the main slice planes
For each view `vi ∈ {0,1,2}`, reslice the **same** mask with the **same `slicePlanes[vi]`**
the main image uses (so it tracks slice-scroll for free — the wheel handler already mutates
that plane + renders):
```ts
const m = vtkImageResliceMapper.newInstance(); m.setSlicePlane(slicePlanes[vi]); m.setInputData(maskImageData);
const ctf = vtkColorTransferFunction.newInstance(); ctf.addRGBPoint(0,0,0,0); ctf.addRGBPoint(1, r/255,g/255,b/255);
const pwf = vtkPiecewiseFunction.newInstance(); pwf.addPoint(0,0); pwf.addPoint(0.5,0); pwf.addPoint(1, 0.6); // opacity
const slice = vtkImageSlice.newInstance(); slice.setMapper(m);
slice.getProperty().setRGBTransferFunction(0, ctf);
slice.getProperty().setPiecewiseFunction(0, pwf);
slice.getProperty().setColorWindow(1); slice.getProperty().setColorLevel(0.5); // ⚠️ see below
slice.getProperty().setInterpolationTypeToNearest();
renderers[vi].addActor(slice);
```
**⚠️ `setColorWindow(1)` + `setColorLevel(0.5)` are mandatory.** The default color window
(255) squashes the binary value `1` to ~0 on the transfer function → the mask renders
**near-black** on the slice instead of the organ color. Window 1 / level 0.5 maps data
`0→0`, `1→1`. No z-fighting in practice (coplanar with the main slice, added after).
### 11.3 Lifecycle
Store `{ actor(surface), mapper, ctf: mcFilter, pf: null, sliceActors: [{actor,mapper,ctf,pf}×3] }`
in the Map. On toggle-off: `removeActor` the surface from `renderer3D` + each slice from
`renderers[vi]`, then `.delete()` all. **⚠️ Also free them in the component's unmount
cleanup** (iterate the Map) — they leak otherwise when the viewer closes with organs selected.
---
## 12. Host integration (the dialog)
The viewer is mounted full-screen and fed by a host that owns the controls. Minimal shape:
```tsx
<UnifiedQuadViewRenderer
files={[file]} windowWidth={ww} windowLevel={wl} opacity={op}
organMasks={organMasks} // ← selected organs
annotationTool={tool} annotations={annos} onAnnotationsChange={setAnnos} />
```
- **Control bar:** range sliders → `windowWidth` (1..4000), `windowLevel` (-1000..3000),
`opacity` (0..1).
- **Organ panel:** lists the available masks (id, label, color swatch). Toggling an organ
**lazily** loads its mask: `presignURL → fetch as File → loadNiftiImageData(file) →
OrganMaskData{ id, organName, imageData, color }`, cache by id, and derive
`organMasks = selectedIds.map(id => cache[id])`. Assign a stable color per organ by list
index from a fixed palette.
- **⚠️ Loader lives behind the lazy boundary** — the dialog (already VTK-heavy) imports
`loadNiftiImageData` from `@ump/shared/viewer`; the host *page* only uses VTK-free helpers
(presign/fetch) so the page bundle stays clean.
- Full-screen dialog: use `inset-0` (not `w-screen/h-screen`, which overflows by the
scrollbar width); a `flex-1` child in a non-definite-height parent collapses to 0 → give a
definite height (`h-screen` flex-col).
---
## 13. Consolidated gotcha checklist ⚠️
1. **% border divs**, never px from `getBoundingClientRect()` (transform-stale-rect).
2. **Confine drags** to the origin pane via `setInteractive(false)` on the others (VTK
re-pokes every move).
3. **Expand = `e.detail===2` on mousedown** (pointer-capture eats `dblclick`) **+
`resetCamera()`** on layout change.
4. **Organ 3D overlay = marching-cubes SURFACE**, not a 2nd volume **+
`resetCameraClippingRange()`** (stale far-clip culls it).
5. **Organ 2D overlay**: reslice on the **shared** slice planes + **`colorWindow(1)/
colorLevel(0.5)`** (else near-black) + piecewise opacity.
6. **Key overlays by stable id**, not label.
7. **Free organ actors** on toggle-off AND unmount.
8. **Annotation overlay**: pointer-transparent when idle; native listeners +
`stopPropagation` + `interactor.disable()` while drawing.
9. **Co-registration**: masks must share the image's grid (dims+spacing, origin 0,0,0) — no
resampling is applied.
10. **Lazy subpath + alias order**; verify VTK/visual/interaction changes on a **FULL reload**.
---
## 14. Reconstruction build order
1. Install deps (§2); import the 3 VTK profiles once.
2. `parseNiftiBuffer` + `loadNiftiImageData` + `useNiftiData` (§7).
3. The init effect: 1 canvas / 1 renderWindow / 1 OpenGL view / 4 renderers + viewports
(§8.18.2) + the % border divs (§8.3).
4. Data-load effect: planes + reslice slice actors (§8.4), volume actor (§8.5), cameras
(§8.6) — verify the 3 slices + 3D volume render.
5. Window/level/opacity effect (§8.7) + wheel scroll/zoom (§8.8) + resize (§8.9).
6. Interaction: interactor + style swap + confine + expand (§9) — verify drag isolation +
double-click expand on a FULL reload.
7. Annotation overlay (§10).
8. Organ overlays: 3D surface (§11.1) then 2D reslice (§11.2) + lifecycle (§11.3).
9. Host dialog + organ panel (§12).
10. Walk the gotcha checklist (§13).
> Verify every visual/interaction change live with **real mouse input on a full reload** —
> `tsc`-clean ≠ works, and synthetic event-dispatch can mask the pointer-capture/clip bugs.
</content>
+356
View File
@@ -0,0 +1,356 @@
# ImageHub — Architecture
> **"GitHub for medical-imaging research datasets."** A self-hosted platform for
> versioning, viewing, de-identifying, and collaborating on imaging datasets
> (DICOM / NIfTI / WSI), modeled on Gitea's architecture but rebuilt on a
> Python-centric stack suited to the imaging + ML ecosystem.
>
> *"ImageHub" is a placeholder name — rename freely.*
This document describes (1) the Gitea patterns we are reproducing, (2) how each
maps to the imaging domain, (3) the recommended stack, (4) the subsystems and
data model, and (5) an MVP-first roadmap.
---
## 1. Design philosophy (inherited from Gitea)
Gitea is worth copying for five structural decisions. We keep all five:
1. **Modular monolith, not microservices.** One deployable core app with clear
internal layers. You can scale the heavy parts out later (we do — the worker
tier) without paying distributed-systems tax up front.
2. **Strict downward layering.** `cli → api → services → models → core`.
Dependencies only point down. Business logic lives in `services`, never in
models or HTTP handlers.
3. **Server-rendered UI + progressive enhancement, not a SPA.** Pages are
rendered server-side; rich client behavior (the image viewer) is embedded as
self-contained widgets. Faster to build, easy to deep-link, SEO/printable.
4. **Pluggable infrastructure behind interfaces.** Storage, queue, search,
cache, and auth are interfaces with swappable drivers (local disk ↔ S3,
in-proc ↔ Redis, Postgres FTS ↔ OpenSearch). Same idea as Gitea's
`modules/storage`, `modules/queue`, `modules/indexer`.
5. **The domain engine is a first-class subsystem.** For Gitea that engine is
Git. For us it is the **Dataset Versioning Engine** — a content-addressed,
Merkle-DAG version control system specialized for large imaging files. This is
the single most important component and the heart of the product.
What we deliberately change from Gitea:
- **Workers are externalized.** Gitea runs background jobs in-process. Imaging
jobs (de-identification, format conversion, thumbnailing, ML) are heavy,
Python-bound, and sometimes need GPUs — so they run in a separate, scalable
worker tier driven by a real queue.
- **All "files" are large binaries.** Gitea bolts on Git-LFS for large files; for
us large-file handling is the *default and only* path — every blob is
content-addressed and stored in object storage.
- **De-identification & audit are core**, not afterthoughts (domain requirement).
---
## 2. Concept mapping: Gitea → ImageHub
| Gitea concept | ImageHub equivalent | Notes |
|---|---|---|
| Repository | **Dataset** | A versioned collection of imaging studies/series + metadata + labels. |
| Git commit | **Version** (commit) | Immutable snapshot = a content-addressed manifest + parent links. |
| Branch / tag | **Branch / tag** | e.g. `raw`, `deidentified`, `train-split-v3`; tags for citable releases. |
| Blob / tree | **Blob / manifest** | Blob = one file (DICOM instance, NIfTI, label). Manifest = the tree of a version. |
| Git-LFS | *(native)* | Every blob is large; content-addressed object store is the only path. |
| Git transport (SSH/HTTP) | **Transport API + CLI/SDK** | Resumable chunked upload/download; "have/want" blob negotiation like LFS batch. |
| Pull Request | **Change Proposal** | Review added/changed/relabeled data before merging into a branch. |
| Diff / code review | **Dataset diff + image diff** | Added/removed/changed series and label diffs, viewed side-by-side. |
| Issues | **Issues / annotation tasks** | QC findings, labeling tasks, discussions. |
| Releases | **Dataset releases** | Frozen, citable snapshots (DOI-friendly) — key for research reproducibility. |
| Wiki | **Datasheet / data dictionary** | Dataset documentation, "Datasheets for Datasets". |
| Actions / act_runner | **Pipelines / runners** | Event-driven compute: de-id, QC, train/eval; pins exact data version. |
| Webhooks | **Webhooks** | Same. |
| Code search indexer | **Metadata + tag search** | Faceted search over modality/body-part/labels; optional image-embedding search. |
| Org / Team / User / RBAC | **Org / Team / User / RBAC** | Nearly identical; plus dataset access requests / data-use agreements. |
| `app.ini` + `modules/setting` | **Config system** | Typed config from file + env. |
| XORM migrations | **Alembic migrations** | Ordered, append-only schema migrations. |
| Storage (local/minio/s3) | **Object storage** | Same abstraction; blobs live here. |
| *(minimal in Gitea)* | **Audit & compliance log** | First-class, append-only PHI-access trail. |
| *(none)* | **De-identification engine** | Domain-specific; no Gitea analogue. |
---
## 3. Recommended stack ("own stack", Python-centric)
Rationale: the medical-imaging and ML ecosystems (pydicom, SimpleITK, nibabel,
dcm2niix, highdicom, MONAI, the de-id tooling) are overwhelmingly Python. A
single-language core + worker stack removes the model-duplication friction you'd
get from a Go core calling Python workers.
| Layer | Choice | Gitea analogue |
|---|---|---|
| Core web/API | **Python 3.12 + FastAPI** (uvicorn/gunicorn) | `routers/` (chi) |
| Templating | **Jinja2 + HTMX** for progressive enhancement | `templates/` |
| Frontend build | **Vite + TypeScript** | `web_src/` + Vite |
| DICOM viewer | **Cornerstone3D** (DICOM), **NiiVue** (NIfTI) | embedded widgets |
| ORM / migrations | **SQLAlchemy 2.0 + Alembic** | XORM + migrations |
| Primary DB | **PostgreSQL** (single target) | multi-DB → standardize on PG |
| Queue / workers | **Redis + Arq** (async) or Celery | `modules/queue` + workers |
| Object storage | **S3 / MinIO** (self-host) | `modules/storage` |
| Search | **OpenSearch** (or Postgres FTS to start) | `modules/indexer` |
| Cache / pubsub / sessions | **Redis** | `modules/cache`, eventsource |
| Auth | **Authlib** (OIDC/OAuth2) + sessions + API tokens | `services/auth` |
| Imaging libs | pydicom, highdicom, SimpleITK, nibabel, dcm2niix, Pillow; OpenSlide for WSI | — |
| ML integration | MONAI / PyTorch dataset adapters via the SDK | — |
| De-id | pydicom + `deid` (CTP rules) + Presidio (text) + OCR (burned-in PHI) | — |
| Client | **Python SDK + CLI** (`imagehub clone/pull/push/commit`) | the `git` client |
> **Alternative if you want Gitea-grade transport performance:** keep a **Go
> core** for the API/transport/auth layer and use **Python only in the worker
> tier**. Faithful to Gitea, but you maintain two languages and duplicate the
> dataset/manifest types across the boundary. Recommended only if the upload/
> download path is your dominant bottleneck. Default to all-Python.
---
## 4. Layered architecture
```
cli/ Admin & ops commands (Typer): serve, migrate, doctor, deid-batch, user-admin
└─ api/ FastAPI routers — UI pages + REST API + transport endpoints (thin: parse → service → render)
└─ services/ Business logic: dataset ops, versioning workflows, review, pipelines, de-id orchestration
└─ models/ SQLAlchemy entities + queries (one module per domain: user, dataset, version, annotation…)
└─ core/ Leaf infra & domain engines — MUST NOT import the layers above
├─ vcs/ ← the Dataset Versioning Engine (the "Git")
├─ storage/ ← content-addressed blob store over S3/MinIO
├─ imaging/ ← DICOM/NIfTI parsing, metadata, thumbnails, conversion
├─ deid/ ← de-identification pipeline stages
├─ queue/ ← Redis/Arq job abstraction
├─ index/ ← search abstraction (OpenSearch / PG FTS)
├─ audit/ ← append-only audit log
├─ config/ ← typed settings
└─ auth/ ← tokens, sessions, OIDC, permissions
```
**Layer rules (enforce with import-linter, the analogue of Gitea's depguard):**
- `core/` is the foundation; it may not import `models/`, `services/`, or `api/`.
- Cross-entity business logic goes in `services/`, never in `models/`.
- `api/` handlers stay thin — no business logic, no direct DB-engine access.
- Every DB query takes a `session`/context so it enlists in the request transaction.
---
## 5. Core subsystems
### 5.1 Dataset Versioning Engine (`core/vcs`) — the heart
A content-addressed Merkle DAG, like Git, specialized for large imaging files.
- **Blob store.** Every file is hashed (SHA-256) and stored once in object
storage at `blobs/<aa>/<bb>/<hash>`. Identical files across versions/datasets
dedupe for free (huge win — imaging datasets share many instances).
- **Manifest (tree).** A version's manifest lists `logical_path → {blob_hash,
size, media_type, imaging_meta}`. The manifest is itself content-addressed.
- **Commit.** `{manifest_hash, parents[], author, timestamp, message}`. The
parent chain is the history DAG.
- **Refs.** Branches/tags map `name → commit_id`, stored in **Postgres** (not in
object storage) so they're transactional and queryable.
- **Transport / negotiation.** On push, the client hashes locally and asks the
server which blobs are missing ("have/want", like the LFS batch API), uploads
only those (resumable, chunked), then posts the commit. Pull is the reverse.
- **Diff.** Compare two manifests → added / removed / modified entries; surfaced
in the UI as a dataset diff and, per-image, as a viewer side-by-side.
- **Merge.** Three-way path-level merge of manifests; conflicts when the same
path changed on both sides. Label/annotation merges can be semantic.
**Build vs. buy:** building this custom gives full control and the cleanest
domain fit (recommended). If you need to move faster, back it with **lakeFS**
(git-like branches/commits/merge over S3) or **DVC**, and keep your manifest API
as the stable interface so you can swap the backend later.
### 5.2 Object storage (`core/storage`)
Driver interface (`put/get/stat/delete/presign`) with `local` and `s3/minio`
implementations — exactly Gitea's `modules/storage` pattern. Stores blobs,
manifests, thumbnails, pipeline artifacts. Presigned URLs let clients up/download
directly to S3 for large transfers, bypassing the app.
### 5.3 Ingestion & processing pipeline (`core/queue` + workers)
On upload, enqueue jobs; workers (Arq) process them:
1. Verify checksums, store blobs (dedup).
2. **Extract metadata** (pydicom/nibabel): modality, body part, study/series UIDs,
dimensions, acquisition params → indexed + linked to blobs.
3. **Thumbnails / previews** for the browse UI.
4. **De-identification** (§5.4).
5. **Format normalization** (optional: DICOM→NIfTI via dcm2niix for ML).
6. Commit the resulting version; update search index; write audit entries.
Workers scale independently; GPU nodes handle ML jobs.
### 5.4 De-identification engine (`core/deid`) — compliance must-have
A configurable, multi-stage pipeline producing a `deidentified` branch from a
`raw`/PHI version:
- **Tag de-id** per **DICOM PS3.15 Annex E** confidentiality profiles: remove/
replace PHI tags, regenerate UIDs *consistently* (so series stay linked),
handle private tags.
- **Date shifting**: consistent per-patient offset to preserve intervals.
- **Burned-in pixel PHI**: OCR (Tesseract/EasyOCR) to detect text in pixels,
redact, and flag for human review.
- **Free-text / report de-id**: Presidio NER over any text fields/reports.
- **Re-identification map** (only if policy allows): the original↔pseudonym
mapping is encrypted, access-restricted, and fully audited; otherwise the PHI
source is dropped.
- **Verification stage** emits a report of exactly what changed.
Tooling: pydicom, Stanford `deid` / MIRC CTP rule sets, Presidio, an OCR engine.
Profiles are configurable per org/dataset.
### 5.5 Web viewer (`api` + embedded TS widgets)
Progressive-enhancement widgets (not a separate SPA), true to Gitea:
- **Cornerstone3D** for DICOM (multi-frame, MPR, windowing, measurements,
segmentation overlays).
- **NiiVue** for NIfTI volumes (great for neuro/research).
- **OpenSlide**-backed deep-zoom tiles for whole-slide pathology (optional).
The server exposes a frame/tile API (a WADO-RS-like read path even without full
DICOMweb). Annotations are structured objects (DICOM SR or JSON), **versioned
with the dataset**.
### 5.6 Search & discovery (`core/index`)
Index extracted metadata + labels → faceted search ("brain MRI, T1, age<40, has
tumor label"). Start on **Postgres FTS**; graduate to **OpenSearch** for scale.
Optional later: compute image embeddings (a foundation model) → **pgvector** for
"find similar studies/lesions".
### 5.7 Collaboration (`services`)
Change Proposals (PRs), reviews, issues, comments, annotation tasks, releases,
datasheets — the GitHub social layer, mapped to datasets. A reviewer of a Change
Proposal sees the dataset diff and can open the viewer on changed series.
### 5.8 Pipelines & runners (Actions analogue, optional/advanced)
Event-driven compute (`on: push | proposal | tag | schedule`) executed by
**runners** (containers that poll for jobs, à la `act_runner`). Use cases: auto
de-id, QC/validation, dataset statistics, **training/eval** with MONAI. Each run
**pins the dataset version hash**, giving reproducible ML by construction.
### 5.9 Auth, permissions, audit (`core/auth`, `core/audit`)
- OIDC/OAuth2 login, sessions, scoped API tokens.
- Org → Team → permission model; dataset visibility `private | internal | public`;
dataset-level access requests / data-use agreements.
- **Audit log**: append-only Postgres table (actor, action, object, dataset,
version, IP, purpose-of-use, timestamp). Every PHI-bearing access (view
original, download) is logged; optional hash-chaining for tamper-evidence;
retention + legal-hold support.
### 5.10 API, SDK, CLI
- **REST API** (FastAPI, OpenAPI-documented — the swagger analogue).
- **Python SDK** (the most important client for ML users): pull a pinned version
straight into a `torch`/MONAI `Dataset`.
- **CLI** (`imagehub clone/pull/push/checkout/commit/diff`) — the `git`/`dvc`
analogue for data engineers.
---
## 6. Data model (core tables)
```
user, organization, team, team_membership, team_access
dataset(id, owner_id, name, visibility, default_branch, description)
ref(dataset_id, name, type[branch|tag], commit_id) -- transactional refs
commit(id, dataset_id, manifest_hash, parent_ids[], author_id, message, created_at)
blob(hash PK, size, storage_key, media_type, refcount) -- content-addressed, deduped
manifest(hash PK, storage_key) -- stored in object store, hash in DB
instance_meta(blob_hash, dataset_id, study_uid, series_uid, modality, body_part, dims, params…)
annotation(id, dataset_id, commit_id, target, type, payload, author_id)
label_schema(id, dataset_id, spec) label(id, schema_id, value)
change_proposal(id, dataset_id, src_ref, dst_ref, status) review, comment
issue(id, dataset_id, …) issue_comment
release(id, dataset_id, tag, notes, doi?)
pipeline(id, dataset_id, spec) pipeline_run(id, pipeline_id, commit_id, status, artifacts) runner
webhook webhook_delivery
audit_log(id, actor_id, action, object_type, object_id, dataset_id, ip, purpose, created_at) -- append-only
access_request, data_use_agreement
phi_map(dataset_id, original_ref, pseudonym, …) -- encrypted, restricted, audited
```
---
## 7. Key flows
1. **Ingest & de-identify:** upload → blobs stored (deduped) → metadata extracted
→ de-id pipeline → new commit on `deidentified` branch → indexed → audited.
2. **Browse & view:** datasets list → dataset → series list → Cornerstone3D/NiiVue
streams frames → annotation overlays.
3. **Curate an ML subset (zero-copy):** faceted query → new branch/dataset whose
manifest *references existing blobs* (no data copied) → commit → tag a release
→ `sdk.pull(tag)` in training.
4. **Propose a change (PR):** push new/relabeled data to a branch → open Change
Proposal → reviewer sees dataset diff + image diff → approve → merge.
5. **Reproducible training:** tag triggers a pipeline that pins the version hash,
runs MONAI train/eval, and links metrics + model artifact to that exact data
version.
---
## 8. Deployment topology
```
┌──────────── reverse proxy (Caddy/Traefik) + TLS ────────────┐
│ │
┌────────▼────────┐ ┌──────────────────┐ ┌───────────────────────▼─┐
│ Core app (N×) │ │ Worker tier (M×) │ │ Runners (K×, GPU opt.) │
│ FastAPI/uvicorn│ │ Arq + imaging/ML │ │ pipelines (train/eval) │
└───┬─────┬───┬───┘ └───┬─────────┬─────┘ └────────────┬────────────┘
│ │ │ │ │ │
┌───▼─┐ ┌─▼─┐ │ ┌───▼───┐ ┌───▼────────┐ ┌────▼────┐
│ PG │ │Redis│ └──────▶│ Redis │ │ Object store│◀────────┤Object st.│
│(state│ │queue│ │ queue │ │ S3 / MinIO │ │ (blobs) │
│ refs)│ │cache│ └───────┘ │ (blobs) │ └─────────┘
└─────┘ └────┘ └─────────────┘
┌───────────────┐
│ OpenSearch │ (metadata/label search)
└───────────────┘
```
- **Core app**: stateless, horizontally scalable.
- **Worker tier**: scales independently; CPU for de-id/convert, GPU for ML.
- **Postgres**: state, refs, metadata, audit. **Redis**: queue, cache, sessions,
server-sent events. **Object storage**: all blobs. **OpenSearch**: search.
- **Dev / small self-host**: a single `docker-compose` (app + worker + PG + Redis
+ MinIO + OpenSearch). **Scale**: Kubernetes with separate node pools.
- Contrast with Gitea (one binary, in-proc workers): we externalize workers and
object storage because imaging/ML work is heavy, Python-bound, and GPU-hungry.
---
## 9. Build-vs-buy summary
| Component | Recommendation |
|---|---|
| Versioning engine | **Build** the manifest/commit model (custom) — or back it with **lakeFS/DVC** behind your API to ship faster. |
| Viewer | **Adopt** Cornerstone3D + NiiVue (+ OpenSlide for WSI). Don't build. |
| De-identification | **Assemble** from pydicom + `deid`/CTP rules + Presidio + OCR. Don't build from scratch. |
| Search | **Postgres FTS** first → **OpenSearch** at scale. |
| Auth | **Authlib** (OIDC). |
| Queue | **Arq** (async) or **Celery**. |
| Object storage | **MinIO** self-host / **S3** cloud. |
---
## 10. MVP-first roadmap
Ordered for the chosen must-haves (versioning + viewer + de-id + audit):
- **Phase 0 — Skeleton.** Layered project structure, config, Postgres + Alembic,
object-storage driver, auth (user/org/team), dataset CRUD.
- **Phase 1 — Versioning engine.** Blobs, manifests, commits, branches; push/pull
via CLI + SDK; dataset diff. *(This is the product's spine — invest here.)*
- **Phase 2 — Ingestion + de-id + audit.** Worker tier, metadata extraction,
de-identification pipeline, append-only audit log. *(The compliance core.)*
- **Phase 3 — Viewer + search.** Cornerstone3D/NiiVue widgets, thumbnails,
faceted metadata search, browse UI.
- **Phase 4 — Collaboration.** Change Proposals, reviews, issues, annotations,
citable releases, datasheets.
- **Phase 5 — Pipelines.** Runners, event triggers, reproducible MONAI train/eval,
webhooks.
- **Later / optional.** DICOMweb + PACS adapter (QIDO/WADO/STOW), image-embedding
similarity search (pgvector), whole-slide pathology.
---
## Appendix — naming parallels for orientation
`git clone` → `imagehub clone` · repository → dataset · commit → version ·
push/pull → push/pull · PR → change proposal · `.git/objects` → content-addressed
blob store · act_runner → pipeline runner · `app.ini` → config · XORM → SQLAlchemy.