Files
sciagent/docs/security-incident-rcc-ump-2026-05-27.md
T
Thinh Lam 688fac73e9
CI/CD / backend (push) Failing after 2m8s
CI/CD / frontend (push) Failing after 1m40s
CI/CD / deploy (push) Has been skipped
sciagent code + Gitea Actions CI/CD
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 09:38:30 +07:00

174 lines
6.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Security incident — rcc-ump.com (2026-05-27)
**Status:** Remediation in progress (code fixes tracked below)
**Scope:** Production exposure at `https://www.rcc-ump.com` / `https://rcc-ump.com`
**Related prior audit:** [assets/docs/2026-05-21-security-review.md](../assets/docs/2026-05-21-security-review.md)
---
## Executive summary
Public screenshots and `curl` tests show the production site was serving the **Vite development server**, not a built SPA. That exposes full TypeScript source, `import.meta.env` values (including internal Docker hostnames), stack traces, and HMR internals. Combined with backend misconfiguration (default JWT secret, unauthenticated API routes), this created a path from **reconnaissance → data theft → admin takeover → arbitrary file write**.
Treat the VPS as **compromised until forensics prove otherwise**. Rotate credentials and redeploy with the fixes in this document before bringing the site back.
---
## Evidence (what attackers saw)
| Observation | Confirms |
|---|---|
| DevTools Sources tree shows `@vite/client`, `@react-refresh`, `node_modules`, `/src/**` | Vite **dev** server on the public internet |
| `curl …/src/shared/api/client.ts` returns source with `DEV: true`, `VITE_DEV_PROXY_TARGET: http://be0:4402` | Env + internal service names leaked |
| `curl …/vite.config.ts` returns HTML error with full config + stack trace | Verbose dev error handling |
| `lovable-tagger` in plugin list | Dev-only tooling active |
**Root cause in repo:** `docker-compose.prod.yml` and `fe0/Dockerfile` run `npm run dev -- --host 0.0.0.0`.
---
## Findings → fixes (checklist)
Track implementation in git; check off after deploy to production.
### Step 0 — Incident response (ops, not code)
- [ ] Restrict public access (maintenance page / firewall) during remediation
- [ ] Rotate **Postgres**, **MinIO**, **SMTP**, and generate new **`JWT_SECRET`** (`openssl rand -base64 48`)
- [ ] Bump every user's `credential_version` in Postgres (invalidates old JWTs)
- [ ] Review `audit_events`, unknown admin users, MinIO objects, modified files under `./be0` / `./fe0`
- [ ] Bind MinIO console to localhost; do not expose `:9001` to the internet
- [ ] Purge `.env` from git history if the repo was ever shared (`git filter-repo`)
### Step 1 — JWT and production mode ✅ (code)
| ID | Finding | Fix | Status |
|---|---|---|---|
| C-1 | `JWT_SECRET` unset → dev fallback signs tokens | `JWT_SECRET` + `ENVIRONMENT=production` in `docker-compose.prod.yml`; `verify-prod-env.sh` | **Done** |
| — | `ENVIRONMENT` never set in prod | Pass `ENVIRONMENT=production` to `be0` | **Done** |
### Step 2 — Production frontend (stop source leak) ✅ (code)
| ID | Finding | Fix | Status |
|---|---|---|---|
| H-3 | Vite dev in production | `fe0/Dockerfile.prod` + `fe0/nginx/default.conf`; updated `docker-compose.prod.yml` | **Done** |
| — | Prod API URL pointed at localhost | Same-origin `/api` via nginx when `VITE_API_URL` unset | **Done** |
### Step 3 — Remove broken / dangerous endpoints ✅ (code)
| ID | Finding | Fix | Status |
|---|---|---|---|
| C-3 | `POST /upload_document` | **Removed** | **Done** |
| M-9 | `POST /get_page` | **Removed** | **Done** |
### Step 4 — Authenticate sensitive API routes ✅ (code)
| ID | Finding | Fix | Status |
|---|---|---|---|
| C-4 | `/api/v1/review-documents` CRUD | Login + owner/staff | **Done** |
| C-5 | `GET /api/applications` | Staff-only | **Done** |
| C-5 | `GET /api/applications/{id}` | Owner or staff | **Done** |
| H-1 | LLM / chat / ideas endpoints | Auth on all listed routes | **Done** |
| H-4 | `DELETE …/admin-result` auth order | Auth first | **Done** |
### Step 5 — Hardening ✅ (code)
| ID | Finding | Fix | Status |
|---|---|---|---|
| H-5 | MinIO CORS `*` default | Required `MINIO_API_CORS_ALLOW_ORIGIN`; console on localhost | **Done** |
| H-7 | No login rate limit | `allow_login()` | **Done** |
| M-2 | No security headers | Middleware + nginx | **Done** |
| M-3 | CORS `*` risk | Fail startup if `*` in origins | **Done** |
**Deploy:** Rebuild `fe0` with `Dockerfile.prod`. Confirm DevTools no longer shows `@vite/client` or `/src/`. Replace placeholder `fe0/public/logo.svg` with your institution logo if needed.
**Deploy (Steps 15):** Update `.env` (see below), run `./scripts/verify-prod-env.sh`, then:
```bash
docker compose --env-file .env -f docker-compose.prod.yml up -d --build
```
Recreate `be0` after setting `JWT_SECRET` and bump all users' `credential_version` in Postgres.
- Non-root Docker users; remove prod bind-mounts of `./be0` / `./fe0` source
- HttpOnly refresh tokens; shorten JWT TTL
- Upgrade `xlsx`; pin `pip install` at image build time
- Auth audit test: every mutating route must have auth dependency
- Add `SECURITY.md` disclosure policy
---
## Production architecture (target)
```
Browser (HTTPS, external nginx/Caddy on VPS)
├─► fe0 container (nginx :8080) ── static files from dist/
│ proxy /api/* ──► be0:4402 (Docker network)
│ proxy /submitted-initiatives/ ──► be0:4402
├─► be0 bound 127.0.0.1:4402 on host (not public)
├─► postgres bound 127.0.0.1:15432
└─► MinIO API (TLS via reverse proxy); console localhost-only
```
External TLS termination (Certbot/Caddy on the VPS) sits in front of `${FE_PORT}`. See [deploy-production-docker.md](./deploy-production-docker.md) and [minio-behind-https.md](./minio-behind-https.md).
---
## Verification after deploy
### Automated tests (run before deploy)
```bash
# Backend — includes 17 security regression tests
cd be0 && python -m pytest tests/ -q
# Frontend unit tests + env config
cd fe0 && npm test && npm run build
# Production .env validation script
./scripts/test-verify-prod-env.sh
```
### Production smoke checks
```bash
# 1. Env validation
./scripts/verify-prod-env.sh
# 2. Stack healthy
docker compose --env-file .env -f docker-compose.prod.yml ps
# 3. No Vite dev artifacts (expect 404, not TS source)
curl -sS -o /dev/null -w "%{http_code}\n" https://www.rcc-ump.com/src/main.tsx
# 4. Unauthenticated PII blocked (expect 401)
curl -sS -o /dev/null -w "%{http_code}\n" https://www.rcc-ump.com/api/applications
# 5. JWT not forgeable — login with real user; admin routes reject unsigned tokens
curl -sS -o /dev/null -w "%{http_code}\n" \
-H "Authorization: Bearer invalid" \
https://www.rcc-ump.com/api/v1/admin/audit-events
```
---
## `.env` additions required for production
```bash
# Generate once:
JWT_SECRET=$(openssl rand -base64 48)
# Restrict MinIO browser CORS to your SPA origin (scheme + host, no trailing slash):
MINIO_API_CORS_ALLOW_ORIGIN=https://www.rcc-ump.com
# Public app URL (emails, CORS extras):
AUTH_PUBLIC_WEB_ORIGIN=https://www.rcc-ump.com
CORS_ORIGINS_EXTRA=https://www.rcc-ump.com
```
After first deploy with `JWT_SECRET`, run SQL (or admin script) to increment `credential_version` for all users.
---
*Last updated: 2026-05-27 — update checkboxes as fixes land on `main` and production.*