233 lines
9.5 KiB
Markdown
233 lines
9.5 KiB
Markdown
# Production Docker deployment (`docker-compose.prod.yml`)
|
||
|
||
This guide walks through **common failures** when running the prod-style stack locally or on a VPS, in a fixed order: validate environment, reconcile Postgres credentials with the Docker volume, then confirm frontend wiring.
|
||
|
||
**Stack topology (frontend → backend → DB → MinIO):** [deploy-stack-overview.md](./deploy-stack-overview.md)
|
||
|
||
Related files: `.env.example` (copy to `.env`), `scripts/deploy-prod.sh`, `scripts/verify-prod-env.sh`.
|
||
|
||
---
|
||
|
||
## 1. `.env` in the repo root (cloud / VPS)
|
||
|
||
Docker Compose substitutes `${PUBLIC_HOST}`, `${POSTGRES_USER}`, etc. from a file named `.env` in the **same directory** as `docker-compose.prod.yml` (or from `--env-file` when you use the deploy script).
|
||
|
||
### It may already be there: plain `ls` hides it
|
||
|
||
Unix `ls` does **not** list dotfiles. A file named `.env` will **not** show up unless you:
|
||
|
||
```bash
|
||
ls -a # lists .env alongside . ..
|
||
test -f .env && echo ok # exits 0 if the file exists
|
||
```
|
||
|
||
### Create it when it is missing
|
||
|
||
From the repo root on the server:
|
||
|
||
```bash
|
||
cp .env.example .env
|
||
nano .env # or vim / your editor — set PUBLIC_HOST, secrets, Postgres identifiers (see section **3** below)
|
||
chmod 600 .env # optional: restrict reads to your user/root
|
||
```
|
||
|
||
`./scripts/deploy-prod.sh` refuses to run if `.env` is absent. If you start Compose by hand **without** a `.env` file, `${POSTGRES_*}` interpolates empty and Postgres health checks / connections can misbehave — always keep a populated `.env` next to the compose file.
|
||
|
||
---
|
||
|
||
## 2. Run validation before compose
|
||
|
||
Always fix script failures before restarting containers.
|
||
|
||
```bash
|
||
./scripts/verify-prod-env.sh
|
||
```
|
||
|
||
`verify-prod-env.sh` rejects:
|
||
|
||
- Empty `PUBLIC_HOST`, ports, MinIO or Postgres variables.
|
||
- `POSTGRES_USER` / `POSTGRES_DB` that are not plain SQL identifiers (letters, digits, underscore only — no `!`, spaces, unicode).
|
||
- `POSTGRES_PASSWORD` containing `@`, `:`, `/`, or `%`, which breaks `INITIATIVE_DATABASE_URL` in Compose (assembled without URL-encoding).
|
||
|
||
If `deploy-prod.sh` exits early, rerun `verify-prod-env.sh` and edit `.env` until it prints `OK`.
|
||
|
||
---
|
||
|
||
## 3. Postgres — `FATAL: role "<name>" does not exist`
|
||
|
||
### Why it happens
|
||
|
||
The official Postgres image **creates `POSTGRES_USER` and `POSTGRES_DB` only when the data directory is empty** (first start of the named volume). After that, changing `.env` does **not** rename or recreate roles inside the volume.
|
||
|
||
Typical triggers:
|
||
|
||
| Situation | Result |
|
||
|-----------|--------|
|
||
| Volume was initialized with `POSTGRES_USER=initiative`; `.env` now uses a different username | Existing DB has role `initiative`, not your new name. |
|
||
| Username with special characters (`user_pkhcn2025!`) | Prefer plain identifiers — see validation above — and historically some setups never created the role cleanly. |
|
||
|
||
### Fix (pick one track)
|
||
|
||
**A. Keep existing data — align `.env` with the roles that already exist**
|
||
|
||
1. Discover the logical volume name Compose uses:
|
||
|
||
```bash
|
||
docker compose --env-file .env -f docker-compose.prod.yml down
|
||
docker volume ls | grep initiative_pg_data
|
||
```
|
||
|
||
The name looks like `<project>_initiative_pg_data` (Compose names the volume from your project directory).
|
||
|
||
2. Start only Postgres temporarily with `.env` that matches credentials you **know** worked on first bootstrap (often your dev values from `docker-compose.yml`: user `initiative`, DB `initiatives`):
|
||
|
||
```bash
|
||
docker compose --env-file .env -f docker-compose.prod.yml up -d postgres
|
||
```
|
||
|
||
3. List roles inside the cluster (substitute `-U`/`-d`/`PGPASSWORD` to match credentials that succeed):
|
||
|
||
```bash
|
||
docker compose --env-file .env -f docker-compose.prod.yml exec postgres \
|
||
psql -U initiative -d initiatives -c '\du'
|
||
```
|
||
|
||
Set `POSTGRES_USER` / `POSTGRES_DB` / `POSTGRES_PASSWORD` in `.env` to match an existing role and database. Do **not** change only the username without aligning to an existing login.
|
||
|
||
**Password-only mismatch:** If the role and database names are already correct but someone changed `POSTGRES_PASSWORD` in `.env` after the volume was first created, run from the repo root (with `postgres` running):
|
||
|
||
```bash
|
||
./scripts/sync-postgres-app-password.sh
|
||
```
|
||
|
||
That executes `ALTER ROLE … PASSWORD` to match `.env` when `psql` inside the container can connect without the old password (typical with the official image’s local socket rules). If it fails, use the `psql` steps above with credentials that still work, or re-init the volume (**B**). Optional: `POSTGRES_SUPERUSER` in `.env` if you must connect as another superuser (e.g. `postgres`).
|
||
|
||
**B. You can afford to lose Postgres data — re-init the volume**
|
||
|
||
1. Stop stack; remove volume (this **deletes** all DB data):
|
||
|
||
```bash
|
||
docker compose --env-file .env -f docker-compose.prod.yml down
|
||
docker volume rm <project>_initiative_pg_data # exact name from `docker volume ls`
|
||
```
|
||
|
||
2. Ensure `./scripts/verify-prod-env.sh` passes.
|
||
|
||
3. Bring stack up fresh so scripts in `docker-entrypoint-initdb.d/` run:
|
||
|
||
```bash
|
||
./scripts/deploy-prod.sh
|
||
```
|
||
|
||
**C. Rename or add roles without wiping data (advanced)**
|
||
|
||
Connect as your **currently working** database superuser, then:
|
||
|
||
- `ALTER ROLE initiative RENAME TO new_name;`
|
||
- Create a parallel role/password with matching grants if your app expects a dedicated user only.
|
||
|
||
Operational details vary with your retention and backup policy; involve your DBA playbook if applicable.
|
||
|
||
---
|
||
|
||
## 4. Frontend (`fe0`) — port mismatch (host cannot reach UI)
|
||
|
||
Compose maps **`${FE_PORT}:8080`**: traffic to the container must hit **port 8080** inside `fe0`.
|
||
|
||
Vite defaults to **5173** if nothing overrides it. Previously that meant the mapped port forwarded to nothing or the wrong listener.
|
||
|
||
### Required state
|
||
|
||
[Vite](../fe0/vite.config.ts) must set:
|
||
|
||
- `server.port: 8080`
|
||
- host `0.0.0.0` and **port 8080** (Compose/Dockerfile pass `npm run dev -- --host 0.0.0.0 --port 8080` so bind-mounted trees without an updated `vite.config.ts` still match `${FE_PORT}:8080`)
|
||
|
||
If logs show:
|
||
|
||
```text
|
||
Local: http://localhost:5173/
|
||
```
|
||
|
||
fix `vite.config.ts` so the dev server uses **8080**, then recreate or restart `fe0`.
|
||
|
||
After that, browsers use:
|
||
|
||
```text
|
||
http://${PUBLIC_HOST}:${FE_PORT}
|
||
```
|
||
|
||
---
|
||
|
||
## 5. Different IPs in logs (`fe0` vs MinIO)
|
||
|
||
This is usually **correct**, not contradictory:
|
||
|
||
| Log line | Meaning |
|
||
|----------|---------|
|
||
| `fe0` “Network”: `http://10.5.0.x:…` | **Static container IP** on Compose bridge `profyt-net` (`docker-compose.prod.yml` `ipv4_address`). |
|
||
| MinIO banner: `http://<PUBLIC_HOST>:19000` | **Public/browser URL**, from `MINIO_SERVER_URL` using `PUBLIC_HOST` and `MINIO_API_PORT`. |
|
||
|
||
`be0` still talks to MinIO as `http://minio:9000` internally; browsers use `${PUBLIC_HOST}` unless you override presign with **`S3_PUBLIC_ENDPOINT_URL`**.
|
||
|
||
When the UI is **`https://`**, embedding plain **`http://…:${MINIO_API_PORT}`** presigned URLs is blocked (**mixed content**). In-app PDF preview can use **`GET …/evidence/content`**; for direct presigned links in the browser, terminate TLS on the MinIO API host and set **`S3_PUBLIC_ENDPOINT_URL`** / **`MINIO_SERVER_URL`** to that **`https://…`** base — see **[minio-behind-https.md](./minio-behind-https.md)** and **`deploy/nginx/minio-s3-proxy.conf.example`**.
|
||
|
||
---
|
||
|
||
## 6. Operational checklist after changes
|
||
|
||
```bash
|
||
./scripts/verify-prod-env.sh
|
||
docker compose --env-file .env -f docker-compose.prod.yml config >/dev/null
|
||
./scripts/deploy-prod.sh # or: up without -d for foreground logs
|
||
docker compose --env-file .env -f docker-compose.prod.yml ps
|
||
```
|
||
|
||
For Postgres persistence issues, skim **section 3** before editing `.env` again.
|
||
|
||
---
|
||
|
||
## 7. Postgres — `relation "audit_events" does not exist`
|
||
|
||
### Why it happens
|
||
|
||
`docker-entrypoint-initdb.d` on the Postgres image runs **only when the data volume is empty**. If the volume was created **before** `008_audit_events.sql` existed in compose, that migration never ran. **`be0`** then fails when it tries to write audit rows.
|
||
|
||
### Fix
|
||
|
||
**After pulling a current `be0` image / repo:** restart **`be0`**. On startup, `scripts/apply_initiative_migrations.py` applies **`008_audit_events.sql`** automatically if `public.audit_events` is missing (same pattern as migration 009).
|
||
|
||
Or apply by hand from the repo root on the server (adjust user/db to match `.env`):
|
||
|
||
```bash
|
||
docker compose --env-file .env -f docker-compose.prod.yml exec -T postgres \
|
||
psql -U "${POSTGRES_USER}" -d "${POSTGRES_DB}" \
|
||
< be0/migrations/008_audit_events.sql
|
||
```
|
||
|
||
---
|
||
|
||
## 8. Large uploads — `413 Request Entity Too Large` (evidence PDF, etc.)
|
||
|
||
The app allows evidence up to **50 MB** end-to-end, but **HTTPS reverse proxies** (nginx in front of `www.rcc-ump.com`) often default to **`client_max_body_size 1m`**, which rejects a **multi-megabyte** PDF **before** Docker sees the request. The browser console may show an HTML nginx error page (comment about “friendly error page”).
|
||
|
||
### Fix (nginx)
|
||
|
||
In the `server { }` (or the `location` that proxies to your `fe0` port), set at least:
|
||
|
||
```nginx
|
||
client_max_body_size 64m;
|
||
```
|
||
|
||
Reload nginx after editing. If uploads are slow, you may also need longer timeouts on the same `location`:
|
||
|
||
```nginx
|
||
proxy_read_timeout 300s;
|
||
proxy_connect_timeout 300s;
|
||
proxy_send_timeout 300s;
|
||
```
|
||
|
||
If **Cloudflare** (or another CDN) sits in front of the origin, confirm it does not impose a smaller upload limit than nginx.
|
||
|
||
**Note:** Browsers hit **`fe0`** (Vite proxy `/api` → `be0`). The body limit must allow the full multipart upload on the **first** hop (usually nginx → origin), not only inside Docker.
|