sciagent code + Gitea Actions CI/CD

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 09:38:30 +07:00
commit 688fac73e9
1167 changed files with 158244 additions and 0 deletions
@@ -0,0 +1,185 @@
+# OTP-verified registration: workflow & state machine (engineering guide)
+
+This document describes **email OTP verification during signup** end-to-end: **frontend**, **backend**, **database**, and **when object storage (e.g. MinIO) is relevant**. It is written so another team can implement the same pattern in a **new** application; it aligns with the reference implementation in this repo (`be0/src/auth_api.py`, `be0/src/auth_mail.py`, `fe0/src/auth/registration/`, migrations `013_*` / `014_*`).
+
+---
+
+## 1. Goals & threat model (what we optimize for)
+
+| Goal | How |
+|------|-----|
+| Prove control of the inbox before full account use | `users.email_verified = false` until OTP succeeds; **login denied** until verified. |
+| Avoid storing OTP plaintext | Persist **only a hash** of the OTP (same idea as password storage). |
+| Limit brute-force on OTP | Cap **failed attempts** per pending code; use **constant-time** compare on hash. |
+| Limit abuse of “resend” | Rate-limit **per email** and **per IP** (see §7). |
+| Avoid email enumeration on resend | Return **same JSON envelope** whether the address exists or needs OTP (reference pattern). |
+
+**MinIO / S3:** OTP delivery does **not** use object storage. Include MinIO in your design doc **only** if registration itself uploads files (avatars, documents); keep OTP and mail independent of buckets.
+
+---
+
+## 2. Account-level state machine (`users`)
+
+After successful `POST /auth/register`, the server creates an **active** user with **`email_verified = false`**. Verification flips the flag; login is allowed only when **`email_verified = true`**.
+
+```mermaid
+stateDiagram-v2
+  [*] --> NoAccount: visitor
+  NoAccount --> RegisteredUnverified: POST /register OK\n(password hashed, user row created)
+  RegisteredUnverified --> Verified: POST /verify-otp OK\n(email_verified := true)
+  RegisteredUnverified --> RegisteredUnverified: POST /resend-otp\n(new OTP row, old pending superseded)
+  RegisteredUnverified --> NoAccount: optional admin deactivate/\ndelete user (app-specific)
+
+  Verified --> [*]: normal user lifecycle\n(login, JWT, etc.)
+
+  note right of RegisteredUnverified
+    Login with correct password
+    returns 403 until verified
+    (do not leak OTP validity)
+  end note
+```
+
+**Engineering rules:**
+
+- **Register** must be **atomic**: create user + staff/profile as needed + insert OTP row + commit before sending email (or you risk orphan users without OTP).
+- **Login** when password OK but `email_verified` is false: return **403** with a message that tells the user to complete OTP / resend (reference copy mentions OTP resend on registration flow).
+
+---
+
+## 3. OTP row lifecycle (`registration_otp_codes`)
+
+Each issued OTP is one row. Only **one logical “pending”** code per user is typical: **issue** deletes previous unused rows for that user, then inserts a new row.
+
+```mermaid
+stateDiagram-v2
+  [*] --> Pending: issue OTP\n(plaintext only in memory/email)
+  Pending --> Consumed: verify OK\nused_at := now,\nuser.email_verified := true
+  Pending --> InvalidWrongCode: wrong OTP\nfailed_attempts += 1
+  InvalidWrongCode --> Pending: failed_attempts < max
+  InvalidWrongCap --> [*]: failed_attempts >= max\n(row ignored for verify)
+  Pending --> Superseded: resend or re-register issue\n(DB deletes unused rows first)
+  Pending --> Expired: now > expires_at\n(row ignored for verify)
+  Consumed --> [*]
+
+  state InvalidWrongCap <<choice>>
+```
+
+**Reference schema** (`014_registration_otp.sql`):
+
+- `user_id` → `users(id)`
+- `otp_hash` — never store plaintext OTP
+- `expires_at` — server-side TTL (e.g. env-configurable minutes, clamped to a sane max)
+- `failed_attempts` — increment on mismatch; **reject verify** when ≥ max (e.g. 5)
+- `used_at` — non-null means consumed
+
+**Verify query logic (conceptual):**
+
+1. Resolve user by **normalized email**, active, **`email_verified` still false** — otherwise reject with **generic** error (same message as bad OTP if you want strict anti-enumeration on verify; reference uses a uniform rejection detail).
+2. Select **latest** pending row where `used_at IS NULL`, `expires_at > now()`, `failed_attempts < max`.
+3. Compare `hash(submitted_otp)` to `otp_hash` with **constant-time** comparison (`hmac.compare_digest` in Python).
+4. On match: set `user.email_verified = true`, `row.used_at = now()`, audit log, commit.
+5. On mismatch: increment `failed_attempts`, commit, reject.
+
+---
+
+## 4. Frontend state machine (wizard UX)
+
+The SPA typically uses **local UI states**, not the server’s DB state diagram:
+
+```mermaid
+stateDiagram-v2
+  [*] --> Form: land on registration
+  Form --> Form: client validation\n(email domain, password policy, etc.)
+  Form --> Otp: POST /register 2xx\nemailVerificationRequired === true
+  Form --> Form: 4xx/5xx show rejection
+
+  Otp --> Success: POST /verify-otp OK
+  Otp --> Otp: verify 4xx clear inputs/\nshow generic error
+  Otp --> Otp: POST /resend-otp\n(+ cooldown timer UX)
+
+  Success --> Login: navigate / user signs in
+```
+
+**Contract hints for engineers:**
+
+- Read **`otpTtlSeconds`** from register response and drive countdown (stay in sync with server TTL; avoid hardcoding if backend can change TTL via env).
+- **`otpDeliveryChannel`** (or equivalent): `smtp` | `log_only` | `none` | `smtp_failed` — drives toasts/help text (SMTP missing, dev log-only, SMTP error).
+- **Resend UX cooldown** on the client is **supplementary**; server **must** enforce rate limits independently.
+- OTP input: **exactly 6 digits** if matching this API; paste-friendly fields improve UX.
+
+---
+
+## 5. Backend API surface (minimal)
+
+| Method | Path | Purpose |
+|--------|------|---------|
+| `POST` | `/auth/register` | Create user with `email_verified=false`; issue OTP; attempt email delivery; return `{ otpTtlSeconds, otpDeliveryChannel, emailVerificationRequired, ... }`. |
+| `POST` | `/auth/verify-otp` | Body: `{ email, otp }` — normalized email + `\d{6}` OTP; on success set verified. |
+| `POST` | `/auth/resend-otp` | Body: `{ email }` — enumeration-safe constant response; **429** if rate-limited. |
+| `POST` | `/auth/login` | Reject **403** when password OK but email not verified (reference behavior). |
+
+**Optional legacy path in this repo:** `POST /auth/verify-email` with **magic-link token** (`email_verification_tokens`). New apps should prefer **either** OTP **or** magic link consistently to reduce confusion.
+
+---
+
+## 6. Email delivery (no MinIO)
+
+Outbound mail is **SMTP** or **dev log**:
+
+- **Production:** `SMTP_HOST`, `SMTP_USER`, `SMTP_PASSWORD`, `SMTP_PORT`, `SMTP_USE_TLS`, `AUTH_MAIL_FROM`, optional `AUTH_PUBLIC_WEB_ORIGIN` for links in other mails.
+- **Dev:** `AUTH_MAIL_LOG_ONLY=1` — log plaintext OTP (**never** enable in production).
+
+If SMTP fails **after** user creation, return a distinct delivery channel (e.g. `smtp_failed`) so the client can explain “account created but email failed”; ops fix SMTP and user uses **resend**.
+
+---
+
+## 7. Rate limiting & scaling caveats
+
+Reference implementation uses **in-process** buckets (`auth_rate_limit.py`): e.g. **5 resends per email per hour** and **30 per IP per hour** (rolling window).
+
+**Greenfield checklist:**
+
+- Single replica: in-memory limits are simple.
+- Multiple API workers / pods: use **Redis** (or API gateway limits) for shared counters; tune limits per product policy.
+- Always rate-limit **resend** harder than **verify** if verify is already capped by `failed_attempts`.
+
+---
+
+## 8. Database prerequisites
+
+1. **`users.email_verified`** — `BOOLEAN NOT NULL DEFAULT FALSE` (migration `013_email_verification.sql` pattern).
+2. **`registration_otp_codes`** — table as in §3 (`014_registration_otp.sql` pattern).
+3. **Indexes** — partial index on pending OTP by `user_id` speeds lookup.
+
+**MinIO:** no migration needed for OTP. Add bucket policies only if registration uploads binaries.
+
+---
+
+## 9. Observability & security checklist
+
+- **Audit:** log verification success and resend with actor metadata (no OTP plaintext in audit).
+- **Logs:** mail failures at `register` / `resend` with exception detail for ops; avoid logging hashed OTP.
+- **HTTPS:** registration and OTP over TLS; HttpOnly cookies if using cookie-based sessions.
+- **CORS:** allow credentials only for trusted front-end origins.
+- **JWT:** issue tokens **after** login; do not put OTP or `email_verified=false` users into a “logged-in” state unless your product explicitly wants partial tokens (reference does not).
+
+---
+
+## 10. Reference file map (this repository)
+
+| Layer | Location |
+|-------|----------|
+| Register / verify / resend / login gate | `be0/src/auth_api.py` |
+| SMTP + `AUTH_MAIL_LOG_ONLY` | `be0/src/auth_mail.py` |
+| Resend rate limits | `be0/src/auth_rate_limit.py` |
+| OTP table migration | `be0/migrations/014_registration_otp.sql` |
+| `email_verified` + magic-link tokens | `be0/migrations/013_email_verification.sql` |
+| Registration UI + delivery UX | `fe0/src/auth/registration/RegistrationWithOtp.tsx` |
+| TTL/cooldown constants (keep aligned) | `fe0/src/auth/registration/constants.ts` |
+| API client | `fe0/src/lib/auth-service.ts` |
+
+---
+
+## 11. Summary one-liner for architects
+
+**OTP registration is a small state machine on top of `users.email_verified` plus short-lived hashed OTP rows, sent over SMTP (not object storage), with gated login and abuse limits on resend—implement the same invariants even if frameworks differ.**