9.4 KiB
9.4 KiB
Simplified Tech Stack for Local Governance Layer
Analysis & Simplification Strategy
Key Observations
- Local Application Context: Single-server deployment, not distributed
- Existing Stack: Already using FastAPI + PostgreSQL
- Complexity Overkill: Enterprise tools (Kafka, Camunda, Elasticsearch) are unnecessary for local deployment
- Core Needs: State machine, rules engine, document storage, audit logging
Simplified Tech Stack Recommendation
✅ Core Stack (Keep These)
| Component | Technology | Rationale |
|---|---|---|
| Database | PostgreSQL 15+ | ✅ Already in use, supports JSONB, excellent for local deployment |
| API Framework | FastAPI (Python) | ✅ Already in use, fast, async, great for this use case |
| Document Storage | Local filesystem + PostgreSQL (metadata) | ✅ Simple, no external service needed |
| Business Rules | Custom Python classes/functions | ✅ Lightweight, maintainable, no external engine needed |
🔄 Replace Complex Components
| Original Suggestion | Simplified Alternative | Why |
|---|---|---|
| Camunda/Temporal | Custom state machine (Python) | Simple workflow states, no need for enterprise orchestration |
| Elasticsearch + ML | PostgreSQL full-text search + pg_trgm (trigram similarity) |
Built-in, sufficient for duplicate detection |
| Apache Kafka/RabbitMQ | PostgreSQL NOTIFY/LISTEN or in-memory event queue | Simple pub/sub, no separate service |
| AWS S3/MinIO | Local filesystem with organized folders | Direct file storage, simpler for local |
| Drools | Python rule functions/classes | More maintainable, easier to debug |
Recommended Simplified Architecture
1. Database Layer
# Single PostgreSQL database with:
- Core tables (initiatives, authors, reviews, etc.)
- JSONB columns for flexible metadata
- Full-text search indexes (GIN indexes on text fields)
- pg_trgm extension for similarity matching
Benefits:
- No additional services
- ACID compliance
- Built-in full-text search
- Trigram similarity for duplicate detection
2. Business Rules Engine
# Custom Python classes
class NoveltyChecker:
def check(self, initiative: Initiative) -> ValidationResult
class ScoringEngine:
def calculate_score(self, reviews: List[Review]) -> Score
class WorkflowStateMachine:
def transition(self, initiative: Initiative, action: str) -> State
Benefits:
- Easy to test and debug
- No external dependencies
- Version control friendly
- Can be extended incrementally
3. Workflow Engine
# Simple state machine
class InitiativeWorkflow:
STATES = ['DRAFT', 'SUBMITTED', 'UNIT_REVIEW', ...]
TRANSITIONS = {
'DRAFT': ['SUBMITTED'],
'SUBMITTED': ['UNIT_REVIEW', 'REJECTED'],
...
}
def can_transition(self, from_state, to_state, user_role):
# Check permissions and business rules
pass
Benefits:
- No external workflow engine
- Easy to understand and modify
- Can store state in database
- Lightweight
4. Document Storage
# Local filesystem structure
/initiatives/
/{initiative_id}/
/forms/
form_01_v1.pdf
form_03_v1.pdf
/reviews/
review_001.pdf
/attachments/
evidence_001.pdf
# Metadata in PostgreSQL
CREATE TABLE document_metadata (
id UUID PRIMARY KEY,
initiative_id UUID REFERENCES initiatives(id),
file_path TEXT,
form_type VARCHAR(50),
version INT,
uploaded_by UUID,
uploaded_at TIMESTAMP,
checksum VARCHAR(64)
);
Benefits:
- No object storage service needed
- Easy backup (just copy folder)
- Direct file access
- Simple versioning
5. Duplicate Detection
-- Use PostgreSQL trigram similarity
CREATE EXTENSION IF NOT EXISTS pg_trgm;
-- Similarity query
SELECT
i1.id,
i1.title,
similarity(i1.description, i2.description) as sim_score
FROM initiatives i1
CROSS JOIN initiatives i2
WHERE i1.id != i2.id
AND similarity(i1.description, i2.description) > 0.7
ORDER BY sim_score DESC;
Benefits:
- Built into PostgreSQL
- No ML model training needed
- Fast enough for local scale
- Can be enhanced with custom logic
6. Event System
# Simple in-memory event dispatcher
class EventDispatcher:
def __init__(self):
self.listeners = {}
def subscribe(self, event_type, callback):
if event_type not in self.listeners:
self.listeners[event_type] = []
self.listeners[event_type].append(callback)
def emit(self, event_type, data):
for callback in self.listeners.get(event_type, []):
callback(data)
# Or use PostgreSQL NOTIFY/LISTEN for persistence
Benefits:
- No message broker needed
- Simple pub/sub pattern
- Can persist events to database if needed
- Easy to add email notifications
7. Audit Logging
-- Simple append-only table
CREATE TABLE audit_log (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
initiative_id UUID,
actor_id UUID,
action VARCHAR(100),
timestamp TIMESTAMP DEFAULT NOW(),
previous_state JSONB,
new_state JSONB,
metadata JSONB
);
CREATE INDEX idx_audit_initiative ON audit_log(initiative_id);
CREATE INDEX idx_audit_timestamp ON audit_log(timestamp);
Benefits:
- No separate audit system
- Queryable with SQL
- Can export for compliance
- Simple to implement
Complete Simplified Stack
Backend
FastAPI (Python)
├── Database: PostgreSQL 15+
│ ├── Core tables (initiatives, authors, reviews, etc.)
│ ├── JSONB for flexible data
│ ├── Full-text search (GIN indexes)
│ ├── Trigram similarity (pg_trgm)
│ └── Audit log table
├── Business Logic: Custom Python classes
│ ├── NoveltyChecker
│ ├── ScoringEngine
│ ├── WorkflowStateMachine
│ └── DuplicateDetector
├── Document Storage: Local filesystem
│ └── Organized folder structure
├── Event System: In-memory dispatcher + PostgreSQL NOTIFY
└── API: FastAPI REST endpoints
Frontend (Already in place)
React + TypeScript
├── Feature-based architecture
├── React Query for data fetching
└── Existing UI components
Implementation Priority
Phase 1: Core Foundation (Week 1-2)
- ✅ Database schema (PostgreSQL)
- ✅ Basic CRUD APIs (FastAPI)
- ✅ Document upload/storage (local filesystem)
- ✅ Basic state machine (Python class)
Phase 2: Business Rules (Week 3-4)
- ✅ Novelty checking (PostgreSQL similarity)
- ✅ Author contribution validation
- ✅ Scoring algorithm (Group 01)
- ✅ Auto-classification (Group 02)
Phase 3: Workflow & Notifications (Week 5-6)
- ✅ Complete state machine transitions
- ✅ Deadline tracking & alerts
- ✅ Email notifications (SMTP)
- ✅ Duplicate detection & mediation
Phase 4: Advanced Features (Week 7-8)
- ✅ Reporting & analytics
- ✅ Audit trail queries
- ✅ Role-based permissions
- ✅ Appeal workflow
Technology Comparison
Original Stack Complexity
- 8+ services to manage
- External dependencies (Kafka, Elasticsearch, S3)
- Complex deployment
- Higher resource usage
- Steeper learning curve
Simplified Stack
- 2 services (FastAPI + PostgreSQL)
- Minimal external dependencies
- Simple deployment
- Lower resource usage
- Easier to maintain
When to Scale Up
Consider adding complexity only if:
- >10,000 initiatives/year: Add Elasticsearch for search
- >100 concurrent users: Add Redis for caching
- Multi-server deployment: Add message queue (RabbitMQ)
- Advanced ML needed: Add dedicated ML service
- Cloud deployment: Use S3 for documents
For local application with <5,000 initiatives/year, simplified stack is sufficient.
Code Structure Example
be0/
├── src/
│ ├── domain/
│ │ ├── entities/
│ │ │ ├── initiative.py
│ │ │ ├── author.py
│ │ │ └── review.py
│ │ └── rules/
│ │ ├── novelty_checker.py
│ │ ├── scoring_engine.py
│ │ └── duplicate_detector.py
│ ├── application/
│ │ ├── services/
│ │ │ ├── workflow_service.py
│ │ │ └── notification_service.py
│ │ └── state_machine.py
│ ├── infrastructure/
│ │ ├── database/
│ │ │ └── models.py
│ │ ├── storage/
│ │ │ └── file_storage.py
│ │ └── events/
│ │ └── dispatcher.py
│ └── api/
│ └── routes/
│ └── initiatives.py
└── storage/
└── documents/
└── initiatives/
Summary
Simplified Stack:
- ✅ PostgreSQL (database + search + similarity)
- ✅ FastAPI (API framework)
- ✅ Python (business rules + workflow)
- ✅ Local filesystem (document storage)
- ✅ In-memory events (or PostgreSQL NOTIFY)
Removed:
- ❌ Camunda/Temporal (use custom state machine)
- ❌ Elasticsearch (use PostgreSQL full-text search)
- ❌ Kafka/RabbitMQ (use simple event dispatcher)
- ❌ S3/MinIO (use local filesystem)
- ❌ Drools (use Python functions)
Result: Simpler, easier to maintain, sufficient for local deployment, can scale up later if needed.