# Simplified Tech Stack for Local Governance Layer ## Analysis & Simplification Strategy ### Key Observations 1. **Local Application Context**: Single-server deployment, not distributed 2. **Existing Stack**: Already using FastAPI + PostgreSQL 3. **Complexity Overkill**: Enterprise tools (Kafka, Camunda, Elasticsearch) are unnecessary for local deployment 4. **Core Needs**: State machine, rules engine, document storage, audit logging --- ## Simplified Tech Stack Recommendation ### ✅ **Core Stack (Keep These)** | Component | Technology | Rationale | |-----------|-----------|-----------| | **Database** | PostgreSQL 15+ | ✅ Already in use, supports JSONB, excellent for local deployment | | **API Framework** | FastAPI (Python) | ✅ Already in use, fast, async, great for this use case | | **Document Storage** | Local filesystem + PostgreSQL (metadata) | ✅ Simple, no external service needed | | **Business Rules** | Custom Python classes/functions | ✅ Lightweight, maintainable, no external engine needed | ### 🔄 **Replace Complex Components** | Original Suggestion | Simplified Alternative | Why | |-------------------|----------------------|-----| | **Camunda/Temporal** | Custom state machine (Python) | Simple workflow states, no need for enterprise orchestration | | **Elasticsearch + ML** | PostgreSQL full-text search + `pg_trgm` (trigram similarity) | Built-in, sufficient for duplicate detection | | **Apache Kafka/RabbitMQ** | PostgreSQL NOTIFY/LISTEN or in-memory event queue | Simple pub/sub, no separate service | | **AWS S3/MinIO** | Local filesystem with organized folders | Direct file storage, simpler for local | | **Drools** | Python rule functions/classes | More maintainable, easier to debug | --- ## Recommended Simplified Architecture ### 1. **Database Layer** ```python # Single PostgreSQL database with: - Core tables (initiatives, authors, reviews, etc.) - JSONB columns for flexible metadata - Full-text search indexes (GIN indexes on text fields) - pg_trgm extension for similarity matching ``` **Benefits:** - No additional services - ACID compliance - Built-in full-text search - Trigram similarity for duplicate detection ### 2. **Business Rules Engine** ```python # Custom Python classes class NoveltyChecker: def check(self, initiative: Initiative) -> ValidationResult class ScoringEngine: def calculate_score(self, reviews: List[Review]) -> Score class WorkflowStateMachine: def transition(self, initiative: Initiative, action: str) -> State ``` **Benefits:** - Easy to test and debug - No external dependencies - Version control friendly - Can be extended incrementally ### 3. **Workflow Engine** ```python # Simple state machine class InitiativeWorkflow: STATES = ['DRAFT', 'SUBMITTED', 'UNIT_REVIEW', ...] TRANSITIONS = { 'DRAFT': ['SUBMITTED'], 'SUBMITTED': ['UNIT_REVIEW', 'REJECTED'], ... } def can_transition(self, from_state, to_state, user_role): # Check permissions and business rules pass ``` **Benefits:** - No external workflow engine - Easy to understand and modify - Can store state in database - Lightweight ### 4. **Document Storage** ```python # Local filesystem structure /initiatives/ /{initiative_id}/ /forms/ form_01_v1.pdf form_03_v1.pdf /reviews/ review_001.pdf /attachments/ evidence_001.pdf # Metadata in PostgreSQL CREATE TABLE document_metadata ( id UUID PRIMARY KEY, initiative_id UUID REFERENCES initiatives(id), file_path TEXT, form_type VARCHAR(50), version INT, uploaded_by UUID, uploaded_at TIMESTAMP, checksum VARCHAR(64) ); ``` **Benefits:** - No object storage service needed - Easy backup (just copy folder) - Direct file access - Simple versioning ### 5. **Duplicate Detection** ```sql -- Use PostgreSQL trigram similarity CREATE EXTENSION IF NOT EXISTS pg_trgm; -- Similarity query SELECT i1.id, i1.title, similarity(i1.description, i2.description) as sim_score FROM initiatives i1 CROSS JOIN initiatives i2 WHERE i1.id != i2.id AND similarity(i1.description, i2.description) > 0.7 ORDER BY sim_score DESC; ``` **Benefits:** - Built into PostgreSQL - No ML model training needed - Fast enough for local scale - Can be enhanced with custom logic ### 6. **Event System** ```python # Simple in-memory event dispatcher class EventDispatcher: def __init__(self): self.listeners = {} def subscribe(self, event_type, callback): if event_type not in self.listeners: self.listeners[event_type] = [] self.listeners[event_type].append(callback) def emit(self, event_type, data): for callback in self.listeners.get(event_type, []): callback(data) # Or use PostgreSQL NOTIFY/LISTEN for persistence ``` **Benefits:** - No message broker needed - Simple pub/sub pattern - Can persist events to database if needed - Easy to add email notifications ### 7. **Audit Logging** ```sql -- Simple append-only table CREATE TABLE audit_log ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), initiative_id UUID, actor_id UUID, action VARCHAR(100), timestamp TIMESTAMP DEFAULT NOW(), previous_state JSONB, new_state JSONB, metadata JSONB ); CREATE INDEX idx_audit_initiative ON audit_log(initiative_id); CREATE INDEX idx_audit_timestamp ON audit_log(timestamp); ``` **Benefits:** - No separate audit system - Queryable with SQL - Can export for compliance - Simple to implement --- ## Complete Simplified Stack ### **Backend** ``` FastAPI (Python) ├── Database: PostgreSQL 15+ │ ├── Core tables (initiatives, authors, reviews, etc.) │ ├── JSONB for flexible data │ ├── Full-text search (GIN indexes) │ ├── Trigram similarity (pg_trgm) │ └── Audit log table ├── Business Logic: Custom Python classes │ ├── NoveltyChecker │ ├── ScoringEngine │ ├── WorkflowStateMachine │ └── DuplicateDetector ├── Document Storage: Local filesystem │ └── Organized folder structure ├── Event System: In-memory dispatcher + PostgreSQL NOTIFY └── API: FastAPI REST endpoints ``` ### **Frontend** (Already in place) ``` React + TypeScript ├── Feature-based architecture ├── React Query for data fetching └── Existing UI components ``` --- ## Implementation Priority ### **Phase 1: Core Foundation** (Week 1-2) 1. ✅ Database schema (PostgreSQL) 2. ✅ Basic CRUD APIs (FastAPI) 3. ✅ Document upload/storage (local filesystem) 4. ✅ Basic state machine (Python class) ### **Phase 2: Business Rules** (Week 3-4) 1. ✅ Novelty checking (PostgreSQL similarity) 2. ✅ Author contribution validation 3. ✅ Scoring algorithm (Group 01) 4. ✅ Auto-classification (Group 02) ### **Phase 3: Workflow & Notifications** (Week 5-6) 1. ✅ Complete state machine transitions 2. ✅ Deadline tracking & alerts 3. ✅ Email notifications (SMTP) 4. ✅ Duplicate detection & mediation ### **Phase 4: Advanced Features** (Week 7-8) 1. ✅ Reporting & analytics 2. ✅ Audit trail queries 3. ✅ Role-based permissions 4. ✅ Appeal workflow --- ## Technology Comparison ### **Original Stack Complexity** - 8+ services to manage - External dependencies (Kafka, Elasticsearch, S3) - Complex deployment - Higher resource usage - Steeper learning curve ### **Simplified Stack** - 2 services (FastAPI + PostgreSQL) - Minimal external dependencies - Simple deployment - Lower resource usage - Easier to maintain --- ## When to Scale Up Consider adding complexity only if: - **>10,000 initiatives/year**: Add Elasticsearch for search - **>100 concurrent users**: Add Redis for caching - **Multi-server deployment**: Add message queue (RabbitMQ) - **Advanced ML needed**: Add dedicated ML service - **Cloud deployment**: Use S3 for documents For local application with <5,000 initiatives/year, simplified stack is sufficient. --- ## Code Structure Example ``` be0/ ├── src/ │ ├── domain/ │ │ ├── entities/ │ │ │ ├── initiative.py │ │ │ ├── author.py │ │ │ └── review.py │ │ └── rules/ │ │ ├── novelty_checker.py │ │ ├── scoring_engine.py │ │ └── duplicate_detector.py │ ├── application/ │ │ ├── services/ │ │ │ ├── workflow_service.py │ │ │ └── notification_service.py │ │ └── state_machine.py │ ├── infrastructure/ │ │ ├── database/ │ │ │ └── models.py │ │ ├── storage/ │ │ │ └── file_storage.py │ │ └── events/ │ │ └── dispatcher.py │ └── api/ │ └── routes/ │ └── initiatives.py └── storage/ └── documents/ └── initiatives/ ``` --- ## Summary **Simplified Stack:** - ✅ PostgreSQL (database + search + similarity) - ✅ FastAPI (API framework) - ✅ Python (business rules + workflow) - ✅ Local filesystem (document storage) - ✅ In-memory events (or PostgreSQL NOTIFY) **Removed:** - ❌ Camunda/Temporal (use custom state machine) - ❌ Elasticsearch (use PostgreSQL full-text search) - ❌ Kafka/RabbitMQ (use simple event dispatcher) - ❌ S3/MinIO (use local filesystem) - ❌ Drools (use Python functions) **Result:** Simpler, easier to maintain, sufficient for local deployment, can scale up later if needed.