# Tech Stack Comparison: Original vs Simplified ## Quick Reference ### Original Suggestions → Simplified Alternatives | Requirement | Original Tech | Simplified Tech | Complexity Reduction | |------------|--------------|----------------|---------------------| | **Workflow Engine** | Camunda / Temporal | Custom Python state machine | 90% simpler | | **Document Storage** | AWS S3 / MinIO | Local filesystem + PostgreSQL metadata | 80% simpler | | **Search & Duplicate Detection** | Elasticsearch + ML (Sentence-BERT) | PostgreSQL full-text + pg_trgm | 85% simpler | | **Event Bus** | Apache Kafka / RabbitMQ | PostgreSQL NOTIFY/LISTEN or in-memory | 90% simpler | | **Business Rules** | Drools | Custom Python classes/functions | 70% simpler | | **Audit Log** | Separate WORM storage | PostgreSQL append-only table | 60% simpler | --- ## Detailed Simplifications ### 1. Workflow Engine **Original:** Camunda or Temporal - Separate service to run - Complex BPMN diagrams - Additional database - Learning curve **Simplified:** Custom Python State Machine ```python # ~100 lines of code class InitiativeWorkflow: STATES = { 'DRAFT': ['SUBMITTED'], 'SUBMITTED': ['UNIT_REVIEW', 'REJECTED'], 'UNIT_REVIEW': ['COUNCIL_REVIEW', 'REJECTED'], 'COUNCIL_REVIEW': ['APPROVED', 'REJECTED'], 'APPROVED': ['FINALIZED', 'APPEAL'], 'REJECTED': ['APPEAL'], 'APPEAL': ['APPROVED', 'REJECTED', 'FINALIZED'], 'FINALIZED': [] } def can_transition(self, from_state, to_state, user_role): return to_state in self.STATES.get(from_state, []) ``` **Savings:** - No separate service - No BPMN learning - Easier to debug - Version controlled --- ### 2. Document Storage **Original:** AWS S3 / MinIO - Separate service - API calls for every operation - Network latency - Additional configuration **Simplified:** Local Filesystem ``` /initiatives/ /{initiative_id}/ /forms/ /reviews/ /attachments/ ``` **Savings:** - Direct file access - No API calls - Simpler backup (copy folder) - No network dependency --- ### 3. Search & Duplicate Detection **Original:** Elasticsearch + ML Model (Sentence-BERT) - Separate service - Model training required - Complex deployment - Resource intensive **Simplified:** PostgreSQL Full-Text + Trigram Similarity ```sql -- Enable extensions CREATE EXTENSION IF NOT EXISTS pg_trgm; -- Create index CREATE INDEX idx_initiative_description_gin ON initiatives USING gin(to_tsvector('english', description)); -- Similarity search SELECT id, title, similarity(description, 'search text') as score FROM initiatives WHERE similarity(description, 'search text') > 0.3 ORDER BY score DESC; ``` **Savings:** - Built into PostgreSQL - No model training - No separate service - Good enough for local scale --- ### 4. Event Bus **Original:** Apache Kafka / RabbitMQ - Separate service - Complex configuration - Message persistence - Consumer groups **Simplified:** PostgreSQL NOTIFY/LISTEN ```python # Publisher async def notify_event(event_type, data): await db.execute( "SELECT pg_notify('initiative_events', %s)", json.dumps({'type': event_type, 'data': data}) ) # Listener async def listen_events(): conn = await asyncpg.connect(...) await conn.add_listener('initiative_events', handle_event) ``` **Savings:** - No separate service - Built into database - Persistent (if needed) - Simple pub/sub --- ### 5. Business Rules Engine **Original:** Drools - Java-based - Separate rule files - Complex syntax - Additional dependency **Simplified:** Python Functions/Classes ```python class NoveltyChecker: def check(self, initiative): # Check similarity with existing similar = self.find_similar(initiative) if similar: return ValidationResult(invalid=True, reason="Duplicate found") return ValidationResult(valid=True) class ScoringEngine: def calculate(self, reviews): scores = [r.score for r in reviews if r.score is not None] if len(scores) == 0: return None return sum(scores) / len(scores) ``` **Savings:** - Native Python - Easy to test - Version controlled - No external engine --- ## Resource Usage Comparison ### Original Stack - PostgreSQL: ~200MB RAM - FastAPI: ~100MB RAM - Elasticsearch: ~1GB RAM - Kafka: ~500MB RAM - MinIO: ~200MB RAM - **Total: ~2GB RAM minimum** ### Simplified Stack - PostgreSQL: ~200MB RAM - FastAPI: ~100MB RAM - **Total: ~300MB RAM** **Savings: 85% less memory** --- ## Deployment Complexity ### Original Stack ``` docker-compose.yml: - postgres - fastapi - elasticsearch - kafka - minio - zookeeper (for Kafka) Total: 6+ containers ``` ### Simplified Stack ``` docker-compose.yml: - postgres - fastapi Total: 2 containers ``` **Savings: 67% fewer services** --- ## Maintenance Effort | Task | Original | Simplified | Time Saved | |------|----------|------------|------------| | Setup | 2-3 days | 2-3 hours | 90% | | Debugging | Complex (multiple services) | Simple (2 services) | 70% | | Updates | Multiple services | 2 services | 80% | | Monitoring | Multiple dashboards | Single dashboard | 75% | --- ## When to Upgrade Upgrade to original stack only if: 1. **Scale:** >10,000 initiatives/year 2. **Users:** >100 concurrent users 3. **Performance:** Response time >2s 4. **Distribution:** Multi-server deployment 5. **Advanced ML:** Need sophisticated NLP For local application with typical load (<5,000 initiatives/year), simplified stack is optimal. --- ## Migration Path If you need to scale later: 1. **Add Redis** for caching (if slow queries) 2. **Add Elasticsearch** for advanced search (if PostgreSQL search insufficient) 3. **Add RabbitMQ** for async processing (if need background jobs) 4. **Move to S3** for documents (if need cloud storage) But start simple, scale when needed.