344 lines
9.4 KiB
Markdown
344 lines
9.4 KiB
Markdown
# Simplified Tech Stack for Local Governance Layer
|
|
|
|
## Analysis & Simplification Strategy
|
|
|
|
### Key Observations
|
|
1. **Local Application Context**: Single-server deployment, not distributed
|
|
2. **Existing Stack**: Already using FastAPI + PostgreSQL
|
|
3. **Complexity Overkill**: Enterprise tools (Kafka, Camunda, Elasticsearch) are unnecessary for local deployment
|
|
4. **Core Needs**: State machine, rules engine, document storage, audit logging
|
|
|
|
---
|
|
|
|
## Simplified Tech Stack Recommendation
|
|
|
|
### ✅ **Core Stack (Keep These)**
|
|
|
|
| Component | Technology | Rationale |
|
|
|-----------|-----------|-----------|
|
|
| **Database** | PostgreSQL 15+ | ✅ Already in use, supports JSONB, excellent for local deployment |
|
|
| **API Framework** | FastAPI (Python) | ✅ Already in use, fast, async, great for this use case |
|
|
| **Document Storage** | Local filesystem + PostgreSQL (metadata) | ✅ Simple, no external service needed |
|
|
| **Business Rules** | Custom Python classes/functions | ✅ Lightweight, maintainable, no external engine needed |
|
|
|
|
### 🔄 **Replace Complex Components**
|
|
|
|
| Original Suggestion | Simplified Alternative | Why |
|
|
|-------------------|----------------------|-----|
|
|
| **Camunda/Temporal** | Custom state machine (Python) | Simple workflow states, no need for enterprise orchestration |
|
|
| **Elasticsearch + ML** | PostgreSQL full-text search + `pg_trgm` (trigram similarity) | Built-in, sufficient for duplicate detection |
|
|
| **Apache Kafka/RabbitMQ** | PostgreSQL NOTIFY/LISTEN or in-memory event queue | Simple pub/sub, no separate service |
|
|
| **AWS S3/MinIO** | Local filesystem with organized folders | Direct file storage, simpler for local |
|
|
| **Drools** | Python rule functions/classes | More maintainable, easier to debug |
|
|
|
|
---
|
|
|
|
## Recommended Simplified Architecture
|
|
|
|
### 1. **Database Layer**
|
|
```python
|
|
# Single PostgreSQL database with:
|
|
- Core tables (initiatives, authors, reviews, etc.)
|
|
- JSONB columns for flexible metadata
|
|
- Full-text search indexes (GIN indexes on text fields)
|
|
- pg_trgm extension for similarity matching
|
|
```
|
|
|
|
**Benefits:**
|
|
- No additional services
|
|
- ACID compliance
|
|
- Built-in full-text search
|
|
- Trigram similarity for duplicate detection
|
|
|
|
### 2. **Business Rules Engine**
|
|
```python
|
|
# Custom Python classes
|
|
class NoveltyChecker:
|
|
def check(self, initiative: Initiative) -> ValidationResult
|
|
|
|
class ScoringEngine:
|
|
def calculate_score(self, reviews: List[Review]) -> Score
|
|
|
|
class WorkflowStateMachine:
|
|
def transition(self, initiative: Initiative, action: str) -> State
|
|
```
|
|
|
|
**Benefits:**
|
|
- Easy to test and debug
|
|
- No external dependencies
|
|
- Version control friendly
|
|
- Can be extended incrementally
|
|
|
|
### 3. **Workflow Engine**
|
|
```python
|
|
# Simple state machine
|
|
class InitiativeWorkflow:
|
|
STATES = ['DRAFT', 'SUBMITTED', 'UNIT_REVIEW', ...]
|
|
TRANSITIONS = {
|
|
'DRAFT': ['SUBMITTED'],
|
|
'SUBMITTED': ['UNIT_REVIEW', 'REJECTED'],
|
|
...
|
|
}
|
|
|
|
def can_transition(self, from_state, to_state, user_role):
|
|
# Check permissions and business rules
|
|
pass
|
|
```
|
|
|
|
**Benefits:**
|
|
- No external workflow engine
|
|
- Easy to understand and modify
|
|
- Can store state in database
|
|
- Lightweight
|
|
|
|
### 4. **Document Storage**
|
|
```python
|
|
# Local filesystem structure
|
|
/initiatives/
|
|
/{initiative_id}/
|
|
/forms/
|
|
form_01_v1.pdf
|
|
form_03_v1.pdf
|
|
/reviews/
|
|
review_001.pdf
|
|
/attachments/
|
|
evidence_001.pdf
|
|
|
|
# Metadata in PostgreSQL
|
|
CREATE TABLE document_metadata (
|
|
id UUID PRIMARY KEY,
|
|
initiative_id UUID REFERENCES initiatives(id),
|
|
file_path TEXT,
|
|
form_type VARCHAR(50),
|
|
version INT,
|
|
uploaded_by UUID,
|
|
uploaded_at TIMESTAMP,
|
|
checksum VARCHAR(64)
|
|
);
|
|
```
|
|
|
|
**Benefits:**
|
|
- No object storage service needed
|
|
- Easy backup (just copy folder)
|
|
- Direct file access
|
|
- Simple versioning
|
|
|
|
### 5. **Duplicate Detection**
|
|
```sql
|
|
-- Use PostgreSQL trigram similarity
|
|
CREATE EXTENSION IF NOT EXISTS pg_trgm;
|
|
|
|
-- Similarity query
|
|
SELECT
|
|
i1.id,
|
|
i1.title,
|
|
similarity(i1.description, i2.description) as sim_score
|
|
FROM initiatives i1
|
|
CROSS JOIN initiatives i2
|
|
WHERE i1.id != i2.id
|
|
AND similarity(i1.description, i2.description) > 0.7
|
|
ORDER BY sim_score DESC;
|
|
```
|
|
|
|
**Benefits:**
|
|
- Built into PostgreSQL
|
|
- No ML model training needed
|
|
- Fast enough for local scale
|
|
- Can be enhanced with custom logic
|
|
|
|
### 6. **Event System**
|
|
```python
|
|
# Simple in-memory event dispatcher
|
|
class EventDispatcher:
|
|
def __init__(self):
|
|
self.listeners = {}
|
|
|
|
def subscribe(self, event_type, callback):
|
|
if event_type not in self.listeners:
|
|
self.listeners[event_type] = []
|
|
self.listeners[event_type].append(callback)
|
|
|
|
def emit(self, event_type, data):
|
|
for callback in self.listeners.get(event_type, []):
|
|
callback(data)
|
|
|
|
# Or use PostgreSQL NOTIFY/LISTEN for persistence
|
|
```
|
|
|
|
**Benefits:**
|
|
- No message broker needed
|
|
- Simple pub/sub pattern
|
|
- Can persist events to database if needed
|
|
- Easy to add email notifications
|
|
|
|
### 7. **Audit Logging**
|
|
```sql
|
|
-- Simple append-only table
|
|
CREATE TABLE audit_log (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
initiative_id UUID,
|
|
actor_id UUID,
|
|
action VARCHAR(100),
|
|
timestamp TIMESTAMP DEFAULT NOW(),
|
|
previous_state JSONB,
|
|
new_state JSONB,
|
|
metadata JSONB
|
|
);
|
|
|
|
CREATE INDEX idx_audit_initiative ON audit_log(initiative_id);
|
|
CREATE INDEX idx_audit_timestamp ON audit_log(timestamp);
|
|
```
|
|
|
|
**Benefits:**
|
|
- No separate audit system
|
|
- Queryable with SQL
|
|
- Can export for compliance
|
|
- Simple to implement
|
|
|
|
---
|
|
|
|
## Complete Simplified Stack
|
|
|
|
### **Backend**
|
|
```
|
|
FastAPI (Python)
|
|
├── Database: PostgreSQL 15+
|
|
│ ├── Core tables (initiatives, authors, reviews, etc.)
|
|
│ ├── JSONB for flexible data
|
|
│ ├── Full-text search (GIN indexes)
|
|
│ ├── Trigram similarity (pg_trgm)
|
|
│ └── Audit log table
|
|
├── Business Logic: Custom Python classes
|
|
│ ├── NoveltyChecker
|
|
│ ├── ScoringEngine
|
|
│ ├── WorkflowStateMachine
|
|
│ └── DuplicateDetector
|
|
├── Document Storage: Local filesystem
|
|
│ └── Organized folder structure
|
|
├── Event System: In-memory dispatcher + PostgreSQL NOTIFY
|
|
└── API: FastAPI REST endpoints
|
|
```
|
|
|
|
### **Frontend** (Already in place)
|
|
```
|
|
React + TypeScript
|
|
├── Feature-based architecture
|
|
├── React Query for data fetching
|
|
└── Existing UI components
|
|
```
|
|
|
|
---
|
|
|
|
## Implementation Priority
|
|
|
|
### **Phase 1: Core Foundation** (Week 1-2)
|
|
1. ✅ Database schema (PostgreSQL)
|
|
2. ✅ Basic CRUD APIs (FastAPI)
|
|
3. ✅ Document upload/storage (local filesystem)
|
|
4. ✅ Basic state machine (Python class)
|
|
|
|
### **Phase 2: Business Rules** (Week 3-4)
|
|
1. ✅ Novelty checking (PostgreSQL similarity)
|
|
2. ✅ Author contribution validation
|
|
3. ✅ Scoring algorithm (Group 01)
|
|
4. ✅ Auto-classification (Group 02)
|
|
|
|
### **Phase 3: Workflow & Notifications** (Week 5-6)
|
|
1. ✅ Complete state machine transitions
|
|
2. ✅ Deadline tracking & alerts
|
|
3. ✅ Email notifications (SMTP)
|
|
4. ✅ Duplicate detection & mediation
|
|
|
|
### **Phase 4: Advanced Features** (Week 7-8)
|
|
1. ✅ Reporting & analytics
|
|
2. ✅ Audit trail queries
|
|
3. ✅ Role-based permissions
|
|
4. ✅ Appeal workflow
|
|
|
|
---
|
|
|
|
## Technology Comparison
|
|
|
|
### **Original Stack Complexity**
|
|
- 8+ services to manage
|
|
- External dependencies (Kafka, Elasticsearch, S3)
|
|
- Complex deployment
|
|
- Higher resource usage
|
|
- Steeper learning curve
|
|
|
|
### **Simplified Stack**
|
|
- 2 services (FastAPI + PostgreSQL)
|
|
- Minimal external dependencies
|
|
- Simple deployment
|
|
- Lower resource usage
|
|
- Easier to maintain
|
|
|
|
---
|
|
|
|
## When to Scale Up
|
|
|
|
Consider adding complexity only if:
|
|
- **>10,000 initiatives/year**: Add Elasticsearch for search
|
|
- **>100 concurrent users**: Add Redis for caching
|
|
- **Multi-server deployment**: Add message queue (RabbitMQ)
|
|
- **Advanced ML needed**: Add dedicated ML service
|
|
- **Cloud deployment**: Use S3 for documents
|
|
|
|
For local application with <5,000 initiatives/year, simplified stack is sufficient.
|
|
|
|
---
|
|
|
|
## Code Structure Example
|
|
|
|
```
|
|
be0/
|
|
├── src/
|
|
│ ├── domain/
|
|
│ │ ├── entities/
|
|
│ │ │ ├── initiative.py
|
|
│ │ │ ├── author.py
|
|
│ │ │ └── review.py
|
|
│ │ └── rules/
|
|
│ │ ├── novelty_checker.py
|
|
│ │ ├── scoring_engine.py
|
|
│ │ └── duplicate_detector.py
|
|
│ ├── application/
|
|
│ │ ├── services/
|
|
│ │ │ ├── workflow_service.py
|
|
│ │ │ └── notification_service.py
|
|
│ │ └── state_machine.py
|
|
│ ├── infrastructure/
|
|
│ │ ├── database/
|
|
│ │ │ └── models.py
|
|
│ │ ├── storage/
|
|
│ │ │ └── file_storage.py
|
|
│ │ └── events/
|
|
│ │ └── dispatcher.py
|
|
│ └── api/
|
|
│ └── routes/
|
|
│ └── initiatives.py
|
|
└── storage/
|
|
└── documents/
|
|
└── initiatives/
|
|
```
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
**Simplified Stack:**
|
|
- ✅ PostgreSQL (database + search + similarity)
|
|
- ✅ FastAPI (API framework)
|
|
- ✅ Python (business rules + workflow)
|
|
- ✅ Local filesystem (document storage)
|
|
- ✅ In-memory events (or PostgreSQL NOTIFY)
|
|
|
|
**Removed:**
|
|
- ❌ Camunda/Temporal (use custom state machine)
|
|
- ❌ Elasticsearch (use PostgreSQL full-text search)
|
|
- ❌ Kafka/RabbitMQ (use simple event dispatcher)
|
|
- ❌ S3/MinIO (use local filesystem)
|
|
- ❌ Drools (use Python functions)
|
|
|
|
**Result:** Simpler, easier to maintain, sufficient for local deployment, can scale up later if needed.
|