sciagent code + Gitea Actions CI/CD
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,259 @@
|
||||
# Tech Stack Comparison: Original vs Simplified
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Original Suggestions → Simplified Alternatives
|
||||
|
||||
| Requirement | Original Tech | Simplified Tech | Complexity Reduction |
|
||||
|------------|--------------|----------------|---------------------|
|
||||
| **Workflow Engine** | Camunda / Temporal | Custom Python state machine | 90% simpler |
|
||||
| **Document Storage** | AWS S3 / MinIO | Local filesystem + PostgreSQL metadata | 80% simpler |
|
||||
| **Search & Duplicate Detection** | Elasticsearch + ML (Sentence-BERT) | PostgreSQL full-text + pg_trgm | 85% simpler |
|
||||
| **Event Bus** | Apache Kafka / RabbitMQ | PostgreSQL NOTIFY/LISTEN or in-memory | 90% simpler |
|
||||
| **Business Rules** | Drools | Custom Python classes/functions | 70% simpler |
|
||||
| **Audit Log** | Separate WORM storage | PostgreSQL append-only table | 60% simpler |
|
||||
|
||||
---
|
||||
|
||||
## Detailed Simplifications
|
||||
|
||||
### 1. Workflow Engine
|
||||
|
||||
**Original:** Camunda or Temporal
|
||||
- Separate service to run
|
||||
- Complex BPMN diagrams
|
||||
- Additional database
|
||||
- Learning curve
|
||||
|
||||
**Simplified:** Custom Python State Machine
|
||||
```python
|
||||
# ~100 lines of code
|
||||
class InitiativeWorkflow:
|
||||
STATES = {
|
||||
'DRAFT': ['SUBMITTED'],
|
||||
'SUBMITTED': ['UNIT_REVIEW', 'REJECTED'],
|
||||
'UNIT_REVIEW': ['COUNCIL_REVIEW', 'REJECTED'],
|
||||
'COUNCIL_REVIEW': ['APPROVED', 'REJECTED'],
|
||||
'APPROVED': ['FINALIZED', 'APPEAL'],
|
||||
'REJECTED': ['APPEAL'],
|
||||
'APPEAL': ['APPROVED', 'REJECTED', 'FINALIZED'],
|
||||
'FINALIZED': []
|
||||
}
|
||||
|
||||
def can_transition(self, from_state, to_state, user_role):
|
||||
return to_state in self.STATES.get(from_state, [])
|
||||
```
|
||||
|
||||
**Savings:**
|
||||
- No separate service
|
||||
- No BPMN learning
|
||||
- Easier to debug
|
||||
- Version controlled
|
||||
|
||||
---
|
||||
|
||||
### 2. Document Storage
|
||||
|
||||
**Original:** AWS S3 / MinIO
|
||||
- Separate service
|
||||
- API calls for every operation
|
||||
- Network latency
|
||||
- Additional configuration
|
||||
|
||||
**Simplified:** Local Filesystem
|
||||
```
|
||||
/initiatives/
|
||||
/{initiative_id}/
|
||||
/forms/
|
||||
/reviews/
|
||||
/attachments/
|
||||
```
|
||||
|
||||
**Savings:**
|
||||
- Direct file access
|
||||
- No API calls
|
||||
- Simpler backup (copy folder)
|
||||
- No network dependency
|
||||
|
||||
---
|
||||
|
||||
### 3. Search & Duplicate Detection
|
||||
|
||||
**Original:** Elasticsearch + ML Model (Sentence-BERT)
|
||||
- Separate service
|
||||
- Model training required
|
||||
- Complex deployment
|
||||
- Resource intensive
|
||||
|
||||
**Simplified:** PostgreSQL Full-Text + Trigram Similarity
|
||||
```sql
|
||||
-- Enable extensions
|
||||
CREATE EXTENSION IF NOT EXISTS pg_trgm;
|
||||
|
||||
-- Create index
|
||||
CREATE INDEX idx_initiative_description_gin
|
||||
ON initiatives USING gin(to_tsvector('english', description));
|
||||
|
||||
-- Similarity search
|
||||
SELECT id, title,
|
||||
similarity(description, 'search text') as score
|
||||
FROM initiatives
|
||||
WHERE similarity(description, 'search text') > 0.3
|
||||
ORDER BY score DESC;
|
||||
```
|
||||
|
||||
**Savings:**
|
||||
- Built into PostgreSQL
|
||||
- No model training
|
||||
- No separate service
|
||||
- Good enough for local scale
|
||||
|
||||
---
|
||||
|
||||
### 4. Event Bus
|
||||
|
||||
**Original:** Apache Kafka / RabbitMQ
|
||||
- Separate service
|
||||
- Complex configuration
|
||||
- Message persistence
|
||||
- Consumer groups
|
||||
|
||||
**Simplified:** PostgreSQL NOTIFY/LISTEN
|
||||
```python
|
||||
# Publisher
|
||||
async def notify_event(event_type, data):
|
||||
await db.execute(
|
||||
"SELECT pg_notify('initiative_events', %s)",
|
||||
json.dumps({'type': event_type, 'data': data})
|
||||
)
|
||||
|
||||
# Listener
|
||||
async def listen_events():
|
||||
conn = await asyncpg.connect(...)
|
||||
await conn.add_listener('initiative_events', handle_event)
|
||||
```
|
||||
|
||||
**Savings:**
|
||||
- No separate service
|
||||
- Built into database
|
||||
- Persistent (if needed)
|
||||
- Simple pub/sub
|
||||
|
||||
---
|
||||
|
||||
### 5. Business Rules Engine
|
||||
|
||||
**Original:** Drools
|
||||
- Java-based
|
||||
- Separate rule files
|
||||
- Complex syntax
|
||||
- Additional dependency
|
||||
|
||||
**Simplified:** Python Functions/Classes
|
||||
```python
|
||||
class NoveltyChecker:
|
||||
def check(self, initiative):
|
||||
# Check similarity with existing
|
||||
similar = self.find_similar(initiative)
|
||||
if similar:
|
||||
return ValidationResult(invalid=True, reason="Duplicate found")
|
||||
return ValidationResult(valid=True)
|
||||
|
||||
class ScoringEngine:
|
||||
def calculate(self, reviews):
|
||||
scores = [r.score for r in reviews if r.score is not None]
|
||||
if len(scores) == 0:
|
||||
return None
|
||||
return sum(scores) / len(scores)
|
||||
```
|
||||
|
||||
**Savings:**
|
||||
- Native Python
|
||||
- Easy to test
|
||||
- Version controlled
|
||||
- No external engine
|
||||
|
||||
---
|
||||
|
||||
## Resource Usage Comparison
|
||||
|
||||
### Original Stack
|
||||
- PostgreSQL: ~200MB RAM
|
||||
- FastAPI: ~100MB RAM
|
||||
- Elasticsearch: ~1GB RAM
|
||||
- Kafka: ~500MB RAM
|
||||
- MinIO: ~200MB RAM
|
||||
- **Total: ~2GB RAM minimum**
|
||||
|
||||
### Simplified Stack
|
||||
- PostgreSQL: ~200MB RAM
|
||||
- FastAPI: ~100MB RAM
|
||||
- **Total: ~300MB RAM**
|
||||
|
||||
**Savings: 85% less memory**
|
||||
|
||||
---
|
||||
|
||||
## Deployment Complexity
|
||||
|
||||
### Original Stack
|
||||
```
|
||||
docker-compose.yml:
|
||||
- postgres
|
||||
- fastapi
|
||||
- elasticsearch
|
||||
- kafka
|
||||
- minio
|
||||
- zookeeper (for Kafka)
|
||||
|
||||
Total: 6+ containers
|
||||
```
|
||||
|
||||
### Simplified Stack
|
||||
```
|
||||
docker-compose.yml:
|
||||
- postgres
|
||||
- fastapi
|
||||
|
||||
Total: 2 containers
|
||||
```
|
||||
|
||||
**Savings: 67% fewer services**
|
||||
|
||||
---
|
||||
|
||||
## Maintenance Effort
|
||||
|
||||
| Task | Original | Simplified | Time Saved |
|
||||
|------|----------|------------|------------|
|
||||
| Setup | 2-3 days | 2-3 hours | 90% |
|
||||
| Debugging | Complex (multiple services) | Simple (2 services) | 70% |
|
||||
| Updates | Multiple services | 2 services | 80% |
|
||||
| Monitoring | Multiple dashboards | Single dashboard | 75% |
|
||||
|
||||
---
|
||||
|
||||
## When to Upgrade
|
||||
|
||||
Upgrade to original stack only if:
|
||||
|
||||
1. **Scale:** >10,000 initiatives/year
|
||||
2. **Users:** >100 concurrent users
|
||||
3. **Performance:** Response time >2s
|
||||
4. **Distribution:** Multi-server deployment
|
||||
5. **Advanced ML:** Need sophisticated NLP
|
||||
|
||||
For local application with typical load (<5,000 initiatives/year), simplified stack is optimal.
|
||||
|
||||
---
|
||||
|
||||
## Migration Path
|
||||
|
||||
If you need to scale later:
|
||||
|
||||
1. **Add Redis** for caching (if slow queries)
|
||||
2. **Add Elasticsearch** for advanced search (if PostgreSQL search insufficient)
|
||||
3. **Add RabbitMQ** for async processing (if need background jobs)
|
||||
4. **Move to S3** for documents (if need cloud storage)
|
||||
|
||||
But start simple, scale when needed.
|
||||
Reference in New Issue
Block a user