Create Agent-Context wiki page for chrysopedia

xpltd_admin 2026-04-03 23:15:54 -06:00
parent 4f3de1b1c7
commit 0bb2098654

126
Agent-Context.-.md Normal file

@ -0,0 +1,126 @@
# Agent Context
| Meta | Value |
|------|-------|
| **Repo** | `xpltdco/chrysopedia` |
| **Language** | Python 3.11+ (backend), TypeScript (frontend) |
| **Framework** | FastAPI + Celery + SQLAlchemy (async) + React 18 + Vite |
| **Entry Point** | `backend/main.py` |
| **Test Command** | `cd backend && python -m pytest tests/ -v` |
| **Build Command** | `docker compose build && docker compose up -d` |
| **Compose Services** | 11 (PostgreSQL, Redis, Qdrant, Ollama, LightRAG, API, Worker, Watcher, nginx, init, healthcheck) |
| **Database** | PostgreSQL 16 (async via asyncpg) |
| **Cache/Queue** | Redis 7 (Celery broker, classification cache, review toggle) |
| **Vector Store** | Qdrant 1.13.2 (nomic-embed-text embeddings via Ollama) |
| **LLM** | OpenAI-compatible API (DGX Sparks Qwen primary, Ollama fallback) |
| **Last Updated** | 2026-04-04 |
## File Index (Read These First)
| Priority | File | Purpose |
|----------|------|---------|
| 1 | `backend/main.py` | FastAPI app setup, router registration |
| 2 | `backend/models.py` | 18 SQLAlchemy ORM models |
| 3 | `backend/config.py` | Settings class (env vars, LRU cached) |
| 4 | `backend/database.py` | Async + sync engine factories |
| 5 | `backend/schemas.py` | Pydantic request/response schemas |
| 6 | `backend/pipeline/stages.py` | 6-stage Celery LLM pipeline |
| 7 | `backend/services/search_service.py` | Qdrant + keyword search |
| 8 | `backend/watcher.py` | Filesystem transcript ingestion |
| 9 | `backend/redis_client.py` | Async Redis client |
| 10 | `backend/routers/` | 11 FastAPI routers (auth, consent, admin, etc.) |
| 11 | `frontend/src/App.tsx` | React router, layout, main structure |
| 12 | `frontend/src/App.css` | All 5,820 lines of styling (monolithic BEM) |
| 13 | `frontend/src/api/public-client.ts` | API client (~600 lines, all TS interfaces) |
| 14 | `frontend/src/pages/` | 11 page components |
| 15 | `docker-compose.yml` | 11-service orchestration |
| 16 | `Dockerfile.api` | Backend/worker image |
| 17 | `Dockerfile.web` | Frontend/nginx image |
| 18 | `prompts/` | LLM prompt templates (XML-style, disk-loaded) |
## Common Modification Patterns
### To add an API endpoint:
1. Create/edit router in `backend/routers/`
2. Add Pydantic schemas in `backend/schemas.py`
3. Register router in `backend/main.py`
4. Add frontend API call in `frontend/src/api/public-client.ts`
### To modify the database schema:
1. Edit models in `backend/models.py`
2. Apply DDL manually (no Alembic migrations currently)
3. Rebuild API + worker: `docker compose build chrysopedia-api && docker compose up -d chrysopedia-api chrysopedia-worker`
### To add a pipeline stage:
1. Add stage function in `backend/pipeline/stages.py`
2. Create prompt template in `prompts/`
3. Register in the Celery task chain
4. Prompts are disk-loaded at runtime (SHA-256 tracked for reproducibility)
### To add a frontend page:
1. Create component in `frontend/src/pages/`
2. Add route in `frontend/src/App.tsx`
3. Style in `frontend/src/App.css` (monolithic, BEM conventions)
## Gotchas
1. **Dual SQLAlchemy engines (D004)** — FastAPI uses async engine (asyncpg). Celery uses sync engine (psycopg2). Never use the async session in Celery tasks.
2. **asyncpg timezone trap** — Use `datetime.now(timezone.utc).replace(tzinfo=None)`. asyncpg rejects timezone-aware datetimes.
3. **SQLAlchemy reserved names** — Never use `relationship`, `query`, `metadata` as column names — they shadow SQLAlchemy internals.
4. **Vite build constants** — Must wrap with `JSON.stringify()` in vite.config.ts (e.g., `JSON.stringify(version)`).
5. **Docker ARG/ENV ordering** — In Dockerfile.web: `ARG``ENV``RUN npm run build`. ENV must precede the build step.
6. **ZFS file watching** — Use `watchdog.observers.polling.PollingObserver`, not inotify (doesn't work on ZFS).
7. **Nginx stale DNS** — After API container rebuild, restart nginx: `docker compose restart chrysopedia-web-8096`
8. **Port 8000 conflict** — ub01:8000 may be used by another service; internal API runs on 8000 inside Docker network, exposed as 8096.
9. **Non-blocking embeddings (D005)** — Stage 6 (embedding/Qdrant indexing) failures don't block pipeline output. A technique page can exist without embeddings.
10. **Redis classification cache** — Stage 4 classification results stored in Redis with 24h TTL, not in the database.
11. **Monolithic CSS** — All 5,820 lines in one file (`App.css`). No CSS modules, no preprocessor. BEM naming with 77 custom properties.
12. **LightRAG service** — Runs on port 9621 (localhost only). Data stored in `/vmPool/r/services/chrysopedia_lightrag/`. Separate `.env.lightrag` config.
## Pipeline Stages
| Stage | Purpose | LLM Config |
|-------|---------|------------|
| 1 | Segmentation | Chat model |
| 2 | Key Moments extraction | Chat model |
| 3 | (Reserved) | — |
| 4 | Classification | Chat model (cached in Redis) |
| 5 | Synthesis (page generation) | Thinking model |
| 6 | Embedding + Qdrant indexing | Ollama (nomic-embed-text) |
## Verification Commands
```bash
# SSH to ub01
ssh ub01
cd /vmPool/r/repos/xpltdco/chrysopedia
# Check running services
docker ps --filter name=chrysopedia
# View API logs
docker logs -f chrysopedia-api
# View worker logs
docker logs -f chrysopedia-worker
# Health check
curl http://ub01:8096/health
# Database shell
psql -h ub01 -p 5433 -U chrysopedia
# Full rebuild
docker compose build && docker compose up -d
```