4 Architecture
jlightner edited this page 2026-04-04 10:31:50 -05:00

Architecture

System Overview

Chrysopedia is a self-hosted music production knowledge base that synthesizes technique articles from video transcripts using a 7-stage LLM pipeline. It runs as a Docker Compose stack on ub01 with 11 containers.

┌─────────────────────────────────────────────────────────────────┐
│                         ub01 (10.0.0.10)                        │
│  Docker Compose: xpltd_chrysopedia  Subnet: 172.32.0.0/24      │
│                                                                 │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────────┐   │
│  │ nginx    │  │ FastAPI  │  │ Celery   │  │ Watcher      │   │
│  │ :8096    │─▶│ :8000    │  │ Worker   │  │ (PollingObs) │   │
│  └──────────┘  └────┬─────┘  └────┬─────┘  └──────┬───────┘   │
│                     │             │                │            │
│        ┌────────────┼─────────────┼────────────────┘            │
│        ▼            ▼             ▼                             │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────────┐   │
│  │ Postgres │  │ Redis    │  │ Qdrant   │  │ Ollama       │   │
│  │ :5433    │  │ :6379    │  │ :6333    │  │ :11434       │   │
│  └──────────┘  └──────────┘  └──────────┘  └──────────────┘   │
│                                                                 │
│                    ┌──────────────┐                              │
│                    │ LightRAG     │                              │
│                    │ :9621        │                              │
│                    └──────────────┘                              │
└─────────────────────────────────────────────────────────────────┘
         ▲
         │ nginx reverse proxy
┌────────┴────────┐
│ nuc01 (10.0.0.9)│
│ chrysopedia.com │
│ :443 → :8096    │
└─────────────────┘

Key Architectural Characteristics

  • Zero external frontend dependencies beyond React, react-router-dom, and Vite
  • Monolithic CSS — 5,820 lines, single file, BEM naming, 77 custom properties
  • JWT authentication — invite-code registration, bcrypt passwords, HS256 tokens, role-based access control (admin, creator)
  • Dual SQLAlchemy strategy — async engine for FastAPI request handlers, sync engine for Celery pipeline tasks (D004)
  • Non-blocking pipeline side effects — embedding/Qdrant failures don't block page synthesis (D005)
  • LightRAG — graph-based RAG knowledge base backed by Qdrant + Ollama, running as a standalone service

Docker Services

Service Image Container Port Volume
PostgreSQL 16 postgres:16-alpine chrysopedia-db 5433:5432 chrysopedia_postgres_data
Redis 7 redis:7-alpine chrysopedia-redis 6379 (internal)
Qdrant 1.13.2 qdrant/qdrant:v1.13.2 chrysopedia-qdrant 6333 (internal) chrysopedia_qdrant_data
Ollama ollama/ollama:latest chrysopedia-ollama 11434 (internal) chrysopedia_ollama_data
LightRAG ghcr.io/hkuds/lightrag:latest chrysopedia-lightrag 9621 (internal) chrysopedia_lightrag_data
API (FastAPI) Dockerfile.api chrysopedia-api 8000 (internal) Bind: backend/, prompts/
Worker (Celery) Dockerfile.api chrysopedia-worker Bind: backend/, prompts/
Watcher Dockerfile.api chrysopedia-watcher Bind: watch dir
Web (nginx) Dockerfile.web chrysopedia-web-8096 8096:80

Note: The 11-container count includes the 3 infrastructure services not shown above (nginx-internal-healthcheck, db-init, and lightrag). Exact running count may vary based on one-shot init containers.

Network Topology

  • Compose subnet: 172.32.0.0/24 (D015)
  • External access: nginx on nuc01 (10.0.0.9) reverse-proxies to ub01:8096
  • DNS: AdGuard Home rewrites chrysopedia.com → 10.0.0.9
  • Internal services (Redis, Qdrant, Ollama, LightRAG) are not exposed outside the Docker network

Tech Stack

Layer Technology
Frontend React 18 + TypeScript + Vite
Backend FastAPI + Celery + SQLAlchemy (async)
Database PostgreSQL 16
Cache/Broker Redis 7 (Celery broker + Beat scheduler + review mode toggle + classification cache + rate limit counters)
Vector Store Qdrant 1.13.2
Knowledge Graph LightRAG (graph-based RAG, port 9621)
Embeddings Ollama (nomic-embed-text) via OpenAI-compatible /v1/embeddings
LLM OpenAI-compatible API — DGX Sparks Qwen primary, local Ollama fallback
Authentication JWT (HS256, 24h expiry) + bcrypt + invite codes
Deployment Docker Compose on ub01, nginx reverse proxy on nuc01

Data Flow

  1. Ingestion: Video files → Whisper transcription (desktop, RTX 4090) → JSON transcript
  2. Upload: Transcript JSON dropped into watch folder or POSTed to /api/v1/ingest
  3. Pipeline: 6 Celery stages process each video (see Pipeline)
  4. Storage: Technique pages + key moments → PostgreSQL, embeddings → Qdrant
  5. Serving: React SPA fetches from FastAPI, search queries hit Qdrant then PostgreSQL fallback
  6. Auth: JWT-protected endpoints for creator consent management and admin features (see Authentication)
  7. Notifications: Celery Beat runs daily email digest task (09:00 UTC) -- queries new content per followed creator, composes HTML, sends via SMTP. Deduplication via EmailDigestLog table (M025/S01)

See also: Deployment, Pipeline, Data-Model, Authentication → PostgreSQL, embeddings → Qdrant 5. Serving: React SPA fetches from FastAPI, search queries hit Qdrant then PostgreSQL fallback 6. Auth: JWT-protected endpoints for creator consent management and admin features (see Authentication) 7. Notifications: Celery Beat runs daily email digest task (09:00 UTC) -- queries new content per followed creator, composes HTML, sends via SMTP. Deduplication via EmailDigestLog table (M025/S01)


See also: Deployment, Pipeline, Data-Model, Authentication ], Authentication*