chrysopedia/PROJECT_CONTEXT.md
jlightner 15dcab201a test: Added BodySection/BodySubSection schema models, changed Synthesiz…
- "backend/pipeline/schemas.py"
- "backend/pipeline/citation_utils.py"
- "backend/pipeline/test_citation_utils.py"

GSD-Task: S01/T01
2026-04-03 00:50:30 +00:00

29 KiB
Raw Blame History

Chrysopedia — Project Context Document

Auto-generated: 2026-04-01 | Assessed Stage: Integration/Stabilization | Root: /home/aux/projects/content-to-kb-automator

Overview

Chrysopedia is a self-hosted knowledge extraction and retrieval system for electronic music production content. It takes raw video files (tutorials, livestreams, track breakdowns) from 50+ electronic music producers, transcribes them via Whisper, runs them through a multi-stage LLM pipeline to extract structured knowledge, and serves the results through a search-first web UI designed for mid-session retrieval — a producer Alt+Tabs from their DAW, searches for a technique, absorbs the answer, and gets back to work in under 30 seconds.

Audience: Electronic music producers, primarily one power user (the project owner) with a personal library of 100-500 video files. Single-admin tool, not multi-tenant.

Project type: Full-stack web application with an LLM-powered data pipeline. Monorepo with backend (Python/FastAPI), frontend (React/TypeScript), Whisper transcription script, Docker Compose deployment, and prompt engineering toolkit.

Evidence for purpose: Extensive 37-page spec (chrysopedia-spec.md), README with architecture diagrams, detailed PROJECT.md in GSD artifacts, 23 decisions logged, 32 requirements tracked (28 validated, 1 active, 4 out-of-scope). Etymology: chrysopoeia (alchemical transmutation) + encyclopedia.

Canonical development directory: This is not the active development location. Per CLAUDE.md, all future development happens on ub01 at /vmPool/r/repos/xpltdco/chrysopedia. This directory was the initial workspace. GitHub: github.com/xpltdco/chrysopedia (private, xpltdco org).


Architecture & Stack

Technology Stack

Layer Technology Version/Notes
Backend Python 3.12, FastAPI, SQLAlchemy (async), Pydantic Settings API + business logic
Task Queue Celery + Redis (broker + result backend) Sync tasks, concurrency=1
Database PostgreSQL 16 (asyncpg driver) Primary data store
Vector DB Qdrant v1.13.2 Semantic search embeddings
Embeddings Ollama (nomic-embed-text, 768-dim) Local CPU inference
LLM OpenAI-compatible API (DGX Sparks Qwen primary, Ollama fallback) Per-stage model routing (chat vs thinking)
Frontend React 18.3, TypeScript 5.6, Vite 6, React Router 6.28 Zero UI libraries — all custom CSS
Web Server nginx 1.27 (Alpine) SPA routing + API proxy
Containerization Docker Compose 8 services, dedicated bridge network
Deployment ub01 (on-premises server) Bind mounts to /vmPool/r/services/chrysopedia_*
Reverse Proxy nginx on nuc01 (separate machine) Routes chrysopedia.xpltd.co → ub01:8096

System Architecture

Desktop (GPU workstation — hal0022)
  └── whisper/transcribe.py → JSON transcripts → SCP/rsync to /watch folder

Docker Compose on ub01 (8 services on 172.32.0.0/24):
  ┌─────────────┐  ┌───────┐  ┌────────┐  ┌────────┐
  │ PostgreSQL   │  │ Redis │  │ Qdrant │  │ Ollama │
  │ :5433→5432   │  │ broker│  │ vector │  │ embed  │
  └──────┬───────┘  └───┬───┘  └───┬────┘  └───┬────┘
         └──────┬───────┴──────────┴────────────┘
                │
  ┌─────────────┼────────────────────────────────┐
  │ FastAPI API  │  Celery Worker  │  Watcher     │
  │ REST + admin │  LLM pipeline   │  /watch→POST │
  └──────────────┴─────────────────┴──────────────┘
                │
  ┌─────────────┴──────┐
  │ nginx (React SPA)  │
  │ :8096→80           │
  └────────────────────┘

Data flow: Video → Whisper transcript JSON → Watcher POSTs to /api/v1/ingest → Celery pipeline (4 LLM stages: segment → extract → classify → synthesize) → KeyMoments + TechniquePages in PostgreSQL → Embeddings in Qdrant → Search-first web UI.

External integrations:

  • OpenWebUI at chat.forgetyour.name (DGX Sparks Qwen models for LLM inference)
  • AdGuard DNS on ub01 for internal domain resolution
  • nginx on nuc01 for external HTTPS termination (via Certbot)

Data Model

11 entities across 11 tables:

Entity Purpose Key Fields
Creator Artists/producers name, slug, genres[], folder_name, hidden
SourceVideo Processed video files filename, content_hash (dedup), processing_status, classification_data (JSONB)
TranscriptSegment Whisper output rows start_time, end_time, text, segment_index, topic_label
KeyMoment LLM-extracted insights title, summary, start_time, end_time, content_type, plugins[]
TechniquePage Synthesized knowledge (primary output) title, slug, topic_category, topic_tags[], body_sections (JSONB), signal_chains (JSONB), plugins[]
TechniquePageVersion Pre-overwrite snapshots content_snapshot (JSONB), pipeline_metadata (JSONB), version_number
RelatedTechniqueLink Cross-references source→target, relationship type
Tag Topic taxonomy name, category, aliases[]
ContentReport User-reported issues report_type, status, admin_notes
PipelineRun Pipeline execution record video_id, run_number, trigger, status, total_tokens
PipelineEvent Per-stage execution log stage, event_type, token counts, payload (JSONB), debug I/O columns

Relationships: Creator → SourceVideo → TranscriptSegment, KeyMoment; Creator → TechniquePage → KeyMoment, TechniquePageVersion, RelatedTechniqueLink; SourceVideo → PipelineRun → PipelineEvent.

Migrations: 11 Alembic migrations (001 through 011), covering initial schema through pipeline runs and classification cache additions.


Project Structure

chrysopedia/
├── backend/                    # FastAPI application (10,209 LOC Python)
│   ├── main.py                 # App entry, middleware, router mounting
│   ├── config.py               # Pydantic Settings (all env vars)
│   ├── database.py             # Async engine + session factory
│   ├── models.py               # 11 SQLAlchemy ORM models
│   ├── schemas.py              # Pydantic request/response schemas (422 lines)
│   ├── worker.py               # Celery app config
│   ├── watcher.py              # Folder monitor → auto-ingest service
│   ├── search_service.py       # Async semantic + keyword search (603 lines)
│   ├── redis_client.py         # Redis client for feature flags
│   ├── routers/                # 9 API router modules
│   │   ├── health.py, ingest.py, search.py, techniques.py
│   │   ├── creators.py, topics.py, videos.py
│   │   ├── pipeline.py (admin), reports.py
│   ├── pipeline/               # LLM pipeline core (2,908 LOC)
│   │   ├── stages.py           # 4 LLM stages + orchestrator (2,102 lines — largest file)
│   │   ├── llm_client.py       # OpenAI-compatible sync client with fallback
│   │   ├── embedding_client.py # Sync embedding client for Celery
│   │   ├── qdrant_client.py    # Qdrant upsert + collection management
│   │   ├── schemas.py          # Pipeline data schemas
│   │   └── quality/            # Prompt optimization toolkit (2,507 LOC)
│   │       ├── fitness.py      # LLM fitness test suite (9 tests)
│   │       ├── scorer.py       # 5-dimension LLM-as-judge scoring
│   │       ├── optimizer.py    # Automated prompt A/B optimization
│   │       ├── variant_generator.py  # LLM-powered prompt mutation
│   │       └── voice_dial.py   # Voice preservation dial
│   └── tests/                  # Integration tests (2,754 LOC, 65 tests)
├── frontend/                   # React SPA (9,975 LOC TypeScript + CSS)
│   └── src/
│       ├── pages/              # 10 page components
│       ├── components/         # 9 shared components
│       ├── hooks/              # 2 custom hooks
│       ├── api/                # Typed API client
│       └── App.css             # 4,871 lines — all styles (no CSS framework)
├── whisper/                    # Desktop transcription scripts
├── prompts/                    # 3 active prompt templates + 100 stage5 variants
├── alembic/                    # 11 database migrations
├── config/                     # canonical_tags.yaml (7-category topic taxonomy)
├── docker/                     # Dockerfile.api, Dockerfile.web, nginx.conf
├── docker-compose.yml          # 8-service stack definition
├── generate_stage5_variants.py # Stage 5 prompt variant generator (874 lines — one-off tool)
├── .gsd/                       # GSD project management artifacts
│   ├── PROJECT.md, REQUIREMENTS.md, DECISIONS.md, KNOWLEDGE.md
│   └── milestones/             # 13 completed milestone artifacts
└── .env.example                # Environment variable template

Entry points:

  • backend/main.py → FastAPI app (uvicorn main:app)
  • backend/worker.py → Celery worker (celery -A worker worker)
  • backend/watcher.py → Folder watcher service (python watcher.py)
  • frontend/src/main.tsx → React app (Vite dev server or nginx-served build)
  • whisper/transcribe.py → Desktop transcription CLI
  • backend/pipeline/quality/__main__.py → Prompt quality toolkit CLI

Configuration & Environment

Environment Variables

Variable Purpose Default
POSTGRES_USER Database user chrysopedia
POSTGRES_PASSWORD Database password changeme
POSTGRES_DB Database name chrysopedia
DATABASE_URL Full async connection string Composed from above
REDIS_URL Redis broker URL redis://chrysopedia-redis:6379/0
LLM_API_URL Primary LLM endpoint OpenWebUI on DGX
LLM_API_KEY LLM authentication Required
LLM_MODEL Default LLM model name fyn-llm-agent-chat
LLM_FALLBACK_URL / _MODEL Fallback LLM endpoint Same as primary
LLM_STAGE{2-5}_MODEL Per-stage model override chat for 2/4, think for 3/5
LLM_STAGE{2-5}_MODALITY chat or thinking per stage See above
LLM_MAX_TOKENS LLM response token limit 32768
LLM_TEMPERATURE LLM temperature 0.0 (deterministic)
SYNTHESIS_CHUNK_SIZE Max moments per synthesis call 30
EMBEDDING_API_URL Ollama embedding endpoint Container-internal
EMBEDDING_MODEL Embedding model name nomic-embed-text
EMBEDDING_DIMENSIONS Vector dimensionality 768
QDRANT_URL Qdrant endpoint Container-internal
QDRANT_COLLECTION Qdrant collection name chrysopedia
APP_ENV Environment name development
APP_LOG_LEVEL Log level info
APP_SECRET_KEY Application secret changeme-generate-a-real-secret
CORS_ORIGINS Allowed CORS origins ["*"]
REVIEW_MODE Require admin review of moments true
DEBUG_MODE Capture full LLM I/O in events false
TRANSCRIPT_STORAGE_PATH Transcript file storage /data/transcripts
VIDEO_METADATA_PATH Video metadata storage /data/video_meta
PROMPTS_PATH Prompt template directory ./prompts
GIT_COMMIT_SHA Build-time commit hash unknown
WATCH_FOLDER Watcher monitored directory /watch
WATCHER_API_URL Ingest endpoint for watcher Container-internal
WATCHER_STABILITY_SECONDS File stability wait time 2
WATCHER_POLL_INTERVAL Filesystem poll interval 5
GIT_COMMIT_SHA (build arg) Passed at Docker build time for footer dev
VITE_GIT_COMMIT (build arg) Frontend build-time constant dev

Environments

  • Production: Docker Compose on ub01, .env file with real credentials
  • Local dev: Backend runs locally with docker compose up -d chrysopedia-db chrysopedia-redis, .env in backend/
  • Test: Uses real PostgreSQL (test database), configured in backend/tests/conftest.py
  • No staging environment exists.

Secrets Management

Environment variables via .env file (gitignored). No vault, KMS, or sealed secrets. The .env.example contains placeholders. backend/.env exists locally (not tracked in git) and contains a real API key — this is expected for local dev but the key should be rotated if this directory is ever shared.


Development Workflow

Getting Started

# 1. Clone the repo
git clone git@github.com:xpltdco/chrysopedia.git
cd chrysopedia

# 2. Configure environment
cp .env.example .env
# Edit .env with real LLM_API_KEY and POSTGRES_PASSWORD

# 3. Start infrastructure
docker compose up -d

# 4. Run migrations
docker exec chrysopedia-api alembic upgrade head

# 5. Pull embedding model (first time)
docker exec chrysopedia-ollama ollama pull nomic-embed-text

# 6. Verify
curl http://localhost:8096/health

For local backend development (outside Docker):

python -m venv .venv && source .venv/bin/activate
pip install -r backend/requirements.txt
docker compose up -d chrysopedia-db chrysopedia-redis  # just infra
alembic upgrade head
cd backend && uvicorn main:app --reload --host 0.0.0.0 --port 8001  # 8001 to avoid kerf-engine conflict on 8000

For frontend development:

cd frontend && npm ci && npm run dev

Key Commands

Task Command
Start full stack docker compose up -d
Rebuild after code changes docker compose build && docker compose up -d
Run migrations docker exec chrysopedia-api alembic upgrade head
Create migration alembic revision --autogenerate -m "description"
View API logs docker logs -f chrysopedia-api
View worker logs docker logs -f chrysopedia-worker
Run tests cd backend && pytest
Frontend dev server cd frontend && npm run dev
Frontend build cd frontend && npm run build
Prompt quality CLI cd backend && python -m pipeline.quality
Deploy to ub01 ssh ub01; cd /vmPool/r/repos/xpltdco/chrysopedia; git pull && docker compose build && docker compose up -d

CI/CD Pipeline

None. No .github/workflows/, no CI config files. Deployment is manual: git pull && docker compose build && docker compose up -d on ub01. [inferred — high confidence based on absence of any CI configuration]

Code Conventions

  • Python: No linter config (no ruff, black, flake8 config files found). Code follows PEP 8 by convention. Type hints used throughout (Python 3.12 features like X | None).
  • TypeScript: No ESLint config. TypeScript strict mode via tsconfig. Zero-dependency UI (no UI libraries, no Tailwind).
  • CSS: Single monolithic App.css (4,871 lines). 77 CSS custom properties for theming. Dark theme with cyan accent (#22d3ee).
  • Naming: Slugified URLs, snake_case Python, camelCase TypeScript. SQLAlchemy models use Mapped annotations. Pydantic schemas use model_config = {"from_attributes": True}.
  • No pre-commit hooks, no .editorconfig, no formatter configs.

Current State Assessment

Stage: Integration/Stabilization — All 13 milestones complete. 28 of 32 requirements validated. 171 commits over 3 days (March 29April 1, 2026) by a single contributor. The system is deployed and running. However, it was built rapidly by AI agents (GSD workflow), the pipeline is running inline (not via Celery chain as originally designed per recent commit 29f6e74), and there are no CI/CD guardrails. The codebase is functional but hasn't been through the hardening that comes from sustained multi-user operation.

Recent Activity

  • 171 commits from 2026-03-29 to 2026-04-01 (3 days of intense development)
  • Single contributor: jlightner
  • Last commit: 29f6e74 — "pipeline: run stages inline instead of Celery chain dispatch"
  • Most recent work: Stage 5 prompt optimization (100 variant prompts generated), inline pipeline execution, prompt quality toolkit (M013)

Active Branches

Only main exists. All development has been on a single branch. No feature branches, no release branches.

What's Working

  • Full 6-stage pipeline (transcription → ingestion → LLM extraction → review → synthesis → search)
  • Docker Compose deployment with 8 services, healthchecks on all containers
  • Search (semantic via Qdrant + keyword fallback with multi-token AND matching)
  • Admin review queue with approve/edit/reject workflows
  • Pipeline admin dashboard with event logs, token usage, retrigger controls
  • 10-page React SPA with responsive design, topic taxonomy, creator browse, technique detail
  • Folder watcher for auto-ingestion of new transcripts
  • Article versioning with pipeline metadata snapshots
  • 65 integration tests covering all major API paths
  • Prompt quality toolkit (fitness tests, scoring, automated optimization)

What's In Progress

  • Stage 5 prompt optimization: 100 variant prompts generated (prompts/stage5_variants/), active A/B testing with the quality toolkit. The most recent commits are all prompt refinement.
  • Inline pipeline execution: The latest commit switches from Celery chain dispatch to inline stage execution, suggesting the Celery chaining had issues.
  • generate_stage5_variants.py (874 lines) is a one-off script at project root — should likely be absorbed into the quality toolkit or removed.

Technical Debt Inventory

Zero TODOs/FIXMEs/HACKs in source code. All annotations found were in node_modules/ (third-party). This is notable — either debt was addressed as it arose, or code annotations weren't used as a practice.

Implicit debt captured in KNOWLEDGE.md:

  • QdrantManager uses random UUIDs for point IDs, causing duplicates on re-index (noted as deferred fix — use deterministic UUIDs)
  • LLM-generated topic categories have inconsistent casing (deferred)
  • Stage 4 classification data stored in Redis with 24h TTL instead of DB columns (expedient but fragile)

Structural debt:

  • frontend/src/App.css — 4,871-line monolithic stylesheet. No CSS modules, no component-scoped styles.
  • backend/pipeline/stages.py — 2,102 lines. All 4 LLM stages + orchestrator in one file.
  • generate_stage5_variants.py — 874-line one-off script at project root.
  • prompts/stage5_variants/.v016.txt.swp — vim swap file committed (harmless but untidy).
  • No authentication on any endpoint (admin or public). Single-admin tool by design, but the admin endpoints are exposed to anyone on the network.
  • CORS allows all origins ("*").

Test Coverage

  • Framework: pytest + pytest-asyncio
  • Test count: 65 tests across 4 files (ingest: 6, pipeline: 11, public API: 26, search: 22)
  • Test LOC: 2,754 (27% of backend source LOC)
  • Approach: Integration tests against real PostgreSQL with NullPool. Mock LLM responses via fixtures. httpx.AsyncClient with ASGI transport for API tests.
  • Missing: No frontend tests. No unit tests for pipeline stages in isolation. No load/performance tests. No test for the watcher service. No test for the quality toolkit.
  • No CI: Tests are run manually (cd backend && pytest).

Documentation Status

  • README.md: Comprehensive (19KB) — architecture diagrams, quick start, full API reference, environment variables, deployment instructions. High quality.
  • chrysopedia-spec.md: Detailed 37-page product specification. Thorough and thoughtful.
  • CLAUDE.md: Development reference with deployment info and quick commands.
  • GSD artifacts: 13 milestone summaries, 23 decisions, 32 requirements, extensive KNOWLEDGE.md with 30+ lessons learned. Unusually thorough project history.
  • prompts/README.md: Exists (not inspected in detail).
  • whisper/README.md: Exists for transcription docs.
  • Missing: No API documentation generation (no OpenAPI spec export, though FastAPI auto-generates one at /docs). No architecture decision records beyond GSD decisions. No runbook for operations/debugging.

Red Flags & Observations

Security

  1. No authentication on any endpoint. Admin endpoints (pipeline control, review queue, debug mode toggle) are accessible to anyone who can reach the server. Acceptable for a single-user tool on a private network, but risky if the port is ever exposed.
  2. CORS allows all origins (cors_origins: ["*"]). No restriction on which domains can call the API.
  3. backend/.env contains a real API key (sk-dcdd...). Not tracked in git (correctly gitignored), but present on disk. Standard for local dev.
  4. APP_SECRET_KEY defaults to changeme-generate-a-real-secret in config.py. If the .env doesn't override this, it's a predictable secret (though it's unclear if anything actually uses it — no session/JWT middleware found).

Architectural Concerns

  1. Monolithic CSS file (4,871 lines). Any style change requires searching through a single massive file. No component isolation.
  2. stages.py god file (2,102 lines). Four LLM stages + orchestrator + helpers all in one module. Each stage is a complex function with JSON parsing, error recovery, and DB writes.
  3. Pipeline switched from Celery chains to inline execution (latest commit). This suggests Celery task chaining had reliability issues. Inline execution means the API request thread runs all LLM stages synchronously — a single pipeline run could take 10+ minutes blocking a worker.
  4. Qdrant duplicate points on re-index (documented in KNOWLEDGE.md, unfixed). Random UUIDs mean every re-embed creates duplicates instead of upserts.
  5. No retry/backoff on LLM API calls beyond the primary→fallback pattern. If both endpoints are down, the pipeline fails immediately.

Fragile Areas

  1. Classification data in Redis with 24h TTL. If Redis restarts between stage 4 and stage 5, classification data is lost and stage 5 fails or produces degraded output.
  2. Frontend has zero type-safe API layer. The public-client.ts uses fetch() directly. No generated types from the backend schema. API contract drift is possible.
  3. Single-branch development. All 171 commits on main. No protection against broken deploys.

Inconsistencies

  1. FastAPI version in app = FastAPI(version="0.1.0") vs package.json version "0.8.0". No single source of truth for the project version.

Trajectory & Opportunities

Where It's Heading

The most recent work is prompt quality optimization — generating 100 stage 5 variants and building automated A/B testing infrastructure. The project owner is clearly focused on improving the LLM output quality now that the infrastructure is stable.

The inline pipeline execution change suggests the next phase may involve processing real video content at scale and encountering reliability issues with the current architecture.

Partially Built / Stubbed Features

  • Content reports — Model and API exist (ContentReport, /api/v1/reports), admin reports page exists, but unclear if actively used.
  • View countsview_count field on Creator and TechniquePage models, but no increment logic found. Fields default to 0.
  • Creator hidden flaghidden boolean on Creator model (migration 009), but no admin UI to toggle it.
  • Genre filtering on Creators page — Spec mentions it, UI has it, but genre data depends on pipeline classification which may not populate genres consistently.

Capability Gaps

  • No authentication/authorization. Adding a simple API key or basic auth for admin endpoints would be a quick security win.
  • No WebSocket/SSE for pipeline progress. The admin UI polls for pipeline status. Real-time updates would improve the pipeline monitoring experience.
  • No full-text search index. Keyword search uses ILIKE which doesn't scale. PostgreSQL tsvector/GIN index would be significantly faster.
  • No backup strategy documented. PostgreSQL data and Qdrant vectors are on bind mounts but no backup cron or strategy is mentioned.
  • No content analytics. No view tracking, no search query logging, no usage metrics beyond pipeline token counts.

Low-Hanging Fruit

  1. Fix Qdrant duplicate points — Switch to deterministic UUIDs based on content hash. Small change, big data quality impact.
  2. Add basic auth to admin endpoints — A single API key middleware for /admin/* and /review/* routes.
  3. Split stages.py — Extract each stage into its own module. The file is already structured with clear stage boundaries.
  4. Normalize topic category casing.lower() or .title() in stage 4 output. One-line fix for data consistency.
  5. Delete generate_stage5_variants.py from project root (or move into quality toolkit).
  6. Add a Makefile with common commands (build, test, deploy, migrate) to replace the manual command documentation.

Logical Next Features

Based on the trajectory and spec:

  1. Batch processing pipeline — Process the full video library (100-500 files). Will stress-test pipeline reliability.
  2. Content analytics — View tracking, popular searches, usage patterns.
  3. Improved search — Full-text search index, search result ranking improvements, faceted filtering.
  4. Multi-user support — Authentication, user-specific bookmarks/notes on techniques.
  5. Video timestamp deep links — If videos are accessible on the network, link directly to the timestamp in a player.

Key Files Reference

File Purpose
chrysopedia-spec.md Full product specification (37 pages) — read first for product understanding
README.md Architecture, setup, API reference, deployment guide
CLAUDE.md Development context and canonical directory warning
backend/main.py FastAPI app entry point, middleware, router mounting
backend/config.py All environment variables with defaults (Pydantic Settings)
backend/models.py All 11 SQLAlchemy ORM models — the data model source of truth
backend/schemas.py Pydantic request/response schemas
backend/pipeline/stages.py LLM pipeline — all 4 stages and orchestrator (the most complex file)
backend/pipeline/llm_client.py LLM API client with primary/fallback and thinking mode support
backend/search_service.py Semantic + keyword search implementation
backend/watcher.py Transcript folder watcher service
frontend/src/App.tsx React app root with routing
frontend/src/App.css All styles (4,871 lines)
frontend/src/api/public-client.ts Typed API client
config/canonical_tags.yaml 7-category topic taxonomy definition
docker-compose.yml Full 8-service stack definition
.env.example Environment variable template
.gsd/PROJECT.md Living project state document with milestone history
.gsd/KNOWLEDGE.md Lessons learned and patterns (30+ entries) — invaluable for newcomers
.gsd/DECISIONS.md 23 architectural decisions with rationale
.gsd/REQUIREMENTS.md 32 requirements with validation status

Uncertainties & Open Questions

  1. Is the pipeline actually processing real content? The system is deployed, but it's unclear how many videos have been processed through the pipeline. The test fixtures use sample data, and the prompt optimization work suggests the pipeline output quality isn't yet satisfactory. [inferred — medium confidence]

  2. Why did Celery chain dispatch get replaced with inline execution? The latest commit (29f6e74) switches to inline, but no commit message explains the issue. Was it a Celery reliability problem, a debugging convenience, or a permanent architectural change? [unknown — needs project owner input]

  3. Is the domain chrysopedia.xpltd.co actually configured? M003 mentions domain + DNS setup, KNOWLEDGE.md documents the XPLTD domain flow, but the nginx config uses server_name _ (catch-all). [inferred — likely configured on nuc01's nginx, not in this codebase]

  4. What's the actual LLM infrastructure? References to "DGX Sparks Qwen" and "FYN" suggest a private GPU cluster. The API endpoint is chat.forgetyour.name which appears to be an OpenWebUI instance. The relationship between these systems and their reliability characteristics would matter for pipeline scaling. [low confidence — outside codebase]

  5. Are there plans for multi-user access? The spec says "single-admin tool" but the architecture (separate frontend, API, PostgreSQL) could support multiple users. No authentication means this is purely a trust-boundary question. [inferred — currently single-user by design]

  6. What is the CHRYSOPEDIA-ASSESSMENT.md (42KB)? Not read in detail — appears to be a UI/UX assessment that fed into M011 decisions. [low confidence on contents]