gsd: plan M001 (Chrysopedia Foundation) with 5 slices and S01 task breakdown

Milestone: Chrysopedia Foundation — Infrastructure, Pipeline Core, and Skeleton UI
Slices:
  S01: Docker Compose + Database + Whisper Script (5 tasks)
  S02: Transcript Ingestion API
  S03: LLM Extraction Pipeline + Qdrant Integration
  S04: Review Queue Admin UI
  S05: Search-First Web UI

Requirements: R001-R015 covering all spec sections.
Decisions: D001 (tech stack), D002 (Docker conventions), D003 (storage layer)
This commit is contained in:
jlightner 2026-03-29 21:39:04 +00:00
parent 8b506a95ca
commit e15dd97b73
15 changed files with 415 additions and 1 deletions

6
.gitignore vendored
View file

@ -1,2 +1,6 @@
.bg-shell/
.gsd/
.gsd/gsd.db
.gsd/gsd.db-shm
.gsd/gsd.db-wal
.gsd/event-log.jsonl
.gsd/state-manifest.json

9
.gsd/DECISIONS.md Normal file
View file

@ -0,0 +1,9 @@
# Decisions Register
<!-- Append-only. Never edit or remove existing rows.
To reverse a decision, add a new row that supersedes it.
Read this file at the start of any planning or research phase. -->
| # | When | Scope | Decision | Choice | Rationale | Revisable? | Made By |
|---|------|-------|----------|--------|-----------|------------|---------|
| D001 | | architecture | Docker Compose project naming and path conventions | xpltd_chrysopedia with bind mounts at /vmPool/r/services/chrysopedia_*, compose at /vmPool/r/compose/chrysopedia/ | XPLTD lore: compose projects at /vmPool/r/compose/{name}/, service data at /vmPool/r/services/{service}_{role}/, project naming follows xpltd_{name} pattern. Network will be a dedicated bridge subnet avoiding existing 172.16-172.23 and 172.29-172.30 ranges. | Yes | agent |

91
.gsd/REQUIREMENTS.md Normal file
View file

@ -0,0 +1,91 @@
# Requirements
## R001 — Whisper Transcription Pipeline
**Status:** active
**Description:** Desktop Python script that accepts video files (MP4/MKV), extracts audio via ffmpeg, runs Whisper large-v3 on RTX 4090, and outputs timestamped transcript JSON with segment-level timestamps and word-level timing. Must be resumable.
**Validation:** Script processes a sample video and produces valid JSON with timestamped segments.
**Primary Owner:** M001/S01
## R002 — Transcript Ingestion API
**Status:** active
**Description:** FastAPI endpoint that accepts transcript JSON uploads, creates/updates Creator and Source Video records, and stores transcript data in PostgreSQL. Handles new creator detection from folder names.
**Validation:** POST transcript JSON → 200 OK, records created in DB, file stored on filesystem.
**Primary Owner:** M001/S02
## R003 — LLM-Powered Extraction Pipeline (Stages 2-5)
**Status:** active
**Description:** Background worker pipeline: transcript segmentation → key moment extraction → classification/tagging → technique page synthesis. Uses OpenAI-compatible API with primary (DGX Sparks Qwen) and fallback (local Ollama) endpoints. Pipeline must be resumable per-video per-stage.
**Validation:** End-to-end: transcript JSON in → technique pages with key moments, tags, and cross-references out.
**Primary Owner:** M001/S03
## R004 — Review Queue UI
**Status:** active
**Description:** Admin interface for reviewing extracted key moments: approve, edit+approve, split, merge, reject. Organized by source video for contextual review. Includes mode toggle (review vs auto-publish).
**Validation:** Admin can review, edit, and approve/reject moments; mode toggle controls whether new moments require review.
**Primary Owner:** M001/S04
## R005 — Search-First Web UI
**Status:** active
**Description:** Landing page with prominent search bar, live typeahead (results after 2-3 chars), scope toggle (All/Topics/Creators), and two navigation cards (Topics, Creators). Recently added section. Search powered by Qdrant semantic search with keyword fallback.
**Validation:** User types query → results appear within 500ms, grouped by type, with clickable navigation.
**Primary Owner:** M001/S05
## R006 — Technique Page Display
**Status:** active
**Description:** Core content unit: header (tags, title, creator, meta), study guide prose (organized by sub-aspects with signal chain blocks and quotes), key moments index (timestamped list), related techniques, plugins referenced. Amber banner for livestream-sourced content.
**Validation:** Technique page renders with all sections populated from synthesized data.
**Primary Owner:** M001/S05
## R007 — Creators Browse Page
**Status:** active
**Description:** Filterable creator list with genre filter pills, type-to-narrow, sort options (randomized default, alphabetical, view count). Each row: name, genre tags, technique count, video count, view count. Links to creator detail page.
**Validation:** Page loads with randomized order, genre filtering works, clicking row navigates to creator detail.
**Primary Owner:** M001/S05
## R008 — Topics Browse Page
**Status:** active
**Description:** Two-level topic hierarchy (6 top-level categories → sub-topics). Filter input, genre filter pills. Each sub-topic shows technique count and creator count. Clicking sub-topic shows technique pages.
**Validation:** Hierarchy renders, filtering works, sub-topic links show correct technique pages.
**Primary Owner:** M001/S05
## R009 — Qdrant Vector Search Integration
**Status:** active
**Description:** Embed key moment summaries, technique page content, and transcript segments in Qdrant using configurable embedding model (nomic-embed-text default). Power semantic search with metadata filtering.
**Validation:** Semantic search returns relevant results for natural language queries; embeddings update when content changes.
**Primary Owner:** M001/S03
## R010 — Docker Compose Deployment
**Status:** active
**Description:** Single docker-compose.yml packaging API, web UI, PostgreSQL, and worker services. Follows XPLTD conventions: bind mounts at /vmPool/r/services/, compose at /vmPool/r/compose/chrysopedia/, xpltd_chrysopedia project name, dedicated Docker network.
**Validation:** `docker compose up -d` brings up all services; data persists across restarts.
**Primary Owner:** M001/S01
## R011 — Canonical Tag System
**Status:** active
**Description:** Editable canonical tag list (config file) with aliases. Pipeline references tags during classification. New tags can be proposed by LLM and queued for admin approval or auto-added within existing categories.
**Validation:** Tag list is editable; pipeline uses canonical tags consistently; alias normalization works.
**Primary Owner:** M001/S03
## R012 — Incremental Content Addition
**Status:** active
**Description:** System handles ongoing content: new videos processed through pipeline, new creators auto-detected, existing technique pages updated when new moments are added for same creator+topic.
**Validation:** Adding a new video for an existing creator updates their technique pages; new creator folder creates new Creator record.
**Primary Owner:** M001/S03
## R013 — Prompt Template System
**Status:** active
**Description:** Extraction prompts (stages 2-5) stored as editable configuration files, not hardcoded. Admin can edit prompts and re-run extraction on specific or all videos for calibration.
**Validation:** Prompt files are editable; re-processing a video with updated prompts produces different output.
**Primary Owner:** M001/S03
## R014 — Creator Equity
**Status:** active
**Description:** No creator is privileged in the UI. Default sort on Creators page is randomized on every page load. All creators get equal visual weight.
**Validation:** Refreshing Creators page shows different order each time; no creator gets larger/bolder display.
**Primary Owner:** M001/S05
## R015 — 30-Second Retrieval Target
**Status:** active
**Description:** A producer mid-session can find a specific technique in under 30 seconds from Alt+Tab to reading the key insight.
**Validation:** Timed test: Alt+Tab → search → read technique → under 30 seconds.
**Primary Owner:** M001/S05

18
.gsd/STATE.md Normal file
View file

@ -0,0 +1,18 @@
# GSD State
**Active Milestone:** M001: Chrysopedia Foundation — Infrastructure, Pipeline Core, and Skeleton UI
**Active Slice:** S01: Docker Compose + Database + Whisper Script
**Phase:** evaluating-gates
**Requirements Status:** 0 active · 0 validated · 0 deferred · 0 out of scope
## Milestone Registry
- 🔄 **M001:** Chrysopedia Foundation — Infrastructure, Pipeline Core, and Skeleton UI
## Recent Decisions
- None recorded
## Blockers
- None
## Next Action
Evaluate 3 quality gate(s) for S01 before execution.

View file

@ -0,0 +1,13 @@
# M001: Chrysopedia Foundation — Infrastructure, Pipeline Core, and Skeleton UI
## Vision
Stand up the complete Chrysopedia stack: Docker Compose deployment on ub01, PostgreSQL data model, FastAPI backend with transcript ingestion, Whisper transcription script for the desktop, LLM extraction pipeline (stages 2-5), review queue, Qdrant integration, and the search-first web UI with technique pages, creators, and topics browsing. By the end, a video file can be transcribed → ingested → extracted → reviewed → searched and read in the web UI.
## Slice Overview
| ID | Slice | Risk | Depends | Done | After this |
|----|-------|------|---------|------|------------|
| S01 | Docker Compose + Database + Whisper Script | low | — | ⬜ | docker compose up -d starts all services on ub01; Whisper script transcribes a sample video to JSON |
| S02 | Transcript Ingestion API | low | S01 | ⬜ | POST a transcript JSON file to the API; Creator and Source Video records appear in PostgreSQL |
| S03 | LLM Extraction Pipeline + Qdrant Integration | high | S02 | ⬜ | A transcript JSON triggers stages 2-5: segmentation → extraction → classification → synthesis. Technique pages with key moments appear in DB. Qdrant has searchable embeddings. |
| S04 | Review Queue Admin UI | medium | S03 | ⬜ | Admin views pending key moments, approves/edits/rejects them, toggles between review and auto mode |
| S05 | Search-First Web UI | medium | S03 | ⬜ | User searches for a technique, gets semantic results in <500ms, clicks through to a full technique page with study guide prose, key moments, and related links |

View file

@ -0,0 +1,81 @@
# S01: Docker Compose + Database + Whisper Script
**Goal:** Deployable infrastructure: Docker Compose project with PostgreSQL (full schema), FastAPI skeleton, and desktop Whisper transcription script
**Demo:** After this: docker compose up -d starts all services on ub01; Whisper script transcribes a sample video to JSON
## Tasks
- [ ] **T01: Project scaffolding and Docker Compose** — 1. Create project directory structure:
- backend/ (FastAPI app)
- frontend/ (React app, placeholder)
- whisper/ (desktop transcription script)
- docker/ (Dockerfiles)
- prompts/ (editable prompt templates)
- config/ (canonical tags, settings)
2. Write docker-compose.yml with services:
- chrysopedia-api (FastAPI, Uvicorn)
- chrysopedia-web (React, nginx)
- chrysopedia-db (PostgreSQL 16)
- chrysopedia-worker (Celery)
- chrysopedia-redis (Redis for Celery broker)
3. Follow XPLTD conventions: bind mounts, project naming xpltd_chrysopedia, dedicated bridge network
4. Create .env.example with all required env vars
5. Write Dockerfiles for API and web services
- Estimate: 2-3 hours
- Files: docker-compose.yml, .env.example, docker/Dockerfile.api, docker/Dockerfile.web, backend/main.py, backend/requirements.txt
- Verify: docker compose config validates without errors
- [ ] **T02: PostgreSQL schema and migrations** — 1. Create SQLAlchemy models for all 7 entities:
- Creator (id, name, slug, genres, folder_name, view_count, timestamps)
- SourceVideo (id, creator_id FK, filename, file_path, duration, content_type enum, transcript_path, processing_status enum, timestamps)
- TranscriptSegment (id, source_video_id FK, start_time, end_time, text, segment_index, topic_label)
- KeyMoment (id, source_video_id FK, technique_page_id FK nullable, title, summary, start/end time, content_type enum, plugins, review_status enum, raw_transcript, timestamps)
- TechniquePage (id, creator_id FK, title, slug, topic_category, topic_tags, summary, body_sections JSONB, signal_chains JSONB, plugins, source_quality enum, view_count, review_status enum, timestamps)
- RelatedTechniqueLink (id, source_page_id FK, target_page_id FK, relationship enum)
- Tag (id, name, category, aliases)
2. Set up Alembic for migrations
3. Create initial migration
4. Add seed data for canonical tags (6 top-level categories)
- Estimate: 2-3 hours
- Files: backend/models.py, backend/database.py, alembic.ini, alembic/versions/*.py, config/canonical_tags.yaml
- Verify: alembic upgrade head succeeds; all 7 tables exist with correct columns and constraints
- [ ] **T03: FastAPI application skeleton with health checks** — 1. Set up FastAPI app with:
- CORS middleware
- Database session dependency
- Health check endpoint (/health)
- API versioning prefix (/api/v1)
2. Create Pydantic schemas for all entities
3. Implement basic CRUD endpoints:
- GET /api/v1/creators
- GET /api/v1/creators/{slug}
- GET /api/v1/videos
- GET /api/v1/health
4. Add structured logging
5. Configure environment variable loading from .env
- Estimate: 1-2 hours
- Files: backend/main.py, backend/schemas.py, backend/routers/__init__.py, backend/routers/health.py, backend/routers/creators.py, backend/config.py
- Verify: curl http://localhost:8000/health returns 200; curl http://localhost:8000/api/v1/creators returns empty list
- [ ] **T04: Whisper transcription script** — 1. Create Python script whisper/transcribe.py that:
- Accepts video file path (or directory for batch mode)
- Extracts audio via ffmpeg (subprocess)
- Runs Whisper large-v3 with segment-level and word-level timestamps
- Outputs JSON matching the spec format (source_file, creator_folder, duration, segments with words)
- Supports resumability: checks if output JSON already exists, skips
2. Create whisper/requirements.txt (openai-whisper, ffmpeg-python)
3. Write output to a configurable output directory
4. Add CLI arguments: --input, --output-dir, --model (default large-v3), --device (default cuda)
5. Include progress logging for long transcriptions
- Estimate: 1-2 hours
- Files: whisper/transcribe.py, whisper/requirements.txt, whisper/README.md
- Verify: python whisper/transcribe.py --help shows usage; script validates ffmpeg is available
- [ ] **T05: Integration verification and documentation** — 1. Write README.md with:
- Project overview
- Architecture diagram (text)
- Setup instructions (Docker Compose + desktop Whisper)
- Environment variable documentation
- Development workflow
2. Verify Docker Compose stack starts with: docker compose up -d
3. Verify PostgreSQL schema with: alembic upgrade head
4. Verify API health check responds
5. Create sample transcript JSON for testing subsequent slices
- Estimate: 1 hour
- Files: README.md, tests/fixtures/sample_transcript.json
- Verify: docker compose config validates; README covers all setup steps; sample transcript JSON is valid

View file

@ -0,0 +1,40 @@
---
estimated_steps: 16
estimated_files: 6
skills_used: []
---
# T01: Project scaffolding and Docker Compose
1. Create project directory structure:
- backend/ (FastAPI app)
- frontend/ (React app, placeholder)
- whisper/ (desktop transcription script)
- docker/ (Dockerfiles)
- prompts/ (editable prompt templates)
- config/ (canonical tags, settings)
2. Write docker-compose.yml with services:
- chrysopedia-api (FastAPI, Uvicorn)
- chrysopedia-web (React, nginx)
- chrysopedia-db (PostgreSQL 16)
- chrysopedia-worker (Celery)
- chrysopedia-redis (Redis for Celery broker)
3. Follow XPLTD conventions: bind mounts, project naming xpltd_chrysopedia, dedicated bridge network
4. Create .env.example with all required env vars
5. Write Dockerfiles for API and web services
## Inputs
- `chrysopedia-spec.md`
- `XPLTD lore conventions`
## Expected Output
- `docker-compose.yml`
- `.env.example`
- `docker/Dockerfile.api`
- `backend/main.py`
## Verification
docker compose config validates without errors

View file

@ -0,0 +1,34 @@
---
estimated_steps: 11
estimated_files: 5
skills_used: []
---
# T02: PostgreSQL schema and migrations
1. Create SQLAlchemy models for all 7 entities:
- Creator (id, name, slug, genres, folder_name, view_count, timestamps)
- SourceVideo (id, creator_id FK, filename, file_path, duration, content_type enum, transcript_path, processing_status enum, timestamps)
- TranscriptSegment (id, source_video_id FK, start_time, end_time, text, segment_index, topic_label)
- KeyMoment (id, source_video_id FK, technique_page_id FK nullable, title, summary, start/end time, content_type enum, plugins, review_status enum, raw_transcript, timestamps)
- TechniquePage (id, creator_id FK, title, slug, topic_category, topic_tags, summary, body_sections JSONB, signal_chains JSONB, plugins, source_quality enum, view_count, review_status enum, timestamps)
- RelatedTechniqueLink (id, source_page_id FK, target_page_id FK, relationship enum)
- Tag (id, name, category, aliases)
2. Set up Alembic for migrations
3. Create initial migration
4. Add seed data for canonical tags (6 top-level categories)
## Inputs
- `chrysopedia-spec.md section 6 (Data Model)`
## Expected Output
- `backend/models.py`
- `backend/database.py`
- `alembic/versions/001_initial.py`
- `config/canonical_tags.yaml`
## Verification
alembic upgrade head succeeds; all 7 tables exist with correct columns and constraints

View file

@ -0,0 +1,37 @@
---
estimated_steps: 13
estimated_files: 6
skills_used: []
---
# T03: FastAPI application skeleton with health checks
1. Set up FastAPI app with:
- CORS middleware
- Database session dependency
- Health check endpoint (/health)
- API versioning prefix (/api/v1)
2. Create Pydantic schemas for all entities
3. Implement basic CRUD endpoints:
- GET /api/v1/creators
- GET /api/v1/creators/{slug}
- GET /api/v1/videos
- GET /api/v1/health
4. Add structured logging
5. Configure environment variable loading from .env
## Inputs
- `backend/models.py`
- `backend/database.py`
## Expected Output
- `backend/main.py`
- `backend/schemas.py`
- `backend/routers/creators.py`
- `backend/config.py`
## Verification
curl http://localhost:8000/health returns 200; curl http://localhost:8000/api/v1/creators returns empty list

View file

@ -0,0 +1,32 @@
---
estimated_steps: 10
estimated_files: 3
skills_used: []
---
# T04: Whisper transcription script
1. Create Python script whisper/transcribe.py that:
- Accepts video file path (or directory for batch mode)
- Extracts audio via ffmpeg (subprocess)
- Runs Whisper large-v3 with segment-level and word-level timestamps
- Outputs JSON matching the spec format (source_file, creator_folder, duration, segments with words)
- Supports resumability: checks if output JSON already exists, skips
2. Create whisper/requirements.txt (openai-whisper, ffmpeg-python)
3. Write output to a configurable output directory
4. Add CLI arguments: --input, --output-dir, --model (default large-v3), --device (default cuda)
5. Include progress logging for long transcriptions
## Inputs
- `chrysopedia-spec.md section 7.2 Stage 1`
## Expected Output
- `whisper/transcribe.py`
- `whisper/requirements.txt`
- `whisper/README.md`
## Verification
python whisper/transcribe.py --help shows usage; script validates ffmpeg is available

View file

@ -0,0 +1,31 @@
---
estimated_steps: 10
estimated_files: 2
skills_used: []
---
# T05: Integration verification and documentation
1. Write README.md with:
- Project overview
- Architecture diagram (text)
- Setup instructions (Docker Compose + desktop Whisper)
- Environment variable documentation
- Development workflow
2. Verify Docker Compose stack starts with: docker compose up -d
3. Verify PostgreSQL schema with: alembic upgrade head
4. Verify API health check responds
5. Create sample transcript JSON for testing subsequent slices
## Inputs
- `All T01-T04 outputs`
## Expected Output
- `README.md`
- `tests/fixtures/sample_transcript.json`
## Verification
docker compose config validates; README covers all setup steps; sample transcript JSON is valid

View file

@ -0,0 +1,6 @@
# S02: Transcript Ingestion API
**Goal:** FastAPI endpoints for transcript upload, creator management, and source video tracking
**Demo:** After this: POST a transcript JSON file to the API; Creator and Source Video records appear in PostgreSQL
## Tasks

View file

@ -0,0 +1,6 @@
# S03: LLM Extraction Pipeline + Qdrant Integration
**Goal:** Complete LLM pipeline with editable prompt templates, canonical tag system, Qdrant embedding, and resumable processing
**Demo:** After this: A transcript JSON triggers stages 2-5: segmentation → extraction → classification → synthesis. Technique pages with key moments appear in DB. Qdrant has searchable embeddings.
## Tasks

View file

@ -0,0 +1,6 @@
# S04: Review Queue Admin UI
**Goal:** Functional review workflow for calibrating extraction quality
**Demo:** After this: Admin views pending key moments, approves/edits/rejects them, toggles between review and auto mode
## Tasks

View file

@ -0,0 +1,6 @@
# S05: Search-First Web UI
**Goal:** Complete public-facing UI: landing page, live search, technique pages, creators browse, topics browse
**Demo:** After this: User searches for a technique, gets semantic results in <500ms, clicks through to a full technique page with study guide prose, key moments, and related links
## Tasks