diff --git a/.gitignore b/.gitignore index 782c74a..617259b 100644 --- a/.gitignore +++ b/.gitignore @@ -1,2 +1,6 @@ .bg-shell/ -.gsd/ +.gsd/gsd.db +.gsd/gsd.db-shm +.gsd/gsd.db-wal +.gsd/event-log.jsonl +.gsd/state-manifest.json diff --git a/.gsd/DECISIONS.md b/.gsd/DECISIONS.md new file mode 100644 index 0000000..58e90f5 --- /dev/null +++ b/.gsd/DECISIONS.md @@ -0,0 +1,9 @@ +# Decisions Register + + + +| # | When | Scope | Decision | Choice | Rationale | Revisable? | Made By | +|---|------|-------|----------|--------|-----------|------------|---------| +| D001 | | architecture | Docker Compose project naming and path conventions | xpltd_chrysopedia with bind mounts at /vmPool/r/services/chrysopedia_*, compose at /vmPool/r/compose/chrysopedia/ | XPLTD lore: compose projects at /vmPool/r/compose/{name}/, service data at /vmPool/r/services/{service}_{role}/, project naming follows xpltd_{name} pattern. Network will be a dedicated bridge subnet avoiding existing 172.16-172.23 and 172.29-172.30 ranges. | Yes | agent | diff --git a/.gsd/REQUIREMENTS.md b/.gsd/REQUIREMENTS.md new file mode 100644 index 0000000..4c3d99e --- /dev/null +++ b/.gsd/REQUIREMENTS.md @@ -0,0 +1,91 @@ +# Requirements + +## R001 — Whisper Transcription Pipeline +**Status:** active +**Description:** Desktop Python script that accepts video files (MP4/MKV), extracts audio via ffmpeg, runs Whisper large-v3 on RTX 4090, and outputs timestamped transcript JSON with segment-level timestamps and word-level timing. Must be resumable. +**Validation:** Script processes a sample video and produces valid JSON with timestamped segments. +**Primary Owner:** M001/S01 + +## R002 — Transcript Ingestion API +**Status:** active +**Description:** FastAPI endpoint that accepts transcript JSON uploads, creates/updates Creator and Source Video records, and stores transcript data in PostgreSQL. Handles new creator detection from folder names. +**Validation:** POST transcript JSON → 200 OK, records created in DB, file stored on filesystem. +**Primary Owner:** M001/S02 + +## R003 — LLM-Powered Extraction Pipeline (Stages 2-5) +**Status:** active +**Description:** Background worker pipeline: transcript segmentation → key moment extraction → classification/tagging → technique page synthesis. Uses OpenAI-compatible API with primary (DGX Sparks Qwen) and fallback (local Ollama) endpoints. Pipeline must be resumable per-video per-stage. +**Validation:** End-to-end: transcript JSON in → technique pages with key moments, tags, and cross-references out. +**Primary Owner:** M001/S03 + +## R004 — Review Queue UI +**Status:** active +**Description:** Admin interface for reviewing extracted key moments: approve, edit+approve, split, merge, reject. Organized by source video for contextual review. Includes mode toggle (review vs auto-publish). +**Validation:** Admin can review, edit, and approve/reject moments; mode toggle controls whether new moments require review. +**Primary Owner:** M001/S04 + +## R005 — Search-First Web UI +**Status:** active +**Description:** Landing page with prominent search bar, live typeahead (results after 2-3 chars), scope toggle (All/Topics/Creators), and two navigation cards (Topics, Creators). Recently added section. Search powered by Qdrant semantic search with keyword fallback. +**Validation:** User types query → results appear within 500ms, grouped by type, with clickable navigation. +**Primary Owner:** M001/S05 + +## R006 — Technique Page Display +**Status:** active +**Description:** Core content unit: header (tags, title, creator, meta), study guide prose (organized by sub-aspects with signal chain blocks and quotes), key moments index (timestamped list), related techniques, plugins referenced. Amber banner for livestream-sourced content. +**Validation:** Technique page renders with all sections populated from synthesized data. +**Primary Owner:** M001/S05 + +## R007 — Creators Browse Page +**Status:** active +**Description:** Filterable creator list with genre filter pills, type-to-narrow, sort options (randomized default, alphabetical, view count). Each row: name, genre tags, technique count, video count, view count. Links to creator detail page. +**Validation:** Page loads with randomized order, genre filtering works, clicking row navigates to creator detail. +**Primary Owner:** M001/S05 + +## R008 — Topics Browse Page +**Status:** active +**Description:** Two-level topic hierarchy (6 top-level categories → sub-topics). Filter input, genre filter pills. Each sub-topic shows technique count and creator count. Clicking sub-topic shows technique pages. +**Validation:** Hierarchy renders, filtering works, sub-topic links show correct technique pages. +**Primary Owner:** M001/S05 + +## R009 — Qdrant Vector Search Integration +**Status:** active +**Description:** Embed key moment summaries, technique page content, and transcript segments in Qdrant using configurable embedding model (nomic-embed-text default). Power semantic search with metadata filtering. +**Validation:** Semantic search returns relevant results for natural language queries; embeddings update when content changes. +**Primary Owner:** M001/S03 + +## R010 — Docker Compose Deployment +**Status:** active +**Description:** Single docker-compose.yml packaging API, web UI, PostgreSQL, and worker services. Follows XPLTD conventions: bind mounts at /vmPool/r/services/, compose at /vmPool/r/compose/chrysopedia/, xpltd_chrysopedia project name, dedicated Docker network. +**Validation:** `docker compose up -d` brings up all services; data persists across restarts. +**Primary Owner:** M001/S01 + +## R011 — Canonical Tag System +**Status:** active +**Description:** Editable canonical tag list (config file) with aliases. Pipeline references tags during classification. New tags can be proposed by LLM and queued for admin approval or auto-added within existing categories. +**Validation:** Tag list is editable; pipeline uses canonical tags consistently; alias normalization works. +**Primary Owner:** M001/S03 + +## R012 — Incremental Content Addition +**Status:** active +**Description:** System handles ongoing content: new videos processed through pipeline, new creators auto-detected, existing technique pages updated when new moments are added for same creator+topic. +**Validation:** Adding a new video for an existing creator updates their technique pages; new creator folder creates new Creator record. +**Primary Owner:** M001/S03 + +## R013 — Prompt Template System +**Status:** active +**Description:** Extraction prompts (stages 2-5) stored as editable configuration files, not hardcoded. Admin can edit prompts and re-run extraction on specific or all videos for calibration. +**Validation:** Prompt files are editable; re-processing a video with updated prompts produces different output. +**Primary Owner:** M001/S03 + +## R014 — Creator Equity +**Status:** active +**Description:** No creator is privileged in the UI. Default sort on Creators page is randomized on every page load. All creators get equal visual weight. +**Validation:** Refreshing Creators page shows different order each time; no creator gets larger/bolder display. +**Primary Owner:** M001/S05 + +## R015 — 30-Second Retrieval Target +**Status:** active +**Description:** A producer mid-session can find a specific technique in under 30 seconds from Alt+Tab to reading the key insight. +**Validation:** Timed test: Alt+Tab → search → read technique → under 30 seconds. +**Primary Owner:** M001/S05 diff --git a/.gsd/STATE.md b/.gsd/STATE.md new file mode 100644 index 0000000..af6bbb9 --- /dev/null +++ b/.gsd/STATE.md @@ -0,0 +1,18 @@ +# GSD State + +**Active Milestone:** M001: Chrysopedia Foundation — Infrastructure, Pipeline Core, and Skeleton UI +**Active Slice:** S01: Docker Compose + Database + Whisper Script +**Phase:** evaluating-gates +**Requirements Status:** 0 active · 0 validated · 0 deferred · 0 out of scope + +## Milestone Registry +- 🔄 **M001:** Chrysopedia Foundation — Infrastructure, Pipeline Core, and Skeleton UI + +## Recent Decisions +- None recorded + +## Blockers +- None + +## Next Action +Evaluate 3 quality gate(s) for S01 before execution. diff --git a/.gsd/milestones/M001/M001-ROADMAP.md b/.gsd/milestones/M001/M001-ROADMAP.md new file mode 100644 index 0000000..5f50896 --- /dev/null +++ b/.gsd/milestones/M001/M001-ROADMAP.md @@ -0,0 +1,13 @@ +# M001: Chrysopedia Foundation — Infrastructure, Pipeline Core, and Skeleton UI + +## Vision +Stand up the complete Chrysopedia stack: Docker Compose deployment on ub01, PostgreSQL data model, FastAPI backend with transcript ingestion, Whisper transcription script for the desktop, LLM extraction pipeline (stages 2-5), review queue, Qdrant integration, and the search-first web UI with technique pages, creators, and topics browsing. By the end, a video file can be transcribed → ingested → extracted → reviewed → searched and read in the web UI. + +## Slice Overview +| ID | Slice | Risk | Depends | Done | After this | +|----|-------|------|---------|------|------------| +| S01 | Docker Compose + Database + Whisper Script | low | — | ⬜ | docker compose up -d starts all services on ub01; Whisper script transcribes a sample video to JSON | +| S02 | Transcript Ingestion API | low | S01 | ⬜ | POST a transcript JSON file to the API; Creator and Source Video records appear in PostgreSQL | +| S03 | LLM Extraction Pipeline + Qdrant Integration | high | S02 | ⬜ | A transcript JSON triggers stages 2-5: segmentation → extraction → classification → synthesis. Technique pages with key moments appear in DB. Qdrant has searchable embeddings. | +| S04 | Review Queue Admin UI | medium | S03 | ⬜ | Admin views pending key moments, approves/edits/rejects them, toggles between review and auto mode | +| S05 | Search-First Web UI | medium | S03 | ⬜ | User searches for a technique, gets semantic results in <500ms, clicks through to a full technique page with study guide prose, key moments, and related links | diff --git a/.gsd/milestones/M001/slices/S01/S01-PLAN.md b/.gsd/milestones/M001/slices/S01/S01-PLAN.md new file mode 100644 index 0000000..34c89bf --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/S01-PLAN.md @@ -0,0 +1,81 @@ +# S01: Docker Compose + Database + Whisper Script + +**Goal:** Deployable infrastructure: Docker Compose project with PostgreSQL (full schema), FastAPI skeleton, and desktop Whisper transcription script +**Demo:** After this: docker compose up -d starts all services on ub01; Whisper script transcribes a sample video to JSON + +## Tasks +- [ ] **T01: Project scaffolding and Docker Compose** — 1. Create project directory structure: + - backend/ (FastAPI app) + - frontend/ (React app, placeholder) + - whisper/ (desktop transcription script) + - docker/ (Dockerfiles) + - prompts/ (editable prompt templates) + - config/ (canonical tags, settings) +2. Write docker-compose.yml with services: + - chrysopedia-api (FastAPI, Uvicorn) + - chrysopedia-web (React, nginx) + - chrysopedia-db (PostgreSQL 16) + - chrysopedia-worker (Celery) + - chrysopedia-redis (Redis for Celery broker) +3. Follow XPLTD conventions: bind mounts, project naming xpltd_chrysopedia, dedicated bridge network +4. Create .env.example with all required env vars +5. Write Dockerfiles for API and web services + - Estimate: 2-3 hours + - Files: docker-compose.yml, .env.example, docker/Dockerfile.api, docker/Dockerfile.web, backend/main.py, backend/requirements.txt + - Verify: docker compose config validates without errors +- [ ] **T02: PostgreSQL schema and migrations** — 1. Create SQLAlchemy models for all 7 entities: + - Creator (id, name, slug, genres, folder_name, view_count, timestamps) + - SourceVideo (id, creator_id FK, filename, file_path, duration, content_type enum, transcript_path, processing_status enum, timestamps) + - TranscriptSegment (id, source_video_id FK, start_time, end_time, text, segment_index, topic_label) + - KeyMoment (id, source_video_id FK, technique_page_id FK nullable, title, summary, start/end time, content_type enum, plugins, review_status enum, raw_transcript, timestamps) + - TechniquePage (id, creator_id FK, title, slug, topic_category, topic_tags, summary, body_sections JSONB, signal_chains JSONB, plugins, source_quality enum, view_count, review_status enum, timestamps) + - RelatedTechniqueLink (id, source_page_id FK, target_page_id FK, relationship enum) + - Tag (id, name, category, aliases) +2. Set up Alembic for migrations +3. Create initial migration +4. Add seed data for canonical tags (6 top-level categories) + - Estimate: 2-3 hours + - Files: backend/models.py, backend/database.py, alembic.ini, alembic/versions/*.py, config/canonical_tags.yaml + - Verify: alembic upgrade head succeeds; all 7 tables exist with correct columns and constraints +- [ ] **T03: FastAPI application skeleton with health checks** — 1. Set up FastAPI app with: + - CORS middleware + - Database session dependency + - Health check endpoint (/health) + - API versioning prefix (/api/v1) +2. Create Pydantic schemas for all entities +3. Implement basic CRUD endpoints: + - GET /api/v1/creators + - GET /api/v1/creators/{slug} + - GET /api/v1/videos + - GET /api/v1/health +4. Add structured logging +5. Configure environment variable loading from .env + - Estimate: 1-2 hours + - Files: backend/main.py, backend/schemas.py, backend/routers/__init__.py, backend/routers/health.py, backend/routers/creators.py, backend/config.py + - Verify: curl http://localhost:8000/health returns 200; curl http://localhost:8000/api/v1/creators returns empty list +- [ ] **T04: Whisper transcription script** — 1. Create Python script whisper/transcribe.py that: + - Accepts video file path (or directory for batch mode) + - Extracts audio via ffmpeg (subprocess) + - Runs Whisper large-v3 with segment-level and word-level timestamps + - Outputs JSON matching the spec format (source_file, creator_folder, duration, segments with words) + - Supports resumability: checks if output JSON already exists, skips +2. Create whisper/requirements.txt (openai-whisper, ffmpeg-python) +3. Write output to a configurable output directory +4. Add CLI arguments: --input, --output-dir, --model (default large-v3), --device (default cuda) +5. Include progress logging for long transcriptions + - Estimate: 1-2 hours + - Files: whisper/transcribe.py, whisper/requirements.txt, whisper/README.md + - Verify: python whisper/transcribe.py --help shows usage; script validates ffmpeg is available +- [ ] **T05: Integration verification and documentation** — 1. Write README.md with: + - Project overview + - Architecture diagram (text) + - Setup instructions (Docker Compose + desktop Whisper) + - Environment variable documentation + - Development workflow +2. Verify Docker Compose stack starts with: docker compose up -d +3. Verify PostgreSQL schema with: alembic upgrade head +4. Verify API health check responds +5. Create sample transcript JSON for testing subsequent slices + - Estimate: 1 hour + - Files: README.md, tests/fixtures/sample_transcript.json + - Verify: docker compose config validates; README covers all setup steps; sample transcript JSON is valid diff --git a/.gsd/milestones/M001/slices/S01/tasks/T01-PLAN.md b/.gsd/milestones/M001/slices/S01/tasks/T01-PLAN.md new file mode 100644 index 0000000..f27335c --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T01-PLAN.md @@ -0,0 +1,40 @@ +--- +estimated_steps: 16 +estimated_files: 6 +skills_used: [] +--- + +# T01: Project scaffolding and Docker Compose + +1. Create project directory structure: + - backend/ (FastAPI app) + - frontend/ (React app, placeholder) + - whisper/ (desktop transcription script) + - docker/ (Dockerfiles) + - prompts/ (editable prompt templates) + - config/ (canonical tags, settings) +2. Write docker-compose.yml with services: + - chrysopedia-api (FastAPI, Uvicorn) + - chrysopedia-web (React, nginx) + - chrysopedia-db (PostgreSQL 16) + - chrysopedia-worker (Celery) + - chrysopedia-redis (Redis for Celery broker) +3. Follow XPLTD conventions: bind mounts, project naming xpltd_chrysopedia, dedicated bridge network +4. Create .env.example with all required env vars +5. Write Dockerfiles for API and web services + +## Inputs + +- `chrysopedia-spec.md` +- `XPLTD lore conventions` + +## Expected Output + +- `docker-compose.yml` +- `.env.example` +- `docker/Dockerfile.api` +- `backend/main.py` + +## Verification + +docker compose config validates without errors diff --git a/.gsd/milestones/M001/slices/S01/tasks/T02-PLAN.md b/.gsd/milestones/M001/slices/S01/tasks/T02-PLAN.md new file mode 100644 index 0000000..b3c4910 --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T02-PLAN.md @@ -0,0 +1,34 @@ +--- +estimated_steps: 11 +estimated_files: 5 +skills_used: [] +--- + +# T02: PostgreSQL schema and migrations + +1. Create SQLAlchemy models for all 7 entities: + - Creator (id, name, slug, genres, folder_name, view_count, timestamps) + - SourceVideo (id, creator_id FK, filename, file_path, duration, content_type enum, transcript_path, processing_status enum, timestamps) + - TranscriptSegment (id, source_video_id FK, start_time, end_time, text, segment_index, topic_label) + - KeyMoment (id, source_video_id FK, technique_page_id FK nullable, title, summary, start/end time, content_type enum, plugins, review_status enum, raw_transcript, timestamps) + - TechniquePage (id, creator_id FK, title, slug, topic_category, topic_tags, summary, body_sections JSONB, signal_chains JSONB, plugins, source_quality enum, view_count, review_status enum, timestamps) + - RelatedTechniqueLink (id, source_page_id FK, target_page_id FK, relationship enum) + - Tag (id, name, category, aliases) +2. Set up Alembic for migrations +3. Create initial migration +4. Add seed data for canonical tags (6 top-level categories) + +## Inputs + +- `chrysopedia-spec.md section 6 (Data Model)` + +## Expected Output + +- `backend/models.py` +- `backend/database.py` +- `alembic/versions/001_initial.py` +- `config/canonical_tags.yaml` + +## Verification + +alembic upgrade head succeeds; all 7 tables exist with correct columns and constraints diff --git a/.gsd/milestones/M001/slices/S01/tasks/T03-PLAN.md b/.gsd/milestones/M001/slices/S01/tasks/T03-PLAN.md new file mode 100644 index 0000000..6812c15 --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T03-PLAN.md @@ -0,0 +1,37 @@ +--- +estimated_steps: 13 +estimated_files: 6 +skills_used: [] +--- + +# T03: FastAPI application skeleton with health checks + +1. Set up FastAPI app with: + - CORS middleware + - Database session dependency + - Health check endpoint (/health) + - API versioning prefix (/api/v1) +2. Create Pydantic schemas for all entities +3. Implement basic CRUD endpoints: + - GET /api/v1/creators + - GET /api/v1/creators/{slug} + - GET /api/v1/videos + - GET /api/v1/health +4. Add structured logging +5. Configure environment variable loading from .env + +## Inputs + +- `backend/models.py` +- `backend/database.py` + +## Expected Output + +- `backend/main.py` +- `backend/schemas.py` +- `backend/routers/creators.py` +- `backend/config.py` + +## Verification + +curl http://localhost:8000/health returns 200; curl http://localhost:8000/api/v1/creators returns empty list diff --git a/.gsd/milestones/M001/slices/S01/tasks/T04-PLAN.md b/.gsd/milestones/M001/slices/S01/tasks/T04-PLAN.md new file mode 100644 index 0000000..2e9b5ff --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T04-PLAN.md @@ -0,0 +1,32 @@ +--- +estimated_steps: 10 +estimated_files: 3 +skills_used: [] +--- + +# T04: Whisper transcription script + +1. Create Python script whisper/transcribe.py that: + - Accepts video file path (or directory for batch mode) + - Extracts audio via ffmpeg (subprocess) + - Runs Whisper large-v3 with segment-level and word-level timestamps + - Outputs JSON matching the spec format (source_file, creator_folder, duration, segments with words) + - Supports resumability: checks if output JSON already exists, skips +2. Create whisper/requirements.txt (openai-whisper, ffmpeg-python) +3. Write output to a configurable output directory +4. Add CLI arguments: --input, --output-dir, --model (default large-v3), --device (default cuda) +5. Include progress logging for long transcriptions + +## Inputs + +- `chrysopedia-spec.md section 7.2 Stage 1` + +## Expected Output + +- `whisper/transcribe.py` +- `whisper/requirements.txt` +- `whisper/README.md` + +## Verification + +python whisper/transcribe.py --help shows usage; script validates ffmpeg is available diff --git a/.gsd/milestones/M001/slices/S01/tasks/T05-PLAN.md b/.gsd/milestones/M001/slices/S01/tasks/T05-PLAN.md new file mode 100644 index 0000000..4c4fdb4 --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T05-PLAN.md @@ -0,0 +1,31 @@ +--- +estimated_steps: 10 +estimated_files: 2 +skills_used: [] +--- + +# T05: Integration verification and documentation + +1. Write README.md with: + - Project overview + - Architecture diagram (text) + - Setup instructions (Docker Compose + desktop Whisper) + - Environment variable documentation + - Development workflow +2. Verify Docker Compose stack starts with: docker compose up -d +3. Verify PostgreSQL schema with: alembic upgrade head +4. Verify API health check responds +5. Create sample transcript JSON for testing subsequent slices + +## Inputs + +- `All T01-T04 outputs` + +## Expected Output + +- `README.md` +- `tests/fixtures/sample_transcript.json` + +## Verification + +docker compose config validates; README covers all setup steps; sample transcript JSON is valid diff --git a/.gsd/milestones/M001/slices/S02/S02-PLAN.md b/.gsd/milestones/M001/slices/S02/S02-PLAN.md new file mode 100644 index 0000000..85a9bb2 --- /dev/null +++ b/.gsd/milestones/M001/slices/S02/S02-PLAN.md @@ -0,0 +1,6 @@ +# S02: Transcript Ingestion API + +**Goal:** FastAPI endpoints for transcript upload, creator management, and source video tracking +**Demo:** After this: POST a transcript JSON file to the API; Creator and Source Video records appear in PostgreSQL + +## Tasks diff --git a/.gsd/milestones/M001/slices/S03/S03-PLAN.md b/.gsd/milestones/M001/slices/S03/S03-PLAN.md new file mode 100644 index 0000000..e8839d0 --- /dev/null +++ b/.gsd/milestones/M001/slices/S03/S03-PLAN.md @@ -0,0 +1,6 @@ +# S03: LLM Extraction Pipeline + Qdrant Integration + +**Goal:** Complete LLM pipeline with editable prompt templates, canonical tag system, Qdrant embedding, and resumable processing +**Demo:** After this: A transcript JSON triggers stages 2-5: segmentation → extraction → classification → synthesis. Technique pages with key moments appear in DB. Qdrant has searchable embeddings. + +## Tasks diff --git a/.gsd/milestones/M001/slices/S04/S04-PLAN.md b/.gsd/milestones/M001/slices/S04/S04-PLAN.md new file mode 100644 index 0000000..09da954 --- /dev/null +++ b/.gsd/milestones/M001/slices/S04/S04-PLAN.md @@ -0,0 +1,6 @@ +# S04: Review Queue Admin UI + +**Goal:** Functional review workflow for calibrating extraction quality +**Demo:** After this: Admin views pending key moments, approves/edits/rejects them, toggles between review and auto mode + +## Tasks diff --git a/.gsd/milestones/M001/slices/S05/S05-PLAN.md b/.gsd/milestones/M001/slices/S05/S05-PLAN.md new file mode 100644 index 0000000..c6557f9 --- /dev/null +++ b/.gsd/milestones/M001/slices/S05/S05-PLAN.md @@ -0,0 +1,6 @@ +# S05: Search-First Web UI + +**Goal:** Complete public-facing UI: landing page, live search, technique pages, creators browse, topics browse +**Demo:** After this: User searches for a technique, gets semantic results in <500ms, clicks through to a full technique page with study guide prose, key moments, and related links + +## Tasks