chrysopedia

Author	SHA1	Message	Date
jlightner	80ac367e23	feat: Stage tab view for pipeline runs, rename stale→orphaned pages - Expanded runs now show horizontal stage tabs (Segment→Extract→Classify→Synthesize→Embed) - Each tab has status indicator dot (idle/running/done/error) with pulse animation - Clicking a tab shows that stage's events with summary stats (LLM calls, tokens, duration) - Error events auto-expanded with monospace error detail block - Auto-selects the error stage or latest active stage on expand - Renamed 'stale pages' to 'orphaned pages' in admin header	2026-04-03 03:24:43 +00:00
jlightner	47fe10f3df	fix: MCP server API URL patterns — path params not JSON body, stage name mapping	2026-04-03 03:07:39 +00:00
jlightner	6f3d5b27f9	fix: MCP server SQL uses correct column names (video_id, not pipeline_run_id)	2026-04-03 03:05:59 +00:00
jlightner	df93f2655a	fix: MCP server port 8097→8101 (8097 already allocated on ub01)	2026-04-03 02:58:57 +00:00
jlightner	ff0d40a466	feat: Chrysopedia MCP server — 25 tools for pipeline, infra, content, observability, embeddings, prompts Runs as chrysopedia-mcp container in Docker Compose with direct DB, Redis, Docker socket, and API access. Streamable HTTP transport on port 8097. Clients connect via http://ub01:8097/mcp	2026-04-03 02:57:27 +00:00
jlightner	9a8d2ea5c9	feat: Show article + creator count stats on admin techniques page	2026-04-03 02:38:09 +00:00
jlightner	df1d6af84e	style: Admin technique pages — full CSS styling, description text	2026-04-03 02:33:23 +00:00
jlightner	7bdba76d50	feat: Added technique_section result rendering with Section badge, deep… - "frontend/src/api/public-client.ts" - "frontend/src/pages/TechniquePage.tsx" - "frontend/src/pages/SearchResults.tsx" - "frontend/src/components/SearchAutocomplete.tsx" GSD-Task: S07/T02	2026-04-03 02:15:07 +00:00
jlightner	fd683e8266	feat: Added per-section embedding to stage 6 for v2 technique pages wit… - "backend/schemas.py" - "backend/pipeline/stages.py" - "backend/pipeline/qdrant_client.py" - "backend/search_service.py" - "backend/pipeline/test_section_embedding.py" GSD-Task: S07/T01	2026-04-03 02:12:56 +00:00
jlightner	edfabb037a	feat: Built AdminTechniquePages page at /admin/techniques with table, e… - "frontend/src/pages/AdminTechniquePages.tsx" - "frontend/src/api/public-client.ts" - "frontend/src/App.tsx" - "frontend/src/components/AdminDropdown.tsx" GSD-Task: S06/T02	2026-04-03 01:59:49 +00:00
jlightner	bd8a928c95	feat: Added paginated GET /admin/pipeline/technique-pages endpoint with… - "backend/routers/pipeline.py" - "backend/schemas.py" GSD-Task: S06/T01	2026-04-03 01:55:35 +00:00
jlightner	304f3bc069	feat: Added format-aware v2 body_sections rendering with nested TOC, ci… - "frontend/src/api/public-client.ts" - "frontend/src/pages/TechniquePage.tsx" - "frontend/src/components/TableOfContents.tsx" - "frontend/src/utils/citations.tsx" - "frontend/src/App.css" GSD-Task: S05/T01	2026-04-03 01:42:56 +00:00
jlightner	dbf3643662	test: Added 12 unit tests covering compose prompt construction, branchi… - "backend/pipeline/test_compose_pipeline.py" GSD-Task: S04/T02	2026-04-03 01:33:16 +00:00
jlightner	943a5102fe	feat: Added _build_compose_user_prompt(), _compose_into_existing(), and… - "backend/pipeline/stages.py" GSD-Task: S04/T01	2026-04-03 01:29:21 +00:00
jlightner	66b02dd94e	feat: Wired source_videos and body_sections_format into technique detai… - "backend/routers/techniques.py" GSD-Task: S03/T02	2026-04-03 01:19:32 +00:00
jlightner	ae98e4e30e	feat: Added body_sections_format column, technique_page_videos associat… - "alembic/versions/012_multi_source_format.py" - "backend/models.py" - "backend/schemas.py" GSD-Task: S03/T01	2026-04-03 01:16:31 +00:00
jlightner	cd2d842477	test: 16 unit tests covering compose prompt XML structure, citation off… - "backend/pipeline/test_harness_compose.py" - ".gsd/milestones/M014/slices/S02/tasks/T03-SUMMARY.md" GSD-Task: S02/T03	2026-04-03 01:08:41 +00:00
jlightner	9ee9b01af5	test: Added compose subcommand with build_compose_prompt(), run_compose… - "backend/pipeline/test_harness.py" GSD-Task: S02/T02	2026-04-03 01:05:25 +00:00
jlightner	3433c48681	feat: Created composition prompt with merge rules, citation re-indexing… - "prompts/stage5_compose.txt" - ".gsd/milestones/M014/slices/S02/tasks/T01-SUMMARY.md" GSD-Task: S02/T01	2026-04-03 01:03:01 +00:00
jlightner	3cf993c019	test: Updated test_harness.py word-count/section-count logic for list[B… - "backend/pipeline/test_harness.py" - "backend/pipeline/test_harness_v2_format.py" GSD-Task: S01/T03	2026-04-03 00:54:27 +00:00
jlightner	4c952ed96c	feat: Rewrote stage5_synthesis.txt with v2 body_sections (list-of-objec… - "prompts/stage5_synthesis.txt" - "prompts/stage5_synthesis.20260403_005044.bak" GSD-Task: S01/T02	2026-04-03 00:52:48 +00:00
jlightner	f320b08e0b	test: Added BodySection/BodySubSection schema models, changed Synthesiz… - "backend/pipeline/schemas.py" - "backend/pipeline/citation_utils.py" - "backend/pipeline/test_citation_utils.py" GSD-Task: S01/T01	2026-04-03 00:50:30 +00:00
jlightner	d04b810289	feat: add wipe-all-output admin endpoint and UI button Deletes all technique pages, versions, links, key moments, pipeline events/runs, Qdrant vectors, and Redis cache while preserving creators, videos, and transcript segments. Resets all video status to not_started. Double-confirm dialog in the UI prevents accidental use. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 22:17:48 +00:00
jlightner	14d3663567	prompt: stage5 synthesis v4 — instructive voice, name discipline, merge thresholds - Rewrote voice from third-person narrative ("Keota does X") to instructive ("Route the effect at 100% wet"). Body prose now reads like a lesson book. - Hard rule: creator name appears in title/summary only, max once in body (for quote attribution). Fixed JSON example that modeled heavy name usage. - Added orientation-first section rhythm: brief definition before diving into method, prevents run-on feel. - Page minimum thresholds: 3+ sections, 400+ words, 3+ moments. Prevents stub pages from thin categories. - Strengthened merge guidance: prefer fewer rich pages over many stubs. - Updated all examples to model instructive phrasing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 22:14:43 +00:00
jlightner	41eeb69c2d	fix: shorten alembic revision ID to fit varchar(32) column Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 16:59:52 +00:00
jlightner	da0b4b5fd6	feat: add pipeline iteration tooling — offline test harness, stage re-runs, chunking inspector Drops prompt iteration cycles from 20-30 min to under 5 min by enabling stage-isolated re-runs and offline prompt testing against exported fixtures. Phase 1: Offline prompt test harness - export_fixture.py: export stage 5 inputs from DB to reusable JSON fixtures - test_harness.py: run synthesis offline with any prompt, no Docker needed - promote subcommand: deploy winning prompts with backup and optional git commit Phase 2: Classification data persistence - Dual-write classification to PostgreSQL + Redis (fixes 24hr TTL data loss) - Clean retrigger now clears Redis cache keys (fixes stale data bug) - Alembic migration 011: classification_data JSONB column + stage_rerun enum Phase 3: Stage-isolated re-run - run_single_stage Celery task with prerequisite validation and prompt overrides - _load_prompt supports per-video Redis overrides for testing custom prompts - POST /admin/pipeline/rerun-stage/{video_id}/{stage_name} endpoint - Frontend: Re-run Stage modal with stage selector and prompt override textarea Phase 4: Chunking inspector - GET /admin/pipeline/chunking/{video_id} returns topic boundaries, classifications, and synthesis group breakdowns - Frontend: collapsible Chunking Inspector panel per video Phase 5: Prompt deployment & stale data cleanup - GET /admin/pipeline/stale-pages detects pages from older prompts - POST /admin/pipeline/bulk-resynthesize re-runs a stage on all completed videos - Frontend: stale pages indicator badge with one-click bulk re-synth Phase 6: Automated iteration foundation - Quality CLI --video-id flag auto-exports fixture from DB - POST /admin/pipeline/optimize-prompt/{stage} dispatches optimization as Celery task Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 15:47:46 +00:00
jlightner	37308fd185	pipeline: run stages inline instead of Celery chain dispatch Each video now completes all stages (2→6) before the worker picks up the next queued video. Previously, dispatching celery_chain for multiple videos caused interleaved execution — nothing finished until everything went through all stages. Now run_pipeline calls each stage function synchronously within the same worker task, so videos complete linearly and efficiently.	2026-04-01 11:39:21 +00:00
jlightner	7cc07d1b2d	stage5 prompt: make section ordering a hard constraint with explicit wrong/correct examples Validation/quality-check sections can NEVER precede construction sections. Added concrete wrong vs correct ordering example using the exact snare design case that failed. Elevated from 'typically' guidance to non-negotiable rule.	2026-04-01 11:35:59 +00:00
jlightner	1ed60f712a	stage5 prompt: reduce creator name repetition — use pronouns after establishing attribution	2026-04-01 11:25:45 +00:00
jlightner	1e31af9c30	stage5 prompt: add explicit section ordering guidance — follow the workflow Sections should mirror the actual production workflow: foundations before finishing, construction before glue, sound sources before processing before mix-bus treatment. Includes the test: 'would a producer follow these steps in this sequence?' and a natural flow template (framework → construction → combining/refining → quality checks).	2026-04-01 11:23:20 +00:00
jlightner	7311f25509	stage5: replace synthesis prompt with v016 (masterclass-recap) + add 100 variant prompts New prompt combines: embedded documentarian role, distilled-knowledge framing, conversational authority voice, problem-solution section structure, context-wrapped specifics, problem-driven teaching rhythm, any-skill-level reader model, insight-first summary, and engagement emphasis. 100 variant prompts generated across 9 dimensions of variation for future A/B testing. Generator script included for reproducibility.	2026-04-01 10:49:16 +00:00
jlightner	1b54b51922	optimize: Stage 5 synthesis prompt — round 0 winner (0.95→1.0 composite) Applied first optimization result: tighter voice preservation instructions, improved section flow guidance, trimmed redundant metadata instructions. 13382→11123 chars (-17%).	2026-04-01 10:15:24 +00:00
jlightner	4f4126e0ce	feat: Generalized OptimizationLoop to stages 2-5 with per-stage fixture… - "backend/pipeline/quality/optimizer.py" - "backend/pipeline/quality/__main__.py" - "backend/pipeline/quality/scorer.py" - "backend/pipeline/quality/fixtures/sample_segments.json" - "backend/pipeline/quality/fixtures/sample_topic_group.json" - "backend/pipeline/quality/fixtures/sample_classifications.json" GSD-Task: S04/T02	2026-04-01 09:24:42 +00:00
jlightner	1be0deeb76	feat: Added STAGE_CONFIGS registry (stages 2-5) with per-stage rubrics,… - "backend/pipeline/quality/scorer.py" - "backend/pipeline/quality/variant_generator.py" GSD-Task: S04/T01	2026-04-01 09:20:24 +00:00
jlightner	03373f263d	perf: Added optimize CLI subcommand with leaderboard table, ASCII traje… - "backend/pipeline/quality/__main__.py" - "backend/pipeline/quality/results/.gitkeep" GSD-Task: S03/T02	2026-04-01 09:10:42 +00:00
jlightner	0d82b2b409	feat: Created PromptVariantGenerator (LLM-powered prompt mutation) and… - "backend/pipeline/quality/variant_generator.py" - "backend/pipeline/quality/optimizer.py" GSD-Task: S03/T01	2026-04-01 09:08:01 +00:00
jlightner	0086573af5	feat: Added VoiceDial class with 3-band prompt modification and ScoreRu… - "backend/pipeline/quality/voice_dial.py" - "backend/pipeline/quality/scorer.py" - "backend/pipeline/quality/__main__.py" GSD-Task: S02/T02	2026-04-01 08:57:07 +00:00
jlightner	91cae921a4	feat: Built ScoreRunner with 5-dimension LLM-as-judge scoring rubric, C… - "backend/pipeline/quality/scorer.py" - "backend/pipeline/quality/__main__.py" - "backend/pipeline/quality/fixtures/sample_moments.json" - "backend/pipeline/quality/fixtures/__init__.py" GSD-Task: S02/T01	2026-04-01 08:53:40 +00:00
jlightner	b1b02a9633	test: Built pipeline.quality package with FitnessRunner (9 tests, 4 cat… - "backend/pipeline/quality/__init__.py" - "backend/pipeline/quality/__main__.py" - "backend/pipeline/quality/fitness.py" GSD-Task: S01/T01	2026-04-01 08:45:05 +00:00
jlightner	2cf9ae9bd6	fix: Retrigger button now uses clean-retrigger (wipes events + re-runs from scratch) The plain trigger endpoint short-circuits on status=complete — 'nothing to do'. Retrigger must use clean-retrigger to reset pipeline state first.	2026-04-01 07:34:01 +00:00
jlightner	18de2b3065	fix: Rename 'Trigger (debug)' button to 'Retrigger'	2026-04-01 07:25:20 +00:00
jlightner	c62c6eb644	fix: Pipeline LLM audit — temperature=0, realistic token ratios, structured request_params Audit findings & fixes: - temperature was never set (API defaulted to 1.0) → now explicit 0.0 for deterministic JSON - llm_max_tokens=65536 exceeded hard_limit=32768 → aligned to 32768 - Output ratio estimates were 5-30x too high (based on actual pipeline data): stage2: 0.6→0.05, stage3: 2.0→0.3, stage4: 0.5→0.3, stage5: 2.5→0.8 - request_params now structured as api_params (what's sent to LLM) vs pipeline_config (internal estimator settings) — no more ambiguous 'hard_limit' in request params - temperature=0.0 sent on both primary and fallback endpoints	2026-04-01 07:20:09 +00:00
jlightner	c7ac4be860	feat: Store LLM request params (max_tokens, model, modality) in pipeline events - _make_llm_callback now accepts request_params dict - All 6 LLM call sites pass max_tokens, model_override, modality, response_model, hard_limit - request_params stored in payload JSONB on every llm_call event (always, not just debug mode) - Frontend JSON export includes full payload + request_params at top level - DebugPayloadViewer shows 'Request Params' section even with debug mode off - Answers whether max_tokens is actually being sent on pipeline requests	2026-04-01 07:01:57 +00:00
jlightner	5f608b8889	fix: Parallel search with match_context, deterministic Qdrant IDs, raised embedding timeout - Search now runs semantic + keyword in parallel, merges and deduplicates - Keyword results always included with match_context explaining WHY matched - Semantic results filtered by minimum score threshold (0.45) - match_context shows 'Creator: X', 'Tag: Y', 'Title match', 'Content: ...' - Qdrant points use deterministic uuid5 IDs (no more duplicates on reindex) - Embedding timeout raised from 300ms to 2s (Ollama needs it) - _enrich_qdrant_results reads creator_name from payload before DB fallback - Frontend displays match_context as highlighted bar on search result cards	2026-04-01 06:54:34 +00:00
jlightner	94da19c05d	fix: Variable ordering bug and stage 5 truncation recovery Two fixes: 1. page_moment_indices was referenced before assignment in the page persist loop — moved assignment to top of loop body. This caused "cannot access local variable" errors on every stage 5 run. 2. Stage 5 now catches LLMTruncationError and splits the chunk in half for retry, instead of blindly retrying the same oversized prompt. This handles categories where synthesis output exceeds the model context window. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 01:51:28 -05:00
jlightner	630d3fa477	feat: Created SortDropdown component and useSortPreference hook, integr… - "frontend/src/components/SortDropdown.tsx" - "frontend/src/hooks/useSortPreference.ts" - "frontend/src/pages/SearchResults.tsx" - "frontend/src/pages/SubTopicPage.tsx" - "frontend/src/pages/CreatorDetail.tsx" - "frontend/src/api/public-client.ts" - "frontend/src/App.css" GSD-Task: S02/T02	2026-04-01 06:41:52 +00:00
jlightner	250d7315af	feat: Added sort query parameter (relevance/newest/oldest/alpha/creator… - "backend/routers/search.py" - "backend/routers/topics.py" - "backend/routers/techniques.py" - "backend/search_service.py" GSD-Task: S02/T01	2026-04-01 06:41:52 +00:00
jlightner	c1cdba14f2	feat: Added partial_matches fallback UI to search results — shows muted… - "frontend/src/api/public-client.ts" - "frontend/src/pages/SearchResults.tsx" - "frontend/src/App.css" GSD-Task: S01/T03	2026-04-01 06:41:52 +00:00
jlightner	5a484fb27a	feat: Enriched Qdrant embedding text with creator_name/tags and added r… - "backend/pipeline/stages.py" - "backend/pipeline/qdrant_client.py" - "backend/routers/pipeline.py" GSD-Task: S01/T02	2026-04-01 06:41:52 +00:00
jlightner	9c0247c830	feat: Refactored keyword_search to multi-token AND with cross-field mat… - "backend/search_service.py" - "backend/schemas.py" - "backend/routers/search.py" - "backend/tests/test_search.py" GSD-Task: S01/T01	2026-04-01 06:41:52 +00:00

1 2 3 4 5 ...

277 commits