chrysopedia

Author	SHA1	Message	Date
jlightner	da0b4b5fd6	feat: add pipeline iteration tooling — offline test harness, stage re-runs, chunking inspector Drops prompt iteration cycles from 20-30 min to under 5 min by enabling stage-isolated re-runs and offline prompt testing against exported fixtures. Phase 1: Offline prompt test harness - export_fixture.py: export stage 5 inputs from DB to reusable JSON fixtures - test_harness.py: run synthesis offline with any prompt, no Docker needed - promote subcommand: deploy winning prompts with backup and optional git commit Phase 2: Classification data persistence - Dual-write classification to PostgreSQL + Redis (fixes 24hr TTL data loss) - Clean retrigger now clears Redis cache keys (fixes stale data bug) - Alembic migration 011: classification_data JSONB column + stage_rerun enum Phase 3: Stage-isolated re-run - run_single_stage Celery task with prerequisite validation and prompt overrides - _load_prompt supports per-video Redis overrides for testing custom prompts - POST /admin/pipeline/rerun-stage/{video_id}/{stage_name} endpoint - Frontend: Re-run Stage modal with stage selector and prompt override textarea Phase 4: Chunking inspector - GET /admin/pipeline/chunking/{video_id} returns topic boundaries, classifications, and synthesis group breakdowns - Frontend: collapsible Chunking Inspector panel per video Phase 5: Prompt deployment & stale data cleanup - GET /admin/pipeline/stale-pages detects pages from older prompts - POST /admin/pipeline/bulk-resynthesize re-runs a stage on all completed videos - Frontend: stale pages indicator badge with one-click bulk re-synth Phase 6: Automated iteration foundation - Quality CLI --video-id flag auto-exports fixture from DB - POST /admin/pipeline/optimize-prompt/{stage} dispatches optimization as Celery task Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 15:47:46 +00:00
jlightner	37308fd185	pipeline: run stages inline instead of Celery chain dispatch Each video now completes all stages (2→6) before the worker picks up the next queued video. Previously, dispatching celery_chain for multiple videos caused interleaved execution — nothing finished until everything went through all stages. Now run_pipeline calls each stage function synchronously within the same worker task, so videos complete linearly and efficiently.	2026-04-01 11:39:21 +00:00
jlightner	7cc07d1b2d	stage5 prompt: make section ordering a hard constraint with explicit wrong/correct examples Validation/quality-check sections can NEVER precede construction sections. Added concrete wrong vs correct ordering example using the exact snare design case that failed. Elevated from 'typically' guidance to non-negotiable rule.	2026-04-01 11:35:59 +00:00
jlightner	1ed60f712a	stage5 prompt: reduce creator name repetition — use pronouns after establishing attribution	2026-04-01 11:25:45 +00:00
jlightner	1e31af9c30	stage5 prompt: add explicit section ordering guidance — follow the workflow Sections should mirror the actual production workflow: foundations before finishing, construction before glue, sound sources before processing before mix-bus treatment. Includes the test: 'would a producer follow these steps in this sequence?' and a natural flow template (framework → construction → combining/refining → quality checks).	2026-04-01 11:23:20 +00:00
jlightner	7311f25509	stage5: replace synthesis prompt with v016 (masterclass-recap) + add 100 variant prompts New prompt combines: embedded documentarian role, distilled-knowledge framing, conversational authority voice, problem-solution section structure, context-wrapped specifics, problem-driven teaching rhythm, any-skill-level reader model, insight-first summary, and engagement emphasis. 100 variant prompts generated across 9 dimensions of variation for future A/B testing. Generator script included for reproducibility.	2026-04-01 10:49:16 +00:00
jlightner	1b54b51922	optimize: Stage 5 synthesis prompt — round 0 winner (0.95→1.0 composite) Applied first optimization result: tighter voice preservation instructions, improved section flow guidance, trimmed redundant metadata instructions. 13382→11123 chars (-17%).	2026-04-01 10:15:24 +00:00
jlightner	4f4126e0ce	feat: Generalized OptimizationLoop to stages 2-5 with per-stage fixture… - "backend/pipeline/quality/optimizer.py" - "backend/pipeline/quality/__main__.py" - "backend/pipeline/quality/scorer.py" - "backend/pipeline/quality/fixtures/sample_segments.json" - "backend/pipeline/quality/fixtures/sample_topic_group.json" - "backend/pipeline/quality/fixtures/sample_classifications.json" GSD-Task: S04/T02	2026-04-01 09:24:42 +00:00
jlightner	1be0deeb76	feat: Added STAGE_CONFIGS registry (stages 2-5) with per-stage rubrics,… - "backend/pipeline/quality/scorer.py" - "backend/pipeline/quality/variant_generator.py" GSD-Task: S04/T01	2026-04-01 09:20:24 +00:00
jlightner	03373f263d	perf: Added optimize CLI subcommand with leaderboard table, ASCII traje… - "backend/pipeline/quality/__main__.py" - "backend/pipeline/quality/results/.gitkeep" GSD-Task: S03/T02	2026-04-01 09:10:42 +00:00
jlightner	0d82b2b409	feat: Created PromptVariantGenerator (LLM-powered prompt mutation) and… - "backend/pipeline/quality/variant_generator.py" - "backend/pipeline/quality/optimizer.py" GSD-Task: S03/T01	2026-04-01 09:08:01 +00:00
jlightner	0086573af5	feat: Added VoiceDial class with 3-band prompt modification and ScoreRu… - "backend/pipeline/quality/voice_dial.py" - "backend/pipeline/quality/scorer.py" - "backend/pipeline/quality/__main__.py" GSD-Task: S02/T02	2026-04-01 08:57:07 +00:00
jlightner	91cae921a4	feat: Built ScoreRunner with 5-dimension LLM-as-judge scoring rubric, C… - "backend/pipeline/quality/scorer.py" - "backend/pipeline/quality/__main__.py" - "backend/pipeline/quality/fixtures/sample_moments.json" - "backend/pipeline/quality/fixtures/__init__.py" GSD-Task: S02/T01	2026-04-01 08:53:40 +00:00
jlightner	b1b02a9633	test: Built pipeline.quality package with FitnessRunner (9 tests, 4 cat… - "backend/pipeline/quality/__init__.py" - "backend/pipeline/quality/__main__.py" - "backend/pipeline/quality/fitness.py" GSD-Task: S01/T01	2026-04-01 08:45:05 +00:00
jlightner	2cf9ae9bd6	fix: Retrigger button now uses clean-retrigger (wipes events + re-runs from scratch) The plain trigger endpoint short-circuits on status=complete — 'nothing to do'. Retrigger must use clean-retrigger to reset pipeline state first.	2026-04-01 07:34:01 +00:00
jlightner	18de2b3065	fix: Rename 'Trigger (debug)' button to 'Retrigger'	2026-04-01 07:25:20 +00:00
jlightner	c62c6eb644	fix: Pipeline LLM audit — temperature=0, realistic token ratios, structured request_params Audit findings & fixes: - temperature was never set (API defaulted to 1.0) → now explicit 0.0 for deterministic JSON - llm_max_tokens=65536 exceeded hard_limit=32768 → aligned to 32768 - Output ratio estimates were 5-30x too high (based on actual pipeline data): stage2: 0.6→0.05, stage3: 2.0→0.3, stage4: 0.5→0.3, stage5: 2.5→0.8 - request_params now structured as api_params (what's sent to LLM) vs pipeline_config (internal estimator settings) — no more ambiguous 'hard_limit' in request params - temperature=0.0 sent on both primary and fallback endpoints	2026-04-01 07:20:09 +00:00
jlightner	c7ac4be860	feat: Store LLM request params (max_tokens, model, modality) in pipeline events - _make_llm_callback now accepts request_params dict - All 6 LLM call sites pass max_tokens, model_override, modality, response_model, hard_limit - request_params stored in payload JSONB on every llm_call event (always, not just debug mode) - Frontend JSON export includes full payload + request_params at top level - DebugPayloadViewer shows 'Request Params' section even with debug mode off - Answers whether max_tokens is actually being sent on pipeline requests	2026-04-01 07:01:57 +00:00
jlightner	5f608b8889	fix: Parallel search with match_context, deterministic Qdrant IDs, raised embedding timeout - Search now runs semantic + keyword in parallel, merges and deduplicates - Keyword results always included with match_context explaining WHY matched - Semantic results filtered by minimum score threshold (0.45) - match_context shows 'Creator: X', 'Tag: Y', 'Title match', 'Content: ...' - Qdrant points use deterministic uuid5 IDs (no more duplicates on reindex) - Embedding timeout raised from 300ms to 2s (Ollama needs it) - _enrich_qdrant_results reads creator_name from payload before DB fallback - Frontend displays match_context as highlighted bar on search result cards	2026-04-01 06:54:34 +00:00
jlightner	94da19c05d	fix: Variable ordering bug and stage 5 truncation recovery Two fixes: 1. page_moment_indices was referenced before assignment in the page persist loop — moved assignment to top of loop body. This caused "cannot access local variable" errors on every stage 5 run. 2. Stage 5 now catches LLMTruncationError and splits the chunk in half for retry, instead of blindly retrying the same oversized prompt. This handles categories where synthesis output exceeds the model context window. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 01:51:28 -05:00
jlightner	630d3fa477	feat: Created SortDropdown component and useSortPreference hook, integr… - "frontend/src/components/SortDropdown.tsx" - "frontend/src/hooks/useSortPreference.ts" - "frontend/src/pages/SearchResults.tsx" - "frontend/src/pages/SubTopicPage.tsx" - "frontend/src/pages/CreatorDetail.tsx" - "frontend/src/api/public-client.ts" - "frontend/src/App.css" GSD-Task: S02/T02	2026-04-01 06:41:52 +00:00
jlightner	250d7315af	feat: Added sort query parameter (relevance/newest/oldest/alpha/creator… - "backend/routers/search.py" - "backend/routers/topics.py" - "backend/routers/techniques.py" - "backend/search_service.py" GSD-Task: S02/T01	2026-04-01 06:41:52 +00:00
jlightner	c1cdba14f2	feat: Added partial_matches fallback UI to search results — shows muted… - "frontend/src/api/public-client.ts" - "frontend/src/pages/SearchResults.tsx" - "frontend/src/App.css" GSD-Task: S01/T03	2026-04-01 06:41:52 +00:00
jlightner	5a484fb27a	feat: Enriched Qdrant embedding text with creator_name/tags and added r… - "backend/pipeline/stages.py" - "backend/pipeline/qdrant_client.py" - "backend/routers/pipeline.py" GSD-Task: S01/T02	2026-04-01 06:41:52 +00:00
jlightner	9c0247c830	feat: Refactored keyword_search to multi-token AND with cross-field mat… - "backend/search_service.py" - "backend/schemas.py" - "backend/routers/search.py" - "backend/tests/test_search.py" GSD-Task: S01/T01	2026-04-01 06:41:52 +00:00
jlightner	0d538238a6	fix: Moment-to-page linking via moment_indices in stage 5 synthesis When the LLM splits a category group into multiple technique pages, moments were blanket-linked to the last page in the loop, leaving all other pages as orphans with 0 key moments (48 out of 204 pages affected). Added moment_indices field to SynthesizedPage schema and synthesis prompt so the LLM explicitly declares which input moments each page covers. Stage 5 now uses these indices for targeted linking instead of the broken blanket approach. Tags are also computed per-page from linked moments only, fixing cross-contamination (e.g. "stereo imaging" tag appearing on gain staging pages). Deleted 48 orphan technique pages from the database. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 00:34:37 -05:00
jlightner	89c3f4fcc4	feat: Enrich in-progress stage display and memoize pipeline page In-progress stages now show: - Live elapsed time (ticks every second) next to the active stage dot - Run-level token count so far Performance: wrapped StageTimeline, StatusFilter, WorkerStatus, and RecentActivityFeed with React.memo. Memoized filteredVideos with useMemo. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 22:10:07 -05:00
jlightner	2ea720af5c	feat: Truncation detection, batched classification, and pipeline auto-resume Three resilience improvements to the pipeline: 1. LLMResponse(str) subclass carries finish_reason metadata from the LLM. _safe_parse_llm_response detects truncation (finish=length) and raises LLMTruncationError instead of wastefully retrying with a JSON nudge that makes the prompt even longer. 2. Stage 4 classification now batches moments (20 per call) instead of sending all moments in a single LLM call. Prevents context window overflow for videos with many moments. Batch results are merged with reindexed moment_index values. 3. run_pipeline auto-resumes from the last completed stage on error/retry instead of always restarting from stage 2. Queries pipeline_events for the most recent run to find completed stages. clean_reprocess trigger still forces a full restart. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 17:48:19 -05:00
jlightner	2a128a3804	fix: Inflate LLM token estimates and forward max_tokens on retry Stage 4 classification was truncating (finish=length) because the 0.15x output ratio underestimated token needs. Inflated all stage ratios, bumped the buffer from 20% to 50%, raised the floor from 2048 to 4096, and fixed _safe_parse_llm_response to forward max_tokens on retry instead of falling back to the 65k default. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 17:28:58 -05:00
jlightner	a216f60093	style: Custom dark-theme checkboxes on pipeline admin page Replace default browser checkboxes with custom styled versions that blend with the dark UI — transparent background, muted border, cyan accent on check. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 17:23:33 -05:00
jlightner	0abd11299e	feat: Add real-time text search filter on pipeline admin page Filters video list by filename or creator name as you type. Works alongside the existing status and creator dropdown filters. Includes a clear button when text is entered.	2026-03-31 17:38:31 +00:00
jlightner	564eb2a636	feat: Change global search shortcut from Cmd+K to Ctrl+Shift+F	2026-03-31 17:33:09 +00:00
jlightner	604e9711d1	feat: Add tooltips to stage timeline dots showing stage name on hover	2026-03-31 17:30:45 +00:00
jlightner	d7de26b86b	feat: Add context labels to multi-call pipeline stages Stage 3 (extraction) LLM calls now show the topic group label (e.g., 'Sound Design Basics') and Stage 5 (synthesis) calls show the category name. Displayed as a cyan italic label in the event row between the event type badge and model name. Helps admins understand why there are multiple LLM calls per stage.	2026-03-31 17:27:40 +00:00
jlightner	af3f1a6663	feat: Pipeline runs — per-execution tracking with run-scoped events Data model: - New pipeline_runs table (id, video_id, run_number, trigger, status, started_at, finished_at, error_stage, total_tokens) - pipeline_events gains run_id FK (nullable for backward compat) - Alembic migration 010_add_pipeline_runs Backend: - run_pipeline() creates a PipelineRun, threads run_id through all stages - _emit_event() and _make_llm_callback() accept and store run_id - Stage 6 (final) calls _finish_run() to mark complete with token totals - mark_pipeline_error marks run as error - Revoke marks running runs as cancelled - Trigger endpoints pass trigger type (manual, clean_reprocess) - New GET /admin/pipeline/runs/{video_id} — lists runs with event counts - GET /admin/pipeline/events supports ?run_id= filter Frontend: - Expanded video detail now shows RunList instead of flat EventLog - Each run is a collapsible card showing: run number, trigger type, status badge, timestamps, token count, event count - Latest run auto-expands, older runs collapsed - Legacy events (pre-run-tracking) shown as separate collapsible section - Run cards color-coded: cyan border for running, red for error, gray for cancelled - EventLog accepts optional runId prop to scope events to a single run	2026-03-31 17:13:41 +00:00
jlightner	a6f4f36a46	fix: Clean retrigger preserves transcript_segments (pipeline input data) Deleting transcript_segments left the pipeline with nothing to process — all stages would skip immediately. Segments come from the ingest step, not from pipeline stages 2-6. Only pipeline_events and key_moments (pipeline output) are deleted during clean reprocess.	2026-03-31 16:32:25 +00:00
jlightner	63e350a882	fix: Auto-refresh EventLog every 10s for processing/queued videos Previously the event log only loaded once when the row was expanded, so mid-pipeline videos only showed start events. Now the EventLog component accepts a status prop and polls every 10s when the video is processing or queued, silently updating without showing a loading spinner.	2026-03-31 16:23:33 +00:00
jlightner	9497d8f0e4	feat: Add real-time pipeline visibility — auto-refresh, stage timeline, activity feed, bulk log - Backend: Video list now includes active_stage, active_stage_status, and stage_started_at fields via DISTINCT ON subquery - Backend: New GET /admin/pipeline/recent-activity endpoint returns latest stage completions/errors with video context - Frontend: 15-second auto-refresh with change detection — video rows flash when status changes - Frontend: Stage timeline dots on processing/complete/error videos showing progress through stages 2-5, active stage pulses - Frontend: Collapsible Recent Activity feed at top showing last 8 stage completions/errors with duration and creator - Frontend: Bulk operation scrollable log showing per-video results as they complete - Frontend: Auto-refresh checkbox toggle in header	2026-03-31 16:12:57 +00:00
jlightner	04ae6d0703	feat: Add bulk pipeline reprocessing — creator filter, multi-select, clean retrigger - Backend: POST /admin/pipeline/clean-retrigger/{video_id} endpoint that deletes pipeline_events, key_moments, transcript_segments, and Qdrant vectors before retriggering the pipeline - Backend: QdrantManager.delete_by_video_id() for vector cleanup - Frontend: Creator filter dropdown on pipeline admin page - Frontend: Checkbox selection column with select-all - Frontend: Bulk toolbar with Retrigger Selected and Clean Reprocess actions, sequential dispatch with progress bar, cancel support - Bulk dispatch uses 500ms delay between requests to avoid slamming API	2026-03-31 15:24:59 +00:00
jlightner	f3e6a9c885	feat: Created useDocumentTitle hook and wired descriptive, route-specif… - "frontend/src/hooks/useDocumentTitle.ts" - "frontend/src/pages/Home.tsx" - "frontend/src/pages/TopicsBrowse.tsx" - "frontend/src/pages/SubTopicPage.tsx" - "frontend/src/pages/CreatorsBrowse.tsx" - "frontend/src/pages/CreatorDetail.tsx" - "frontend/src/pages/TechniquePage.tsx" - "frontend/src/pages/SearchResults.tsx" GSD-Task: S04/T02	2026-03-31 08:56:16 +00:00
jlightner	6845f5c349	feat: Demoted nav brand to span, promoted page headings to h1, added sk… - "frontend/src/App.tsx" - "frontend/src/App.css" - "frontend/src/pages/Home.tsx" - "frontend/src/pages/SearchResults.tsx" - "frontend/src/pages/TopicsBrowse.tsx" - "frontend/src/pages/CreatorsBrowse.tsx" - "frontend/src/pages/SubTopicPage.tsx" - "frontend/src/pages/AdminReports.tsx" GSD-Task: S04/T01	2026-03-31 08:52:48 +00:00
jlightner	85712c15eb	feat: Added mobile hamburger menu with 44px touch targets, Escape/outsi… - "frontend/src/App.tsx" - "frontend/src/App.css" GSD-Task: S03/T02	2026-03-31 08:45:33 +00:00
jlightner	fea0afdec0	feat: Refactored SearchAutocomplete from heroSize boolean to variant st… - "frontend/src/components/SearchAutocomplete.tsx" - "frontend/src/App.tsx" - "frontend/src/App.css" - "frontend/src/pages/Home.tsx" - "frontend/src/pages/SearchResults.tsx" GSD-Task: S03/T01	2026-03-31 08:42:15 +00:00
jlightner	fa1fc82d5a	feat: Created shared TagList component with max-4 overflow, applied acr… - "frontend/src/components/TagList.tsx" - "frontend/src/pages/Home.tsx" - "frontend/src/pages/SearchResults.tsx" - "frontend/src/pages/SubTopicPage.tsx" - "frontend/src/pages/CreatorDetail.tsx" - "frontend/src/pages/TopicsBrowse.tsx" - "frontend/src/App.css" GSD-Task: S02/T03	2026-03-31 08:35:07 +00:00
jlightner	b01e5949b6	feat: Replaced run-on dot-separated topic stats on CreatorDetail with c… - "frontend/src/pages/CreatorDetail.tsx" - "frontend/src/App.css" GSD-Task: S02/T02	2026-03-31 08:32:09 +00:00
jlightner	a9b65fcea9	feat: Topics page loads with all categories collapsed; expand/collapse… - "frontend/src/pages/TopicsBrowse.tsx" - "frontend/src/App.css" GSD-Task: S02/T01	2026-03-31 08:30:55 +00:00
jlightner	df559bbca0	feat: Added GET /api/v1/techniques/random endpoint returning {slug}, fe… - "backend/routers/techniques.py" - "frontend/src/api/public-client.ts" - "frontend/src/pages/Home.tsx" - "frontend/src/App.css" GSD-Task: S01/T02	2026-03-31 08:24:38 +00:00
jlightner	9e4f10b0af	feat: Added scale(1.02) hover to all 6 card types, cardEnter stagger an… - "frontend/src/App.css" - "frontend/src/pages/Home.tsx" - "frontend/src/pages/TopicsBrowse.tsx" - "frontend/src/pages/CreatorDetail.tsx" - "frontend/src/pages/SubTopicPage.tsx" - "frontend/src/pages/SearchResults.tsx" GSD-Task: S01/T01	2026-03-31 08:22:37 +00:00
jlightner	d0bdc6f516	feat: Extracted inline typeahead from Home.tsx into shared SearchAutoco… - "frontend/src/components/SearchAutocomplete.tsx" - "frontend/src/api/public-client.ts" - "frontend/src/pages/Home.tsx" - "frontend/src/pages/SearchResults.tsx" - "frontend/src/App.css" GSD-Task: S04/T02	2026-03-31 06:39:01 +00:00
jlightner	9107323a66	test: Added GET /api/v1/search/suggestions endpoint returning popular t… - "backend/schemas.py" - "backend/routers/search.py" - "backend/tests/test_search.py" GSD-Task: S04/T01	2026-03-31 06:35:37 +00:00

1 2 3 4

152 commits