Deletes all technique pages, versions, links, key moments, pipeline
events/runs, Qdrant vectors, and Redis cache while preserving creators,
videos, and transcript segments. Resets all video status to not_started.
Double-confirm dialog in the UI prevents accidental use.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rewrote voice from third-person narrative ("Keota does X") to instructive
("Route the effect at 100% wet"). Body prose now reads like a lesson book.
- Hard rule: creator name appears in title/summary only, max once in body
(for quote attribution). Fixed JSON example that modeled heavy name usage.
- Added orientation-first section rhythm: brief definition before diving into
method, prevents run-on feel.
- Page minimum thresholds: 3+ sections, 400+ words, 3+ moments. Prevents
stub pages from thin categories.
- Strengthened merge guidance: prefer fewer rich pages over many stubs.
- Updated all examples to model instructive phrasing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Drops prompt iteration cycles from 20-30 min to under 5 min by enabling
stage-isolated re-runs and offline prompt testing against exported fixtures.
Phase 1: Offline prompt test harness
- export_fixture.py: export stage 5 inputs from DB to reusable JSON fixtures
- test_harness.py: run synthesis offline with any prompt, no Docker needed
- promote subcommand: deploy winning prompts with backup and optional git commit
Phase 2: Classification data persistence
- Dual-write classification to PostgreSQL + Redis (fixes 24hr TTL data loss)
- Clean retrigger now clears Redis cache keys (fixes stale data bug)
- Alembic migration 011: classification_data JSONB column + stage_rerun enum
Phase 3: Stage-isolated re-run
- run_single_stage Celery task with prerequisite validation and prompt overrides
- _load_prompt supports per-video Redis overrides for testing custom prompts
- POST /admin/pipeline/rerun-stage/{video_id}/{stage_name} endpoint
- Frontend: Re-run Stage modal with stage selector and prompt override textarea
Phase 4: Chunking inspector
- GET /admin/pipeline/chunking/{video_id} returns topic boundaries,
classifications, and synthesis group breakdowns
- Frontend: collapsible Chunking Inspector panel per video
Phase 5: Prompt deployment & stale data cleanup
- GET /admin/pipeline/stale-pages detects pages from older prompts
- POST /admin/pipeline/bulk-resynthesize re-runs a stage on all completed videos
- Frontend: stale pages indicator badge with one-click bulk re-synth
Phase 6: Automated iteration foundation
- Quality CLI --video-id flag auto-exports fixture from DB
- POST /admin/pipeline/optimize-prompt/{stage} dispatches optimization as Celery task
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each video now completes all stages (2→6) before the worker picks up the
next queued video. Previously, dispatching celery_chain for multiple videos
caused interleaved execution — nothing finished until everything went through
all stages. Now run_pipeline calls each stage function synchronously within
the same worker task, so videos complete linearly and efficiently.
Validation/quality-check sections can NEVER precede construction sections.
Added concrete wrong vs correct ordering example using the exact snare design
case that failed. Elevated from 'typically' guidance to non-negotiable rule.
Sections should mirror the actual production workflow: foundations before
finishing, construction before glue, sound sources before processing before
mix-bus treatment. Includes the test: 'would a producer follow these steps
in this sequence?' and a natural flow template (framework → construction →
combining/refining → quality checks).
Audit findings & fixes:
- temperature was never set (API defaulted to 1.0) → now explicit 0.0 for deterministic JSON
- llm_max_tokens=65536 exceeded hard_limit=32768 → aligned to 32768
- Output ratio estimates were 5-30x too high (based on actual pipeline data):
stage2: 0.6→0.05, stage3: 2.0→0.3, stage4: 0.5→0.3, stage5: 2.5→0.8
- request_params now structured as api_params (what's sent to LLM) vs pipeline_config
(internal estimator settings) — no more ambiguous 'hard_limit' in request params
- temperature=0.0 sent on both primary and fallback endpoints
- _make_llm_callback now accepts request_params dict
- All 6 LLM call sites pass max_tokens, model_override, modality, response_model, hard_limit
- request_params stored in payload JSONB on every llm_call event (always, not just debug mode)
- Frontend JSON export includes full payload + request_params at top level
- DebugPayloadViewer shows 'Request Params' section even with debug mode off
- Answers whether max_tokens is actually being sent on pipeline requests
- Search now runs semantic + keyword in parallel, merges and deduplicates
- Keyword results always included with match_context explaining WHY matched
- Semantic results filtered by minimum score threshold (0.45)
- match_context shows 'Creator: X', 'Tag: Y', 'Title match', 'Content: ...'
- Qdrant points use deterministic uuid5 IDs (no more duplicates on reindex)
- Embedding timeout raised from 300ms to 2s (Ollama needs it)
- _enrich_qdrant_results reads creator_name from payload before DB fallback
- Frontend displays match_context as highlighted bar on search result cards
Two fixes:
1. page_moment_indices was referenced before assignment in the page
persist loop — moved assignment to top of loop body. This caused
"cannot access local variable" errors on every stage 5 run.
2. Stage 5 now catches LLMTruncationError and splits the chunk in
half for retry, instead of blindly retrying the same oversized
prompt. This handles categories where synthesis output exceeds
the model context window.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the LLM splits a category group into multiple technique pages,
moments were blanket-linked to the last page in the loop, leaving all
other pages as orphans with 0 key moments (48 out of 204 pages affected).
Added moment_indices field to SynthesizedPage schema and synthesis prompt
so the LLM explicitly declares which input moments each page covers.
Stage 5 now uses these indices for targeted linking instead of the broken
blanket approach. Tags are also computed per-page from linked moments
only, fixing cross-contamination (e.g. "stereo imaging" tag appearing
on gain staging pages).
Deleted 48 orphan technique pages from the database.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
In-progress stages now show:
- Live elapsed time (ticks every second) next to the active stage dot
- Run-level token count so far
Performance: wrapped StageTimeline, StatusFilter, WorkerStatus, and
RecentActivityFeed with React.memo. Memoized filteredVideos with useMemo.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three resilience improvements to the pipeline:
1. LLMResponse(str) subclass carries finish_reason metadata from the LLM.
_safe_parse_llm_response detects truncation (finish=length) and raises
LLMTruncationError instead of wastefully retrying with a JSON nudge
that makes the prompt even longer.
2. Stage 4 classification now batches moments (20 per call) instead of
sending all moments in a single LLM call. Prevents context window
overflow for videos with many moments. Batch results are merged with
reindexed moment_index values.
3. run_pipeline auto-resumes from the last completed stage on error/retry
instead of always restarting from stage 2. Queries pipeline_events for
the most recent run to find completed stages. clean_reprocess trigger
still forces a full restart.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stage 4 classification was truncating (finish=length) because the 0.15x
output ratio underestimated token needs. Inflated all stage ratios,
bumped the buffer from 20% to 50%, raised the floor from 2048 to 4096,
and fixed _safe_parse_llm_response to forward max_tokens on retry
instead of falling back to the 65k default.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace default browser checkboxes with custom styled versions that blend
with the dark UI — transparent background, muted border, cyan accent on check.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Filters video list by filename or creator name as you type. Works
alongside the existing status and creator dropdown filters. Includes
a clear button when text is entered.
Stage 3 (extraction) LLM calls now show the topic group label (e.g.,
'Sound Design Basics') and Stage 5 (synthesis) calls show the category
name. Displayed as a cyan italic label in the event row between the
event type badge and model name. Helps admins understand why there are
multiple LLM calls per stage.
Data model:
- New pipeline_runs table (id, video_id, run_number, trigger, status,
started_at, finished_at, error_stage, total_tokens)
- pipeline_events gains run_id FK (nullable for backward compat)
- Alembic migration 010_add_pipeline_runs
Backend:
- run_pipeline() creates a PipelineRun, threads run_id through all stages
- _emit_event() and _make_llm_callback() accept and store run_id
- Stage 6 (final) calls _finish_run() to mark complete with token totals
- mark_pipeline_error marks run as error
- Revoke marks running runs as cancelled
- Trigger endpoints pass trigger type (manual, clean_reprocess)
- New GET /admin/pipeline/runs/{video_id} — lists runs with event counts
- GET /admin/pipeline/events supports ?run_id= filter
Frontend:
- Expanded video detail now shows RunList instead of flat EventLog
- Each run is a collapsible card showing: run number, trigger type,
status badge, timestamps, token count, event count
- Latest run auto-expands, older runs collapsed
- Legacy events (pre-run-tracking) shown as separate collapsible section
- Run cards color-coded: cyan border for running, red for error,
gray for cancelled
- EventLog accepts optional runId prop to scope events to a single run
Deleting transcript_segments left the pipeline with nothing to process —
all stages would skip immediately. Segments come from the ingest step,
not from pipeline stages 2-6. Only pipeline_events and key_moments
(pipeline output) are deleted during clean reprocess.
Previously the event log only loaded once when the row was expanded,
so mid-pipeline videos only showed start events. Now the EventLog
component accepts a status prop and polls every 10s when the video is
processing or queued, silently updating without showing a loading spinner.
- Backend: Video list now includes active_stage, active_stage_status, and
stage_started_at fields via DISTINCT ON subquery
- Backend: New GET /admin/pipeline/recent-activity endpoint returns
latest stage completions/errors with video context
- Frontend: 15-second auto-refresh with change detection — video rows
flash when status changes
- Frontend: Stage timeline dots on processing/complete/error videos
showing progress through stages 2-5, active stage pulses
- Frontend: Collapsible Recent Activity feed at top showing last 8
stage completions/errors with duration and creator
- Frontend: Bulk operation scrollable log showing per-video results
as they complete
- Frontend: Auto-refresh checkbox toggle in header
- Backend: POST /admin/pipeline/clean-retrigger/{video_id} endpoint that
deletes pipeline_events, key_moments, transcript_segments, and Qdrant
vectors before retriggering the pipeline
- Backend: QdrantManager.delete_by_video_id() for vector cleanup
- Frontend: Creator filter dropdown on pipeline admin page
- Frontend: Checkbox selection column with select-all
- Frontend: Bulk toolbar with Retrigger Selected and Clean Reprocess
actions, sequential dispatch with progress bar, cancel support
- Bulk dispatch uses 500ms delay between requests to avoid slamming API