- Add content_hash (SHA-256 of transcript text) to source_videos (migration 005)
- 3-tier duplicate detection at ingest: exact filename, content hash,
then normalized filename + duration (handles yt-dlp re-downloads)
- Snapshot prior technique_page_ids to Redis before pipeline dispatch
- Stage 5 matches prior pages by creator+category before slug fallback,
enabling version snapshots on reprocessing even when LLM generates
different slugs
- Expose content_hash in API responses and admin pipeline dashboard
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>