chrysopedia/alembic/versions
jlightner c6f69019cf feat: Content hash dedup and prior-page versioning
- Add content_hash (SHA-256 of transcript text) to source_videos (migration 005)
- 3-tier duplicate detection at ingest: exact filename, content hash,
  then normalized filename + duration (handles yt-dlp re-downloads)
- Snapshot prior technique_page_ids to Redis before pipeline dispatch
- Stage 5 matches prior pages by creator+category before slug fallback,
  enabling version snapshots on reprocessing even when LLM generates
  different slugs
- Expose content_hash in API responses and admin pipeline dashboard

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 05:55:27 -05:00
..
001_initial.py fix: Created SQLAlchemy models for all 7 entities, Alembic async migrat… 2026-03-29 21:48:36 +00:00
002_technique_page_versions.py feat: Added TechniquePageVersion model, Alembic migration 002, pipeline… 2026-03-30 07:27:40 +00:00
003_content_reports.py feat: Content issue reporting — submit from technique pages, manage in admin reports page 2026-03-30 02:53:56 -05:00
004_pipeline_events.py feat: Pipeline events, admin dashboard, and version switcher UI 2026-03-30 05:55:07 -05:00
005_content_hash.py feat: Content hash dedup and prior-page versioning 2026-03-30 05:55:27 -05:00