chrysopedia/alembic
jlightner c6f69019cf feat: Content hash dedup and prior-page versioning
- Add content_hash (SHA-256 of transcript text) to source_videos (migration 005)
- 3-tier duplicate detection at ingest: exact filename, content hash,
  then normalized filename + duration (handles yt-dlp re-downloads)
- Snapshot prior technique_page_ids to Redis before pipeline dispatch
- Stage 5 matches prior pages by creator+category before slug fallback,
  enabling version snapshots on reprocessing even when LLM generates
  different slugs
- Expose content_hash in API responses and admin pipeline dashboard

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 05:55:27 -05:00
..
versions feat: Content hash dedup and prior-page versioning 2026-03-30 05:55:27 -05:00
env.py fix: alembic env.py sys.path includes parent dir for Docker compatibility 2026-03-30 01:22:30 +00:00
script.py.mako fix: Created SQLAlchemy models for all 7 entities, Alembic async migrat… 2026-03-29 21:48:36 +00:00