feat: Created ASS subtitle generator with karaoke word-by-word highligh…
- "backend/pipeline/caption_generator.py" - "backend/pipeline/shorts_generator.py" - "backend/pipeline/stages.py" - "backend/models.py" - "alembic/versions/027_add_captions_enabled.py" - "backend/pipeline/test_caption_generator.py" GSD-Task: S04/T01
This commit is contained in:
parent
18e9a4dce1
commit
125983588d
16 changed files with 997 additions and 4 deletions
|
|
@ -8,7 +8,7 @@ Shorts pipeline goes end-to-end with captioning and templates. Player gets key m
|
||||||
|----|-------|------|---------|------|------------|
|
|----|-------|------|---------|------|------------|
|
||||||
| S01 | [A] Shorts Publishing Flow | medium | — | ✅ | Creator approves a short → it renders → gets a shareable URL and embed code |
|
| S01 | [A] Shorts Publishing Flow | medium | — | ✅ | Creator approves a short → it renders → gets a shareable URL and embed code |
|
||||||
| S02 | [A] Key Moment Pins on Player Timeline | low | — | ✅ | Key technique moments appear as clickable pins on the player timeline |
|
| S02 | [A] Key Moment Pins on Player Timeline | low | — | ✅ | Key technique moments appear as clickable pins on the player timeline |
|
||||||
| S03 | [A] Embed Support (iframe Snippet) | low | — | ⬜ | Creators can copy an iframe embed snippet to put the player on their own site |
|
| S03 | [A] Embed Support (iframe Snippet) | low | — | ✅ | Creators can copy an iframe embed snippet to put the player on their own site |
|
||||||
| S04 | [B] Auto-Captioning + Template System | medium | — | ⬜ | Shorts have Whisper-generated animated subtitles and creator-configurable intro/outro cards |
|
| S04 | [B] Auto-Captioning + Template System | medium | — | ⬜ | Shorts have Whisper-generated animated subtitles and creator-configurable intro/outro cards |
|
||||||
| S05 | [B] Citation UX Improvements | low | — | ⬜ | Chat citations show timestamp links that seek the player and source cards with video thumbnails |
|
| S05 | [B] Citation UX Improvements | low | — | ⬜ | Chat citations show timestamp links that seek the player and source cards with video thumbnails |
|
||||||
| S06 | Forgejo KB Update — Shorts, Embed, Citations | low | S01, S02, S03, S04, S05 | ⬜ | Forgejo wiki updated with shorts pipeline, embed system, citation architecture |
|
| S06 | Forgejo KB Update — Shorts, Embed, Citations | low | S01, S02, S03, S04, S05 | ⬜ | Forgejo wiki updated with shorts pipeline, embed system, citation architecture |
|
||||||
|
|
|
||||||
95
.gsd/milestones/M024/slices/S03/S03-SUMMARY.md
Normal file
95
.gsd/milestones/M024/slices/S03/S03-SUMMARY.md
Normal file
|
|
@ -0,0 +1,95 @@
|
||||||
|
---
|
||||||
|
id: S03
|
||||||
|
parent: M024
|
||||||
|
milestone: M024
|
||||||
|
provides:
|
||||||
|
- EmbedPlayer page at /embed/:videoId
|
||||||
|
- Shared copyToClipboard utility
|
||||||
|
- Copy Embed Code button on WatchPage
|
||||||
|
requires:
|
||||||
|
[]
|
||||||
|
affects:
|
||||||
|
- S06
|
||||||
|
key_files:
|
||||||
|
- frontend/src/utils/clipboard.ts
|
||||||
|
- frontend/src/pages/EmbedPlayer.tsx
|
||||||
|
- frontend/src/pages/EmbedPlayer.module.css
|
||||||
|
- frontend/src/pages/ShortPlayer.tsx
|
||||||
|
- frontend/src/App.tsx
|
||||||
|
- frontend/src/pages/WatchPage.tsx
|
||||||
|
- frontend/src/App.css
|
||||||
|
key_decisions:
|
||||||
|
- Embed route rendered at top-level Routes before AppShell fallback for chrome-free iframe rendering
|
||||||
|
- Audio-only embeds use height 120 vs 405 for video in generated snippet
|
||||||
|
- Branding link opens origin in new tab with noopener for iframe safety
|
||||||
|
- copyToClipboard extracted to shared utility for reuse across ShortPlayer and WatchPage
|
||||||
|
patterns_established:
|
||||||
|
- Top-level Routes in App.tsx for chrome-free pages that skip AppShell (header/nav/footer)
|
||||||
|
observability_surfaces:
|
||||||
|
- none
|
||||||
|
drill_down_paths:
|
||||||
|
- .gsd/milestones/M024/slices/S03/tasks/T01-SUMMARY.md
|
||||||
|
- .gsd/milestones/M024/slices/S03/tasks/T02-SUMMARY.md
|
||||||
|
duration: ""
|
||||||
|
verification_result: passed
|
||||||
|
completed_at: 2026-04-04T11:00:25.948Z
|
||||||
|
blocker_discovered: false
|
||||||
|
---
|
||||||
|
|
||||||
|
# S03: [A] Embed Support (iframe Snippet)
|
||||||
|
|
||||||
|
**Creators can copy an iframe embed snippet from WatchPage, and /embed/:videoId renders a chrome-free player suitable for iframe embedding.**
|
||||||
|
|
||||||
|
## What Happened
|
||||||
|
|
||||||
|
Built the embed support feature in two tasks. T01 extracted copyToClipboard from ShortPlayer into a shared utility at `frontend/src/utils/clipboard.ts`, then created the EmbedPlayer page (`EmbedPlayer.tsx` + `EmbedPlayer.module.css`) — a full-viewport dark-background player that fetches video detail and renders either VideoPlayer or AudioWaveform based on content type, with a small "Powered by Chrysopedia" branding link.
|
||||||
|
|
||||||
|
T02 wired the `/embed/:videoId` route at the top level of App.tsx's Routes (before the AppShell catch-all), so embed pages render without header/nav/footer. Added a "Copy Embed Code" button to WatchPage's header that generates an iframe snippet with audio-aware height (120px for audio-only, 405px for video) and shows 2-second "Copied!" feedback. The EmbedPlayer chunk is code-split via React.lazy.
|
||||||
|
|
||||||
|
Both `tsc --noEmit` and `npm run build` pass cleanly. The embed route is isolated from the app shell, the clipboard utility is shared, and the iframe snippet includes correct dimensions per content type.
|
||||||
|
|
||||||
|
## Verification
|
||||||
|
|
||||||
|
- `cd frontend && npx tsc --noEmit` — exit 0, zero type errors
|
||||||
|
- `cd frontend && npm run build` — exit 0, EmbedPlayer code-split into own chunk, 190 modules transformed
|
||||||
|
- Files confirmed: clipboard.ts, EmbedPlayer.tsx, EmbedPlayer.module.css all present
|
||||||
|
- App.tsx has /embed/:videoId route before AppShell catch-all (line 233)
|
||||||
|
- WatchPage.tsx generates iframe snippet with audio-aware height (line 35)
|
||||||
|
|
||||||
|
## Requirements Advanced
|
||||||
|
|
||||||
|
None.
|
||||||
|
|
||||||
|
## Requirements Validated
|
||||||
|
|
||||||
|
None.
|
||||||
|
|
||||||
|
## New Requirements Surfaced
|
||||||
|
|
||||||
|
None.
|
||||||
|
|
||||||
|
## Requirements Invalidated or Re-scoped
|
||||||
|
|
||||||
|
None.
|
||||||
|
|
||||||
|
## Deviations
|
||||||
|
|
||||||
|
Added .watch-page__header-top flex container in WatchPage for title/button layout — minor structural addition not in plan.
|
||||||
|
|
||||||
|
## Known Limitations
|
||||||
|
|
||||||
|
None.
|
||||||
|
|
||||||
|
## Follow-ups
|
||||||
|
|
||||||
|
None.
|
||||||
|
|
||||||
|
## Files Created/Modified
|
||||||
|
|
||||||
|
- `frontend/src/utils/clipboard.ts` — New shared copyToClipboard utility extracted from ShortPlayer
|
||||||
|
- `frontend/src/pages/EmbedPlayer.tsx` — New chrome-free embed player page with video/audio support
|
||||||
|
- `frontend/src/pages/EmbedPlayer.module.css` — Full-viewport dark layout styles for embed player
|
||||||
|
- `frontend/src/pages/ShortPlayer.tsx` — Updated to import copyToClipboard from shared utility
|
||||||
|
- `frontend/src/App.tsx` — Added /embed/:videoId route before AppShell catch-all
|
||||||
|
- `frontend/src/pages/WatchPage.tsx` — Added Copy Embed Code button with audio-aware iframe snippet
|
||||||
|
- `frontend/src/App.css` — Added styles for embed copy button
|
||||||
52
.gsd/milestones/M024/slices/S03/S03-UAT.md
Normal file
52
.gsd/milestones/M024/slices/S03/S03-UAT.md
Normal file
|
|
@ -0,0 +1,52 @@
|
||||||
|
# S03: [A] Embed Support (iframe Snippet) — UAT
|
||||||
|
|
||||||
|
**Milestone:** M024
|
||||||
|
**Written:** 2026-04-04T11:00:25.948Z
|
||||||
|
|
||||||
|
## UAT: Embed Support (iframe Snippet)
|
||||||
|
|
||||||
|
### Preconditions
|
||||||
|
- Chrysopedia frontend running (ub01:8096 or local dev server)
|
||||||
|
- At least one video-type and one audio-only source video in the database
|
||||||
|
|
||||||
|
### Test 1: Copy Embed Code — Video Content
|
||||||
|
1. Navigate to a WatchPage for a video that has `video_url` (e.g., `/watch/{videoId}`)
|
||||||
|
2. Locate the "Copy Embed Code" button in the page header
|
||||||
|
3. Click the button
|
||||||
|
4. **Expected:** Button text changes to "Copied!" for ~2 seconds, then reverts to "Copy Embed Code"
|
||||||
|
5. Paste clipboard contents into a text editor
|
||||||
|
6. **Expected:** Clipboard contains `<iframe src="http://{origin}/embed/{videoId}" width="720" height="405" frameborder="0" allowfullscreen></iframe>`
|
||||||
|
|
||||||
|
### Test 2: Copy Embed Code — Audio-Only Content
|
||||||
|
1. Navigate to a WatchPage for an audio-only source video (no `video_url`)
|
||||||
|
2. Click "Copy Embed Code"
|
||||||
|
3. Paste clipboard contents
|
||||||
|
4. **Expected:** iframe height is `120` (not 405): `<iframe src="..." width="720" height="120" ...></iframe>`
|
||||||
|
|
||||||
|
### Test 3: Embed Route — Chrome-Free Video
|
||||||
|
1. Navigate directly to `/embed/{videoId}` for a video source
|
||||||
|
2. **Expected:** Full-viewport dark background, video player fills the space, no site header/nav/footer
|
||||||
|
3. **Expected:** Small "Powered by Chrysopedia" link at the bottom
|
||||||
|
4. **Expected:** Player controls (play/pause, seek, volume) are functional
|
||||||
|
|
||||||
|
### Test 4: Embed Route — Chrome-Free Audio
|
||||||
|
1. Navigate to `/embed/{videoId}` for an audio-only source
|
||||||
|
2. **Expected:** Audio waveform or audio player renders instead of video player
|
||||||
|
3. **Expected:** Same chrome-free layout with branding link
|
||||||
|
|
||||||
|
### Test 5: Embed Route with Start Time
|
||||||
|
1. Navigate to `/embed/{videoId}?t=30`
|
||||||
|
2. **Expected:** Player starts at or seeks to the 30-second mark
|
||||||
|
|
||||||
|
### Test 6: Embed Route — Invalid Video ID
|
||||||
|
1. Navigate to `/embed/nonexistent-id`
|
||||||
|
2. **Expected:** Error state displayed (not a blank page or crash)
|
||||||
|
|
||||||
|
### Test 7: iframe Integration
|
||||||
|
1. Create a local HTML file with the copied iframe snippet
|
||||||
|
2. Open it in a browser
|
||||||
|
3. **Expected:** Chrysopedia player loads inside the iframe, video plays, no app chrome visible
|
||||||
|
|
||||||
|
### Edge Cases
|
||||||
|
- **Rapid clicks on Copy Embed Code:** Should not stack timeouts or cause flickering — button stays "Copied!" and timer resets
|
||||||
|
- **Narrow viewport in iframe:** Embed player should be responsive, scaling to container width
|
||||||
16
.gsd/milestones/M024/slices/S03/tasks/T02-VERIFY.json
Normal file
16
.gsd/milestones/M024/slices/S03/tasks/T02-VERIFY.json
Normal file
|
|
@ -0,0 +1,16 @@
|
||||||
|
{
|
||||||
|
"schemaVersion": 1,
|
||||||
|
"taskId": "T02",
|
||||||
|
"unitId": "M024/S03/T02",
|
||||||
|
"timestamp": 1775300354664,
|
||||||
|
"passed": true,
|
||||||
|
"discoverySource": "task-plan",
|
||||||
|
"checks": [
|
||||||
|
{
|
||||||
|
"command": "cd frontend",
|
||||||
|
"exitCode": 0,
|
||||||
|
"durationMs": 14,
|
||||||
|
"verdict": "pass"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
|
@ -1,6 +1,75 @@
|
||||||
# S04: [B] Auto-Captioning + Template System
|
# S04: [B] Auto-Captioning + Template System
|
||||||
|
|
||||||
**Goal:** Add auto-captioning and template system to shorts pipeline
|
**Goal:** Shorts have Whisper-generated animated subtitles and creator-configurable intro/outro cards
|
||||||
**Demo:** After this: Shorts have Whisper-generated animated subtitles and creator-configurable intro/outro cards
|
**Demo:** After this: Shorts have Whisper-generated animated subtitles and creator-configurable intro/outro cards
|
||||||
|
|
||||||
## Tasks
|
## Tasks
|
||||||
|
- [x] **T01: Created ASS subtitle generator with karaoke word-by-word highlighting and wired it into the shorts generation stage with non-blocking caption enrichment** — Create `caption_generator.py` that converts word-level timings into ASS (Advanced SubStation Alpha) subtitle format with word-by-word karaoke highlighting. Modify `shorts_generator.py` to accept an optional ASS file path and chain the `ass=` filter into the ffmpeg `-vf` string. Wire transcript loading and caption generation into `stage_generate_shorts` in `stages.py`. Add `captions_enabled` boolean column to `GeneratedShort` model. Write unit tests for caption generation.
|
||||||
|
|
||||||
|
Steps:
|
||||||
|
1. Read `backend/pipeline/highlight_scorer.py` for `extract_word_timings` signature and output format. Read `backend/pipeline/shorts_generator.py` for `extract_clip` and `PRESETS`. Read `backend/pipeline/stages.py:2869-2990` for `stage_generate_shorts` flow.
|
||||||
|
2. Create `backend/pipeline/caption_generator.py`:
|
||||||
|
- `generate_ass_captions(word_timings: list[dict], clip_start: float, style_config: dict | None = None) -> str` — returns ASS file content as string. Each word gets a `Dialogue` line. Use `{\k}` karaoke tags for word-by-word highlight timing. Style: bold white text, centered bottom 15%, black outline. Offset all word times by `-clip_start` to make them clip-relative.
|
||||||
|
- `write_ass_file(ass_content: str, output_path: Path) -> Path` — writes to disk, returns path.
|
||||||
|
3. Modify `extract_clip()` in `shorts_generator.py`: add optional `ass_path: Path | None = None` parameter. When provided, append `,ass={ass_path}` to the `vf_filter` string before passing to ffmpeg. Ensure the ASS filter comes after scale/pad filters.
|
||||||
|
4. Add `captions_enabled: Mapped[bool] = mapped_column(default=False, server_default='false')` to `GeneratedShort` in `models.py`.
|
||||||
|
5. Create Alembic migration `027_add_captions_enabled.py` for the new column.
|
||||||
|
6. Modify `stage_generate_shorts` in `stages.py`:
|
||||||
|
- After loading the highlight, load `source_video.transcript_path` and parse transcript JSON (reuse the pattern from line ~2465).
|
||||||
|
- Call `extract_word_timings(transcript_data, clip_start, clip_end)` to get word timings for the clip window.
|
||||||
|
- If word timings are non-empty, call `generate_ass_captions()` and `write_ass_file()` to a temp path.
|
||||||
|
- Pass the ASS path to `extract_clip()`. Set `short.captions_enabled = True`.
|
||||||
|
- If word timings are empty, log a warning and proceed without captions.
|
||||||
|
7. Create `backend/pipeline/test_caption_generator.py` with tests:
|
||||||
|
- Valid word timings → correct ASS output with proper timing math
|
||||||
|
- Empty word timings → empty ASS (or raise, depending on design)
|
||||||
|
- Clip offset applied correctly (word at t=10.5 with clip_start=10.0 becomes t=0.5)
|
||||||
|
- ASS format structure (header, style block, dialogue lines)
|
||||||
|
8. Run tests: `cd backend && python -m pytest pipeline/test_caption_generator.py -v`
|
||||||
|
- Estimate: 2h
|
||||||
|
- Files: backend/pipeline/caption_generator.py, backend/pipeline/shorts_generator.py, backend/pipeline/stages.py, backend/models.py, alembic/versions/027_add_captions_enabled.py, backend/pipeline/test_caption_generator.py
|
||||||
|
- Verify: cd backend && python -m pytest pipeline/test_caption_generator.py -v && python -c "from pipeline.caption_generator import generate_ass_captions; print('import ok')"
|
||||||
|
- [ ] **T02: Build card renderer and concat pipeline for intro/outro templates** — Create `card_renderer.py` that generates intro/outro card video segments using ffmpeg lavfi (color + drawtext). Add `shorts_template` JSONB column to Creator model. Implement ffmpeg concat demuxer logic to assemble intro + main clip + outro into final short. Wire into `stage_generate_shorts`. Write unit tests for card renderer.
|
||||||
|
|
||||||
|
Steps:
|
||||||
|
1. Read T01 outputs: `backend/pipeline/caption_generator.py`, modified `shorts_generator.py` and `stages.py`.
|
||||||
|
2. Add `shorts_template: Mapped[dict | None] = mapped_column(JSONB, nullable=True)` to `Creator` model in `models.py`. Create Alembic migration `028_add_shorts_template.py`.
|
||||||
|
3. Create `backend/pipeline/card_renderer.py`:
|
||||||
|
- `render_card(text: str, duration_secs: float, width: int, height: int, accent_color: str = '#22d3ee', font_family: str = 'Inter') -> list[str]` — returns ffmpeg command args that generate a card mp4 from lavfi input (`color=c=black:s={w}x{h}:d={dur}` with `drawtext` for centered text, accent color underline/glow).
|
||||||
|
- `render_card_to_file(text: str, duration_secs: float, width: int, height: int, output_path: Path, accent_color: str = '#22d3ee', font_family: str = 'Inter') -> Path` — executes the ffmpeg command, returns output path.
|
||||||
|
- `concat_segments(segments: list[Path], output_path: Path) -> Path` — writes a concat demuxer list file, runs `ffmpeg -f concat -safe 0 -i list.txt -c copy output.mp4`, returns output path. All segments must share codec settings.
|
||||||
|
4. Modify `shorts_generator.py`: add `extract_clip_with_template(input_path, output_path, start_secs, end_secs, vf_filter, ass_path=None, intro_path=None, outro_path=None) -> None` that extracts the main clip (with optional captions), then if intro/outro paths are provided, concats them via `concat_segments()`.
|
||||||
|
5. Modify `stage_generate_shorts` in `stages.py`:
|
||||||
|
- After loading highlight, also load `highlight.source_video.creator` to access `creator.shorts_template`.
|
||||||
|
- If `shorts_template` exists and `show_intro` is true, call `render_card_to_file()` for intro. Same for outro.
|
||||||
|
- Pass intro/outro paths to the clip extraction. Use codec-compatible settings (libx264, aac, same resolution from preset spec).
|
||||||
|
- If no template, proceed without cards (existing behavior preserved).
|
||||||
|
6. Create `backend/pipeline/test_card_renderer.py` with tests:
|
||||||
|
- `render_card()` returns valid ffmpeg command with correct dimensions and duration
|
||||||
|
- `concat_segments()` generates correct concat list file content
|
||||||
|
- Template config parsing handles missing/partial fields with defaults
|
||||||
|
7. Run tests: `cd backend && python -m pytest pipeline/test_card_renderer.py -v`
|
||||||
|
- Estimate: 2h
|
||||||
|
- Files: backend/pipeline/card_renderer.py, backend/pipeline/shorts_generator.py, backend/pipeline/stages.py, backend/models.py, alembic/versions/028_add_shorts_template.py, backend/pipeline/test_card_renderer.py
|
||||||
|
- Verify: cd backend && python -m pytest pipeline/test_card_renderer.py -v && python -c "from pipeline.card_renderer import render_card, concat_segments; print('import ok')"
|
||||||
|
- [ ] **T03: Template API endpoints and frontend template config UI** — Add REST endpoints for reading and updating creator shorts template config. Add template configuration UI to the HighlightQueue page — color picker, text inputs, duration controls, and intro/outro toggles. Add a caption toggle to the short generation flow.
|
||||||
|
|
||||||
|
Steps:
|
||||||
|
1. Read T02 outputs to understand the shorts_template schema on the Creator model.
|
||||||
|
2. Create or extend `backend/routers/creators.py` with two endpoints:
|
||||||
|
- `GET /api/v1/admin/creators/{creator_id}/shorts-template` — returns current `shorts_template` JSONB or default config if null.
|
||||||
|
- `PUT /api/v1/admin/creators/{creator_id}/shorts-template` — validates and saves template config. Pydantic schema: `ShortsTemplateUpdate` with fields: intro_text (str, max 100), outro_text (str, max 100), accent_color (str, hex pattern), font_family (str), intro_duration_secs (float, 1.0-5.0), outro_duration_secs (float, 1.0-5.0), show_intro (bool), show_outro (bool).
|
||||||
|
- Both endpoints require admin auth.
|
||||||
|
3. Add `ShortsTemplateConfig` and `ShortsTemplateUpdate` Pydantic schemas to `backend/schemas.py`.
|
||||||
|
4. Create `frontend/src/api/templates.ts` — API client functions: `fetchShortsTemplate(creatorId)`, `updateShortsTemplate(creatorId, config)`.
|
||||||
|
5. Add template config UI to `frontend/src/pages/HighlightQueue.tsx`:
|
||||||
|
- A collapsible "Shorts Template" section in the sidebar or above the queue.
|
||||||
|
- Fields: intro text, outro text, accent color (HTML color input), intro/outro duration sliders (1-5s), show intro/outro toggles.
|
||||||
|
- Save button that calls `updateShortsTemplate()`.
|
||||||
|
- Load current template on mount when a creator is selected.
|
||||||
|
6. Add a "Captions" toggle checkbox to the short generation trigger in HighlightQueue — when unchecked, pass `captions=false` query param to the generate endpoint. Update the `POST /api/v1/admin/highlights/{id}/generate-shorts` handler (in `backend/routers/creator_highlights.py` or similar) to accept and forward an optional `captions` param.
|
||||||
|
7. Verify frontend builds: `cd frontend && npm run build`
|
||||||
|
8. Verify API imports: `cd backend && python -c "from routers.creators import router; print('ok')"`
|
||||||
|
- Estimate: 2h
|
||||||
|
- Files: backend/routers/creators.py, backend/schemas.py, frontend/src/api/templates.ts, frontend/src/pages/HighlightQueue.tsx, frontend/src/pages/HighlightQueue.module.css, backend/routers/creator_highlights.py
|
||||||
|
- Verify: cd frontend && npm run build 2>&1 | tail -5 && cd ../backend && python -c "from routers.creators import router; print('ok')"
|
||||||
|
|
|
||||||
108
.gsd/milestones/M024/slices/S04/S04-RESEARCH.md
Normal file
108
.gsd/milestones/M024/slices/S04/S04-RESEARCH.md
Normal file
|
|
@ -0,0 +1,108 @@
|
||||||
|
# S04 Research: Auto-Captioning + Template System
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
Add Whisper-derived animated subtitles to generated shorts and a creator-configurable intro/outro card system. Medium complexity — captioning uses established ASS format burned via ffmpeg (well-documented pattern), template system needs schema + UI + ffmpeg concat.
|
||||||
|
|
||||||
|
## Recommendation
|
||||||
|
|
||||||
|
Build captioning first (riskiest, validates the word-timing → ASS → ffmpeg pipeline), then template system (straightforward concat). Keep both features optional per-short — existing shorts should still generate without captions/templates if data is missing.
|
||||||
|
|
||||||
|
## Implementation Landscape
|
||||||
|
|
||||||
|
### Current State
|
||||||
|
|
||||||
|
1. **Shorts generation pipeline** (`backend/pipeline/stages.py:2869–3055`): `stage_generate_shorts` Celery task extracts clips via `extract_clip()` from `shorts_generator.py`, uploads to MinIO, creates `GeneratedShort` DB rows. Currently uses a simple `-vf` filter for scaling/padding only.
|
||||||
|
|
||||||
|
2. **Word-level timing data**: Whisper transcripts stored on disk at `settings.transcript_storage_path` contain per-word `{word, start, end}` dicts inside each segment. The function `extract_word_timings()` in `highlight_scorer.py:186` already extracts words for a `[start_time, end_time]` window — this is the exact input needed for subtitle generation.
|
||||||
|
|
||||||
|
3. **Transcript loading pattern** (`stages.py:2465–2485`): Stage loads transcript JSON from `source_video.transcript_path`, parses segments. Same pattern reusable in `stage_generate_shorts`.
|
||||||
|
|
||||||
|
4. **Creator model** (`models.py:122`): Has `personality_profile` JSONB but no template/branding fields. No intro/outro configuration exists.
|
||||||
|
|
||||||
|
5. **GeneratedShort model** (`models.py:836`): Has format_preset, minio_object_key, dimensions, status. No caption or template metadata columns.
|
||||||
|
|
||||||
|
6. **Frontend**: `ShortPlayer.tsx` is a basic `<video>` player. `HighlightQueue.tsx` has short generation trigger UI. Neither has caption/template controls.
|
||||||
|
|
||||||
|
### What Needs Building
|
||||||
|
|
||||||
|
#### Part 1: Auto-Captioning
|
||||||
|
|
||||||
|
- **ASS subtitle generator** (`backend/pipeline/caption_generator.py` — new): Takes word timings + clip offset → generates ASS (Advanced SubStation Alpha) format file with animated word-by-word highlighting. ASS chosen over SRT because it supports styling (font, color, position, animation) needed for social-media-style captions.
|
||||||
|
- **ffmpeg filter chain update** (`shorts_generator.py`): Add `ass=` filter to the existing `-vf` chain. Current filter is e.g. `scale=1080:-2,pad=1080:1920:(ow-iw)/2:(oh-ih)/2:black` → becomes `scale=1080:-2,pad=1080:1920:(ow-iw)/2:(oh-ih)/2:black,ass=/tmp/captions.ass`.
|
||||||
|
- **Transcript loading in stage_generate_shorts**: Reuse the transcript loading pattern from highlight detection stage. Load transcript JSON, call `extract_word_timings(data, clip_start, clip_end)` to get word-level timing for the clip window. Offset all word times by `-clip_start` so they're relative to clip start.
|
||||||
|
- **No new DB columns needed for basic captioning** — captions are burned into the video file. Optional: add `captions_enabled: bool` to GeneratedShort for tracking.
|
||||||
|
|
||||||
|
#### Part 2: Template System (Intro/Outro Cards)
|
||||||
|
|
||||||
|
- **DB schema**: Add `shorts_template` JSONB column to `Creator` model (new Alembic migration). Schema:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"intro_text": "CreatorName presents",
|
||||||
|
"outro_text": "Follow for more",
|
||||||
|
"accent_color": "#22d3ee",
|
||||||
|
"font_family": "Inter",
|
||||||
|
"intro_duration_secs": 2.0,
|
||||||
|
"outro_duration_secs": 3.0,
|
||||||
|
"show_intro": true,
|
||||||
|
"show_outro": true
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- **Card renderer** (`backend/pipeline/card_renderer.py` — new): Generate intro/outro card video segments using ffmpeg's `lavfi` input (color + drawtext filter). Each card is a short standalone mp4 clip. Then concat all three pieces (intro + main clip + outro) using ffmpeg concat demuxer.
|
||||||
|
- **API endpoint**: `PUT /api/v1/admin/creators/{creator_id}/shorts-template` to save template config. `GET` to retrieve current config.
|
||||||
|
- **Frontend**: Template config UI on the creator dashboard / highlight queue. Color picker, text inputs, toggle for intro/outro.
|
||||||
|
|
||||||
|
### Key Technical Details
|
||||||
|
|
||||||
|
**ASS subtitle format for word-by-word animation**: Each word gets its own dialogue line with `{\kN}` karaoke timing tags. The `\k` tag specifies duration in centiseconds before the next word highlights. Style block defines font (bold, white, centered at bottom 15%), with `\3c` for outline color. ffmpeg's `ass` filter (part of libass) renders this natively.
|
||||||
|
|
||||||
|
**Time offset for captions**: Word timings from Whisper are absolute (relative to video start). For a clip starting at `clip_start`, subtract `clip_start` from each word's start/end to get clip-relative times. If intro card is prepended, further offset by `intro_duration_secs`.
|
||||||
|
|
||||||
|
**ffmpeg concat approach for templates**: Use concat demuxer (`-f concat -safe 0 -i list.txt`) with a text file listing intro.mp4, main.mp4, outro.mp4. All three must share codec settings (h264, aac, same resolution). The card renderer must output clips matching the preset dimensions.
|
||||||
|
|
||||||
|
**Fallback behavior**: If transcript has no word-level timings for the clip window (empty `words` arrays), skip captioning and generate without subtitles. If creator has no `shorts_template`, skip intro/outro. Both features are additive — the base clip extraction path is unchanged.
|
||||||
|
|
||||||
|
### File Inventory
|
||||||
|
|
||||||
|
| File | Role | Changes |
|
||||||
|
|------|------|---------|
|
||||||
|
| `backend/pipeline/caption_generator.py` | NEW — ASS subtitle generation from word timings | |
|
||||||
|
| `backend/pipeline/card_renderer.py` | NEW — ffmpeg-based intro/outro card generation | |
|
||||||
|
| `backend/pipeline/shorts_generator.py` | Modify `extract_clip()` or add `extract_clip_with_captions()` to chain ASS filter | |
|
||||||
|
| `backend/pipeline/stages.py` | Modify `stage_generate_shorts` to load transcript, generate captions, handle templates | |
|
||||||
|
| `backend/models.py` | Add `shorts_template` JSONB to Creator, optionally `captions_enabled` to GeneratedShort | |
|
||||||
|
| `alembic/versions/027_*.py` | NEW — migration for shorts_template column | |
|
||||||
|
| `backend/routers/creators.py` or new router | Template config GET/PUT endpoints | |
|
||||||
|
| `frontend/src/pages/HighlightQueue.tsx` | Caption toggle, template preview | |
|
||||||
|
| `frontend/src/api/creators.ts` or similar | API client for template config | |
|
||||||
|
|
||||||
|
### Natural Task Seams
|
||||||
|
|
||||||
|
1. **Caption generator + ffmpeg integration** (riskiest — validate ASS generation and ffmpeg burning works): Build `caption_generator.py`, modify `shorts_generator.py` to accept optional ASS path, wire into `stage_generate_shorts` with transcript loading. Test with a real clip.
|
||||||
|
|
||||||
|
2. **Template schema + card renderer** (medium risk — ffmpeg concat is well-documented but needs resolution matching): DB migration, `card_renderer.py`, concat logic in stage.
|
||||||
|
|
||||||
|
3. **Template API + frontend** (low risk — standard CRUD + form UI): REST endpoints, creator dashboard UI for template configuration.
|
||||||
|
|
||||||
|
### Verification Strategy
|
||||||
|
|
||||||
|
- Unit test `caption_generator.py`: given word timings + offset → produces valid ASS content with correct timing math
|
||||||
|
- Unit test `card_renderer.py`: given template config + preset → produces correct ffmpeg command
|
||||||
|
- Integration: `stage_generate_shorts` with mock transcript data → verify ASS file created, ffmpeg called with subtitle filter
|
||||||
|
- Frontend build passes with zero TypeScript errors
|
||||||
|
- Manual: generate a short for a highlight with word-level timing data → verify captions appear in output video
|
||||||
|
|
||||||
|
### Risks
|
||||||
|
|
||||||
|
1. **ffmpeg ASS filter requires libass**: The Docker image must have libass compiled into ffmpeg. Standard `ffmpeg` packages on Debian/Ubuntu include it, but verify with `ffmpeg -filters | grep ass` inside the container.
|
||||||
|
2. **Word timing gaps**: Some Whisper transcripts may have segments without word-level timings (older transcriptions). Fallback to segment-level timing (less granular but still useful) or skip captions entirely.
|
||||||
|
3. **Concat codec mismatch**: Intro/outro cards must exactly match the main clip's codec settings. Use identical ffmpeg encoding params for all three segments.
|
||||||
|
|
||||||
|
### Don't Hand-Roll
|
||||||
|
|
||||||
|
- **ASS format**: Use the standard ASS spec — don't invent a custom subtitle format. ASS is the native format for ffmpeg's `ass` filter.
|
||||||
|
- **Video concatenation**: Use ffmpeg's concat demuxer, not manual frame-by-frame stitching or Python video libraries.
|
||||||
|
|
||||||
|
### Skill Discovery
|
||||||
|
|
||||||
|
No additional professional agent skills needed. The work is Python + ffmpeg (both well-understood in this codebase) and standard React form UI. The `whisper` skill in available_skills is for transcription, not subtitle generation — not relevant here since we're consuming existing transcript data.
|
||||||
50
.gsd/milestones/M024/slices/S04/tasks/T01-PLAN.md
Normal file
50
.gsd/milestones/M024/slices/S04/tasks/T01-PLAN.md
Normal file
|
|
@ -0,0 +1,50 @@
|
||||||
|
---
|
||||||
|
estimated_steps: 21
|
||||||
|
estimated_files: 6
|
||||||
|
skills_used: []
|
||||||
|
---
|
||||||
|
|
||||||
|
# T01: Build caption generator and wire into shorts pipeline
|
||||||
|
|
||||||
|
Create `caption_generator.py` that converts word-level timings into ASS (Advanced SubStation Alpha) subtitle format with word-by-word karaoke highlighting. Modify `shorts_generator.py` to accept an optional ASS file path and chain the `ass=` filter into the ffmpeg `-vf` string. Wire transcript loading and caption generation into `stage_generate_shorts` in `stages.py`. Add `captions_enabled` boolean column to `GeneratedShort` model. Write unit tests for caption generation.
|
||||||
|
|
||||||
|
Steps:
|
||||||
|
1. Read `backend/pipeline/highlight_scorer.py` for `extract_word_timings` signature and output format. Read `backend/pipeline/shorts_generator.py` for `extract_clip` and `PRESETS`. Read `backend/pipeline/stages.py:2869-2990` for `stage_generate_shorts` flow.
|
||||||
|
2. Create `backend/pipeline/caption_generator.py`:
|
||||||
|
- `generate_ass_captions(word_timings: list[dict], clip_start: float, style_config: dict | None = None) -> str` — returns ASS file content as string. Each word gets a `Dialogue` line. Use `{\k}` karaoke tags for word-by-word highlight timing. Style: bold white text, centered bottom 15%, black outline. Offset all word times by `-clip_start` to make them clip-relative.
|
||||||
|
- `write_ass_file(ass_content: str, output_path: Path) -> Path` — writes to disk, returns path.
|
||||||
|
3. Modify `extract_clip()` in `shorts_generator.py`: add optional `ass_path: Path | None = None` parameter. When provided, append `,ass={ass_path}` to the `vf_filter` string before passing to ffmpeg. Ensure the ASS filter comes after scale/pad filters.
|
||||||
|
4. Add `captions_enabled: Mapped[bool] = mapped_column(default=False, server_default='false')` to `GeneratedShort` in `models.py`.
|
||||||
|
5. Create Alembic migration `027_add_captions_enabled.py` for the new column.
|
||||||
|
6. Modify `stage_generate_shorts` in `stages.py`:
|
||||||
|
- After loading the highlight, load `source_video.transcript_path` and parse transcript JSON (reuse the pattern from line ~2465).
|
||||||
|
- Call `extract_word_timings(transcript_data, clip_start, clip_end)` to get word timings for the clip window.
|
||||||
|
- If word timings are non-empty, call `generate_ass_captions()` and `write_ass_file()` to a temp path.
|
||||||
|
- Pass the ASS path to `extract_clip()`. Set `short.captions_enabled = True`.
|
||||||
|
- If word timings are empty, log a warning and proceed without captions.
|
||||||
|
7. Create `backend/pipeline/test_caption_generator.py` with tests:
|
||||||
|
- Valid word timings → correct ASS output with proper timing math
|
||||||
|
- Empty word timings → empty ASS (or raise, depending on design)
|
||||||
|
- Clip offset applied correctly (word at t=10.5 with clip_start=10.0 becomes t=0.5)
|
||||||
|
- ASS format structure (header, style block, dialogue lines)
|
||||||
|
8. Run tests: `cd backend && python -m pytest pipeline/test_caption_generator.py -v`
|
||||||
|
|
||||||
|
## Inputs
|
||||||
|
|
||||||
|
- ``backend/pipeline/highlight_scorer.py` — extract_word_timings function signature and output format`
|
||||||
|
- ``backend/pipeline/shorts_generator.py` — extract_clip function to modify`
|
||||||
|
- ``backend/pipeline/stages.py` — stage_generate_shorts function to wire caption generation into`
|
||||||
|
- ``backend/models.py` — GeneratedShort model to add captions_enabled column`
|
||||||
|
|
||||||
|
## Expected Output
|
||||||
|
|
||||||
|
- ``backend/pipeline/caption_generator.py` — new module: ASS subtitle generation from word timings`
|
||||||
|
- ``backend/pipeline/shorts_generator.py` — modified: extract_clip accepts optional ass_path parameter`
|
||||||
|
- ``backend/pipeline/stages.py` — modified: stage_generate_shorts loads transcript, generates captions`
|
||||||
|
- ``backend/models.py` — modified: GeneratedShort has captions_enabled boolean`
|
||||||
|
- ``alembic/versions/027_add_captions_enabled.py` — new migration for captions_enabled column`
|
||||||
|
- ``backend/pipeline/test_caption_generator.py` — new: unit tests for caption generator`
|
||||||
|
|
||||||
|
## Verification
|
||||||
|
|
||||||
|
cd backend && python -m pytest pipeline/test_caption_generator.py -v && python -c "from pipeline.caption_generator import generate_ass_captions; print('import ok')"
|
||||||
87
.gsd/milestones/M024/slices/S04/tasks/T01-SUMMARY.md
Normal file
87
.gsd/milestones/M024/slices/S04/tasks/T01-SUMMARY.md
Normal file
|
|
@ -0,0 +1,87 @@
|
||||||
|
---
|
||||||
|
id: T01
|
||||||
|
parent: S04
|
||||||
|
milestone: M024
|
||||||
|
provides: []
|
||||||
|
requires: []
|
||||||
|
affects: []
|
||||||
|
key_files: ["backend/pipeline/caption_generator.py", "backend/pipeline/shorts_generator.py", "backend/pipeline/stages.py", "backend/models.py", "alembic/versions/027_add_captions_enabled.py", "backend/pipeline/test_caption_generator.py"]
|
||||||
|
key_decisions: ["ASS karaoke format with per-word Dialogue lines and \k tags", "Caption generation failures are non-blocking — shorts proceed without captions", "Single ASS file shared across all format presets"]
|
||||||
|
patterns_established: []
|
||||||
|
drill_down_paths: []
|
||||||
|
observability_surfaces: []
|
||||||
|
duration: ""
|
||||||
|
verification_result: "17 unit tests pass covering time formatting, ASS structure, clip offset math, karaoke duration calculation, empty/whitespace word handling, custom style overrides, negative time clamping, and file I/O. Import verification confirms module loads correctly."
|
||||||
|
completed_at: 2026-04-04T11:12:15.208Z
|
||||||
|
blocker_discovered: false
|
||||||
|
---
|
||||||
|
|
||||||
|
# T01: Created ASS subtitle generator with karaoke word-by-word highlighting and wired it into the shorts generation stage with non-blocking caption enrichment
|
||||||
|
|
||||||
|
> Created ASS subtitle generator with karaoke word-by-word highlighting and wired it into the shorts generation stage with non-blocking caption enrichment
|
||||||
|
|
||||||
|
## What Happened
|
||||||
|
---
|
||||||
|
id: T01
|
||||||
|
parent: S04
|
||||||
|
milestone: M024
|
||||||
|
key_files:
|
||||||
|
- backend/pipeline/caption_generator.py
|
||||||
|
- backend/pipeline/shorts_generator.py
|
||||||
|
- backend/pipeline/stages.py
|
||||||
|
- backend/models.py
|
||||||
|
- alembic/versions/027_add_captions_enabled.py
|
||||||
|
- backend/pipeline/test_caption_generator.py
|
||||||
|
key_decisions:
|
||||||
|
- ASS karaoke format with per-word Dialogue lines and \k tags
|
||||||
|
- Caption generation failures are non-blocking — shorts proceed without captions
|
||||||
|
- Single ASS file shared across all format presets
|
||||||
|
duration: ""
|
||||||
|
verification_result: passed
|
||||||
|
completed_at: 2026-04-04T11:12:15.209Z
|
||||||
|
blocker_discovered: false
|
||||||
|
---
|
||||||
|
|
||||||
|
# T01: Created ASS subtitle generator with karaoke word-by-word highlighting and wired it into the shorts generation stage with non-blocking caption enrichment
|
||||||
|
|
||||||
|
**Created ASS subtitle generator with karaoke word-by-word highlighting and wired it into the shorts generation stage with non-blocking caption enrichment**
|
||||||
|
|
||||||
|
## What Happened
|
||||||
|
|
||||||
|
Built caption_generator.py with generate_ass_captions() and write_ass_file() for ASS subtitle generation with \k karaoke tags. Modified extract_clip() to accept optional ass_path parameter for subtitle burn-in via ffmpeg. Added captions_enabled boolean to GeneratedShort model with migration 027. Wired transcript loading and caption generation into stage_generate_shorts with non-blocking error handling — caption failures log WARNING but don't fail the stage.
|
||||||
|
|
||||||
|
## Verification
|
||||||
|
|
||||||
|
17 unit tests pass covering time formatting, ASS structure, clip offset math, karaoke duration calculation, empty/whitespace word handling, custom style overrides, negative time clamping, and file I/O. Import verification confirms module loads correctly.
|
||||||
|
|
||||||
|
## Verification Evidence
|
||||||
|
|
||||||
|
| # | Command | Exit Code | Verdict | Duration |
|
||||||
|
|---|---------|-----------|---------|----------|
|
||||||
|
| 1 | `cd backend && python -m pytest pipeline/test_caption_generator.py -v` | 0 | ✅ pass | 20ms |
|
||||||
|
| 2 | `cd backend && python -c "from pipeline.caption_generator import generate_ass_captions; print('import ok')"` | 0 | ✅ pass | 100ms |
|
||||||
|
|
||||||
|
|
||||||
|
## Deviations
|
||||||
|
|
||||||
|
None.
|
||||||
|
|
||||||
|
## Known Issues
|
||||||
|
|
||||||
|
None.
|
||||||
|
|
||||||
|
## Files Created/Modified
|
||||||
|
|
||||||
|
- `backend/pipeline/caption_generator.py`
|
||||||
|
- `backend/pipeline/shorts_generator.py`
|
||||||
|
- `backend/pipeline/stages.py`
|
||||||
|
- `backend/models.py`
|
||||||
|
- `alembic/versions/027_add_captions_enabled.py`
|
||||||
|
- `backend/pipeline/test_caption_generator.py`
|
||||||
|
|
||||||
|
|
||||||
|
## Deviations
|
||||||
|
None.
|
||||||
|
|
||||||
|
## Known Issues
|
||||||
|
None.
|
||||||
48
.gsd/milestones/M024/slices/S04/tasks/T02-PLAN.md
Normal file
48
.gsd/milestones/M024/slices/S04/tasks/T02-PLAN.md
Normal file
|
|
@ -0,0 +1,48 @@
|
||||||
|
---
|
||||||
|
estimated_steps: 19
|
||||||
|
estimated_files: 6
|
||||||
|
skills_used: []
|
||||||
|
---
|
||||||
|
|
||||||
|
# T02: Build card renderer and concat pipeline for intro/outro templates
|
||||||
|
|
||||||
|
Create `card_renderer.py` that generates intro/outro card video segments using ffmpeg lavfi (color + drawtext). Add `shorts_template` JSONB column to Creator model. Implement ffmpeg concat demuxer logic to assemble intro + main clip + outro into final short. Wire into `stage_generate_shorts`. Write unit tests for card renderer.
|
||||||
|
|
||||||
|
Steps:
|
||||||
|
1. Read T01 outputs: `backend/pipeline/caption_generator.py`, modified `shorts_generator.py` and `stages.py`.
|
||||||
|
2. Add `shorts_template: Mapped[dict | None] = mapped_column(JSONB, nullable=True)` to `Creator` model in `models.py`. Create Alembic migration `028_add_shorts_template.py`.
|
||||||
|
3. Create `backend/pipeline/card_renderer.py`:
|
||||||
|
- `render_card(text: str, duration_secs: float, width: int, height: int, accent_color: str = '#22d3ee', font_family: str = 'Inter') -> list[str]` — returns ffmpeg command args that generate a card mp4 from lavfi input (`color=c=black:s={w}x{h}:d={dur}` with `drawtext` for centered text, accent color underline/glow).
|
||||||
|
- `render_card_to_file(text: str, duration_secs: float, width: int, height: int, output_path: Path, accent_color: str = '#22d3ee', font_family: str = 'Inter') -> Path` — executes the ffmpeg command, returns output path.
|
||||||
|
- `concat_segments(segments: list[Path], output_path: Path) -> Path` — writes a concat demuxer list file, runs `ffmpeg -f concat -safe 0 -i list.txt -c copy output.mp4`, returns output path. All segments must share codec settings.
|
||||||
|
4. Modify `shorts_generator.py`: add `extract_clip_with_template(input_path, output_path, start_secs, end_secs, vf_filter, ass_path=None, intro_path=None, outro_path=None) -> None` that extracts the main clip (with optional captions), then if intro/outro paths are provided, concats them via `concat_segments()`.
|
||||||
|
5. Modify `stage_generate_shorts` in `stages.py`:
|
||||||
|
- After loading highlight, also load `highlight.source_video.creator` to access `creator.shorts_template`.
|
||||||
|
- If `shorts_template` exists and `show_intro` is true, call `render_card_to_file()` for intro. Same for outro.
|
||||||
|
- Pass intro/outro paths to the clip extraction. Use codec-compatible settings (libx264, aac, same resolution from preset spec).
|
||||||
|
- If no template, proceed without cards (existing behavior preserved).
|
||||||
|
6. Create `backend/pipeline/test_card_renderer.py` with tests:
|
||||||
|
- `render_card()` returns valid ffmpeg command with correct dimensions and duration
|
||||||
|
- `concat_segments()` generates correct concat list file content
|
||||||
|
- Template config parsing handles missing/partial fields with defaults
|
||||||
|
7. Run tests: `cd backend && python -m pytest pipeline/test_card_renderer.py -v`
|
||||||
|
|
||||||
|
## Inputs
|
||||||
|
|
||||||
|
- ``backend/pipeline/caption_generator.py` — T01 output, needed for understanding the full pipeline flow`
|
||||||
|
- ``backend/pipeline/shorts_generator.py` — T01 output with ass_path support, to add template support`
|
||||||
|
- ``backend/pipeline/stages.py` — T01 output with transcript loading, to add template loading`
|
||||||
|
- ``backend/models.py` — T01 output with captions_enabled, to add shorts_template column`
|
||||||
|
|
||||||
|
## Expected Output
|
||||||
|
|
||||||
|
- ``backend/pipeline/card_renderer.py` — new module: ffmpeg-based intro/outro card generation + concat`
|
||||||
|
- ``backend/pipeline/shorts_generator.py` — modified: extract_clip_with_template function`
|
||||||
|
- ``backend/pipeline/stages.py` — modified: stage_generate_shorts loads creator template, renders cards`
|
||||||
|
- ``backend/models.py` — modified: Creator has shorts_template JSONB column`
|
||||||
|
- ``alembic/versions/028_add_shorts_template.py` — new migration for shorts_template column`
|
||||||
|
- ``backend/pipeline/test_card_renderer.py` — new: unit tests for card renderer and concat`
|
||||||
|
|
||||||
|
## Verification
|
||||||
|
|
||||||
|
cd backend && python -m pytest pipeline/test_card_renderer.py -v && python -c "from pipeline.card_renderer import render_card, concat_segments; print('import ok')"
|
||||||
47
.gsd/milestones/M024/slices/S04/tasks/T03-PLAN.md
Normal file
47
.gsd/milestones/M024/slices/S04/tasks/T03-PLAN.md
Normal file
|
|
@ -0,0 +1,47 @@
|
||||||
|
---
|
||||||
|
estimated_steps: 17
|
||||||
|
estimated_files: 6
|
||||||
|
skills_used: []
|
||||||
|
---
|
||||||
|
|
||||||
|
# T03: Template API endpoints and frontend template config UI
|
||||||
|
|
||||||
|
Add REST endpoints for reading and updating creator shorts template config. Add template configuration UI to the HighlightQueue page — color picker, text inputs, duration controls, and intro/outro toggles. Add a caption toggle to the short generation flow.
|
||||||
|
|
||||||
|
Steps:
|
||||||
|
1. Read T02 outputs to understand the shorts_template schema on the Creator model.
|
||||||
|
2. Create or extend `backend/routers/creators.py` with two endpoints:
|
||||||
|
- `GET /api/v1/admin/creators/{creator_id}/shorts-template` — returns current `shorts_template` JSONB or default config if null.
|
||||||
|
- `PUT /api/v1/admin/creators/{creator_id}/shorts-template` — validates and saves template config. Pydantic schema: `ShortsTemplateUpdate` with fields: intro_text (str, max 100), outro_text (str, max 100), accent_color (str, hex pattern), font_family (str), intro_duration_secs (float, 1.0-5.0), outro_duration_secs (float, 1.0-5.0), show_intro (bool), show_outro (bool).
|
||||||
|
- Both endpoints require admin auth.
|
||||||
|
3. Add `ShortsTemplateConfig` and `ShortsTemplateUpdate` Pydantic schemas to `backend/schemas.py`.
|
||||||
|
4. Create `frontend/src/api/templates.ts` — API client functions: `fetchShortsTemplate(creatorId)`, `updateShortsTemplate(creatorId, config)`.
|
||||||
|
5. Add template config UI to `frontend/src/pages/HighlightQueue.tsx`:
|
||||||
|
- A collapsible "Shorts Template" section in the sidebar or above the queue.
|
||||||
|
- Fields: intro text, outro text, accent color (HTML color input), intro/outro duration sliders (1-5s), show intro/outro toggles.
|
||||||
|
- Save button that calls `updateShortsTemplate()`.
|
||||||
|
- Load current template on mount when a creator is selected.
|
||||||
|
6. Add a "Captions" toggle checkbox to the short generation trigger in HighlightQueue — when unchecked, pass `captions=false` query param to the generate endpoint. Update the `POST /api/v1/admin/highlights/{id}/generate-shorts` handler (in `backend/routers/creator_highlights.py` or similar) to accept and forward an optional `captions` param.
|
||||||
|
7. Verify frontend builds: `cd frontend && npm run build`
|
||||||
|
8. Verify API imports: `cd backend && python -c "from routers.creators import router; print('ok')"`
|
||||||
|
|
||||||
|
## Inputs
|
||||||
|
|
||||||
|
- ``backend/models.py` — T02 output with shorts_template JSONB on Creator model`
|
||||||
|
- ``backend/pipeline/card_renderer.py` — T02 output, understanding template schema`
|
||||||
|
- ``frontend/src/pages/HighlightQueue.tsx` — existing page to add template config UI`
|
||||||
|
- ``frontend/src/api/shorts.ts` — existing shorts API client for reference`
|
||||||
|
- ``backend/routers/creators.py` — existing router to add template endpoints`
|
||||||
|
|
||||||
|
## Expected Output
|
||||||
|
|
||||||
|
- ``backend/routers/creators.py` — modified: GET/PUT shorts-template endpoints`
|
||||||
|
- ``backend/schemas.py` — modified: ShortsTemplateConfig and ShortsTemplateUpdate schemas`
|
||||||
|
- ``frontend/src/api/templates.ts` — new: API client for template config`
|
||||||
|
- ``frontend/src/pages/HighlightQueue.tsx` — modified: template config UI section + caption toggle`
|
||||||
|
- ``frontend/src/pages/HighlightQueue.module.css` — modified: styles for template config UI`
|
||||||
|
- ``backend/routers/creator_highlights.py` — modified: captions param on generate-shorts endpoint`
|
||||||
|
|
||||||
|
## Verification
|
||||||
|
|
||||||
|
cd frontend && npm run build 2>&1 | tail -5 && cd ../backend && python -c "from routers.creators import router; print('ok')"
|
||||||
30
alembic/versions/027_add_captions_enabled.py
Normal file
30
alembic/versions/027_add_captions_enabled.py
Normal file
|
|
@ -0,0 +1,30 @@
|
||||||
|
"""Add captions_enabled boolean to generated_shorts.
|
||||||
|
|
||||||
|
Revision ID: 027_add_captions_enabled
|
||||||
|
Revises: 026_add_share_token
|
||||||
|
"""
|
||||||
|
|
||||||
|
import sqlalchemy as sa
|
||||||
|
|
||||||
|
from alembic import op
|
||||||
|
|
||||||
|
revision = "027_add_captions_enabled"
|
||||||
|
down_revision = "026_add_share_token"
|
||||||
|
branch_labels = None
|
||||||
|
depends_on = None
|
||||||
|
|
||||||
|
|
||||||
|
def upgrade() -> None:
|
||||||
|
op.add_column(
|
||||||
|
"generated_shorts",
|
||||||
|
sa.Column(
|
||||||
|
"captions_enabled",
|
||||||
|
sa.Boolean(),
|
||||||
|
nullable=False,
|
||||||
|
server_default=sa.text("false"),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def downgrade() -> None:
|
||||||
|
op.drop_column("generated_shorts", "captions_enabled")
|
||||||
|
|
@ -867,5 +867,9 @@ class GeneratedShort(Base):
|
||||||
default=_now, server_default=func.now(), onupdate=_now
|
default=_now, server_default=func.now(), onupdate=_now
|
||||||
)
|
)
|
||||||
|
|
||||||
|
captions_enabled: Mapped[bool] = mapped_column(
|
||||||
|
Boolean, default=False, server_default=text("'false'"),
|
||||||
|
)
|
||||||
|
|
||||||
# relationships
|
# relationships
|
||||||
highlight_candidate: Mapped[HighlightCandidate] = sa_relationship()
|
highlight_candidate: Mapped[HighlightCandidate] = sa_relationship()
|
||||||
|
|
|
||||||
155
backend/pipeline/caption_generator.py
Normal file
155
backend/pipeline/caption_generator.py
Normal file
|
|
@ -0,0 +1,155 @@
|
||||||
|
r"""ASS (Advanced SubStation Alpha) caption generator for shorts.
|
||||||
|
|
||||||
|
Converts word-level timings from Whisper transcripts into ASS subtitle
|
||||||
|
files with word-by-word karaoke highlighting. Each word gets its own
|
||||||
|
Dialogue line with {\k} tags that control highlight duration.
|
||||||
|
|
||||||
|
Pure functions — no DB access, no Celery dependency.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
# ── Default style configuration ──────────────────────────────────────────────
|
||||||
|
|
||||||
|
DEFAULT_STYLE: dict[str, Any] = {
|
||||||
|
"font_name": "Arial",
|
||||||
|
"font_size": 48,
|
||||||
|
"primary_colour": "&H00FFFFFF", # white (BGR + alpha)
|
||||||
|
"secondary_colour": "&H0000FFFF", # yellow highlight
|
||||||
|
"outline_colour": "&H00000000", # black outline
|
||||||
|
"back_colour": "&H80000000", # semi-transparent black shadow
|
||||||
|
"bold": -1, # bold
|
||||||
|
"outline": 3,
|
||||||
|
"shadow": 1,
|
||||||
|
"alignment": 2, # bottom-center
|
||||||
|
"margin_v": 60, # 60px from bottom (~15% on 1920h)
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _format_ass_time(seconds: float) -> str:
|
||||||
|
"""Convert seconds to ASS timestamp format: H:MM:SS.cc (centiseconds).
|
||||||
|
|
||||||
|
>>> _format_ass_time(65.5)
|
||||||
|
'0:01:05.50'
|
||||||
|
>>> _format_ass_time(0.0)
|
||||||
|
'0:00:00.00'
|
||||||
|
"""
|
||||||
|
if seconds < 0:
|
||||||
|
seconds = 0.0
|
||||||
|
h = int(seconds // 3600)
|
||||||
|
m = int((seconds % 3600) // 60)
|
||||||
|
s = seconds % 60
|
||||||
|
return f"{h}:{m:02d}:{s:05.2f}"
|
||||||
|
|
||||||
|
|
||||||
|
def _build_ass_header(style_config: dict[str, Any]) -> str:
|
||||||
|
"""Build ASS file header with script info and style definition."""
|
||||||
|
cfg = {**DEFAULT_STYLE, **(style_config or {})}
|
||||||
|
|
||||||
|
header = (
|
||||||
|
"[Script Info]\n"
|
||||||
|
"Title: Chrysopedia Auto-Captions\n"
|
||||||
|
"ScriptType: v4.00+\n"
|
||||||
|
"PlayResX: 1080\n"
|
||||||
|
"PlayResY: 1920\n"
|
||||||
|
"WrapStyle: 0\n"
|
||||||
|
"ScaledBorderAndShadow: yes\n"
|
||||||
|
"\n"
|
||||||
|
"[V4+ Styles]\n"
|
||||||
|
"Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, "
|
||||||
|
"OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, "
|
||||||
|
"ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, "
|
||||||
|
"Alignment, MarginL, MarginR, MarginV, Encoding\n"
|
||||||
|
f"Style: Default,{cfg['font_name']},{cfg['font_size']},"
|
||||||
|
f"{cfg['primary_colour']},{cfg['secondary_colour']},"
|
||||||
|
f"{cfg['outline_colour']},{cfg['back_colour']},"
|
||||||
|
f"{cfg['bold']},0,0,0,"
|
||||||
|
f"100,100,0,0,1,{cfg['outline']},{cfg['shadow']},"
|
||||||
|
f"{cfg['alignment']},20,20,{cfg['margin_v']},1\n"
|
||||||
|
"\n"
|
||||||
|
"[Events]\n"
|
||||||
|
"Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text\n"
|
||||||
|
)
|
||||||
|
return header
|
||||||
|
|
||||||
|
|
||||||
|
def generate_ass_captions(
|
||||||
|
word_timings: list[dict[str, Any]],
|
||||||
|
clip_start: float,
|
||||||
|
style_config: dict[str, Any] | None = None,
|
||||||
|
) -> str:
|
||||||
|
"""Generate ASS subtitle content from word-level timings.
|
||||||
|
|
||||||
|
Each word is emitted as a separate Dialogue line with karaoke timing
|
||||||
|
(``{\\k<centiseconds>}``) so that words highlight one-by-one.
|
||||||
|
|
||||||
|
All word timestamps are offset by ``-clip_start`` to make them
|
||||||
|
clip-relative (i.e. the first frame of the clip is t=0).
|
||||||
|
|
||||||
|
Parameters
|
||||||
|
----------
|
||||||
|
word_timings : list[dict]
|
||||||
|
Word-timing dicts with ``word``, ``start``, ``end`` keys.
|
||||||
|
``start`` and ``end`` are absolute times in seconds.
|
||||||
|
clip_start : float
|
||||||
|
Absolute start time of the clip in seconds. Subtracted from
|
||||||
|
all word timestamps.
|
||||||
|
style_config : dict | None
|
||||||
|
Override style parameters (merged onto DEFAULT_STYLE).
|
||||||
|
|
||||||
|
Returns
|
||||||
|
-------
|
||||||
|
str — Full ASS file content. Empty dialogue section if no timings.
|
||||||
|
"""
|
||||||
|
header = _build_ass_header(style_config)
|
||||||
|
|
||||||
|
if not word_timings:
|
||||||
|
logger.debug("No word timings provided — returning header-only ASS")
|
||||||
|
return header
|
||||||
|
|
||||||
|
lines: list[str] = [header]
|
||||||
|
|
||||||
|
for w in word_timings:
|
||||||
|
word_text = w.get("word", "").strip()
|
||||||
|
if not word_text:
|
||||||
|
continue
|
||||||
|
|
||||||
|
abs_start = float(w.get("start", 0.0))
|
||||||
|
abs_end = float(w.get("end", abs_start))
|
||||||
|
|
||||||
|
# Make clip-relative
|
||||||
|
rel_start = max(0.0, abs_start - clip_start)
|
||||||
|
rel_end = max(rel_start, abs_end - clip_start)
|
||||||
|
|
||||||
|
# Karaoke duration in centiseconds
|
||||||
|
k_duration = max(1, round((rel_end - rel_start) * 100))
|
||||||
|
|
||||||
|
start_ts = _format_ass_time(rel_start)
|
||||||
|
end_ts = _format_ass_time(rel_end)
|
||||||
|
|
||||||
|
# Dialogue line with karaoke tag
|
||||||
|
line = (
|
||||||
|
f"Dialogue: 0,{start_ts},{end_ts},Default,,0,0,0,,"
|
||||||
|
f"{{\\k{k_duration}}}{word_text}"
|
||||||
|
)
|
||||||
|
lines.append(line)
|
||||||
|
|
||||||
|
return "\n".join(lines) + "\n"
|
||||||
|
|
||||||
|
|
||||||
|
def write_ass_file(ass_content: str, output_path: Path) -> Path:
|
||||||
|
"""Write ASS content to disk.
|
||||||
|
|
||||||
|
Creates parent directories if needed. Returns the output path.
|
||||||
|
"""
|
||||||
|
output_path = Path(output_path)
|
||||||
|
output_path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
output_path.write_text(ass_content, encoding="utf-8")
|
||||||
|
logger.debug("Wrote ASS captions to %s (%d bytes)", output_path, len(ass_content))
|
||||||
|
return output_path
|
||||||
|
|
@ -72,18 +72,24 @@ def extract_clip(
|
||||||
start_secs: float,
|
start_secs: float,
|
||||||
end_secs: float,
|
end_secs: float,
|
||||||
vf_filter: str,
|
vf_filter: str,
|
||||||
|
ass_path: Path | str | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
"""Extract a clip from a video file using ffmpeg.
|
"""Extract a clip from a video file using ffmpeg.
|
||||||
|
|
||||||
Seeks to *start_secs*, encodes until *end_secs*, and applies *vf_filter*.
|
Seeks to *start_secs*, encodes until *end_secs*, and applies *vf_filter*.
|
||||||
Uses ``-c:v libx264 -preset fast -crf 23`` for reasonable quality/speed.
|
Uses ``-c:v libx264 -preset fast -crf 23`` for reasonable quality/speed.
|
||||||
|
|
||||||
|
When *ass_path* is provided, the ASS subtitle filter is appended to the
|
||||||
|
video filter chain so that captions are burned into the output video.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
input_path: Source video file.
|
input_path: Source video file.
|
||||||
output_path: Destination mp4 file (parent dir must exist).
|
output_path: Destination mp4 file (parent dir must exist).
|
||||||
start_secs: Start time in seconds.
|
start_secs: Start time in seconds.
|
||||||
end_secs: End time in seconds.
|
end_secs: End time in seconds.
|
||||||
vf_filter: ffmpeg ``-vf`` filter string.
|
vf_filter: ffmpeg ``-vf`` filter string.
|
||||||
|
ass_path: Optional path to an ASS subtitle file. When provided,
|
||||||
|
``ass=<path>`` is appended to the filter chain.
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
subprocess.CalledProcessError: If ffmpeg exits non-zero.
|
subprocess.CalledProcessError: If ffmpeg exits non-zero.
|
||||||
|
|
@ -97,13 +103,20 @@ def extract_clip(
|
||||||
f"(duration={duration}s)"
|
f"(duration={duration}s)"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# Build the video filter chain — ASS burn-in comes after scale/pad
|
||||||
|
effective_vf = vf_filter
|
||||||
|
if ass_path is not None:
|
||||||
|
# Escape colons and backslashes in the path for ffmpeg filter syntax
|
||||||
|
escaped = str(ass_path).replace("\\", "\\\\").replace(":", "\\:")
|
||||||
|
effective_vf = f"{vf_filter},ass={escaped}"
|
||||||
|
|
||||||
cmd = [
|
cmd = [
|
||||||
"ffmpeg",
|
"ffmpeg",
|
||||||
"-y", # overwrite output
|
"-y", # overwrite output
|
||||||
"-ss", str(start_secs), # seek before input (fast)
|
"-ss", str(start_secs), # seek before input (fast)
|
||||||
"-i", str(input_path),
|
"-i", str(input_path),
|
||||||
"-t", str(duration),
|
"-t", str(duration),
|
||||||
"-vf", vf_filter,
|
"-vf", effective_vf,
|
||||||
"-c:v", "libx264",
|
"-c:v", "libx264",
|
||||||
"-preset", "fast",
|
"-preset", "fast",
|
||||||
"-crf", "23",
|
"-crf", "23",
|
||||||
|
|
|
||||||
|
|
@ -2876,7 +2876,8 @@ def stage_generate_shorts(self, highlight_candidate_id: str) -> str:
|
||||||
Returns the highlight_candidate_id on completion.
|
Returns the highlight_candidate_id on completion.
|
||||||
"""
|
"""
|
||||||
from pipeline.shorts_generator import PRESETS, extract_clip, resolve_video_path
|
from pipeline.shorts_generator import PRESETS, extract_clip, resolve_video_path
|
||||||
from models import FormatPreset, GeneratedShort, ShortStatus
|
from pipeline.caption_generator import generate_ass_captions, write_ass_file
|
||||||
|
from models import FormatPreset, GeneratedShort, ShortStatus, SourceVideo
|
||||||
|
|
||||||
start = time.monotonic()
|
start = time.monotonic()
|
||||||
session = _get_sync_session()
|
session = _get_sync_session()
|
||||||
|
|
@ -2954,6 +2955,56 @@ def stage_generate_shorts(self, highlight_candidate_id: str) -> str:
|
||||||
clip_start, clip_end,
|
clip_start, clip_end,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# ── Generate captions from transcript (if available) ────────────
|
||||||
|
ass_path: Path | None = None
|
||||||
|
captions_ok = False
|
||||||
|
try:
|
||||||
|
transcript_data: list | None = None
|
||||||
|
if source_video.transcript_path:
|
||||||
|
try:
|
||||||
|
with open(source_video.transcript_path, "r") as fh:
|
||||||
|
raw = json.load(fh)
|
||||||
|
if isinstance(raw, dict):
|
||||||
|
transcript_data = raw.get("segments", raw.get("results", []))
|
||||||
|
elif isinstance(raw, list):
|
||||||
|
transcript_data = raw
|
||||||
|
except (FileNotFoundError, json.JSONDecodeError, OSError) as io_exc:
|
||||||
|
logger.warning(
|
||||||
|
"Failed to load transcript for captions highlight=%s: %s",
|
||||||
|
highlight_candidate_id, io_exc,
|
||||||
|
)
|
||||||
|
|
||||||
|
if transcript_data:
|
||||||
|
from pipeline.highlight_scorer import extract_word_timings
|
||||||
|
|
||||||
|
word_timings = extract_word_timings(transcript_data, clip_start, clip_end)
|
||||||
|
if word_timings:
|
||||||
|
ass_content = generate_ass_captions(word_timings, clip_start)
|
||||||
|
ass_path = write_ass_file(
|
||||||
|
ass_content,
|
||||||
|
Path(f"/tmp/captions_{highlight_candidate_id}.ass"),
|
||||||
|
)
|
||||||
|
captions_ok = True
|
||||||
|
logger.info(
|
||||||
|
"Generated captions for highlight=%s (%d words)",
|
||||||
|
highlight_candidate_id, len(word_timings),
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
logger.warning(
|
||||||
|
"No word timings in transcript window [%.1f–%.1f]s for highlight=%s — proceeding without captions",
|
||||||
|
clip_start, clip_end, highlight_candidate_id,
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
logger.info(
|
||||||
|
"No transcript available for highlight=%s — proceeding without captions",
|
||||||
|
highlight_candidate_id,
|
||||||
|
)
|
||||||
|
except Exception as cap_exc:
|
||||||
|
logger.warning(
|
||||||
|
"Caption generation failed for highlight=%s: %s — proceeding without captions",
|
||||||
|
highlight_candidate_id, cap_exc,
|
||||||
|
)
|
||||||
|
|
||||||
# ── Process each preset independently ───────────────────────────
|
# ── Process each preset independently ───────────────────────────
|
||||||
for preset in FormatPreset:
|
for preset in FormatPreset:
|
||||||
spec = PRESETS[preset]
|
spec = PRESETS[preset]
|
||||||
|
|
@ -2983,6 +3034,7 @@ def stage_generate_shorts(self, highlight_candidate_id: str) -> str:
|
||||||
start_secs=clip_start,
|
start_secs=clip_start,
|
||||||
end_secs=clip_end,
|
end_secs=clip_end,
|
||||||
vf_filter=spec.vf_filter,
|
vf_filter=spec.vf_filter,
|
||||||
|
ass_path=ass_path,
|
||||||
)
|
)
|
||||||
|
|
||||||
# Upload to MinIO
|
# Upload to MinIO
|
||||||
|
|
@ -3000,6 +3052,7 @@ def stage_generate_shorts(self, highlight_candidate_id: str) -> str:
|
||||||
short.status = ShortStatus.complete
|
short.status = ShortStatus.complete
|
||||||
short.file_size_bytes = file_size
|
short.file_size_bytes = file_size
|
||||||
short.minio_object_key = minio_key
|
short.minio_object_key = minio_key
|
||||||
|
short.captions_enabled = captions_ok
|
||||||
short.share_token = secrets.token_urlsafe(8)
|
short.share_token = secrets.token_urlsafe(8)
|
||||||
session.commit()
|
session.commit()
|
||||||
|
|
||||||
|
|
@ -3035,6 +3088,13 @@ def stage_generate_shorts(self, highlight_candidate_id: str) -> str:
|
||||||
except OSError:
|
except OSError:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
|
# Clean up temp ASS caption file
|
||||||
|
if ass_path is not None and ass_path.exists():
|
||||||
|
try:
|
||||||
|
ass_path.unlink()
|
||||||
|
except OSError:
|
||||||
|
pass
|
||||||
|
|
||||||
elapsed = time.monotonic() - start
|
elapsed = time.monotonic() - start
|
||||||
logger.info(
|
logger.info(
|
||||||
"Shorts generation complete for highlight=%s in %.1fs",
|
"Shorts generation complete for highlight=%s in %.1fs",
|
||||||
|
|
|
||||||
159
backend/pipeline/test_caption_generator.py
Normal file
159
backend/pipeline/test_caption_generator.py
Normal file
|
|
@ -0,0 +1,159 @@
|
||||||
|
"""Unit tests for caption_generator module."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import re
|
||||||
|
import tempfile
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from pipeline.caption_generator import (
|
||||||
|
DEFAULT_STYLE,
|
||||||
|
_format_ass_time,
|
||||||
|
generate_ass_captions,
|
||||||
|
write_ass_file,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ── Fixtures ─────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def sample_word_timings() -> list[dict]:
|
||||||
|
"""Realistic word timings as produced by extract_word_timings."""
|
||||||
|
return [
|
||||||
|
{"word": "This", "start": 10.0, "end": 10.3},
|
||||||
|
{"word": "is", "start": 10.3, "end": 10.5},
|
||||||
|
{"word": "a", "start": 10.5, "end": 10.6},
|
||||||
|
{"word": "test", "start": 10.6, "end": 11.0},
|
||||||
|
{"word": "sentence", "start": 11.1, "end": 11.6},
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
# ── Time formatting ─────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
class TestFormatAssTime:
|
||||||
|
def test_zero(self):
|
||||||
|
assert _format_ass_time(0.0) == "0:00:00.00"
|
||||||
|
|
||||||
|
def test_sub_second(self):
|
||||||
|
assert _format_ass_time(0.5) == "0:00:00.50"
|
||||||
|
|
||||||
|
def test_minutes(self):
|
||||||
|
assert _format_ass_time(65.5) == "0:01:05.50"
|
||||||
|
|
||||||
|
def test_hours(self):
|
||||||
|
assert _format_ass_time(3661.25) == "1:01:01.25"
|
||||||
|
|
||||||
|
def test_negative_clamps_to_zero(self):
|
||||||
|
assert _format_ass_time(-5.0) == "0:00:00.00"
|
||||||
|
|
||||||
|
|
||||||
|
# ── ASS generation ──────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
class TestGenerateAssCaptions:
|
||||||
|
def test_empty_timings_returns_header_only(self):
|
||||||
|
result = generate_ass_captions([], clip_start=0.0)
|
||||||
|
assert "[Script Info]" in result
|
||||||
|
assert "[Events]" in result
|
||||||
|
# No Dialogue lines
|
||||||
|
assert "Dialogue:" not in result
|
||||||
|
|
||||||
|
def test_structure_has_required_sections(self, sample_word_timings):
|
||||||
|
result = generate_ass_captions(sample_word_timings, clip_start=10.0)
|
||||||
|
assert "[Script Info]" in result
|
||||||
|
assert "[V4+ Styles]" in result
|
||||||
|
assert "[Events]" in result
|
||||||
|
assert "Dialogue:" in result
|
||||||
|
|
||||||
|
def test_clip_offset_applied(self, sample_word_timings):
|
||||||
|
"""Word at t=10.5 with clip_start=10.0 should become t=0.5 in ASS."""
|
||||||
|
result = generate_ass_captions(sample_word_timings, clip_start=10.0)
|
||||||
|
lines = result.strip().split("\n")
|
||||||
|
dialogue_lines = [l for l in lines if l.startswith("Dialogue:")]
|
||||||
|
|
||||||
|
# First word "This" starts at 10.0, clip_start=10.0 → relative 0.0
|
||||||
|
assert dialogue_lines[0].startswith("Dialogue: 0,0:00:00.00,")
|
||||||
|
|
||||||
|
# Third word "a" starts at 10.5, clip_start=10.0 → relative 0.5
|
||||||
|
assert "0:00:00.50" in dialogue_lines[2]
|
||||||
|
|
||||||
|
def test_karaoke_tags_present(self, sample_word_timings):
|
||||||
|
result = generate_ass_captions(sample_word_timings, clip_start=10.0)
|
||||||
|
lines = result.strip().split("\n")
|
||||||
|
dialogue_lines = [l for l in lines if l.startswith("Dialogue:")]
|
||||||
|
|
||||||
|
for line in dialogue_lines:
|
||||||
|
# Each line should have a \kN tag
|
||||||
|
assert re.search(r"\{\\k\d+\}", line), f"Missing karaoke tag in: {line}"
|
||||||
|
|
||||||
|
def test_karaoke_duration_math(self, sample_word_timings):
|
||||||
|
"""Word "This" at [10.0, 10.3] → 0.3s → k30 (30 centiseconds)."""
|
||||||
|
result = generate_ass_captions(sample_word_timings, clip_start=10.0)
|
||||||
|
lines = result.strip().split("\n")
|
||||||
|
dialogue_lines = [l for l in lines if l.startswith("Dialogue:")]
|
||||||
|
|
||||||
|
# "This" duration: 10.3 - 10.0 = 0.3s = 30cs
|
||||||
|
assert "{\\k30}This" in dialogue_lines[0]
|
||||||
|
|
||||||
|
# "test" duration: 11.0 - 10.6 = 0.4s = 40cs
|
||||||
|
assert "{\\k40}test" in dialogue_lines[3]
|
||||||
|
|
||||||
|
def test_word_count_matches(self, sample_word_timings):
|
||||||
|
result = generate_ass_captions(sample_word_timings, clip_start=10.0)
|
||||||
|
lines = result.strip().split("\n")
|
||||||
|
dialogue_lines = [l for l in lines if l.startswith("Dialogue:")]
|
||||||
|
assert len(dialogue_lines) == 5
|
||||||
|
|
||||||
|
def test_empty_word_text_skipped(self):
|
||||||
|
timings = [
|
||||||
|
{"word": "hello", "start": 0.0, "end": 0.5},
|
||||||
|
{"word": " ", "start": 0.5, "end": 0.7}, # whitespace-only
|
||||||
|
{"word": "", "start": 0.7, "end": 0.8}, # empty
|
||||||
|
{"word": "world", "start": 0.8, "end": 1.2},
|
||||||
|
]
|
||||||
|
result = generate_ass_captions(timings, clip_start=0.0)
|
||||||
|
lines = result.strip().split("\n")
|
||||||
|
dialogue_lines = [l for l in lines if l.startswith("Dialogue:")]
|
||||||
|
assert len(dialogue_lines) == 2 # only "hello" and "world"
|
||||||
|
|
||||||
|
def test_custom_style_overrides(self, sample_word_timings):
|
||||||
|
result = generate_ass_captions(
|
||||||
|
sample_word_timings,
|
||||||
|
clip_start=10.0,
|
||||||
|
style_config={"font_size": 72, "font_name": "Roboto"},
|
||||||
|
)
|
||||||
|
assert "Roboto" in result
|
||||||
|
assert ",72," in result
|
||||||
|
|
||||||
|
def test_negative_relative_time_clamped(self):
|
||||||
|
"""Words before clip_start should clamp to 0."""
|
||||||
|
timings = [{"word": "early", "start": 5.0, "end": 5.5}]
|
||||||
|
result = generate_ass_captions(timings, clip_start=10.0)
|
||||||
|
lines = [l for l in result.strip().split("\n") if l.startswith("Dialogue:")]
|
||||||
|
# Both start and end clamped to 0
|
||||||
|
assert lines[0].startswith("Dialogue: 0,0:00:00.00,0:00:00.00,")
|
||||||
|
|
||||||
|
|
||||||
|
# ── File writing ─────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
class TestWriteAssFile:
|
||||||
|
def test_writes_content(self):
|
||||||
|
content = "[Script Info]\ntest content\n"
|
||||||
|
with tempfile.TemporaryDirectory() as td:
|
||||||
|
out = write_ass_file(content, Path(td) / "sub.ass")
|
||||||
|
assert out.exists()
|
||||||
|
assert out.read_text() == content
|
||||||
|
|
||||||
|
def test_creates_parent_dirs(self):
|
||||||
|
content = "test"
|
||||||
|
with tempfile.TemporaryDirectory() as td:
|
||||||
|
out = write_ass_file(content, Path(td) / "nested" / "deep" / "sub.ass")
|
||||||
|
assert out.exists()
|
||||||
|
|
||||||
|
def test_returns_path(self):
|
||||||
|
content = "test"
|
||||||
|
with tempfile.TemporaryDirectory() as td:
|
||||||
|
target = Path(td) / "sub.ass"
|
||||||
|
result = write_ass_file(content, target)
|
||||||
|
assert result == target
|
||||||
Loading…
Add table
Reference in a new issue