feat: Added media streaming endpoint and chapters endpoint to videos ro…

- "backend/routers/videos.py"
- "backend/schemas.py"
- "frontend/src/api/videos.ts"

GSD-Task: S05/T01
This commit is contained in:
jlightner 2026-04-04 05:47:16 +00:00
parent 6f12d5a240
commit e44ec1d1d5
15 changed files with 816 additions and 3 deletions

View file

@ -308,3 +308,15 @@
**Context:** LightRAG's `/query/data` endpoint accepts `ll_keywords` (list of strings) that bias retrieval toward matching content without hard filtering. For creator-scoped search, pass the creator's name as a keyword; for domain-scoped, pass the topic category. Combine with post-filtering for strict creator scoping (request 3x results, filter locally by creator_id).
**Where:** `backend/search_service.py``_creator_scoped_search()`, `_domain_scoped_search()`
## Named unique constraints for Celery upsert targeting
**Context:** When a Celery task needs idempotent writes (re-running on same input updates rather than duplicates), use a named unique constraint on the natural key column and target it with `INSERT ... ON CONFLICT ON CONSTRAINT <name> DO UPDATE`. The named constraint approach is more explicit than targeting the column directly and works reliably with SQLAlchemy's `insert().on_conflict_do_update(constraint=...)`.
**Where:** `backend/pipeline/stages.py``stage_highlight_detection`, constraint `uq_highlight_candidate_moment` on `key_moment_id`
## Pure-function scoring + Celery task separation
**Context:** Keep scoring logic as a pure function (no DB, no side effects) in a separate module from the Celery task that calls it. This enables unit testing with 28 tests running in 0.03s (no DB fixtures needed). The Celery task handles DB reads, calls the pure function, and writes results. Use lazy imports inside the Celery task function body to avoid circular imports at module load time.
**Where:** `backend/pipeline/highlight_scorer.py` (pure), `backend/pipeline/stages.py` (Celery wiring)

View file

@ -60,6 +60,8 @@ Nineteen milestones complete. Phase 2 foundations are in place. M019 delivered c
- **Creator dashboard shell** — Protected /creator/* routes with sidebar nav (Dashboard, Settings). Profile edit and password change forms. Code-split with React.lazy.
- **Consent infrastructure** — Per-video consent toggles (allow_embed, allow_search, allow_kb, allow_download, allow_remix) with versioned audit trail. VideoConsent and ConsentAuditLog models with Alembic migration 017. 5 API endpoints with ownership verification and admin bypass.
- **Highlight detection v1** — Heuristic scoring engine with 7 weighted dimensions (duration fitness, content type, specificity density, plugin richness, transcript energy, source quality, video type) scores KeyMoment data into ranked highlight candidates stored in `highlight_candidates` table. Celery task for batch processing, 4 admin API endpoints for triggering detection and listing/inspecting candidates. 28 unit tests.
- **Web media player** — Custom video player page at `/watch/:videoId` with HLS playback (lazy-loaded hls.js), speed controls (0.52x), volume, seek, fullscreen, keyboard shortcuts, and synchronized transcript sidebar with binary search active segment detection and auto-scroll. Technique page key moment timestamps link directly to the watch page. Video + transcript API endpoints with creator info.
- **LightRAG graph-enhanced retrieval** — Running as chrysopedia-lightrag service on port 9621. Uses DGX Sparks for LLM (entity extraction, summarization), Ollama nomic-embed-text for embeddings, Qdrant for vector storage, NetworkX for graph storage. 12 music production entity types configured. Exposed via REST API at /documents/text (ingest) and /query (retrieval with local/global/mix/hybrid modes).
@ -99,3 +101,4 @@ Nineteen milestones complete. Phase 2 foundations are in place. M019 delivered c
| M018 | Phase 2 Research & Documentation — Site Audit and Forgejo Wiki Bootstrap | ✅ Complete |
| M019 | Foundations — Auth, Consent & LightRAG | ✅ Complete |
| M020 | Core Experiences — Player, Impersonation & Knowledge Routing | 🔄 Active |
| M021 | Intelligence Online — Chat, Chapters & Search Cutover | 🔄 Active |

View file

@ -9,7 +9,7 @@ LightRAG becomes the primary search engine. Chat engine goes live (encyclopedic
| S01 | [B] LightRAG Search Cutover | high | — | ✅ | Primary search backed by LightRAG. Old system remains as automatic fallback. |
| S02 | [B] Creator-Scoped Retrieval Cascade | medium | S01 | ✅ | Question on Keota's profile first checks Keota's content, then sound design domain, then full KB, then graceful fallback |
| S03 | [B] Chat Engine MVP | high | S02 | ✅ | User asks a question, receives a streamed response with citations linking to source videos and technique pages |
| S04 | [B] Highlight Detection v1 | medium | — | | Scored highlight candidates generated from existing pipeline data for a sample of videos |
| S04 | [B] Highlight Detection v1 | medium | — | | Scored highlight candidates generated from existing pipeline data for a sample of videos |
| S05 | [A] Audio Mode + Chapter Markers | medium | — | ⬜ | Media player with waveform visualization in audio mode and chapter markers on the timeline |
| S06 | [A] Auto-Chapters Review UI | low | — | ⬜ | Creator reviews detected chapters: drag boundaries, rename, reorder, approve for publication |
| S07 | [A] Impersonation Polish + Write Mode | low | — | ⬜ | Impersonation write mode with confirmation modal. Audit log admin view shows all sessions. |

View file

@ -0,0 +1,108 @@
---
id: S04
parent: M021
milestone: M021
provides:
- highlight_candidates table with scored KeyMoment data
- Admin API for listing/inspecting highlight candidates
- Celery task stage_highlight_detection for batch scoring
- HighlightCandidateResponse Pydantic schema for downstream consumers
requires:
[]
affects:
- S08
key_files:
- backend/models.py
- alembic/versions/019_add_highlight_candidates.py
- backend/pipeline/highlight_schemas.py
- backend/pipeline/highlight_scorer.py
- backend/pipeline/test_highlight_scorer.py
- backend/pipeline/stages.py
- backend/routers/highlights.py
- backend/main.py
key_decisions:
- UNIQUE constraint on key_moment_id enforced at both ORM and named constraint level for upsert targeting
- Duration fitness uses piecewise linear rather than Gaussian bell curve for predictability
- Lazy import of score_moment inside Celery task to avoid circular imports at module load
- Upsert uses named constraint uq_highlight_candidate_moment for ON CONFLICT targeting
- 7 scoring dimensions mapped to HighlightScoreBreakdown schema fields for API/DB consistency
patterns_established:
- Heuristic scoring as pure function (no DB, no side effects) with separate Celery task for DB integration — enables easy unit testing
- Named unique constraint for upsert targeting pattern (uq_highlight_candidate_moment) — reusable for future pipeline stages that need idempotent writes
observability_surfaces:
- pipeline_events rows for highlight_detection stage (start/complete/error with candidate count in payload)
- GET /api/v1/admin/highlights/candidates — paginated list sorted by score desc
- GET /api/v1/admin/highlights/candidates/{id} — detail with full score_breakdown
drill_down_paths:
- .gsd/milestones/M021/slices/S04/tasks/T01-SUMMARY.md
- .gsd/milestones/M021/slices/S04/tasks/T02-SUMMARY.md
- .gsd/milestones/M021/slices/S04/tasks/T03-SUMMARY.md
duration: ""
verification_result: passed
completed_at: 2026-04-04T05:37:31.104Z
blocker_discovered: false
---
# S04: [B] Highlight Detection v1
**Heuristic scoring engine scores KeyMoment data into ranked highlight candidates via 7 weighted dimensions, stored in a new highlight_candidates table, exposed through 4 admin API endpoints, and triggerable via Celery task.**
## What Happened
Built the complete highlight detection pipeline in three tasks:
**T01 — Data Foundation.** Added `HighlightStatus` enum (candidate/approved/rejected) and `HighlightCandidate` ORM model to models.py with UUID PK, unique FK to key_moments, score (float 0-1), score_breakdown (JSONB), duration_secs, status, and timestamps. Alembic migration 019 creates the table with indexes on source_video_id, score DESC, and status. Created Pydantic schemas: `HighlightScoreBreakdown` (7 float fields), `HighlightCandidateResponse`, and `HighlightBatchResult`.
**T02 — Scoring Engine.** Implemented `score_moment()` pure function in highlight_scorer.py with 7 weighted dimensions: duration_fitness (0.25, bell curve peaking 30-60s), content_type_weight (0.20), specificity_density (0.20, regex unit/ratio counting), plugin_richness (0.10), transcript_energy (0.10, teaching-phrase detection), source_quality_weight (0.10), video_type_weight (0.05). Weights sum to 1.0. All 28 unit tests pass covering ideal/mediocre/poor ordering, edge cases (None/empty fields), and per-dimension behavior.
**T03 — Runtime Wiring.** Added `stage_highlight_detection` Celery task following existing patterns (bind=True, max_retries=3, _get_sync_session, _emit_event start/complete/error). Task loads KeyMoments for a video, scores each, and bulk-upserts via INSERT ON CONFLICT on the named constraint. Created highlights router with 4 endpoints: POST detect/{video_id}, POST detect-all, GET candidates (paginated, score desc), GET candidates/{id}. Router registered in main.py.
## Verification
All 7 slice-level verification checks pass:
1. Model import (HighlightCandidate, HighlightStatus) — OK
2. Schema import (HighlightCandidateResponse, HighlightScoreBreakdown, HighlightBatchResult) — OK
3. Migration revision resolves to 019_add_highlight_candidates — OK
4. 28/28 scorer unit tests pass in 0.03s
5. Celery task import (stage_highlight_detection) — OK
6. Router import (highlights.router) — OK
7. Router registration confirmed in main.py app routes
## Requirements Advanced
None.
## Requirements Validated
None.
## New Requirements Surfaced
None.
## Requirements Invalidated or Re-scoped
None.
## Deviations
None.
## Known Limitations
Scoring is heuristic-only — no ML model or user feedback loop yet. Duration fitness uses piecewise linear (not Gaussian) for predictability. No integration tests against a live database (unit tests use pure functions only).
## Follow-ups
Run migration 019 on ub01 production database. Trigger detect-all endpoint on existing videos to populate initial candidates. Consider adding feedback loop (approved/rejected status) to tune weights in a future milestone.
## Files Created/Modified
- `backend/models.py` — Added HighlightStatus enum and HighlightCandidate ORM model
- `alembic/versions/019_add_highlight_candidates.py` — Migration 019: highlight_candidates table with indexes
- `backend/pipeline/highlight_schemas.py` — Pydantic schemas for scoring breakdown, API response, batch result
- `backend/pipeline/highlight_scorer.py` — Pure-function scoring engine with 7 weighted dimensions
- `backend/pipeline/test_highlight_scorer.py` — 28 unit tests for scoring engine
- `backend/pipeline/stages.py` — Added stage_highlight_detection Celery task
- `backend/routers/highlights.py` — 4 admin API endpoints for highlight detection
- `backend/main.py` — Registered highlights router

View file

@ -0,0 +1,64 @@
# S04: [B] Highlight Detection v1 — UAT
**Milestone:** M021
**Written:** 2026-04-04T05:37:31.104Z
## UAT: Highlight Detection v1
### Preconditions
- Chrysopedia API running on ub01:8096
- Migration 019 applied (`docker exec chrysopedia-api alembic upgrade head`)
- At least one source video with extracted KeyMoments in the database
### Test 1: Model & Schema Imports
1. Run: `python -c "from backend.models import HighlightCandidate, HighlightStatus; print(HighlightStatus.candidate.value, HighlightStatus.approved.value, HighlightStatus.rejected.value)"`
2. **Expected:** Prints `candidate approved rejected`
3. Run: `python -c "from backend.pipeline.highlight_schemas import HighlightScoreBreakdown; print(list(HighlightScoreBreakdown.model_fields.keys()))"`
4. **Expected:** 7 field names printed
### Test 2: Scoring Engine Ordering
1. Run: `python -m pytest backend/pipeline/test_highlight_scorer.py::TestScoreMoment::test_ordering_is_sensible -v`
2. **Expected:** PASSED — ideal (45s technique, 3 plugins) > mediocre > poor (300s reasoning, 0 plugins)
### Test 3: Scoring Edge Cases
1. Run: `python -m pytest backend/pipeline/test_highlight_scorer.py::TestScoreMoment::test_missing_optional_fields -v`
2. **Expected:** PASSED — None transcript and None plugins don't crash, score in [0,1]
### Test 4: Full Test Suite
1. Run: `python -m pytest backend/pipeline/test_highlight_scorer.py -v`
2. **Expected:** 28/28 tests pass
### Test 5: Trigger Detection for Single Video
1. Pick a video_id from the database: `curl http://ub01:8096/api/v1/admin/pipeline/videos?limit=1`
2. POST: `curl -X POST http://ub01:8096/api/v1/admin/highlights/detect/{video_id}`
3. **Expected:** 200 with `{"task_id": "..."}` response
4. Wait 10s for worker to process
5. GET: `curl http://ub01:8096/api/v1/admin/highlights/candidates?limit=5`
6. **Expected:** Array of candidates with scores in [0,1], score_breakdown with 7 dimensions
### Test 6: Trigger Detection for All Videos
1. POST: `curl -X POST http://ub01:8096/api/v1/admin/highlights/detect-all`
2. **Expected:** 200 with count of dispatched tasks
3. Wait 30s, then GET candidates endpoint
4. **Expected:** Candidates from multiple videos, sorted by score desc
### Test 7: Candidate Detail
1. From Test 5/6 results, pick a candidate_id
2. GET: `curl http://ub01:8096/api/v1/admin/highlights/candidates/{candidate_id}`
3. **Expected:** Full candidate with score_breakdown showing all 7 dimension scores
### Test 8: Idempotent Re-run
1. Re-trigger detection for the same video_id as Test 5
2. Wait for completion
3. GET candidates for that video
4. **Expected:** Same number of candidates (upsert, not duplicate). Scores may differ only if data changed.
### Test 9: 404 on Missing Candidate
1. GET: `curl http://ub01:8096/api/v1/admin/highlights/candidates/00000000-0000-0000-0000-000000000000`
2. **Expected:** 404 response
### Test 10: Pagination
1. GET: `curl "http://ub01:8096/api/v1/admin/highlights/candidates?skip=0&limit=2"`
2. **Expected:** At most 2 candidates returned
3. GET with skip=2: `curl "http://ub01:8096/api/v1/admin/highlights/candidates?skip=2&limit=2"`
4. **Expected:** Next page of candidates (different from first page if enough exist)

View file

@ -0,0 +1,9 @@
{
"schemaVersion": 1,
"taskId": "T03",
"unitId": "M021/S04/T03",
"timestamp": 1775280970845,
"passed": true,
"discoverySource": "none",
"checks": []
}

View file

@ -1,6 +1,126 @@
# S05: [A] Audio Mode + Chapter Markers
**Goal:** Add audio-only waveform mode and chapter marker timeline UI to the media player
**Goal:** Media player renders an audio waveform (via wavesurfer.js) when no video URL is available, and chapter markers derived from KeyMoment data appear on the seek bar timeline.
**Demo:** After this: Media player with waveform visualization in audio mode and chapter markers on the timeline
## Tasks
- [x] **T01: Added media streaming endpoint and chapters endpoint to videos router, plus fetchChapters frontend API client** — Add two new endpoints to `backend/routers/videos.py`:
1. **`GET /videos/{video_id}/stream`** — Serves the media file at `SourceVideo.file_path` via `FileResponse`. Validates the video exists and `file_path` is set. Returns 404 if video not found or no file. Guesses media type from file extension (audio/wav, audio/mpeg, video/mp4, etc.).
2. **`GET /videos/{video_id}/chapters`** — Returns KeyMoment records for the video as chapter markers, sorted by `start_time`. Uses a new `ChapterMarkerRead` schema with fields: `id`, `title`, `start_time`, `end_time`, `content_type`.
Also adds `fetchChapters()` to the frontend API client so downstream tasks can consume it.
## Steps
1. In `backend/schemas.py`, add `ChapterMarkerRead` Pydantic model (id: UUID, title: str, start_time: float, end_time: float, content_type: str) with `model_config = ConfigDict(from_attributes=True)`. Add `ChaptersResponse` with `video_id: UUID` and `chapters: list[ChapterMarkerRead]`.
2. In `backend/routers/videos.py`, add `GET /videos/{video_id}/stream` endpoint: query `SourceVideo` by id, check `file_path` exists and is a real file on disk, return `FileResponse(video.file_path, media_type=guessed_type)`. Import `os.path` and `mimetypes`. Return 404 with detail if video not found or file missing.
3. In `backend/routers/videos.py`, add `GET /videos/{video_id}/chapters` endpoint: query `KeyMoment` records where `source_video_id == video_id`, order by `start_time`. Verify video exists first (404 if not). Return `ChaptersResponse`.
4. In `frontend/src/api/videos.ts`, add `Chapter` interface (id, title, start_time, end_time, content_type) and `ChaptersResponse` interface. Add `fetchChapters(videoId: string)` function following the `fetchTranscript` pattern.
5. Verify: run `python -c "from routers.videos import router"` in backend dir to confirm imports compile.
## Must-Haves
- [ ] `ChapterMarkerRead` schema in `backend/schemas.py`
- [ ] Stream endpoint serves file from `file_path` with correct content-type
- [ ] Stream endpoint returns 404 when video not found or file_path missing/invalid
- [ ] Chapters endpoint returns KeyMoments sorted by start_time
- [ ] `fetchChapters()` added to frontend API client
## Verification
- `cd backend && python -c "from routers.videos import router; print('ok')"` exits 0
- `grep -q 'def get_video_chapters' backend/routers/videos.py` confirms endpoint exists
- `grep -q 'def stream_video' backend/routers/videos.py` confirms stream endpoint exists
- `grep -q 'fetchChapters' frontend/src/api/videos.ts` confirms API client function exists
- Estimate: 45m
- Files: backend/routers/videos.py, backend/schemas.py, frontend/src/api/videos.ts
- Verify: cd /home/aux/projects/content-to-kb-automator/backend && python -c "from routers.videos import router; print('ok')" && grep -q 'fetchChapters' /home/aux/projects/content-to-kb-automator/frontend/src/api/videos.ts
- [ ] **T02: Audio waveform component with wavesurfer.js + WatchPage integration** — Install wavesurfer.js, create the AudioWaveform component, widen useMediaSync to support HTMLMediaElement, and wire the waveform into WatchPage as a replacement for VideoPlayer when no video URL is available.
## Steps
1. Install wavesurfer.js: `cd frontend && npm install wavesurfer.js`
2. In `frontend/src/hooks/useMediaSync.ts`, widen the ref type from `HTMLVideoElement` to `HTMLMediaElement`. Change `useRef<HTMLVideoElement | null>` to `useRef<HTMLMediaElement | null>`. Update the `MediaSyncState` interface's `videoRef` type to `React.RefObject<HTMLMediaElement | null>`. All HTMLMediaElement APIs (play, pause, currentTime, volume, etc.) are identical — no behavioral changes needed.
3. In `frontend/src/components/VideoPlayer.tsx`, update the `ref` cast from `React.RefObject<HTMLVideoElement>` to `React.RefObject<HTMLVideoElement>` — actually this file explicitly casts `videoRef as React.RefObject<HTMLVideoElement>` on line ~108, which remains valid since HTMLVideoElement extends HTMLMediaElement.
4. Create `frontend/src/components/AudioWaveform.tsx`:
- Props: `mediaSync: MediaSyncState`, `src: string` (the stream URL)
- Render a hidden `<audio ref={mediaSync.videoRef} src={src} preload="metadata" />` element — this lets useMediaSync own the audio element
- Create a container div ref for wavesurfer
- In a useEffect, create `WaveSurfer.create({ container, media: audioRef, height: 128, waveColor: 'rgba(0, 255, 209, 0.4)', progressColor: 'rgba(0, 255, 209, 0.8)', cursorColor: '#00ffd1', barWidth: 2, barGap: 1, barRadius: 2, backend: 'MediaElement' })`. The `media` option must point to the same `<audio>` element the ref points to.
- Clean up wavesurfer instance on unmount with `wavesurfer.destroy()`
- Style the container with class `audio-waveform`
5. In `frontend/src/pages/WatchPage.tsx`, import AudioWaveform. Compute `streamUrl` as `` `${BASE}/videos/${videoId}/stream` ``. Conditionally render: if `video.video_url` is null/undefined, render `<AudioWaveform src={streamUrl} mediaSync={mediaSync} />`, else render existing `<VideoPlayer>`.
6. Add CSS to `frontend/src/App.css` for `.audio-waveform` container: dark background matching `.video-player`, border-radius, padding, min-height.
7. Verify: `cd frontend && npx tsc --noEmit` passes with no errors.
## Must-Haves
- [ ] wavesurfer.js installed as dependency
- [ ] `useMediaSync` ref type widened to `HTMLMediaElement`
- [ ] `AudioWaveform` component renders wavesurfer waveform using stream URL
- [ ] Hidden `<audio>` element shared between useMediaSync and wavesurfer
- [ ] WatchPage renders AudioWaveform when `video_url` is null
- [ ] Dark-themed CSS for `.audio-waveform` container
## Verification
- `cd frontend && npx tsc --noEmit` exits 0
- `grep -q 'HTMLMediaElement' frontend/src/hooks/useMediaSync.ts`
- `grep -q 'wavesurfer' frontend/src/components/AudioWaveform.tsx`
- `grep -q 'AudioWaveform' frontend/src/pages/WatchPage.tsx`
- Estimate: 1h30m
- Files: frontend/src/components/AudioWaveform.tsx, frontend/src/hooks/useMediaSync.ts, frontend/src/pages/WatchPage.tsx, frontend/src/components/VideoPlayer.tsx, frontend/src/App.css, frontend/package.json
- Verify: cd /home/aux/projects/content-to-kb-automator/frontend && npx tsc --noEmit && grep -q 'HTMLMediaElement' src/hooks/useMediaSync.ts && grep -q 'AudioWaveform' src/pages/WatchPage.tsx
- [ ] **T03: Chapter markers on seek bar + waveform regions + integration CSS** — Create a ChapterMarkers overlay component for the seek bar, add chapter region display in the waveform, load chapter data in WatchPage, and polish the integration CSS.
## Steps
1. Create `frontend/src/components/ChapterMarkers.tsx`:
- Props: `chapters: Chapter[]`, `duration: number`, `onSeek: (time: number) => void`
- Renders an absolutely-positioned overlay div (`.chapter-markers`) containing tick marks
- Each chapter renders a `.chapter-marker__tick` at `left: (chapter.start_time / duration) * 100%`
- Each tick has a hover tooltip (`.chapter-marker__tooltip`) showing the chapter title
- Clicking a tick calls `onSeek(chapter.start_time)`
- Guard: if duration is 0 or chapters is empty, render nothing
2. In `frontend/src/components/PlayerControls.tsx`:
- Add optional `chapters` prop: `chapters?: Chapter[]`
- Import `ChapterMarkers` and `Chapter` type
- Wrap the seek `<input>` in a `.player-controls__seek-container` div (position: relative)
- Render `<ChapterMarkers>` inside that container, passing `chapters`, `duration`, and `seekTo`
3. In `frontend/src/components/AudioWaveform.tsx`:
- Add optional `chapters` prop: `chapters?: Chapter[]`
- After wavesurfer is created, if chapters are provided, use wavesurfer's `RegionsPlugin` to add labeled read-only regions for each chapter. Import `RegionsPlugin` from `wavesurfer.js/dist/plugins/regions.esm.js` (or similar path). Create regions with `{ start, end, content: title, color: 'rgba(0, 255, 209, 0.1)', drag: false, resize: false }`.
4. In `frontend/src/pages/WatchPage.tsx`:
- Import `fetchChapters` and `Chapter` type from `../api/videos`
- Add `chapters` state: `useState<Chapter[]>([])`
- In the existing fetch useEffect, after fetching transcript, also call `fetchChapters(videoId)` and set chapters state (catch errors silently — chapters are non-critical)
- Pass `chapters` to both `<AudioWaveform chapters={chapters}>` and `<PlayerControls chapters={chapters}>`
5. Add CSS to `frontend/src/App.css`:
- `.player-controls__seek-container` — position: relative, flex: 1
- `.chapter-markers` — position: absolute, top: 0, left: 0, right: 0, bottom: 0, pointer-events: none
- `.chapter-marker__tick` — position: absolute, width: 3px, height: 100%, background: var(--accent), opacity: 0.6, pointer-events: all, cursor: pointer, transform: translateX(-50%)
- `.chapter-marker__tick:hover` — opacity: 1
- `.chapter-marker__tooltip` — position: absolute, bottom: 100%, left: 50%, transform: translateX(-50%), background: var(--color-bg-surface), color: var(--text-primary), padding: 4px 8px, border-radius: 4px, font-size: 0.75rem, white-space: nowrap, opacity: 0, pointer-events: none, transition: opacity 150ms
- `.chapter-marker__tick:hover .chapter-marker__tooltip` — opacity: 1
6. Verify: `cd frontend && npx tsc --noEmit` passes. Visual inspection: chapter ticks appear on seek bar.
## Must-Haves
- [ ] ChapterMarkers component renders positioned ticks on the seek bar
- [ ] Tick hover shows chapter title tooltip
- [ ] Tick click seeks to chapter start_time
- [ ] AudioWaveform displays chapter regions via RegionsPlugin
- [ ] WatchPage fetches and distributes chapter data to both components
- [ ] CSS consistent with dark theme
## Verification
- `cd frontend && npx tsc --noEmit` exits 0
- `grep -q 'ChapterMarkers' frontend/src/components/PlayerControls.tsx`
- `grep -q 'fetchChapters' frontend/src/pages/WatchPage.tsx`
- `grep -q 'chapter-marker' frontend/src/App.css`
- Estimate: 1h30m
- Files: frontend/src/components/ChapterMarkers.tsx, frontend/src/components/PlayerControls.tsx, frontend/src/components/AudioWaveform.tsx, frontend/src/pages/WatchPage.tsx, frontend/src/App.css
- Verify: cd /home/aux/projects/content-to-kb-automator/frontend && npx tsc --noEmit && grep -q 'ChapterMarkers' src/components/PlayerControls.tsx && grep -q 'fetchChapters' src/pages/WatchPage.tsx && grep -q 'chapter-marker' src/App.css

View file

@ -0,0 +1,120 @@
# S05 Research: Audio Mode + Chapter Markers
## Summary
This slice adds two features to the existing WatchPage media player: (1) a waveform visualization mode for audio-only playback, and (2) chapter markers on the seek bar timeline derived from existing KeyMoment data. The codebase already has a solid `VideoPlayer` + `PlayerControls` + `useMediaSync` stack. No chapters/waveform infrastructure exists yet — this is greenfield UI work on established patterns.
**Depth: Targeted** — known tech (wavesurfer.js), clear codebase patterns, moderate integration with existing player.
## Requirements Targeted
No active requirements explicitly own this slice. The slice is defined by the roadmap: "Media player with waveform visualization in audio mode and chapter markers on the timeline." S06 (Auto-Chapters Review UI) depends on chapters being displayable, so S05 provides the read-only chapter display foundation.
## Recommendation
**Use wavesurfer.js + @wavesurfer/react** for audio waveform rendering. It's the dominant library (8.2 trust score), has a React hook (`useWavesurfer`), and its RegionsPlugin provides chapter marker display out of the box (labeled markers at time points, clickable, styled). The existing `useMediaSync` hook can sync playback state between the wavesurfer instance and the existing controls.
For chapter data, use existing **KeyMoment** records as chapters. Each KeyMoment already has `start_time`, `end_time`, `title`, and `content_type` — perfect for chapter markers. A new API endpoint `GET /videos/{video_id}/chapters` returns KeyMoments for a video sorted by start_time.
## Implementation Landscape
### Existing Components (What's There)
| File | Purpose |
|------|---------|
| `frontend/src/pages/WatchPage.tsx` | Main page, fetches video + transcript, renders player + sidebar |
| `frontend/src/components/VideoPlayer.tsx` | HLS/native video, accepts `mediaSync` + `src` + `startTime` |
| `frontend/src/components/PlayerControls.tsx` | Play/pause, seek bar, speed, volume, fullscreen |
| `frontend/src/hooks/useMediaSync.ts` | Shared playback state hook: `videoRef`, `currentTime`, `duration`, controls |
| `frontend/src/components/TranscriptSidebar.tsx` | Scrollable transcript synced to playback |
| `frontend/src/api/videos.ts` | `fetchVideo()`, `fetchTranscript()` |
| `backend/routers/videos.py` | `GET /videos/{id}`, `GET /videos/{id}/transcript` |
| `backend/models.py` | `SourceVideo`, `KeyMoment` (with `start_time`, `end_time`, `title`) |
| `backend/schemas.py` | `SourceVideoDetail` (has `video_url: str | None`), `KeyMomentRead` |
| `frontend/src/App.css` (lines ~5866+) | `.video-player`, `.player-controls` CSS |
### Key Observations
1. **`video_url` is always `None`** — the `SourceVideoDetail` schema has it but it's never populated (test confirms `video_url is always None for now`). The `VideoPlayer` component shows "Video not available" placeholder when `src` is null. This means **all current videos will show in audio mode** since there's no video URL. Audio mode is the primary mode, not a fallback.
2. **No audio URL exists either**`SourceVideo` model has no `audio_url` or `audio_path` field. The whisper pipeline extracts audio to WAV locally for transcription but doesn't persist a served audio URL. We need either: (a) an audio extraction step that creates a served file, or (b) serve audio from the existing video file_path. Given `video_url` is also null, we likely need to address media serving first.
3. **`ContentType` enum is about content category** (tutorial, livestream, breakdown, short_form) — NOT audio vs video media type. There's no field distinguishing audio-only from video sources.
4. **KeyMoment → Chapter mapping is natural**: each KeyMoment has `start_time`, `end_time`, `title`, `content_type` (technique/settings/reasoning/workflow). The `GET /videos/{video_id}/transcript` pattern can be replicated for `GET /videos/{video_id}/chapters`.
5. **`useMediaSync` needs extension**: wavesurfer.js manages its own internal audio element. The hook currently owns a `videoRef`. For audio mode, we'd either: (a) let wavesurfer own the audio element and sync `useMediaSync` state from wavesurfer events, or (b) feed an external audio element to wavesurfer via its `media` option. Option (b) is better — lets `useMediaSync` remain the source of truth.
### What Needs Building
#### Backend (1 endpoint + 1 schema)
- **New endpoint**: `GET /videos/{video_id}/chapters` → returns KeyMoments for the video as chapter markers. Schema: `ChapterMarker { id, title, start_time, end_time, content_type }`. Sort by `start_time`.
- Add to `backend/routers/videos.py` and `backend/schemas.py`.
#### Frontend — Audio Waveform Mode (new component)
- **New dependency**: `wavesurfer.js` + `@wavesurfer/react` (npm install)
- **New component**: `AudioWaveform.tsx` — renders wavesurfer waveform with:
- Uses `useWavesurfer` hook or raw WaveSurfer.create()
- WaveSurfer configured with `media: videoRef.current` (external element) so `useMediaSync` stays the source of truth
- RegionsPlugin for chapter markers (read-only, non-draggable)
- TimelinePlugin for time ruler
- HoverPlugin for timestamp on hover
- Styled to match dark theme (`--color-bg-surface`, `--color-accent`)
- **WatchPage changes**: conditionally render `AudioWaveform` instead of `VideoPlayer` when `video_url` is null (or always, since video_url is always null currently)
#### Frontend — Chapter Markers on Seek Bar
- **New component**: `ChapterMarkers.tsx` — thin overlay on the seek bar showing marker ticks
- Positioned absolutely over `.player-controls__seek` range input
- Each chapter = a small tick mark at `(start_time / duration) * 100%`
- Tooltip on hover showing chapter title
- Click seeks to chapter start
- **API function**: `fetchChapters(videoId)` in `frontend/src/api/videos.ts`
- **PlayerControls changes**: accept optional `chapters` prop, render `ChapterMarkers` overlay
#### CSS additions in `App.css`
- `.audio-waveform` container styles
- `.chapter-markers` overlay styles
- `.chapter-marker__tick` + tooltip styles
### Natural Task Decomposition
1. **Backend: chapters endpoint** — independent, can be built first. `GET /videos/{video_id}/chapters` returning KeyMoments. Quick — follows exact pattern of `GET /videos/{video_id}/transcript`.
2. **Frontend: install wavesurfer.js + AudioWaveform component** — the core waveform rendering. Needs wavesurfer.js installed, new component, wired into WatchPage. Biggest risk is syncing wavesurfer with the existing `useMediaSync` hook.
3. **Frontend: chapter markers on timeline** — the seek bar chapter ticks. Needs the chapters API client + a ChapterMarkers overlay component + PlayerControls integration. Independent of waveform work.
4. **Integration + CSS polish** — wire everything together in WatchPage, style for dark theme, verify responsive behavior.
### Media Serving Constraint
**Critical finding**: There is no served media URL. `video_url` is always null, and no audio URL exists. The waveform component needs an audio/video URL to render. Two options:
- **Option A (recommended for this slice)**: Add a media serving endpoint `GET /videos/{video_id}/stream` that serves the file at `SourceVideo.file_path` via `FileResponse`/`StreamingResponse`. This unblocks both audio waveform and future video playback.
- **Option B**: Use a pre-extracted audio file path. Requires a migration or pipeline change.
Option A is simpler and more useful. It's a single FastAPI endpoint. The waveform component and VideoPlayer can both use the same URL pattern.
### Wavesurfer + useMediaSync Sync Strategy
wavesurfer.js accepts a `media` option — an existing HTMLMediaElement. This is the key:
```
WaveSurfer.create({
container: containerRef,
media: audioElement, // use external <audio> element
...
})
```
For audio mode: create a hidden `<audio>` element, pass it to both `useMediaSync` (via `videoRef` which accepts HTMLMediaElement) and wavesurfer. The hook stays the single source of truth for play/pause/seek. Wavesurfer just visualizes.
### Skill Suggestions
No directly relevant professional skills found in `<available_skills>`. The `frontend-design` skill may be useful for polishing the waveform + chapter UI.
### Risks
1. **No media URL served** — must add a streaming endpoint or the waveform has nothing to render. This is the #1 blocker.
2. **useMediaSync typed as `HTMLVideoElement`** — the ref is `useRef<HTMLVideoElement | null>`. For audio mode it wraps an `<audio>` element. Both extend `HTMLMediaElement` so the API is identical, but the TypeScript type needs widening to `HTMLMediaElement`.
3. **wavesurfer.js bundle size** — ~60KB gzipped. Acceptable for a lazy-loaded page. Use dynamic import.

View file

@ -0,0 +1,55 @@
---
estimated_steps: 21
estimated_files: 3
skills_used: []
---
# T01: Backend: media streaming endpoint + chapters endpoint
Add two new endpoints to `backend/routers/videos.py`:
1. **`GET /videos/{video_id}/stream`** — Serves the media file at `SourceVideo.file_path` via `FileResponse`. Validates the video exists and `file_path` is set. Returns 404 if video not found or no file. Guesses media type from file extension (audio/wav, audio/mpeg, video/mp4, etc.).
2. **`GET /videos/{video_id}/chapters`** — Returns KeyMoment records for the video as chapter markers, sorted by `start_time`. Uses a new `ChapterMarkerRead` schema with fields: `id`, `title`, `start_time`, `end_time`, `content_type`.
Also adds `fetchChapters()` to the frontend API client so downstream tasks can consume it.
## Steps
1. In `backend/schemas.py`, add `ChapterMarkerRead` Pydantic model (id: UUID, title: str, start_time: float, end_time: float, content_type: str) with `model_config = ConfigDict(from_attributes=True)`. Add `ChaptersResponse` with `video_id: UUID` and `chapters: list[ChapterMarkerRead]`.
2. In `backend/routers/videos.py`, add `GET /videos/{video_id}/stream` endpoint: query `SourceVideo` by id, check `file_path` exists and is a real file on disk, return `FileResponse(video.file_path, media_type=guessed_type)`. Import `os.path` and `mimetypes`. Return 404 with detail if video not found or file missing.
3. In `backend/routers/videos.py`, add `GET /videos/{video_id}/chapters` endpoint: query `KeyMoment` records where `source_video_id == video_id`, order by `start_time`. Verify video exists first (404 if not). Return `ChaptersResponse`.
4. In `frontend/src/api/videos.ts`, add `Chapter` interface (id, title, start_time, end_time, content_type) and `ChaptersResponse` interface. Add `fetchChapters(videoId: string)` function following the `fetchTranscript` pattern.
5. Verify: run `python -c "from routers.videos import router"` in backend dir to confirm imports compile.
## Must-Haves
- [ ] `ChapterMarkerRead` schema in `backend/schemas.py`
- [ ] Stream endpoint serves file from `file_path` with correct content-type
- [ ] Stream endpoint returns 404 when video not found or file_path missing/invalid
- [ ] Chapters endpoint returns KeyMoments sorted by start_time
- [ ] `fetchChapters()` added to frontend API client
## Verification
- `cd backend && python -c "from routers.videos import router; print('ok')"` exits 0
- `grep -q 'def get_video_chapters' backend/routers/videos.py` confirms endpoint exists
- `grep -q 'def stream_video' backend/routers/videos.py` confirms stream endpoint exists
- `grep -q 'fetchChapters' frontend/src/api/videos.ts` confirms API client function exists
## Inputs
- ``backend/routers/videos.py` — existing video endpoints to extend`
- ``backend/schemas.py` — existing schemas including KeyMomentRead pattern`
- ``backend/models.py` — KeyMoment and SourceVideo models`
- ``frontend/src/api/videos.ts` — existing API client to extend`
## Expected Output
- ``backend/routers/videos.py` — two new endpoints: stream_video and get_video_chapters`
- ``backend/schemas.py` — ChapterMarkerRead and ChaptersResponse schemas`
- ``frontend/src/api/videos.ts` — Chapter type + fetchChapters function`
## Verification
cd /home/aux/projects/content-to-kb-automator/backend && python -c "from routers.videos import router; print('ok')" && grep -q 'fetchChapters' /home/aux/projects/content-to-kb-automator/frontend/src/api/videos.ts

View file

@ -0,0 +1,83 @@
---
id: T01
parent: S05
milestone: M021
provides: []
requires: []
affects: []
key_files: ["backend/routers/videos.py", "backend/schemas.py", "frontend/src/api/videos.ts"]
key_decisions: ["Stream endpoint uses FileResponse with mimetypes.guess_type for content-type detection", "Chapters endpoint maps KeyMoment records directly to ChapterMarkerRead schema"]
patterns_established: []
drill_down_paths: []
observability_surfaces: []
duration: ""
verification_result: "All 5 verification checks passed: router import compiles, get_video_chapters endpoint exists, stream_video endpoint exists, fetchChapters exists in frontend API client, ChapterMarkerRead schema exists."
completed_at: 2026-04-04T05:47:12.024Z
blocker_discovered: false
---
# T01: Added media streaming endpoint and chapters endpoint to videos router, plus fetchChapters frontend API client
> Added media streaming endpoint and chapters endpoint to videos router, plus fetchChapters frontend API client
## What Happened
---
id: T01
parent: S05
milestone: M021
key_files:
- backend/routers/videos.py
- backend/schemas.py
- frontend/src/api/videos.ts
key_decisions:
- Stream endpoint uses FileResponse with mimetypes.guess_type for content-type detection
- Chapters endpoint maps KeyMoment records directly to ChapterMarkerRead schema
duration: ""
verification_result: passed
completed_at: 2026-04-04T05:47:12.025Z
blocker_discovered: false
---
# T01: Added media streaming endpoint and chapters endpoint to videos router, plus fetchChapters frontend API client
**Added media streaming endpoint and chapters endpoint to videos router, plus fetchChapters frontend API client**
## What Happened
Added ChapterMarkerRead and ChaptersResponse Pydantic schemas to backend/schemas.py. Added two new endpoints to backend/routers/videos.py: GET /videos/{video_id}/stream serves the media file via FileResponse with MIME type guessing, returning 404 when video or file is missing; GET /videos/{video_id}/chapters returns KeyMoment records sorted by start_time. Added Chapter/ChaptersResponse TypeScript interfaces and fetchChapters() to the frontend API client.
## Verification
All 5 verification checks passed: router import compiles, get_video_chapters endpoint exists, stream_video endpoint exists, fetchChapters exists in frontend API client, ChapterMarkerRead schema exists.
## Verification Evidence
| # | Command | Exit Code | Verdict | Duration |
|---|---------|-----------|---------|----------|
| 1 | `cd backend && python -c "from routers.videos import router; print('ok')"` | 0 | ✅ pass | 500ms |
| 2 | `grep -q 'def get_video_chapters' backend/routers/videos.py` | 0 | ✅ pass | 50ms |
| 3 | `grep -q 'def stream_video' backend/routers/videos.py` | 0 | ✅ pass | 50ms |
| 4 | `grep -q 'fetchChapters' frontend/src/api/videos.ts` | 0 | ✅ pass | 50ms |
| 5 | `grep -q 'ChapterMarkerRead' backend/schemas.py` | 0 | ✅ pass | 50ms |
## Deviations
None.
## Known Issues
None.
## Files Created/Modified
- `backend/routers/videos.py`
- `backend/schemas.py`
- `frontend/src/api/videos.ts`
## Deviations
None.
## Known Issues
None.

View file

@ -0,0 +1,61 @@
---
estimated_steps: 27
estimated_files: 6
skills_used: []
---
# T02: Audio waveform component with wavesurfer.js + WatchPage integration
Install wavesurfer.js, create the AudioWaveform component, widen useMediaSync to support HTMLMediaElement, and wire the waveform into WatchPage as a replacement for VideoPlayer when no video URL is available.
## Steps
1. Install wavesurfer.js: `cd frontend && npm install wavesurfer.js`
2. In `frontend/src/hooks/useMediaSync.ts`, widen the ref type from `HTMLVideoElement` to `HTMLMediaElement`. Change `useRef<HTMLVideoElement | null>` to `useRef<HTMLMediaElement | null>`. Update the `MediaSyncState` interface's `videoRef` type to `React.RefObject<HTMLMediaElement | null>`. All HTMLMediaElement APIs (play, pause, currentTime, volume, etc.) are identical — no behavioral changes needed.
3. In `frontend/src/components/VideoPlayer.tsx`, update the `ref` cast from `React.RefObject<HTMLVideoElement>` to `React.RefObject<HTMLVideoElement>` — actually this file explicitly casts `videoRef as React.RefObject<HTMLVideoElement>` on line ~108, which remains valid since HTMLVideoElement extends HTMLMediaElement.
4. Create `frontend/src/components/AudioWaveform.tsx`:
- Props: `mediaSync: MediaSyncState`, `src: string` (the stream URL)
- Render a hidden `<audio ref={mediaSync.videoRef} src={src} preload="metadata" />` element — this lets useMediaSync own the audio element
- Create a container div ref for wavesurfer
- In a useEffect, create `WaveSurfer.create({ container, media: audioRef, height: 128, waveColor: 'rgba(0, 255, 209, 0.4)', progressColor: 'rgba(0, 255, 209, 0.8)', cursorColor: '#00ffd1', barWidth: 2, barGap: 1, barRadius: 2, backend: 'MediaElement' })`. The `media` option must point to the same `<audio>` element the ref points to.
- Clean up wavesurfer instance on unmount with `wavesurfer.destroy()`
- Style the container with class `audio-waveform`
5. In `frontend/src/pages/WatchPage.tsx`, import AudioWaveform. Compute `streamUrl` as `` `${BASE}/videos/${videoId}/stream` ``. Conditionally render: if `video.video_url` is null/undefined, render `<AudioWaveform src={streamUrl} mediaSync={mediaSync} />`, else render existing `<VideoPlayer>`.
6. Add CSS to `frontend/src/App.css` for `.audio-waveform` container: dark background matching `.video-player`, border-radius, padding, min-height.
7. Verify: `cd frontend && npx tsc --noEmit` passes with no errors.
## Must-Haves
- [ ] wavesurfer.js installed as dependency
- [ ] `useMediaSync` ref type widened to `HTMLMediaElement`
- [ ] `AudioWaveform` component renders wavesurfer waveform using stream URL
- [ ] Hidden `<audio>` element shared between useMediaSync and wavesurfer
- [ ] WatchPage renders AudioWaveform when `video_url` is null
- [ ] Dark-themed CSS for `.audio-waveform` container
## Verification
- `cd frontend && npx tsc --noEmit` exits 0
- `grep -q 'HTMLMediaElement' frontend/src/hooks/useMediaSync.ts`
- `grep -q 'wavesurfer' frontend/src/components/AudioWaveform.tsx`
- `grep -q 'AudioWaveform' frontend/src/pages/WatchPage.tsx`
## Inputs
- ``frontend/src/hooks/useMediaSync.ts` — hook to widen ref type`
- ``frontend/src/components/VideoPlayer.tsx` — existing player for reference and cast update`
- ``frontend/src/pages/WatchPage.tsx` — page to integrate AudioWaveform into`
- ``frontend/src/api/videos.ts` — BASE import for stream URL construction`
- ``frontend/src/App.css` — stylesheet to add waveform CSS`
## Expected Output
- ``frontend/src/components/AudioWaveform.tsx` — new wavesurfer-based audio waveform component`
- ``frontend/src/hooks/useMediaSync.ts` — ref type widened to HTMLMediaElement`
- ``frontend/src/pages/WatchPage.tsx` — conditional AudioWaveform rendering`
- ``frontend/src/App.css` — .audio-waveform CSS styles`
- ``frontend/package.json` — wavesurfer.js dependency added`
## Verification
cd /home/aux/projects/content-to-kb-automator/frontend && npx tsc --noEmit && grep -q 'HTMLMediaElement' src/hooks/useMediaSync.ts && grep -q 'AudioWaveform' src/pages/WatchPage.tsx

View file

@ -0,0 +1,76 @@
---
estimated_steps: 42
estimated_files: 5
skills_used: []
---
# T03: Chapter markers on seek bar + waveform regions + integration CSS
Create a ChapterMarkers overlay component for the seek bar, add chapter region display in the waveform, load chapter data in WatchPage, and polish the integration CSS.
## Steps
1. Create `frontend/src/components/ChapterMarkers.tsx`:
- Props: `chapters: Chapter[]`, `duration: number`, `onSeek: (time: number) => void`
- Renders an absolutely-positioned overlay div (`.chapter-markers`) containing tick marks
- Each chapter renders a `.chapter-marker__tick` at `left: (chapter.start_time / duration) * 100%`
- Each tick has a hover tooltip (`.chapter-marker__tooltip`) showing the chapter title
- Clicking a tick calls `onSeek(chapter.start_time)`
- Guard: if duration is 0 or chapters is empty, render nothing
2. In `frontend/src/components/PlayerControls.tsx`:
- Add optional `chapters` prop: `chapters?: Chapter[]`
- Import `ChapterMarkers` and `Chapter` type
- Wrap the seek `<input>` in a `.player-controls__seek-container` div (position: relative)
- Render `<ChapterMarkers>` inside that container, passing `chapters`, `duration`, and `seekTo`
3. In `frontend/src/components/AudioWaveform.tsx`:
- Add optional `chapters` prop: `chapters?: Chapter[]`
- After wavesurfer is created, if chapters are provided, use wavesurfer's `RegionsPlugin` to add labeled read-only regions for each chapter. Import `RegionsPlugin` from `wavesurfer.js/dist/plugins/regions.esm.js` (or similar path). Create regions with `{ start, end, content: title, color: 'rgba(0, 255, 209, 0.1)', drag: false, resize: false }`.
4. In `frontend/src/pages/WatchPage.tsx`:
- Import `fetchChapters` and `Chapter` type from `../api/videos`
- Add `chapters` state: `useState<Chapter[]>([])`
- In the existing fetch useEffect, after fetching transcript, also call `fetchChapters(videoId)` and set chapters state (catch errors silently — chapters are non-critical)
- Pass `chapters` to both `<AudioWaveform chapters={chapters}>` and `<PlayerControls chapters={chapters}>`
5. Add CSS to `frontend/src/App.css`:
- `.player-controls__seek-container` — position: relative, flex: 1
- `.chapter-markers` — position: absolute, top: 0, left: 0, right: 0, bottom: 0, pointer-events: none
- `.chapter-marker__tick` — position: absolute, width: 3px, height: 100%, background: var(--accent), opacity: 0.6, pointer-events: all, cursor: pointer, transform: translateX(-50%)
- `.chapter-marker__tick:hover` — opacity: 1
- `.chapter-marker__tooltip` — position: absolute, bottom: 100%, left: 50%, transform: translateX(-50%), background: var(--color-bg-surface), color: var(--text-primary), padding: 4px 8px, border-radius: 4px, font-size: 0.75rem, white-space: nowrap, opacity: 0, pointer-events: none, transition: opacity 150ms
- `.chapter-marker__tick:hover .chapter-marker__tooltip` — opacity: 1
6. Verify: `cd frontend && npx tsc --noEmit` passes. Visual inspection: chapter ticks appear on seek bar.
## Must-Haves
- [ ] ChapterMarkers component renders positioned ticks on the seek bar
- [ ] Tick hover shows chapter title tooltip
- [ ] Tick click seeks to chapter start_time
- [ ] AudioWaveform displays chapter regions via RegionsPlugin
- [ ] WatchPage fetches and distributes chapter data to both components
- [ ] CSS consistent with dark theme
## Verification
- `cd frontend && npx tsc --noEmit` exits 0
- `grep -q 'ChapterMarkers' frontend/src/components/PlayerControls.tsx`
- `grep -q 'fetchChapters' frontend/src/pages/WatchPage.tsx`
- `grep -q 'chapter-marker' frontend/src/App.css`
## Inputs
- ``frontend/src/api/videos.ts` — Chapter type and fetchChapters function from T01`
- ``frontend/src/components/AudioWaveform.tsx` — waveform component from T02 to add regions`
- ``frontend/src/components/PlayerControls.tsx` — seek bar to overlay chapter markers on`
- ``frontend/src/pages/WatchPage.tsx` — page composition from T02 to add chapter loading`
- ``frontend/src/App.css` — stylesheet from T02 to add chapter CSS`
## Expected Output
- ``frontend/src/components/ChapterMarkers.tsx` — new chapter marker overlay component`
- ``frontend/src/components/PlayerControls.tsx` — chapters prop + seek container wrapper`
- ``frontend/src/components/AudioWaveform.tsx` — RegionsPlugin chapter display added`
- ``frontend/src/pages/WatchPage.tsx` — chapter data loading and distribution`
- ``frontend/src/App.css` — chapter marker CSS styles`
## Verification
cd /home/aux/projects/content-to-kb-automator/frontend && npx tsc --noEmit && grep -q 'ChapterMarkers' src/components/PlayerControls.tsx && grep -q 'fetchChapters' src/pages/WatchPage.tsx && grep -q 'chapter-marker' src/App.css

View file

@ -1,17 +1,22 @@
"""Source video endpoints for Chrysopedia API."""
import logging
import mimetypes
import os.path
import uuid
from typing import Annotated
from fastapi import APIRouter, Depends, HTTPException, Query
from fastapi.responses import FileResponse
from sqlalchemy import func, select
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.orm import selectinload
from database import get_session
from models import SourceVideo, TranscriptSegment
from models import KeyMoment, SourceVideo, TranscriptSegment
from schemas import (
ChapterMarkerRead,
ChaptersResponse,
SourceVideoDetail,
SourceVideoRead,
TranscriptForPlayerResponse,
@ -102,3 +107,60 @@ async def get_video_transcript(
segments=[TranscriptSegmentRead.model_validate(s) for s in segments],
total=len(segments),
)
@router.get("/{video_id}/stream")
async def stream_video(
video_id: uuid.UUID,
db: AsyncSession = Depends(get_session),
) -> FileResponse:
"""Serve the media file at SourceVideo.file_path.
Returns 404 if the video record is missing, file_path is unset,
or the file does not exist on disk.
"""
stmt = select(SourceVideo).where(SourceVideo.id == video_id)
result = await db.execute(stmt)
video = result.scalar_one_or_none()
if video is None:
raise HTTPException(status_code=404, detail="Video not found")
if not video.file_path or not os.path.isfile(video.file_path):
raise HTTPException(status_code=404, detail="Media file not found on disk")
media_type, _ = mimetypes.guess_type(video.file_path)
if media_type is None:
media_type = "application/octet-stream"
logger.debug("Streaming %s (%s) for video %s", video.file_path, media_type, video_id)
return FileResponse(
video.file_path,
media_type=media_type,
filename=video.filename,
)
@router.get("/{video_id}/chapters", response_model=ChaptersResponse)
async def get_video_chapters(
video_id: uuid.UUID,
db: AsyncSession = Depends(get_session),
) -> ChaptersResponse:
"""Return KeyMoment records for a video as chapter markers, sorted by start_time."""
# Verify video exists
video_stmt = select(SourceVideo.id).where(SourceVideo.id == video_id)
video_result = await db.execute(video_stmt)
if video_result.scalar_one_or_none() is None:
raise HTTPException(status_code=404, detail="Video not found")
stmt = (
select(KeyMoment)
.where(KeyMoment.source_video_id == video_id)
.order_by(KeyMoment.start_time)
)
result = await db.execute(stmt)
moments = result.scalars().all()
logger.debug("Chapters for %s: %d key moments", video_id, len(moments))
return ChaptersResponse(
video_id=video_id,
chapters=[ChapterMarkerRead.model_validate(m) for m in moments],
)

View file

@ -659,3 +659,22 @@ class CreatorDashboardResponse(BaseModel):
search_impressions: int = 0
techniques: list[CreatorDashboardTechnique] = Field(default_factory=list)
videos: list[CreatorDashboardVideo] = Field(default_factory=list)
# ── Chapter Markers (for media player timeline) ─────────────────────────────
class ChapterMarkerRead(BaseModel):
"""A chapter marker derived from a KeyMoment for the player timeline."""
model_config = ConfigDict(from_attributes=True)
id: uuid.UUID
title: str
start_time: float
end_time: float
content_type: str
class ChaptersResponse(BaseModel):
"""Chapters (KeyMoments) for a video, sorted by start_time."""
video_id: uuid.UUID
chapters: list[ChapterMarkerRead] = Field(default_factory=list)

View file

@ -44,3 +44,24 @@ export function fetchTranscript(videoId: string): Promise<TranscriptResponse> {
`${BASE}/videos/${encodeURIComponent(videoId)}/transcript`,
);
}
// ── Chapters (KeyMoments as timeline markers) ────────────────────────────────
export interface Chapter {
id: string;
title: string;
start_time: number;
end_time: number;
content_type: string;
}
export interface ChaptersResponse {
video_id: string;
chapters: Chapter[];
}
export function fetchChapters(videoId: string): Promise<ChaptersResponse> {
return request<ChaptersResponse>(
`${BASE}/videos/${encodeURIComponent(videoId)}/chapters`,
);
}