docs: M024 feature documentation — shorts publishing, embed support, timeline pins, auto-captioning, templates, citation UX

Updated 7 wiki pages:
- Home.md: M024 feature sections for shorts/publishing and player/UX
- Data-Model.md: share_token, captions_enabled, shorts_template columns + migrations 026-028
- API-Surface.md: public shorts endpoint, template admin endpoints, captions param
- Frontend.md: ShortPlayer, EmbedPlayer, shared utilities, template config panel
- Pipeline.md: caption_generator, card_renderer, shorts_generator updates
- Player.md: key moment pins, inline technique player, embed player
- Chat-Engine.md: citation metadata propagation, shared parseChatCitations
jlightner 2026-04-04 06:55:06 -05:00
parent d1bd67a9eb
commit 0451abf62d
7 changed files with 170 additions and 11 deletions

@ -1,8 +1,8 @@
# API Surface
71 API endpoints grouped by domain. All served by FastAPI under `/api/v1/`.
74 API endpoints grouped by domain. All served by FastAPI under `/api/v1/`.
## Public Endpoints (10)
## Public Endpoints (11)
| Method | Path | Response Shape | Notes |
|--------|------|---------------|-------|
@ -16,6 +16,7 @@
| GET | `/api/v1/techniques/{slug}` | 22-field object | Full technique detail with relations |
| GET | `/api/v1/techniques/{slug}/versions` | `{items, total}` | Version history |
| GET | `/api/v1/techniques/{slug}/versions/{n}` | Version detail | Single version |
| GET | `/api/v1/public/shorts/{share_token}` | `{format, dimensions, duration, creator_name, highlight_title, download_url}` | Unauthenticated. Returns metadata + presigned MinIO URL. 404 for missing/non-complete shorts (M024/S01) |
### Technique Detail Fields (22)
@ -161,10 +162,17 @@ All under prefix `/api/v1/admin/pipeline/`.
| Method | Path | Purpose |
|--------|------|---------|
| POST | `/api/v1/admin/shorts/{highlight_id}/generate` | Queue short generation for approved highlight (3 presets) — 202 response |
| POST | `/api/v1/admin/shorts/{highlight_id}/generate` | Queue short generation for approved highlight (3 presets). Accepts `captions` boolean param (M024/S04) — 202 response |
| GET | `/api/v1/admin/shorts/{highlight_id}` | List generated shorts for a highlight with per-preset status |
| GET | `/api/v1/admin/shorts/{short_id}/download` | Presigned MinIO download URL for completed short |
### Shorts Template Admin (2) — M024/S04
| Method | Path | Purpose |
|--------|------|---------|
| GET | `/api/v1/admin/creators/{id}/shorts-template` | Returns JSONB template config for creator |
| PUT | `/api/v1/admin/creators/{id}/shorts-template` | Updates template config with validation (hex color, duration 1.05.0) |
## Other Endpoints (2)
| Method | Path | Notes |

@ -36,7 +36,7 @@ The chat endpoint returns a `text/event-stream` response with four event types i
| Event | Payload | When |
|-------|---------|------|
| `sources` | `[{title, slug, creator_name, summary}]` | First — citation metadata for link rendering |
| `sources` | `[{title, slug, creator_name, summary, source_video_id, start_time, end_time, video_filename}]` | First — citation metadata for link rendering. Video fields added in M024/S05 for timestamp badges. |
| `token` | `string` (text chunk) | Repeated — streamed LLM completion tokens |
| `done` | `{cascade_tier, conversation_id}` | Once — signals completion, includes retrieval tier and conversation ID |
| `error` | `{message: string}` | On failure — emitted if LLM errors mid-stream |
@ -67,7 +67,9 @@ The LLM is instructed to reference sources using numbered citations `[N]` in its
- `[1]` → links to `/techniques/:slug` for the corresponding source
- Multiple citations supported: `[1][3]` or `[1,3]`
- Citation regex: `/\[(\d+)\]/g` parsed locally in both ChatPage and ChatWidget
- Citation regex: `/\[(\d+)\]/g` parsed by shared `parseChatCitations()` utility (M024/S05)
- **Timestamp badges** — when `start_time` is defined, source cards show a badge linking to `/watch/:id?t=N` (M024/S05)
- **Video filename** — displayed as subtle metadata on source cards (M024/S05)
## API Endpoint
@ -207,6 +209,22 @@ Linear: `temperature = 0.3 + weight * 0.2`
If the creator has no `personality_profile` (null JSONB), the system falls back to pure encyclopedic mode regardless of weight value. DB errors during profile fetch are caught and logged — never crash the stream.
## Citation Metadata Propagation (M024/S05)
End-to-end video metadata flow from search results through SSE to frontend source cards.
### Backend
- `search_service.py``_enrich_qdrant_results()` and `_keyword_search_and()` now batch-fetch `SourceVideo` filenames and include `source_video_id`, `start_time`, `end_time`, `video_filename` in result dicts. Non-key_moment types get empty/None values for uniform dict shape.
- `chat_service.py``_build_sources()` passes all four video fields through to SSE source events.
### Frontend
- `ChatSource` interface (in `api/chat.ts`) extended with `source_video_id`, `start_time`, `end_time`, `video_filename` fields.
- `utils/chatCitations.tsx` — shared `parseChatCitations()` replaces duplicate implementations in ChatPage and ChatWidget. Accepts CSS module styles as `Record<string, string>`.
- `utils/formatTime.ts` — shared hour-aware time formatter used across timestamp badges and player controls.
- Source cards now show: timestamp badge (links to `/watch/:id?t=N` when `start_time` defined) and video filename metadata.
## Key Files
- `backend/chat_service.py` — ChatService with history load/save, retrieve-prompt-stream pipeline
@ -217,6 +235,8 @@ If the creator has no `personality_profile` (null JSONB), the system falls back
- `frontend/src/pages/ChatPage.module.css` — Conversation bubble layout styles
- `frontend/src/components/ChatWidget.tsx` — Floating chat widget component
- `frontend/src/components/ChatWidget.module.css` — Widget styles (38 custom property refs)
- `frontend/src/utils/chatCitations.tsx` — Shared citation parser (M024/S05)
- `frontend/src/utils/formatTime.ts` — Shared time formatter (M024/S05)
## Design Decisions
@ -226,7 +246,7 @@ If the creator has no `personality_profile` (null JSONB), the system falls back
- **Client-side suggested questions** — Generated from technique titles/categories without API call
- **5-tier interpolation** — Progressive field inclusion replaces 3-tier step function (D044 supersedes D043)
- **5-tier interpolation** — Progressive field inclusion replaces 3-tier step function (D044 supersedes D043)
- **Citation parsing duplicated** — ChatPage and ChatWidget each parse citations independently (extracted utility deferred)
- **Shared citation parsing** — `parseChatCitations()` in `utils/chatCitations.tsx` replaces duplicate implementations (M024/S05)
- **Standalone ASGI test client** — Tests use mocked DB to avoid PostgreSQL dependency
---

@ -41,6 +41,7 @@ HighlightCandidate (1) ──→ (N) GeneratedShort
| social_links | JSONB | Platform → URL mapping |
| featured | Boolean | For homepage spotlight |
| personality_profile | JSONB | LLM-extracted personality data (M022/S06). See [[Personality-Profiles]] |
| shorts_template | JSONB | Nullable — intro/outro card config with `parse_template_config` normalizer (M024/S04) |
### SourceVideo
@ -238,6 +239,9 @@ Append-only versioned record of per-field consent changes.
| 023 | Add personality_profile JSONB to creators (M022/S06) |
| 024 | Add posts and post_attachments tables (M023/S01) |
| 025 | Add generated_shorts table with FormatPreset and ShortStatus enums (M023/S03) |
| 026 | Add share_token to generated_shorts + backfill existing complete shorts + unique index (M024/S01) |
| 027 | Add captions_enabled boolean to generated_shorts (M024/S04) |
| 028 | Add shorts_template JSONB to creators (M024/S04) |
## Schema Notes
@ -288,5 +292,7 @@ Append-only versioned record of per-field consent changes.
| file_size | Integer | Nullable — bytes |
| duration_secs | Float | Nullable |
| error_message | Text | Nullable — on failure |
| share_token | String(16) | Nullable, unique-indexed — generated via `secrets.token_urlsafe(8)` at pipeline completion (M024/S01) |
| captions_enabled | Boolean | Default false — indicates successful ASS subtitle generation (M024/S04) |
| created_at | Timestamp | |
| updated_at | Timestamp | |

@ -23,6 +23,8 @@ React 18 + TypeScript + Vite SPA. No UI library, no state management library, no
| `/creator/posts` | PostsList | Creator JWT | Creator post management — status badges, edit/delete (M023/S01) |
| `/creator/posts/new` | PostEditor | Creator JWT | Tiptap rich text editor with file attachments (M023/S01) |
| `/creator/posts/:postId/edit` | PostEditor | Creator JWT | Edit existing post (M023/S01) |
| `/shorts/:token` | ShortPlayer | Public | Public short player outside ProtectedRoute — fetches via unauthenticated API (M024/S01) |
| `/embed/:videoId` | EmbedPlayer | Public | Chrome-free embed player, registered before AppShell catch-all (M024/S03) |
| `*` | → Redirect `/` | — | SPA fallback |
*Admin routes have no authentication gate.
@ -111,7 +113,7 @@ Creator post management at `/creator/posts`.
- **SidebarNav integration** — accessible from creator dashboard
- **Files:** `PostsList.tsx`, `PostsList.module.css`
### Shorts UI (M023/S03)
### Shorts UI (M023/S03, M024/S01, S04)
Shorts generation controls in HighlightQueue.
@ -119,8 +121,32 @@ Shorts generation controls in HighlightQueue.
- **Per-preset status badges** — color-coded pending/processing/complete/failed with pulsing animation
- **Download links** — open presigned MinIO URLs in new tab
- **5s polling** — while any shorts are processing, auto-stops when all settle
- **Share link + embed code copy buttons** — 🔗 and 📋 icons on completed shorts with `share_token` (M024/S01)
- **Collapsible template config panel** — intro/outro text, duration sliders, show/hide toggles, color picker, font selection (M024/S04)
- **Per-highlight captions toggle** — checkbox to enable ASS subtitle generation (M024/S04)
- **Files:** `HighlightQueue.tsx`, `HighlightQueue.module.css` (updated)
### ShortPlayer (M024/S01)
Public short video player at `/shorts/:token`.
- **Unauthenticated** — fetches via `fetchPublicShort()` (raw `fetch()`, no auth token injection)
- **Video rendering**`<video>` element with presigned MinIO URL
- **Creator/highlight metadata** — format, dimensions, duration, creator name, title
- **Share/embed buttons** — copy share URL and iframe embed snippet to clipboard
- **Only renders share/embed** when `share_token` is non-null (graceful for pre-migration shorts)
- **Files:** `ShortPlayer.tsx`, `ShortPlayer.module.css`
### EmbedPlayer (M024/S03)
Chrome-free embed player at `/embed/:videoId`.
- **No header/nav/footer** — registered at top-level Routes before AppShell catch-all
- **Content-type-aware height** — 120px for audio-only, 405px for video
- **"Powered by Chrysopedia" branding** — link opens origin in new tab with `noopener`
- **Code-split** — lazy-loaded via `React.lazy` + `Suspense`
- **Files:** `EmbedPlayer.tsx`, `EmbedPlayer.module.css`
### Personality Slider (M023/S02, S04)
ChatWidget personality control.
@ -131,6 +157,14 @@ ChatWidget personality control.
- **Labels match backend 5-tier interpolation boundaries** (0.2/0.4/0.6/0.8)
- **Files:** `ChatWidget.tsx`, `ChatWidget.module.css` (updated)
## Shared Utilities (M024)
| Utility | Path | Purpose |
|---------|------|---------|
| copyToClipboard | `utils/clipboard.ts` | Shared clipboard utility — `navigator.clipboard` with `document.execCommand` fallback (M024/S03) |
| parseChatCitations | `utils/chatCitations.tsx` | Shared citation parser replacing duplicate implementations in ChatPage and ChatWidget (M024/S05) |
| formatTime | `utils/formatTime.ts` | Shared hour-aware time formatter replacing duplicated implementations across 4+ files (M024/S05) |
## Hooks
| Hook | Purpose |
@ -152,7 +186,8 @@ API modules:
- `auth.ts` — authentication + impersonation functions
- `highlights.ts`
- `posts.ts` — post CRUD and file upload/download functions (M023/S01)
- `shorts.ts` — short generation, listing, and download functions (M023/S03) — creator highlight review functions (M022/S01)
- `shorts.ts` — short generation, listing, and download functions (M023/S03); `fetchPublicShort` for unauthenticated access (M024/S01) — creator highlight review functions (M022/S01)
- `templates.ts` — shorts template CRUD API client (M024/S04)
- `follows.ts` — follow/unfollow/status/list functions (M022/S02)
- `creators.ts` — creator detail with personality_profile and follower_count types (M022/S02, S06)

15
Home.md

@ -25,7 +25,7 @@ Producers can search for specific techniques and find timestamped key moments, s
- **Technique Pages** — LLM-synthesized study guides with v2 body sections, signal chains, citations
- **Search** — LightRAG primary + Qdrant fallback with 4-tier creator-scoped cascade
- **Pipeline** — 6-stage LLM extraction (transcripts → key moments → classification → synthesis → embedding)
- **Player** — Audio player with chapter markers
- **Player** — Audio/video player with chapter markers, key moment timeline pins, and inline technique page player
### Creator Tools
- **Follow System** — User-to-creator follows with follower counts (M022)
@ -39,6 +39,17 @@ Producers can search for specific techniques and find timestamped key moments, s
- **Multi-Turn Chat Memory** — Redis-backed conversation history with conversation_id threading (M022)
- **Creator Dashboard** — Video management, chapter editing, consent controls
### Shorts & Publishing (M024)
- **Shorts Publishing Flow** — Public shareable URLs via `/shorts/{token}`, share_token generated at pipeline completion (M024/S01)
- **Embed Support** — Chrome-free `/embed/:videoId` player, iframe snippet with audio-aware height (120px audio, 405px video) (M024/S03)
- **Auto-Captioning** — ASS karaoke subtitles from Whisper word-level timings with `\k` tags for word-by-word highlighting (M024/S04)
- **Template System** — Creator-configurable intro/outro cards via admin API, ffmpeg lavfi card rendering, concat demuxer assembly (M024/S04)
### Player & UX (M024)
- **Key Moment Timeline Pins** — 12px color-coded circle pins (technique=cyan, settings=amber, reasoning=purple, workflow=green) with active-state 1.3x highlighting and touch-friendly hit areas (M024/S02)
- **Inline Collapsible Player** — On technique pages between summary and body, with bibliography seek wiring and multi-source-video selector (M024/S02)
- **Citation UX** — Timestamp badge links to `/watch/:id?t=N`, video filename on source cards, shared `chatCitations.tsx` + `formatTime.ts` utilities (M024/S05)
### Platform
- **Authentication** — JWT with invite codes, admin/creator roles
- **Consent System** — Per-video granular consent with audit trail
@ -67,4 +78,4 @@ Producers can search for specific techniques and find timestamped key moments, s
---
*Last updated: 2026-04-04 — M023 post editor, file sharing, shorts pipeline, personality interpolation
*Last updated: 2026-04-04 — M024 shorts publishing, embed support, timeline pins, auto-captioning, templates, citation UX

@ -96,6 +96,36 @@ CLI tool (`python -m pipeline.quality`) with:
- **Automated prompt A/B optimization loop** — LLM-powered variant generation, iterative scoring, leaderboard
- **Multi-stage support** for pipeline stages 2-5 with per-stage rubrics and fixtures
## Shorts Pipeline Modules (M024/S04)
### caption_generator.py
Converts Whisper word-level timings to ASS (Advanced SubStation Alpha) subtitles with karaoke highlighting.
- `generate_ass_captions()` — main entry point, produces ASS file with `\k` tags for word-by-word highlighting
- Clip-relative timing — offsets word timestamps by clip_start for accurate subtitle sync
- **Non-blocking** — failures log WARNING, never fail the parent short generation stage
- 17 unit tests covering time formatting, ASS structure, clip offset math, karaoke duration, empty/whitespace handling, custom styles, negative time clamping
### card_renderer.py
ffmpeg-based intro/outro card generation and concatenation pipeline.
- `render_card()` — builds lavfi command using `color` + `drawtext` filters
- `render_card_to_file()` — executes ffmpeg to produce card video segment
- `build_concat_list()` — writes file manifest for ffmpeg concat demuxer
- `concat_segments()` — runs ffmpeg concat demuxer with `-c copy`
- `parse_template_config()` — JSONB normalizer with defaults for missing/null fields
- Cards include silent audio track via `anullsrc` for codec-compatible concat with audio main clips
- **Non-blocking** — card render failures log WARNING, shorts proceed without intro/outro
- 28 unit tests
### shorts_generator.py Updates (M024/S01, S04)
- `extract_clip()` accepts optional `ass_path` for subtitle burn-in via ffmpeg `-vf ass=` filter
- `extract_clip_with_template()` — orchestrates intro/main/outro concatenation using card_renderer
- `stage_generate_shorts` now: loads transcripts for captions, loads creator templates for cards, generates `share_token` on completion via `secrets.token_urlsafe(8)`
## Key Design Decisions
- **Sync clients in Celery** (D004): openai.OpenAI, QdrantClient, sync SQLAlchemy. Avoids nested event loop errors.

@ -103,12 +103,61 @@ with chapter title
- `wavesurfer.js` — waveform rendering (~200KB, loaded only in audio mode)
- `useMediaSync` hook widened from `HTMLVideoElement` to `HTMLMediaElement` for audio/video polymorphism
## Key Moment Timeline Pins (M024/S02)
ChapterMarkers upgraded from thin 3px line ticks to 12px color-coded circle pins.
### Pin Colors by Content Type
| Content Type | CSS Custom Property | Color |
|-------------|-------------------|-------|
| technique | `--color-pin-technique` | Cyan |
| settings | `--color-pin-settings` | Amber |
| reasoning | `--color-pin-reasoning` | Purple |
| workflow | `--color-pin-workflow` | Green |
### Active State
When playback is within a key moment's time range, the corresponding pin scales to 1.3x. `PlayerControls` passes `currentTime` to `ChapterMarkers` for active detection.
### Touch Targets
`::before` pseudo-element with `inset:-6px` provides touch-friendly hit areas without enlarging the visual pin.
### Tooltips
Enriched: title + formatted time range + content type label.
## Inline Player on Technique Pages (M024/S02)
Collapsible video player on `TechniquePage`, positioned between summary and body sections.
- **Collapse/expand animation** — CSS `grid-template-rows: 0fr/1fr` transition
- **Multi-source-video selector** — dropdown for technique pages sourced from multiple videos
- **Chapter pins** — key moments rendered as pin markers on the inline seek bar
- **Bibliography seek wiring** — time links render as seek buttons (calling `mediaSync.seekTo`) when the inline player is active for the matching video, or as `<Link>` to WatchPage otherwise
- Uses existing `useMediaSync` hook, `VideoPlayer`, and `PlayerControls` components
## Embed Player (M024/S03)
Chrome-free player at `/embed/:videoId` for iframe embedding.
- **No header/nav/footer** — route registered at top-level `Routes` in `App.tsx` before AppShell catch-all
- **Content-type-aware height** — video: 405px, audio-only: 120px
- **"Powered by Chrysopedia" branding** — link opens origin in new tab with `rel="noopener"`
- **Dark full-viewport layout** — matches existing player theme
- **Files:** `EmbedPlayer.tsx`, `EmbedPlayer.module.css`
## Key Files
- `frontend/src/pages/WatchPage.tsx` — Page component
- `frontend/src/components/VideoPlayer.tsx` — Video element + HLS setup
- `frontend/src/components/PlayerControls.tsx` — Play/pause, speed, volume, seek bar
- `frontend/src/components/PlayerControls.tsx` — Play/pause, speed, volume, seek bar, currentTime pass-through to ChapterMarkers
- `frontend/src/components/ChapterMarkers.tsx` — 12px color-coded circle pins with active state and touch targets (M024/S02)
- `frontend/src/components/TranscriptSidebar.tsx` — Synchronized transcript display
- `frontend/src/components/AudioWaveform.tsx` — Waveform visualization for audio content (M021/S05)
- `frontend/src/pages/EmbedPlayer.tsx` — Chrome-free embed player (M024/S03)
- `frontend/src/pages/TechniquePage.tsx` — Inline collapsible player with bibliography seek (M024/S02)
- `frontend/src/hooks/useMediaSync.ts` — Shared playback state hook
- `backend/routers/videos.py` — Video detail + transcript API