M021: Chat engine, retrieval cascade, highlights, audio mode, chapters, impersonation write mode docs

2026-04-04 06:40:26 +00:00 · 2026-04-04 06:40:26 +00:00 · eec99b6c7d
commit eec99b6c7d
parent be05f5edf2
13 changed files with 533 additions and 7 deletions
--- a/API-Surface.md
+++ b/API-Surface.md
@ -8,7 +8,7 @@
 |--------|------|---------------|-------|
 | GET | `/health` | `{status, service, version, database}` | Health check |
 | GET | `/api/v1/stats` | `{technique_count, creator_count}` | Homepage stats |
-| GET | `/api/v1/search?q=` | `{items, partial_matches, total, query, fallback_used}` | Semantic + keyword fallback (D009) |
+| GET | `/api/v1/search?q=&creator=` | `{items, partial_matches, total, query, fallback_used, cascade_tier}` | LightRAG primary + Qdrant fallback, optional creator cascade (D039, D040) |
 | GET | `/api/v1/search/suggestions?q=` | `{suggestions: [{text, type}]}` | Typeahead autocomplete |
 | GET | `/api/v1/search/popular` | `{items: [{query, count}]}` | Popular searches (D025) |
 | GET | `/api/v1/techniques?limit=&offset=` | `{items, total, offset, limit}` | Paginated technique list |
@ -139,3 +139,50 @@ JWT-based authentication added in M019. See [[Authentication]] for full details.
 ---
 *See also: [[Architecture]], [[Data-Model]], [[Frontend]], [[Authentication]]*
 utput` | Delete all pipeline output |
 | POST | `/admin/pipeline/optimize-prompt` | Trigger prompt optimization |
 | POST | `/admin/pipeline/reindex-all` | Rebuild Qdrant index |
 | GET | `/admin/pipeline/worker-status` | Celery worker health |
 | GET | `/admin/pipeline/recent-activity` | Recent pipeline events |
 | POST | `/admin/pipeline/creator-profile/{creator_id}` | Update creator profile |
 | POST | `/admin/pipeline/avatar-fetch/{creator_id}` | Fetch creator avatar |
 ## Other Endpoints (2)
 | Method | Path | Notes |
 |--------|------|-------|
 | POST | `/api/v1/ingest` | Transcript upload |
 | GET | `/api/v1/videos` | ⚠️ Bare list (not paginated) |
 ## Response Conventions
 **Standard paginated response:**
 ```json
 {
  "items": [...],
  "total": 83,
  "offset": 0,
  "limit": 20
 }
 ```
 **Known inconsistencies:**
 - `GET /topics` returns bare list instead of paginated dict
 - `GET /videos` returns bare list instead of paginated dict
 - Search uses `items` key (not `results`)
 - `/techniques/random` returns JSON `{slug}` (not HTTP redirect)
 **New endpoints should follow the `{items, total, offset, limit}` paginated pattern.**
 ## Authentication
 JWT-based authentication added in M019. See [[Authentication]] for full details.
 - **Public endpoints** (search, browse, techniques) require no auth
 - **Auth endpoints** (`/auth/register`, `/auth/login`) are open; `/auth/me` requires Bearer JWT
 - **Consent endpoints** require Bearer JWT with ownership verification (creator must own the video, or be admin)
 - **Admin endpoints** (`/admin/*`) are accessible to anyone with network access (auth planned for future milestone)
 ---
 *See also: [[Architecture]], [[Data-Model]], [[Frontend]], [[Authentication]]*
--- a/Architecture.md
+++ b/Architecture.md
@ -2,7 +2,7 @@
 ## System Overview
-Chrysopedia is a self-hosted music production knowledge base that synthesizes technique articles from video transcripts using a 6-stage LLM pipeline. It runs as a Docker Compose stack on `ub01` with 11 containers.
+Chrysopedia is a self-hosted music production knowledge base that synthesizes technique articles from video transcripts using a 7-stage LLM pipeline. It runs as a Docker Compose stack on `ub01` with 11 containers.
 ```
 ┌─────────────────────────────────────────────────────────────────┐
@ -94,3 +94,11 @@ Chrysopedia is a self-hosted music production knowledge base that synthesizes te
 ---
 *See also: [[Deployment]], [[Pipeline]], [[Data-Model]], [[Authentication]]*
 → PostgreSQL, embeddings → Qdrant
 5. **Serving:** React SPA fetches from FastAPI, search queries hit Qdrant then PostgreSQL fallback
 6. **Auth:** JWT-protected endpoints for creator consent management and admin features (see [[Authentication]])
 ---
 *See also: [[Deployment]], [[Pipeline]], [[Data-Model]], [[Authentication]]*
 ], [[Authentication]]*
--- a/Chat-Engine.md
+++ b/Chat-Engine.md
@ -0,0 +1,115 @@
 # Chat Engine
 Streaming question-answering interface backed by LightRAG retrieval and LLM completion. Added in M021/S03.
 ## Architecture
 ```
 User types question in ChatPage
        │
        ▼
 POST /api/v1/chat  { query: "...", creator?: "..." }
        │
        ▼
 ChatService.stream(query, creator?)
        │
        ├─ 1. Retrieve: SearchService.search(query, creator)
        │     └─ Uses 4-tier cascade if creator provided (see [[Search-Retrieval]])
        │
        ├─ 2. Prompt: Assemble numbered context block into encyclopedic system prompt
        │     └─ Sources formatted as [1] Title — Summary for citation mapping
        │
        ├─ 3. Stream: openai.AsyncOpenAI with stream=True
        │     └─ Tokens streamed as SSE events in real-time
        │
        ▼
 SSE response → ChatPage renders tokens + citation links
 ```
 ## SSE Protocol
 The chat endpoint returns a `text/event-stream` response with four event types in strict order:
 | Event | Payload | When |
 |-------|---------|------|
 | `sources` | `[{title, slug, creator_name, summary}]` | First — citation metadata for link rendering |
 | `token` | `string` (text chunk) | Repeated — streamed LLM completion tokens |
 | `done` | `{cascade_tier: "creator"\|"domain"\|"global"\|"none"\|""}` | Once — signals completion, includes which retrieval tier answered |
 | `error` | `{message: string}` | On failure — emitted if LLM errors mid-stream |
 The `cascade_tier` in the `done` event reveals which tier of the retrieval cascade served the context (see [[Search-Retrieval]]).
 ## Citation Format
 The LLM is instructed to reference sources using numbered citations `[N]` in its response. The frontend parses these into superscript links:
 - `[1]` → links to `/techniques/:slug` for the corresponding source
 - Multiple citations supported: `[1][3]` or `[1,3]`
 - Citation regex: `/\[(\d+)\]/g` parsed locally in ChatPage
 ## API Endpoint
 ### POST /api/v1/chat
 | Field | Type | Required | Validation |
 |-------|------|----------|------------|
 | `query` | string | Yes | 1–1000 characters |
 | `creator` | string | No | Creator UUID or slug for scoped retrieval |
 **Response:** `text/event-stream` (SSE)
 **Error responses:**
 - `422` — Empty or missing query, query exceeds 1000 chars
 ## Backend: ChatService
 Located in `backend/chat_service.py`. The retrieve-prompt-stream pipeline:
 1. **Retrieve** — Calls `SearchService.search()` with the query and optional creator parameter. Gets back ranked technique page results with the cascade_tier.
 2. **Prompt** — Builds a numbered context block from search results. System prompt instructs the LLM to act as a music production encyclopedia, cite sources with `[N]` notation, and stay grounded in the provided context.
 3. **Stream** — Opens an async streaming completion via `openai.AsyncOpenAI` (configured to point at DGX Sparks Qwen or local Ollama). Yields SSE events as tokens arrive.
 Error handling: If the LLM fails mid-stream (after some tokens have been sent), an `error` event is emitted so the frontend can display a failure message rather than leaving the response hanging.
 ## Frontend: ChatPage
 Route: `/chat` (lazy-loaded, code-split)
 ### Components
 - **Text input + submit button** — Query entry with Enter-to-submit
 - **Streaming message display** — Accumulates tokens with blinking cursor animation during streaming
 - **Citation markers** — `[N]` parsed to superscript links targeting `/techniques/:slug`
 - **Source list** — Numbered sources with creator attribution displayed below the response
 - **States:** Loading (streaming indicator), error (message display), empty (placeholder prompt)
 ### SSE Client
 Located in `frontend/src/api/chat.ts`. Uses `fetch()` + `ReadableStream` with typed callbacks:
 ```typescript
 streamChat(query, creator?, {
  onSources: (sources) => void,
  onToken: (token) => void,
  onDone: (data) => void,
  onError: (error) => void,
 })
 ```
 ## Key Files
 - `backend/chat_service.py` — ChatService retrieve-prompt-stream pipeline
 - `backend/routers/chat.py` — POST /api/v1/chat endpoint
 - `frontend/src/api/chat.ts` — SSE client utility
 - `frontend/src/pages/ChatPage.tsx` — Chat UI page component
 - `frontend/src/pages/ChatPage.module.css` — Chat page styles
 ## Design Decisions
 - **Standalone ASGI test client pattern** — Tests use mocked DB to avoid PostgreSQL dependency, enabling fast CI runs
 - **Patch `openai.AsyncOpenAI` constructor** rather than instance attribute for reliable test mocking
 - **Local citation regex** in ChatPage rather than importing from utils — link targets differ from technique page citations
 ---
 *See also: [[Search-Retrieval]], [[API-Surface]], [[Frontend]]*
--- a/Data-Model.md
+++ b/Data-Model.md
@ -71,6 +71,7 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
 | topic_tags | ARRAY(String) | |
 | content_type | Enum | tutorial / tip / exploration / walkthrough |
 | review_status | String | pending / approved / rejected |
 | sort_order | Integer | Display ordering within video (M021/S06) |
 ### TechniquePage
@ -188,6 +189,8 @@ Append-only versioned record of per-field consent changes.
 | PipelineRunTrigger | auto, manual, retrigger, clean_retrigger |
 | **UserRole** | admin, creator |
 | **ConsentField** | kb_inclusion, training_usage, public_display |
 | **HighlightStatus** | candidate, approved, rejected (M021/S04) |
 | **ChapterStatus** | draft, approved, hidden (M021/S06) |
 ## Schema Notes
@ -201,4 +204,15 @@ Append-only versioned record of per-field consent changes.
 ---
 *See also: [[Architecture]], [[API-Surface]], [[Pipeline]], [[Authentication]]*
 changes currently require manual DDL
 - **body_sections_format** discriminator enables v1/v2 format coexistence (D024)
 - **topic_category casing** is inconsistent across records (e.g., "Sound design" vs "Sound Design") — known data quality issue
 - **Stage 4 classification data** (per-moment topic_tags) stored in Redis with 24h TTL, not DB columns
 - **Timestamp convention:** `datetime.now(timezone.utc).replace(tzinfo=None)` — asyncpg rejects timezone-aware datetimes for TIMESTAMP WITHOUT TIME ZONE columns (D002)
 - **User passwords** are stored as bcrypt hashes via `bcrypt.hashpw()`
 - **Consent audit** uses version numbers assigned in application code (`max(version) + 1` per video_consent_id)
 ---
 *See also: [[Architecture]], [[API-Surface]], [[Pipeline]], [[Authentication]]*
--- a/Decisions.md
+++ b/Decisions.md
@ -31,6 +31,13 @@ Architectural and pattern decisions made during Chrysopedia development. Append-
 | D034 | Documentation strategy | Forgejo wiki, KB slice at end of every milestone | Incremental docs stay current; final pass in M025 |
 | D035 | File/object storage | MinIO (S3-compatible) self-hosted | Docker-native, signed URLs, fits existing infrastructure |
 ## M021 Decisions
 | # | When | Decision | Choice | Rationale |
 |---|------|----------|--------|-----------|
 | D039 | M021/S01 | LightRAG scoring strategy | Position-based (1.0 → 0.5 descending), sequential Qdrant fallback | `/query/data` has no numeric relevance score; retrieval order is the only signal |
 | D040 | M021/S02 | Creator-scoped retrieval strategy | 4-tier cascade: creator → domain → global → none | Progressive widening ensures results while preferring creator context; `ll_keywords` for soft scoping; 3x oversampling for post-filter survival |
 ## UI/UX Decisions
 | # | Decision | Choice |
--- a/Frontend.md
+++ b/Frontend.md
@ -53,7 +53,13 @@ Local component state only (`useState`/`useEffect`). No Redux, Zustand, Context
 ## API Client
-Single module `public-client.ts` (~600 lines) with typed `request<T>` helper. Relative `/api/v1` base URL (nginx proxies to API container). All response TypeScript interfaces defined in the same file.
+Two API modules:
 - `public-client.ts` (~600 lines) — typed `request<T>` helper for REST endpoints
 - `chat.ts` — SSE streaming client for POST /api/v1/chat using `fetch()` + `ReadableStream`
 - `videos.ts` — chapter management functions (fetchChapters, fetchCreatorChapters, updateChapter, reorderChapters, approveChapters)
 - `auth.ts` — authentication + impersonation functions including `fetchImpersonationLog()`
 Relative `/api/v1` base URL (nginx proxies to API container).
 ## CSS Architecture
@ -108,3 +114,10 @@ Single module `public-client.ts` (~600 lines) with typed `request<T>` helper. Re
 ---
 *See also: [[Architecture]], [[API-Surface]], [[Development-Guide]]*
 *See also: [[Architecture]], [[API-Surface]], [[Development-Guide]]*
 ocalhost:8001`
 - **Production:** nginx serves static `dist/` bundle, proxies `/api` to FastAPI container
 ---
 *See also: [[Architecture]], [[API-Surface]], [[Development-Guide]]*
--- a/Highlights.md
+++ b/Highlights.md
@ -0,0 +1,143 @@
 # Highlight Detection
 Heuristic scoring engine that ranks KeyMoment records into highlight candidates using 7 weighted dimensions. Added in M021/S04.
 ## Overview
 Highlight detection scores every KeyMoment in a video to identify the most "highlightable" segments — moments that would work well as standalone clips or featured content. The scoring is a pure function (no ML model, no external API) based on 7 dimensions derived from existing KeyMoment metadata.
 ## Scoring Dimensions
 Total weight sums to 1.0. Each dimension produces a 0.0–1.0 score.
 | Dimension | Weight | What It Measures |
 |-----------|--------|-----------------|
 | `duration_fitness` | 0.25 | Piecewise linear curve peaking at 30–60 seconds (ideal clip length) |
 | `content_type` | 0.20 | Content type favorability: tutorial > tip > walkthrough > exploration |
 | `specificity_density` | 0.20 | Regex-based counting of specific units, ratios, and named parameters in summary text |
 | `plugin_richness` | 0.10 | Number of plugins/VSTs referenced (more = more actionable) |
 | `transcript_energy` | 0.10 | Teaching-phrase detection in transcript text (e.g., "the trick is", "key thing") |
 | `source_quality` | 0.10 | Source quality rating: high=1.0, medium=0.6, low=0.3 |
 | `video_type` | 0.05 | Video type favorability mapping |
 ### Duration Fitness Curve
 Uses piecewise linear (not Gaussian) for predictability:
 - 0–10s → low score (too short)
 - 10–30s → ramp up
 - 30–60s → peak score (1.0)
 - 60–120s → gradual decline
 - 120s+ → low score (too long for a highlight)
 ## Data Model
 ### HighlightCandidate
 | Field | Type | Notes |
 |-------|------|-------|
 | id | UUID PK | |
 | key_moment_id | FK → KeyMoment | Unique constraint (`uq_highlight_candidate_moment`) |
 | source_video_id | FK → SourceVideo | Indexed |
 | score | Float | Composite score 0.0–1.0 |
 | score_breakdown | JSONB | Per-dimension scores (7 fields) |
 | duration_secs | Float | Cached from KeyMoment for display |
 | status | Enum(HighlightStatus) | candidate / approved / rejected |
 | created_at | Timestamp | |
 | updated_at | Timestamp | |
 ### HighlightStatus Enum
 | Value | Meaning |
 |-------|---------|
 | `candidate` | Scored but not reviewed |
 | `approved` | Admin-approved as a highlight |
 | `rejected` | Admin-rejected |
 ### Database Indexes
 - `source_video_id` — filter by video
 - `score` DESC — rank ordering
 - `status` — filter by review state
 ### Migration
 Alembic migration `019_add_highlight_candidates.py` creates the table with all indexes and the named unique constraint.
 ## API Endpoints
 All under `/api/v1/admin/highlights/`. Admin access.
 | Method | Path | Purpose |
 |--------|------|---------|
 | POST | `/admin/highlights/detect/{video_id}` | Score all KeyMoments for a video, upsert candidates |
 | POST | `/admin/highlights/detect-all` | Score all videos (triggers Celery tasks) |
 | GET | `/admin/highlights/candidates` | Paginated candidate list, sorted by score DESC |
 | GET | `/admin/highlights/candidates/{id}` | Single candidate with full `score_breakdown` |
 ### Detect Response
 ```json
 {
  "video_id": "uuid",
  "candidates_created": 12,
  "candidates_updated": 0
 }
 ```
 ### Candidate Response
 ```json
 {
  "id": "uuid",
  "key_moment_id": "uuid",
  "source_video_id": "uuid",
  "score": 0.847,
  "score_breakdown": {
    "duration_fitness": 0.95,
    "content_type_weight": 0.80,
    "specificity_density": 0.72,
    "plugin_richness": 0.60,
    "transcript_energy": 0.85,
    "source_quality_weight": 1.00,
    "video_type_weight": 0.50
  },
  "duration_secs": 45.0,
  "status": "candidate",
  "created_at": "...",
  "updated_at": "..."
 }
 ```
 ## Pipeline Integration
 ### Celery Task: `stage_highlight_detection`
 - **Binding:** `bind=True, max_retries=3`
 - **Session:** Uses `_get_sync_session` (sync SQLAlchemy, per D004)
 - **Flow:** Load KeyMoments for video → score each via `score_moment()` → bulk upsert via `INSERT ON CONFLICT` on named constraint `uq_highlight_candidate_moment`
 - **Events:** Emits `pipeline_events` rows for start/complete/error with candidate count in payload
 ### Scoring Function
 `score_moment()` in `backend/pipeline/highlight_scorer.py` is a **pure function** — no DB access, no side effects. Takes a KeyMoment-like dict, returns `(score, breakdown_dict)`. This separation enables easy unit testing (28 tests, runs in 0.03s).
 ## Design Decisions
 - **Pure function scoring** — No DB or side effects in `score_moment()`, enabling fast unit tests
 - **Piecewise linear duration** — Predictable behavior vs. Gaussian bell curve
 - **Named unique constraint** — `uq_highlight_candidate_moment` enables idempotent upserts via `ON CONFLICT`
 - **Lazy import** — `score_moment` imported inside Celery task to avoid circular imports at module load
 ## Key Files
 - `backend/pipeline/highlight_scorer.py` — Pure scoring function with 7 dimensions
 - `backend/pipeline/highlight_schemas.py` — Pydantic schemas (HighlightScoreBreakdown, HighlightCandidateResponse, HighlightBatchResult)
 - `backend/pipeline/stages.py` — `stage_highlight_detection` Celery task
 - `backend/routers/highlights.py` — 4 admin API endpoints
 - `backend/models.py` — HighlightCandidate model, HighlightStatus enum
 - `alembic/versions/019_add_highlight_candidates.py` — Migration
 - `backend/pipeline/test_highlight_scorer.py` — 28 unit tests
 ---
 *See also: [[Pipeline]], [[Data-Model]], [[API-Surface]]*
--- a/Home.md
+++ b/Home.md
@ -37,4 +37,10 @@ Producers can search for specific techniques and find timestamped key moments, s
 ---
 *Last updated: 2026-04-04 â€” M021 chat engine, retrieval cascade, highlights, audio mode, chapters, impersonation write mode*
 inx reverse proxy on nuc01 |
 ---
 *Last updated: 2026-04-03 — M018/S02 initial bootstrap*
 ” M018/S02 initial bootstrap*
--- a/Impersonation.md
+++ b/Impersonation.md
@ -86,17 +86,39 @@ The `original_user_id` claim is what `reject_impersonation` checks.
 ### AuthContext Extensions
- `startImpersonation(userId)` — Calls impersonate API, saves current admin token to `sessionStorage`, swaps to impersonation token
+- `startImpersonation(userId, writeMode?)` — Calls impersonate API (with optional write_mode body), saves current admin token to `sessionStorage`, swaps to impersonation token
 - `exitImpersonation()` — Calls stop API, restores admin token from `sessionStorage`
 - `user.impersonating` (boolean) — True when viewing as another user
 - `isWriteMode` (boolean) — True when impersonation session has write access (M021/S07)
 ### ImpersonationBanner
-Fixed amber bar at page top when impersonating. Shows "Viewing as {name}" with Exit button. Rendered in `AppShell` when `user.impersonating` is true.
+Fixed bar at page top when impersonating. Two visual states (M021/S07):
 - **Amber** "👁 Viewing as {name}" — read-only mode
 - **Red** "✏️ Editing as {name}" — write mode (adds `body.impersonating-write` class)
 Uses `data-write-mode` attribute for CSS color switching.
 ### ConfirmModal (M021/S07)
 Reusable confirmation dialog component used for "Edit As" write-mode impersonation:
 - Backdrop + Escape/backdrop-click dismiss
 - `variant` prop: `warning` (amber) or `danger` (red confirm button)
 - Uses `data-variant` attribute for CSS variant styling
 ### AdminUsers Page
-Route: `/admin/users`. Table of all users with "View As" buttons for creator-role users. Code-split with `React.lazy`.
+Route: `/admin/users`. Table of all users with two action buttons per creator:
 - "View As" — starts read-only impersonation (no confirmation modal)
 - "Edit As" — opens ConfirmModal with danger variant before starting write-mode impersonation
 ### AdminAuditLog Page (M021/S07)
 Route: `/admin/audit-log`. Six-column table:
 - Date/Time, Admin, Target User, Action, Write Mode, IP Address
 - Badge styling via `data-variant` / `data-action` / `data-write-mode` attributes
 - Loading/error/empty states, Previous/Next pagination
 - Linked from AdminDropdown after "Users"
 ## Key Files
--- a/Pipeline.md
+++ b/Pipeline.md
@ -55,6 +55,14 @@ Stage 6: Embed & Index — generate embeddings, upsert to Qdrant (non-blocking)
 - **Non-blocking:** Failures log WARNING but don't fail the pipeline (D005)
 - Can be re-triggered independently via `/admin/pipeline/reindex-all`
 ### Stage 7: Highlight Detection (M021/S04)
 - Scores every KeyMoment in a video using 7 weighted heuristic dimensions
 - Pure function scoring: duration_fitness (0.25), content_type (0.20), specificity_density (0.20), plugin_richness (0.10), transcript_energy (0.10), source_quality (0.10), video_type (0.05)
 - Celery task `stage_highlight_detection` with `bind=True, max_retries=3`
 - Bulk upserts via `INSERT ON CONFLICT` on named constraint `uq_highlight_candidate_moment`
 - Output: HighlightCandidate records in PostgreSQL with composite score and per-dimension breakdown
 - See [[Highlights]] for full scoring details and API endpoints
 ## LLM Configuration
 | Setting | Value |
--- a/Player.md
+++ b/Player.md
@ -74,6 +74,37 @@ Playback rate options: 0.5x, 0.75x, 1x, 1.25x, 1.5x, 2x.
 ## Key Files
 - `frontend/src/pages/WatchPage.tsx` — Page component
 - `frontend/src/components/VideoPlayer.tsx` — Video element + HLS setup
 - `frontend/src/components/PlayerControls.tsx` — Play/pause, speed, volume, seek bar
 - `frontend/src/components/TranscriptSidebar.tsx` — Synchronized transcript display
 - `frontend/src/components/AudioWaveform.tsx` — Waveform visualization for audio content (M021/S05)
 - `frontend/src/components/ChapterMarkers.tsx` — Seek bar chapter overlay (M021/S05)
 - `frontend/src/hooks/useMediaSync.ts` — Shared playback state hook
 - `backend/routers/videos.py` — Video detail + transcript API
 ---
 *See also: [[Architecture]], [[API-Surface]], [[Frontend]]*
 with chapter title
 - Click seeks playback to chapter start time
 - Integrated into `PlayerControls` via wrapper container div
 ## Audio Waveform (M021/S05)
 `AudioWaveform` component renders when content is audio-only (no video_url):
 - Hidden `<audio>` element shared between `useMediaSync` and WaveSurfer
 - wavesurfer.js with MediaElement backend — playback controlled identically to video mode
 - Dark-themed CSS matching the video player area
 - RegionsPlugin for labeled chapter regions with drag/resize support
 ### Dependencies
 - `wavesurfer.js` — waveform rendering (~200KB, loaded only in audio mode)
 - `useMediaSync` hook widened from `HTMLVideoElement` to `HTMLMediaElement` for audio/video polymorphism
 ## Key Files
 - `frontend/src/pages/WatchPage.tsx` — Page component
 - `frontend/src/components/VideoPlayer.tsx` — Video element + HLS setup
 - `frontend/src/components/PlayerControls.tsx` — Play/pause, speed, volume, seek bar
--- a/Search-Retrieval.md
+++ b/Search-Retrieval.md
@ -0,0 +1,107 @@
 # Search & Retrieval
 LightRAG-first search with automatic Qdrant fallback, plus a 4-tier creator-scoped retrieval cascade. Added in M021/S01–S02.
 ## Overview
 Search went through a major upgrade in M021: LightRAG replaced Qdrant as the primary search engine, with Qdrant retained as an automatic fallback. A 4-tier creator-scoped retrieval cascade was added for context-aware search when querying within a creator's content.
 ## LightRAG Integration
 LightRAG is a graph-based RAG engine running as a standalone service on port 9621. It replaced Qdrant as the primary search path for `GET /api/v1/search`.
 ### How It Works
 1. **Query** — `SearchService._lightrag_search()` POSTs to LightRAG `/query/data` with `mode: "hybrid"`
 2. **Parse** — Response contains `chunks` (text passages with file_source metadata) and `entities` (graph nodes)
 3. **Extract** — Technique slugs parsed from `file_source` paths using format `technique:{slug}:creator:{uuid}`
 4. **Lookup** — Batch PostgreSQL query maps slugs to full TechniquePage records
 5. **Score** — Position-based scoring (1.0 → 0.5 descending) since `/query/data` has no numeric relevance score (D039)
 6. **Supplement** — Entity names matched against technique page titles as supplementary results
 ### Configuration
 | Field | Default | Purpose |
 |-------|---------|---------|
 | `lightrag_url` | `http://chrysopedia-lightrag:9621` | LightRAG service URL |
 | `lightrag_search_timeout` | `2.0` (seconds) | Request timeout |
 | `lightrag_min_query_length` | `3` (characters) | Queries shorter than this skip LightRAG |
 ### Fallback Behavior
 LightRAG failures trigger automatic fallback to the existing Qdrant + keyword search path:
 - **Timeout** → fallback + WARNING log with `reason=timeout`
 - **Connection error** → fallback + WARNING log with `reason=connection_error`
 - **HTTP error (e.g. 500)** → fallback + WARNING log with `reason=http_error`
 - **Empty results** → fallback + WARNING log with `reason=empty_results`
 - **Parse error** → fallback + WARNING log with `reason=parse_error`
 - **Short query (<3 chars)** → skips LightRAG entirely, uses Qdrant directly
 The `fallback_used` field in the search response indicates which engine served results.
 ## 4-Tier Creator-Scoped Cascade
 When a `?creator=` parameter is provided (e.g., from a creator profile page or the chat engine), search runs a progressive cascade that widens scope until results are found. Added in M021/S02 (D040).
 ```
 Tier 1: Creator-scoped
  └─ LightRAG with ll_keywords=[creator_name], post-filter by creator_id (3× oversampling)
        │ empty?
        ▼
 Tier 2: Domain-scoped
  └─ LightRAG with ll_keywords=[dominant_category] (requires ≥2 pages in category)
        │ empty?
        ▼
 Tier 3: Global
  └─ Standard LightRAG search (no scoping)
        │ empty?
        ▼
 Tier 4: None
  └─ cascade_tier="none" — no results from any tier
 ```
 ### Cascade Details
 | Tier | Method | Scoping | Post-Filter |
 |------|--------|---------|-------------|
 | Creator | `_creator_scoped_search()` | `ll_keywords: [creator_name]` | Yes — filter by `creator_id`, request 3× `top_k` |
 | Domain | `_domain_scoped_search()` | `ll_keywords: [domain]` | No — any creator in domain qualifies |
 | Global | `_lightrag_search()` | None | No |
 | None | — | — | — (empty result) |
 **Domain detection:** SQL aggregation finds the dominant `topic_category` across a creator's technique pages. Requires ≥2 pages in the category to declare a domain — fewer means insufficient signal.
 **Post-filtering with oversampling:** Creator tier requests 3× the desired result count from LightRAG, then filters locally by `creator_id`. This compensates for LightRAG not supporting native creator filtering.
 ### Response Fields
 | Field | Type | Description |
 |-------|------|-------------|
 | `cascade_tier` | string | Which tier served: `"creator"`, `"domain"`, `"global"`, `"none"`, or `""` (no cascade) |
 | `fallback_used` | boolean | `true` if Qdrant fallback was used instead of LightRAG |
 ## Observability
 - `logger.info` per LightRAG search: query, latency_ms, result_count
 - `logger.info` per cascade tier: query, creator, tier, latency_ms, result_count
 - `logger.warning` on any failure path with structured `reason=` tag
 - `cascade_tier` and `fallback_used` in API response for downstream consumers
 ## Key Decisions
 | # | Decision | Choice | Rationale |
 |---|----------|--------|-----------|
 | D039 | LightRAG scoring | Position-based (1.0 → 0.5) | `/query/data` has no numeric relevance score; sequential fallback to Qdrant |
 | D040 | Creator-scoped strategy | 4-tier cascade (creator → domain → global → none) | Progressive widening ensures results while preferring creator context |
 ## Key Files
 - `backend/search_service.py` — SearchService with LightRAG integration and cascade methods
 - `backend/config.py` — LightRAG configuration fields
 - `backend/schemas.py` — `cascade_tier` in SearchResponse
 - `backend/routers/search.py` — `?creator=` query parameter
 ---
 *See also: [[Chat-Engine]], [[Architecture]], [[API-Surface]], [[Decisions]]*
--- a/_Sidebar.md
+++ b/_Sidebar.md
@ -10,6 +10,11 @@
 - [[Pipeline]]
 - [[Player]]
 **Features**
 - [[Chat-Engine]]
 - [[Search-Retrieval]]
 - [[Highlights]]
 **Reference**
 - [[API-Surface]]
 - [[Frontend]]