M021: Chat engine, retrieval cascade, highlights, audio mode, chapters, impersonation write mode docs

2026-04-04 06:40:26 +00:00 · 2026-04-04 06:40:26 +00:00 · eec99b6c7d
commit eec99b6c7d
parent be05f5edf2
13 changed files with 533 additions and 7 deletions
--- a/API-Surface.md
+++ b/API-Surface.md
@ -8,7 +8,7 @@
 |--------|------|---------------|-------|
 | GET | `/health` | `{status, service, version, database}` | Health check |
 | GET | `/api/v1/stats` | `{technique_count, creator_count}` | Homepage stats |
-| GET | `/api/v1/search?q=` | `{items, partial_matches, total, query, fallback_used}` | Semantic + keyword fallback (D009) |
+| GET | `/api/v1/search?q=&creator=` | `{items, partial_matches, total, query, fallback_used, cascade_tier}` | LightRAG primary + Qdrant fallback, optional creator cascade (D039, D040) |
 | GET | `/api/v1/search/suggestions?q=` | `{suggestions: [{text, type}]}` | Typeahead autocomplete |
 | GET | `/api/v1/search/popular` | `{items: [{query, count}]}` | Popular searches (D025) |
 | GET | `/api/v1/techniques?limit=&offset=` | `{items, total, offset, limit}` | Paginated technique list |
@ -139,3 +139,50 @@ JWT-based authentication added in M019. See [[Authentication]] for full details.
 ---

 *See also: [[Architecture]], [[Data-Model]], [[Frontend]], [[Authentication]]*
+utput` | Delete all pipeline output |
+| POST | `/admin/pipeline/optimize-prompt` | Trigger prompt optimization |
+| POST | `/admin/pipeline/reindex-all` | Rebuild Qdrant index |
+| GET | `/admin/pipeline/worker-status` | Celery worker health |
+| GET | `/admin/pipeline/recent-activity` | Recent pipeline events |
+| POST | `/admin/pipeline/creator-profile/{creator_id}` | Update creator profile |
+| POST | `/admin/pipeline/avatar-fetch/{creator_id}` | Fetch creator avatar |
+
+## Other Endpoints (2)
+
+| Method | Path | Notes |
+|--------|------|-------|
+| POST | `/api/v1/ingest` | Transcript upload |
+| GET | `/api/v1/videos` | ⚠️ Bare list (not paginated) |
+
+## Response Conventions
+
+**Standard paginated response:**
+```json
+{
+  "items": [...],
+  "total": 83,
+  "offset": 0,
+  "limit": 20
+}
+```
+
+**Known inconsistencies:**
+- `GET /topics` returns bare list instead of paginated dict
+- `GET /videos` returns bare list instead of paginated dict
+- Search uses `items` key (not `results`)
+- `/techniques/random` returns JSON `{slug}` (not HTTP redirect)
+
+**New endpoints should follow the `{items, total, offset, limit}` paginated pattern.**
+
+## Authentication
+
+JWT-based authentication added in M019. See [[Authentication]] for full details.
+
+- **Public endpoints** (search, browse, techniques) require no auth
+- **Auth endpoints** (`/auth/register`, `/auth/login`) are open; `/auth/me` requires Bearer JWT
+- **Consent endpoints** require Bearer JWT with ownership verification (creator must own the video, or be admin)
+- **Admin endpoints** (`/admin/*`) are accessible to anyone with network access (auth planned for future milestone)
+
+---
+
+*See also: [[Architecture]], [[Data-Model]], [[Frontend]], [[Authentication]]*
--- a/Architecture.md
+++ b/Architecture.md
@ -2,7 +2,7 @@

 ## System Overview

-Chrysopedia is a self-hosted music production knowledge base that synthesizes technique articles from video transcripts using a 6-stage LLM pipeline. It runs as a Docker Compose stack on `ub01` with 11 containers.
+Chrysopedia is a self-hosted music production knowledge base that synthesizes technique articles from video transcripts using a 7-stage LLM pipeline. It runs as a Docker Compose stack on `ub01` with 11 containers.

 ```
 ┌─────────────────────────────────────────────────────────────────┐
@ -94,3 +94,11 @@ Chrysopedia is a self-hosted music production knowledge base that synthesizes te
 ---

 *See also: [[Deployment]], [[Pipeline]], [[Data-Model]], [[Authentication]]*
+→ PostgreSQL, embeddings → Qdrant
+5. **Serving:** React SPA fetches from FastAPI, search queries hit Qdrant then PostgreSQL fallback
+6. **Auth:** JWT-protected endpoints for creator consent management and admin features (see [[Authentication]])
+
+---
+
+*See also: [[Deployment]], [[Pipeline]], [[Data-Model]], [[Authentication]]*
+], [[Authentication]]*
--- a/Chat-Engine.md
+++ b/Chat-Engine.md
@ -0,0 +1,115 @@
+# Chat Engine
+
+Streaming question-answering interface backed by LightRAG retrieval and LLM completion. Added in M021/S03.
+
+## Architecture
+
+```
+User types question in ChatPage
+        │
+        ▼
+POST /api/v1/chat  { query: "...", creator?: "..." }
+        │
+        ▼
+ChatService.stream(query, creator?)
+        │
+        ├─ 1. Retrieve: SearchService.search(query, creator)
+        │     └─ Uses 4-tier cascade if creator provided (see [[Search-Retrieval]])
+        │
+        ├─ 2. Prompt: Assemble numbered context block into encyclopedic system prompt
+        │     └─ Sources formatted as [1] Title — Summary for citation mapping
+        │
+        ├─ 3. Stream: openai.AsyncOpenAI with stream=True
+        │     └─ Tokens streamed as SSE events in real-time
+        │
+        ▼
+SSE response → ChatPage renders tokens + citation links
+```
+
+## SSE Protocol
+
+The chat endpoint returns a `text/event-stream` response with four event types in strict order:
+
+| Event | Payload | When |
+|-------|---------|------|
+| `sources` | `[{title, slug, creator_name, summary}]` | First — citation metadata for link rendering |
+| `token` | `string` (text chunk) | Repeated — streamed LLM completion tokens |
+| `done` | `{cascade_tier: "creator"\|"domain"\|"global"\|"none"\|""}` | Once — signals completion, includes which retrieval tier answered |
+| `error` | `{message: string}` | On failure — emitted if LLM errors mid-stream |
+
+The `cascade_tier` in the `done` event reveals which tier of the retrieval cascade served the context (see [[Search-Retrieval]]).
+
+## Citation Format
+
+The LLM is instructed to reference sources using numbered citations `[N]` in its response. The frontend parses these into superscript links:
+
+- `[1]` → links to `/techniques/:slug` for the corresponding source
+- Multiple citations supported: `[1][3]` or `[1,3]`
+- Citation regex: `/\[(\d+)\]/g` parsed locally in ChatPage
+
+## API Endpoint
+
+### POST /api/v1/chat
+
+| Field | Type | Required | Validation |
+|-------|------|----------|------------|
+| `query` | string | Yes | 1–1000 characters |
+| `creator` | string | No | Creator UUID or slug for scoped retrieval |
+
+**Response:** `text/event-stream` (SSE)
+
+**Error responses:**
+- `422` — Empty or missing query, query exceeds 1000 chars
+
+## Backend: ChatService
+
+Located in `backend/chat_service.py`. The retrieve-prompt-stream pipeline:
+
+1. **Retrieve** — Calls `SearchService.search()` with the query and optional creator parameter. Gets back ranked technique page results with the cascade_tier.
+2. **Prompt** — Builds a numbered context block from search results. System prompt instructs the LLM to act as a music production encyclopedia, cite sources with `[N]` notation, and stay grounded in the provided context.
+3. **Stream** — Opens an async streaming completion via `openai.AsyncOpenAI` (configured to point at DGX Sparks Qwen or local Ollama). Yields SSE events as tokens arrive.
+
+Error handling: If the LLM fails mid-stream (after some tokens have been sent), an `error` event is emitted so the frontend can display a failure message rather than leaving the response hanging.
+
+## Frontend: ChatPage
+
+Route: `/chat` (lazy-loaded, code-split)
+
+### Components
+
+- **Text input + submit button** — Query entry with Enter-to-submit
+- **Streaming message display** — Accumulates tokens with blinking cursor animation during streaming
+- **Citation markers** — `[N]` parsed to superscript links targeting `/techniques/:slug`
+- **Source list** — Numbered sources with creator attribution displayed below the response
+- **States:** Loading (streaming indicator), error (message display), empty (placeholder prompt)
+
+### SSE Client
+
+Located in `frontend/src/api/chat.ts`. Uses `fetch()` + `ReadableStream` with typed callbacks:
+
+```typescript
+streamChat(query, creator?, {
+  onSources: (sources) => void,
+  onToken: (token) => void,
+  onDone: (data) => void,
+  onError: (error) => void,
+})
+```
+
+## Key Files
+
+- `backend/chat_service.py` — ChatService retrieve-prompt-stream pipeline
+- `backend/routers/chat.py` — POST /api/v1/chat endpoint
+- `frontend/src/api/chat.ts` — SSE client utility
+- `frontend/src/pages/ChatPage.tsx` — Chat UI page component
+- `frontend/src/pages/ChatPage.module.css` — Chat page styles
+
+## Design Decisions
+
+- **Standalone ASGI test client pattern** — Tests use mocked DB to avoid PostgreSQL dependency, enabling fast CI runs
+- **Patch `openai.AsyncOpenAI` constructor** rather than instance attribute for reliable test mocking
+- **Local citation regex** in ChatPage rather than importing from utils — link targets differ from technique page citations
+
+---
+
+*See also: [[Search-Retrieval]], [[API-Surface]], [[Frontend]]*
--- a/Data-Model.md
+++ b/Data-Model.md
@ -71,6 +71,7 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
 | topic_tags | ARRAY(String) | |
 | content_type | Enum | tutorial / tip / exploration / walkthrough |
 | review_status | String | pending / approved / rejected |
+| sort_order | Integer | Display ordering within video (M021/S06) |

 ### TechniquePage

@ -188,6 +189,8 @@ Append-only versioned record of per-field consent changes.
 | PipelineRunTrigger | auto, manual, retrigger, clean_retrigger |
 | **UserRole** | admin, creator |
 | **ConsentField** | kb_inclusion, training_usage, public_display |
+| **HighlightStatus** | candidate, approved, rejected (M021/S04) |
+| **ChapterStatus** | draft, approved, hidden (M021/S06) |

 ## Schema Notes

@ -201,4 +204,15 @@ Append-only versioned record of per-field consent changes.

 ---

+*See also: [[Architecture]], [[API-Surface]], [[Pipeline]], [[Authentication]]*
+ changes currently require manual DDL
+- **body_sections_format** discriminator enables v1/v2 format coexistence (D024)
+- **topic_category casing** is inconsistent across records (e.g., "Sound design" vs "Sound Design") — known data quality issue
+- **Stage 4 classification data** (per-moment topic_tags) stored in Redis with 24h TTL, not DB columns
+- **Timestamp convention:** `datetime.now(timezone.utc).replace(tzinfo=None)` — asyncpg rejects timezone-aware datetimes for TIMESTAMP WITHOUT TIME ZONE columns (D002)
+- **User passwords** are stored as bcrypt hashes via `bcrypt.hashpw()`
+- **Consent audit** uses version numbers assigned in application code (`max(version) + 1` per video_consent_id)
+
+---
+
 *See also: [[Architecture]], [[API-Surface]], [[Pipeline]], [[Authentication]]*
--- a/Decisions.md
+++ b/Decisions.md
@ -31,6 +31,13 @@ Architectural and pattern decisions made during Chrysopedia development. Append-
 | D034 | Documentation strategy | Forgejo wiki, KB slice at end of every milestone | Incremental docs stay current; final pass in M025 |
 | D035 | File/object storage | MinIO (S3-compatible) self-hosted | Docker-native, signed URLs, fits existing infrastructure |

+## M021 Decisions
+
+| # | When | Decision | Choice | Rationale |
+|---|------|----------|--------|-----------|
+| D039 | M021/S01 | LightRAG scoring strategy | Position-based (1.0 → 0.5 descending), sequential Qdrant fallback | `/query/data` has no numeric relevance score; retrieval order is the only signal |
+| D040 | M021/S02 | Creator-scoped retrieval strategy | 4-tier cascade: creator → domain → global → none | Progressive widening ensures results while preferring creator context; `ll_keywords` for soft scoping; 3x oversampling for post-filter survival |
+
 ## UI/UX Decisions

 | # | Decision | Choice |
--- a/Frontend.md
+++ b/Frontend.md
@ -53,7 +53,13 @@ Local component state only (`useState`/`useEffect`). No Redux, Zustand, Context

 ## API Client

-Single module `public-client.ts` (~600 lines) with typed `request<T>` helper. Relative `/api/v1` base URL (nginx proxies to API container). All response TypeScript interfaces defined in the same file.
+Two API modules:
+- `public-client.ts` (~600 lines) — typed `request<T>` helper for REST endpoints
+- `chat.ts` — SSE streaming client for POST /api/v1/chat using `fetch()` + `ReadableStream`
+- `videos.ts` — chapter management functions (fetchChapters, fetchCreatorChapters, updateChapter, reorderChapters, approveChapters)
+- `auth.ts` — authentication + impersonation functions including `fetchImpersonationLog()`
+
+Relative `/api/v1` base URL (nginx proxies to API container).

 ## CSS Architecture

@ -108,3 +114,10 @@ Single module `public-client.ts` (~600 lines) with typed `request<T>` helper. Re
 ---

 *See also: [[Architecture]], [[API-Surface]], [[Development-Guide]]*
+*See also: [[Architecture]], [[API-Surface]], [[Development-Guide]]*
+ocalhost:8001`
+- **Production:** nginx serves static `dist/` bundle, proxies `/api` to FastAPI container
+
+---
+
+*See also: [[Architecture]], [[API-Surface]], [[Development-Guide]]*
--- a/Highlights.md
+++ b/Highlights.md
@ -0,0 +1,143 @@
+# Highlight Detection
+
+Heuristic scoring engine that ranks KeyMoment records into highlight candidates using 7 weighted dimensions. Added in M021/S04.
+
+## Overview
+
+Highlight detection scores every KeyMoment in a video to identify the most "highlightable" segments — moments that would work well as standalone clips or featured content. The scoring is a pure function (no ML model, no external API) based on 7 dimensions derived from existing KeyMoment metadata.
+
+## Scoring Dimensions
+
+Total weight sums to 1.0. Each dimension produces a 0.0–1.0 score.
+
+| Dimension | Weight | What It Measures |
+|-----------|--------|-----------------|
+| `duration_fitness` | 0.25 | Piecewise linear curve peaking at 30–60 seconds (ideal clip length) |
+| `content_type` | 0.20 | Content type favorability: tutorial > tip > walkthrough > exploration |
+| `specificity_density` | 0.20 | Regex-based counting of specific units, ratios, and named parameters in summary text |
+| `plugin_richness` | 0.10 | Number of plugins/VSTs referenced (more = more actionable) |
+| `transcript_energy` | 0.10 | Teaching-phrase detection in transcript text (e.g., "the trick is", "key thing") |
+| `source_quality` | 0.10 | Source quality rating: high=1.0, medium=0.6, low=0.3 |
+| `video_type` | 0.05 | Video type favorability mapping |
+
+### Duration Fitness Curve
+
+Uses piecewise linear (not Gaussian) for predictability:
+- 0–10s → low score (too short)
+- 10–30s → ramp up
+- 30–60s → peak score (1.0)
+- 60–120s → gradual decline
+- 120s+ → low score (too long for a highlight)
+
+## Data Model
+
+### HighlightCandidate
+
+| Field | Type | Notes |
+|-------|------|-------|
+| id | UUID PK | |
+| key_moment_id | FK → KeyMoment | Unique constraint (`uq_highlight_candidate_moment`) |
+| source_video_id | FK → SourceVideo | Indexed |
+| score | Float | Composite score 0.0–1.0 |
+| score_breakdown | JSONB | Per-dimension scores (7 fields) |
+| duration_secs | Float | Cached from KeyMoment for display |
+| status | Enum(HighlightStatus) | candidate / approved / rejected |
+| created_at | Timestamp | |
+| updated_at | Timestamp | |
+
+### HighlightStatus Enum
+
+| Value | Meaning |
+|-------|---------|
+| `candidate` | Scored but not reviewed |
+| `approved` | Admin-approved as a highlight |
+| `rejected` | Admin-rejected |
+
+### Database Indexes
+
+- `source_video_id` — filter by video
+- `score` DESC — rank ordering
+- `status` — filter by review state
+
+### Migration
+
+Alembic migration `019_add_highlight_candidates.py` creates the table with all indexes and the named unique constraint.
+
+## API Endpoints
+
+All under `/api/v1/admin/highlights/`. Admin access.
+
+| Method | Path | Purpose |
+|--------|------|---------|
+| POST | `/admin/highlights/detect/{video_id}` | Score all KeyMoments for a video, upsert candidates |
+| POST | `/admin/highlights/detect-all` | Score all videos (triggers Celery tasks) |
+| GET | `/admin/highlights/candidates` | Paginated candidate list, sorted by score DESC |
+| GET | `/admin/highlights/candidates/{id}` | Single candidate with full `score_breakdown` |
+
+### Detect Response
+
+```json
+{
+  "video_id": "uuid",
+  "candidates_created": 12,
+  "candidates_updated": 0
+}
+```
+
+### Candidate Response
+
+```json
+{
+  "id": "uuid",
+  "key_moment_id": "uuid",
+  "source_video_id": "uuid",
+  "score": 0.847,
+  "score_breakdown": {
+    "duration_fitness": 0.95,
+    "content_type_weight": 0.80,
+    "specificity_density": 0.72,
+    "plugin_richness": 0.60,
+    "transcript_energy": 0.85,
+    "source_quality_weight": 1.00,
+    "video_type_weight": 0.50
+  },
+  "duration_secs": 45.0,
+  "status": "candidate",
+  "created_at": "...",
+  "updated_at": "..."
+}
+```
+
+## Pipeline Integration
+
+### Celery Task: `stage_highlight_detection`
+
+- **Binding:** `bind=True, max_retries=3`
+- **Session:** Uses `_get_sync_session` (sync SQLAlchemy, per D004)
+- **Flow:** Load KeyMoments for video → score each via `score_moment()` → bulk upsert via `INSERT ON CONFLICT` on named constraint `uq_highlight_candidate_moment`
+- **Events:** Emits `pipeline_events` rows for start/complete/error with candidate count in payload
+
+### Scoring Function
+
+`score_moment()` in `backend/pipeline/highlight_scorer.py` is a **pure function** — no DB access, no side effects. Takes a KeyMoment-like dict, returns `(score, breakdown_dict)`. This separation enables easy unit testing (28 tests, runs in 0.03s).
+
+## Design Decisions
+
+- **Pure function scoring** — No DB or side effects in `score_moment()`, enabling fast unit tests
+- **Piecewise linear duration** — Predictable behavior vs. Gaussian bell curve
+- **Named unique constraint** — `uq_highlight_candidate_moment` enables idempotent upserts via `ON CONFLICT`
+- **Lazy import** — `score_moment` imported inside Celery task to avoid circular imports at module load
+
+## Key Files
+
+- `backend/pipeline/highlight_scorer.py` — Pure scoring function with 7 dimensions
+- `backend/pipeline/highlight_schemas.py` — Pydantic schemas (HighlightScoreBreakdown, HighlightCandidateResponse, HighlightBatchResult)
+- `backend/pipeline/stages.py` — `stage_highlight_detection` Celery task
+- `backend/routers/highlights.py` — 4 admin API endpoints
+- `backend/models.py` — HighlightCandidate model, HighlightStatus enum
+- `alembic/versions/019_add_highlight_candidates.py` — Migration
+- `backend/pipeline/test_highlight_scorer.py` — 28 unit tests
+
+---
+
+*See also: [[Pipeline]], [[Data-Model]], [[API-Surface]]*
--- a/Home.md
+++ b/Home.md
@ -37,4 +37,10 @@ Producers can search for specific techniques and find timestamped key moments, s

 ---

+*Last updated: 2026-04-04 â€” M021 chat engine, retrieval cascade, highlights, audio mode, chapters, impersonation write mode*
+inx reverse proxy on nuc01 |
+
+---
+
 *Last updated: 2026-04-03 — M018/S02 initial bootstrap*
+” M018/S02 initial bootstrap*
--- a/Impersonation.md
+++ b/Impersonation.md
@ -86,17 +86,39 @@ The `original_user_id` claim is what `reject_impersonation` checks.

 ### AuthContext Extensions

- `startImpersonation(userId)` — Calls impersonate API, saves current admin token to `sessionStorage`, swaps to impersonation token
+- `startImpersonation(userId, writeMode?)` — Calls impersonate API (with optional write_mode body), saves current admin token to `sessionStorage`, swaps to impersonation token
 - `exitImpersonation()` — Calls stop API, restores admin token from `sessionStorage`
 - `user.impersonating` (boolean) — True when viewing as another user
+- `isWriteMode` (boolean) — True when impersonation session has write access (M021/S07)

 ### ImpersonationBanner

-Fixed amber bar at page top when impersonating. Shows "Viewing as {name}" with Exit button. Rendered in `AppShell` when `user.impersonating` is true.
+Fixed bar at page top when impersonating. Two visual states (M021/S07):
+- **Amber** "👁 Viewing as {name}" — read-only mode
+- **Red** "✏️ Editing as {name}" — write mode (adds `body.impersonating-write` class)
+
+Uses `data-write-mode` attribute for CSS color switching.
+
+### ConfirmModal (M021/S07)
+
+Reusable confirmation dialog component used for "Edit As" write-mode impersonation:
+- Backdrop + Escape/backdrop-click dismiss
+- `variant` prop: `warning` (amber) or `danger` (red confirm button)
+- Uses `data-variant` attribute for CSS variant styling

 ### AdminUsers Page

-Route: `/admin/users`. Table of all users with "View As" buttons for creator-role users. Code-split with `React.lazy`.
+Route: `/admin/users`. Table of all users with two action buttons per creator:
+- "View As" — starts read-only impersonation (no confirmation modal)
+- "Edit As" — opens ConfirmModal with danger variant before starting write-mode impersonation
+
+### AdminAuditLog Page (M021/S07)
+
+Route: `/admin/audit-log`. Six-column table:
+- Date/Time, Admin, Target User, Action, Write Mode, IP Address
+- Badge styling via `data-variant` / `data-action` / `data-write-mode` attributes
+- Loading/error/empty states, Previous/Next pagination
+- Linked from AdminDropdown after "Users"

 ## Key Files

--- a/Pipeline.md
+++ b/Pipeline.md
@ -55,6 +55,14 @@ Stage 6: Embed & Index — generate embeddings, upsert to Qdrant (non-blocking)
 - **Non-blocking:** Failures log WARNING but don't fail the pipeline (D005)
 - Can be re-triggered independently via `/admin/pipeline/reindex-all`

+### Stage 7: Highlight Detection (M021/S04)
+- Scores every KeyMoment in a video using 7 weighted heuristic dimensions
+- Pure function scoring: duration_fitness (0.25), content_type (0.20), specificity_density (0.20), plugin_richness (0.10), transcript_energy (0.10), source_quality (0.10), video_type (0.05)
+- Celery task `stage_highlight_detection` with `bind=True, max_retries=3`
+- Bulk upserts via `INSERT ON CONFLICT` on named constraint `uq_highlight_candidate_moment`
+- Output: HighlightCandidate records in PostgreSQL with composite score and per-dimension breakdown
+- See [[Highlights]] for full scoring details and API endpoints
+
 ## LLM Configuration

 | Setting | Value |
--- a/Player.md
+++ b/Player.md
@ -74,6 +74,37 @@ Playback rate options: 0.5x, 0.75x, 1x, 1.25x, 1.5x, 2x.

 ## Key Files

+- `frontend/src/pages/WatchPage.tsx` — Page component
+- `frontend/src/components/VideoPlayer.tsx` — Video element + HLS setup
+- `frontend/src/components/PlayerControls.tsx` — Play/pause, speed, volume, seek bar
+- `frontend/src/components/TranscriptSidebar.tsx` — Synchronized transcript display
+- `frontend/src/components/AudioWaveform.tsx` — Waveform visualization for audio content (M021/S05)
+- `frontend/src/components/ChapterMarkers.tsx` — Seek bar chapter overlay (M021/S05)
+- `frontend/src/hooks/useMediaSync.ts` — Shared playback state hook
+- `backend/routers/videos.py` — Video detail + transcript API
+
+---
+
+*See also: [[Architecture]], [[API-Surface]], [[Frontend]]*
+with chapter title
+- Click seeks playback to chapter start time
+- Integrated into `PlayerControls` via wrapper container div
+
+## Audio Waveform (M021/S05)
+
+`AudioWaveform` component renders when content is audio-only (no video_url):
+- Hidden `<audio>` element shared between `useMediaSync` and WaveSurfer
+- wavesurfer.js with MediaElement backend — playback controlled identically to video mode
+- Dark-themed CSS matching the video player area
+- RegionsPlugin for labeled chapter regions with drag/resize support
+
+### Dependencies
+
+- `wavesurfer.js` — waveform rendering (~200KB, loaded only in audio mode)
+- `useMediaSync` hook widened from `HTMLVideoElement` to `HTMLMediaElement` for audio/video polymorphism
+
+## Key Files
+
 - `frontend/src/pages/WatchPage.tsx` — Page component
 - `frontend/src/components/VideoPlayer.tsx` — Video element + HLS setup
 - `frontend/src/components/PlayerControls.tsx` — Play/pause, speed, volume, seek bar
--- a/Search-Retrieval.md
+++ b/Search-Retrieval.md
@ -0,0 +1,107 @@
+# Search & Retrieval
+
+LightRAG-first search with automatic Qdrant fallback, plus a 4-tier creator-scoped retrieval cascade. Added in M021/S01–S02.
+
+## Overview
+
+Search went through a major upgrade in M021: LightRAG replaced Qdrant as the primary search engine, with Qdrant retained as an automatic fallback. A 4-tier creator-scoped retrieval cascade was added for context-aware search when querying within a creator's content.
+
+## LightRAG Integration
+
+LightRAG is a graph-based RAG engine running as a standalone service on port 9621. It replaced Qdrant as the primary search path for `GET /api/v1/search`.
+
+### How It Works
+
+1. **Query** — `SearchService._lightrag_search()` POSTs to LightRAG `/query/data` with `mode: "hybrid"`
+2. **Parse** — Response contains `chunks` (text passages with file_source metadata) and `entities` (graph nodes)
+3. **Extract** — Technique slugs parsed from `file_source` paths using format `technique:{slug}:creator:{uuid}`
+4. **Lookup** — Batch PostgreSQL query maps slugs to full TechniquePage records
+5. **Score** — Position-based scoring (1.0 → 0.5 descending) since `/query/data` has no numeric relevance score (D039)
+6. **Supplement** — Entity names matched against technique page titles as supplementary results
+
+### Configuration
+
+| Field | Default | Purpose |
+|-------|---------|---------|
+| `lightrag_url` | `http://chrysopedia-lightrag:9621` | LightRAG service URL |
+| `lightrag_search_timeout` | `2.0` (seconds) | Request timeout |
+| `lightrag_min_query_length` | `3` (characters) | Queries shorter than this skip LightRAG |
+
+### Fallback Behavior
+
+LightRAG failures trigger automatic fallback to the existing Qdrant + keyword search path:
+
+- **Timeout** → fallback + WARNING log with `reason=timeout`
+- **Connection error** → fallback + WARNING log with `reason=connection_error`
+- **HTTP error (e.g. 500)** → fallback + WARNING log with `reason=http_error`
+- **Empty results** → fallback + WARNING log with `reason=empty_results`
+- **Parse error** → fallback + WARNING log with `reason=parse_error`
+- **Short query (<3 chars)** → skips LightRAG entirely, uses Qdrant directly
+
+The `fallback_used` field in the search response indicates which engine served results.
+
+## 4-Tier Creator-Scoped Cascade
+
+When a `?creator=` parameter is provided (e.g., from a creator profile page or the chat engine), search runs a progressive cascade that widens scope until results are found. Added in M021/S02 (D040).
+
+```
+Tier 1: Creator-scoped
+  └─ LightRAG with ll_keywords=[creator_name], post-filter by creator_id (3× oversampling)
+        │ empty?
+        ▼
+Tier 2: Domain-scoped
+  └─ LightRAG with ll_keywords=[dominant_category] (requires ≥2 pages in category)
+        │ empty?
+        ▼
+Tier 3: Global
+  └─ Standard LightRAG search (no scoping)
+        │ empty?
+        ▼
+Tier 4: None
+  └─ cascade_tier="none" — no results from any tier
+```
+
+### Cascade Details
+
+| Tier | Method | Scoping | Post-Filter |
+|------|--------|---------|-------------|
+| Creator | `_creator_scoped_search()` | `ll_keywords: [creator_name]` | Yes — filter by `creator_id`, request 3× `top_k` |
+| Domain | `_domain_scoped_search()` | `ll_keywords: [domain]` | No — any creator in domain qualifies |
+| Global | `_lightrag_search()` | None | No |
+| None | — | — | — (empty result) |
+
+**Domain detection:** SQL aggregation finds the dominant `topic_category` across a creator's technique pages. Requires ≥2 pages in the category to declare a domain — fewer means insufficient signal.
+
+**Post-filtering with oversampling:** Creator tier requests 3× the desired result count from LightRAG, then filters locally by `creator_id`. This compensates for LightRAG not supporting native creator filtering.
+
+### Response Fields
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `cascade_tier` | string | Which tier served: `"creator"`, `"domain"`, `"global"`, `"none"`, or `""` (no cascade) |
+| `fallback_used` | boolean | `true` if Qdrant fallback was used instead of LightRAG |
+
+## Observability
+
+- `logger.info` per LightRAG search: query, latency_ms, result_count
+- `logger.info` per cascade tier: query, creator, tier, latency_ms, result_count
+- `logger.warning` on any failure path with structured `reason=` tag
+- `cascade_tier` and `fallback_used` in API response for downstream consumers
+
+## Key Decisions
+
+| # | Decision | Choice | Rationale |
+|---|----------|--------|-----------|
+| D039 | LightRAG scoring | Position-based (1.0 → 0.5) | `/query/data` has no numeric relevance score; sequential fallback to Qdrant |
+| D040 | Creator-scoped strategy | 4-tier cascade (creator → domain → global → none) | Progressive widening ensures results while preferring creator context |
+
+## Key Files
+
+- `backend/search_service.py` — SearchService with LightRAG integration and cascade methods
+- `backend/config.py` — LightRAG configuration fields
+- `backend/schemas.py` — `cascade_tier` in SearchResponse
+- `backend/routers/search.py` — `?creator=` query parameter
+
+---
+
+*See also: [[Chat-Engine]], [[Architecture]], [[API-Surface]], [[Decisions]]*
--- a/_Sidebar.md
+++ b/_Sidebar.md
@ -10,6 +10,11 @@
 - [[Pipeline]]
 - [[Player]]

+**Features**
+- [[Chat-Engine]]
+- [[Search-Retrieval]]
+- [[Highlights]]
+
 **Reference**
 - [[API-Surface]]
 - [[Frontend]]
@ -20,4 +25,4 @@
 - [[Deployment]]
 - [[Monitoring]]
 - [[Development-Guide]]
- [[Agent-Context]]
+- [[Agent-Context]]