M021: Chat engine, retrieval cascade, highlights, audio mode, chapters, impersonation write mode docs

jlightner 2026-04-04 06:40:26 +00:00
parent be05f5edf2
commit eec99b6c7d
13 changed files with 533 additions and 7 deletions

@ -8,7 +8,7 @@
|--------|------|---------------|-------|
| GET | `/health` | `{status, service, version, database}` | Health check |
| GET | `/api/v1/stats` | `{technique_count, creator_count}` | Homepage stats |
| GET | `/api/v1/search?q=` | `{items, partial_matches, total, query, fallback_used}` | Semantic + keyword fallback (D009) |
| GET | `/api/v1/search?q=&creator=` | `{items, partial_matches, total, query, fallback_used, cascade_tier}` | LightRAG primary + Qdrant fallback, optional creator cascade (D039, D040) |
| GET | `/api/v1/search/suggestions?q=` | `{suggestions: [{text, type}]}` | Typeahead autocomplete |
| GET | `/api/v1/search/popular` | `{items: [{query, count}]}` | Popular searches (D025) |
| GET | `/api/v1/techniques?limit=&offset=` | `{items, total, offset, limit}` | Paginated technique list |
@ -139,3 +139,50 @@ JWT-based authentication added in M019. See [[Authentication]] for full details.
---
*See also: [[Architecture]], [[Data-Model]], [[Frontend]], [[Authentication]]*
utput` | Delete all pipeline output |
| POST | `/admin/pipeline/optimize-prompt` | Trigger prompt optimization |
| POST | `/admin/pipeline/reindex-all` | Rebuild Qdrant index |
| GET | `/admin/pipeline/worker-status` | Celery worker health |
| GET | `/admin/pipeline/recent-activity` | Recent pipeline events |
| POST | `/admin/pipeline/creator-profile/{creator_id}` | Update creator profile |
| POST | `/admin/pipeline/avatar-fetch/{creator_id}` | Fetch creator avatar |
## Other Endpoints (2)
| Method | Path | Notes |
|--------|------|-------|
| POST | `/api/v1/ingest` | Transcript upload |
| GET | `/api/v1/videos` | ⚠️ Bare list (not paginated) |
## Response Conventions
**Standard paginated response:**
```json
{
"items": [...],
"total": 83,
"offset": 0,
"limit": 20
}
```
**Known inconsistencies:**
- `GET /topics` returns bare list instead of paginated dict
- `GET /videos` returns bare list instead of paginated dict
- Search uses `items` key (not `results`)
- `/techniques/random` returns JSON `{slug}` (not HTTP redirect)
**New endpoints should follow the `{items, total, offset, limit}` paginated pattern.**
## Authentication
JWT-based authentication added in M019. See [[Authentication]] for full details.
- **Public endpoints** (search, browse, techniques) require no auth
- **Auth endpoints** (`/auth/register`, `/auth/login`) are open; `/auth/me` requires Bearer JWT
- **Consent endpoints** require Bearer JWT with ownership verification (creator must own the video, or be admin)
- **Admin endpoints** (`/admin/*`) are accessible to anyone with network access (auth planned for future milestone)
---
*See also: [[Architecture]], [[Data-Model]], [[Frontend]], [[Authentication]]*

@ -2,7 +2,7 @@
## System Overview
Chrysopedia is a self-hosted music production knowledge base that synthesizes technique articles from video transcripts using a 6-stage LLM pipeline. It runs as a Docker Compose stack on `ub01` with 11 containers.
Chrysopedia is a self-hosted music production knowledge base that synthesizes technique articles from video transcripts using a 7-stage LLM pipeline. It runs as a Docker Compose stack on `ub01` with 11 containers.
```
┌─────────────────────────────────────────────────────────────────┐
@ -94,3 +94,11 @@ Chrysopedia is a self-hosted music production knowledge base that synthesizes te
---
*See also: [[Deployment]], [[Pipeline]], [[Data-Model]], [[Authentication]]*
→ PostgreSQL, embeddings → Qdrant
5. **Serving:** React SPA fetches from FastAPI, search queries hit Qdrant then PostgreSQL fallback
6. **Auth:** JWT-protected endpoints for creator consent management and admin features (see [[Authentication]])
---
*See also: [[Deployment]], [[Pipeline]], [[Data-Model]], [[Authentication]]*
], [[Authentication]]*

115
Chat-Engine.md Normal file

@ -0,0 +1,115 @@
# Chat Engine
Streaming question-answering interface backed by LightRAG retrieval and LLM completion. Added in M021/S03.
## Architecture
```
User types question in ChatPage
POST /api/v1/chat { query: "...", creator?: "..." }
ChatService.stream(query, creator?)
├─ 1. Retrieve: SearchService.search(query, creator)
│ └─ Uses 4-tier cascade if creator provided (see [[Search-Retrieval]])
├─ 2. Prompt: Assemble numbered context block into encyclopedic system prompt
│ └─ Sources formatted as [1] Title — Summary for citation mapping
├─ 3. Stream: openai.AsyncOpenAI with stream=True
│ └─ Tokens streamed as SSE events in real-time
SSE response → ChatPage renders tokens + citation links
```
## SSE Protocol
The chat endpoint returns a `text/event-stream` response with four event types in strict order:
| Event | Payload | When |
|-------|---------|------|
| `sources` | `[{title, slug, creator_name, summary}]` | First — citation metadata for link rendering |
| `token` | `string` (text chunk) | Repeated — streamed LLM completion tokens |
| `done` | `{cascade_tier: "creator"\|"domain"\|"global"\|"none"\|""}` | Once — signals completion, includes which retrieval tier answered |
| `error` | `{message: string}` | On failure — emitted if LLM errors mid-stream |
The `cascade_tier` in the `done` event reveals which tier of the retrieval cascade served the context (see [[Search-Retrieval]]).
## Citation Format
The LLM is instructed to reference sources using numbered citations `[N]` in its response. The frontend parses these into superscript links:
- `[1]` → links to `/techniques/:slug` for the corresponding source
- Multiple citations supported: `[1][3]` or `[1,3]`
- Citation regex: `/\[(\d+)\]/g` parsed locally in ChatPage
## API Endpoint
### POST /api/v1/chat
| Field | Type | Required | Validation |
|-------|------|----------|------------|
| `query` | string | Yes | 11000 characters |
| `creator` | string | No | Creator UUID or slug for scoped retrieval |
**Response:** `text/event-stream` (SSE)
**Error responses:**
- `422` — Empty or missing query, query exceeds 1000 chars
## Backend: ChatService
Located in `backend/chat_service.py`. The retrieve-prompt-stream pipeline:
1. **Retrieve** — Calls `SearchService.search()` with the query and optional creator parameter. Gets back ranked technique page results with the cascade_tier.
2. **Prompt** — Builds a numbered context block from search results. System prompt instructs the LLM to act as a music production encyclopedia, cite sources with `[N]` notation, and stay grounded in the provided context.
3. **Stream** — Opens an async streaming completion via `openai.AsyncOpenAI` (configured to point at DGX Sparks Qwen or local Ollama). Yields SSE events as tokens arrive.
Error handling: If the LLM fails mid-stream (after some tokens have been sent), an `error` event is emitted so the frontend can display a failure message rather than leaving the response hanging.
## Frontend: ChatPage
Route: `/chat` (lazy-loaded, code-split)
### Components
- **Text input + submit button** — Query entry with Enter-to-submit
- **Streaming message display** — Accumulates tokens with blinking cursor animation during streaming
- **Citation markers**`[N]` parsed to superscript links targeting `/techniques/:slug`
- **Source list** — Numbered sources with creator attribution displayed below the response
- **States:** Loading (streaming indicator), error (message display), empty (placeholder prompt)
### SSE Client
Located in `frontend/src/api/chat.ts`. Uses `fetch()` + `ReadableStream` with typed callbacks:
```typescript
streamChat(query, creator?, {
onSources: (sources) => void,
onToken: (token) => void,
onDone: (data) => void,
onError: (error) => void,
})
```
## Key Files
- `backend/chat_service.py` — ChatService retrieve-prompt-stream pipeline
- `backend/routers/chat.py` — POST /api/v1/chat endpoint
- `frontend/src/api/chat.ts` — SSE client utility
- `frontend/src/pages/ChatPage.tsx` — Chat UI page component
- `frontend/src/pages/ChatPage.module.css` — Chat page styles
## Design Decisions
- **Standalone ASGI test client pattern** — Tests use mocked DB to avoid PostgreSQL dependency, enabling fast CI runs
- **Patch `openai.AsyncOpenAI` constructor** rather than instance attribute for reliable test mocking
- **Local citation regex** in ChatPage rather than importing from utils — link targets differ from technique page citations
---
*See also: [[Search-Retrieval]], [[API-Surface]], [[Frontend]]*

@ -71,6 +71,7 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
| topic_tags | ARRAY(String) | |
| content_type | Enum | tutorial / tip / exploration / walkthrough |
| review_status | String | pending / approved / rejected |
| sort_order | Integer | Display ordering within video (M021/S06) |
### TechniquePage
@ -188,6 +189,8 @@ Append-only versioned record of per-field consent changes.
| PipelineRunTrigger | auto, manual, retrigger, clean_retrigger |
| **UserRole** | admin, creator |
| **ConsentField** | kb_inclusion, training_usage, public_display |
| **HighlightStatus** | candidate, approved, rejected (M021/S04) |
| **ChapterStatus** | draft, approved, hidden (M021/S06) |
## Schema Notes
@ -201,4 +204,15 @@ Append-only versioned record of per-field consent changes.
---
*See also: [[Architecture]], [[API-Surface]], [[Pipeline]], [[Authentication]]*
changes currently require manual DDL
- **body_sections_format** discriminator enables v1/v2 format coexistence (D024)
- **topic_category casing** is inconsistent across records (e.g., "Sound design" vs "Sound Design") — known data quality issue
- **Stage 4 classification data** (per-moment topic_tags) stored in Redis with 24h TTL, not DB columns
- **Timestamp convention:** `datetime.now(timezone.utc).replace(tzinfo=None)` — asyncpg rejects timezone-aware datetimes for TIMESTAMP WITHOUT TIME ZONE columns (D002)
- **User passwords** are stored as bcrypt hashes via `bcrypt.hashpw()`
- **Consent audit** uses version numbers assigned in application code (`max(version) + 1` per video_consent_id)
---
*See also: [[Architecture]], [[API-Surface]], [[Pipeline]], [[Authentication]]*

@ -31,6 +31,13 @@ Architectural and pattern decisions made during Chrysopedia development. Append-
| D034 | Documentation strategy | Forgejo wiki, KB slice at end of every milestone | Incremental docs stay current; final pass in M025 |
| D035 | File/object storage | MinIO (S3-compatible) self-hosted | Docker-native, signed URLs, fits existing infrastructure |
## M021 Decisions
| # | When | Decision | Choice | Rationale |
|---|------|----------|--------|-----------|
| D039 | M021/S01 | LightRAG scoring strategy | Position-based (1.0 → 0.5 descending), sequential Qdrant fallback | `/query/data` has no numeric relevance score; retrieval order is the only signal |
| D040 | M021/S02 | Creator-scoped retrieval strategy | 4-tier cascade: creator → domain → global → none | Progressive widening ensures results while preferring creator context; `ll_keywords` for soft scoping; 3x oversampling for post-filter survival |
## UI/UX Decisions
| # | Decision | Choice |

@ -53,7 +53,13 @@ Local component state only (`useState`/`useEffect`). No Redux, Zustand, Context
## API Client
Single module `public-client.ts` (~600 lines) with typed `request<T>` helper. Relative `/api/v1` base URL (nginx proxies to API container). All response TypeScript interfaces defined in the same file.
Two API modules:
- `public-client.ts` (~600 lines) — typed `request<T>` helper for REST endpoints
- `chat.ts` — SSE streaming client for POST /api/v1/chat using `fetch()` + `ReadableStream`
- `videos.ts` — chapter management functions (fetchChapters, fetchCreatorChapters, updateChapter, reorderChapters, approveChapters)
- `auth.ts` — authentication + impersonation functions including `fetchImpersonationLog()`
Relative `/api/v1` base URL (nginx proxies to API container).
## CSS Architecture
@ -108,3 +114,10 @@ Single module `public-client.ts` (~600 lines) with typed `request<T>` helper. Re
---
*See also: [[Architecture]], [[API-Surface]], [[Development-Guide]]*
*See also: [[Architecture]], [[API-Surface]], [[Development-Guide]]*
ocalhost:8001`
- **Production:** nginx serves static `dist/` bundle, proxies `/api` to FastAPI container
---
*See also: [[Architecture]], [[API-Surface]], [[Development-Guide]]*

143
Highlights.md Normal file

@ -0,0 +1,143 @@
# Highlight Detection
Heuristic scoring engine that ranks KeyMoment records into highlight candidates using 7 weighted dimensions. Added in M021/S04.
## Overview
Highlight detection scores every KeyMoment in a video to identify the most "highlightable" segments — moments that would work well as standalone clips or featured content. The scoring is a pure function (no ML model, no external API) based on 7 dimensions derived from existing KeyMoment metadata.
## Scoring Dimensions
Total weight sums to 1.0. Each dimension produces a 0.01.0 score.
| Dimension | Weight | What It Measures |
|-----------|--------|-----------------|
| `duration_fitness` | 0.25 | Piecewise linear curve peaking at 3060 seconds (ideal clip length) |
| `content_type` | 0.20 | Content type favorability: tutorial > tip > walkthrough > exploration |
| `specificity_density` | 0.20 | Regex-based counting of specific units, ratios, and named parameters in summary text |
| `plugin_richness` | 0.10 | Number of plugins/VSTs referenced (more = more actionable) |
| `transcript_energy` | 0.10 | Teaching-phrase detection in transcript text (e.g., "the trick is", "key thing") |
| `source_quality` | 0.10 | Source quality rating: high=1.0, medium=0.6, low=0.3 |
| `video_type` | 0.05 | Video type favorability mapping |
### Duration Fitness Curve
Uses piecewise linear (not Gaussian) for predictability:
- 010s → low score (too short)
- 1030s → ramp up
- 3060s → peak score (1.0)
- 60120s → gradual decline
- 120s+ → low score (too long for a highlight)
## Data Model
### HighlightCandidate
| Field | Type | Notes |
|-------|------|-------|
| id | UUID PK | |
| key_moment_id | FK → KeyMoment | Unique constraint (`uq_highlight_candidate_moment`) |
| source_video_id | FK → SourceVideo | Indexed |
| score | Float | Composite score 0.01.0 |
| score_breakdown | JSONB | Per-dimension scores (7 fields) |
| duration_secs | Float | Cached from KeyMoment for display |
| status | Enum(HighlightStatus) | candidate / approved / rejected |
| created_at | Timestamp | |
| updated_at | Timestamp | |
### HighlightStatus Enum
| Value | Meaning |
|-------|---------|
| `candidate` | Scored but not reviewed |
| `approved` | Admin-approved as a highlight |
| `rejected` | Admin-rejected |
### Database Indexes
- `source_video_id` — filter by video
- `score` DESC — rank ordering
- `status` — filter by review state
### Migration
Alembic migration `019_add_highlight_candidates.py` creates the table with all indexes and the named unique constraint.
## API Endpoints
All under `/api/v1/admin/highlights/`. Admin access.
| Method | Path | Purpose |
|--------|------|---------|
| POST | `/admin/highlights/detect/{video_id}` | Score all KeyMoments for a video, upsert candidates |
| POST | `/admin/highlights/detect-all` | Score all videos (triggers Celery tasks) |
| GET | `/admin/highlights/candidates` | Paginated candidate list, sorted by score DESC |
| GET | `/admin/highlights/candidates/{id}` | Single candidate with full `score_breakdown` |
### Detect Response
```json
{
"video_id": "uuid",
"candidates_created": 12,
"candidates_updated": 0
}
```
### Candidate Response
```json
{
"id": "uuid",
"key_moment_id": "uuid",
"source_video_id": "uuid",
"score": 0.847,
"score_breakdown": {
"duration_fitness": 0.95,
"content_type_weight": 0.80,
"specificity_density": 0.72,
"plugin_richness": 0.60,
"transcript_energy": 0.85,
"source_quality_weight": 1.00,
"video_type_weight": 0.50
},
"duration_secs": 45.0,
"status": "candidate",
"created_at": "...",
"updated_at": "..."
}
```
## Pipeline Integration
### Celery Task: `stage_highlight_detection`
- **Binding:** `bind=True, max_retries=3`
- **Session:** Uses `_get_sync_session` (sync SQLAlchemy, per D004)
- **Flow:** Load KeyMoments for video → score each via `score_moment()` → bulk upsert via `INSERT ON CONFLICT` on named constraint `uq_highlight_candidate_moment`
- **Events:** Emits `pipeline_events` rows for start/complete/error with candidate count in payload
### Scoring Function
`score_moment()` in `backend/pipeline/highlight_scorer.py` is a **pure function** — no DB access, no side effects. Takes a KeyMoment-like dict, returns `(score, breakdown_dict)`. This separation enables easy unit testing (28 tests, runs in 0.03s).
## Design Decisions
- **Pure function scoring** — No DB or side effects in `score_moment()`, enabling fast unit tests
- **Piecewise linear duration** — Predictable behavior vs. Gaussian bell curve
- **Named unique constraint**`uq_highlight_candidate_moment` enables idempotent upserts via `ON CONFLICT`
- **Lazy import**`score_moment` imported inside Celery task to avoid circular imports at module load
## Key Files
- `backend/pipeline/highlight_scorer.py` — Pure scoring function with 7 dimensions
- `backend/pipeline/highlight_schemas.py` — Pydantic schemas (HighlightScoreBreakdown, HighlightCandidateResponse, HighlightBatchResult)
- `backend/pipeline/stages.py``stage_highlight_detection` Celery task
- `backend/routers/highlights.py` — 4 admin API endpoints
- `backend/models.py` — HighlightCandidate model, HighlightStatus enum
- `alembic/versions/019_add_highlight_candidates.py` — Migration
- `backend/pipeline/test_highlight_scorer.py` — 28 unit tests
---
*See also: [[Pipeline]], [[Data-Model]], [[API-Surface]]*

@ -37,4 +37,10 @@ Producers can search for specific techniques and find timestamped key moments, s
---
*Last updated: 2026-04-04 — M021 chat engine, retrieval cascade, highlights, audio mode, chapters, impersonation write mode*
inx reverse proxy on nuc01 |
---
*Last updated: 2026-04-03 — M018/S02 initial bootstrap*
” M018/S02 initial bootstrap*

@ -86,17 +86,39 @@ The `original_user_id` claim is what `reject_impersonation` checks.
### AuthContext Extensions
- `startImpersonation(userId)` — Calls impersonate API, saves current admin token to `sessionStorage`, swaps to impersonation token
- `startImpersonation(userId, writeMode?)` — Calls impersonate API (with optional write_mode body), saves current admin token to `sessionStorage`, swaps to impersonation token
- `exitImpersonation()` — Calls stop API, restores admin token from `sessionStorage`
- `user.impersonating` (boolean) — True when viewing as another user
- `isWriteMode` (boolean) — True when impersonation session has write access (M021/S07)
### ImpersonationBanner
Fixed amber bar at page top when impersonating. Shows "Viewing as {name}" with Exit button. Rendered in `AppShell` when `user.impersonating` is true.
Fixed bar at page top when impersonating. Two visual states (M021/S07):
- **Amber** "👁 Viewing as {name}" — read-only mode
- **Red** "✏️ Editing as {name}" — write mode (adds `body.impersonating-write` class)
Uses `data-write-mode` attribute for CSS color switching.
### ConfirmModal (M021/S07)
Reusable confirmation dialog component used for "Edit As" write-mode impersonation:
- Backdrop + Escape/backdrop-click dismiss
- `variant` prop: `warning` (amber) or `danger` (red confirm button)
- Uses `data-variant` attribute for CSS variant styling
### AdminUsers Page
Route: `/admin/users`. Table of all users with "View As" buttons for creator-role users. Code-split with `React.lazy`.
Route: `/admin/users`. Table of all users with two action buttons per creator:
- "View As" — starts read-only impersonation (no confirmation modal)
- "Edit As" — opens ConfirmModal with danger variant before starting write-mode impersonation
### AdminAuditLog Page (M021/S07)
Route: `/admin/audit-log`. Six-column table:
- Date/Time, Admin, Target User, Action, Write Mode, IP Address
- Badge styling via `data-variant` / `data-action` / `data-write-mode` attributes
- Loading/error/empty states, Previous/Next pagination
- Linked from AdminDropdown after "Users"
## Key Files

@ -55,6 +55,14 @@ Stage 6: Embed & Index — generate embeddings, upsert to Qdrant (non-blocking)
- **Non-blocking:** Failures log WARNING but don't fail the pipeline (D005)
- Can be re-triggered independently via `/admin/pipeline/reindex-all`
### Stage 7: Highlight Detection (M021/S04)
- Scores every KeyMoment in a video using 7 weighted heuristic dimensions
- Pure function scoring: duration_fitness (0.25), content_type (0.20), specificity_density (0.20), plugin_richness (0.10), transcript_energy (0.10), source_quality (0.10), video_type (0.05)
- Celery task `stage_highlight_detection` with `bind=True, max_retries=3`
- Bulk upserts via `INSERT ON CONFLICT` on named constraint `uq_highlight_candidate_moment`
- Output: HighlightCandidate records in PostgreSQL with composite score and per-dimension breakdown
- See [[Highlights]] for full scoring details and API endpoints
## LLM Configuration
| Setting | Value |

@ -74,6 +74,37 @@ Playback rate options: 0.5x, 0.75x, 1x, 1.25x, 1.5x, 2x.
## Key Files
- `frontend/src/pages/WatchPage.tsx` — Page component
- `frontend/src/components/VideoPlayer.tsx` — Video element + HLS setup
- `frontend/src/components/PlayerControls.tsx` — Play/pause, speed, volume, seek bar
- `frontend/src/components/TranscriptSidebar.tsx` — Synchronized transcript display
- `frontend/src/components/AudioWaveform.tsx` — Waveform visualization for audio content (M021/S05)
- `frontend/src/components/ChapterMarkers.tsx` — Seek bar chapter overlay (M021/S05)
- `frontend/src/hooks/useMediaSync.ts` — Shared playback state hook
- `backend/routers/videos.py` — Video detail + transcript API
---
*See also: [[Architecture]], [[API-Surface]], [[Frontend]]*
with chapter title
- Click seeks playback to chapter start time
- Integrated into `PlayerControls` via wrapper container div
## Audio Waveform (M021/S05)
`AudioWaveform` component renders when content is audio-only (no video_url):
- Hidden `<audio>` element shared between `useMediaSync` and WaveSurfer
- wavesurfer.js with MediaElement backend — playback controlled identically to video mode
- Dark-themed CSS matching the video player area
- RegionsPlugin for labeled chapter regions with drag/resize support
### Dependencies
- `wavesurfer.js` — waveform rendering (~200KB, loaded only in audio mode)
- `useMediaSync` hook widened from `HTMLVideoElement` to `HTMLMediaElement` for audio/video polymorphism
## Key Files
- `frontend/src/pages/WatchPage.tsx` — Page component
- `frontend/src/components/VideoPlayer.tsx` — Video element + HLS setup
- `frontend/src/components/PlayerControls.tsx` — Play/pause, speed, volume, seek bar

107
Search-Retrieval.md Normal file

@ -0,0 +1,107 @@
# Search & Retrieval
LightRAG-first search with automatic Qdrant fallback, plus a 4-tier creator-scoped retrieval cascade. Added in M021/S01S02.
## Overview
Search went through a major upgrade in M021: LightRAG replaced Qdrant as the primary search engine, with Qdrant retained as an automatic fallback. A 4-tier creator-scoped retrieval cascade was added for context-aware search when querying within a creator's content.
## LightRAG Integration
LightRAG is a graph-based RAG engine running as a standalone service on port 9621. It replaced Qdrant as the primary search path for `GET /api/v1/search`.
### How It Works
1. **Query**`SearchService._lightrag_search()` POSTs to LightRAG `/query/data` with `mode: "hybrid"`
2. **Parse** — Response contains `chunks` (text passages with file_source metadata) and `entities` (graph nodes)
3. **Extract** — Technique slugs parsed from `file_source` paths using format `technique:{slug}:creator:{uuid}`
4. **Lookup** — Batch PostgreSQL query maps slugs to full TechniquePage records
5. **Score** — Position-based scoring (1.0 → 0.5 descending) since `/query/data` has no numeric relevance score (D039)
6. **Supplement** — Entity names matched against technique page titles as supplementary results
### Configuration
| Field | Default | Purpose |
|-------|---------|---------|
| `lightrag_url` | `http://chrysopedia-lightrag:9621` | LightRAG service URL |
| `lightrag_search_timeout` | `2.0` (seconds) | Request timeout |
| `lightrag_min_query_length` | `3` (characters) | Queries shorter than this skip LightRAG |
### Fallback Behavior
LightRAG failures trigger automatic fallback to the existing Qdrant + keyword search path:
- **Timeout** → fallback + WARNING log with `reason=timeout`
- **Connection error** → fallback + WARNING log with `reason=connection_error`
- **HTTP error (e.g. 500)** → fallback + WARNING log with `reason=http_error`
- **Empty results** → fallback + WARNING log with `reason=empty_results`
- **Parse error** → fallback + WARNING log with `reason=parse_error`
- **Short query (<3 chars)** → skips LightRAG entirely, uses Qdrant directly
The `fallback_used` field in the search response indicates which engine served results.
## 4-Tier Creator-Scoped Cascade
When a `?creator=` parameter is provided (e.g., from a creator profile page or the chat engine), search runs a progressive cascade that widens scope until results are found. Added in M021/S02 (D040).
```
Tier 1: Creator-scoped
└─ LightRAG with ll_keywords=[creator_name], post-filter by creator_id (3× oversampling)
│ empty?
Tier 2: Domain-scoped
└─ LightRAG with ll_keywords=[dominant_category] (requires ≥2 pages in category)
│ empty?
Tier 3: Global
└─ Standard LightRAG search (no scoping)
│ empty?
Tier 4: None
└─ cascade_tier="none" — no results from any tier
```
### Cascade Details
| Tier | Method | Scoping | Post-Filter |
|------|--------|---------|-------------|
| Creator | `_creator_scoped_search()` | `ll_keywords: [creator_name]` | Yes — filter by `creator_id`, request 3× `top_k` |
| Domain | `_domain_scoped_search()` | `ll_keywords: [domain]` | No — any creator in domain qualifies |
| Global | `_lightrag_search()` | None | No |
| None | — | — | — (empty result) |
**Domain detection:** SQL aggregation finds the dominant `topic_category` across a creator's technique pages. Requires ≥2 pages in the category to declare a domain — fewer means insufficient signal.
**Post-filtering with oversampling:** Creator tier requests 3× the desired result count from LightRAG, then filters locally by `creator_id`. This compensates for LightRAG not supporting native creator filtering.
### Response Fields
| Field | Type | Description |
|-------|------|-------------|
| `cascade_tier` | string | Which tier served: `"creator"`, `"domain"`, `"global"`, `"none"`, or `""` (no cascade) |
| `fallback_used` | boolean | `true` if Qdrant fallback was used instead of LightRAG |
## Observability
- `logger.info` per LightRAG search: query, latency_ms, result_count
- `logger.info` per cascade tier: query, creator, tier, latency_ms, result_count
- `logger.warning` on any failure path with structured `reason=` tag
- `cascade_tier` and `fallback_used` in API response for downstream consumers
## Key Decisions
| # | Decision | Choice | Rationale |
|---|----------|--------|-----------|
| D039 | LightRAG scoring | Position-based (1.0 → 0.5) | `/query/data` has no numeric relevance score; sequential fallback to Qdrant |
| D040 | Creator-scoped strategy | 4-tier cascade (creator → domain → global → none) | Progressive widening ensures results while preferring creator context |
## Key Files
- `backend/search_service.py` — SearchService with LightRAG integration and cascade methods
- `backend/config.py` — LightRAG configuration fields
- `backend/schemas.py``cascade_tier` in SearchResponse
- `backend/routers/search.py``?creator=` query parameter
---
*See also: [[Chat-Engine]], [[Architecture]], [[API-Surface]], [[Decisions]]*

@ -10,6 +10,11 @@
- [[Pipeline]]
- [[Player]]
**Features**
- [[Chat-Engine]]
- [[Search-Retrieval]]
- [[Highlights]]
**Reference**
- [[API-Surface]]
- [[Frontend]]
@ -20,4 +25,4 @@
- [[Deployment]]
- [[Monitoring]]
- [[Development-Guide]]
- [[Agent-Context]]
- [[Agent-Context]]