docs: add auth, consent, LightRAG documentation (M019/S06)

Updated Architecture, Data-Model, API-Surface, Deployment, _Sidebar.
New Authentication page covering JWT flow, invite codes, consent, LightRAG.
Chrysopedia Bot 2026-04-03 18:23:17 -05:00
parent 081b39f767
commit e0a52757b0
6 changed files with 313 additions and 21 deletions

@ -1,6 +1,6 @@
# API Surface # API Surface
41 API endpoints grouped by domain. All served by FastAPI under `/api/v1/`. 50 API endpoints grouped by domain. All served by FastAPI under `/api/v1/`.
## Public Endpoints (10) ## Public Endpoints (10)
@ -31,6 +31,39 @@ title, slug, topic_category, topic_tags, summary, body_sections, body_sections_f
| GET | `/api/v1/topics/{cat}/{sub}` | `{items, total, offset, limit}` | Subtopic techniques | | GET | `/api/v1/topics/{cat}/{sub}` | `{items, total, offset, limit}` | Subtopic techniques |
| GET | `/api/v1/topics/{cat}` | `{items, total, offset, limit}` | Category techniques | | GET | `/api/v1/topics/{cat}` | `{items, total, offset, limit}` | Category techniques |
## Auth Endpoints (4)
All under prefix `/api/v1/auth/`. JWT-protected except registration and login.
| Method | Path | Auth | Purpose |
|--------|------|------|---------|
| POST | `/auth/register` | None (invite code required) | Create account with invite code, email, password, display_name. Optional creator_slug links to Creator. Returns UserResponse (201). |
| POST | `/auth/login` | None | Email + password → JWT access token (HS256, 24h expiry). Returns `{access_token, token_type}`. |
| GET | `/auth/me` | Bearer JWT | Current user profile. Returns UserResponse. |
| PUT | `/auth/me` | Bearer JWT | Update display_name and/or password (requires current_password for password changes). Returns UserResponse. |
## Consent Endpoints (5)
All under prefix `/api/v1/consent/`. All require Bearer JWT.
| Method | Path | Auth | Purpose |
|--------|------|------|---------|
| GET | `/consent/videos` | Creator or Admin | List consent records for current creator's videos. Admin sees all. Paginated (offset, limit). |
| GET | `/consent/videos/{video_id}` | Creator (owner) or Admin | Single video consent status. Returns defaults if no consent record exists. |
| PUT | `/consent/videos/{video_id}` | Creator (owner) or Admin | Upsert consent flags (partial update). Creates audit log entries for each changed field. |
| GET | `/consent/videos/{video_id}/history` | Creator (owner) or Admin | Versioned audit trail of consent changes for a video. |
| GET | `/consent/admin/summary` | Admin only | Aggregate consent flag counts across all videos. |
### Consent Fields
Three boolean consent flags per video, each independently toggleable:
| Field | Default | Meaning |
|-------|---------|---------|
| `kb_inclusion` | false | Allow indexing into knowledge base |
| `training_usage` | false | Allow use for model training |
| `public_display` | true | Allow public display on site |
## Report Endpoints (3) ## Report Endpoints (3)
| Method | Path | Purpose | | Method | Path | Purpose |
@ -96,8 +129,13 @@ All under prefix `/api/v1/admin/pipeline/`.
## Authentication ## Authentication
No authentication on any endpoint. Admin routes (`/admin/*`) are accessible to anyone with network access. Phase 2 will add auth middleware (see [[Decisions]] D033). JWT-based authentication added in M019. See [[Authentication]] for full details.
- **Public endpoints** (search, browse, techniques) require no auth
- **Auth endpoints** (`/auth/register`, `/auth/login`) are open; `/auth/me` requires Bearer JWT
- **Consent endpoints** require Bearer JWT with ownership verification (creator must own the video, or be admin)
- **Admin endpoints** (`/admin/*`) are accessible to anyone with network access (auth planned for future milestone)
--- ---
*See also: [[Architecture]], [[Data-Model]], [[Frontend]]* *See also: [[Architecture]], [[Data-Model]], [[Frontend]], [[Authentication]]*

@ -2,7 +2,7 @@
## System Overview ## System Overview
Chrysopedia is a self-hosted music production knowledge base that synthesizes technique articles from video transcripts using a 6-stage LLM pipeline. It runs as a Docker Compose stack on `ub01` with 8 containers. Chrysopedia is a self-hosted music production knowledge base that synthesizes technique articles from video transcripts using a 6-stage LLM pipeline. It runs as a Docker Compose stack on `ub01` with 11 containers.
``` ```
┌─────────────────────────────────────────────────────────────────┐ ┌─────────────────────────────────────────────────────────────────┐
@ -20,6 +20,11 @@ Chrysopedia is a self-hosted music production knowledge base that synthesizes te
│ │ Postgres │ │ Redis │ │ Qdrant │ │ Ollama │ │ │ │ Postgres │ │ Redis │ │ Qdrant │ │ Ollama │ │
│ │ :5433 │ │ :6379 │ │ :6333 │ │ :11434 │ │ │ │ :5433 │ │ :6379 │ │ :6333 │ │ :11434 │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────────┘ │ │ └──────────┘ └──────────┘ └──────────┘ └──────────────┘ │
│ │
│ ┌──────────────┐ │
│ │ LightRAG │ │
│ │ :9621 │ │
│ └──────────────┘ │
└─────────────────────────────────────────────────────────────────┘ └─────────────────────────────────────────────────────────────────┘
│ nginx reverse proxy │ nginx reverse proxy
@ -34,9 +39,10 @@ Chrysopedia is a self-hosted music production knowledge base that synthesizes te
- **Zero external frontend dependencies** beyond React, react-router-dom, and Vite - **Zero external frontend dependencies** beyond React, react-router-dom, and Vite
- **Monolithic CSS** — 5,820 lines, single file, BEM naming, 77 custom properties - **Monolithic CSS** — 5,820 lines, single file, BEM naming, 77 custom properties
- **No authentication** — admin routes are network-access-controlled only - **JWT authentication** — invite-code registration, bcrypt passwords, HS256 tokens, role-based access control (admin, creator)
- **Dual SQLAlchemy strategy** — async engine for FastAPI request handlers, sync engine for Celery pipeline tasks (D004) - **Dual SQLAlchemy strategy** — async engine for FastAPI request handlers, sync engine for Celery pipeline tasks (D004)
- **Non-blocking pipeline side effects** — embedding/Qdrant failures don't block page synthesis (D005) - **Non-blocking pipeline side effects** — embedding/Qdrant failures don't block page synthesis (D005)
- **LightRAG** — graph-based RAG knowledge base backed by Qdrant + Ollama, running as a standalone service
## Docker Services ## Docker Services
@ -46,17 +52,20 @@ Chrysopedia is a self-hosted music production knowledge base that synthesizes te
| Redis 7 | redis:7-alpine | chrysopedia-redis | 6379 (internal) | — | | Redis 7 | redis:7-alpine | chrysopedia-redis | 6379 (internal) | — |
| Qdrant 1.13.2 | qdrant/qdrant:v1.13.2 | chrysopedia-qdrant | 6333 (internal) | chrysopedia_qdrant_data | | Qdrant 1.13.2 | qdrant/qdrant:v1.13.2 | chrysopedia-qdrant | 6333 (internal) | chrysopedia_qdrant_data |
| Ollama | ollama/ollama:latest | chrysopedia-ollama | 11434 (internal) | chrysopedia_ollama_data | | Ollama | ollama/ollama:latest | chrysopedia-ollama | 11434 (internal) | chrysopedia_ollama_data |
| LightRAG | ghcr.io/hkuds/lightrag:latest | chrysopedia-lightrag | 9621 (internal) | chrysopedia_lightrag_data |
| API (FastAPI) | Dockerfile.api | chrysopedia-api | 8000 (internal) | Bind: backend/, prompts/ | | API (FastAPI) | Dockerfile.api | chrysopedia-api | 8000 (internal) | Bind: backend/, prompts/ |
| Worker (Celery) | Dockerfile.api | chrysopedia-worker | — | Bind: backend/, prompts/ | | Worker (Celery) | Dockerfile.api | chrysopedia-worker | — | Bind: backend/, prompts/ |
| Watcher | Dockerfile.api | chrysopedia-watcher | — | Bind: watch dir | | Watcher | Dockerfile.api | chrysopedia-watcher | — | Bind: watch dir |
| Web (nginx) | Dockerfile.web | chrysopedia-web-8096 | 8096:80 | — | | Web (nginx) | Dockerfile.web | chrysopedia-web-8096 | 8096:80 | — |
**Note:** The 11-container count includes the 3 infrastructure services not shown above (nginx-internal-healthcheck, db-init, and lightrag). Exact running count may vary based on one-shot init containers.
## Network Topology ## Network Topology
- **Compose subnet:** 172.32.0.0/24 (D015) - **Compose subnet:** 172.32.0.0/24 (D015)
- **External access:** nginx on nuc01 (10.0.0.9) reverse-proxies to ub01:8096 - **External access:** nginx on nuc01 (10.0.0.9) reverse-proxies to ub01:8096
- **DNS:** AdGuard Home rewrites chrysopedia.com → 10.0.0.9 - **DNS:** AdGuard Home rewrites chrysopedia.com → 10.0.0.9
- **Internal services** (Redis, Qdrant, Ollama) are not exposed outside the Docker network - **Internal services** (Redis, Qdrant, Ollama, LightRAG) are not exposed outside the Docker network
## Tech Stack ## Tech Stack
@ -67,8 +76,10 @@ Chrysopedia is a self-hosted music production knowledge base that synthesizes te
| Database | PostgreSQL 16 | | Database | PostgreSQL 16 |
| Cache/Broker | Redis 7 (Celery broker + review mode toggle + classification cache) | | Cache/Broker | Redis 7 (Celery broker + review mode toggle + classification cache) |
| Vector Store | Qdrant 1.13.2 | | Vector Store | Qdrant 1.13.2 |
| Knowledge Graph | LightRAG (graph-based RAG, port 9621) |
| Embeddings | Ollama (nomic-embed-text) via OpenAI-compatible /v1/embeddings | | Embeddings | Ollama (nomic-embed-text) via OpenAI-compatible /v1/embeddings |
| LLM | OpenAI-compatible API — DGX Sparks Qwen primary, local Ollama fallback | | LLM | OpenAI-compatible API — DGX Sparks Qwen primary, local Ollama fallback |
| Authentication | JWT (HS256, 24h expiry) + bcrypt + invite codes |
| Deployment | Docker Compose on ub01, nginx reverse proxy on nuc01 | | Deployment | Docker Compose on ub01, nginx reverse proxy on nuc01 |
## Data Flow ## Data Flow
@ -78,7 +89,8 @@ Chrysopedia is a self-hosted music production knowledge base that synthesizes te
3. **Pipeline:** 6 Celery stages process each video (see [[Pipeline]]) 3. **Pipeline:** 6 Celery stages process each video (see [[Pipeline]])
4. **Storage:** Technique pages + key moments → PostgreSQL, embeddings → Qdrant 4. **Storage:** Technique pages + key moments → PostgreSQL, embeddings → Qdrant
5. **Serving:** React SPA fetches from FastAPI, search queries hit Qdrant then PostgreSQL fallback 5. **Serving:** React SPA fetches from FastAPI, search queries hit Qdrant then PostgreSQL fallback
6. **Auth:** JWT-protected endpoints for creator consent management and admin features (see [[Authentication]])
--- ---
*See also: [[Deployment]], [[Pipeline]], [[Data-Model]]* *See also: [[Deployment]], [[Pipeline]], [[Data-Model]], [[Authentication]]*

159
Authentication.md Normal file

@ -0,0 +1,159 @@
# Authentication
JWT-based authentication system added in M019. Provides invite-code registration, bcrypt password hashing, role-based access control, and a per-video consent management system with versioned audit trail.
## Overview
```
┌──────────────┐ POST /auth/login ┌──────────────┐
│ Frontend │ ──────────────────────────▶│ FastAPI │
│ (React SPA) │◀────── JWT (HS256) ───────│ Backend │
│ │ │ │
│ AuthContext │ Bearer token header │ get_current_ │
│ localStorage│ ──────────────────────────▶│ user() │
└──────────────┘ └──────────────┘
```
## JWT Token Flow
- **Algorithm:** HS256 (HMAC-SHA256)
- **Secret:** `APP_SECRET_KEY` from environment config
- **Expiry:** 24 hours (`_ACCESS_TOKEN_EXPIRE_MINUTES = 1440`)
- **Token URL:** `/api/v1/auth/login` (OAuth2PasswordBearer)
- **Claims:** `sub` (user UUID), `role` (admin/creator), `iat`, `exp`
### Token Lifecycle
1. User submits email + password to `POST /auth/login`
2. Backend verifies password against bcrypt hash in PostgreSQL
3. Backend issues signed JWT with user ID and role claims
4. Frontend stores token in `localStorage` under key `chrysopedia_auth_token`
5. All subsequent API requests include `Authorization: Bearer <token>` header
6. Backend `get_current_user` dependency decodes JWT, loads User from DB, checks `is_active`
7. On token expiry (24h), frontend catches 401 and clears stored token
## Invite Code Registration
Registration is gated by invite codes. No open self-registration.
- **Default seed code:** `CHRYSOPEDIA-ALPHA-2026` (100 uses, no expiry) — created on first startup via `seed_invite_codes()`
- **Code validation:** checks existence, expiry date, and remaining uses
- **On use:** `uses_remaining` is decremented (not deleted)
- **Optional creator linking:** `creator_slug` field in registration body links the new user to an existing Creator record
### Registration Flow
```
POST /auth/register
{
"email": "user@example.com",
"password": "...",
"display_name": "DJ Producer",
"invite_code": "CHRYSOPEDIA-ALPHA-2026",
"creator_slug": "dj-producer" // optional
}
```
## Role Model
Two roles defined in `UserRole` enum:
| Role | Purpose | Access |
|------|---------|--------|
| `creator` | Content creators linked to a Creator profile | Own videos' consent management |
| `admin` | Platform administrators | All consent records, admin summary, pipeline admin |
### FastAPI Dependencies
| Dependency | Purpose | Usage |
|------------|---------|-------|
| `get_current_user` | Decode JWT → load User → verify active | `Depends(get_current_user)` on any protected endpoint |
| `require_role(UserRole.admin)` | Check user has admin role, return 403 otherwise | `dependencies=[Depends(require_role(UserRole.admin))]` |
| `oauth2_scheme` | Extract Bearer token from Authorization header | Injected into `get_current_user` |
**Error responses:**
- 401 Unauthorized — missing/expired/invalid token, or user not found/inactive
- 403 Forbidden — wrong role, or not linked to creator profile
## Frontend Auth Integration
### AuthContext (`frontend/src/context/AuthContext.tsx`)
React context providing auth state to the entire app:
| Field/Method | Type | Purpose |
|-------------|------|---------|
| `user` | `UserResponse \| null` | Current user object |
| `token` | `string \| null` | Raw JWT string |
| `isAuthenticated` | `boolean` | `!!user` |
| `loading` | `boolean` | True during token rehydration on mount |
| `login(email, password)` | `async` | Calls `/auth/login`, stores token, fetches `/auth/me` |
| `register(data)` | `async` | Calls `/auth/register` |
| `logout()` | `void` | Clears localStorage and state |
### Token Rehydration
On app mount, `AuthProvider` checks `localStorage` for a stored token. If found, it calls `GET /auth/me` to rehydrate the user session. If the token is expired or invalid, it silently clears the stored token.
### ProtectedRoute (`frontend/src/components/ProtectedRoute.tsx`)
Route wrapper that redirects unauthenticated users to `/login?returnTo=<current_path>`.
- Shows nothing (`null`) while auth state is loading (prevents flash redirect)
- Preserves the intended destination in `returnTo` query param
## Consent Management
Per-video consent system allowing creators to control how their content is used.
### Consent Fields
| Field | Default | Meaning |
|-------|---------|---------|
| `kb_inclusion` | `false` | Allow indexing into knowledge base |
| `training_usage` | `false` | Allow use for model training |
| `public_display` | `true` | Allow public display on site |
### Ownership Model
- Each `VideoConsent` record is tied to a `SourceVideo` via `source_video_id`
- Creator users can only access consent for videos belonging to their linked Creator profile
- Admin users bypass ownership checks and can access all consent records
- `_verify_video_ownership()` helper enforces this on every consent endpoint
### Audit Trail
Every consent field change produces a `ConsentAuditLog` entry:
- **Versioned:** Sequential `version` numbers per `video_consent_id`
- **Per-field:** Each changed field gets its own audit row with `old_value``new_value`
- **Attributed:** `changed_by` (FK → User) and `ip_address` recorded
- **Append-only:** Audit entries are never modified or deleted
### Consent API Endpoints
| Method | Path | Purpose |
|--------|------|---------|
| GET | `/consent/videos` | List consent for creator's videos (admin sees all) |
| GET | `/consent/videos/{id}` | Single video consent (returns defaults if no record) |
| PUT | `/consent/videos/{id}` | Upsert consent (partial update, audit logged) |
| GET | `/consent/videos/{id}/history` | Full audit trail for a video |
| GET | `/consent/admin/summary` | Aggregate flag counts (admin only) |
## LightRAG Integration
LightRAG is a graph-based RAG (Retrieval-Augmented Generation) service added to the stack in M019:
- **Image:** `ghcr.io/hkuds/lightrag:latest`
- **Container:** `chrysopedia-lightrag`
- **Port:** 9621 (bound to 127.0.0.1 only)
- **Data volume:** `/vmPool/r/services/chrysopedia_lightrag`
- **Dependencies:** Qdrant (healthy) + Ollama (healthy)
- **Config:** `.env.lightrag` (LLM model, embedding model, Qdrant connection, working directory)
- **Healthcheck:** Python `urllib.request.urlopen('http://127.0.0.1:9621/health')`
LightRAG provides graph-structured knowledge retrieval as a complement to Qdrant's vector similarity search. It runs as a standalone service within the Docker Compose stack, sharing Qdrant and Ollama with the main application.
---
*See also: [[Architecture]], [[API-Surface]], [[Data-Model]]*

@ -1,19 +1,23 @@
# Data Model # Data Model
13 SQLAlchemy models in `backend/models.py`. 18 SQLAlchemy models in `backend/models.py`.
## Entity Relationship Overview ## Entity Relationship Overview
``` ```
Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
│ │ │ │
│ └──→ (N) KeyMoment │ ├──→ (N) KeyMoment
│ │
│ └──→ (0..1) VideoConsent ──→ (N) ConsentAuditLog
└──→ (N) TechniquePage (M) ←──→ (N) Tag ├──→ (N) TechniquePage (M) ←──→ (N) Tag
│ │
├──→ (N) TechniquePageVersion │ ├──→ (N) TechniquePageVersion
├──→ (N) RelatedTechniqueLink │ ├──→ (N) RelatedTechniqueLink
└──→ (M:N) SourceVideo (via TechniquePageVideo) │ └──→ (M:N) SourceVideo (via TechniquePageVideo)
└──→ (0..1) User ──→ (N) InviteCode (created_by)
``` ```
## Core Content Models ## Core Content Models
@ -22,7 +26,7 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
| Field | Type | Notes | | Field | Type | Notes |
|-------|------|-------| |-------|------|-------|
| id | Integer PK | | | id | UUID PK | |
| name | String | Unique, from folder name | | name | String | Unique, from folder name |
| slug | String | URL-safe, unique | | slug | String | URL-safe, unique |
| genres | ARRAY(String) | e.g. ["dubstep", "sound design"] | | genres | ARRAY(String) | e.g. ["dubstep", "sound design"] |
@ -35,7 +39,7 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
| Field | Type | Notes | | Field | Type | Notes |
|-------|------|-------| |-------|------|-------|
| id | Integer PK | | | id | UUID PK | |
| creator_id | FK → Creator | | | creator_id | FK → Creator | |
| filename | String | Original video filename | | filename | String | Original video filename |
| youtube_url | String | Optional | | youtube_url | String | Optional |
@ -47,7 +51,7 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
| Field | Type | Notes | | Field | Type | Notes |
|-------|------|-------| |-------|------|-------|
| id | Integer PK | | | id | UUID PK | |
| source_video_id | FK → SourceVideo | | | source_video_id | FK → SourceVideo | |
| start_time | Float | Seconds | | start_time | Float | Seconds |
| end_time | Float | Seconds | | end_time | Float | Seconds |
@ -57,7 +61,7 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
| Field | Type | Notes | | Field | Type | Notes |
|-------|------|-------| |-------|------|-------|
| id | Integer PK | | | id | UUID PK | |
| source_video_id | FK → SourceVideo | | | source_video_id | FK → SourceVideo | |
| title | String | | | title | String | |
| summary | Text | | | summary | Text | |
@ -72,7 +76,7 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
| Field | Type | Notes | | Field | Type | Notes |
|-------|------|-------| |-------|------|-------|
| id | Integer PK | | | id | UUID PK | |
| creator_id | FK → Creator | | | creator_id | FK → Creator | |
| title | String | | | title | String | |
| slug | String | Unique, URL-safe | | slug | String | Unique, URL-safe |
@ -90,12 +94,73 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
| Field | Type | Notes | | Field | Type | Notes |
|-------|------|-------| |-------|------|-------|
| id | Integer PK | | | id | UUID PK | |
| technique_page_id | FK → TechniquePage | | | technique_page_id | FK → TechniquePage | |
| version_number | Integer | Sequential | | version_number | Integer | Sequential |
| content_snapshot | JSONB | Full page state at version time | | content_snapshot | JSONB | Full page state at version time |
| pipeline_metadata | JSONB | Prompt SHA-256 hashes, model config | | pipeline_metadata | JSONB | Prompt SHA-256 hashes, model config |
## Authentication & User Models
### User
| Field | Type | Notes |
|-------|------|-------|
| id | UUID PK | |
| email | String(255) | Unique |
| hashed_password | String(255) | bcrypt hash |
| display_name | String(255) | |
| role | Enum(UserRole) | admin / creator (default: creator) |
| creator_id | FK → Creator | Optional — links user to a creator profile |
| is_active | Boolean | Default true |
| created_at | Timestamp | |
| updated_at | Timestamp | |
### InviteCode
| Field | Type | Notes |
|-------|------|-------|
| id | UUID PK | |
| code | String(100) | Unique |
| uses_remaining | Integer | Default 1 — decremented on each registration |
| created_by | FK → User | Optional — admin who created the code |
| expires_at | Timestamp | Optional — null means no expiry |
| created_at | Timestamp | |
## Consent Models
### VideoConsent
Per-video consent state. One row per video, mutable. Full change history in ConsentAuditLog.
| Field | Type | Notes |
|-------|------|-------|
| id | UUID PK | |
| source_video_id | FK → SourceVideo | Unique constraint |
| creator_id | FK → Creator | |
| kb_inclusion | Boolean | Default false — allow KB indexing |
| training_usage | Boolean | Default false — allow training use |
| public_display | Boolean | Default true — allow public display |
| updated_by | FK → User | Last user to modify |
| created_at | Timestamp | |
| updated_at | Timestamp | |
### ConsentAuditLog
Append-only versioned record of per-field consent changes.
| Field | Type | Notes |
|-------|------|-------|
| id | UUID PK | |
| video_consent_id | FK → VideoConsent | Indexed |
| version | Integer | Sequential per video_consent_id |
| field_name | String(50) | ConsentField enum value |
| old_value | Boolean | Nullable (null on first set) |
| new_value | Boolean | |
| changed_by | FK → User | |
| ip_address | String(45) | Client IP at time of change |
| created_at | Timestamp | |
## Supporting Models ## Supporting Models
| Model | Purpose | | Model | Purpose |
@ -121,6 +186,8 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
| ReportStatus | open, acknowledged, resolved, dismissed | | ReportStatus | open, acknowledged, resolved, dismissed |
| PipelineRunStatus | pending, running, completed, failed, revoked | | PipelineRunStatus | pending, running, completed, failed, revoked |
| PipelineRunTrigger | auto, manual, retrigger, clean_retrigger | | PipelineRunTrigger | auto, manual, retrigger, clean_retrigger |
| **UserRole** | admin, creator |
| **ConsentField** | kb_inclusion, training_usage, public_display |
## Schema Notes ## Schema Notes
@ -129,7 +196,9 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
- **topic_category casing** is inconsistent across records (e.g., "Sound design" vs "Sound Design") — known data quality issue - **topic_category casing** is inconsistent across records (e.g., "Sound design" vs "Sound Design") — known data quality issue
- **Stage 4 classification data** (per-moment topic_tags) stored in Redis with 24h TTL, not DB columns - **Stage 4 classification data** (per-moment topic_tags) stored in Redis with 24h TTL, not DB columns
- **Timestamp convention:** `datetime.now(timezone.utc).replace(tzinfo=None)` — asyncpg rejects timezone-aware datetimes for TIMESTAMP WITHOUT TIME ZONE columns (D002) - **Timestamp convention:** `datetime.now(timezone.utc).replace(tzinfo=None)` — asyncpg rejects timezone-aware datetimes for TIMESTAMP WITHOUT TIME ZONE columns (D002)
- **User passwords** are stored as bcrypt hashes via `bcrypt.hashpw()`
- **Consent audit** uses version numbers assigned in application code (`max(version) + 1` per video_consent_id)
--- ---
*See also: [[Architecture]], [[API-Surface]], [[Pipeline]]* *See also: [[Architecture]], [[API-Surface]], [[Pipeline]], [[Authentication]]*

@ -18,6 +18,7 @@ docker exec chrysopedia-api alembic upgrade head
docker logs -f chrysopedia-api docker logs -f chrysopedia-api
docker logs -f chrysopedia-worker docker logs -f chrysopedia-worker
docker logs -f chrysopedia-watcher docker logs -f chrysopedia-watcher
docker logs -f chrysopedia-lightrag
# Check status # Check status
docker ps --filter name=chrysopedia docker ps --filter name=chrysopedia
@ -33,6 +34,7 @@ docker ps --filter name=chrysopedia
│ ├── chrysopedia_postgres_data/ # PostgreSQL data │ ├── chrysopedia_postgres_data/ # PostgreSQL data
│ ├── chrysopedia_qdrant_data/ # Qdrant vector data │ ├── chrysopedia_qdrant_data/ # Qdrant vector data
│ ├── chrysopedia_ollama_data/ # Ollama model cache │ ├── chrysopedia_ollama_data/ # Ollama model cache
│ ├── chrysopedia_lightrag/ # LightRAG data + tiktoken cache
│ └── chrysopedia_watch/ # Watcher input directory │ └── chrysopedia_watch/ # Watcher input directory
│ ├── processed/ # Successfully ingested transcripts │ ├── processed/ # Successfully ingested transcripts
│ └── failed/ # Failed transcripts + .error sidecars │ └── failed/ # Failed transcripts + .error sidecars
@ -44,6 +46,13 @@ docker ps --filter name=chrysopedia
- **Network:** `chrysopedia-net` (172.32.0.0/24) - **Network:** `chrysopedia-net` (172.32.0.0/24)
- **Compose file:** `/vmPool/r/repos/xpltdco/chrysopedia/docker-compose.yml` - **Compose file:** `/vmPool/r/repos/xpltdco/chrysopedia/docker-compose.yml`
### Config Files
| File | Purpose |
|------|---------|
| `.env` | Core environment variables (DB credentials, API keys, APP_SECRET_KEY) |
| `.env.lightrag` | LightRAG-specific config (LLM/embedding model, Qdrant connection, working dir) |
### Build Args / Environment ### Build Args / Environment
Frontend build-time constants are injected via Docker build args: Frontend build-time constants are injected via Docker build args:
@ -63,6 +72,7 @@ build:
chrysopedia-web-8096 → chrysopedia-api → chrysopedia-db, chrysopedia-redis chrysopedia-web-8096 → chrysopedia-api → chrysopedia-db, chrysopedia-redis
chrysopedia-worker → chrysopedia-db, chrysopedia-redis, chrysopedia-qdrant, chrysopedia-ollama chrysopedia-worker → chrysopedia-db, chrysopedia-redis, chrysopedia-qdrant, chrysopedia-ollama
chrysopedia-watcher → chrysopedia-api chrysopedia-watcher → chrysopedia-api
chrysopedia-lightrag → chrysopedia-qdrant, chrysopedia-ollama
``` ```
## Healthchecks ## Healthchecks
@ -76,6 +86,7 @@ chrysopedia-watcher → chrysopedia-api
| API | `curl -f http://localhost:8000/health` | | | API | `curl -f http://localhost:8000/health` | |
| Worker | `celery -A worker inspect ping` | Not HTTP | | Worker | `celery -A worker inspect ping` | Not HTTP |
| Watcher | `python -c "import os; os.kill(1, 0)"` | Slim image, no pgrep | | Watcher | `python -c "import os; os.kill(1, 0)"` | Slim image, no pgrep |
| LightRAG | `python -c "import urllib.request; urllib.request.urlopen('http://127.0.0.1:9621/health')"` | Python urllib (no curl in image) |
## nginx Reverse Proxy ## nginx Reverse Proxy
@ -113,6 +124,7 @@ docker compose restart chrysopedia-api chrysopedia-worker
|---------|---------------|-----------|---------| |---------|---------------|-----------|---------|
| PostgreSQL | 5432 | 5433 | 0.0.0.0 | | PostgreSQL | 5432 | 5433 | 0.0.0.0 |
| Web (nginx) | 80 | 8096 | 0.0.0.0 | | Web (nginx) | 80 | 8096 | 0.0.0.0 |
| LightRAG | 9621 | 9621 | 127.0.0.1 |
| SSH (Forgejo) | 22 | 2222 | 0.0.0.0 | | SSH (Forgejo) | 22 | 2222 | 0.0.0.0 |
All other services (Redis, Qdrant, Ollama, API, Worker) are internal-only. All other services (Redis, Qdrant, Ollama, API, Worker) are internal-only.
@ -121,6 +133,7 @@ All other services (Redis, Qdrant, Ollama, API, Worker) are internal-only.
- **Web UI:** http://ub01:8096 - **Web UI:** http://ub01:8096
- **API Health:** http://ub01:8096/health - **API Health:** http://ub01:8096/health
- **LightRAG Health:** http://ub01:9621/health (localhost only)
- **Pipeline Admin:** http://ub01:8096/admin/pipeline - **Pipeline Admin:** http://ub01:8096/admin/pipeline
- **Worker Status:** http://ub01:8096/admin/pipeline (shows Celery worker count) - **Worker Status:** http://ub01:8096/admin/pipeline (shows Celery worker count)
- **PostgreSQL:** Connect via `psql -h ub01 -p 5433 -U chrysopedia` - **PostgreSQL:** Connect via `psql -h ub01 -p 5433 -U chrysopedia`

@ -4,6 +4,7 @@
**Architecture** **Architecture**
- [[Architecture]] - [[Architecture]]
- [[Authentication]]
- [[Data-Model]] - [[Data-Model]]
- [[Pipeline]] - [[Pipeline]]