docs: add auth, consent, LightRAG documentation (M019/S06)
Updated Architecture, Data-Model, API-Surface, Deployment, _Sidebar. New Authentication page covering JWT flow, invite codes, consent, LightRAG.
parent
081b39f767
commit
e0a52757b0
6 changed files with 313 additions and 21 deletions
|
|
@ -1,6 +1,6 @@
|
|||
# API Surface
|
||||
|
||||
41 API endpoints grouped by domain. All served by FastAPI under `/api/v1/`.
|
||||
50 API endpoints grouped by domain. All served by FastAPI under `/api/v1/`.
|
||||
|
||||
## Public Endpoints (10)
|
||||
|
||||
|
|
@ -31,6 +31,39 @@ title, slug, topic_category, topic_tags, summary, body_sections, body_sections_f
|
|||
| GET | `/api/v1/topics/{cat}/{sub}` | `{items, total, offset, limit}` | Subtopic techniques |
|
||||
| GET | `/api/v1/topics/{cat}` | `{items, total, offset, limit}` | Category techniques |
|
||||
|
||||
## Auth Endpoints (4)
|
||||
|
||||
All under prefix `/api/v1/auth/`. JWT-protected except registration and login.
|
||||
|
||||
| Method | Path | Auth | Purpose |
|
||||
|--------|------|------|---------|
|
||||
| POST | `/auth/register` | None (invite code required) | Create account with invite code, email, password, display_name. Optional creator_slug links to Creator. Returns UserResponse (201). |
|
||||
| POST | `/auth/login` | None | Email + password → JWT access token (HS256, 24h expiry). Returns `{access_token, token_type}`. |
|
||||
| GET | `/auth/me` | Bearer JWT | Current user profile. Returns UserResponse. |
|
||||
| PUT | `/auth/me` | Bearer JWT | Update display_name and/or password (requires current_password for password changes). Returns UserResponse. |
|
||||
|
||||
## Consent Endpoints (5)
|
||||
|
||||
All under prefix `/api/v1/consent/`. All require Bearer JWT.
|
||||
|
||||
| Method | Path | Auth | Purpose |
|
||||
|--------|------|------|---------|
|
||||
| GET | `/consent/videos` | Creator or Admin | List consent records for current creator's videos. Admin sees all. Paginated (offset, limit). |
|
||||
| GET | `/consent/videos/{video_id}` | Creator (owner) or Admin | Single video consent status. Returns defaults if no consent record exists. |
|
||||
| PUT | `/consent/videos/{video_id}` | Creator (owner) or Admin | Upsert consent flags (partial update). Creates audit log entries for each changed field. |
|
||||
| GET | `/consent/videos/{video_id}/history` | Creator (owner) or Admin | Versioned audit trail of consent changes for a video. |
|
||||
| GET | `/consent/admin/summary` | Admin only | Aggregate consent flag counts across all videos. |
|
||||
|
||||
### Consent Fields
|
||||
|
||||
Three boolean consent flags per video, each independently toggleable:
|
||||
|
||||
| Field | Default | Meaning |
|
||||
|-------|---------|---------|
|
||||
| `kb_inclusion` | false | Allow indexing into knowledge base |
|
||||
| `training_usage` | false | Allow use for model training |
|
||||
| `public_display` | true | Allow public display on site |
|
||||
|
||||
## Report Endpoints (3)
|
||||
|
||||
| Method | Path | Purpose |
|
||||
|
|
@ -96,8 +129,13 @@ All under prefix `/api/v1/admin/pipeline/`.
|
|||
|
||||
## Authentication
|
||||
|
||||
No authentication on any endpoint. Admin routes (`/admin/*`) are accessible to anyone with network access. Phase 2 will add auth middleware (see [[Decisions]] D033).
|
||||
JWT-based authentication added in M019. See [[Authentication]] for full details.
|
||||
|
||||
- **Public endpoints** (search, browse, techniques) require no auth
|
||||
- **Auth endpoints** (`/auth/register`, `/auth/login`) are open; `/auth/me` requires Bearer JWT
|
||||
- **Consent endpoints** require Bearer JWT with ownership verification (creator must own the video, or be admin)
|
||||
- **Admin endpoints** (`/admin/*`) are accessible to anyone with network access (auth planned for future milestone)
|
||||
|
||||
---
|
||||
|
||||
*See also: [[Architecture]], [[Data-Model]], [[Frontend]]*
|
||||
*See also: [[Architecture]], [[Data-Model]], [[Frontend]], [[Authentication]]*
|
||||
|
|
|
|||
|
|
@ -2,7 +2,7 @@
|
|||
|
||||
## System Overview
|
||||
|
||||
Chrysopedia is a self-hosted music production knowledge base that synthesizes technique articles from video transcripts using a 6-stage LLM pipeline. It runs as a Docker Compose stack on `ub01` with 8 containers.
|
||||
Chrysopedia is a self-hosted music production knowledge base that synthesizes technique articles from video transcripts using a 6-stage LLM pipeline. It runs as a Docker Compose stack on `ub01` with 11 containers.
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
|
|
@ -20,6 +20,11 @@ Chrysopedia is a self-hosted music production knowledge base that synthesizes te
|
|||
│ │ Postgres │ │ Redis │ │ Qdrant │ │ Ollama │ │
|
||||
│ │ :5433 │ │ :6379 │ │ :6333 │ │ :11434 │ │
|
||||
│ └──────────┘ └──────────┘ └──────────┘ └──────────────┘ │
|
||||
│ │
|
||||
│ ┌──────────────┐ │
|
||||
│ │ LightRAG │ │
|
||||
│ │ :9621 │ │
|
||||
│ └──────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
▲
|
||||
│ nginx reverse proxy
|
||||
|
|
@ -34,9 +39,10 @@ Chrysopedia is a self-hosted music production knowledge base that synthesizes te
|
|||
|
||||
- **Zero external frontend dependencies** beyond React, react-router-dom, and Vite
|
||||
- **Monolithic CSS** — 5,820 lines, single file, BEM naming, 77 custom properties
|
||||
- **No authentication** — admin routes are network-access-controlled only
|
||||
- **JWT authentication** — invite-code registration, bcrypt passwords, HS256 tokens, role-based access control (admin, creator)
|
||||
- **Dual SQLAlchemy strategy** — async engine for FastAPI request handlers, sync engine for Celery pipeline tasks (D004)
|
||||
- **Non-blocking pipeline side effects** — embedding/Qdrant failures don't block page synthesis (D005)
|
||||
- **LightRAG** — graph-based RAG knowledge base backed by Qdrant + Ollama, running as a standalone service
|
||||
|
||||
## Docker Services
|
||||
|
||||
|
|
@ -46,17 +52,20 @@ Chrysopedia is a self-hosted music production knowledge base that synthesizes te
|
|||
| Redis 7 | redis:7-alpine | chrysopedia-redis | 6379 (internal) | — |
|
||||
| Qdrant 1.13.2 | qdrant/qdrant:v1.13.2 | chrysopedia-qdrant | 6333 (internal) | chrysopedia_qdrant_data |
|
||||
| Ollama | ollama/ollama:latest | chrysopedia-ollama | 11434 (internal) | chrysopedia_ollama_data |
|
||||
| LightRAG | ghcr.io/hkuds/lightrag:latest | chrysopedia-lightrag | 9621 (internal) | chrysopedia_lightrag_data |
|
||||
| API (FastAPI) | Dockerfile.api | chrysopedia-api | 8000 (internal) | Bind: backend/, prompts/ |
|
||||
| Worker (Celery) | Dockerfile.api | chrysopedia-worker | — | Bind: backend/, prompts/ |
|
||||
| Watcher | Dockerfile.api | chrysopedia-watcher | — | Bind: watch dir |
|
||||
| Web (nginx) | Dockerfile.web | chrysopedia-web-8096 | 8096:80 | — |
|
||||
|
||||
**Note:** The 11-container count includes the 3 infrastructure services not shown above (nginx-internal-healthcheck, db-init, and lightrag). Exact running count may vary based on one-shot init containers.
|
||||
|
||||
## Network Topology
|
||||
|
||||
- **Compose subnet:** 172.32.0.0/24 (D015)
|
||||
- **External access:** nginx on nuc01 (10.0.0.9) reverse-proxies to ub01:8096
|
||||
- **DNS:** AdGuard Home rewrites chrysopedia.com → 10.0.0.9
|
||||
- **Internal services** (Redis, Qdrant, Ollama) are not exposed outside the Docker network
|
||||
- **Internal services** (Redis, Qdrant, Ollama, LightRAG) are not exposed outside the Docker network
|
||||
|
||||
## Tech Stack
|
||||
|
||||
|
|
@ -67,8 +76,10 @@ Chrysopedia is a self-hosted music production knowledge base that synthesizes te
|
|||
| Database | PostgreSQL 16 |
|
||||
| Cache/Broker | Redis 7 (Celery broker + review mode toggle + classification cache) |
|
||||
| Vector Store | Qdrant 1.13.2 |
|
||||
| Knowledge Graph | LightRAG (graph-based RAG, port 9621) |
|
||||
| Embeddings | Ollama (nomic-embed-text) via OpenAI-compatible /v1/embeddings |
|
||||
| LLM | OpenAI-compatible API — DGX Sparks Qwen primary, local Ollama fallback |
|
||||
| Authentication | JWT (HS256, 24h expiry) + bcrypt + invite codes |
|
||||
| Deployment | Docker Compose on ub01, nginx reverse proxy on nuc01 |
|
||||
|
||||
## Data Flow
|
||||
|
|
@ -78,7 +89,8 @@ Chrysopedia is a self-hosted music production knowledge base that synthesizes te
|
|||
3. **Pipeline:** 6 Celery stages process each video (see [[Pipeline]])
|
||||
4. **Storage:** Technique pages + key moments → PostgreSQL, embeddings → Qdrant
|
||||
5. **Serving:** React SPA fetches from FastAPI, search queries hit Qdrant then PostgreSQL fallback
|
||||
6. **Auth:** JWT-protected endpoints for creator consent management and admin features (see [[Authentication]])
|
||||
|
||||
---
|
||||
|
||||
*See also: [[Deployment]], [[Pipeline]], [[Data-Model]]*
|
||||
*See also: [[Deployment]], [[Pipeline]], [[Data-Model]], [[Authentication]]*
|
||||
|
|
|
|||
159
Authentication.md
Normal file
159
Authentication.md
Normal file
|
|
@ -0,0 +1,159 @@
|
|||
# Authentication
|
||||
|
||||
JWT-based authentication system added in M019. Provides invite-code registration, bcrypt password hashing, role-based access control, and a per-video consent management system with versioned audit trail.
|
||||
|
||||
## Overview
|
||||
|
||||
```
|
||||
┌──────────────┐ POST /auth/login ┌──────────────┐
|
||||
│ Frontend │ ──────────────────────────▶│ FastAPI │
|
||||
│ (React SPA) │◀────── JWT (HS256) ───────│ Backend │
|
||||
│ │ │ │
|
||||
│ AuthContext │ Bearer token header │ get_current_ │
|
||||
│ localStorage│ ──────────────────────────▶│ user() │
|
||||
└──────────────┘ └──────────────┘
|
||||
```
|
||||
|
||||
## JWT Token Flow
|
||||
|
||||
- **Algorithm:** HS256 (HMAC-SHA256)
|
||||
- **Secret:** `APP_SECRET_KEY` from environment config
|
||||
- **Expiry:** 24 hours (`_ACCESS_TOKEN_EXPIRE_MINUTES = 1440`)
|
||||
- **Token URL:** `/api/v1/auth/login` (OAuth2PasswordBearer)
|
||||
- **Claims:** `sub` (user UUID), `role` (admin/creator), `iat`, `exp`
|
||||
|
||||
### Token Lifecycle
|
||||
|
||||
1. User submits email + password to `POST /auth/login`
|
||||
2. Backend verifies password against bcrypt hash in PostgreSQL
|
||||
3. Backend issues signed JWT with user ID and role claims
|
||||
4. Frontend stores token in `localStorage` under key `chrysopedia_auth_token`
|
||||
5. All subsequent API requests include `Authorization: Bearer <token>` header
|
||||
6. Backend `get_current_user` dependency decodes JWT, loads User from DB, checks `is_active`
|
||||
7. On token expiry (24h), frontend catches 401 and clears stored token
|
||||
|
||||
## Invite Code Registration
|
||||
|
||||
Registration is gated by invite codes. No open self-registration.
|
||||
|
||||
- **Default seed code:** `CHRYSOPEDIA-ALPHA-2026` (100 uses, no expiry) — created on first startup via `seed_invite_codes()`
|
||||
- **Code validation:** checks existence, expiry date, and remaining uses
|
||||
- **On use:** `uses_remaining` is decremented (not deleted)
|
||||
- **Optional creator linking:** `creator_slug` field in registration body links the new user to an existing Creator record
|
||||
|
||||
### Registration Flow
|
||||
|
||||
```
|
||||
POST /auth/register
|
||||
{
|
||||
"email": "user@example.com",
|
||||
"password": "...",
|
||||
"display_name": "DJ Producer",
|
||||
"invite_code": "CHRYSOPEDIA-ALPHA-2026",
|
||||
"creator_slug": "dj-producer" // optional
|
||||
}
|
||||
```
|
||||
|
||||
## Role Model
|
||||
|
||||
Two roles defined in `UserRole` enum:
|
||||
|
||||
| Role | Purpose | Access |
|
||||
|------|---------|--------|
|
||||
| `creator` | Content creators linked to a Creator profile | Own videos' consent management |
|
||||
| `admin` | Platform administrators | All consent records, admin summary, pipeline admin |
|
||||
|
||||
### FastAPI Dependencies
|
||||
|
||||
| Dependency | Purpose | Usage |
|
||||
|------------|---------|-------|
|
||||
| `get_current_user` | Decode JWT → load User → verify active | `Depends(get_current_user)` on any protected endpoint |
|
||||
| `require_role(UserRole.admin)` | Check user has admin role, return 403 otherwise | `dependencies=[Depends(require_role(UserRole.admin))]` |
|
||||
| `oauth2_scheme` | Extract Bearer token from Authorization header | Injected into `get_current_user` |
|
||||
|
||||
**Error responses:**
|
||||
- 401 Unauthorized — missing/expired/invalid token, or user not found/inactive
|
||||
- 403 Forbidden — wrong role, or not linked to creator profile
|
||||
|
||||
## Frontend Auth Integration
|
||||
|
||||
### AuthContext (`frontend/src/context/AuthContext.tsx`)
|
||||
|
||||
React context providing auth state to the entire app:
|
||||
|
||||
| Field/Method | Type | Purpose |
|
||||
|-------------|------|---------|
|
||||
| `user` | `UserResponse \| null` | Current user object |
|
||||
| `token` | `string \| null` | Raw JWT string |
|
||||
| `isAuthenticated` | `boolean` | `!!user` |
|
||||
| `loading` | `boolean` | True during token rehydration on mount |
|
||||
| `login(email, password)` | `async` | Calls `/auth/login`, stores token, fetches `/auth/me` |
|
||||
| `register(data)` | `async` | Calls `/auth/register` |
|
||||
| `logout()` | `void` | Clears localStorage and state |
|
||||
|
||||
### Token Rehydration
|
||||
|
||||
On app mount, `AuthProvider` checks `localStorage` for a stored token. If found, it calls `GET /auth/me` to rehydrate the user session. If the token is expired or invalid, it silently clears the stored token.
|
||||
|
||||
### ProtectedRoute (`frontend/src/components/ProtectedRoute.tsx`)
|
||||
|
||||
Route wrapper that redirects unauthenticated users to `/login?returnTo=<current_path>`.
|
||||
|
||||
- Shows nothing (`null`) while auth state is loading (prevents flash redirect)
|
||||
- Preserves the intended destination in `returnTo` query param
|
||||
|
||||
## Consent Management
|
||||
|
||||
Per-video consent system allowing creators to control how their content is used.
|
||||
|
||||
### Consent Fields
|
||||
|
||||
| Field | Default | Meaning |
|
||||
|-------|---------|---------|
|
||||
| `kb_inclusion` | `false` | Allow indexing into knowledge base |
|
||||
| `training_usage` | `false` | Allow use for model training |
|
||||
| `public_display` | `true` | Allow public display on site |
|
||||
|
||||
### Ownership Model
|
||||
|
||||
- Each `VideoConsent` record is tied to a `SourceVideo` via `source_video_id`
|
||||
- Creator users can only access consent for videos belonging to their linked Creator profile
|
||||
- Admin users bypass ownership checks and can access all consent records
|
||||
- `_verify_video_ownership()` helper enforces this on every consent endpoint
|
||||
|
||||
### Audit Trail
|
||||
|
||||
Every consent field change produces a `ConsentAuditLog` entry:
|
||||
|
||||
- **Versioned:** Sequential `version` numbers per `video_consent_id`
|
||||
- **Per-field:** Each changed field gets its own audit row with `old_value` → `new_value`
|
||||
- **Attributed:** `changed_by` (FK → User) and `ip_address` recorded
|
||||
- **Append-only:** Audit entries are never modified or deleted
|
||||
|
||||
### Consent API Endpoints
|
||||
|
||||
| Method | Path | Purpose |
|
||||
|--------|------|---------|
|
||||
| GET | `/consent/videos` | List consent for creator's videos (admin sees all) |
|
||||
| GET | `/consent/videos/{id}` | Single video consent (returns defaults if no record) |
|
||||
| PUT | `/consent/videos/{id}` | Upsert consent (partial update, audit logged) |
|
||||
| GET | `/consent/videos/{id}/history` | Full audit trail for a video |
|
||||
| GET | `/consent/admin/summary` | Aggregate flag counts (admin only) |
|
||||
|
||||
## LightRAG Integration
|
||||
|
||||
LightRAG is a graph-based RAG (Retrieval-Augmented Generation) service added to the stack in M019:
|
||||
|
||||
- **Image:** `ghcr.io/hkuds/lightrag:latest`
|
||||
- **Container:** `chrysopedia-lightrag`
|
||||
- **Port:** 9621 (bound to 127.0.0.1 only)
|
||||
- **Data volume:** `/vmPool/r/services/chrysopedia_lightrag`
|
||||
- **Dependencies:** Qdrant (healthy) + Ollama (healthy)
|
||||
- **Config:** `.env.lightrag` (LLM model, embedding model, Qdrant connection, working directory)
|
||||
- **Healthcheck:** Python `urllib.request.urlopen('http://127.0.0.1:9621/health')`
|
||||
|
||||
LightRAG provides graph-structured knowledge retrieval as a complement to Qdrant's vector similarity search. It runs as a standalone service within the Docker Compose stack, sharing Qdrant and Ollama with the main application.
|
||||
|
||||
---
|
||||
|
||||
*See also: [[Architecture]], [[API-Surface]], [[Data-Model]]*
|
||||
|
|
@ -1,19 +1,23 @@
|
|||
# Data Model
|
||||
|
||||
13 SQLAlchemy models in `backend/models.py`.
|
||||
18 SQLAlchemy models in `backend/models.py`.
|
||||
|
||||
## Entity Relationship Overview
|
||||
|
||||
```
|
||||
Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
|
||||
│ │
|
||||
│ └──→ (N) KeyMoment
|
||||
│ ├──→ (N) KeyMoment
|
||||
│ │
|
||||
│ └──→ (0..1) VideoConsent ──→ (N) ConsentAuditLog
|
||||
│
|
||||
└──→ (N) TechniquePage (M) ←──→ (N) Tag
|
||||
├──→ (N) TechniquePage (M) ←──→ (N) Tag
|
||||
│ │
|
||||
│ ├──→ (N) TechniquePageVersion
|
||||
│ ├──→ (N) RelatedTechniqueLink
|
||||
│ └──→ (M:N) SourceVideo (via TechniquePageVideo)
|
||||
│
|
||||
├──→ (N) TechniquePageVersion
|
||||
├──→ (N) RelatedTechniqueLink
|
||||
└──→ (M:N) SourceVideo (via TechniquePageVideo)
|
||||
└──→ (0..1) User ──→ (N) InviteCode (created_by)
|
||||
```
|
||||
|
||||
## Core Content Models
|
||||
|
|
@ -22,7 +26,7 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
|
|||
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | Integer PK | |
|
||||
| id | UUID PK | |
|
||||
| name | String | Unique, from folder name |
|
||||
| slug | String | URL-safe, unique |
|
||||
| genres | ARRAY(String) | e.g. ["dubstep", "sound design"] |
|
||||
|
|
@ -35,7 +39,7 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
|
|||
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | Integer PK | |
|
||||
| id | UUID PK | |
|
||||
| creator_id | FK → Creator | |
|
||||
| filename | String | Original video filename |
|
||||
| youtube_url | String | Optional |
|
||||
|
|
@ -47,7 +51,7 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
|
|||
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | Integer PK | |
|
||||
| id | UUID PK | |
|
||||
| source_video_id | FK → SourceVideo | |
|
||||
| start_time | Float | Seconds |
|
||||
| end_time | Float | Seconds |
|
||||
|
|
@ -57,7 +61,7 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
|
|||
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | Integer PK | |
|
||||
| id | UUID PK | |
|
||||
| source_video_id | FK → SourceVideo | |
|
||||
| title | String | |
|
||||
| summary | Text | |
|
||||
|
|
@ -72,7 +76,7 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
|
|||
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | Integer PK | |
|
||||
| id | UUID PK | |
|
||||
| creator_id | FK → Creator | |
|
||||
| title | String | |
|
||||
| slug | String | Unique, URL-safe |
|
||||
|
|
@ -90,12 +94,73 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
|
|||
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | Integer PK | |
|
||||
| id | UUID PK | |
|
||||
| technique_page_id | FK → TechniquePage | |
|
||||
| version_number | Integer | Sequential |
|
||||
| content_snapshot | JSONB | Full page state at version time |
|
||||
| pipeline_metadata | JSONB | Prompt SHA-256 hashes, model config |
|
||||
|
||||
## Authentication & User Models
|
||||
|
||||
### User
|
||||
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | UUID PK | |
|
||||
| email | String(255) | Unique |
|
||||
| hashed_password | String(255) | bcrypt hash |
|
||||
| display_name | String(255) | |
|
||||
| role | Enum(UserRole) | admin / creator (default: creator) |
|
||||
| creator_id | FK → Creator | Optional — links user to a creator profile |
|
||||
| is_active | Boolean | Default true |
|
||||
| created_at | Timestamp | |
|
||||
| updated_at | Timestamp | |
|
||||
|
||||
### InviteCode
|
||||
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | UUID PK | |
|
||||
| code | String(100) | Unique |
|
||||
| uses_remaining | Integer | Default 1 — decremented on each registration |
|
||||
| created_by | FK → User | Optional — admin who created the code |
|
||||
| expires_at | Timestamp | Optional — null means no expiry |
|
||||
| created_at | Timestamp | |
|
||||
|
||||
## Consent Models
|
||||
|
||||
### VideoConsent
|
||||
|
||||
Per-video consent state. One row per video, mutable. Full change history in ConsentAuditLog.
|
||||
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | UUID PK | |
|
||||
| source_video_id | FK → SourceVideo | Unique constraint |
|
||||
| creator_id | FK → Creator | |
|
||||
| kb_inclusion | Boolean | Default false — allow KB indexing |
|
||||
| training_usage | Boolean | Default false — allow training use |
|
||||
| public_display | Boolean | Default true — allow public display |
|
||||
| updated_by | FK → User | Last user to modify |
|
||||
| created_at | Timestamp | |
|
||||
| updated_at | Timestamp | |
|
||||
|
||||
### ConsentAuditLog
|
||||
|
||||
Append-only versioned record of per-field consent changes.
|
||||
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | UUID PK | |
|
||||
| video_consent_id | FK → VideoConsent | Indexed |
|
||||
| version | Integer | Sequential per video_consent_id |
|
||||
| field_name | String(50) | ConsentField enum value |
|
||||
| old_value | Boolean | Nullable (null on first set) |
|
||||
| new_value | Boolean | |
|
||||
| changed_by | FK → User | |
|
||||
| ip_address | String(45) | Client IP at time of change |
|
||||
| created_at | Timestamp | |
|
||||
|
||||
## Supporting Models
|
||||
|
||||
| Model | Purpose |
|
||||
|
|
@ -121,6 +186,8 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
|
|||
| ReportStatus | open, acknowledged, resolved, dismissed |
|
||||
| PipelineRunStatus | pending, running, completed, failed, revoked |
|
||||
| PipelineRunTrigger | auto, manual, retrigger, clean_retrigger |
|
||||
| **UserRole** | admin, creator |
|
||||
| **ConsentField** | kb_inclusion, training_usage, public_display |
|
||||
|
||||
## Schema Notes
|
||||
|
||||
|
|
@ -129,7 +196,9 @@ Creator (1) ──→ (N) SourceVideo (1) ──→ (N) TranscriptSegment
|
|||
- **topic_category casing** is inconsistent across records (e.g., "Sound design" vs "Sound Design") — known data quality issue
|
||||
- **Stage 4 classification data** (per-moment topic_tags) stored in Redis with 24h TTL, not DB columns
|
||||
- **Timestamp convention:** `datetime.now(timezone.utc).replace(tzinfo=None)` — asyncpg rejects timezone-aware datetimes for TIMESTAMP WITHOUT TIME ZONE columns (D002)
|
||||
- **User passwords** are stored as bcrypt hashes via `bcrypt.hashpw()`
|
||||
- **Consent audit** uses version numbers assigned in application code (`max(version) + 1` per video_consent_id)
|
||||
|
||||
---
|
||||
|
||||
*See also: [[Architecture]], [[API-Surface]], [[Pipeline]]*
|
||||
*See also: [[Architecture]], [[API-Surface]], [[Pipeline]], [[Authentication]]*
|
||||
|
|
|
|||
|
|
@ -18,6 +18,7 @@ docker exec chrysopedia-api alembic upgrade head
|
|||
docker logs -f chrysopedia-api
|
||||
docker logs -f chrysopedia-worker
|
||||
docker logs -f chrysopedia-watcher
|
||||
docker logs -f chrysopedia-lightrag
|
||||
|
||||
# Check status
|
||||
docker ps --filter name=chrysopedia
|
||||
|
|
@ -33,6 +34,7 @@ docker ps --filter name=chrysopedia
|
|||
│ ├── chrysopedia_postgres_data/ # PostgreSQL data
|
||||
│ ├── chrysopedia_qdrant_data/ # Qdrant vector data
|
||||
│ ├── chrysopedia_ollama_data/ # Ollama model cache
|
||||
│ ├── chrysopedia_lightrag/ # LightRAG data + tiktoken cache
|
||||
│ └── chrysopedia_watch/ # Watcher input directory
|
||||
│ ├── processed/ # Successfully ingested transcripts
|
||||
│ └── failed/ # Failed transcripts + .error sidecars
|
||||
|
|
@ -44,6 +46,13 @@ docker ps --filter name=chrysopedia
|
|||
- **Network:** `chrysopedia-net` (172.32.0.0/24)
|
||||
- **Compose file:** `/vmPool/r/repos/xpltdco/chrysopedia/docker-compose.yml`
|
||||
|
||||
### Config Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `.env` | Core environment variables (DB credentials, API keys, APP_SECRET_KEY) |
|
||||
| `.env.lightrag` | LightRAG-specific config (LLM/embedding model, Qdrant connection, working dir) |
|
||||
|
||||
### Build Args / Environment
|
||||
|
||||
Frontend build-time constants are injected via Docker build args:
|
||||
|
|
@ -63,6 +72,7 @@ build:
|
|||
chrysopedia-web-8096 → chrysopedia-api → chrysopedia-db, chrysopedia-redis
|
||||
chrysopedia-worker → chrysopedia-db, chrysopedia-redis, chrysopedia-qdrant, chrysopedia-ollama
|
||||
chrysopedia-watcher → chrysopedia-api
|
||||
chrysopedia-lightrag → chrysopedia-qdrant, chrysopedia-ollama
|
||||
```
|
||||
|
||||
## Healthchecks
|
||||
|
|
@ -76,6 +86,7 @@ chrysopedia-watcher → chrysopedia-api
|
|||
| API | `curl -f http://localhost:8000/health` | |
|
||||
| Worker | `celery -A worker inspect ping` | Not HTTP |
|
||||
| Watcher | `python -c "import os; os.kill(1, 0)"` | Slim image, no pgrep |
|
||||
| LightRAG | `python -c "import urllib.request; urllib.request.urlopen('http://127.0.0.1:9621/health')"` | Python urllib (no curl in image) |
|
||||
|
||||
## nginx Reverse Proxy
|
||||
|
||||
|
|
@ -113,6 +124,7 @@ docker compose restart chrysopedia-api chrysopedia-worker
|
|||
|---------|---------------|-----------|---------|
|
||||
| PostgreSQL | 5432 | 5433 | 0.0.0.0 |
|
||||
| Web (nginx) | 80 | 8096 | 0.0.0.0 |
|
||||
| LightRAG | 9621 | 9621 | 127.0.0.1 |
|
||||
| SSH (Forgejo) | 22 | 2222 | 0.0.0.0 |
|
||||
|
||||
All other services (Redis, Qdrant, Ollama, API, Worker) are internal-only.
|
||||
|
|
@ -121,6 +133,7 @@ All other services (Redis, Qdrant, Ollama, API, Worker) are internal-only.
|
|||
|
||||
- **Web UI:** http://ub01:8096
|
||||
- **API Health:** http://ub01:8096/health
|
||||
- **LightRAG Health:** http://ub01:9621/health (localhost only)
|
||||
- **Pipeline Admin:** http://ub01:8096/admin/pipeline
|
||||
- **Worker Status:** http://ub01:8096/admin/pipeline (shows Celery worker count)
|
||||
- **PostgreSQL:** Connect via `psql -h ub01 -p 5433 -U chrysopedia`
|
||||
|
|
|
|||
|
|
@ -4,6 +4,7 @@
|
|||
|
||||
**Architecture**
|
||||
- [[Architecture]]
|
||||
- [[Authentication]]
|
||||
- [[Data-Model]]
|
||||
- [[Pipeline]]
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue