2 Authentication
Chrysopedia Bot edited this page 2026-04-03 22:41:16 -05:00

Authentication

JWT-based authentication system added in M019. Provides invite-code registration, bcrypt password hashing, role-based access control, and a per-video consent management system with versioned audit trail.

Overview

┌──────────────┐     POST /auth/login      ┌──────────────┐
│   Frontend   │ ──────────────────────────▶│   FastAPI    │
│  (React SPA) │◀────── JWT (HS256) ───────│   Backend    │
│              │                            │              │
│  AuthContext  │     Bearer token header    │ get_current_ │
│  localStorage│ ──────────────────────────▶│ user()       │
└──────────────┘                            └──────────────┘

JWT Token Flow

  • Algorithm: HS256 (HMAC-SHA256)
  • Secret: APP_SECRET_KEY from environment config
  • Expiry: 24 hours (_ACCESS_TOKEN_EXPIRE_MINUTES = 1440)
  • Token URL: /api/v1/auth/login (OAuth2PasswordBearer)
  • Claims: sub (user UUID), role (admin/creator), iat, exp

Token Lifecycle

  1. User submits email + password to POST /auth/login
  2. Backend verifies password against bcrypt hash in PostgreSQL
  3. Backend issues signed JWT with user ID and role claims
  4. Frontend stores token in localStorage under key chrysopedia_auth_token
  5. All subsequent API requests include Authorization: Bearer <token> header
  6. Backend get_current_user dependency decodes JWT, loads User from DB, checks is_active
  7. On token expiry (24h), frontend catches 401 and clears stored token

Invite Code Registration

Registration is gated by invite codes. No open self-registration.

  • Default seed code: CHRYSOPEDIA-ALPHA-2026 (100 uses, no expiry) — created on first startup via seed_invite_codes()
  • Code validation: checks existence, expiry date, and remaining uses
  • On use: uses_remaining is decremented (not deleted)
  • Optional creator linking: creator_slug field in registration body links the new user to an existing Creator record

Registration Flow

POST /auth/register
{
  "email": "user@example.com",
  "password": "...",
  "display_name": "DJ Producer",
  "invite_code": "CHRYSOPEDIA-ALPHA-2026",
  "creator_slug": "dj-producer"          // optional
}

Role Model

Two roles defined in UserRole enum:

Role Purpose Access
creator Content creators linked to a Creator profile Own videos' consent management
admin Platform administrators All consent records, admin summary, pipeline admin

FastAPI Dependencies

Dependency Purpose Usage
get_current_user Decode JWT → load User → verify active Depends(get_current_user) on any protected endpoint
require_role(UserRole.admin) Check user has admin role, return 403 otherwise dependencies=[Depends(require_role(UserRole.admin))]
oauth2_scheme Extract Bearer token from Authorization header Injected into get_current_user

Error responses:

  • 401 Unauthorized — missing/expired/invalid token, or user not found/inactive
  • 403 Forbidden — wrong role, or not linked to creator profile

Frontend Auth Integration

AuthContext (frontend/src/context/AuthContext.tsx)

React context providing auth state to the entire app:

Field/Method Type Purpose
user UserResponse | null Current user object
token string | null Raw JWT string
isAuthenticated boolean !!user
loading boolean True during token rehydration on mount
login(email, password) async Calls /auth/login, stores token, fetches /auth/me
register(data) async Calls /auth/register
logout() void Clears localStorage and state

Token Rehydration

On app mount, AuthProvider checks localStorage for a stored token. If found, it calls GET /auth/me to rehydrate the user session. If the token is expired or invalid, it silently clears the stored token.

ProtectedRoute (frontend/src/components/ProtectedRoute.tsx)

Route wrapper that redirects unauthenticated users to /login?returnTo=<current_path>.

  • Shows nothing (null) while auth state is loading (prevents flash redirect)
  • Preserves the intended destination in returnTo query param

Per-video consent system allowing creators to control how their content is used.

Field Default Meaning
kb_inclusion false Allow indexing into knowledge base
training_usage false Allow use for model training
public_display true Allow public display on site

Ownership Model

  • Each VideoConsent record is tied to a SourceVideo via source_video_id
  • Creator users can only access consent for videos belonging to their linked Creator profile
  • Admin users bypass ownership checks and can access all consent records
  • _verify_video_ownership() helper enforces this on every consent endpoint

Audit Trail

Every consent field change produces a ConsentAuditLog entry:

  • Versioned: Sequential version numbers per video_consent_id
  • Per-field: Each changed field gets its own audit row with old_valuenew_value
  • Attributed: changed_by (FK → User) and ip_address recorded
  • Append-only: Audit entries are never modified or deleted
Method Path Purpose
GET /consent/videos List consent for creator's videos (admin sees all)
GET /consent/videos/{id} Single video consent (returns defaults if no record)
PUT /consent/videos/{id} Upsert consent (partial update, audit logged)
GET /consent/videos/{id}/history Full audit trail for a video
GET /consent/admin/summary Aggregate flag counts (admin only)

LightRAG Integration

LightRAG is a graph-based RAG (Retrieval-Augmented Generation) service added to the stack in M019:

  • Image: ghcr.io/hkuds/lightrag:latest
  • Container: chrysopedia-lightrag
  • Port: 9621 (bound to 127.0.0.1 only)
  • Data volume: /vmPool/r/services/chrysopedia_lightrag
  • Dependencies: Qdrant (healthy) + Ollama (healthy)
  • Config: .env.lightrag (LLM model, embedding model, Qdrant connection, working directory)
  • Healthcheck: Python urllib.request.urlopen('http://127.0.0.1:9621/health')

LightRAG provides graph-structured knowledge retrieval as a complement to Qdrant's vector similarity search. It runs as a standalone service within the Docker Compose stack, sharing Qdrant and Ollama with the main application.

Admin Impersonation

Admins can view the site as any creator. Full details on the dedicated Impersonation page.

  • Mechanism: Impersonation JWT with original_user_id claim and 1h expiry
  • Safety: Write endpoints blocked via reject_impersonation dependency
  • Audit: All start/stop actions logged to impersonation_log table with admin ID, target ID, IP, timestamp

Evaluation Results (M020/S05)

A/B comparison of 25 queries (13 real user queries, 12 curated) against Qdrant search and LightRAG hybrid mode:

  • LightRAG wins: 23/25 queries on relevance scoring
  • Qdrant wins: 2/25 (short query rejected by LightRAG, ambiguous incomplete query)
  • Avg relevance: Qdrant 2.09/5, LightRAG 4.52/5
  • Avg latency: Qdrant 99ms, LightRAG 86s

Key finding: LightRAG is a RAG system producing synthesized multi-paragraph answers, not a search engine. The two serve different interaction patterns.

  • Autocomplete / typeahead → Qdrant (sub-100ms required)
  • Search results list → Qdrant (users expect instant ranked results)
  • "Ask a question" / chat → LightRAG (synthesized answers are the core value)
  • Deep-dive / explore → Both (LightRAG answer + Qdrant related pages sidebar)

Creator-Scoped Retrieval

LightRAG has no metadata-based filtering. Creator scoping uses ll_keywords to bias retrieval toward a specific creator name (soft bias, not hard filter). Each document's text includes structured provenance: Creator name, Creator ID, Source Videos, and per-key-moment source attribution.

Query utility: backend/scripts/lightrag_query.py --query "snare design" --creator "COPYCATT"

Data Coverage

LightRAG reindex in progress. Entity types configured: Creator, Technique, Plugin, Synthesizer, Effect, Genre, DAW, SamplePack, SignalChain, Concept, Frequency, SoundDesignElement.


See also: Architecture, API-Surface, Data-Model, Impersonation, Player