Create Data-Model wiki page for fractafrag

xpltd_admin 2026-04-03 22:50:48 -06:00
parent ae771c6889
commit 1c4977a166

244
Data-Model.-.md Normal file

@ -0,0 +1,244 @@
# Data Model
| Meta | Value |
|------|-------|
| **Repo** | `xpltdco/fractafrag` |
| **Page** | `Data-Model` |
| **Audience** | developers, agents |
| **Last Updated** | 2026-04-04 |
| **Status** | current |
## Overview
Fractafrag uses PostgreSQL 16 with the **pgvector** extension for vector similarity search, **uuid-ossp** for UUID generation, and **pg_trgm** for trigram text search. The schema is bootstrapped from `db/init.sql` on first startup. ORM models are defined in `services/api/app/models/models.py` using SQLAlchemy 2 (async).
## Entity Relationships
```mermaid
erDiagram
users ||--o{ shaders : "authors"
users ||--o{ votes : "casts"
users ||--o{ engagement_events : "generates"
users ||--o{ desires : "creates"
users ||--o{ api_keys : "owns"
users ||--o{ bounty_tips : "tips"
users ||--o{ comments : "writes"
shaders ||--o{ shader_versions : "has versions"
shaders ||--o{ votes : "receives"
shaders ||--o{ engagement_events : "tracked"
shaders ||--o{ comments : "has"
shaders ||--|| shaders : "forked_from"
desires ||--o{ desire_clusters : "clustered"
desires ||--o{ bounty_tips : "receives tips"
desires }o--o| shaders : "fulfilled by"
```
## Tables
### `users`
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `id` | UUID | No | gen_random_uuid() | Primary key |
| `username` | TEXT | No | — | Unique display name |
| `email` | TEXT | No | — | Unique email address |
| `password_hash` | TEXT | No | — | bcrypt hash (12 rounds) |
| `role` | TEXT | No | `user` | `user`, `moderator`, `admin` |
| `trust_tier` | TEXT | No | `standard` | `standard`, `creator`, `trusted_api` |
| `is_system` | BOOLEAN | No | false | Platform system account flag |
| `stripe_customer_id` | TEXT | Yes | NULL | Stripe customer ID |
| `subscription_tier` | TEXT | No | `free` | `free`, `pro`, `studio` |
| `ai_credits_remaining` | INTEGER | No | 0 | AI generation credits |
| `taste_vector` | vector(512) | Yes | NULL | Learned taste embedding (pgvector) |
| `is_verified_creator` | BOOLEAN | No | false | Verified creator status |
| `verified_creator_at` | TIMESTAMPTZ | Yes | NULL | When creator was verified |
| `stripe_connect_account_id` | TEXT | Yes | NULL | Stripe Connect for payouts |
| `created_at` | TIMESTAMPTZ | No | now() | Registration time |
| `last_active_at` | TIMESTAMPTZ | Yes | NULL | Last activity timestamp |
**System account:** UUID `00000000-0000-0000-0000-000000000001`, username `fractafrag`, role `admin`
### `shaders`
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `id` | UUID | No | gen_random_uuid() | Primary key |
| `author_id` | UUID FK | Yes | — | FK → users (ON DELETE SET NULL) |
| `title` | TEXT | No | — | Shader title |
| `description` | TEXT | Yes | NULL | Shader description |
| `glsl_code` | TEXT | No | — | Full GLSL fragment shader source |
| `status` | TEXT | No | `published` | `draft`, `published`, `archived` |
| `is_public` | BOOLEAN | No | true | Visibility flag |
| `is_ai_generated` | BOOLEAN | No | false | Was this generated by AI? |
| `is_system` | BOOLEAN | No | false | Platform-curated shader |
| `ai_provider` | TEXT | Yes | NULL | `anthropic`, `openai`, `ollama` |
| `system_label` | TEXT | Yes | NULL | e.g., `fractafrag-curated` |
| `thumbnail_url` | TEXT | Yes | NULL | Rendered thumbnail path |
| `preview_url` | TEXT | Yes | NULL | Rendered preview path |
| `render_status` | TEXT | No | `pending` | `pending`, `rendering`, `ready`, `failed` |
| `style_vector` | vector(512) | Yes | NULL | Visual style embedding |
| `style_metadata` | JSONB | Yes | NULL | `{chaos_level, color_temp, motion_type, ...}` |
| `tags` | TEXT[] | No | `{}` | Array of tag strings |
| `shader_type` | TEXT | No | `2d` | `2d`, `3d`, `audio-reactive` |
| `forked_from` | UUID FK | Yes | NULL | Self-FK → shaders (fork origin) |
| `current_version` | INTEGER | No | 1 | Latest version number |
| `view_count` | INTEGER | No | 0 | View counter |
| `score` | FLOAT | No | 0 | Cached hot score for feed ranking |
| `access_tier` | TEXT | No | `open` | Content access level |
| `source_unlock_price_cents` | INTEGER | Yes | NULL | Price to view source (creator economy) |
| `commercial_license_price_cents` | INTEGER | Yes | NULL | Commercial license price |
| `verified_creator_shader` | BOOLEAN | No | false | By a verified creator |
| `created_at` | TIMESTAMPTZ | No | now() | Creation time |
| `updated_at` | TIMESTAMPTZ | No | now() | Last update time |
### `shader_versions`
Immutable snapshots created on every shader update.
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `id` | UUID | No | gen_random_uuid() | Primary key |
| `shader_id` | UUID FK | No | — | FK → shaders (ON DELETE CASCADE) |
| `version_number` | INTEGER | No | — | Sequential version number |
| `glsl_code` | TEXT | No | — | GLSL source at this version |
| `title` | TEXT | Yes | NULL | Title at this version |
| `description` | TEXT | Yes | NULL | Description at this version |
| `tags` | TEXT[] | Yes | NULL | Tags at this version |
| `style_metadata` | JSONB | Yes | NULL | Style data at this version |
| `change_note` | TEXT | Yes | NULL | What changed (e.g., "fixed color bleeding") |
| `thumbnail_url` | TEXT | Yes | NULL | Thumbnail at this version |
| `created_at` | TIMESTAMPTZ | No | now() | Version creation time |
**Unique constraint:** `(shader_id, version_number)`
### `votes`
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `id` | UUID | No | gen_random_uuid() | Primary key |
| `user_id` | UUID FK | No | — | FK → users (CASCADE) |
| `shader_id` | UUID FK | No | — | FK → shaders (CASCADE) |
| `value` | SMALLINT | No | — | `-1` (downvote) or `+1` (upvote) |
| `created_at` | TIMESTAMPTZ | No | now() | Vote time |
**Unique constraint:** `(user_id, shader_id)`
### `engagement_events`
Tracks user interactions for feed personalization.
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `id` | UUID | No | gen_random_uuid() | Primary key |
| `user_id` | UUID FK | Yes | NULL | FK → users (SET NULL) |
| `session_id` | TEXT | Yes | NULL | Anonymous session tracker |
| `shader_id` | UUID FK | No | — | FK → shaders (CASCADE) |
| `event_type` | TEXT | No | — | `view`, `dwell`, `replay`, `share` |
| `dwell_secs` | FLOAT | Yes | NULL | Time spent viewing |
| `metadata` | JSONB | Yes | NULL | Event-specific data |
| `created_at` | TIMESTAMPTZ | No | now() | Event time |
### `desires`
Community bounty requests.
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `id` | UUID | No | gen_random_uuid() | Primary key |
| `author_id` | UUID FK | Yes | NULL | FK → users (SET NULL) |
| `prompt_text` | TEXT | No | — | Description of desired shader |
| `prompt_embedding` | vector(512) | Yes | NULL | TF-IDF + SVD embedding |
| `style_hints` | JSONB | Yes | NULL | Visual style preferences |
| `tip_amount_cents` | INTEGER | No | 0 | Bounty amount in cents |
| `status` | TEXT | No | `open` | `open`, `fulfilled`, `expired` |
| `heat_score` | FLOAT | No | 1 | Demand indicator (cluster size) |
| `fulfilled_by_shader` | UUID FK | Yes | NULL | FK → shaders |
| `fulfilled_at` | TIMESTAMPTZ | Yes | NULL | Fulfillment time |
| `expires_at` | TIMESTAMPTZ | Yes | NULL | Expiration date |
| `created_at` | TIMESTAMPTZ | No | now() | Creation time |
### `desire_clusters`
Groups similar desires by embedding proximity.
| Column | Type | Description |
|--------|------|-------------|
| `cluster_id` | UUID | Cluster identifier |
| `desire_id` | UUID FK | FK → desires (CASCADE) |
| `similarity` | FLOAT | Cosine similarity to cluster representative |
**Primary key:** composite `(cluster_id, desire_id)`
### `bounty_tips`
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `id` | UUID | No | gen_random_uuid() | Primary key |
| `desire_id` | UUID FK | No | — | FK → desires (CASCADE) |
| `tipper_id` | UUID FK | Yes | NULL | FK → users (SET NULL) |
| `amount_cents` | INTEGER | No | — | Tip amount |
| `stripe_payment_intent_id` | TEXT | Yes | NULL | Stripe payment reference |
| `status` | TEXT | No | `held` | `held`, `paid`, `refunded` |
| `created_at` | TIMESTAMPTZ | No | now() | Tip time |
### `api_keys`
MCP client authentication keys.
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `id` | UUID | No | gen_random_uuid() | Primary key |
| `user_id` | UUID FK | No | — | FK → users (CASCADE) |
| `key_hash` | TEXT | No | — | bcrypt hash (unique) |
| `key_prefix` | TEXT | No | — | First 16 chars for identification |
| `name` | TEXT | Yes | NULL | User-given name |
| `trust_tier` | TEXT | No | `probation` | `probation`, `standard`, `verified` |
| `submissions_approved` | INTEGER | No | 0 | Approved submission count |
| `rate_limit_per_hour` | INTEGER | No | 10 | Allowed requests/hour |
| `last_used_at` | TIMESTAMPTZ | Yes | NULL | Last usage time |
| `created_at` | TIMESTAMPTZ | No | now() | Creation time |
| `revoked_at` | TIMESTAMPTZ | Yes | NULL | Revocation time (soft delete) |
### `generation_log`, `comments`, `source_unlocks`, `creator_payouts`, `creator_engagement_snapshots`
These tables exist in the schema but are stubs for future milestones (M4-M5). See `db/init.sql` for full definitions.
## Indexes
### Performance Indexes
- `shaders(score DESC)` — Feed ranking
- `shaders(created_at DESC)` — New feed
- `shaders(tags)` — GIN index for tag filtering
- `shaders(render_status)` — Filter rendered shaders
- `shaders(status, is_public)` — Published shader queries
- `shaders(title)` — Trigram text search (pg_trgm)
### Vector Indexes (HNSW)
- `shaders(style_vector)` — Visual similarity (cosine ops)
- `users(taste_vector)` — Taste-based recommendations (cosine ops)
- `desires(prompt_embedding)` — Desire clustering (cosine ops)
### Relationship Indexes
- `votes(user_id, shader_id)` — Quick vote lookup
- `engagement_events(user_id, shader_id, session_id)` — Engagement queries
- `desires(status, heat_score)` — Bounty board ranking
- `api_keys(user_id, key_prefix)` — Key lookup
## Migration Strategy
The initial schema is bootstrapped from `db/init.sql` which runs as a Docker entrypoint script on first PostgreSQL startup. Subsequent migrations use **Alembic** (configured in the API service). The ORM models in `app/models/models.py` mirror the init.sql schema.
## Hot Score Calculation
The `score` column on shaders is recalculated on each vote change using Wilson score lower bound with time decay:
```
wilson_lower = (p + z²/2n - z√(p(1-p)/n + z²/4n²)) / (1 + z²/n)
where p = upvotes/total, z = 1.96 (95% CI), n = total votes
time_decay = 1 / (1 + age_hours/48)
score = wilson_lower * time_decay
```
This gives shaders a ~3-day half-life while still rewarding high vote ratios.