From 5ce2131802c6e84e3e9c67ecba5f1495b3e849c1 Mon Sep 17 00:00:00 2001 From: xpltd_admin Date: Fri, 3 Apr 2026 22:49:23 -0600 Subject: [PATCH] Create Architecture wiki page for fractafrag --- Architecture.md | 220 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 220 insertions(+) create mode 100644 Architecture.md diff --git a/Architecture.md b/Architecture.md new file mode 100644 index 0000000..247da17 --- /dev/null +++ b/Architecture.md @@ -0,0 +1,220 @@ +# Architecture + +| Meta | Value | +|------|-------| +| **Repo** | `xpltdco/fractafrag` | +| **Page** | `Architecture` | +| **Audience** | developers, agents, newcomers | +| **Last Updated** | 2026-04-04 | +| **Status** | current | + +## System Overview + +Fractafrag is a multi-service application orchestrated by Docker Compose. Eight containers work together: nginx routes traffic, the FastAPI backend handles business logic, a React SPA provides the UI, Celery workers process async jobs, a headless Chromium renderer captures shader thumbnails, an MCP server exposes tools for AI agents, and PostgreSQL + Redis provide persistence and caching. + +```mermaid +graph TB + Internet["Internet / Browser"] + Nginx["nginx :80/443"] + Frontend["Frontend :5173
(React + Vite)"] + API["API :8000
(FastAPI)"] + MCP["MCP Server :3200
(FastMCP)"] + Worker["Celery Worker"] + Renderer["Renderer :3100
(Puppeteer + Chromium)"] + Postgres["PostgreSQL :5432
(pgvector)"] + Redis["Redis :6379"] + Renders["Render Output
(/renders volume)"] + + Internet -->|"HTTP/WS"| Nginx + Nginx -->|"/"| Frontend + Nginx -->|"/api/*"| API + Nginx -->|"/mcp/*"| MCP + Nginx -->|"/renders/*"| Renders + + API --> Postgres + API --> Redis + Worker --> Postgres + Worker --> Redis + Worker -->|"POST /render"| Renderer + Renderer --> Renders + MCP -->|"internal API"| API +``` + +## Service Topology + +| Service | Image/Base | Port | Purpose | +|---------|-----------|------|---------| +| **nginx** | nginx:alpine | 80 | Reverse proxy, static render serving | +| **frontend** | node:20-alpine | 5173 | React SPA (Vite dev/static prod) | +| **api** | python:3.12-slim | 8000 | FastAPI REST API | +| **mcp** | python:3.12-slim | 3200 | AI agent MCP interface (HTTP+SSE) | +| **renderer** | node:20-slim + Chromium | 3100 | Headless shader rendering | +| **worker** | python:3.12-slim (reuses api image) | — | Celery async task processing | +| **postgres** | pgvector/pgvector:pg16 | 5432 | Primary database with vector search | +| **redis** | redis:7-alpine | 6379 | Cache, job queue, token blocklist | + +## Tech Stack + +| Layer | Technology | Purpose | +|-------|-----------|---------| +| HTTP Proxy | nginx | Routing, TLS termination, static files | +| Frontend | React 18, Vite, Three.js | SPA with WebGL shader preview | +| Styling | Tailwind CSS | Utility-first CSS framework | +| Client State | Zustand | Lightweight state management | +| Server State | TanStack Query 5 | API caching and background refetch | +| Backend | FastAPI + Uvicorn | Async ASGI web framework | +| ORM | SQLAlchemy 2 (async) | Type-safe database access | +| Database | PostgreSQL 16 + pgvector | Relational + vector similarity search | +| Migrations | Alembic | Database schema versioning | +| Task Queue | Celery + Redis | Distributed async job processing | +| Auth | JWT (python-jose) + bcrypt | Token-based authentication | +| Payments | Stripe SDK | Subscriptions and payouts (planned) | +| AI Interface | FastMCP | MCP server for external AI agents | +| Rendering | Puppeteer Core + Chromium | Headless shader screenshot capture | +| Embeddings | scikit-learn (TF-IDF + SVD) | Text-to-vector for desire clustering | +| Vector Search | pgvector (HNSW) | Cosine similarity for recommendations | + +## Directory Structure + +``` +fractafrag/ +├── db/ +│ └── init.sql # PostgreSQL bootstrap (schema + extensions + indexes) +├── scripts/ +│ └── seed.py # Sample data seeding (WIP) +├── services/ +│ ├── api/ # FastAPI backend +│ │ ├── Dockerfile +│ │ ├── pyproject.toml # Python dependencies +│ │ └── app/ +│ │ ├── main.py # FastAPI app setup, lifespan, CORS +│ │ ├── config.py # Pydantic Settings (env vars) +│ │ ├── database.py # Async SQLAlchemy engine + sessions +│ │ ├── redis.py # Async Redis client singleton +│ │ ├── models/ +│ │ │ └── models.py # All SQLAlchemy ORM models +│ │ ├── schemas/ +│ │ │ └── schemas.py # Pydantic request/response schemas +│ │ ├── middleware/ +│ │ │ ├── auth.py # JWT auth, password hashing, dependencies +│ │ │ └── rate_limit.py # Redis-based rate limiting +│ │ ├── routers/ +│ │ │ ├── auth.py # Register, login, refresh, logout +│ │ │ ├── shaders.py # CRUD, versioning, fork, search +│ │ │ ├── feed.py # Personalized feed, trending, similar +│ │ │ ├── votes.py # Upvote/downvote, hot score +│ │ │ ├── desires.py # Bounty board +│ │ │ ├── generate.py # AI generation (stub) +│ │ │ ├── users.py # Profile, BYOK keys +│ │ │ ├── payments.py # Stripe integration (stub) +│ │ │ ├── mcp_keys.py # API key management +│ │ │ └── health.py # Liveness check +│ │ ├── services/ +│ │ │ ├── embedding.py # TF-IDF + SVD vectorizer (512-dim) +│ │ │ ├── clustering.py # Desire clustering via pgvector +│ │ │ ├── glsl_validator.py # Static GLSL syntax validation +│ │ │ ├── renderer_client.py # HTTP client to renderer service +│ │ │ └── byok.py # BYOK key encryption +│ │ └── worker/ +│ │ └── __init__.py # Celery app + task definitions +│ ├── frontend/ # React SPA +│ │ ├── Dockerfile +│ │ ├── package.json +│ │ ├── vite.config.ts +│ │ ├── tailwind.config.js +│ │ └── src/ +│ │ ├── main.tsx # React entry point +│ │ ├── stores/ # Zustand state (auth) +│ │ ├── pages/ # Route pages (Feed, Editor, Explore, etc.) +│ │ ├── components/ # ShaderCanvas, Navbar, Layout +│ │ └── ... +│ ├── mcp/ # MCP server +│ │ ├── Dockerfile +│ │ ├── requirements.txt +│ │ └── server.py # FastMCP tools + resources +│ ├── renderer/ # Headless rendering +│ │ ├── Dockerfile +│ │ ├── package.json +│ │ └── server.js # Express + Puppeteer rendering pipeline +│ └── nginx/ +│ └── conf/ +│ └── default.conf # Proxy routing rules +├── docker-compose.yml # Production compose +├── docker-compose.override.yml # Dev overrides (volume mounts, hot reload) +├── docker-compose.dev.yml # Data stores only (for local dev outside Docker) +├── Makefile # Developer convenience commands +├── .env.example # Environment template +└── .forgejo/workflows/ci.yml # CI pipeline +``` + +## Key Design Decisions + +### Microservices in a Monorepo +All services live in one repo under `services/`. Docker Compose orchestrates them. This gives microservice isolation (separate runtimes, independent scaling) with monorepo convenience (single clone, shared CI, atomic changes). + +### pgvector for Similarity Search +Instead of a dedicated vector database (Pinecone, Weaviate), Fractafrag uses pgvector as a PostgreSQL extension. This keeps everything in one database, simplifies backups, and avoids an additional service. HNSW indexes provide fast approximate nearest-neighbor search for taste vectors, style vectors, and desire embeddings. + +### Headless Chromium for Rendering +Shader thumbnails are generated by a real browser (Chromium via Puppeteer) rather than a lightweight WebGL library. This guarantees pixel-perfect rendering matching what users see in their browsers. The tradeoff is higher resource usage (512MB shared memory) and slower rendering. + +### Celery over In-Process Jobs +Unlike Tubearr's in-process queue, Fractafrag uses Celery with Redis as a proper distributed task queue. This allows the API to remain responsive while rendering, embedding, and AI generation happen asynchronously. The worker runs as a separate container with independent scaling. + +### JWT with Redis Blocklist +Refresh tokens are stateless JWTs but validated against a Redis blocklist on each use. This provides immediate token revocation without a database round-trip, while keeping the auth flow mostly stateless. + +### Immutable Shader Versions +Every shader update creates a new version snapshot in the `shader_versions` table. This provides full history without git-level complexity, enables version restoration, and supports future diff/compare features. + +### TF-IDF + SVD Embeddings +Desire prompts are embedded using a custom TF-IDF + TruncatedSVD pipeline trained on a shader/visual-art domain corpus. This avoids external embedding API calls while providing domain-relevant 512-dimensional vectors for cosine similarity clustering. + +## Request Lifecycle + +### API Request +``` +Browser → nginx:80 → /api/* → api:8000 + → CORS middleware + → Auth dependency (JWT validation) + → Router handler → SQLAlchemy async → PostgreSQL + → Pydantic serialization → JSON response +``` + +### Shader Publish Flow +``` +User submits GLSL → POST /api/v1/shaders + → GLSL validator (static analysis) + → Free-tier rate limit check (5/month) + → Create Shader + ShaderVersion v1 in DB + → Enqueue render_shader Celery task + → Return shader (render_status=pending) + +Worker picks up task: + → POST /render to renderer:3100 with GLSL + → Chromium renders shader, captures screenshots + → Store thumbnail_url + preview_url + → Update shader render_status → "ready" +``` + +### Feed Personalization +``` +GET /api/v1/feed (authenticated) + → Over-fetch 2-3x candidates from DB (published, rendered) + → Build tag affinity from user's votes + dwell events + → Score each candidate: 0.5*score + 0.2*recency + 0.2*tag_affinity + 0.1*random + → Sort, slice top N + → Return ShaderFeedItem list +``` + +### Desire Processing +``` +POST /api/v1/desires → Create desire row → Enqueue process_desire task + +Worker: + → Embed prompt text (TF-IDF + SVD → 512-dim vector) + → pgvector cosine search: find nearest cluster (threshold 0.82) + → If match: join cluster, recalculate heat_score = cluster_size + → If no match: create new cluster with this desire + → Update desire row with embedding + cluster info +``` \ No newline at end of file