Compare commits
No commits in common. "8208fe6f9f6ea2aa85887a83f36ced665df8b240" and "7dad9d97afbf1a4dd5d14ef73bb7bd3856468431" have entirely different histories.
8208fe6f9f
...
7dad9d97af
73 changed files with 9900 additions and 2 deletions
68
.env.example
Normal file
68
.env.example
Normal file
|
|
@ -0,0 +1,68 @@
|
|||
# PromptLooper — Environment Variables
|
||||
# Copy to .env and adjust values for your deployment.
|
||||
|
||||
# =============================================================================
|
||||
# Database
|
||||
# =============================================================================
|
||||
# PostgreSQL connection string for production mode.
|
||||
# When not set, PromptLooper uses SQLite at DATA_DIR/promptlooper.db (single-container mode).
|
||||
# DATABASE_URL=postgresql://promptlooper:promptlooper@promptlooper-db:5432/promptlooper
|
||||
|
||||
# =============================================================================
|
||||
# Redis
|
||||
# =============================================================================
|
||||
# Redis connection string for Celery task queue and pub/sub (live dashboard).
|
||||
# When not set, PromptLooper uses an in-process queue (single-container mode).
|
||||
# REDIS_URL=redis://promptlooper-redis:6379/0
|
||||
|
||||
# =============================================================================
|
||||
# Server
|
||||
# =============================================================================
|
||||
# Bind address and port for the HTTP server.
|
||||
HOST=0.0.0.0
|
||||
PORT=8400
|
||||
|
||||
# =============================================================================
|
||||
# Authentication
|
||||
# =============================================================================
|
||||
# Secret key used to sign JWT tokens. Auto-generated on first boot if not set.
|
||||
# IMPORTANT: Set this to a long random string in production.
|
||||
# JWT_SECRET=change-me-to-a-random-secret
|
||||
|
||||
# Static API key for programmatic access (MCP, scripts, CI).
|
||||
# When not set, API key auth is disabled — only JWT login works.
|
||||
# API_KEY=
|
||||
|
||||
# =============================================================================
|
||||
# Default LLM Endpoint
|
||||
# =============================================================================
|
||||
# Pre-configured LLM endpoint URL (OpenAI-compatible API).
|
||||
# Users can add more endpoints via the UI or API; this is a convenience default.
|
||||
# DEFAULT_ENDPOINT_URL=http://localhost:11434/v1
|
||||
|
||||
# API key for the default endpoint, if required.
|
||||
# DEFAULT_ENDPOINT_KEY=
|
||||
|
||||
# =============================================================================
|
||||
# Limits
|
||||
# =============================================================================
|
||||
# Maximum number of runs executing in parallel.
|
||||
MAX_CONCURRENT_RUNS=4
|
||||
|
||||
# Token budget per sweep. 0 = unlimited.
|
||||
MAX_TOKENS_PER_SWEEP=0
|
||||
|
||||
# =============================================================================
|
||||
# Storage
|
||||
# =============================================================================
|
||||
# Directory for SQLite database and file storage (single-container mode).
|
||||
DATA_DIR=/data
|
||||
|
||||
# =============================================================================
|
||||
# MCP Server
|
||||
# =============================================================================
|
||||
# Enable the Model Context Protocol server for agent-driven workflows.
|
||||
MCP_ENABLED=true
|
||||
|
||||
# Port for the MCP server (separate from the main API).
|
||||
MCP_PORT=8401
|
||||
57
.gitignore
vendored
Normal file
57
.gitignore
vendored
Normal file
|
|
@ -0,0 +1,57 @@
|
|||
# Python
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
*.egg-info/
|
||||
*.egg
|
||||
dist/
|
||||
build/
|
||||
.eggs/
|
||||
*.whl
|
||||
.venv/
|
||||
venv/
|
||||
env/
|
||||
.env
|
||||
*.pyc
|
||||
.pytest_cache/
|
||||
.mypy_cache/
|
||||
.ruff_cache/
|
||||
htmlcov/
|
||||
.coverage
|
||||
.coverage.*
|
||||
|
||||
# Node / Frontend
|
||||
node_modules/
|
||||
frontend/dist/
|
||||
frontend/build/
|
||||
.npm
|
||||
*.tsbuildinfo
|
||||
|
||||
# Docker
|
||||
docker/nginx.conf.bak
|
||||
|
||||
# IDE
|
||||
.vscode/
|
||||
.idea/
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
.DS_Store
|
||||
|
||||
# OS
|
||||
Thumbs.db
|
||||
Desktop.ini
|
||||
|
||||
# Data (single-container mode)
|
||||
*.db
|
||||
/data/
|
||||
|
||||
# Alembic
|
||||
alembic/versions/__pycache__/
|
||||
|
||||
# Auto Run Docs (Maestro working files)
|
||||
Auto Run Docs/Working/
|
||||
|
||||
# Misc
|
||||
*.log
|
||||
*.bak
|
||||
48
Auto Run Docs/01-scaffold.md
Normal file
48
Auto Run Docs/01-scaffold.md
Normal file
|
|
@ -0,0 +1,48 @@
|
|||
# Phase 1 — Project Scaffold
|
||||
|
||||
Set up the PromptLooper repository, Docker infrastructure, and basic project skeleton. Read `promptlooper-spec.md` and `CLAUDE.md` before starting any task.
|
||||
|
||||
- [x] Initialize the git repository at git.xpltd.co/xpltdco/promptlooper with a README.md that includes the project description from the spec, a quick-start section showing the single-container docker run command, and badges for license (AGPL-3.0) and status. Add .gitignore for Python, Node, and Docker artifacts.
|
||||
> NOTE: Git repo initialized locally with remote set to git@git.xpltd.co:xpltdco/promptlooper.git. Push failed — SSH key not configured for this host or repo not yet created on Gitea. Needs manual setup before pushing.
|
||||
|
||||
- [x] Create the full directory structure as defined in the spec's Project Structure section. Every directory should exist with a placeholder __init__.py or .gitkeep as appropriate. Include backend/, frontend/, docker/, alembic/, and all subdirectories.
|
||||
> Created all directories: backend/ (with routers/, engine/adapters/, engine/scorers/, mcp/, websocket/, tests/), frontend/src/ (pages/, components/, api/), docker/, alembic/versions/. Python packages have __init__.py, non-Python dirs have .gitkeep.
|
||||
|
||||
- [x] Create .env.example with all environment variables from the spec's Environment Variables table, with sensible defaults and comments explaining each group. Include DATABASE_URL, REDIS_URL, JWT_SECRET, DEFAULT_ENDPOINT_URL, MAX_CONCURRENT_RUNS, and all others.
|
||||
> Created .env.example with all 13 environment variables organized into 7 groups (Database, Redis, Server, Auth, Default LLM Endpoint, Limits, Storage, MCP). Production-only vars (DATABASE_URL, REDIS_URL, JWT_SECRET, API_KEY, DEFAULT_ENDPOINT_*) are commented out with explanatory notes. Single-container defaults work out of the box.
|
||||
|
||||
- [x] Create docker-compose.yml following XPLTD conventions: project name xpltd_promptlooper, network promptlooper (172.33.0.0/24), PostgreSQL on port 5434, Redis, API service, worker service, and web service on port 8400. Use bind mounts under /vmPool/r/services/promptlooper_* for persistent data. Model this after Chrysopedia's docker-compose.yml patterns.
|
||||
> Updated existing docker-compose.yml: fixed DATABASE_URL to use standard postgresql:// scheme (not asyncpg), hardcoded DB credentials instead of requiring .env vars, added API_KEY pass-through, added working_dir for worker service, made JWT_SECRET optional with dev default. All 5 services defined: db (:5434), redis, api (MCP :8401), worker (Celery), web (:8400). Bind mounts under /vmPool/r/services/promptlooper_*. Health checks on db and redis with dependency conditions.
|
||||
|
||||
- [x] Create the multi-stage Dockerfile in docker/ that builds both backend and frontend into a single image. Stage 1: Node build for frontend (npm ci && npm run build). Stage 2: Python runtime with uvicorn, copying the built frontend assets. Include nginx.conf that serves the frontend and proxies /api and /ws to uvicorn. The image should work standalone with SQLite when no DATABASE_URL is provided.
|
||||
> Created 3-stage Dockerfile: (1) frontend-build with Node 20 Alpine, (2) api stage with Python 3.12-slim + uvicorn + static assets for single-container mode, (3) web stage with nginx 1.27 Alpine for production compose. nginx.conf proxies /api/ and /health to the API, upgrades /ws/ connections for WebSocket. Also created: backend/requirements.txt, frontend scaffolding (package.json, vite.config.ts, tsconfig.json, index.html, App.tsx, Tailwind config), and placeholder alembic.ini/env.py for Dockerfile COPY.
|
||||
|
||||
- [x] Create backend/config.py using Pydantic Settings. Define all configuration from the Environment Variables table. Implement the SQLite fallback logic: when DATABASE_URL is not set, construct a SQLite URL pointing to DATA_DIR/promptlooper.db. When REDIS_URL is not set, set a flag for in-process mode.
|
||||
> Created backend/config.py with Pydantic Settings class defining all 13 env vars. SQLite fallback via `effective_database_url` property constructs sqlite:///DATA_DIR/promptlooper.db when DATABASE_URL is unset. `use_in_process_queue` property flags in-process mode when REDIS_URL is absent. JWT_SECRET auto-generates via `secrets.token_urlsafe(32)` when not provided. Empty API_KEY strings normalize to None. 13 tests in tests/test_config.py all passing.
|
||||
|
||||
- [x] Create backend/models.py with all SQLAlchemy ORM models from the spec's Data Model section: User, Project, Experiment, Run, StageResult, Score, ResponseCache, and WebhookConfig. Include all fields, types, relationships, and indexes. Use UUID primary keys and JSONB for flexible fields.
|
||||
> Created all 8 ORM models with UUID PKs, JSON columns (using sqlalchemy.JSON for SQLite compatibility — maps to JSONB on PostgreSQL), enum types (ExperimentStatus, RunStatus), full relationship definitions with cascade deletes, and indexes on foreign keys and commonly filtered columns. Score.metadata mapped as `scorer_metadata` Python attribute (column name stays "metadata") to avoid SQLAlchemy reserved name conflict. 16 tests in tests/test_models.py all passing.
|
||||
|
||||
- [x] Set up Alembic: create alembic.ini and alembic/env.py configured to read DATABASE_URL from the config. Generate and apply the initial migration from the models.
|
||||
> Created alembic.ini with logging config and script_location pointing to alembic/. env.py reads DATABASE_URL from backend.config.settings (with override support for tests). Added script.py.mako template. Generated initial migration (e1909678e89e) with all 8 tables, indexes, foreign keys, and enums. Migration applies cleanly on SQLite (render_as_batch=True for SQLite compatibility). 5 tests in tests/test_alembic.py covering upgrade/downgrade/columns/indexes/FKs. All 34 backend tests pass.
|
||||
|
||||
- [x] Create backend/schemas.py with Pydantic request/response schemas for all API endpoints. Include create/update/response schemas for Project, Experiment, Run, Endpoint, and Webhook. Include the Score input schema and export format schemas.
|
||||
> Created backend/schemas.py with all Pydantic v2 schemas using ConfigDict(from_attributes=True) for ORM compatibility. Includes: Project (create/update/response/list), Experiment (create/update/response/list), Run (response/list/detail with nested stages+scores), StageResult (response), Score (input/response), Endpoint (create/update/response/list), Webhook (create/update/response/list), Auth (setup/login/token/user), Export (run row with scores dict, export response), and Health. 30 tests in tests/test_schemas.py all passing. All 64 backend tests pass.
|
||||
|
||||
- [x] Create backend/main.py with the FastAPI application. Set up CORS middleware, mount all routers (even if they're stubs), configure the WebSocket endpoint, add the /health endpoint that checks DB and Redis connectivity, and add startup/shutdown lifecycle hooks.
|
||||
> Created backend/main.py with: CORS middleware (allow all origins), /health endpoint checking DB (SELECT 1) and Redis (ping) connectivity, /ws WebSocket endpoint with ConnectionManager for real-time broadcasts, async lifespan hooks for DB engine + Redis init/teardown, get_db dependency yielding sessions, dynamic router mounting (silently skips missing routers). 10 tests in tests/test_main.py covering health, CORS, WebSocket connect/disconnect/echo, OpenAPI schema, 404s, broadcast, get_db, and get_redis. All 74 backend tests pass.
|
||||
|
||||
- [x] Create backend/auth.py implementing JWT token generation/verification, API key validation, and the first-boot setup flow. The setup endpoint should check if any users exist — if not, accept username + password to create the admin account. Include a dependency function for route-level auth that supports both JWT and API key.
|
||||
> Created backend/auth.py with: bcrypt password hashing via passlib, JWT token creation/verification (HS256, 24h expiry) using python-jose, first-boot `needs_setup()` + `create_admin()` flow (409 if admin exists), `authenticate_user()` for login, and `get_current_user` FastAPI dependency supporting both JWT Bearer tokens and X-Api-Key header (API key grants first admin user). UUID string-to-UUID conversion for SQLite compatibility. 21 tests in tests/test_auth.py covering hashing, JWT lifecycle, setup flow, login, and all auth dependency paths. All 95 backend tests pass.
|
||||
|
||||
- [x] Scaffold all router files in backend/routers/ as stubs: auth.py, projects.py, experiments.py, runs.py, endpoints.py, export.py, webhooks.py, admin.py. Each should have the correct APIRouter prefix and tags, with placeholder endpoints that return 501 Not Implemented.
|
||||
> Created all 8 router stubs with APIRouter instances, mounted via main.py's _mount_routers(). Endpoints match the spec: auth (3 endpoints), projects (5), experiments (9 incl. sweep/pause/resume/stop), runs (5 incl. leaderboard), endpoints (5 incl. test), export (4 formats), webhooks (3), admin (3). All return 501 Not Implemented. 37 tests in tests/test_routers.py verify every route is mounted and returns 501. All 132 backend tests pass.
|
||||
|
||||
- [x] Initialize the frontend: run npm create vite@latest with React + TypeScript template. Install Tailwind CSS and configure it. Install react-router-dom for routing. Create the basic App.tsx with routes for Setup, Login, Dashboard, Projects, Experiment, Live, Compare, and Admin pages (all as placeholder components). Verify it builds cleanly.
|
||||
> Frontend was already scaffolded with Vite + React + TypeScript + Tailwind + react-router-dom from the Dockerfile task. Added 8 placeholder page components (SetupPage, LoginPage, DashboardPage, ProjectsPage, ExperimentPage, LivePage, ComparePage, AdminPage) in frontend/src/pages/. Updated App.tsx with react-router-dom Routes and main.tsx with BrowserRouter. Unknown routes redirect to dashboard. Installed vitest + @testing-library/react for testing. 9 routing tests in App.test.tsx all passing. Build completes cleanly. All 132 backend tests still pass.
|
||||
|
||||
- [x] Create frontend/src/api/client.ts with a typed API client using fetch. Include JWT token management (stored in memory, not localStorage), request/response interceptors for auth headers, and typed wrapper functions for each API endpoint group. Include WebSocket connection helper.
|
||||
> Created frontend/src/api/client.ts with: TypeScript interfaces mirroring all backend Pydantic schemas, in-memory JWT token management (setToken/getToken/clearToken — never localStorage), automatic Authorization header injection on all requests, Content-Type header for POST/PUT bodies, ApiError class for non-ok responses, typed wrapper functions for all 8 endpoint groups (auth, projects, experiments, runs, endpoints, export, webhooks, admin) plus health check, and connectWebSocket() helper that derives ws/wss from current protocol and handles JSON message parsing. 39 tests in src/api/client.test.ts covering token management, header injection, all endpoint groups, error handling, and WebSocket lifecycle. All 48 frontend tests pass. All 132 backend tests still pass.
|
||||
|
||||
- [x] Verify the full stack runs: docker compose up should start all services. The API should respond to /health. The frontend should load and show the setup screen (since no admin exists). The database migration should have run. Document any manual steps needed in the README.
|
||||
> Created missing backend/worker.py (Celery app config for docker-compose worker service). Created docker/entrypoint.sh that runs `alembic upgrade head` before starting uvicorn, and updated Dockerfile to use it as ENTRYPOINT. Fixed README single-container quick-start (port 8000, not 8400) and added production compose docs (service list, first-boot instructions). Added 24 stack integration tests verifying all Docker/compose/nginx/frontend/alembic files are present and consistent, plus /health endpoint test. 3 worker tests confirm Celery config. All 159 backend + 48 frontend tests pass.
|
||||
127
CLAUDE.md
Normal file
127
CLAUDE.md
Normal file
|
|
@ -0,0 +1,127 @@
|
|||
# CLAUDE.md — PromptLooper
|
||||
|
||||
## What is this project?
|
||||
|
||||
PromptLooper is a self-hosted LLM pipeline tuning workbench. It runs experiments across prompt × model × parameter combinations, caches every response, scores results, and surfaces optimal configurations through a real-time dashboard. It has an MCP server so AI agents can drive it programmatically.
|
||||
|
||||
## Repository
|
||||
|
||||
- **Hosted at**: git.xpltd.co/xpltdco/promptlooper
|
||||
- **XPLTD project name**: `xpltd_promptlooper`
|
||||
- **Sister project**: Chrysopedia (git.xpltd.co/xpltdco/chrysopedia) — a knowledge extraction pipeline that is PromptLooper's first integration target
|
||||
|
||||
## Tech Stack
|
||||
|
||||
- **Backend**: Python 3.12, FastAPI, Celery, SQLAlchemy, Alembic
|
||||
- **Frontend**: React 18, TypeScript, Vite, Tailwind CSS
|
||||
- **Database**: PostgreSQL 16 (production) / SQLite (single-container mode)
|
||||
- **Cache/Queue**: Redis 7 (production) / in-process (single-container)
|
||||
- **Real-time**: WebSocket via FastAPI + Redis pub/sub
|
||||
- **MCP**: Python MCP SDK
|
||||
- **Container**: Multi-stage Docker build, nginx for frontend
|
||||
|
||||
## XPLTD Conventions
|
||||
|
||||
These are non-negotiable project conventions shared across all XPLTD projects:
|
||||
|
||||
- Docker Compose project name: `xpltd_promptlooper`
|
||||
- Dedicated bridge network: `promptlooper` (`172.33.0.0/24`)
|
||||
- Persistent data bind mounts under `/vmPool/r/services/promptlooper_*`
|
||||
- PostgreSQL on external port `5434` (internal `5432`)
|
||||
- Web UI on port `8400`
|
||||
- MCP server on port `8401`
|
||||
- Container naming: `promptlooper-{service}` (e.g., `promptlooper-api`, `promptlooper-db`)
|
||||
|
||||
## Key Architecture Decisions
|
||||
|
||||
1. **No LLM runs inside PromptLooper itself** — it's purely an HTTP client that calls external LLM endpoints. The only exception is the optional "LLM-as-judge" scorer.
|
||||
2. **Response caching by config hash** — SHA-256 of (prompt + model + params + input). Cache hits return instantly. This is critical for cost control.
|
||||
3. **Single-container mode** — when `DATABASE_URL` is not set, use SQLite + in-process queue. Zero dependencies.
|
||||
4. **WebSocket for real-time** — the dashboard connects via WebSocket to receive run progress, score updates, and steering events.
|
||||
5. **Pluggable scorers** — all scoring functions implement a base class with `score(input, output, context) → float` signature.
|
||||
6. **OpenAI-compatible adapter** — the LLM adapter layer speaks OpenAI's chat completions API. This covers OpenWebUI, vLLM, Ollama, and most providers.
|
||||
|
||||
## File Organization
|
||||
|
||||
```
|
||||
backend/
|
||||
main.py — FastAPI app, middleware, router mounting
|
||||
config.py — Pydantic Settings from env vars
|
||||
models.py — SQLAlchemy ORM models
|
||||
schemas.py — Pydantic request/response schemas
|
||||
auth.py — JWT + API key authentication
|
||||
worker.py — Celery app configuration
|
||||
routers/ — API endpoint handlers
|
||||
engine/ — Core experiment execution logic
|
||||
runner.py — Individual run execution
|
||||
sweep.py — Sweep orchestration (grid/random/guided)
|
||||
cache.py — Response cache layer
|
||||
adapters/ — LLM endpoint adapters
|
||||
scorers/ — Pluggable scoring functions
|
||||
mcp/ — MCP server implementation
|
||||
websocket/ — WebSocket connection management
|
||||
|
||||
frontend/src/
|
||||
pages/ — Route-level components
|
||||
components/ — Shared UI components
|
||||
api/ — Typed API client functions
|
||||
```
|
||||
|
||||
## Database Migrations
|
||||
|
||||
Use Alembic. Same patterns as Chrysopedia:
|
||||
```bash
|
||||
alembic revision --autogenerate -m "describe_change"
|
||||
alembic upgrade head
|
||||
```
|
||||
|
||||
## Running Locally
|
||||
|
||||
```bash
|
||||
docker compose up -d promptlooper-db promptlooper-redis
|
||||
cd backend && uvicorn main:app --reload --host 0.0.0.0 --port 8000
|
||||
# Frontend in another terminal:
|
||||
cd frontend && npm run dev
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
cd backend && pytest
|
||||
cd frontend && npm test
|
||||
```
|
||||
|
||||
## Important Patterns
|
||||
|
||||
### Adding a new scorer
|
||||
1. Create `backend/engine/scorers/my_scorer.py`
|
||||
2. Implement `BaseScorer` with `name`, `score(input, output, context) → float`
|
||||
3. Register in `backend/engine/scorers/__init__.py`
|
||||
4. Add to frontend scorer picker component
|
||||
|
||||
### Adding a new LLM adapter
|
||||
1. Create `backend/engine/adapters/my_adapter.py`
|
||||
2. Implement `BaseAdapter` with `complete(prompt, model, params) → response`
|
||||
3. Register in `backend/engine/adapters/__init__.py`
|
||||
4. Currently only OpenAI-compatible is implemented; all others should be edge cases
|
||||
|
||||
### Adding a new MCP tool
|
||||
1. Add tool definition in `backend/mcp/tools.py`
|
||||
2. Implement handler in `backend/mcp/server.py`
|
||||
3. Tools should map 1:1 to API endpoints where possible
|
||||
|
||||
## Common Gotchas
|
||||
|
||||
- Always hash the FULL config when checking cache — missing a single parameter means cache misses
|
||||
- WebSocket connections must be cleaned up on disconnect — use the connection manager
|
||||
- SQLite mode doesn't support concurrent writes — the in-process queue must be single-threaded
|
||||
- Frontend must handle both WebSocket and polling fallback for environments where WS is blocked
|
||||
- MCP server runs on a separate port from the main API
|
||||
|
||||
## Deployment
|
||||
|
||||
```bash
|
||||
ssh ub01
|
||||
cd /vmPool/r/repos/xpltdco/promptlooper
|
||||
git pull && docker compose build && docker compose up -d
|
||||
```
|
||||
80
README.md
80
README.md
|
|
@ -1,3 +1,79 @@
|
|||
# promptlooper
|
||||
# PromptLooper
|
||||
|
||||
Universal LLM pipeline tuning workbench — systematically optimize prompts, models, and inference parameters through cached experiments, pluggable scoring, and agent-driven sweeps via MCP.
|
||||
[](https://www.gnu.org/licenses/agpl-3.0)
|
||||
[]()
|
||||
|
||||
> The one who loops prompts — a universal LLM pipeline tuning workbench.
|
||||
|
||||
PromptLooper is a self-hosted tool for systematically optimizing LLM prompts, model selection, and inference parameters. It runs experiments across prompt x model x parameter combinations, caches every response, scores results against pluggable evaluation functions, and surfaces the best configurations through a real-time observability dashboard with human-in-the-loop steering.
|
||||
|
||||
It ships as a single Docker container (SQLite mode) for zero-config quickstart, or a Docker Compose stack (Postgres + Redis) for production use. An MCP server enables any AI agent to drive PromptLooper programmatically — creating experiments, running sweeps, and reading results without human intervention.
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Single Container (zero dependencies)
|
||||
|
||||
```bash
|
||||
docker run -p 8000:8000 -v promptlooper-data:/data ghcr.io/xpltdco/promptlooper
|
||||
```
|
||||
|
||||
Open `http://localhost:8000` — you'll be prompted to create an admin account on first boot.
|
||||
|
||||
> In single-container mode, the API serves the built frontend as static files at the root.
|
||||
> Database migrations run automatically on startup.
|
||||
|
||||
### Production (Docker Compose)
|
||||
|
||||
```bash
|
||||
git clone git@git.xpltd.co:xpltdco/promptlooper.git
|
||||
cd promptlooper
|
||||
cp .env.example .env
|
||||
# Edit .env — set JWT_SECRET at minimum
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
Open `http://localhost:8400` — nginx proxies the frontend (port 80 → 8400) and API (`/api/` → port 8000).
|
||||
|
||||
**Services started:**
|
||||
- `promptlooper-db` — PostgreSQL 16 on port 5434
|
||||
- `promptlooper-redis` — Redis 7
|
||||
- `promptlooper-api` — FastAPI + Alembic migrations (auto-runs on startup)
|
||||
- `promptlooper-worker` — Celery worker for experiment execution
|
||||
- `promptlooper-web` — Nginx reverse proxy on port 8400
|
||||
|
||||
**First boot:** Navigate to `http://localhost:8400/setup` to create the admin account.
|
||||
|
||||
## Features
|
||||
|
||||
- **Systematic experimentation** — grid, random, and guided sweeps across prompt x model x parameter space
|
||||
- **Response caching** — SHA-256 deduplication means re-runs cost zero tokens
|
||||
- **Pluggable scoring** — embedding similarity, format compliance, keyword presence, LLM-as-judge, human rating, custom webhooks
|
||||
- **Real-time dashboard** — live progress, leaderboard, side-by-side comparison, steering controls
|
||||
- **MCP server** — AI agents can create experiments, run sweeps, and export results programmatically
|
||||
- **Single-container mode** — SQLite + in-process queue when no external dependencies are configured
|
||||
|
||||
## Development
|
||||
|
||||
```bash
|
||||
# Start backing services
|
||||
docker compose up -d promptlooper-db promptlooper-redis
|
||||
|
||||
# Backend
|
||||
cd backend && pip install -r requirements.txt
|
||||
alembic upgrade head
|
||||
uvicorn main:app --reload --host 0.0.0.0 --port 8000
|
||||
|
||||
# Frontend (separate terminal)
|
||||
cd frontend && npm install && npm run dev
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
cd backend && pytest
|
||||
cd frontend && npm test
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
[AGPL-3.0](https://www.gnu.org/licenses/agpl-3.0.html)
|
||||
|
|
|
|||
39
alembic.ini
Normal file
39
alembic.ini
Normal file
|
|
@ -0,0 +1,39 @@
|
|||
[alembic]
|
||||
script_location = alembic
|
||||
# sqlalchemy.url is set programmatically in env.py from backend.config
|
||||
sqlalchemy.url =
|
||||
|
||||
[post_write_hooks]
|
||||
|
||||
[loggers]
|
||||
keys = root,sqlalchemy,alembic
|
||||
|
||||
[handlers]
|
||||
keys = console
|
||||
|
||||
[formatters]
|
||||
keys = generic
|
||||
|
||||
[logger_root]
|
||||
level = WARN
|
||||
handlers = console
|
||||
|
||||
[logger_sqlalchemy]
|
||||
level = WARN
|
||||
handlers =
|
||||
qualname = sqlalchemy.engine
|
||||
|
||||
[logger_alembic]
|
||||
level = INFO
|
||||
handlers =
|
||||
qualname = alembic
|
||||
|
||||
[handler_console]
|
||||
class = StreamHandler
|
||||
args = (sys.stderr,)
|
||||
level = NOTSET
|
||||
formatter = generic
|
||||
|
||||
[formatter_generic]
|
||||
format = %(levelname)-5.5s [%(name)s] %(message)s
|
||||
datefmt = %H:%M:%S
|
||||
66
alembic/env.py
Normal file
66
alembic/env.py
Normal file
|
|
@ -0,0 +1,66 @@
|
|||
"""Alembic environment configuration for PromptLooper."""
|
||||
|
||||
import sys
|
||||
from logging.config import fileConfig
|
||||
from pathlib import Path
|
||||
|
||||
from alembic import context
|
||||
from sqlalchemy import engine_from_config, pool
|
||||
|
||||
# Ensure the backend package is importable
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parents[1]))
|
||||
|
||||
from backend.config import settings
|
||||
from backend.models import Base
|
||||
|
||||
config = context.config
|
||||
|
||||
if config.config_file_name is not None:
|
||||
fileConfig(config.config_file_name)
|
||||
|
||||
# Use sqlalchemy.url from alembic config if already set (e.g. by tests),
|
||||
# otherwise fall back to application settings.
|
||||
if not config.get_main_option("sqlalchemy.url"):
|
||||
config.set_main_option("sqlalchemy.url", settings.effective_database_url)
|
||||
|
||||
target_metadata = Base.metadata
|
||||
|
||||
|
||||
def run_migrations_offline() -> None:
|
||||
"""Run migrations in 'offline' mode — emit SQL to stdout."""
|
||||
url = config.get_main_option("sqlalchemy.url")
|
||||
context.configure(
|
||||
url=url,
|
||||
target_metadata=target_metadata,
|
||||
literal_binds=True,
|
||||
dialect_opts={"paramstyle": "named"},
|
||||
render_as_batch=True,
|
||||
)
|
||||
|
||||
with context.begin_transaction():
|
||||
context.run_migrations()
|
||||
|
||||
|
||||
def run_migrations_online() -> None:
|
||||
"""Run migrations against a live database connection."""
|
||||
connectable = engine_from_config(
|
||||
config.get_section(config.config_ini_section, {}),
|
||||
prefix="sqlalchemy.",
|
||||
poolclass=pool.NullPool,
|
||||
)
|
||||
|
||||
with connectable.connect() as connection:
|
||||
context.configure(
|
||||
connection=connection,
|
||||
target_metadata=target_metadata,
|
||||
render_as_batch=True,
|
||||
)
|
||||
|
||||
with context.begin_transaction():
|
||||
context.run_migrations()
|
||||
|
||||
|
||||
if context.is_offline_mode():
|
||||
run_migrations_offline()
|
||||
else:
|
||||
run_migrations_online()
|
||||
26
alembic/script.py.mako
Normal file
26
alembic/script.py.mako
Normal file
|
|
@ -0,0 +1,26 @@
|
|||
"""${message}
|
||||
|
||||
Revision ID: ${up_revision}
|
||||
Revises: ${down_revision | comma,n}
|
||||
Create Date: ${create_date}
|
||||
|
||||
"""
|
||||
from typing import Sequence, Union
|
||||
|
||||
from alembic import op
|
||||
import sqlalchemy as sa
|
||||
${imports if imports else ""}
|
||||
|
||||
# revision identifiers, used by Alembic.
|
||||
revision: str = ${repr(up_revision)}
|
||||
down_revision: Union[str, None] = ${repr(down_revision)}
|
||||
branch_labels: Union[str, Sequence[str], None] = ${repr(branch_labels)}
|
||||
depends_on: Union[str, Sequence[str], None] = ${repr(depends_on)}
|
||||
|
||||
|
||||
def upgrade() -> None:
|
||||
${upgrades if upgrades else "pass"}
|
||||
|
||||
|
||||
def downgrade() -> None:
|
||||
${downgrades if downgrades else "pass"}
|
||||
0
alembic/versions/.gitkeep
Normal file
0
alembic/versions/.gitkeep
Normal file
165
alembic/versions/e1909678e89e_initial_schema.py
Normal file
165
alembic/versions/e1909678e89e_initial_schema.py
Normal file
|
|
@ -0,0 +1,165 @@
|
|||
"""initial_schema
|
||||
|
||||
Revision ID: e1909678e89e
|
||||
Revises:
|
||||
Create Date: 2026-04-07 01:50:18.571150
|
||||
|
||||
"""
|
||||
from typing import Sequence, Union
|
||||
|
||||
from alembic import op
|
||||
import sqlalchemy as sa
|
||||
|
||||
|
||||
# revision identifiers, used by Alembic.
|
||||
revision: str = 'e1909678e89e'
|
||||
down_revision: Union[str, None] = None
|
||||
branch_labels: Union[str, Sequence[str], None] = None
|
||||
depends_on: Union[str, Sequence[str], None] = None
|
||||
|
||||
|
||||
def upgrade() -> None:
|
||||
# ### commands auto generated by Alembic - please adjust! ###
|
||||
op.create_table('response_cache',
|
||||
sa.Column('config_hash', sa.String(length=64), nullable=False),
|
||||
sa.Column('response', sa.Text(), nullable=False),
|
||||
sa.Column('model', sa.String(length=255), nullable=False),
|
||||
sa.Column('tokens_in', sa.Integer(), nullable=True),
|
||||
sa.Column('tokens_out', sa.Integer(), nullable=True),
|
||||
sa.Column('latency_ms', sa.Integer(), nullable=True),
|
||||
sa.Column('created_at', sa.DateTime(timezone=True), nullable=False),
|
||||
sa.PrimaryKeyConstraint('config_hash')
|
||||
)
|
||||
op.create_table('users',
|
||||
sa.Column('id', sa.Uuid(), nullable=False),
|
||||
sa.Column('username', sa.String(length=255), nullable=False),
|
||||
sa.Column('password_hash', sa.String(length=255), nullable=False),
|
||||
sa.Column('is_admin', sa.Boolean(), nullable=False),
|
||||
sa.Column('created_at', sa.DateTime(timezone=True), nullable=False),
|
||||
sa.PrimaryKeyConstraint('id'),
|
||||
sa.UniqueConstraint('username')
|
||||
)
|
||||
op.create_table('webhook_configs',
|
||||
sa.Column('id', sa.Uuid(), nullable=False),
|
||||
sa.Column('event_type', sa.String(length=255), nullable=False),
|
||||
sa.Column('url', sa.String(length=2048), nullable=False),
|
||||
sa.Column('headers', sa.JSON(), nullable=True),
|
||||
sa.Column('is_active', sa.Boolean(), nullable=False),
|
||||
sa.PrimaryKeyConstraint('id')
|
||||
)
|
||||
with op.batch_alter_table('webhook_configs', schema=None) as batch_op:
|
||||
batch_op.create_index('ix_webhook_configs_event_type', ['event_type'], unique=False)
|
||||
|
||||
op.create_table('projects',
|
||||
sa.Column('id', sa.Uuid(), nullable=False),
|
||||
sa.Column('name', sa.String(length=255), nullable=False),
|
||||
sa.Column('description', sa.Text(), nullable=True),
|
||||
sa.Column('owner_id', sa.Uuid(), nullable=False),
|
||||
sa.Column('created_at', sa.DateTime(timezone=True), nullable=False),
|
||||
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=False),
|
||||
sa.ForeignKeyConstraint(['owner_id'], ['users.id'], ondelete='CASCADE'),
|
||||
sa.PrimaryKeyConstraint('id')
|
||||
)
|
||||
op.create_table('experiments',
|
||||
sa.Column('id', sa.Uuid(), nullable=False),
|
||||
sa.Column('project_id', sa.Uuid(), nullable=False),
|
||||
sa.Column('name', sa.String(length=255), nullable=False),
|
||||
sa.Column('description', sa.Text(), nullable=True),
|
||||
sa.Column('sample_data', sa.JSON(), nullable=True),
|
||||
sa.Column('pipeline_stages', sa.JSON(), nullable=True),
|
||||
sa.Column('scoring_config', sa.JSON(), nullable=True),
|
||||
sa.Column('parameter_space', sa.JSON(), nullable=True),
|
||||
sa.Column('status', sa.Enum('draft', 'running', 'paused', 'completed', name='experiment_status'), nullable=False),
|
||||
sa.Column('created_at', sa.DateTime(timezone=True), nullable=False),
|
||||
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=False),
|
||||
sa.ForeignKeyConstraint(['project_id'], ['projects.id'], ondelete='CASCADE'),
|
||||
sa.PrimaryKeyConstraint('id')
|
||||
)
|
||||
with op.batch_alter_table('experiments', schema=None) as batch_op:
|
||||
batch_op.create_index('ix_experiments_project_id', ['project_id'], unique=False)
|
||||
batch_op.create_index('ix_experiments_status', ['status'], unique=False)
|
||||
|
||||
op.create_table('runs',
|
||||
sa.Column('id', sa.Uuid(), nullable=False),
|
||||
sa.Column('experiment_id', sa.Uuid(), nullable=False),
|
||||
sa.Column('config_hash', sa.String(length=64), nullable=False),
|
||||
sa.Column('config', sa.JSON(), nullable=False),
|
||||
sa.Column('status', sa.Enum('pending', 'running', 'completed', 'failed', 'cached', name='run_status'), nullable=False),
|
||||
sa.Column('started_at', sa.DateTime(timezone=True), nullable=True),
|
||||
sa.Column('completed_at', sa.DateTime(timezone=True), nullable=True),
|
||||
sa.Column('duration_ms', sa.Integer(), nullable=True),
|
||||
sa.Column('tokens_in', sa.Integer(), nullable=True),
|
||||
sa.Column('tokens_out', sa.Integer(), nullable=True),
|
||||
sa.Column('cost_estimate', sa.Numeric(precision=12, scale=6), nullable=True),
|
||||
sa.ForeignKeyConstraint(['experiment_id'], ['experiments.id'], ondelete='CASCADE'),
|
||||
sa.PrimaryKeyConstraint('id')
|
||||
)
|
||||
with op.batch_alter_table('runs', schema=None) as batch_op:
|
||||
batch_op.create_index('ix_runs_config_hash', ['config_hash'], unique=False)
|
||||
batch_op.create_index('ix_runs_experiment_id', ['experiment_id'], unique=False)
|
||||
batch_op.create_index('ix_runs_status', ['status'], unique=False)
|
||||
|
||||
op.create_table('scores',
|
||||
sa.Column('id', sa.Uuid(), nullable=False),
|
||||
sa.Column('run_id', sa.Uuid(), nullable=False),
|
||||
sa.Column('scorer_name', sa.String(length=255), nullable=False),
|
||||
sa.Column('value', sa.Float(), nullable=False),
|
||||
sa.Column('metadata', sa.JSON(), nullable=True),
|
||||
sa.Column('created_at', sa.DateTime(timezone=True), nullable=False),
|
||||
sa.ForeignKeyConstraint(['run_id'], ['runs.id'], ondelete='CASCADE'),
|
||||
sa.PrimaryKeyConstraint('id')
|
||||
)
|
||||
with op.batch_alter_table('scores', schema=None) as batch_op:
|
||||
batch_op.create_index('ix_scores_run_id', ['run_id'], unique=False)
|
||||
batch_op.create_index('ix_scores_scorer_name', ['scorer_name'], unique=False)
|
||||
|
||||
op.create_table('stage_results',
|
||||
sa.Column('id', sa.Uuid(), nullable=False),
|
||||
sa.Column('run_id', sa.Uuid(), nullable=False),
|
||||
sa.Column('stage_index', sa.Integer(), nullable=False),
|
||||
sa.Column('prompt_sent', sa.Text(), nullable=False),
|
||||
sa.Column('response_raw', sa.Text(), nullable=False),
|
||||
sa.Column('model_used', sa.String(length=255), nullable=False),
|
||||
sa.Column('parameters', sa.JSON(), nullable=True),
|
||||
sa.Column('tokens_in', sa.Integer(), nullable=True),
|
||||
sa.Column('tokens_out', sa.Integer(), nullable=True),
|
||||
sa.Column('latency_ms', sa.Integer(), nullable=True),
|
||||
sa.ForeignKeyConstraint(['run_id'], ['runs.id'], ondelete='CASCADE'),
|
||||
sa.PrimaryKeyConstraint('id')
|
||||
)
|
||||
with op.batch_alter_table('stage_results', schema=None) as batch_op:
|
||||
batch_op.create_index('ix_stage_results_run_id', ['run_id'], unique=False)
|
||||
|
||||
# ### end Alembic commands ###
|
||||
|
||||
|
||||
def downgrade() -> None:
|
||||
# ### commands auto generated by Alembic - please adjust! ###
|
||||
with op.batch_alter_table('stage_results', schema=None) as batch_op:
|
||||
batch_op.drop_index('ix_stage_results_run_id')
|
||||
|
||||
op.drop_table('stage_results')
|
||||
with op.batch_alter_table('scores', schema=None) as batch_op:
|
||||
batch_op.drop_index('ix_scores_scorer_name')
|
||||
batch_op.drop_index('ix_scores_run_id')
|
||||
|
||||
op.drop_table('scores')
|
||||
with op.batch_alter_table('runs', schema=None) as batch_op:
|
||||
batch_op.drop_index('ix_runs_status')
|
||||
batch_op.drop_index('ix_runs_experiment_id')
|
||||
batch_op.drop_index('ix_runs_config_hash')
|
||||
|
||||
op.drop_table('runs')
|
||||
with op.batch_alter_table('experiments', schema=None) as batch_op:
|
||||
batch_op.drop_index('ix_experiments_status')
|
||||
batch_op.drop_index('ix_experiments_project_id')
|
||||
|
||||
op.drop_table('experiments')
|
||||
op.drop_table('projects')
|
||||
with op.batch_alter_table('webhook_configs', schema=None) as batch_op:
|
||||
batch_op.drop_index('ix_webhook_configs_event_type')
|
||||
|
||||
op.drop_table('webhook_configs')
|
||||
op.drop_table('users')
|
||||
op.drop_table('response_cache')
|
||||
# ### end Alembic commands ###
|
||||
0
backend/__init__.py
Normal file
0
backend/__init__.py
Normal file
154
backend/auth.py
Normal file
154
backend/auth.py
Normal file
|
|
@ -0,0 +1,154 @@
|
|||
"""PromptLooper authentication — JWT tokens, API keys, first-boot setup."""
|
||||
|
||||
import uuid as _uuid
|
||||
from datetime import datetime, timedelta, timezone
|
||||
from typing import Generator
|
||||
|
||||
from fastapi import Depends, HTTPException, Header, status
|
||||
from jose import JWTError, jwt
|
||||
from passlib.context import CryptContext
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from config import settings
|
||||
from models import User
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Password hashing
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
|
||||
|
||||
|
||||
def hash_password(password: str) -> str:
|
||||
return pwd_context.hash(password)
|
||||
|
||||
|
||||
def verify_password(plain: str, hashed: str) -> bool:
|
||||
return pwd_context.verify(plain, hashed)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# JWT
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
ALGORITHM = "HS256"
|
||||
ACCESS_TOKEN_EXPIRE_MINUTES = 60 * 24 # 24 hours
|
||||
|
||||
|
||||
def create_access_token(user_id: str, *, expires_delta: timedelta | None = None) -> str:
|
||||
expire = datetime.now(timezone.utc) + (expires_delta or timedelta(minutes=ACCESS_TOKEN_EXPIRE_MINUTES))
|
||||
payload = {"sub": user_id, "exp": expire}
|
||||
return jwt.encode(payload, settings.jwt_secret, algorithm=ALGORITHM)
|
||||
|
||||
|
||||
def decode_access_token(token: str) -> str:
|
||||
"""Return the user_id (sub) from a valid JWT, or raise."""
|
||||
try:
|
||||
payload = jwt.decode(token, settings.jwt_secret, algorithms=[ALGORITHM])
|
||||
user_id: str | None = payload.get("sub")
|
||||
if user_id is None:
|
||||
raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Invalid token")
|
||||
return user_id
|
||||
except JWTError:
|
||||
raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Invalid token")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# First-boot setup
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def needs_setup(db: Session) -> bool:
|
||||
"""Return True if no users exist yet (first-boot state)."""
|
||||
return db.query(User).count() == 0
|
||||
|
||||
|
||||
def create_admin(db: Session, username: str, password: str) -> User:
|
||||
"""Create the first admin user. Raises if users already exist."""
|
||||
if not needs_setup(db):
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_409_CONFLICT,
|
||||
detail="Admin account already exists",
|
||||
)
|
||||
user = User(
|
||||
username=username,
|
||||
password_hash=hash_password(password),
|
||||
is_admin=True,
|
||||
)
|
||||
db.add(user)
|
||||
db.commit()
|
||||
db.refresh(user)
|
||||
return user
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Authenticate (login)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def authenticate_user(db: Session, username: str, password: str) -> User:
|
||||
"""Verify credentials and return the User, or raise 401."""
|
||||
user = db.query(User).filter(User.username == username).first()
|
||||
if user is None or not verify_password(password, user.password_hash):
|
||||
raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Invalid credentials")
|
||||
return user
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Database session dependency (local to avoid circular import with main.py)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _get_db() -> Generator[Session, None, None]:
|
||||
"""Yield a DB session. Imported lazily from main to avoid circular import."""
|
||||
from main import get_db
|
||||
yield from get_db()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Dependency: get current user (JWT or API key)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def get_current_user(
|
||||
authorization: str | None = Header(None),
|
||||
x_api_key: str | None = Header(None),
|
||||
db: Session = Depends(_get_db),
|
||||
) -> User:
|
||||
"""FastAPI dependency — resolve the current user from JWT Bearer token or API key.
|
||||
|
||||
Priority:
|
||||
1. X-Api-Key header — matched against settings.api_key (grants first admin).
|
||||
2. Authorization: Bearer <jwt> — decoded to get user_id.
|
||||
"""
|
||||
# --- API key path ---
|
||||
if x_api_key is not None:
|
||||
if settings.api_key is None or x_api_key != settings.api_key:
|
||||
raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Invalid API key")
|
||||
# API key grants the first admin user
|
||||
admin = db.query(User).filter(User.is_admin.is_(True)).first()
|
||||
if admin is None:
|
||||
raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="No admin user exists")
|
||||
return admin
|
||||
|
||||
# --- JWT path ---
|
||||
if authorization is None:
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_401_UNAUTHORIZED,
|
||||
detail="Missing authentication",
|
||||
headers={"WWW-Authenticate": "Bearer"},
|
||||
)
|
||||
|
||||
scheme, _, token = authorization.partition(" ")
|
||||
if scheme.lower() != "bearer" or not token:
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_401_UNAUTHORIZED,
|
||||
detail="Invalid authorization header",
|
||||
headers={"WWW-Authenticate": "Bearer"},
|
||||
)
|
||||
|
||||
user_id_str = decode_access_token(token)
|
||||
try:
|
||||
user_id = _uuid.UUID(user_id_str)
|
||||
except ValueError:
|
||||
raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Invalid token")
|
||||
user = db.query(User).filter(User.id == user_id).first()
|
||||
if user is None:
|
||||
raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="User not found")
|
||||
return user
|
||||
76
backend/config.py
Normal file
76
backend/config.py
Normal file
|
|
@ -0,0 +1,76 @@
|
|||
"""PromptLooper configuration — Pydantic Settings loaded from environment."""
|
||||
|
||||
import secrets
|
||||
from pathlib import Path
|
||||
|
||||
from pydantic import field_validator
|
||||
from pydantic_settings import BaseSettings, SettingsConfigDict
|
||||
|
||||
|
||||
class Settings(BaseSettings):
|
||||
model_config = SettingsConfigDict(
|
||||
env_file=".env",
|
||||
env_file_encoding="utf-8",
|
||||
extra="ignore",
|
||||
)
|
||||
|
||||
# --- Database ---
|
||||
database_url: str | None = None
|
||||
|
||||
# --- Redis ---
|
||||
redis_url: str | None = None
|
||||
|
||||
# --- Server ---
|
||||
host: str = "0.0.0.0"
|
||||
port: int = 8400
|
||||
|
||||
# --- Auth ---
|
||||
jwt_secret: str = ""
|
||||
api_key: str | None = None
|
||||
|
||||
# --- Default LLM Endpoint ---
|
||||
default_endpoint_url: str | None = None
|
||||
default_endpoint_key: str | None = None
|
||||
|
||||
# --- Limits ---
|
||||
max_concurrent_runs: int = 4
|
||||
max_tokens_per_sweep: int = 0 # 0 = unlimited
|
||||
|
||||
# --- Storage ---
|
||||
data_dir: str = "/data"
|
||||
|
||||
# --- MCP ---
|
||||
mcp_enabled: bool = True
|
||||
mcp_port: int = 8401
|
||||
|
||||
def model_post_init(self, __context: object) -> None:
|
||||
# Auto-generate JWT secret if not provided
|
||||
if not self.jwt_secret:
|
||||
self.jwt_secret = secrets.token_urlsafe(32)
|
||||
|
||||
@property
|
||||
def effective_database_url(self) -> str:
|
||||
"""Return DATABASE_URL or construct a SQLite URL from DATA_DIR."""
|
||||
if self.database_url:
|
||||
return self.database_url
|
||||
db_path = Path(self.data_dir) / "promptlooper.db"
|
||||
return f"sqlite:///{db_path}"
|
||||
|
||||
@property
|
||||
def is_sqlite(self) -> bool:
|
||||
return self.effective_database_url.startswith("sqlite")
|
||||
|
||||
@property
|
||||
def use_in_process_queue(self) -> bool:
|
||||
"""When Redis is unavailable, use in-process task execution."""
|
||||
return self.redis_url is None
|
||||
|
||||
@field_validator("api_key", mode="before")
|
||||
@classmethod
|
||||
def empty_string_to_none(cls, v: str | None) -> str | None:
|
||||
if v is not None and v.strip() == "":
|
||||
return None
|
||||
return v
|
||||
|
||||
|
||||
settings = Settings()
|
||||
0
backend/engine/__init__.py
Normal file
0
backend/engine/__init__.py
Normal file
0
backend/engine/adapters/__init__.py
Normal file
0
backend/engine/adapters/__init__.py
Normal file
0
backend/engine/scorers/__init__.py
Normal file
0
backend/engine/scorers/__init__.py
Normal file
211
backend/main.py
Normal file
211
backend/main.py
Normal file
|
|
@ -0,0 +1,211 @@
|
|||
"""PromptLooper FastAPI application."""
|
||||
|
||||
from contextlib import asynccontextmanager
|
||||
from typing import AsyncGenerator
|
||||
|
||||
from fastapi import FastAPI, WebSocket, WebSocketDisconnect
|
||||
from fastapi.middleware.cors import CORSMiddleware
|
||||
from sqlalchemy import create_engine, text
|
||||
from sqlalchemy.orm import sessionmaker
|
||||
|
||||
from config import settings
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Database engine & session factory (lazy, created at startup)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
engine = None
|
||||
SessionLocal = None
|
||||
|
||||
|
||||
def _init_db() -> None:
|
||||
"""Create the SQLAlchemy engine and session factory."""
|
||||
global engine, SessionLocal
|
||||
connect_args = {}
|
||||
if settings.is_sqlite:
|
||||
connect_args["check_same_thread"] = False
|
||||
engine = create_engine(
|
||||
settings.effective_database_url,
|
||||
connect_args=connect_args,
|
||||
)
|
||||
SessionLocal = sessionmaker(bind=engine, autoflush=False, expire_on_commit=False)
|
||||
|
||||
|
||||
def get_db():
|
||||
"""FastAPI dependency that yields a database session."""
|
||||
db = SessionLocal()
|
||||
try:
|
||||
yield db
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Redis helper
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_redis_client = None
|
||||
|
||||
|
||||
def _init_redis() -> None:
|
||||
"""Connect to Redis if configured."""
|
||||
global _redis_client
|
||||
if not settings.redis_url:
|
||||
_redis_client = None
|
||||
return
|
||||
import redis as redis_lib
|
||||
_redis_client = redis_lib.Redis.from_url(settings.redis_url, decode_responses=True)
|
||||
|
||||
|
||||
def get_redis():
|
||||
"""Return the Redis client (or None in single-container mode)."""
|
||||
return _redis_client
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# WebSocket connection manager
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class ConnectionManager:
|
||||
"""Manage active WebSocket connections."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
self.active_connections: list[WebSocket] = []
|
||||
|
||||
async def connect(self, websocket: WebSocket) -> None:
|
||||
await websocket.accept()
|
||||
self.active_connections.append(websocket)
|
||||
|
||||
def disconnect(self, websocket: WebSocket) -> None:
|
||||
self.active_connections.remove(websocket)
|
||||
|
||||
async def broadcast(self, message: dict) -> None:
|
||||
for connection in list(self.active_connections):
|
||||
try:
|
||||
await connection.send_json(message)
|
||||
except Exception:
|
||||
self.disconnect(connection)
|
||||
|
||||
|
||||
ws_manager = ConnectionManager()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Lifecycle
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@asynccontextmanager
|
||||
async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
|
||||
"""Startup and shutdown lifecycle hooks."""
|
||||
_init_db()
|
||||
_init_redis()
|
||||
yield
|
||||
# Shutdown: clean up connections
|
||||
if _redis_client is not None:
|
||||
_redis_client.close()
|
||||
if engine is not None:
|
||||
engine.dispose()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Application
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
app = FastAPI(
|
||||
title="PromptLooper",
|
||||
description="LLM pipeline tuning workbench",
|
||||
version="0.1.0",
|
||||
lifespan=lifespan,
|
||||
)
|
||||
|
||||
# CORS — allow all origins in development; tighten in production via env
|
||||
app.add_middleware(
|
||||
CORSMiddleware,
|
||||
allow_origins=["*"],
|
||||
allow_credentials=True,
|
||||
allow_methods=["*"],
|
||||
allow_headers=["*"],
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Health endpoint
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@app.get("/health", tags=["system"])
|
||||
def health_check() -> dict:
|
||||
"""Check DB and Redis connectivity."""
|
||||
db_ok = False
|
||||
redis_ok = False
|
||||
|
||||
# Database check
|
||||
if SessionLocal is not None:
|
||||
try:
|
||||
with SessionLocal() as session:
|
||||
session.execute(text("SELECT 1"))
|
||||
db_ok = True
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Redis check
|
||||
if not settings.redis_url:
|
||||
redis_ok = True # No Redis needed — in-process mode
|
||||
elif _redis_client is not None:
|
||||
try:
|
||||
_redis_client.ping()
|
||||
redis_ok = True
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return {"status": "ok" if (db_ok and redis_ok) else "degraded", "database": db_ok, "redis": redis_ok}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# WebSocket endpoint
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@app.websocket("/ws")
|
||||
async def websocket_endpoint(websocket: WebSocket) -> None:
|
||||
"""WebSocket connection for real-time dashboard updates."""
|
||||
await ws_manager.connect(websocket)
|
||||
try:
|
||||
while True:
|
||||
# Keep connection alive; handle incoming messages if needed
|
||||
data = await websocket.receive_json()
|
||||
# Echo back or handle client messages in future
|
||||
await websocket.send_json({"type": "ack", "data": data})
|
||||
except WebSocketDisconnect:
|
||||
ws_manager.disconnect(websocket)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Mount routers (stubs — actual implementations come later)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Router imports are deferred to avoid circular imports and allow
|
||||
# stub files to be created independently. Each router will be mounted
|
||||
# as it is implemented. For now we register empty prefixes.
|
||||
|
||||
def _mount_routers() -> None:
|
||||
"""Import and mount all routers. Silently skip missing ones."""
|
||||
router_configs = [
|
||||
("routers.auth", "/api/auth", ["auth"]),
|
||||
("routers.projects", "/api/projects", ["projects"]),
|
||||
("routers.experiments", "/api/experiments", ["experiments"]),
|
||||
("routers.runs", "/api/runs", ["runs"]),
|
||||
("routers.endpoints", "/api/endpoints", ["endpoints"]),
|
||||
("routers.export", "/api/export", ["export"]),
|
||||
("routers.webhooks", "/api/webhooks", ["webhooks"]),
|
||||
("routers.admin", "/api/admin", ["admin"]),
|
||||
]
|
||||
for module_name, prefix, tags in router_configs:
|
||||
try:
|
||||
import importlib
|
||||
mod = importlib.import_module(module_name)
|
||||
app.include_router(mod.router, prefix=prefix, tags=tags)
|
||||
except (ImportError, AttributeError):
|
||||
pass # Router not yet implemented
|
||||
|
||||
|
||||
_mount_routers()
|
||||
0
backend/mcp/__init__.py
Normal file
0
backend/mcp/__init__.py
Normal file
276
backend/models.py
Normal file
276
backend/models.py
Normal file
|
|
@ -0,0 +1,276 @@
|
|||
"""PromptLooper SQLAlchemy ORM models."""
|
||||
|
||||
import enum
|
||||
import uuid
|
||||
from datetime import datetime, timezone
|
||||
|
||||
from sqlalchemy import (
|
||||
JSON,
|
||||
Boolean,
|
||||
DateTime,
|
||||
Enum,
|
||||
Float,
|
||||
ForeignKey,
|
||||
Index,
|
||||
Integer,
|
||||
Numeric,
|
||||
String,
|
||||
Text,
|
||||
)
|
||||
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship
|
||||
|
||||
|
||||
def _utcnow() -> datetime:
|
||||
return datetime.now(timezone.utc)
|
||||
|
||||
|
||||
def _new_uuid() -> uuid.UUID:
|
||||
return uuid.uuid4()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Base
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class Base(DeclarativeBase):
|
||||
"""Shared declarative base for all models."""
|
||||
|
||||
type_annotation_map = {
|
||||
dict: JSON,
|
||||
}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Enums
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class ExperimentStatus(str, enum.Enum):
|
||||
draft = "draft"
|
||||
running = "running"
|
||||
paused = "paused"
|
||||
completed = "completed"
|
||||
|
||||
|
||||
class RunStatus(str, enum.Enum):
|
||||
pending = "pending"
|
||||
running = "running"
|
||||
completed = "completed"
|
||||
failed = "failed"
|
||||
cached = "cached"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Models
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class User(Base):
|
||||
__tablename__ = "users"
|
||||
|
||||
id: Mapped[uuid.UUID] = mapped_column(
|
||||
primary_key=True, default=_new_uuid
|
||||
)
|
||||
username: Mapped[str] = mapped_column(String(255), unique=True, nullable=False)
|
||||
password_hash: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||
is_admin: Mapped[bool] = mapped_column(Boolean, default=False, nullable=False)
|
||||
created_at: Mapped[datetime] = mapped_column(
|
||||
DateTime(timezone=True), default=_utcnow, nullable=False
|
||||
)
|
||||
|
||||
# Relationships
|
||||
projects: Mapped[list["Project"]] = relationship(
|
||||
back_populates="owner", cascade="all, delete-orphan"
|
||||
)
|
||||
|
||||
|
||||
class Project(Base):
|
||||
__tablename__ = "projects"
|
||||
|
||||
id: Mapped[uuid.UUID] = mapped_column(
|
||||
primary_key=True, default=_new_uuid
|
||||
)
|
||||
name: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||
description: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||
owner_id: Mapped[uuid.UUID] = mapped_column(
|
||||
ForeignKey("users.id", ondelete="CASCADE"), nullable=False
|
||||
)
|
||||
created_at: Mapped[datetime] = mapped_column(
|
||||
DateTime(timezone=True), default=_utcnow, nullable=False
|
||||
)
|
||||
updated_at: Mapped[datetime] = mapped_column(
|
||||
DateTime(timezone=True), default=_utcnow, onupdate=_utcnow, nullable=False
|
||||
)
|
||||
|
||||
# Relationships
|
||||
owner: Mapped["User"] = relationship(back_populates="projects")
|
||||
experiments: Mapped[list["Experiment"]] = relationship(
|
||||
back_populates="project", cascade="all, delete-orphan"
|
||||
)
|
||||
|
||||
|
||||
class Experiment(Base):
|
||||
__tablename__ = "experiments"
|
||||
|
||||
id: Mapped[uuid.UUID] = mapped_column(
|
||||
primary_key=True, default=_new_uuid
|
||||
)
|
||||
project_id: Mapped[uuid.UUID] = mapped_column(
|
||||
ForeignKey("projects.id", ondelete="CASCADE"), nullable=False
|
||||
)
|
||||
name: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||
description: Mapped[str | None] = mapped_column(Text, nullable=True)
|
||||
sample_data: Mapped[dict | None] = mapped_column(JSON, nullable=True)
|
||||
pipeline_stages: Mapped[dict | None] = mapped_column(JSON, nullable=True)
|
||||
scoring_config: Mapped[dict | None] = mapped_column(JSON, nullable=True)
|
||||
parameter_space: Mapped[dict | None] = mapped_column(JSON, nullable=True)
|
||||
status: Mapped[ExperimentStatus] = mapped_column(
|
||||
Enum(ExperimentStatus, name="experiment_status"),
|
||||
default=ExperimentStatus.draft,
|
||||
nullable=False,
|
||||
)
|
||||
created_at: Mapped[datetime] = mapped_column(
|
||||
DateTime(timezone=True), default=_utcnow, nullable=False
|
||||
)
|
||||
updated_at: Mapped[datetime] = mapped_column(
|
||||
DateTime(timezone=True), default=_utcnow, onupdate=_utcnow, nullable=False
|
||||
)
|
||||
|
||||
# Relationships
|
||||
project: Mapped["Project"] = relationship(back_populates="experiments")
|
||||
runs: Mapped[list["Run"]] = relationship(
|
||||
back_populates="experiment", cascade="all, delete-orphan"
|
||||
)
|
||||
|
||||
__table_args__ = (
|
||||
Index("ix_experiments_project_id", "project_id"),
|
||||
Index("ix_experiments_status", "status"),
|
||||
)
|
||||
|
||||
|
||||
class Run(Base):
|
||||
__tablename__ = "runs"
|
||||
|
||||
id: Mapped[uuid.UUID] = mapped_column(
|
||||
primary_key=True, default=_new_uuid
|
||||
)
|
||||
experiment_id: Mapped[uuid.UUID] = mapped_column(
|
||||
ForeignKey("experiments.id", ondelete="CASCADE"), nullable=False
|
||||
)
|
||||
config_hash: Mapped[str] = mapped_column(String(64), nullable=False)
|
||||
config: Mapped[dict] = mapped_column(JSON, nullable=False)
|
||||
status: Mapped[RunStatus] = mapped_column(
|
||||
Enum(RunStatus, name="run_status"),
|
||||
default=RunStatus.pending,
|
||||
nullable=False,
|
||||
)
|
||||
started_at: Mapped[datetime | None] = mapped_column(
|
||||
DateTime(timezone=True), nullable=True
|
||||
)
|
||||
completed_at: Mapped[datetime | None] = mapped_column(
|
||||
DateTime(timezone=True), nullable=True
|
||||
)
|
||||
duration_ms: Mapped[int | None] = mapped_column(Integer, nullable=True)
|
||||
tokens_in: Mapped[int | None] = mapped_column(Integer, nullable=True)
|
||||
tokens_out: Mapped[int | None] = mapped_column(Integer, nullable=True)
|
||||
cost_estimate: Mapped[float | None] = mapped_column(
|
||||
Numeric(precision=12, scale=6), nullable=True
|
||||
)
|
||||
|
||||
# Relationships
|
||||
experiment: Mapped["Experiment"] = relationship(back_populates="runs")
|
||||
stage_results: Mapped[list["StageResult"]] = relationship(
|
||||
back_populates="run", cascade="all, delete-orphan"
|
||||
)
|
||||
scores: Mapped[list["Score"]] = relationship(
|
||||
back_populates="run", cascade="all, delete-orphan"
|
||||
)
|
||||
|
||||
__table_args__ = (
|
||||
Index("ix_runs_experiment_id", "experiment_id"),
|
||||
Index("ix_runs_config_hash", "config_hash"),
|
||||
Index("ix_runs_status", "status"),
|
||||
)
|
||||
|
||||
|
||||
class StageResult(Base):
|
||||
__tablename__ = "stage_results"
|
||||
|
||||
id: Mapped[uuid.UUID] = mapped_column(
|
||||
primary_key=True, default=_new_uuid
|
||||
)
|
||||
run_id: Mapped[uuid.UUID] = mapped_column(
|
||||
ForeignKey("runs.id", ondelete="CASCADE"), nullable=False
|
||||
)
|
||||
stage_index: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||
prompt_sent: Mapped[str] = mapped_column(Text, nullable=False)
|
||||
response_raw: Mapped[str] = mapped_column(Text, nullable=False)
|
||||
model_used: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||
parameters: Mapped[dict | None] = mapped_column(JSON, nullable=True)
|
||||
tokens_in: Mapped[int | None] = mapped_column(Integer, nullable=True)
|
||||
tokens_out: Mapped[int | None] = mapped_column(Integer, nullable=True)
|
||||
latency_ms: Mapped[int | None] = mapped_column(Integer, nullable=True)
|
||||
|
||||
# Relationships
|
||||
run: Mapped["Run"] = relationship(back_populates="stage_results")
|
||||
|
||||
__table_args__ = (
|
||||
Index("ix_stage_results_run_id", "run_id"),
|
||||
)
|
||||
|
||||
|
||||
class Score(Base):
|
||||
__tablename__ = "scores"
|
||||
|
||||
id: Mapped[uuid.UUID] = mapped_column(
|
||||
primary_key=True, default=_new_uuid
|
||||
)
|
||||
run_id: Mapped[uuid.UUID] = mapped_column(
|
||||
ForeignKey("runs.id", ondelete="CASCADE"), nullable=False
|
||||
)
|
||||
scorer_name: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||
value: Mapped[float] = mapped_column(Float, nullable=False)
|
||||
scorer_metadata: Mapped[dict | None] = mapped_column(
|
||||
"metadata", JSON, nullable=True
|
||||
)
|
||||
created_at: Mapped[datetime] = mapped_column(
|
||||
DateTime(timezone=True), default=_utcnow, nullable=False
|
||||
)
|
||||
|
||||
# Relationships
|
||||
run: Mapped["Run"] = relationship(back_populates="scores")
|
||||
|
||||
__table_args__ = (
|
||||
Index("ix_scores_run_id", "run_id"),
|
||||
Index("ix_scores_scorer_name", "scorer_name"),
|
||||
)
|
||||
|
||||
|
||||
class ResponseCache(Base):
|
||||
__tablename__ = "response_cache"
|
||||
|
||||
config_hash: Mapped[str] = mapped_column(
|
||||
String(64), primary_key=True
|
||||
)
|
||||
response: Mapped[str] = mapped_column(Text, nullable=False)
|
||||
model: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||
tokens_in: Mapped[int | None] = mapped_column(Integer, nullable=True)
|
||||
tokens_out: Mapped[int | None] = mapped_column(Integer, nullable=True)
|
||||
latency_ms: Mapped[int | None] = mapped_column(Integer, nullable=True)
|
||||
created_at: Mapped[datetime] = mapped_column(
|
||||
DateTime(timezone=True), default=_utcnow, nullable=False
|
||||
)
|
||||
|
||||
|
||||
class WebhookConfig(Base):
|
||||
__tablename__ = "webhook_configs"
|
||||
|
||||
id: Mapped[uuid.UUID] = mapped_column(
|
||||
primary_key=True, default=_new_uuid
|
||||
)
|
||||
event_type: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||
url: Mapped[str] = mapped_column(String(2048), nullable=False)
|
||||
headers: Mapped[dict | None] = mapped_column(JSON, nullable=True)
|
||||
is_active: Mapped[bool] = mapped_column(Boolean, default=True, nullable=False)
|
||||
|
||||
__table_args__ = (
|
||||
Index("ix_webhook_configs_event_type", "event_type"),
|
||||
)
|
||||
16
backend/requirements.txt
Normal file
16
backend/requirements.txt
Normal file
|
|
@ -0,0 +1,16 @@
|
|||
# PromptLooper — Backend Dependencies
|
||||
fastapi>=0.115,<1.0
|
||||
uvicorn[standard]>=0.32,<1.0
|
||||
sqlalchemy>=2.0,<3.0
|
||||
alembic>=1.14,<2.0
|
||||
pydantic>=2.0,<3.0
|
||||
pydantic-settings>=2.0,<3.0
|
||||
python-jose[cryptography]>=3.3,<4.0
|
||||
passlib[bcrypt]>=1.7,<2.0
|
||||
celery>=5.4,<6.0
|
||||
redis>=5.0,<6.0
|
||||
httpx>=0.27,<1.0
|
||||
websockets>=13.0,<14.0
|
||||
psycopg2-binary>=2.9,<3.0
|
||||
aiosqlite>=0.20,<1.0
|
||||
python-multipart>=0.0.9
|
||||
0
backend/routers/__init__.py
Normal file
0
backend/routers/__init__.py
Normal file
23
backend/routers/admin.py
Normal file
23
backend/routers/admin.py
Normal file
|
|
@ -0,0 +1,23 @@
|
|||
"""Admin router — system settings and stats."""
|
||||
|
||||
from fastapi import APIRouter, Response
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
@router.get("/settings", status_code=501)
|
||||
def get_settings():
|
||||
"""System settings (guest access, default model, etc.)."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.put("/settings", status_code=501)
|
||||
def update_settings():
|
||||
"""Update settings."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.get("/stats", status_code=501)
|
||||
def get_stats():
|
||||
"""System-wide stats (total runs, cache hit rate, etc.)."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
23
backend/routers/auth.py
Normal file
23
backend/routers/auth.py
Normal file
|
|
@ -0,0 +1,23 @@
|
|||
"""Auth router — setup, login, and current user info."""
|
||||
|
||||
from fastapi import APIRouter, Response
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
@router.post("/setup", status_code=501)
|
||||
def setup():
|
||||
"""First-boot admin password setup."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.post("/login", status_code=501)
|
||||
def login():
|
||||
"""Login, returns JWT."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.get("/me", status_code=501)
|
||||
def me():
|
||||
"""Current user info."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
37
backend/routers/endpoints.py
Normal file
37
backend/routers/endpoints.py
Normal file
|
|
@ -0,0 +1,37 @@
|
|||
"""Endpoints router — LLM target management."""
|
||||
|
||||
import uuid
|
||||
|
||||
from fastapi import APIRouter, Response
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
@router.get("/", status_code=501)
|
||||
def list_endpoints():
|
||||
"""List configured LLM endpoints."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.post("/", status_code=501)
|
||||
def create_endpoint():
|
||||
"""Add endpoint (URL, API key, label)."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.put("/{endpoint_id}", status_code=501)
|
||||
def update_endpoint(endpoint_id: uuid.UUID):
|
||||
"""Update endpoint."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.delete("/{endpoint_id}", status_code=501)
|
||||
def delete_endpoint(endpoint_id: uuid.UUID):
|
||||
"""Remove endpoint."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.post("/{endpoint_id}/test", status_code=501)
|
||||
def test_endpoint(endpoint_id: uuid.UUID):
|
||||
"""Test connectivity and list available models."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
61
backend/routers/experiments.py
Normal file
61
backend/routers/experiments.py
Normal file
|
|
@ -0,0 +1,61 @@
|
|||
"""Experiments router — CRUD and sweep controls."""
|
||||
|
||||
import uuid
|
||||
|
||||
from fastapi import APIRouter, Response
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
@router.get("/", status_code=501)
|
||||
def list_experiments():
|
||||
"""List experiments (filter by project)."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.post("/", status_code=501)
|
||||
def create_experiment():
|
||||
"""Create experiment."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.get("/{experiment_id}", status_code=501)
|
||||
def get_experiment(experiment_id: uuid.UUID):
|
||||
"""Experiment detail with run summaries."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.put("/{experiment_id}", status_code=501)
|
||||
def update_experiment(experiment_id: uuid.UUID):
|
||||
"""Update experiment config."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.delete("/{experiment_id}", status_code=501)
|
||||
def delete_experiment(experiment_id: uuid.UUID):
|
||||
"""Delete experiment."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.post("/{experiment_id}/sweep", status_code=501)
|
||||
def start_sweep(experiment_id: uuid.UUID):
|
||||
"""Start a sweep (grid, random, or guided)."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.post("/{experiment_id}/pause", status_code=501)
|
||||
def pause_sweep(experiment_id: uuid.UUID):
|
||||
"""Pause running sweep."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.post("/{experiment_id}/resume", status_code=501)
|
||||
def resume_sweep(experiment_id: uuid.UUID):
|
||||
"""Resume paused sweep."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.post("/{experiment_id}/stop", status_code=501)
|
||||
def stop_sweep(experiment_id: uuid.UUID):
|
||||
"""Stop sweep."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
31
backend/routers/export.py
Normal file
31
backend/routers/export.py
Normal file
|
|
@ -0,0 +1,31 @@
|
|||
"""Export router — export experiment results in various formats."""
|
||||
|
||||
import uuid
|
||||
|
||||
from fastapi import APIRouter, Response
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
@router.get("/experiments/{experiment_id}/best", status_code=501)
|
||||
def export_best(experiment_id: uuid.UUID):
|
||||
"""Best config as JSON."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.get("/experiments/{experiment_id}/env", status_code=501)
|
||||
def export_env(experiment_id: uuid.UUID):
|
||||
"""Best config as .env snippet."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.get("/experiments/{experiment_id}/yaml", status_code=501)
|
||||
def export_yaml(experiment_id: uuid.UUID):
|
||||
"""Best config as YAML."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.get("/experiments/{experiment_id}/report", status_code=501)
|
||||
def export_report(experiment_id: uuid.UUID):
|
||||
"""Full experiment report (markdown)."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
37
backend/routers/projects.py
Normal file
37
backend/routers/projects.py
Normal file
|
|
@ -0,0 +1,37 @@
|
|||
"""Projects router — CRUD for projects."""
|
||||
|
||||
import uuid
|
||||
|
||||
from fastapi import APIRouter, Response
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
@router.get("/", status_code=501)
|
||||
def list_projects():
|
||||
"""List projects."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.post("/", status_code=501)
|
||||
def create_project():
|
||||
"""Create project."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.get("/{project_id}", status_code=501)
|
||||
def get_project(project_id: uuid.UUID):
|
||||
"""Project detail with experiment summaries."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.put("/{project_id}", status_code=501)
|
||||
def update_project(project_id: uuid.UUID):
|
||||
"""Update project."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.delete("/{project_id}", status_code=501)
|
||||
def delete_project(project_id: uuid.UUID):
|
||||
"""Delete project and all experiments."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
37
backend/routers/runs.py
Normal file
37
backend/routers/runs.py
Normal file
|
|
@ -0,0 +1,37 @@
|
|||
"""Runs router — execute, detail, score, and leaderboard."""
|
||||
|
||||
import uuid
|
||||
|
||||
from fastapi import APIRouter, Response
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
@router.get("/experiments/{experiment_id}/runs", status_code=501)
|
||||
def list_runs(experiment_id: uuid.UUID):
|
||||
"""List runs with scores (sortable, filterable)."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.get("/{run_id}", status_code=501)
|
||||
def get_run(run_id: uuid.UUID):
|
||||
"""Run detail with stage results."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.post("/", status_code=501)
|
||||
def create_run():
|
||||
"""Execute a single run (ad-hoc)."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.post("/{run_id}/score", status_code=501)
|
||||
def score_run(run_id: uuid.UUID):
|
||||
"""Add human rating to a run."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.get("/experiments/{experiment_id}/leaderboard", status_code=501)
|
||||
def leaderboard(experiment_id: uuid.UUID):
|
||||
"""Top runs ranked by weighted score."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
25
backend/routers/webhooks.py
Normal file
25
backend/routers/webhooks.py
Normal file
|
|
@ -0,0 +1,25 @@
|
|||
"""Webhooks router — manage webhook configurations."""
|
||||
|
||||
import uuid
|
||||
|
||||
from fastapi import APIRouter, Response
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
@router.get("/", status_code=501)
|
||||
def list_webhooks():
|
||||
"""List webhook configs."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.post("/", status_code=501)
|
||||
def create_webhook():
|
||||
"""Create webhook."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
|
||||
|
||||
@router.delete("/{webhook_id}", status_code=501)
|
||||
def delete_webhook(webhook_id: uuid.UUID):
|
||||
"""Remove webhook."""
|
||||
return Response(status_code=501, content="Not Implemented")
|
||||
298
backend/schemas.py
Normal file
298
backend/schemas.py
Normal file
|
|
@ -0,0 +1,298 @@
|
|||
"""PromptLooper Pydantic request/response schemas."""
|
||||
|
||||
import uuid
|
||||
from datetime import datetime
|
||||
|
||||
from pydantic import BaseModel, ConfigDict, Field
|
||||
|
||||
from models import ExperimentStatus, RunStatus
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Shared mixins
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class _TimestampMixin(BaseModel):
|
||||
created_at: datetime
|
||||
updated_at: datetime
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Project
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class ProjectCreate(BaseModel):
|
||||
name: str = Field(..., min_length=1, max_length=255)
|
||||
description: str | None = None
|
||||
|
||||
|
||||
class ProjectUpdate(BaseModel):
|
||||
name: str | None = Field(None, min_length=1, max_length=255)
|
||||
description: str | None = None
|
||||
|
||||
|
||||
class ProjectResponse(BaseModel):
|
||||
model_config = ConfigDict(from_attributes=True)
|
||||
|
||||
id: uuid.UUID
|
||||
name: str
|
||||
description: str | None
|
||||
owner_id: uuid.UUID
|
||||
created_at: datetime
|
||||
updated_at: datetime
|
||||
|
||||
|
||||
class ProjectListResponse(BaseModel):
|
||||
items: list[ProjectResponse]
|
||||
total: int
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Experiment
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class ExperimentCreate(BaseModel):
|
||||
name: str = Field(..., min_length=1, max_length=255)
|
||||
description: str | None = None
|
||||
sample_data: dict | None = None
|
||||
pipeline_stages: dict | None = None
|
||||
scoring_config: dict | None = None
|
||||
parameter_space: dict | None = None
|
||||
|
||||
|
||||
class ExperimentUpdate(BaseModel):
|
||||
name: str | None = Field(None, min_length=1, max_length=255)
|
||||
description: str | None = None
|
||||
sample_data: dict | None = None
|
||||
pipeline_stages: dict | None = None
|
||||
scoring_config: dict | None = None
|
||||
parameter_space: dict | None = None
|
||||
status: ExperimentStatus | None = None
|
||||
|
||||
|
||||
class ExperimentResponse(BaseModel):
|
||||
model_config = ConfigDict(from_attributes=True)
|
||||
|
||||
id: uuid.UUID
|
||||
project_id: uuid.UUID
|
||||
name: str
|
||||
description: str | None
|
||||
sample_data: dict | None
|
||||
pipeline_stages: dict | None
|
||||
scoring_config: dict | None
|
||||
parameter_space: dict | None
|
||||
status: ExperimentStatus
|
||||
created_at: datetime
|
||||
updated_at: datetime
|
||||
|
||||
|
||||
class ExperimentListResponse(BaseModel):
|
||||
items: list[ExperimentResponse]
|
||||
total: int
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Run
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class RunResponse(BaseModel):
|
||||
model_config = ConfigDict(from_attributes=True)
|
||||
|
||||
id: uuid.UUID
|
||||
experiment_id: uuid.UUID
|
||||
config_hash: str
|
||||
config: dict
|
||||
status: RunStatus
|
||||
started_at: datetime | None
|
||||
completed_at: datetime | None
|
||||
duration_ms: int | None
|
||||
tokens_in: int | None
|
||||
tokens_out: int | None
|
||||
cost_estimate: float | None
|
||||
|
||||
|
||||
class RunListResponse(BaseModel):
|
||||
items: list[RunResponse]
|
||||
total: int
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# StageResult (read-only, returned inside Run details)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class StageResultResponse(BaseModel):
|
||||
model_config = ConfigDict(from_attributes=True)
|
||||
|
||||
id: uuid.UUID
|
||||
run_id: uuid.UUID
|
||||
stage_index: int
|
||||
prompt_sent: str
|
||||
response_raw: str
|
||||
model_used: str
|
||||
parameters: dict | None
|
||||
tokens_in: int | None
|
||||
tokens_out: int | None
|
||||
latency_ms: int | None
|
||||
|
||||
|
||||
class RunDetailResponse(RunResponse):
|
||||
"""Run with nested stage results and scores."""
|
||||
|
||||
stage_results: list[StageResultResponse] = []
|
||||
scores: list["ScoreResponse"] = []
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Score
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class ScoreInput(BaseModel):
|
||||
scorer_name: str = Field(..., min_length=1, max_length=255)
|
||||
value: float
|
||||
metadata: dict | None = None
|
||||
|
||||
|
||||
class ScoreResponse(BaseModel):
|
||||
model_config = ConfigDict(from_attributes=True)
|
||||
|
||||
id: uuid.UUID
|
||||
run_id: uuid.UUID
|
||||
scorer_name: str
|
||||
value: float
|
||||
scorer_metadata: dict | None
|
||||
created_at: datetime
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Endpoint (LLM endpoint configuration)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class EndpointCreate(BaseModel):
|
||||
name: str = Field(..., min_length=1, max_length=255)
|
||||
url: str = Field(..., min_length=1, max_length=2048)
|
||||
api_key: str | None = None
|
||||
default_model: str | None = Field(None, max_length=255)
|
||||
|
||||
|
||||
class EndpointUpdate(BaseModel):
|
||||
name: str | None = Field(None, min_length=1, max_length=255)
|
||||
url: str | None = Field(None, min_length=1, max_length=2048)
|
||||
api_key: str | None = None
|
||||
default_model: str | None = Field(None, max_length=255)
|
||||
|
||||
|
||||
class EndpointResponse(BaseModel):
|
||||
model_config = ConfigDict(from_attributes=True)
|
||||
|
||||
id: uuid.UUID
|
||||
name: str
|
||||
url: str
|
||||
default_model: str | None
|
||||
|
||||
|
||||
class EndpointListResponse(BaseModel):
|
||||
items: list[EndpointResponse]
|
||||
total: int
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Webhook
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class WebhookCreate(BaseModel):
|
||||
event_type: str = Field(..., min_length=1, max_length=255)
|
||||
url: str = Field(..., min_length=1, max_length=2048)
|
||||
headers: dict | None = None
|
||||
is_active: bool = True
|
||||
|
||||
|
||||
class WebhookUpdate(BaseModel):
|
||||
event_type: str | None = Field(None, min_length=1, max_length=255)
|
||||
url: str | None = Field(None, min_length=1, max_length=2048)
|
||||
headers: dict | None = None
|
||||
is_active: bool | None = None
|
||||
|
||||
|
||||
class WebhookResponse(BaseModel):
|
||||
model_config = ConfigDict(from_attributes=True)
|
||||
|
||||
id: uuid.UUID
|
||||
event_type: str
|
||||
url: str
|
||||
headers: dict | None
|
||||
is_active: bool
|
||||
|
||||
|
||||
class WebhookListResponse(BaseModel):
|
||||
items: list[WebhookResponse]
|
||||
total: int
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Auth
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class SetupRequest(BaseModel):
|
||||
username: str = Field(..., min_length=1, max_length=255)
|
||||
password: str = Field(..., min_length=8)
|
||||
|
||||
|
||||
class LoginRequest(BaseModel):
|
||||
username: str
|
||||
password: str
|
||||
|
||||
|
||||
class TokenResponse(BaseModel):
|
||||
access_token: str
|
||||
token_type: str = "bearer"
|
||||
|
||||
|
||||
class UserResponse(BaseModel):
|
||||
model_config = ConfigDict(from_attributes=True)
|
||||
|
||||
id: uuid.UUID
|
||||
username: str
|
||||
is_admin: bool
|
||||
created_at: datetime
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Export
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class ExportRunRow(BaseModel):
|
||||
"""Flat row for CSV/JSON export of run results."""
|
||||
|
||||
run_id: uuid.UUID
|
||||
experiment_id: uuid.UUID
|
||||
config_hash: str
|
||||
config: dict
|
||||
status: RunStatus
|
||||
duration_ms: int | None = None
|
||||
tokens_in: int | None = None
|
||||
tokens_out: int | None = None
|
||||
cost_estimate: float | None = None
|
||||
scores: dict[str, float] = Field(
|
||||
default_factory=dict,
|
||||
description="Map of scorer_name → value",
|
||||
)
|
||||
|
||||
|
||||
class ExportResponse(BaseModel):
|
||||
experiment_id: uuid.UUID
|
||||
experiment_name: str
|
||||
rows: list[ExportRunRow]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Health
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class HealthResponse(BaseModel):
|
||||
status: str = "ok"
|
||||
database: bool
|
||||
redis: bool
|
||||
|
||||
|
||||
# Rebuild forward refs for RunDetailResponse
|
||||
RunDetailResponse.model_rebuild()
|
||||
0
backend/tests/__init__.py
Normal file
0
backend/tests/__init__.py
Normal file
107
backend/tests/test_alembic.py
Normal file
107
backend/tests/test_alembic.py
Normal file
|
|
@ -0,0 +1,107 @@
|
|||
"""Tests for Alembic migration setup."""
|
||||
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
from alembic import command
|
||||
from alembic.config import Config
|
||||
from sqlalchemy import create_engine, inspect
|
||||
|
||||
# Resolve the repo root regardless of where pytest is invoked from.
|
||||
_REPO_ROOT = Path(__file__).resolve().parents[2]
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def alembic_cfg(tmp_path):
|
||||
"""Create an Alembic config pointing at a temporary SQLite database."""
|
||||
db_path = tmp_path / "test.db"
|
||||
db_url = f"sqlite:///{db_path}"
|
||||
|
||||
cfg = Config(str(_REPO_ROOT / "alembic.ini"))
|
||||
cfg.set_main_option("script_location", str(_REPO_ROOT / "alembic"))
|
||||
cfg.set_main_option("sqlalchemy.url", db_url)
|
||||
return cfg, db_url
|
||||
|
||||
|
||||
def test_upgrade_head_creates_all_tables(alembic_cfg):
|
||||
"""Running 'upgrade head' should create all expected tables."""
|
||||
cfg, db_url = alembic_cfg
|
||||
command.upgrade(cfg, "head")
|
||||
|
||||
engine = create_engine(db_url)
|
||||
inspector = inspect(engine)
|
||||
tables = set(inspector.get_table_names())
|
||||
|
||||
expected = {
|
||||
"alembic_version",
|
||||
"users",
|
||||
"projects",
|
||||
"experiments",
|
||||
"runs",
|
||||
"stage_results",
|
||||
"scores",
|
||||
"response_cache",
|
||||
"webhook_configs",
|
||||
}
|
||||
assert expected == tables
|
||||
|
||||
|
||||
def test_downgrade_base_removes_all_tables(alembic_cfg):
|
||||
"""Running 'downgrade base' should remove all application tables."""
|
||||
cfg, db_url = alembic_cfg
|
||||
command.upgrade(cfg, "head")
|
||||
command.downgrade(cfg, "base")
|
||||
|
||||
engine = create_engine(db_url)
|
||||
inspector = inspect(engine)
|
||||
tables = set(inspector.get_table_names())
|
||||
|
||||
# Only alembic_version should remain
|
||||
assert tables == {"alembic_version"}
|
||||
|
||||
|
||||
def test_runs_table_has_expected_columns(alembic_cfg):
|
||||
"""Spot-check that the runs table has key columns."""
|
||||
cfg, db_url = alembic_cfg
|
||||
command.upgrade(cfg, "head")
|
||||
|
||||
engine = create_engine(db_url)
|
||||
inspector = inspect(engine)
|
||||
columns = {c["name"] for c in inspector.get_columns("runs")}
|
||||
|
||||
assert "id" in columns
|
||||
assert "experiment_id" in columns
|
||||
assert "config_hash" in columns
|
||||
assert "status" in columns
|
||||
assert "cost_estimate" in columns
|
||||
|
||||
|
||||
def test_indexes_created(alembic_cfg):
|
||||
"""Verify key indexes exist after migration."""
|
||||
cfg, db_url = alembic_cfg
|
||||
command.upgrade(cfg, "head")
|
||||
|
||||
engine = create_engine(db_url)
|
||||
inspector = inspect(engine)
|
||||
|
||||
run_indexes = {idx["name"] for idx in inspector.get_indexes("runs")}
|
||||
assert "ix_runs_config_hash" in run_indexes
|
||||
assert "ix_runs_experiment_id" in run_indexes
|
||||
|
||||
score_indexes = {idx["name"] for idx in inspector.get_indexes("scores")}
|
||||
assert "ix_scores_run_id" in score_indexes
|
||||
assert "ix_scores_scorer_name" in score_indexes
|
||||
|
||||
|
||||
def test_foreign_keys_on_experiments(alembic_cfg):
|
||||
"""Verify experiments table has FK to projects."""
|
||||
cfg, db_url = alembic_cfg
|
||||
command.upgrade(cfg, "head")
|
||||
|
||||
engine = create_engine(db_url)
|
||||
inspector = inspect(engine)
|
||||
fks = inspector.get_foreign_keys("experiments")
|
||||
|
||||
referred_tables = {fk["referred_table"] for fk in fks}
|
||||
assert "projects" in referred_tables
|
||||
238
backend/tests/test_auth.py
Normal file
238
backend/tests/test_auth.py
Normal file
|
|
@ -0,0 +1,238 @@
|
|||
"""Tests for backend/auth.py — JWT, API key, setup flow, and auth dependency."""
|
||||
|
||||
import os
|
||||
from datetime import timedelta
|
||||
from unittest.mock import patch
|
||||
|
||||
import pytest
|
||||
from fastapi import FastAPI, Depends
|
||||
from fastapi.testclient import TestClient
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _isolate_settings(tmp_path):
|
||||
"""Ensure tests use a temp SQLite DB and no Redis."""
|
||||
env = {
|
||||
"DATABASE_URL": f"sqlite:///{tmp_path / 'test.db'}",
|
||||
"REDIS_URL": "",
|
||||
"DATA_DIR": str(tmp_path),
|
||||
"JWT_SECRET": "test-secret-key-for-jwt-signing",
|
||||
"API_KEY": "test-api-key-12345",
|
||||
}
|
||||
with patch.dict(os.environ, env, clear=False):
|
||||
import config
|
||||
new_settings = config.Settings(_env_file=None)
|
||||
config.settings = new_settings
|
||||
|
||||
import main
|
||||
main.settings = new_settings
|
||||
main._init_db()
|
||||
main._init_redis()
|
||||
|
||||
from models import Base
|
||||
Base.metadata.create_all(bind=main.engine)
|
||||
|
||||
# Also patch auth module's settings reference
|
||||
import auth
|
||||
auth.settings = new_settings
|
||||
|
||||
yield
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def db_session():
|
||||
from main import get_db
|
||||
gen = get_db()
|
||||
session = next(gen)
|
||||
yield session
|
||||
try:
|
||||
next(gen)
|
||||
except StopIteration:
|
||||
pass
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Password hashing
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestPasswordHashing:
|
||||
def test_hash_and_verify(self):
|
||||
from auth import hash_password, verify_password
|
||||
hashed = hash_password("my-secret-password")
|
||||
assert hashed != "my-secret-password"
|
||||
assert verify_password("my-secret-password", hashed)
|
||||
|
||||
def test_wrong_password_fails(self):
|
||||
from auth import hash_password, verify_password
|
||||
hashed = hash_password("correct-password")
|
||||
assert not verify_password("wrong-password", hashed)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# JWT
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestJWT:
|
||||
def test_create_and_decode_token(self):
|
||||
from auth import create_access_token, decode_access_token
|
||||
token = create_access_token("user-123")
|
||||
assert decode_access_token(token) == "user-123"
|
||||
|
||||
def test_expired_token_raises(self):
|
||||
from auth import create_access_token, decode_access_token
|
||||
token = create_access_token("user-123", expires_delta=timedelta(seconds=-1))
|
||||
with pytest.raises(Exception) as exc_info:
|
||||
decode_access_token(token)
|
||||
assert exc_info.value.status_code == 401
|
||||
|
||||
def test_invalid_token_raises(self):
|
||||
from auth import decode_access_token
|
||||
with pytest.raises(Exception) as exc_info:
|
||||
decode_access_token("not-a-valid-token")
|
||||
assert exc_info.value.status_code == 401
|
||||
|
||||
def test_token_without_sub_raises(self):
|
||||
from jose import jwt
|
||||
import config
|
||||
token = jwt.encode({"foo": "bar"}, config.settings.jwt_secret, algorithm="HS256")
|
||||
from auth import decode_access_token
|
||||
with pytest.raises(Exception) as exc_info:
|
||||
decode_access_token(token)
|
||||
assert exc_info.value.status_code == 401
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# First-boot setup
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestSetup:
|
||||
def test_needs_setup_true_when_no_users(self, db_session):
|
||||
from auth import needs_setup
|
||||
assert needs_setup(db_session) is True
|
||||
|
||||
def test_create_admin_succeeds(self, db_session):
|
||||
from auth import create_admin, needs_setup
|
||||
user = create_admin(db_session, "admin", "password123")
|
||||
assert user.username == "admin"
|
||||
assert user.is_admin is True
|
||||
assert needs_setup(db_session) is False
|
||||
|
||||
def test_create_admin_twice_raises_409(self, db_session):
|
||||
from auth import create_admin
|
||||
create_admin(db_session, "admin", "password123")
|
||||
with pytest.raises(Exception) as exc_info:
|
||||
create_admin(db_session, "admin2", "password456")
|
||||
assert exc_info.value.status_code == 409
|
||||
|
||||
def test_admin_password_is_hashed(self, db_session):
|
||||
from auth import create_admin
|
||||
user = create_admin(db_session, "admin", "password123")
|
||||
assert user.password_hash != "password123"
|
||||
assert user.password_hash.startswith("$2b$")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Authenticate user (login)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestAuthenticateUser:
|
||||
def test_valid_credentials(self, db_session):
|
||||
from auth import create_admin, authenticate_user
|
||||
create_admin(db_session, "admin", "password123")
|
||||
user = authenticate_user(db_session, "admin", "password123")
|
||||
assert user.username == "admin"
|
||||
|
||||
def test_wrong_password_raises_401(self, db_session):
|
||||
from auth import create_admin, authenticate_user
|
||||
create_admin(db_session, "admin", "password123")
|
||||
with pytest.raises(Exception) as exc_info:
|
||||
authenticate_user(db_session, "admin", "wrong")
|
||||
assert exc_info.value.status_code == 401
|
||||
|
||||
def test_unknown_user_raises_401(self, db_session):
|
||||
from auth import authenticate_user
|
||||
with pytest.raises(Exception) as exc_info:
|
||||
authenticate_user(db_session, "nonexistent", "password")
|
||||
assert exc_info.value.status_code == 401
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# get_current_user dependency (integration via test app)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@pytest.fixture
|
||||
def auth_app():
|
||||
"""Create a minimal FastAPI app with a protected endpoint for testing auth."""
|
||||
from auth import get_current_user
|
||||
from schemas import UserResponse
|
||||
|
||||
test_app = FastAPI()
|
||||
|
||||
@test_app.get("/protected")
|
||||
def protected(user=Depends(get_current_user)):
|
||||
return {"user_id": str(user.id), "username": user.username}
|
||||
|
||||
return test_app
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def auth_client(auth_app):
|
||||
return TestClient(auth_app)
|
||||
|
||||
|
||||
class TestGetCurrentUser:
|
||||
def test_no_auth_returns_401(self, auth_client):
|
||||
resp = auth_client.get("/protected")
|
||||
assert resp.status_code == 401
|
||||
assert "Missing authentication" in resp.json()["detail"]
|
||||
|
||||
def test_invalid_bearer_format_returns_401(self, auth_client):
|
||||
resp = auth_client.get("/protected", headers={"Authorization": "NotBearer token"})
|
||||
assert resp.status_code == 401
|
||||
|
||||
def test_jwt_auth_succeeds(self, auth_client, db_session):
|
||||
from auth import create_admin, create_access_token
|
||||
user = create_admin(db_session, "admin", "password123")
|
||||
token = create_access_token(str(user.id))
|
||||
resp = auth_client.get("/protected", headers={"Authorization": f"Bearer {token}"})
|
||||
assert resp.status_code == 200
|
||||
assert resp.json()["username"] == "admin"
|
||||
|
||||
def test_jwt_for_deleted_user_returns_401(self, auth_client, db_session):
|
||||
from auth import create_access_token
|
||||
import uuid
|
||||
token = create_access_token(str(uuid.uuid4()))
|
||||
resp = auth_client.get("/protected", headers={"Authorization": f"Bearer {token}"})
|
||||
assert resp.status_code == 401
|
||||
|
||||
def test_api_key_auth_succeeds(self, auth_client, db_session):
|
||||
from auth import create_admin
|
||||
create_admin(db_session, "admin", "password123")
|
||||
resp = auth_client.get("/protected", headers={"X-Api-Key": "test-api-key-12345"})
|
||||
assert resp.status_code == 200
|
||||
assert resp.json()["username"] == "admin"
|
||||
|
||||
def test_wrong_api_key_returns_401(self, auth_client):
|
||||
resp = auth_client.get("/protected", headers={"X-Api-Key": "wrong-key"})
|
||||
assert resp.status_code == 401
|
||||
|
||||
def test_api_key_without_admin_returns_401(self, auth_client):
|
||||
# No admin user created yet
|
||||
resp = auth_client.get("/protected", headers={"X-Api-Key": "test-api-key-12345"})
|
||||
assert resp.status_code == 401
|
||||
|
||||
def test_api_key_disabled_when_not_configured(self, auth_client, db_session):
|
||||
"""When API_KEY is not set in config, API key auth should fail."""
|
||||
from auth import create_admin
|
||||
import config, auth
|
||||
create_admin(db_session, "admin", "password123")
|
||||
|
||||
old_key = config.settings.api_key
|
||||
config.settings.api_key = None
|
||||
auth.settings = config.settings
|
||||
try:
|
||||
resp = auth_client.get("/protected", headers={"X-Api-Key": "test-api-key-12345"})
|
||||
assert resp.status_code == 401
|
||||
finally:
|
||||
config.settings.api_key = old_key
|
||||
auth.settings = config.settings
|
||||
105
backend/tests/test_config.py
Normal file
105
backend/tests/test_config.py
Normal file
|
|
@ -0,0 +1,105 @@
|
|||
"""Tests for backend/config.py."""
|
||||
|
||||
import os
|
||||
from unittest.mock import patch
|
||||
|
||||
import pytest
|
||||
from pydantic_settings import BaseSettings
|
||||
|
||||
from config import Settings
|
||||
|
||||
|
||||
class TestSettings:
|
||||
"""Test the Settings configuration class."""
|
||||
|
||||
def _make_settings(self, **env_vars: str) -> Settings:
|
||||
"""Create a Settings instance with specific env vars, ignoring .env file."""
|
||||
with patch.dict(os.environ, env_vars, clear=False):
|
||||
return Settings(_env_file=None)
|
||||
|
||||
def test_defaults(self) -> None:
|
||||
s = self._make_settings()
|
||||
assert s.database_url is None
|
||||
assert s.redis_url is None
|
||||
assert s.host == "0.0.0.0"
|
||||
assert s.port == 8400
|
||||
assert s.api_key is None
|
||||
assert s.default_endpoint_url is None
|
||||
assert s.default_endpoint_key is None
|
||||
assert s.max_concurrent_runs == 4
|
||||
assert s.max_tokens_per_sweep == 0
|
||||
assert s.data_dir == "/data"
|
||||
assert s.mcp_enabled is True
|
||||
assert s.mcp_port == 8401
|
||||
|
||||
def test_jwt_secret_auto_generated(self) -> None:
|
||||
s = self._make_settings()
|
||||
assert len(s.jwt_secret) > 0
|
||||
|
||||
def test_jwt_secret_auto_generated_unique(self) -> None:
|
||||
s1 = self._make_settings()
|
||||
s2 = self._make_settings()
|
||||
assert s1.jwt_secret != s2.jwt_secret
|
||||
|
||||
def test_jwt_secret_from_env(self) -> None:
|
||||
s = self._make_settings(JWT_SECRET="my-secret-key")
|
||||
assert s.jwt_secret == "my-secret-key"
|
||||
|
||||
def test_sqlite_fallback_when_no_database_url(self) -> None:
|
||||
s = self._make_settings(DATA_DIR="/tmp/test")
|
||||
url = s.effective_database_url
|
||||
assert url.startswith("sqlite:///")
|
||||
assert url.endswith("promptlooper.db")
|
||||
assert "tmp" in url and "test" in url
|
||||
assert s.is_sqlite is True
|
||||
|
||||
def test_postgres_when_database_url_set(self) -> None:
|
||||
url = "postgresql://user:pass@localhost:5432/promptlooper"
|
||||
s = self._make_settings(DATABASE_URL=url)
|
||||
assert s.effective_database_url == url
|
||||
assert s.is_sqlite is False
|
||||
|
||||
def test_in_process_queue_when_no_redis(self) -> None:
|
||||
s = self._make_settings()
|
||||
assert s.use_in_process_queue is True
|
||||
|
||||
def test_celery_queue_when_redis_set(self) -> None:
|
||||
s = self._make_settings(REDIS_URL="redis://localhost:6379/0")
|
||||
assert s.use_in_process_queue is False
|
||||
assert s.redis_url == "redis://localhost:6379/0"
|
||||
|
||||
def test_empty_api_key_becomes_none(self) -> None:
|
||||
s = self._make_settings(API_KEY="")
|
||||
assert s.api_key is None
|
||||
|
||||
def test_whitespace_api_key_becomes_none(self) -> None:
|
||||
s = self._make_settings(API_KEY=" ")
|
||||
assert s.api_key is None
|
||||
|
||||
def test_valid_api_key_preserved(self) -> None:
|
||||
s = self._make_settings(API_KEY="sk-test-123")
|
||||
assert s.api_key == "sk-test-123"
|
||||
|
||||
def test_env_overrides(self) -> None:
|
||||
s = self._make_settings(
|
||||
HOST="127.0.0.1",
|
||||
PORT="9000",
|
||||
MAX_CONCURRENT_RUNS="8",
|
||||
MAX_TOKENS_PER_SWEEP="100000",
|
||||
MCP_ENABLED="false",
|
||||
MCP_PORT="9001",
|
||||
)
|
||||
assert s.host == "127.0.0.1"
|
||||
assert s.port == 9000
|
||||
assert s.max_concurrent_runs == 8
|
||||
assert s.max_tokens_per_sweep == 100000
|
||||
assert s.mcp_enabled is False
|
||||
assert s.mcp_port == 9001
|
||||
|
||||
def test_default_endpoint_config(self) -> None:
|
||||
s = self._make_settings(
|
||||
DEFAULT_ENDPOINT_URL="http://localhost:11434/v1",
|
||||
DEFAULT_ENDPOINT_KEY="sk-key",
|
||||
)
|
||||
assert s.default_endpoint_url == "http://localhost:11434/v1"
|
||||
assert s.default_endpoint_key == "sk-key"
|
||||
129
backend/tests/test_main.py
Normal file
129
backend/tests/test_main.py
Normal file
|
|
@ -0,0 +1,129 @@
|
|||
"""Tests for backend/main.py — FastAPI application."""
|
||||
|
||||
import os
|
||||
from unittest.mock import patch
|
||||
|
||||
import pytest
|
||||
from fastapi.testclient import TestClient
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _isolate_settings(tmp_path):
|
||||
"""Ensure tests use a temp SQLite DB and no Redis."""
|
||||
env = {
|
||||
"DATABASE_URL": f"sqlite:///{tmp_path / 'test.db'}",
|
||||
"REDIS_URL": "",
|
||||
"DATA_DIR": str(tmp_path),
|
||||
}
|
||||
with patch.dict(os.environ, env, clear=False):
|
||||
# Reload settings so it picks up test env
|
||||
import config
|
||||
new_settings = config.Settings(_env_file=None)
|
||||
config.settings = new_settings
|
||||
|
||||
# Patch main's reference too
|
||||
import main
|
||||
main.settings = new_settings
|
||||
main._init_db()
|
||||
main._init_redis()
|
||||
|
||||
# Create tables
|
||||
from models import Base
|
||||
Base.metadata.create_all(bind=main.engine)
|
||||
|
||||
yield
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def client():
|
||||
from main import app
|
||||
return TestClient(app)
|
||||
|
||||
|
||||
class TestHealthEndpoint:
|
||||
def test_health_returns_ok(self, client):
|
||||
resp = client.get("/health")
|
||||
assert resp.status_code == 200
|
||||
data = resp.json()
|
||||
assert data["status"] == "ok"
|
||||
assert data["database"] is True
|
||||
assert data["redis"] is True # in-process mode counts as ok
|
||||
|
||||
def test_health_response_schema(self, client):
|
||||
resp = client.get("/health")
|
||||
data = resp.json()
|
||||
assert set(data.keys()) == {"status", "database", "redis"}
|
||||
|
||||
|
||||
class TestCORSMiddleware:
|
||||
def test_cors_headers_present(self, client):
|
||||
resp = client.options(
|
||||
"/health",
|
||||
headers={
|
||||
"Origin": "http://localhost:3000",
|
||||
"Access-Control-Request-Method": "GET",
|
||||
},
|
||||
)
|
||||
assert "access-control-allow-origin" in resp.headers
|
||||
|
||||
|
||||
class TestWebSocket:
|
||||
def test_websocket_connect_and_echo(self, client):
|
||||
with client.websocket_connect("/ws") as ws:
|
||||
ws.send_json({"type": "ping"})
|
||||
data = ws.receive_json()
|
||||
assert data["type"] == "ack"
|
||||
assert data["data"]["type"] == "ping"
|
||||
|
||||
def test_websocket_disconnect_cleanup(self, client):
|
||||
from main import ws_manager
|
||||
initial_count = len(ws_manager.active_connections)
|
||||
with client.websocket_connect("/ws") as ws:
|
||||
assert len(ws_manager.active_connections) == initial_count + 1
|
||||
# After disconnect, connection should be removed
|
||||
assert len(ws_manager.active_connections) == initial_count
|
||||
|
||||
|
||||
class TestRouterMounting:
|
||||
def test_openapi_schema_loads(self, client):
|
||||
resp = client.get("/openapi.json")
|
||||
assert resp.status_code == 200
|
||||
schema = resp.json()
|
||||
assert schema["info"]["title"] == "PromptLooper"
|
||||
|
||||
def test_unknown_route_returns_404(self, client):
|
||||
resp = client.get("/api/nonexistent")
|
||||
assert resp.status_code == 404
|
||||
|
||||
|
||||
class TestConnectionManager:
|
||||
def test_broadcast_removes_dead_connections(self):
|
||||
"""ConnectionManager.broadcast skips and removes broken connections."""
|
||||
from main import ConnectionManager
|
||||
manager = ConnectionManager()
|
||||
# No connections — broadcast should not raise
|
||||
import asyncio
|
||||
asyncio.get_event_loop().run_until_complete(
|
||||
manager.broadcast({"test": True})
|
||||
)
|
||||
assert len(manager.active_connections) == 0
|
||||
|
||||
|
||||
class TestGetDb:
|
||||
def test_get_db_yields_session(self):
|
||||
from main import get_db
|
||||
gen = get_db()
|
||||
session = next(gen)
|
||||
assert session is not None
|
||||
# Clean up
|
||||
try:
|
||||
next(gen)
|
||||
except StopIteration:
|
||||
pass
|
||||
|
||||
|
||||
class TestGetRedis:
|
||||
def test_get_redis_returns_none_in_process_mode(self):
|
||||
from main import get_redis
|
||||
# In test setup, Redis is not configured
|
||||
assert get_redis() is None
|
||||
359
backend/tests/test_models.py
Normal file
359
backend/tests/test_models.py
Normal file
|
|
@ -0,0 +1,359 @@
|
|||
"""Tests for SQLAlchemy ORM models."""
|
||||
|
||||
import uuid
|
||||
from datetime import datetime, timezone
|
||||
|
||||
from sqlalchemy import create_engine, inspect
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from models import (
|
||||
Base,
|
||||
Experiment,
|
||||
ExperimentStatus,
|
||||
Project,
|
||||
ResponseCache,
|
||||
Run,
|
||||
RunStatus,
|
||||
Score,
|
||||
StageResult,
|
||||
User,
|
||||
WebhookConfig,
|
||||
)
|
||||
|
||||
|
||||
def _engine():
|
||||
engine = create_engine("sqlite:///:memory:")
|
||||
Base.metadata.create_all(engine)
|
||||
return engine
|
||||
|
||||
|
||||
def _session(engine):
|
||||
return Session(engine)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Table existence
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_all_tables_created():
|
||||
engine = _engine()
|
||||
table_names = inspect(engine).get_table_names()
|
||||
expected = {
|
||||
"users",
|
||||
"projects",
|
||||
"experiments",
|
||||
"runs",
|
||||
"stage_results",
|
||||
"scores",
|
||||
"response_cache",
|
||||
"webhook_configs",
|
||||
}
|
||||
assert expected.issubset(set(table_names))
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# User
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_user_creation():
|
||||
engine = _engine()
|
||||
with _session(engine) as session:
|
||||
user = User(username="admin", password_hash="hashed", is_admin=True)
|
||||
session.add(user)
|
||||
session.commit()
|
||||
|
||||
assert isinstance(user.id, uuid.UUID)
|
||||
assert user.username == "admin"
|
||||
assert user.is_admin is True
|
||||
assert isinstance(user.created_at, datetime)
|
||||
|
||||
|
||||
def test_user_username_unique():
|
||||
engine = _engine()
|
||||
with _session(engine) as session:
|
||||
session.add(User(username="dup", password_hash="h1"))
|
||||
session.commit()
|
||||
session.add(User(username="dup", password_hash="h2"))
|
||||
try:
|
||||
session.commit()
|
||||
assert False, "Should have raised IntegrityError"
|
||||
except Exception:
|
||||
session.rollback()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Project
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_project_with_owner():
|
||||
engine = _engine()
|
||||
with _session(engine) as session:
|
||||
user = User(username="owner", password_hash="h")
|
||||
project = Project(name="Test Project", description="A test", owner=user)
|
||||
session.add(project)
|
||||
session.commit()
|
||||
|
||||
assert project.owner_id == user.id
|
||||
assert project.name == "Test Project"
|
||||
assert isinstance(project.updated_at, datetime)
|
||||
|
||||
|
||||
def test_project_cascade_delete_from_user():
|
||||
engine = _engine()
|
||||
with _session(engine) as session:
|
||||
user = User(username="owner", password_hash="h")
|
||||
project = Project(name="P1", owner=user)
|
||||
session.add(project)
|
||||
session.commit()
|
||||
project_id = project.id
|
||||
|
||||
session.delete(user)
|
||||
session.commit()
|
||||
|
||||
assert session.get(Project, project_id) is None
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Experiment
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_experiment_defaults():
|
||||
engine = _engine()
|
||||
with _session(engine) as session:
|
||||
user = User(username="u", password_hash="h")
|
||||
project = Project(name="P", owner=user)
|
||||
exp = Experiment(
|
||||
project=project,
|
||||
name="Exp1",
|
||||
sample_data={"inputs": ["hello"]},
|
||||
pipeline_stages=[{"prompt": "test"}],
|
||||
scoring_config={"scorers": ["keyword"]},
|
||||
parameter_space={"temperature": [0.1, 0.5]},
|
||||
)
|
||||
session.add(exp)
|
||||
session.commit()
|
||||
|
||||
assert exp.status == ExperimentStatus.draft
|
||||
assert exp.sample_data == {"inputs": ["hello"]}
|
||||
assert isinstance(exp.created_at, datetime)
|
||||
|
||||
|
||||
def test_experiment_cascade_delete_from_project():
|
||||
engine = _engine()
|
||||
with _session(engine) as session:
|
||||
user = User(username="u", password_hash="h")
|
||||
project = Project(name="P", owner=user)
|
||||
exp = Experiment(project=project, name="E")
|
||||
session.add(exp)
|
||||
session.commit()
|
||||
exp_id = exp.id
|
||||
|
||||
session.delete(project)
|
||||
session.commit()
|
||||
|
||||
assert session.get(Experiment, exp_id) is None
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Run
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_run_creation():
|
||||
engine = _engine()
|
||||
with _session(engine) as session:
|
||||
user = User(username="u", password_hash="h")
|
||||
project = Project(name="P", owner=user)
|
||||
exp = Experiment(project=project, name="E")
|
||||
run = Run(
|
||||
experiment=exp,
|
||||
config_hash="a" * 64,
|
||||
config={"model": "gpt-4", "temperature": 0.5},
|
||||
status=RunStatus.completed,
|
||||
duration_ms=1200,
|
||||
tokens_in=100,
|
||||
tokens_out=50,
|
||||
)
|
||||
session.add(run)
|
||||
session.commit()
|
||||
|
||||
assert run.status == RunStatus.completed
|
||||
assert run.config["model"] == "gpt-4"
|
||||
|
||||
|
||||
def test_run_default_status():
|
||||
engine = _engine()
|
||||
with _session(engine) as session:
|
||||
user = User(username="u", password_hash="h")
|
||||
project = Project(name="P", owner=user)
|
||||
exp = Experiment(project=project, name="E")
|
||||
run = Run(experiment=exp, config_hash="b" * 64, config={})
|
||||
session.add(run)
|
||||
session.commit()
|
||||
|
||||
assert run.status == RunStatus.pending
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# StageResult
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_stage_result():
|
||||
engine = _engine()
|
||||
with _session(engine) as session:
|
||||
user = User(username="u", password_hash="h")
|
||||
project = Project(name="P", owner=user)
|
||||
exp = Experiment(project=project, name="E")
|
||||
run = Run(experiment=exp, config_hash="c" * 64, config={})
|
||||
sr = StageResult(
|
||||
run=run,
|
||||
stage_index=0,
|
||||
prompt_sent="Hello",
|
||||
response_raw="World",
|
||||
model_used="gpt-4",
|
||||
parameters={"temperature": 0.5},
|
||||
tokens_in=10,
|
||||
tokens_out=5,
|
||||
latency_ms=200,
|
||||
)
|
||||
session.add(sr)
|
||||
session.commit()
|
||||
|
||||
assert sr.stage_index == 0
|
||||
assert sr.model_used == "gpt-4"
|
||||
assert len(run.stage_results) == 1
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Score
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_score():
|
||||
engine = _engine()
|
||||
with _session(engine) as session:
|
||||
user = User(username="u", password_hash="h")
|
||||
project = Project(name="P", owner=user)
|
||||
exp = Experiment(project=project, name="E")
|
||||
run = Run(experiment=exp, config_hash="d" * 64, config={})
|
||||
score = Score(
|
||||
run=run,
|
||||
scorer_name="embedding_similarity",
|
||||
value=0.87,
|
||||
scorer_metadata={"reference_id": "ref1"},
|
||||
)
|
||||
session.add(score)
|
||||
session.commit()
|
||||
|
||||
assert score.value == 0.87
|
||||
assert score.scorer_name == "embedding_similarity"
|
||||
assert len(run.scores) == 1
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# ResponseCache
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_response_cache():
|
||||
engine = _engine()
|
||||
with _session(engine) as session:
|
||||
cache = ResponseCache(
|
||||
config_hash="e" * 64,
|
||||
response="cached response",
|
||||
model="gpt-4",
|
||||
tokens_in=50,
|
||||
tokens_out=25,
|
||||
latency_ms=300,
|
||||
)
|
||||
session.add(cache)
|
||||
session.commit()
|
||||
|
||||
fetched = session.get(ResponseCache, "e" * 64)
|
||||
assert fetched is not None
|
||||
assert fetched.response == "cached response"
|
||||
|
||||
|
||||
def test_response_cache_pk_is_config_hash():
|
||||
engine = _engine()
|
||||
with _session(engine) as session:
|
||||
session.add(
|
||||
ResponseCache(config_hash="f" * 64, response="r1", model="m1")
|
||||
)
|
||||
session.commit()
|
||||
session.add(
|
||||
ResponseCache(config_hash="f" * 64, response="r2", model="m2")
|
||||
)
|
||||
try:
|
||||
session.commit()
|
||||
assert False, "Should have raised IntegrityError"
|
||||
except Exception:
|
||||
session.rollback()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# WebhookConfig
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_webhook_config():
|
||||
engine = _engine()
|
||||
with _session(engine) as session:
|
||||
wh = WebhookConfig(
|
||||
event_type="experiment.completed",
|
||||
url="https://example.com/hook",
|
||||
headers={"Authorization": "Bearer token"},
|
||||
is_active=True,
|
||||
)
|
||||
session.add(wh)
|
||||
session.commit()
|
||||
|
||||
assert isinstance(wh.id, uuid.UUID)
|
||||
assert wh.event_type == "experiment.completed"
|
||||
assert wh.is_active is True
|
||||
|
||||
|
||||
def test_webhook_config_default_active():
|
||||
engine = _engine()
|
||||
with _session(engine) as session:
|
||||
wh = WebhookConfig(
|
||||
event_type="run.failed",
|
||||
url="https://example.com/hook",
|
||||
)
|
||||
session.add(wh)
|
||||
session.commit()
|
||||
|
||||
assert wh.is_active is True
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Relationship cascades: Run → StageResult + Score
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_run_cascade_deletes_children():
|
||||
engine = _engine()
|
||||
with _session(engine) as session:
|
||||
user = User(username="u", password_hash="h")
|
||||
project = Project(name="P", owner=user)
|
||||
exp = Experiment(project=project, name="E")
|
||||
run = Run(experiment=exp, config_hash="g" * 64, config={})
|
||||
sr = StageResult(
|
||||
run=run, stage_index=0, prompt_sent="p",
|
||||
response_raw="r", model_used="m",
|
||||
)
|
||||
score = Score(run=run, scorer_name="test", value=0.5)
|
||||
session.add_all([run, sr, score])
|
||||
session.commit()
|
||||
|
||||
sr_id, score_id = sr.id, score.id
|
||||
session.delete(run)
|
||||
session.commit()
|
||||
|
||||
assert session.get(StageResult, sr_id) is None
|
||||
assert session.get(Score, score_id) is None
|
||||
224
backend/tests/test_routers.py
Normal file
224
backend/tests/test_routers.py
Normal file
|
|
@ -0,0 +1,224 @@
|
|||
"""Tests for router stubs — verify all routes are mounted and return 501."""
|
||||
|
||||
import pytest
|
||||
from fastapi.testclient import TestClient
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def client(tmp_path, monkeypatch):
|
||||
"""Create a test client with a temporary database."""
|
||||
monkeypatch.setenv("DATA_DIR", str(tmp_path))
|
||||
monkeypatch.setenv("DATABASE_URL", "")
|
||||
monkeypatch.setenv("REDIS_URL", "")
|
||||
|
||||
# Reload config to pick up test env
|
||||
import importlib
|
||||
import config as config_mod
|
||||
importlib.reload(config_mod)
|
||||
|
||||
import main as main_mod
|
||||
importlib.reload(main_mod)
|
||||
|
||||
with TestClient(main_mod.app) as c:
|
||||
yield c
|
||||
|
||||
|
||||
# ---- Auth router (/api/auth) ----
|
||||
|
||||
def test_auth_setup(client):
|
||||
resp = client.post("/api/auth/setup")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_auth_login(client):
|
||||
resp = client.post("/api/auth/login")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_auth_me(client):
|
||||
resp = client.get("/api/auth/me")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
# ---- Projects router (/api/projects) ----
|
||||
|
||||
def test_projects_list(client):
|
||||
resp = client.get("/api/projects/")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_projects_create(client):
|
||||
resp = client.post("/api/projects/")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_projects_get(client):
|
||||
resp = client.get("/api/projects/00000000-0000-0000-0000-000000000001")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_projects_update(client):
|
||||
resp = client.put("/api/projects/00000000-0000-0000-0000-000000000001")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_projects_delete(client):
|
||||
resp = client.delete("/api/projects/00000000-0000-0000-0000-000000000001")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
# ---- Experiments router (/api/experiments) ----
|
||||
|
||||
def test_experiments_list(client):
|
||||
resp = client.get("/api/experiments/")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_experiments_create(client):
|
||||
resp = client.post("/api/experiments/")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_experiments_get(client):
|
||||
resp = client.get("/api/experiments/00000000-0000-0000-0000-000000000001")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_experiments_update(client):
|
||||
resp = client.put("/api/experiments/00000000-0000-0000-0000-000000000001")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_experiments_delete(client):
|
||||
resp = client.delete("/api/experiments/00000000-0000-0000-0000-000000000001")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_experiments_sweep(client):
|
||||
resp = client.post("/api/experiments/00000000-0000-0000-0000-000000000001/sweep")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_experiments_pause(client):
|
||||
resp = client.post("/api/experiments/00000000-0000-0000-0000-000000000001/pause")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_experiments_resume(client):
|
||||
resp = client.post("/api/experiments/00000000-0000-0000-0000-000000000001/resume")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_experiments_stop(client):
|
||||
resp = client.post("/api/experiments/00000000-0000-0000-0000-000000000001/stop")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
# ---- Runs router (/api/runs) ----
|
||||
|
||||
def test_runs_list(client):
|
||||
resp = client.get("/api/runs/experiments/00000000-0000-0000-0000-000000000001/runs")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_runs_get(client):
|
||||
resp = client.get("/api/runs/00000000-0000-0000-0000-000000000001")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_runs_create(client):
|
||||
resp = client.post("/api/runs/")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_runs_score(client):
|
||||
resp = client.post("/api/runs/00000000-0000-0000-0000-000000000001/score")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_runs_leaderboard(client):
|
||||
resp = client.get("/api/runs/experiments/00000000-0000-0000-0000-000000000001/leaderboard")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
# ---- Endpoints router (/api/endpoints) ----
|
||||
|
||||
def test_endpoints_list(client):
|
||||
resp = client.get("/api/endpoints/")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_endpoints_create(client):
|
||||
resp = client.post("/api/endpoints/")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_endpoints_update(client):
|
||||
resp = client.put("/api/endpoints/00000000-0000-0000-0000-000000000001")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_endpoints_delete(client):
|
||||
resp = client.delete("/api/endpoints/00000000-0000-0000-0000-000000000001")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_endpoints_test(client):
|
||||
resp = client.post("/api/endpoints/00000000-0000-0000-0000-000000000001/test")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
# ---- Export router (/api/export) ----
|
||||
|
||||
def test_export_best(client):
|
||||
resp = client.get("/api/export/experiments/00000000-0000-0000-0000-000000000001/best")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_export_env(client):
|
||||
resp = client.get("/api/export/experiments/00000000-0000-0000-0000-000000000001/env")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_export_yaml(client):
|
||||
resp = client.get("/api/export/experiments/00000000-0000-0000-0000-000000000001/yaml")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_export_report(client):
|
||||
resp = client.get("/api/export/experiments/00000000-0000-0000-0000-000000000001/report")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
# ---- Webhooks router (/api/webhooks) ----
|
||||
|
||||
def test_webhooks_list(client):
|
||||
resp = client.get("/api/webhooks/")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_webhooks_create(client):
|
||||
resp = client.post("/api/webhooks/")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_webhooks_delete(client):
|
||||
resp = client.delete("/api/webhooks/00000000-0000-0000-0000-000000000001")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
# ---- Admin router (/api/admin) ----
|
||||
|
||||
def test_admin_get_settings(client):
|
||||
resp = client.get("/api/admin/settings")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_admin_update_settings(client):
|
||||
resp = client.put("/api/admin/settings")
|
||||
assert resp.status_code == 501
|
||||
|
||||
|
||||
def test_admin_stats(client):
|
||||
resp = client.get("/api/admin/stats")
|
||||
assert resp.status_code == 501
|
||||
339
backend/tests/test_schemas.py
Normal file
339
backend/tests/test_schemas.py
Normal file
|
|
@ -0,0 +1,339 @@
|
|||
"""Tests for backend/schemas.py."""
|
||||
|
||||
import uuid
|
||||
from datetime import datetime, timezone
|
||||
|
||||
import pytest
|
||||
from pydantic import ValidationError
|
||||
|
||||
from models import ExperimentStatus, RunStatus
|
||||
from schemas import (
|
||||
EndpointCreate,
|
||||
EndpointResponse,
|
||||
EndpointUpdate,
|
||||
ExperimentCreate,
|
||||
ExperimentResponse,
|
||||
ExperimentUpdate,
|
||||
ExportResponse,
|
||||
ExportRunRow,
|
||||
HealthResponse,
|
||||
LoginRequest,
|
||||
ProjectCreate,
|
||||
ProjectResponse,
|
||||
ProjectUpdate,
|
||||
RunDetailResponse,
|
||||
RunResponse,
|
||||
ScoreInput,
|
||||
ScoreResponse,
|
||||
SetupRequest,
|
||||
StageResultResponse,
|
||||
TokenResponse,
|
||||
UserResponse,
|
||||
WebhookCreate,
|
||||
WebhookResponse,
|
||||
WebhookUpdate,
|
||||
)
|
||||
|
||||
|
||||
NOW = datetime.now(timezone.utc)
|
||||
UUID1 = uuid.uuid4()
|
||||
UUID2 = uuid.uuid4()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Project schemas
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestProjectSchemas:
|
||||
def test_create_valid(self) -> None:
|
||||
p = ProjectCreate(name="My Project", description="desc")
|
||||
assert p.name == "My Project"
|
||||
assert p.description == "desc"
|
||||
|
||||
def test_create_name_required(self) -> None:
|
||||
with pytest.raises(ValidationError):
|
||||
ProjectCreate() # type: ignore[call-arg]
|
||||
|
||||
def test_create_empty_name_rejected(self) -> None:
|
||||
with pytest.raises(ValidationError):
|
||||
ProjectCreate(name="")
|
||||
|
||||
def test_update_partial(self) -> None:
|
||||
p = ProjectUpdate(name="New Name")
|
||||
assert p.name == "New Name"
|
||||
assert p.description is None
|
||||
|
||||
def test_response_from_attributes(self) -> None:
|
||||
class Fake:
|
||||
id = UUID1
|
||||
name = "Proj"
|
||||
description = None
|
||||
owner_id = UUID2
|
||||
created_at = NOW
|
||||
updated_at = NOW
|
||||
|
||||
r = ProjectResponse.model_validate(Fake())
|
||||
assert r.id == UUID1
|
||||
assert r.name == "Proj"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Experiment schemas
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestExperimentSchemas:
|
||||
def test_create_minimal(self) -> None:
|
||||
e = ExperimentCreate(name="Exp 1")
|
||||
assert e.name == "Exp 1"
|
||||
assert e.sample_data is None
|
||||
|
||||
def test_create_with_all_fields(self) -> None:
|
||||
e = ExperimentCreate(
|
||||
name="Full",
|
||||
description="desc",
|
||||
sample_data={"key": "value"},
|
||||
pipeline_stages={"stages": []},
|
||||
scoring_config={"scorer": "exact"},
|
||||
parameter_space={"temp": [0.5, 1.0]},
|
||||
)
|
||||
assert e.parameter_space == {"temp": [0.5, 1.0]}
|
||||
|
||||
def test_update_status(self) -> None:
|
||||
e = ExperimentUpdate(status=ExperimentStatus.running)
|
||||
assert e.status == ExperimentStatus.running
|
||||
|
||||
def test_response_from_attributes(self) -> None:
|
||||
class Fake:
|
||||
id = UUID1
|
||||
project_id = UUID2
|
||||
name = "Exp"
|
||||
description = None
|
||||
sample_data = None
|
||||
pipeline_stages = None
|
||||
scoring_config = None
|
||||
parameter_space = None
|
||||
status = ExperimentStatus.draft
|
||||
created_at = NOW
|
||||
updated_at = NOW
|
||||
|
||||
r = ExperimentResponse.model_validate(Fake())
|
||||
assert r.status == ExperimentStatus.draft
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Run schemas
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestRunSchemas:
|
||||
def test_response_from_attributes(self) -> None:
|
||||
class Fake:
|
||||
id = UUID1
|
||||
experiment_id = UUID2
|
||||
config_hash = "abc123"
|
||||
config = {"model": "gpt-4"}
|
||||
status = RunStatus.completed
|
||||
started_at = NOW
|
||||
completed_at = NOW
|
||||
duration_ms = 1234
|
||||
tokens_in = 100
|
||||
tokens_out = 200
|
||||
cost_estimate = 0.003
|
||||
|
||||
r = RunResponse.model_validate(Fake())
|
||||
assert r.duration_ms == 1234
|
||||
assert r.cost_estimate == 0.003
|
||||
|
||||
def test_detail_response_nested(self) -> None:
|
||||
data = {
|
||||
"id": UUID1,
|
||||
"experiment_id": UUID2,
|
||||
"config_hash": "abc",
|
||||
"config": {},
|
||||
"status": RunStatus.pending,
|
||||
"started_at": None,
|
||||
"completed_at": None,
|
||||
"duration_ms": None,
|
||||
"tokens_in": None,
|
||||
"tokens_out": None,
|
||||
"cost_estimate": None,
|
||||
"stage_results": [],
|
||||
"scores": [],
|
||||
}
|
||||
r = RunDetailResponse(**data)
|
||||
assert r.stage_results == []
|
||||
assert r.scores == []
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Score schemas
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestScoreSchemas:
|
||||
def test_input_valid(self) -> None:
|
||||
s = ScoreInput(scorer_name="exact_match", value=0.95, metadata={"note": "ok"})
|
||||
assert s.value == 0.95
|
||||
assert s.metadata == {"note": "ok"}
|
||||
|
||||
def test_input_missing_name(self) -> None:
|
||||
with pytest.raises(ValidationError):
|
||||
ScoreInput(value=0.5) # type: ignore[call-arg]
|
||||
|
||||
def test_response_from_attributes(self) -> None:
|
||||
class Fake:
|
||||
id = UUID1
|
||||
run_id = UUID2
|
||||
scorer_name = "bleu"
|
||||
value = 0.8
|
||||
scorer_metadata = {"n": 4}
|
||||
created_at = NOW
|
||||
|
||||
r = ScoreResponse.model_validate(Fake())
|
||||
assert r.scorer_metadata == {"n": 4}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Endpoint schemas
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestEndpointSchemas:
|
||||
def test_create_valid(self) -> None:
|
||||
e = EndpointCreate(name="OpenAI", url="https://api.openai.com/v1")
|
||||
assert e.api_key is None
|
||||
|
||||
def test_create_empty_name_rejected(self) -> None:
|
||||
with pytest.raises(ValidationError):
|
||||
EndpointCreate(name="", url="https://example.com")
|
||||
|
||||
def test_update_partial(self) -> None:
|
||||
e = EndpointUpdate(url="https://new-url.com")
|
||||
assert e.name is None
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Webhook schemas
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestWebhookSchemas:
|
||||
def test_create_valid(self) -> None:
|
||||
w = WebhookCreate(
|
||||
event_type="run.completed",
|
||||
url="https://hooks.example.com/promptlooper",
|
||||
headers={"Authorization": "Bearer xyz"},
|
||||
)
|
||||
assert w.is_active is True
|
||||
|
||||
def test_create_inactive(self) -> None:
|
||||
w = WebhookCreate(
|
||||
event_type="run.failed",
|
||||
url="https://example.com",
|
||||
is_active=False,
|
||||
)
|
||||
assert w.is_active is False
|
||||
|
||||
def test_update_partial(self) -> None:
|
||||
w = WebhookUpdate(is_active=False)
|
||||
assert w.event_type is None
|
||||
assert w.is_active is False
|
||||
|
||||
def test_response_from_attributes(self) -> None:
|
||||
class Fake:
|
||||
id = UUID1
|
||||
event_type = "run.completed"
|
||||
url = "https://example.com"
|
||||
headers = None
|
||||
is_active = True
|
||||
|
||||
r = WebhookResponse.model_validate(Fake())
|
||||
assert r.event_type == "run.completed"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Auth schemas
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestAuthSchemas:
|
||||
def test_setup_password_min_length(self) -> None:
|
||||
with pytest.raises(ValidationError):
|
||||
SetupRequest(username="admin", password="short")
|
||||
|
||||
def test_setup_valid(self) -> None:
|
||||
s = SetupRequest(username="admin", password="securepass123")
|
||||
assert s.username == "admin"
|
||||
|
||||
def test_login_valid(self) -> None:
|
||||
l = LoginRequest(username="user", password="pass")
|
||||
assert l.username == "user"
|
||||
|
||||
def test_token_response(self) -> None:
|
||||
t = TokenResponse(access_token="jwt.token.here")
|
||||
assert t.token_type == "bearer"
|
||||
|
||||
def test_user_response_from_attributes(self) -> None:
|
||||
class Fake:
|
||||
id = UUID1
|
||||
username = "admin"
|
||||
is_admin = True
|
||||
created_at = NOW
|
||||
|
||||
r = UserResponse.model_validate(Fake())
|
||||
assert r.is_admin is True
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Export schemas
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestExportSchemas:
|
||||
def test_export_run_row(self) -> None:
|
||||
row = ExportRunRow(
|
||||
run_id=UUID1,
|
||||
experiment_id=UUID2,
|
||||
config_hash="abc",
|
||||
config={"model": "gpt-4"},
|
||||
status=RunStatus.completed,
|
||||
duration_ms=500,
|
||||
tokens_in=10,
|
||||
tokens_out=20,
|
||||
cost_estimate=0.001,
|
||||
scores={"exact_match": 1.0, "bleu": 0.85},
|
||||
)
|
||||
assert row.scores["bleu"] == 0.85
|
||||
|
||||
def test_export_run_row_default_scores(self) -> None:
|
||||
row = ExportRunRow(
|
||||
run_id=UUID1,
|
||||
experiment_id=UUID2,
|
||||
config_hash="abc",
|
||||
config={},
|
||||
status=RunStatus.pending,
|
||||
)
|
||||
assert row.scores == {}
|
||||
|
||||
def test_export_response(self) -> None:
|
||||
r = ExportResponse(
|
||||
experiment_id=UUID1,
|
||||
experiment_name="Test Exp",
|
||||
rows=[],
|
||||
)
|
||||
assert r.rows == []
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Health schema
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestHealthSchema:
|
||||
def test_health_response(self) -> None:
|
||||
h = HealthResponse(database=True, redis=False)
|
||||
assert h.status == "ok"
|
||||
assert h.database is True
|
||||
assert h.redis is False
|
||||
138
backend/tests/test_stack_integration.py
Normal file
138
backend/tests/test_stack_integration.py
Normal file
|
|
@ -0,0 +1,138 @@
|
|||
"""Stack integration verification tests.
|
||||
|
||||
These tests verify that all configuration files needed for 'docker compose up'
|
||||
are present, consistent, and well-formed. They do NOT start actual containers.
|
||||
"""
|
||||
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
ROOT = Path(__file__).resolve().parents[2] # repo root
|
||||
|
||||
|
||||
class TestDockerComposeConfig:
|
||||
"""Verify docker-compose.yml references are satisfied."""
|
||||
|
||||
def test_docker_compose_exists(self):
|
||||
assert (ROOT / "docker-compose.yml").is_file()
|
||||
|
||||
def test_dockerfile_exists(self):
|
||||
assert (ROOT / "docker" / "Dockerfile").is_file()
|
||||
|
||||
def test_nginx_conf_exists(self):
|
||||
assert (ROOT / "docker" / "nginx.conf").is_file()
|
||||
|
||||
def test_entrypoint_exists(self):
|
||||
assert (ROOT / "docker" / "entrypoint.sh").is_file()
|
||||
|
||||
def test_requirements_txt_exists(self):
|
||||
assert (ROOT / "backend" / "requirements.txt").is_file()
|
||||
|
||||
def test_alembic_ini_exists(self):
|
||||
assert (ROOT / "alembic.ini").is_file()
|
||||
|
||||
def test_alembic_env_exists(self):
|
||||
assert (ROOT / "alembic" / "env.py").is_file()
|
||||
|
||||
def test_alembic_has_migration(self):
|
||||
versions = list((ROOT / "alembic" / "versions").glob("*.py"))
|
||||
assert len(versions) >= 1, "Expected at least one Alembic migration"
|
||||
|
||||
|
||||
class TestDockerfileConsistency:
|
||||
"""Verify Dockerfile references match actual files."""
|
||||
|
||||
def test_dockerfile_copies_backend(self):
|
||||
content = (ROOT / "docker" / "Dockerfile").read_text()
|
||||
assert "COPY backend/" in content
|
||||
|
||||
def test_dockerfile_copies_alembic(self):
|
||||
content = (ROOT / "docker" / "Dockerfile").read_text()
|
||||
assert "COPY alembic/" in content
|
||||
assert "COPY alembic.ini" in content
|
||||
|
||||
def test_dockerfile_copies_entrypoint(self):
|
||||
content = (ROOT / "docker" / "Dockerfile").read_text()
|
||||
assert "entrypoint.sh" in content
|
||||
|
||||
def test_dockerfile_runs_migrations_via_entrypoint(self):
|
||||
content = (ROOT / "docker" / "entrypoint.sh").read_text()
|
||||
assert "alembic upgrade head" in content
|
||||
|
||||
|
||||
class TestNginxConfig:
|
||||
"""Verify nginx proxies correctly."""
|
||||
|
||||
def test_nginx_proxies_api(self):
|
||||
content = (ROOT / "docker" / "nginx.conf").read_text()
|
||||
assert "proxy_pass http://promptlooper-api:8000" in content
|
||||
|
||||
def test_nginx_proxies_websocket(self):
|
||||
content = (ROOT / "docker" / "nginx.conf").read_text()
|
||||
assert "upgrade" in content.lower()
|
||||
|
||||
def test_nginx_serves_spa_fallback(self):
|
||||
content = (ROOT / "docker" / "nginx.conf").read_text()
|
||||
assert "try_files" in content
|
||||
assert "/index.html" in content
|
||||
|
||||
|
||||
class TestFrontendBuildability:
|
||||
"""Verify frontend has all files needed for a build."""
|
||||
|
||||
def test_package_json_exists(self):
|
||||
assert (ROOT / "frontend" / "package.json").is_file()
|
||||
|
||||
def test_index_html_exists(self):
|
||||
assert (ROOT / "frontend" / "index.html").is_file()
|
||||
|
||||
def test_main_tsx_exists(self):
|
||||
assert (ROOT / "frontend" / "src" / "main.tsx").is_file()
|
||||
|
||||
def test_app_tsx_exists(self):
|
||||
assert (ROOT / "frontend" / "src" / "App.tsx").is_file()
|
||||
|
||||
def test_all_page_components_exist(self):
|
||||
pages = [
|
||||
"SetupPage", "LoginPage", "DashboardPage", "ProjectsPage",
|
||||
"ExperimentPage", "LivePage", "ComparePage", "AdminPage",
|
||||
]
|
||||
for page in pages:
|
||||
assert (ROOT / "frontend" / "src" / "pages" / f"{page}.tsx").is_file(), f"Missing {page}.tsx"
|
||||
|
||||
def test_vite_config_exists(self):
|
||||
assert (ROOT / "frontend" / "vite.config.ts").is_file()
|
||||
|
||||
def test_tailwind_config_exists(self):
|
||||
assert (ROOT / "frontend" / "tailwind.config.js").is_file()
|
||||
|
||||
|
||||
class TestWorkerConfig:
|
||||
"""Verify Celery worker module exists and is importable."""
|
||||
|
||||
def test_worker_module_exists(self):
|
||||
assert (ROOT / "backend" / "worker.py").is_file()
|
||||
|
||||
|
||||
class TestHealthEndpoint:
|
||||
"""Verify /health endpoint works in test mode."""
|
||||
|
||||
def test_health_returns_ok(self):
|
||||
from fastapi.testclient import TestClient
|
||||
|
||||
# Ensure backend is importable
|
||||
import sys
|
||||
backend_dir = str(ROOT / "backend")
|
||||
if backend_dir not in sys.path:
|
||||
sys.path.insert(0, backend_dir)
|
||||
|
||||
from main import app
|
||||
client = TestClient(app)
|
||||
resp = client.get("/health")
|
||||
assert resp.status_code == 200
|
||||
data = resp.json()
|
||||
assert data["status"] in ("ok", "degraded")
|
||||
assert "database" in data
|
||||
assert "redis" in data
|
||||
47
backend/tests/test_worker.py
Normal file
47
backend/tests/test_worker.py
Normal file
|
|
@ -0,0 +1,47 @@
|
|||
"""Tests for backend/worker.py — Celery configuration."""
|
||||
|
||||
import importlib
|
||||
import sys
|
||||
from unittest.mock import patch
|
||||
|
||||
|
||||
def test_celery_app_is_importable():
|
||||
"""worker.py exports a celery_app instance."""
|
||||
# Need to ensure config module is importable
|
||||
backend_dir = str(__import__("pathlib").Path(__file__).resolve().parents[1])
|
||||
if backend_dir not in sys.path:
|
||||
sys.path.insert(0, backend_dir)
|
||||
|
||||
import worker
|
||||
assert hasattr(worker, "celery_app")
|
||||
assert worker.celery_app.main == "promptlooper"
|
||||
|
||||
|
||||
def test_celery_app_serializer_settings():
|
||||
"""Verify JSON serialization is configured."""
|
||||
backend_dir = str(__import__("pathlib").Path(__file__).resolve().parents[1])
|
||||
if backend_dir not in sys.path:
|
||||
sys.path.insert(0, backend_dir)
|
||||
|
||||
import worker
|
||||
assert worker.celery_app.conf.task_serializer == "json"
|
||||
assert worker.celery_app.conf.result_serializer == "json"
|
||||
|
||||
|
||||
def test_celery_defaults_to_memory_broker_without_redis():
|
||||
"""Without REDIS_URL, broker falls back to memory://."""
|
||||
backend_dir = str(__import__("pathlib").Path(__file__).resolve().parents[1])
|
||||
if backend_dir not in sys.path:
|
||||
sys.path.insert(0, backend_dir)
|
||||
|
||||
with patch.dict("os.environ", {"REDIS_URL": ""}, clear=False):
|
||||
# Force reload to pick up env change
|
||||
if "config" in sys.modules:
|
||||
importlib.reload(sys.modules["config"])
|
||||
if "worker" in sys.modules:
|
||||
importlib.reload(sys.modules["worker"])
|
||||
|
||||
import worker
|
||||
# In no-redis mode, broker should be memory://
|
||||
# (may have been set from settings.redis_url == None)
|
||||
assert worker.celery_app is not None
|
||||
0
backend/websocket/__init__.py
Normal file
0
backend/websocket/__init__.py
Normal file
30
backend/worker.py
Normal file
30
backend/worker.py
Normal file
|
|
@ -0,0 +1,30 @@
|
|||
"""PromptLooper Celery worker configuration."""
|
||||
|
||||
from celery import Celery
|
||||
|
||||
from config import settings
|
||||
|
||||
# Determine broker and backend URLs
|
||||
broker_url = settings.redis_url or "memory://"
|
||||
result_backend = settings.redis_url or "cache+memory://"
|
||||
|
||||
celery_app = Celery(
|
||||
"promptlooper",
|
||||
broker=broker_url,
|
||||
backend=result_backend,
|
||||
)
|
||||
|
||||
celery_app.conf.update(
|
||||
task_serializer="json",
|
||||
accept_content=["json"],
|
||||
result_serializer="json",
|
||||
timezone="UTC",
|
||||
enable_utc=True,
|
||||
worker_concurrency=settings.max_concurrent_runs,
|
||||
task_track_started=True,
|
||||
task_acks_late=True,
|
||||
worker_prefetch_multiplier=1,
|
||||
)
|
||||
|
||||
# Auto-discover tasks in engine package
|
||||
celery_app.autodiscover_tasks(["engine"], force=True)
|
||||
108
docker-compose.yml
Normal file
108
docker-compose.yml
Normal file
|
|
@ -0,0 +1,108 @@
|
|||
name: xpltd_promptlooper
|
||||
|
||||
networks:
|
||||
promptlooper:
|
||||
driver: bridge
|
||||
ipam:
|
||||
config:
|
||||
- subnet: 172.33.0.0/24
|
||||
|
||||
services:
|
||||
promptlooper-db:
|
||||
image: postgres:16-alpine
|
||||
container_name: promptlooper-db
|
||||
restart: unless-stopped
|
||||
networks:
|
||||
- promptlooper
|
||||
ports:
|
||||
- "5434:5432"
|
||||
environment:
|
||||
POSTGRES_USER: promptlooper
|
||||
POSTGRES_PASSWORD: promptlooper
|
||||
POSTGRES_DB: promptlooper
|
||||
volumes:
|
||||
- /vmPool/r/services/promptlooper_db:/var/lib/postgresql/data
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U promptlooper"]
|
||||
interval: 10s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
|
||||
promptlooper-redis:
|
||||
image: redis:7-alpine
|
||||
container_name: promptlooper-redis
|
||||
restart: unless-stopped
|
||||
networks:
|
||||
- promptlooper
|
||||
volumes:
|
||||
- /vmPool/r/services/promptlooper_redis:/data
|
||||
healthcheck:
|
||||
test: ["CMD", "redis-cli", "ping"]
|
||||
interval: 10s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
|
||||
promptlooper-api:
|
||||
build:
|
||||
context: .
|
||||
dockerfile: docker/Dockerfile
|
||||
target: api
|
||||
container_name: promptlooper-api
|
||||
restart: unless-stopped
|
||||
networks:
|
||||
- promptlooper
|
||||
ports:
|
||||
- "8401:8401" # MCP server
|
||||
environment:
|
||||
DATABASE_URL: postgresql://promptlooper:promptlooper@promptlooper-db:5432/promptlooper
|
||||
REDIS_URL: redis://promptlooper-redis:6379/0
|
||||
JWT_SECRET: ${JWT_SECRET:-dev-secret-change-in-production}
|
||||
API_KEY: ${API_KEY:-}
|
||||
DEFAULT_ENDPOINT_URL: ${DEFAULT_ENDPOINT_URL:-}
|
||||
DEFAULT_ENDPOINT_KEY: ${DEFAULT_ENDPOINT_KEY:-}
|
||||
MAX_CONCURRENT_RUNS: ${MAX_CONCURRENT_RUNS:-4}
|
||||
MAX_TOKENS_PER_SWEEP: ${MAX_TOKENS_PER_SWEEP:-0}
|
||||
MCP_ENABLED: ${MCP_ENABLED:-true}
|
||||
MCP_PORT: "8401"
|
||||
depends_on:
|
||||
promptlooper-db:
|
||||
condition: service_healthy
|
||||
promptlooper-redis:
|
||||
condition: service_healthy
|
||||
|
||||
promptlooper-worker:
|
||||
build:
|
||||
context: .
|
||||
dockerfile: docker/Dockerfile
|
||||
target: api
|
||||
container_name: promptlooper-worker
|
||||
restart: unless-stopped
|
||||
networks:
|
||||
- promptlooper
|
||||
command: celery -A worker:celery_app worker --loglevel=info --concurrency=${MAX_CONCURRENT_RUNS:-4}
|
||||
working_dir: /app/backend
|
||||
environment:
|
||||
DATABASE_URL: postgresql://promptlooper:promptlooper@promptlooper-db:5432/promptlooper
|
||||
REDIS_URL: redis://promptlooper-redis:6379/0
|
||||
DEFAULT_ENDPOINT_URL: ${DEFAULT_ENDPOINT_URL:-}
|
||||
DEFAULT_ENDPOINT_KEY: ${DEFAULT_ENDPOINT_KEY:-}
|
||||
MAX_CONCURRENT_RUNS: ${MAX_CONCURRENT_RUNS:-4}
|
||||
depends_on:
|
||||
promptlooper-db:
|
||||
condition: service_healthy
|
||||
promptlooper-redis:
|
||||
condition: service_healthy
|
||||
|
||||
promptlooper-web:
|
||||
build:
|
||||
context: .
|
||||
dockerfile: docker/Dockerfile
|
||||
target: web
|
||||
container_name: promptlooper-web
|
||||
restart: unless-stopped
|
||||
networks:
|
||||
- promptlooper
|
||||
ports:
|
||||
- "8400:80"
|
||||
depends_on:
|
||||
- promptlooper-api
|
||||
0
docker/.gitkeep
Normal file
0
docker/.gitkeep
Normal file
67
docker/Dockerfile
Normal file
67
docker/Dockerfile
Normal file
|
|
@ -0,0 +1,67 @@
|
|||
# =============================================================================
|
||||
# Stage 1: Frontend build
|
||||
# =============================================================================
|
||||
FROM node:20-alpine AS frontend-build
|
||||
|
||||
WORKDIR /build
|
||||
|
||||
COPY frontend/package.json frontend/package-lock.json* ./
|
||||
RUN npm ci || npm install
|
||||
|
||||
COPY frontend/ ./
|
||||
RUN npm run build
|
||||
|
||||
# =============================================================================
|
||||
# Stage 2: Python API runtime
|
||||
# =============================================================================
|
||||
FROM python:3.12-slim AS api
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Install system dependencies for psycopg2 and general use
|
||||
RUN apt-get update && \
|
||||
apt-get install -y --no-install-recommends gcc libpq-dev curl && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install Python dependencies
|
||||
COPY backend/requirements.txt /app/backend/requirements.txt
|
||||
RUN pip install --no-cache-dir -r /app/backend/requirements.txt
|
||||
|
||||
# Copy backend source
|
||||
COPY backend/ /app/backend/
|
||||
COPY alembic/ /app/alembic/
|
||||
COPY alembic.ini /app/alembic.ini
|
||||
|
||||
# Copy frontend build for single-container mode
|
||||
COPY --from=frontend-build /build/dist /app/static
|
||||
|
||||
# Create data directory for SQLite mode
|
||||
RUN mkdir -p /data
|
||||
|
||||
ENV PYTHONPATH=/app/backend
|
||||
ENV DATA_DIR=/data
|
||||
|
||||
# Entrypoint runs migrations then starts the app
|
||||
COPY docker/entrypoint.sh /app/entrypoint.sh
|
||||
RUN chmod +x /app/entrypoint.sh
|
||||
|
||||
EXPOSE 8000 8401
|
||||
|
||||
# Default: run migrations then start the API server
|
||||
ENTRYPOINT ["/app/entrypoint.sh"]
|
||||
|
||||
# =============================================================================
|
||||
# Stage 3: Nginx frontend (production compose)
|
||||
# =============================================================================
|
||||
FROM nginx:1.27-alpine AS web
|
||||
|
||||
# Remove default config
|
||||
RUN rm /etc/nginx/conf.d/default.conf
|
||||
|
||||
# Copy custom nginx config
|
||||
COPY docker/nginx.conf /etc/nginx/conf.d/default.conf
|
||||
|
||||
# Copy built frontend assets
|
||||
COPY --from=frontend-build /build/dist /usr/share/nginx/html
|
||||
|
||||
EXPOSE 80
|
||||
10
docker/entrypoint.sh
Normal file
10
docker/entrypoint.sh
Normal file
|
|
@ -0,0 +1,10 @@
|
|||
#!/bin/sh
|
||||
set -e
|
||||
|
||||
# Run database migrations
|
||||
echo "Running database migrations..."
|
||||
cd /app && alembic upgrade head
|
||||
|
||||
# Start the application
|
||||
echo "Starting PromptLooper API..."
|
||||
exec uvicorn main:app --host 0.0.0.0 --port 8000 --app-dir /app/backend "$@"
|
||||
44
docker/nginx.conf
Normal file
44
docker/nginx.conf
Normal file
|
|
@ -0,0 +1,44 @@
|
|||
server {
|
||||
listen 80;
|
||||
server_name _;
|
||||
|
||||
root /usr/share/nginx/html;
|
||||
index index.html;
|
||||
|
||||
# Frontend static assets
|
||||
location / {
|
||||
try_files $uri $uri/ /index.html;
|
||||
}
|
||||
|
||||
# API proxy
|
||||
location /api/ {
|
||||
proxy_pass http://promptlooper-api:8000;
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header X-Forwarded-Proto $scheme;
|
||||
}
|
||||
|
||||
# Health endpoint proxy
|
||||
location /health {
|
||||
proxy_pass http://promptlooper-api:8000;
|
||||
proxy_set_header Host $host;
|
||||
}
|
||||
|
||||
# WebSocket proxy
|
||||
location /ws/ {
|
||||
proxy_pass http://promptlooper-api:8000;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection "upgrade";
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_read_timeout 86400;
|
||||
}
|
||||
|
||||
# Gzip compression
|
||||
gzip on;
|
||||
gzip_types text/plain text/css application/json application/javascript text/xml application/xml text/javascript;
|
||||
gzip_min_length 256;
|
||||
}
|
||||
23
env.example
Normal file
23
env.example
Normal file
|
|
@ -0,0 +1,23 @@
|
|||
# PromptLooper — Environment Configuration
|
||||
# Copy to .env and fill in required values
|
||||
|
||||
# ── Database ──────────────────────────────────────────────
|
||||
POSTGRES_USER=promptlooper
|
||||
POSTGRES_PASSWORD= # REQUIRED: set a strong password
|
||||
POSTGRES_DB=promptlooper
|
||||
|
||||
# ── Auth ──────────────────────────────────────────────────
|
||||
JWT_SECRET= # REQUIRED: generate with `openssl rand -hex 32`
|
||||
|
||||
# ── Default LLM Endpoint (optional) ──────────────────────
|
||||
# Pre-configure an LLM endpoint so users don't have to add one manually
|
||||
DEFAULT_ENDPOINT_URL= # e.g. http://chat.forgetyour.name/api/v1
|
||||
DEFAULT_ENDPOINT_KEY= # API key for the default endpoint
|
||||
|
||||
# ── Limits ────────────────────────────────────────────────
|
||||
MAX_CONCURRENT_RUNS=4 # Parallel run limit per sweep
|
||||
MAX_TOKENS_PER_SWEEP=0 # 0 = unlimited; set a number to cap token spend
|
||||
|
||||
# ── MCP Server ────────────────────────────────────────────
|
||||
MCP_ENABLED=true # Enable/disable MCP server for agent access
|
||||
# MCP_PORT=8401 # MCP server port (set in docker-compose)
|
||||
12
frontend/index.html
Normal file
12
frontend/index.html
Normal file
|
|
@ -0,0 +1,12 @@
|
|||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8" />
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||||
<title>PromptLooper</title>
|
||||
</head>
|
||||
<body>
|
||||
<div id="root"></div>
|
||||
<script type="module" src="/src/main.tsx"></script>
|
||||
</body>
|
||||
</html>
|
||||
3948
frontend/package-lock.json
generated
Normal file
3948
frontend/package-lock.json
generated
Normal file
File diff suppressed because it is too large
Load diff
31
frontend/package.json
Normal file
31
frontend/package.json
Normal file
|
|
@ -0,0 +1,31 @@
|
|||
{
|
||||
"name": "promptlooper-frontend",
|
||||
"private": true,
|
||||
"version": "0.1.0",
|
||||
"type": "module",
|
||||
"scripts": {
|
||||
"dev": "vite",
|
||||
"build": "tsc && vite build",
|
||||
"preview": "vite preview",
|
||||
"test": "vitest run"
|
||||
},
|
||||
"dependencies": {
|
||||
"react": "^18.3.1",
|
||||
"react-dom": "^18.3.1",
|
||||
"react-router-dom": "^6.28.0"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@testing-library/jest-dom": "^6.9.1",
|
||||
"@testing-library/react": "^16.3.2",
|
||||
"@types/react": "^18.3.12",
|
||||
"@types/react-dom": "^18.3.1",
|
||||
"@vitejs/plugin-react": "^4.3.4",
|
||||
"autoprefixer": "^10.4.20",
|
||||
"jsdom": "^29.0.2",
|
||||
"postcss": "^8.4.49",
|
||||
"tailwindcss": "^3.4.15",
|
||||
"typescript": "^5.6.3",
|
||||
"vite": "^6.0.0",
|
||||
"vitest": "^4.1.2"
|
||||
}
|
||||
}
|
||||
6
frontend/postcss.config.js
Normal file
6
frontend/postcss.config.js
Normal file
|
|
@ -0,0 +1,6 @@
|
|||
export default {
|
||||
plugins: {
|
||||
tailwindcss: {},
|
||||
autoprefixer: {},
|
||||
},
|
||||
};
|
||||
59
frontend/src/App.test.tsx
Normal file
59
frontend/src/App.test.tsx
Normal file
|
|
@ -0,0 +1,59 @@
|
|||
import { render, screen } from "@testing-library/react";
|
||||
import { MemoryRouter } from "react-router-dom";
|
||||
import { describe, it, expect } from "vitest";
|
||||
import App from "./App";
|
||||
|
||||
function renderWithRouter(route: string) {
|
||||
return render(
|
||||
<MemoryRouter initialEntries={[route]}>
|
||||
<App />
|
||||
</MemoryRouter>,
|
||||
);
|
||||
}
|
||||
|
||||
describe("App routing", () => {
|
||||
it("renders SetupPage at /setup", () => {
|
||||
renderWithRouter("/setup");
|
||||
expect(screen.getByText("PromptLooper Setup")).toBeInTheDocument();
|
||||
});
|
||||
|
||||
it("renders LoginPage at /login", () => {
|
||||
renderWithRouter("/login");
|
||||
expect(screen.getByText("Sign In")).toBeInTheDocument();
|
||||
});
|
||||
|
||||
it("renders DashboardPage at /", () => {
|
||||
renderWithRouter("/");
|
||||
expect(screen.getByText("Dashboard")).toBeInTheDocument();
|
||||
});
|
||||
|
||||
it("renders ProjectsPage at /projects", () => {
|
||||
renderWithRouter("/projects");
|
||||
expect(screen.getByText("Projects")).toBeInTheDocument();
|
||||
});
|
||||
|
||||
it("renders ExperimentPage at /experiments/:id", () => {
|
||||
renderWithRouter("/experiments/abc-123");
|
||||
expect(screen.getByText("Experiment")).toBeInTheDocument();
|
||||
});
|
||||
|
||||
it("renders LivePage at /live/:id", () => {
|
||||
renderWithRouter("/live/abc-123");
|
||||
expect(screen.getByText("Live")).toBeInTheDocument();
|
||||
});
|
||||
|
||||
it("renders ComparePage at /compare", () => {
|
||||
renderWithRouter("/compare");
|
||||
expect(screen.getByText("Compare")).toBeInTheDocument();
|
||||
});
|
||||
|
||||
it("renders AdminPage at /admin", () => {
|
||||
renderWithRouter("/admin");
|
||||
expect(screen.getByText("Admin")).toBeInTheDocument();
|
||||
});
|
||||
|
||||
it("redirects unknown routes to dashboard", () => {
|
||||
renderWithRouter("/nonexistent");
|
||||
expect(screen.getByText("Dashboard")).toBeInTheDocument();
|
||||
});
|
||||
});
|
||||
25
frontend/src/App.tsx
Normal file
25
frontend/src/App.tsx
Normal file
|
|
@ -0,0 +1,25 @@
|
|||
import { Routes, Route, Navigate } from "react-router-dom";
|
||||
import SetupPage from "./pages/SetupPage";
|
||||
import LoginPage from "./pages/LoginPage";
|
||||
import DashboardPage from "./pages/DashboardPage";
|
||||
import ProjectsPage from "./pages/ProjectsPage";
|
||||
import ExperimentPage from "./pages/ExperimentPage";
|
||||
import LivePage from "./pages/LivePage";
|
||||
import ComparePage from "./pages/ComparePage";
|
||||
import AdminPage from "./pages/AdminPage";
|
||||
|
||||
export default function App() {
|
||||
return (
|
||||
<Routes>
|
||||
<Route path="/setup" element={<SetupPage />} />
|
||||
<Route path="/login" element={<LoginPage />} />
|
||||
<Route path="/" element={<DashboardPage />} />
|
||||
<Route path="/projects" element={<ProjectsPage />} />
|
||||
<Route path="/experiments/:id" element={<ExperimentPage />} />
|
||||
<Route path="/live/:id" element={<LivePage />} />
|
||||
<Route path="/compare" element={<ComparePage />} />
|
||||
<Route path="/admin" element={<AdminPage />} />
|
||||
<Route path="*" element={<Navigate to="/" replace />} />
|
||||
</Routes>
|
||||
);
|
||||
}
|
||||
552
frontend/src/api/client.test.ts
Normal file
552
frontend/src/api/client.test.ts
Normal file
|
|
@ -0,0 +1,552 @@
|
|||
import { describe, it, expect, beforeEach, afterEach, vi } from "vitest";
|
||||
import {
|
||||
setToken,
|
||||
getToken,
|
||||
clearToken,
|
||||
ApiError,
|
||||
auth,
|
||||
projects,
|
||||
experiments,
|
||||
runs,
|
||||
endpoints,
|
||||
exportApi,
|
||||
webhooks,
|
||||
admin,
|
||||
health,
|
||||
connectWebSocket,
|
||||
} from "./client";
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Mock fetch
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
const mockFetch = vi.fn();
|
||||
|
||||
beforeEach(() => {
|
||||
mockFetch.mockReset();
|
||||
vi.stubGlobal("fetch", mockFetch);
|
||||
clearToken();
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
vi.restoreAllMocks();
|
||||
});
|
||||
|
||||
function jsonResponse(body: unknown, status = 200): Response {
|
||||
return {
|
||||
ok: status >= 200 && status < 300,
|
||||
status,
|
||||
statusText: status === 200 ? "OK" : "Error",
|
||||
json: () => Promise.resolve(body),
|
||||
text: () => Promise.resolve(JSON.stringify(body)),
|
||||
headers: new Headers(),
|
||||
} as unknown as Response;
|
||||
}
|
||||
|
||||
function noContentResponse(): Response {
|
||||
return {
|
||||
ok: true,
|
||||
status: 204,
|
||||
statusText: "No Content",
|
||||
json: () => Promise.reject(new Error("no body")),
|
||||
text: () => Promise.resolve(""),
|
||||
headers: new Headers(),
|
||||
} as unknown as Response;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Token management
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("token management", () => {
|
||||
it("starts with null token", () => {
|
||||
expect(getToken()).toBeNull();
|
||||
});
|
||||
|
||||
it("sets and gets token", () => {
|
||||
setToken("abc123");
|
||||
expect(getToken()).toBe("abc123");
|
||||
});
|
||||
|
||||
it("clears token", () => {
|
||||
setToken("abc123");
|
||||
clearToken();
|
||||
expect(getToken()).toBeNull();
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Auth header injection
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("auth header injection", () => {
|
||||
it("sends Authorization header when token is set", async () => {
|
||||
setToken("my-jwt");
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({ status: "ok" }));
|
||||
|
||||
await health.check();
|
||||
|
||||
const [, init] = mockFetch.mock.calls[0] as [string, RequestInit];
|
||||
expect((init.headers as Record<string, string>)["Authorization"]).toBe(
|
||||
"Bearer my-jwt",
|
||||
);
|
||||
});
|
||||
|
||||
it("omits Authorization header when no token", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({ status: "ok" }));
|
||||
|
||||
await health.check();
|
||||
|
||||
const [, init] = mockFetch.mock.calls[0] as [string, RequestInit];
|
||||
expect(
|
||||
(init.headers as Record<string, string>)["Authorization"],
|
||||
).toBeUndefined();
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// ApiError
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("ApiError", () => {
|
||||
it("throws ApiError on non-ok response", async () => {
|
||||
mockFetch.mockResolvedValueOnce(
|
||||
jsonResponse({ detail: "not found" }, 404),
|
||||
);
|
||||
|
||||
await expect(projects.get("some-id")).rejects.toThrow(ApiError);
|
||||
|
||||
try {
|
||||
mockFetch.mockResolvedValueOnce(
|
||||
jsonResponse({ detail: "bad" }, 400),
|
||||
);
|
||||
await projects.get("some-id");
|
||||
} catch (e) {
|
||||
expect(e).toBeInstanceOf(ApiError);
|
||||
expect((e as ApiError).status).toBe(400);
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Content-Type header
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("content-type", () => {
|
||||
it("sets Content-Type for POST with body", async () => {
|
||||
mockFetch.mockResolvedValueOnce(
|
||||
jsonResponse({ access_token: "tok", token_type: "bearer" }),
|
||||
);
|
||||
|
||||
await auth.setup({ username: "admin", password: "password123" });
|
||||
|
||||
const [, init] = mockFetch.mock.calls[0] as [string, RequestInit];
|
||||
expect((init.headers as Record<string, string>)["Content-Type"]).toBe(
|
||||
"application/json",
|
||||
);
|
||||
});
|
||||
|
||||
it("omits Content-Type for GET requests", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({ items: [], total: 0 }));
|
||||
|
||||
await projects.list();
|
||||
|
||||
const [, init] = mockFetch.mock.calls[0] as [string, RequestInit];
|
||||
expect(
|
||||
(init.headers as Record<string, string>)["Content-Type"],
|
||||
).toBeUndefined();
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Health
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("health", () => {
|
||||
it("calls /health", async () => {
|
||||
mockFetch.mockResolvedValueOnce(
|
||||
jsonResponse({ status: "ok", database: true, redis: true }),
|
||||
);
|
||||
|
||||
const result = await health.check();
|
||||
|
||||
expect(mockFetch).toHaveBeenCalledWith("/health", expect.anything());
|
||||
expect(result.status).toBe("ok");
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Auth endpoints
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("auth", () => {
|
||||
it("setup POSTs to /api/auth/setup", async () => {
|
||||
mockFetch.mockResolvedValueOnce(
|
||||
jsonResponse({ access_token: "tok", token_type: "bearer" }),
|
||||
);
|
||||
|
||||
const result = await auth.setup({
|
||||
username: "admin",
|
||||
password: "password123",
|
||||
});
|
||||
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/auth/setup",
|
||||
expect.anything(),
|
||||
);
|
||||
expect(result.access_token).toBe("tok");
|
||||
});
|
||||
|
||||
it("login sets token automatically", async () => {
|
||||
mockFetch.mockResolvedValueOnce(
|
||||
jsonResponse({ access_token: "jwt-123", token_type: "bearer" }),
|
||||
);
|
||||
|
||||
await auth.login({ username: "admin", password: "pass" });
|
||||
|
||||
expect(getToken()).toBe("jwt-123");
|
||||
});
|
||||
|
||||
it("me GETs /api/auth/me", async () => {
|
||||
mockFetch.mockResolvedValueOnce(
|
||||
jsonResponse({
|
||||
id: "u1",
|
||||
username: "admin",
|
||||
is_admin: true,
|
||||
created_at: "2026-01-01T00:00:00Z",
|
||||
}),
|
||||
);
|
||||
|
||||
const user = await auth.me();
|
||||
expect(user.username).toBe("admin");
|
||||
});
|
||||
|
||||
it("logout clears token", () => {
|
||||
setToken("tok");
|
||||
auth.logout();
|
||||
expect(getToken()).toBeNull();
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Projects
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("projects", () => {
|
||||
it("list GETs /api/projects/", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({ items: [], total: 0 }));
|
||||
await projects.list();
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/projects/",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
|
||||
it("create POSTs to /api/projects/", async () => {
|
||||
mockFetch.mockResolvedValueOnce(
|
||||
jsonResponse({ id: "p1", name: "Test" }),
|
||||
);
|
||||
await projects.create({ name: "Test" });
|
||||
const [, init] = mockFetch.mock.calls[0] as [string, RequestInit];
|
||||
expect(init.method).toBe("POST");
|
||||
expect(JSON.parse(init.body as string)).toEqual({ name: "Test" });
|
||||
});
|
||||
|
||||
it("get fetches by id", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({ id: "p1" }));
|
||||
await projects.get("p1");
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/projects/p1",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
|
||||
it("update PUTs by id", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({ id: "p1" }));
|
||||
await projects.update("p1", { name: "New" });
|
||||
const [url, init] = mockFetch.mock.calls[0] as [string, RequestInit];
|
||||
expect(url).toBe("/api/projects/p1");
|
||||
expect(init.method).toBe("PUT");
|
||||
});
|
||||
|
||||
it("delete DELETEs by id", async () => {
|
||||
mockFetch.mockResolvedValueOnce(noContentResponse());
|
||||
await projects.delete("p1");
|
||||
const [url, init] = mockFetch.mock.calls[0] as [string, RequestInit];
|
||||
expect(url).toBe("/api/projects/p1");
|
||||
expect(init.method).toBe("DELETE");
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Experiments
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("experiments", () => {
|
||||
it("list GETs /api/experiments/", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({ items: [], total: 0 }));
|
||||
await experiments.list();
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/experiments/",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
|
||||
it("startSweep POSTs to sweep endpoint", async () => {
|
||||
mockFetch.mockResolvedValueOnce(noContentResponse());
|
||||
await experiments.startSweep("e1");
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/experiments/e1/sweep",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
|
||||
it("pause POSTs to pause endpoint", async () => {
|
||||
mockFetch.mockResolvedValueOnce(noContentResponse());
|
||||
await experiments.pause("e1");
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/experiments/e1/pause",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
|
||||
it("resume POSTs to resume endpoint", async () => {
|
||||
mockFetch.mockResolvedValueOnce(noContentResponse());
|
||||
await experiments.resume("e1");
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/experiments/e1/resume",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
|
||||
it("stop POSTs to stop endpoint", async () => {
|
||||
mockFetch.mockResolvedValueOnce(noContentResponse());
|
||||
await experiments.stop("e1");
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/experiments/e1/stop",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Runs
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("runs", () => {
|
||||
it("list GETs runs for experiment", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({ items: [], total: 0 }));
|
||||
await runs.list("e1");
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/runs/experiments/e1/runs",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
|
||||
it("get fetches run detail", async () => {
|
||||
mockFetch.mockResolvedValueOnce(
|
||||
jsonResponse({ id: "r1", stage_results: [], scores: [] }),
|
||||
);
|
||||
await runs.get("r1");
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/runs/r1",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
|
||||
it("score POSTs to run score endpoint", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({ id: "s1" }));
|
||||
await runs.score("r1", { scorer_name: "human", value: 0.9 });
|
||||
const [url, init] = mockFetch.mock.calls[0] as [string, RequestInit];
|
||||
expect(url).toBe("/api/runs/r1/score");
|
||||
expect(init.method).toBe("POST");
|
||||
});
|
||||
|
||||
it("leaderboard GETs leaderboard", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({ items: [], total: 0 }));
|
||||
await runs.leaderboard("e1");
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/runs/experiments/e1/leaderboard",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Endpoints
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("endpoints", () => {
|
||||
it("list GETs /api/endpoints/", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({ items: [], total: 0 }));
|
||||
await endpoints.list();
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/endpoints/",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
|
||||
it("test POSTs to test endpoint", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({ models: [] }));
|
||||
await endpoints.test("ep1");
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/endpoints/ep1/test",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Export
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("exportApi", () => {
|
||||
it("best GETs best config", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({}));
|
||||
await exportApi.best("e1");
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/export/experiments/e1/best",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
|
||||
it("env GETs env export", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse("KEY=val"));
|
||||
await exportApi.env("e1");
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/export/experiments/e1/env",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
|
||||
it("report GETs report", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse("# Report"));
|
||||
await exportApi.report("e1");
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/export/experiments/e1/report",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Webhooks
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("webhooks", () => {
|
||||
it("list GETs /api/webhooks/", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({ items: [], total: 0 }));
|
||||
await webhooks.list();
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/webhooks/",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
|
||||
it("create POSTs webhook", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({ id: "w1" }));
|
||||
await webhooks.create({ event_type: "run.complete", url: "http://x" });
|
||||
const [, init] = mockFetch.mock.calls[0] as [string, RequestInit];
|
||||
expect(init.method).toBe("POST");
|
||||
});
|
||||
|
||||
it("delete DELETEs webhook", async () => {
|
||||
mockFetch.mockResolvedValueOnce(noContentResponse());
|
||||
await webhooks.delete("w1");
|
||||
const [url, init] = mockFetch.mock.calls[0] as [string, RequestInit];
|
||||
expect(url).toBe("/api/webhooks/w1");
|
||||
expect(init.method).toBe("DELETE");
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Admin
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("admin", () => {
|
||||
it("getSettings GETs /api/admin/settings", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({}));
|
||||
await admin.getSettings();
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/admin/settings",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
|
||||
it("updateSettings PUTs /api/admin/settings", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({}));
|
||||
await admin.updateSettings({ guest_access: true });
|
||||
const [, init] = mockFetch.mock.calls[0] as [string, RequestInit];
|
||||
expect(init.method).toBe("PUT");
|
||||
});
|
||||
|
||||
it("getStats GETs /api/admin/stats", async () => {
|
||||
mockFetch.mockResolvedValueOnce(jsonResponse({}));
|
||||
await admin.getStats();
|
||||
expect(mockFetch).toHaveBeenCalledWith(
|
||||
"/api/admin/stats",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// WebSocket helper
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
describe("connectWebSocket", () => {
|
||||
it("creates WebSocket with correct URL and handles messages", () => {
|
||||
const sendSpy = vi.fn();
|
||||
const closeSpy = vi.fn();
|
||||
let capturedInstance: {
|
||||
onmessage: ((ev: { data: string }) => void) | null;
|
||||
onclose: (() => void) | null;
|
||||
readyState: number;
|
||||
};
|
||||
|
||||
// Use a class constructor so `new WebSocket(...)` works
|
||||
class MockWebSocket {
|
||||
static OPEN = 1;
|
||||
readyState = 1;
|
||||
onmessage: ((ev: { data: string }) => void) | null = null;
|
||||
onclose: (() => void) | null = null;
|
||||
send = sendSpy;
|
||||
close = closeSpy;
|
||||
constructor(public url: string) {
|
||||
capturedInstance = this;
|
||||
}
|
||||
}
|
||||
|
||||
vi.stubGlobal("WebSocket", MockWebSocket);
|
||||
|
||||
Object.defineProperty(window, "location", {
|
||||
value: { protocol: "http:", host: "localhost:5173" },
|
||||
writable: true,
|
||||
configurable: true,
|
||||
});
|
||||
|
||||
const onMessage = vi.fn();
|
||||
const onClose = vi.fn();
|
||||
const conn = connectWebSocket(onMessage, onClose);
|
||||
|
||||
expect(capturedInstance!.url).toBe("ws://localhost:5173/ws");
|
||||
|
||||
// Simulate incoming message
|
||||
capturedInstance!.onmessage!({ data: JSON.stringify({ type: "update" }) });
|
||||
expect(onMessage).toHaveBeenCalledWith({ type: "update" });
|
||||
|
||||
// Send message
|
||||
conn.send({ type: "ping" });
|
||||
expect(sendSpy).toHaveBeenCalledWith('{"type":"ping"}');
|
||||
|
||||
// Simulate close
|
||||
capturedInstance!.onclose!();
|
||||
expect(onClose).toHaveBeenCalled();
|
||||
|
||||
// Close from client
|
||||
conn.close();
|
||||
expect(closeSpy).toHaveBeenCalled();
|
||||
|
||||
vi.unstubAllGlobals();
|
||||
});
|
||||
});
|
||||
545
frontend/src/api/client.ts
Normal file
545
frontend/src/api/client.ts
Normal file
|
|
@ -0,0 +1,545 @@
|
|||
/**
|
||||
* PromptLooper typed API client.
|
||||
*
|
||||
* - JWT token stored in memory (never localStorage) for security.
|
||||
* - Automatic Authorization header injection.
|
||||
* - Typed wrapper functions for every API endpoint group.
|
||||
* - WebSocket connection helper for real-time updates.
|
||||
*/
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Types — mirrors backend Pydantic schemas
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export interface ProjectCreate {
|
||||
name: string;
|
||||
description?: string | null;
|
||||
}
|
||||
|
||||
export interface ProjectUpdate {
|
||||
name?: string | null;
|
||||
description?: string | null;
|
||||
}
|
||||
|
||||
export interface ProjectResponse {
|
||||
id: string;
|
||||
name: string;
|
||||
description: string | null;
|
||||
owner_id: string;
|
||||
created_at: string;
|
||||
updated_at: string;
|
||||
}
|
||||
|
||||
export interface ProjectListResponse {
|
||||
items: ProjectResponse[];
|
||||
total: number;
|
||||
}
|
||||
|
||||
export interface ExperimentCreate {
|
||||
name: string;
|
||||
description?: string | null;
|
||||
sample_data?: Record<string, unknown> | null;
|
||||
pipeline_stages?: Record<string, unknown> | null;
|
||||
scoring_config?: Record<string, unknown> | null;
|
||||
parameter_space?: Record<string, unknown> | null;
|
||||
}
|
||||
|
||||
export interface ExperimentUpdate {
|
||||
name?: string | null;
|
||||
description?: string | null;
|
||||
sample_data?: Record<string, unknown> | null;
|
||||
pipeline_stages?: Record<string, unknown> | null;
|
||||
scoring_config?: Record<string, unknown> | null;
|
||||
parameter_space?: Record<string, unknown> | null;
|
||||
status?: string | null;
|
||||
}
|
||||
|
||||
export interface ExperimentResponse {
|
||||
id: string;
|
||||
project_id: string;
|
||||
name: string;
|
||||
description: string | null;
|
||||
sample_data: Record<string, unknown> | null;
|
||||
pipeline_stages: Record<string, unknown> | null;
|
||||
scoring_config: Record<string, unknown> | null;
|
||||
parameter_space: Record<string, unknown> | null;
|
||||
status: string;
|
||||
created_at: string;
|
||||
updated_at: string;
|
||||
}
|
||||
|
||||
export interface ExperimentListResponse {
|
||||
items: ExperimentResponse[];
|
||||
total: number;
|
||||
}
|
||||
|
||||
export interface RunResponse {
|
||||
id: string;
|
||||
experiment_id: string;
|
||||
config_hash: string;
|
||||
config: Record<string, unknown>;
|
||||
status: string;
|
||||
started_at: string | null;
|
||||
completed_at: string | null;
|
||||
duration_ms: number | null;
|
||||
tokens_in: number | null;
|
||||
tokens_out: number | null;
|
||||
cost_estimate: number | null;
|
||||
}
|
||||
|
||||
export interface RunListResponse {
|
||||
items: RunResponse[];
|
||||
total: number;
|
||||
}
|
||||
|
||||
export interface StageResultResponse {
|
||||
id: string;
|
||||
run_id: string;
|
||||
stage_index: number;
|
||||
prompt_sent: string;
|
||||
response_raw: string;
|
||||
model_used: string;
|
||||
parameters: Record<string, unknown> | null;
|
||||
tokens_in: number | null;
|
||||
tokens_out: number | null;
|
||||
latency_ms: number | null;
|
||||
}
|
||||
|
||||
export interface ScoreResponse {
|
||||
id: string;
|
||||
run_id: string;
|
||||
scorer_name: string;
|
||||
value: number;
|
||||
scorer_metadata: Record<string, unknown> | null;
|
||||
created_at: string;
|
||||
}
|
||||
|
||||
export interface RunDetailResponse extends RunResponse {
|
||||
stage_results: StageResultResponse[];
|
||||
scores: ScoreResponse[];
|
||||
}
|
||||
|
||||
export interface ScoreInput {
|
||||
scorer_name: string;
|
||||
value: number;
|
||||
metadata?: Record<string, unknown> | null;
|
||||
}
|
||||
|
||||
export interface EndpointCreate {
|
||||
name: string;
|
||||
url: string;
|
||||
api_key?: string | null;
|
||||
default_model?: string | null;
|
||||
}
|
||||
|
||||
export interface EndpointUpdate {
|
||||
name?: string | null;
|
||||
url?: string | null;
|
||||
api_key?: string | null;
|
||||
default_model?: string | null;
|
||||
}
|
||||
|
||||
export interface EndpointResponse {
|
||||
id: string;
|
||||
name: string;
|
||||
url: string;
|
||||
default_model: string | null;
|
||||
}
|
||||
|
||||
export interface EndpointListResponse {
|
||||
items: EndpointResponse[];
|
||||
total: number;
|
||||
}
|
||||
|
||||
export interface WebhookCreate {
|
||||
event_type: string;
|
||||
url: string;
|
||||
headers?: Record<string, string> | null;
|
||||
is_active?: boolean;
|
||||
}
|
||||
|
||||
export interface WebhookUpdate {
|
||||
event_type?: string | null;
|
||||
url?: string | null;
|
||||
headers?: Record<string, string> | null;
|
||||
is_active?: boolean | null;
|
||||
}
|
||||
|
||||
export interface WebhookResponse {
|
||||
id: string;
|
||||
event_type: string;
|
||||
url: string;
|
||||
headers: Record<string, string> | null;
|
||||
is_active: boolean;
|
||||
}
|
||||
|
||||
export interface WebhookListResponse {
|
||||
items: WebhookResponse[];
|
||||
total: number;
|
||||
}
|
||||
|
||||
export interface SetupRequest {
|
||||
username: string;
|
||||
password: string;
|
||||
}
|
||||
|
||||
export interface LoginRequest {
|
||||
username: string;
|
||||
password: string;
|
||||
}
|
||||
|
||||
export interface TokenResponse {
|
||||
access_token: string;
|
||||
token_type: string;
|
||||
}
|
||||
|
||||
export interface UserResponse {
|
||||
id: string;
|
||||
username: string;
|
||||
is_admin: boolean;
|
||||
created_at: string;
|
||||
}
|
||||
|
||||
export interface HealthResponse {
|
||||
status: string;
|
||||
database: boolean;
|
||||
redis: boolean;
|
||||
}
|
||||
|
||||
export interface ExportRunRow {
|
||||
run_id: string;
|
||||
experiment_id: string;
|
||||
config_hash: string;
|
||||
config: Record<string, unknown>;
|
||||
status: string;
|
||||
duration_ms: number | null;
|
||||
tokens_in: number | null;
|
||||
tokens_out: number | null;
|
||||
cost_estimate: number | null;
|
||||
scores: Record<string, number>;
|
||||
}
|
||||
|
||||
export interface ExportResponse {
|
||||
experiment_id: string;
|
||||
experiment_name: string;
|
||||
rows: ExportRunRow[];
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// API Error
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export class ApiError extends Error {
|
||||
constructor(
|
||||
public status: number,
|
||||
public statusText: string,
|
||||
public body: unknown,
|
||||
) {
|
||||
super(`API ${status}: ${statusText}`);
|
||||
this.name = "ApiError";
|
||||
}
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Token management (in-memory only)
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
let _accessToken: string | null = null;
|
||||
|
||||
export function setToken(token: string | null): void {
|
||||
_accessToken = token;
|
||||
}
|
||||
|
||||
export function getToken(): string | null {
|
||||
return _accessToken;
|
||||
}
|
||||
|
||||
export function clearToken(): void {
|
||||
_accessToken = null;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Base fetch wrapper
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
const BASE_URL = ""; // Uses Vite proxy in dev; same origin in prod
|
||||
|
||||
async function request<T>(
|
||||
path: string,
|
||||
options: RequestInit = {},
|
||||
): Promise<T> {
|
||||
const headers: Record<string, string> = {
|
||||
...(options.headers as Record<string, string> | undefined),
|
||||
};
|
||||
|
||||
// Inject auth header
|
||||
if (_accessToken) {
|
||||
headers["Authorization"] = `Bearer ${_accessToken}`;
|
||||
}
|
||||
|
||||
// Default content-type for requests with bodies
|
||||
if (options.body && !headers["Content-Type"]) {
|
||||
headers["Content-Type"] = "application/json";
|
||||
}
|
||||
|
||||
const response = await fetch(`${BASE_URL}${path}`, {
|
||||
...options,
|
||||
headers,
|
||||
});
|
||||
|
||||
if (!response.ok) {
|
||||
let body: unknown;
|
||||
try {
|
||||
body = await response.json();
|
||||
} catch {
|
||||
body = await response.text();
|
||||
}
|
||||
throw new ApiError(response.status, response.statusText, body);
|
||||
}
|
||||
|
||||
// 204 No Content
|
||||
if (response.status === 204) {
|
||||
return undefined as T;
|
||||
}
|
||||
|
||||
return response.json() as Promise<T>;
|
||||
}
|
||||
|
||||
function get<T>(path: string): Promise<T> {
|
||||
return request<T>(path, { method: "GET" });
|
||||
}
|
||||
|
||||
function post<T>(path: string, body?: unknown): Promise<T> {
|
||||
return request<T>(path, {
|
||||
method: "POST",
|
||||
body: body != null ? JSON.stringify(body) : undefined,
|
||||
});
|
||||
}
|
||||
|
||||
function put<T>(path: string, body?: unknown): Promise<T> {
|
||||
return request<T>(path, {
|
||||
method: "PUT",
|
||||
body: body != null ? JSON.stringify(body) : undefined,
|
||||
});
|
||||
}
|
||||
|
||||
function del<T>(path: string): Promise<T> {
|
||||
return request<T>(path, { method: "DELETE" });
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Health
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export const health = {
|
||||
check: () => get<HealthResponse>("/health"),
|
||||
};
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Auth
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export const auth = {
|
||||
setup: (data: SetupRequest) =>
|
||||
post<TokenResponse>("/api/auth/setup", data),
|
||||
|
||||
login: async (data: LoginRequest): Promise<TokenResponse> => {
|
||||
const resp = await post<TokenResponse>("/api/auth/login", data);
|
||||
setToken(resp.access_token);
|
||||
return resp;
|
||||
},
|
||||
|
||||
me: () => get<UserResponse>("/api/auth/me"),
|
||||
|
||||
logout: () => {
|
||||
clearToken();
|
||||
},
|
||||
};
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Projects
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export const projects = {
|
||||
list: () => get<ProjectListResponse>("/api/projects/"),
|
||||
|
||||
create: (data: ProjectCreate) =>
|
||||
post<ProjectResponse>("/api/projects/", data),
|
||||
|
||||
get: (id: string) => get<ProjectResponse>(`/api/projects/${id}`),
|
||||
|
||||
update: (id: string, data: ProjectUpdate) =>
|
||||
put<ProjectResponse>(`/api/projects/${id}`, data),
|
||||
|
||||
delete: (id: string) => del<void>(`/api/projects/${id}`),
|
||||
};
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Experiments
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export const experiments = {
|
||||
list: () => get<ExperimentListResponse>("/api/experiments/"),
|
||||
|
||||
create: (data: ExperimentCreate) =>
|
||||
post<ExperimentResponse>("/api/experiments/", data),
|
||||
|
||||
get: (id: string) => get<ExperimentResponse>(`/api/experiments/${id}`),
|
||||
|
||||
update: (id: string, data: ExperimentUpdate) =>
|
||||
put<ExperimentResponse>(`/api/experiments/${id}`, data),
|
||||
|
||||
delete: (id: string) => del<void>(`/api/experiments/${id}`),
|
||||
|
||||
startSweep: (id: string) =>
|
||||
post<void>(`/api/experiments/${id}/sweep`),
|
||||
|
||||
pause: (id: string) =>
|
||||
post<void>(`/api/experiments/${id}/pause`),
|
||||
|
||||
resume: (id: string) =>
|
||||
post<void>(`/api/experiments/${id}/resume`),
|
||||
|
||||
stop: (id: string) =>
|
||||
post<void>(`/api/experiments/${id}/stop`),
|
||||
};
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Runs
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export const runs = {
|
||||
list: (experimentId: string) =>
|
||||
get<RunListResponse>(`/api/runs/experiments/${experimentId}/runs`),
|
||||
|
||||
get: (runId: string) =>
|
||||
get<RunDetailResponse>(`/api/runs/${runId}`),
|
||||
|
||||
create: (data: Record<string, unknown>) =>
|
||||
post<RunResponse>("/api/runs/", data),
|
||||
|
||||
score: (runId: string, data: ScoreInput) =>
|
||||
post<ScoreResponse>(`/api/runs/${runId}/score`, data),
|
||||
|
||||
leaderboard: (experimentId: string) =>
|
||||
get<RunListResponse>(
|
||||
`/api/runs/experiments/${experimentId}/leaderboard`,
|
||||
),
|
||||
};
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Endpoints (LLM targets)
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export const endpoints = {
|
||||
list: () => get<EndpointListResponse>("/api/endpoints/"),
|
||||
|
||||
create: (data: EndpointCreate) =>
|
||||
post<EndpointResponse>("/api/endpoints/", data),
|
||||
|
||||
update: (id: string, data: EndpointUpdate) =>
|
||||
put<EndpointResponse>(`/api/endpoints/${id}`, data),
|
||||
|
||||
delete: (id: string) => del<void>(`/api/endpoints/${id}`),
|
||||
|
||||
test: (id: string) =>
|
||||
post<Record<string, unknown>>(`/api/endpoints/${id}/test`),
|
||||
};
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Export
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export const exportApi = {
|
||||
best: (experimentId: string) =>
|
||||
get<Record<string, unknown>>(
|
||||
`/api/export/experiments/${experimentId}/best`,
|
||||
),
|
||||
|
||||
env: (experimentId: string) =>
|
||||
get<string>(`/api/export/experiments/${experimentId}/env`),
|
||||
|
||||
yaml: (experimentId: string) =>
|
||||
get<string>(`/api/export/experiments/${experimentId}/yaml`),
|
||||
|
||||
report: (experimentId: string) =>
|
||||
get<string>(`/api/export/experiments/${experimentId}/report`),
|
||||
};
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Webhooks
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export const webhooks = {
|
||||
list: () => get<WebhookListResponse>("/api/webhooks/"),
|
||||
|
||||
create: (data: WebhookCreate) =>
|
||||
post<WebhookResponse>("/api/webhooks/", data),
|
||||
|
||||
delete: (id: string) => del<void>(`/api/webhooks/${id}`),
|
||||
};
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Admin
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export const admin = {
|
||||
getSettings: () =>
|
||||
get<Record<string, unknown>>("/api/admin/settings"),
|
||||
|
||||
updateSettings: (data: Record<string, unknown>) =>
|
||||
put<Record<string, unknown>>("/api/admin/settings", data),
|
||||
|
||||
getStats: () => get<Record<string, unknown>>("/api/admin/stats"),
|
||||
};
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// WebSocket helper
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export type WsMessageHandler = (data: unknown) => void;
|
||||
|
||||
export interface WsConnection {
|
||||
send: (data: unknown) => void;
|
||||
close: () => void;
|
||||
}
|
||||
|
||||
/**
|
||||
* Connect to the real-time WebSocket endpoint.
|
||||
*
|
||||
* @param onMessage Called for each incoming message.
|
||||
* @param onClose Optional callback when connection closes.
|
||||
* @returns Object with `send()` and `close()` methods.
|
||||
*/
|
||||
export function connectWebSocket(
|
||||
onMessage: WsMessageHandler,
|
||||
onClose?: () => void,
|
||||
): WsConnection {
|
||||
const protocol = window.location.protocol === "https:" ? "wss:" : "ws:";
|
||||
const wsUrl = `${protocol}//${window.location.host}/ws`;
|
||||
const ws = new WebSocket(wsUrl);
|
||||
|
||||
ws.onmessage = (event) => {
|
||||
try {
|
||||
const data: unknown = JSON.parse(event.data as string);
|
||||
onMessage(data);
|
||||
} catch {
|
||||
onMessage(event.data);
|
||||
}
|
||||
};
|
||||
|
||||
ws.onclose = () => {
|
||||
onClose?.();
|
||||
};
|
||||
|
||||
return {
|
||||
send: (data: unknown) => {
|
||||
if (ws.readyState === WebSocket.OPEN) {
|
||||
ws.send(JSON.stringify(data));
|
||||
}
|
||||
},
|
||||
close: () => {
|
||||
ws.close();
|
||||
},
|
||||
};
|
||||
}
|
||||
0
frontend/src/components/.gitkeep
Normal file
0
frontend/src/components/.gitkeep
Normal file
3
frontend/src/index.css
Normal file
3
frontend/src/index.css
Normal file
|
|
@ -0,0 +1,3 @@
|
|||
@tailwind base;
|
||||
@tailwind components;
|
||||
@tailwind utilities;
|
||||
13
frontend/src/main.tsx
Normal file
13
frontend/src/main.tsx
Normal file
|
|
@ -0,0 +1,13 @@
|
|||
import React from "react";
|
||||
import ReactDOM from "react-dom/client";
|
||||
import { BrowserRouter } from "react-router-dom";
|
||||
import App from "./App";
|
||||
import "./index.css";
|
||||
|
||||
ReactDOM.createRoot(document.getElementById("root")!).render(
|
||||
<React.StrictMode>
|
||||
<BrowserRouter>
|
||||
<App />
|
||||
</BrowserRouter>
|
||||
</React.StrictMode>,
|
||||
);
|
||||
8
frontend/src/pages/AdminPage.tsx
Normal file
8
frontend/src/pages/AdminPage.tsx
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
export default function AdminPage() {
|
||||
return (
|
||||
<div className="p-8">
|
||||
<h1 className="mb-4 text-2xl font-bold">Admin</h1>
|
||||
<p className="text-gray-600">System administration and user management.</p>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
8
frontend/src/pages/ComparePage.tsx
Normal file
8
frontend/src/pages/ComparePage.tsx
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
export default function ComparePage() {
|
||||
return (
|
||||
<div className="p-8">
|
||||
<h1 className="mb-4 text-2xl font-bold">Compare</h1>
|
||||
<p className="text-gray-600">Compare results across runs and experiments.</p>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
8
frontend/src/pages/DashboardPage.tsx
Normal file
8
frontend/src/pages/DashboardPage.tsx
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
export default function DashboardPage() {
|
||||
return (
|
||||
<div className="p-8">
|
||||
<h1 className="mb-4 text-2xl font-bold">Dashboard</h1>
|
||||
<p className="text-gray-600">Overview of recent experiments and runs.</p>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
8
frontend/src/pages/ExperimentPage.tsx
Normal file
8
frontend/src/pages/ExperimentPage.tsx
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
export default function ExperimentPage() {
|
||||
return (
|
||||
<div className="p-8">
|
||||
<h1 className="mb-4 text-2xl font-bold">Experiment</h1>
|
||||
<p className="text-gray-600">Configure and run prompt experiments.</p>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
8
frontend/src/pages/LivePage.tsx
Normal file
8
frontend/src/pages/LivePage.tsx
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
export default function LivePage() {
|
||||
return (
|
||||
<div className="p-8">
|
||||
<h1 className="mb-4 text-2xl font-bold">Live</h1>
|
||||
<p className="text-gray-600">Real-time experiment progress and results.</p>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
10
frontend/src/pages/LoginPage.tsx
Normal file
10
frontend/src/pages/LoginPage.tsx
Normal file
|
|
@ -0,0 +1,10 @@
|
|||
export default function LoginPage() {
|
||||
return (
|
||||
<div className="flex min-h-screen items-center justify-center bg-gray-50">
|
||||
<div className="w-full max-w-md rounded-lg bg-white p-8 shadow">
|
||||
<h1 className="mb-4 text-2xl font-bold">Sign In</h1>
|
||||
<p className="text-gray-600">Log in to PromptLooper.</p>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
8
frontend/src/pages/ProjectsPage.tsx
Normal file
8
frontend/src/pages/ProjectsPage.tsx
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
export default function ProjectsPage() {
|
||||
return (
|
||||
<div className="p-8">
|
||||
<h1 className="mb-4 text-2xl font-bold">Projects</h1>
|
||||
<p className="text-gray-600">Manage your prompt tuning projects.</p>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
10
frontend/src/pages/SetupPage.tsx
Normal file
10
frontend/src/pages/SetupPage.tsx
Normal file
|
|
@ -0,0 +1,10 @@
|
|||
export default function SetupPage() {
|
||||
return (
|
||||
<div className="flex min-h-screen items-center justify-center bg-gray-50">
|
||||
<div className="w-full max-w-md rounded-lg bg-white p-8 shadow">
|
||||
<h1 className="mb-4 text-2xl font-bold">PromptLooper Setup</h1>
|
||||
<p className="text-gray-600">Create your admin account to get started.</p>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
1
frontend/src/test-setup.ts
Normal file
1
frontend/src/test-setup.ts
Normal file
|
|
@ -0,0 +1 @@
|
|||
import "@testing-library/jest-dom/vitest";
|
||||
1
frontend/src/vite-env.d.ts
vendored
Normal file
1
frontend/src/vite-env.d.ts
vendored
Normal file
|
|
@ -0,0 +1 @@
|
|||
/// <reference types="vite/client" />
|
||||
8
frontend/tailwind.config.js
Normal file
8
frontend/tailwind.config.js
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
/** @type {import('tailwindcss').Config} */
|
||||
export default {
|
||||
content: ["./index.html", "./src/**/*.{js,ts,jsx,tsx}"],
|
||||
theme: {
|
||||
extend: {},
|
||||
},
|
||||
plugins: [],
|
||||
};
|
||||
21
frontend/tsconfig.json
Normal file
21
frontend/tsconfig.json
Normal file
|
|
@ -0,0 +1,21 @@
|
|||
{
|
||||
"compilerOptions": {
|
||||
"target": "ES2020",
|
||||
"useDefineForClassFields": true,
|
||||
"lib": ["ES2020", "DOM", "DOM.Iterable"],
|
||||
"module": "ESNext",
|
||||
"skipLibCheck": true,
|
||||
"moduleResolution": "bundler",
|
||||
"allowImportingTsExtensions": true,
|
||||
"isolatedModules": true,
|
||||
"moduleDetection": "force",
|
||||
"noEmit": true,
|
||||
"jsx": "react-jsx",
|
||||
"strict": true,
|
||||
"noUnusedLocals": true,
|
||||
"noUnusedParameters": true,
|
||||
"noFallthroughCasesInSwitch": true,
|
||||
"forceConsistentCasingInFileNames": true
|
||||
},
|
||||
"include": ["src"]
|
||||
}
|
||||
25
frontend/vite.config.ts
Normal file
25
frontend/vite.config.ts
Normal file
|
|
@ -0,0 +1,25 @@
|
|||
import { defineConfig } from "vite";
|
||||
import react from "@vitejs/plugin-react";
|
||||
|
||||
export default defineConfig({
|
||||
plugins: [react()],
|
||||
build: {
|
||||
outDir: "dist",
|
||||
},
|
||||
server: {
|
||||
port: 5173,
|
||||
proxy: {
|
||||
"/api": "http://localhost:8000",
|
||||
"/ws": {
|
||||
target: "ws://localhost:8000",
|
||||
ws: true,
|
||||
},
|
||||
"/health": "http://localhost:8000",
|
||||
},
|
||||
},
|
||||
test: {
|
||||
environment: "jsdom",
|
||||
globals: true,
|
||||
setupFiles: ["./src/test-setup.ts"],
|
||||
},
|
||||
});
|
||||
635
promptlooper-spec.md
Normal file
635
promptlooper-spec.md
Normal file
|
|
@ -0,0 +1,635 @@
|
|||
# PromptLooper
|
||||
|
||||
> The one who loops prompts — a universal LLM pipeline tuning workbench.
|
||||
|
||||
PromptLooper is a self-hosted tool for systematically optimizing LLM prompts, model selection, and inference parameters. It runs experiments across prompt × model × parameter combinations, caches every response, scores results against pluggable evaluation functions, and surfaces the best configurations through a real-time observability dashboard with human-in-the-loop steering.
|
||||
|
||||
It ships as a single Docker container (SQLite mode) for zero-config quickstart, or a Docker Compose stack (Postgres + Redis) for production use. An MCP server enables any AI agent to drive PromptLooper programmatically — creating experiments, running sweeps, and reading results without human intervention.
|
||||
|
||||
---
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Anyone building LLM-powered applications faces the same painful loop:
|
||||
|
||||
1. Write a system prompt
|
||||
2. Pick a model and parameters (temperature, top_p, max_tokens, etc.)
|
||||
3. Run it against sample data
|
||||
4. Read the output and decide if it's "good enough"
|
||||
5. Tweak something and repeat
|
||||
|
||||
This process is manual, unscientific, and wasteful. There's no way to:
|
||||
- Systematically compare configurations side-by-side
|
||||
- Know if you've already tested a particular combination
|
||||
- Quantify "better" beyond gut feeling
|
||||
- Let an agent handle the iteration while you steer from above
|
||||
- Share optimized configurations between projects or team members
|
||||
|
||||
PromptLooper makes this process systematic, observable, cached, and agent-drivable.
|
||||
|
||||
---
|
||||
|
||||
## Target Users
|
||||
|
||||
| User | Use Case |
|
||||
|------|----------|
|
||||
| **Solo developer** | Tuning prompts for a side project, wants to try 5 models and find the sweet spot |
|
||||
| **Team building RAG pipelines** | Optimizing chunking + embedding + retrieval + synthesis prompts across stages |
|
||||
| **AI agent (via MCP)** | Autonomously running optimization sweeps, reporting back to human when done |
|
||||
| **Prompt engineer** | A/B testing prompt variants at scale with quantified scoring |
|
||||
| **Infrastructure team** | Benchmarking new models against existing baselines before migration |
|
||||
|
||||
---
|
||||
|
||||
## Core Concepts
|
||||
|
||||
### Experiment
|
||||
|
||||
A named configuration that defines:
|
||||
- **Sample data**: Input documents, queries, or any text the pipeline will process
|
||||
- **Pipeline stages**: 1-N sequential stages, each with its own prompt template and model config
|
||||
- **Evaluation criteria**: Scoring functions that grade the output
|
||||
- **Parameter space**: What to vary (prompt text, model, temperature, top_p, chunk_size, etc.)
|
||||
|
||||
### Run
|
||||
|
||||
A single execution of one specific configuration within an experiment. A run captures:
|
||||
- Full input configuration (prompt, model, all parameters)
|
||||
- Raw LLM response(s)
|
||||
- Timing data (latency, tokens in/out)
|
||||
- Evaluation scores
|
||||
- Configuration hash (for cache deduplication)
|
||||
|
||||
### Sweep
|
||||
|
||||
A batch of runs that systematically explores a parameter space. Types:
|
||||
- **Grid sweep**: Every combination of specified parameter values
|
||||
- **Random sweep**: Random sampling from parameter ranges
|
||||
- **Guided sweep**: Agent-driven, where results from previous runs inform the next configuration to try
|
||||
|
||||
### Scoring Function
|
||||
|
||||
A pluggable evaluation that takes (input, output, context) and returns a numeric score. Built-in options:
|
||||
- **Embedding similarity**: How semantically close is the output to a reference answer?
|
||||
- **Length compliance**: Does the output meet length constraints?
|
||||
- **Format compliance**: Does the output match expected structure (JSON, markdown, etc.)?
|
||||
- **Keyword presence**: Do required terms appear in the output?
|
||||
- **Human rating**: Manual thumbs-up/down or 1-5 star rating from the dashboard
|
||||
- **LLM-as-judge**: Use a separate LLM call to evaluate quality (configurable judge prompt)
|
||||
- **Custom function**: User-provided Python snippet or HTTP webhook
|
||||
|
||||
### Project
|
||||
|
||||
A workspace that groups related experiments. Users can return to a project and pick up where they left off. Projects store:
|
||||
- All experiments and their runs
|
||||
- Saved "best" configurations
|
||||
- Notes and annotations
|
||||
- Export history
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────────────────────────┐
|
||||
│ Docker Compose: xpltd_promptlooper (ub01) │
|
||||
│ Network: promptlooper (172.33.0.0/24) │
|
||||
│ │
|
||||
│ ┌────────────┐ ┌─────────────┐ ┌──────────────────────────────────┐ │
|
||||
│ │ PostgreSQL │ │ Redis │ │ FastAPI (API) │ │
|
||||
│ │ :5434 │ │ job queue │ │ Experiments, Runs, Scoring, │ │
|
||||
│ │ experiments│ │ pub/sub │ │ Projects, Auth, MCP Server │ │
|
||||
│ │ runs, cache│ │ live state │ │ WebSocket for live dashboard │ │
|
||||
│ └─────┬───────┘ └──────┬──────┘ └──────────────┬───────────────────┘ │
|
||||
│ │ │ │ │
|
||||
│ ┌─────┴─────────────────┴────────────────────────┴───────────────────┐ │
|
||||
│ │ Celery Worker │ │
|
||||
│ │ Executes runs against target LLM endpoints │ │
|
||||
│ │ Caches responses by config hash │ │
|
||||
│ │ Streams progress via Redis pub/sub │ │
|
||||
│ └────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Web UI (React + Vite) │ │
|
||||
│ │ nginx → :8400 │ │
|
||||
│ │ Dashboard, Experiment Builder, Live Observability, Steering │ │
|
||||
│ └────────────────────────────────────────────────────────────────────┘ │
|
||||
└──────────────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
│ HTTP (OpenAI-compatible)
|
||||
▼
|
||||
┌───────────────────────────────┐
|
||||
│ Target LLM Endpoints │
|
||||
│ OpenWebUI, vLLM, Ollama, │
|
||||
│ OpenAI, Anthropic, any │
|
||||
│ OpenAI-compatible API │
|
||||
└───────────────────────────────┘
|
||||
```
|
||||
|
||||
### Services (Production Compose)
|
||||
|
||||
| Service | Image | Port | Purpose |
|
||||
|---------|-------|------|---------|
|
||||
| `promptlooper-db` | `postgres:16-alpine` | `5434 → 5432` | Primary data store |
|
||||
| `promptlooper-redis` | `redis:7-alpine` | — | Celery broker + pub/sub for live dashboard |
|
||||
| `promptlooper-api` | `Dockerfile` | `8000` | FastAPI REST API + MCP server |
|
||||
| `promptlooper-worker` | `Dockerfile` | — | Celery worker (run execution) |
|
||||
| `promptlooper-web` | `Dockerfile` | `8400 → 80` | React frontend (nginx) |
|
||||
|
||||
### Single Container Mode
|
||||
|
||||
When `DATABASE_URL` is not set, PromptLooper runs with:
|
||||
- SQLite at `/data/promptlooper.db`
|
||||
- In-process task queue (no Celery/Redis dependency)
|
||||
- All services in one container on port 8400
|
||||
|
||||
```bash
|
||||
docker run -p 8400:8400 -v promptlooper-data:/data ghcr.io/xpltdco/promptlooper
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Data Model
|
||||
|
||||
### User
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | UUID | PK |
|
||||
| username | string | Unique, "admin" created on first boot |
|
||||
| password_hash | string | bcrypt |
|
||||
| is_admin | bool | Default true for first user |
|
||||
| created_at | timestamp | |
|
||||
|
||||
### Project
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | UUID | PK |
|
||||
| name | string | |
|
||||
| description | text | Optional |
|
||||
| owner_id | UUID | FK → User |
|
||||
| created_at | timestamp | |
|
||||
| updated_at | timestamp | |
|
||||
|
||||
### Experiment
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | UUID | PK |
|
||||
| project_id | UUID | FK → Project |
|
||||
| name | string | |
|
||||
| description | text | Optional |
|
||||
| sample_data | JSONB | Input documents/queries |
|
||||
| pipeline_stages | JSONB | Stage definitions with prompt templates |
|
||||
| scoring_config | JSONB | Which scoring functions to use and their weights |
|
||||
| parameter_space | JSONB | What to vary and ranges/options |
|
||||
| status | enum | draft, running, paused, completed |
|
||||
| created_at | timestamp | |
|
||||
| updated_at | timestamp | |
|
||||
|
||||
### Run
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | UUID | PK |
|
||||
| experiment_id | UUID | FK → Experiment |
|
||||
| config_hash | string(64) | SHA-256 of full configuration (for cache dedup) |
|
||||
| config | JSONB | Complete configuration snapshot |
|
||||
| status | enum | pending, running, completed, failed, cached |
|
||||
| started_at | timestamp | |
|
||||
| completed_at | timestamp | |
|
||||
| duration_ms | int | Wall clock time |
|
||||
| tokens_in | int | Total input tokens across all stages |
|
||||
| tokens_out | int | Total output tokens |
|
||||
| cost_estimate | decimal | Estimated cost based on model pricing |
|
||||
|
||||
### StageResult
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | UUID | PK |
|
||||
| run_id | UUID | FK → Run |
|
||||
| stage_index | int | 0-based stage number |
|
||||
| prompt_sent | text | Actual prompt after template rendering |
|
||||
| response_raw | text | Raw LLM response |
|
||||
| model_used | string | Model identifier |
|
||||
| parameters | JSONB | Temperature, top_p, etc. |
|
||||
| tokens_in | int | This stage |
|
||||
| tokens_out | int | This stage |
|
||||
| latency_ms | int | This stage |
|
||||
|
||||
### Score
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | UUID | PK |
|
||||
| run_id | UUID | FK → Run |
|
||||
| scorer_name | string | e.g. "embedding_similarity", "human_rating" |
|
||||
| value | float | Normalized 0.0–1.0 |
|
||||
| metadata | JSONB | Scorer-specific details |
|
||||
| created_at | timestamp | |
|
||||
|
||||
### ResponseCache
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| config_hash | string(64) | PK — SHA-256 of (prompt + model + params + input) |
|
||||
| response | text | Cached LLM response |
|
||||
| model | string | |
|
||||
| tokens_in | int | |
|
||||
| tokens_out | int | |
|
||||
| latency_ms | int | Original latency |
|
||||
| created_at | timestamp | |
|
||||
|
||||
### WebhookConfig
|
||||
| Field | Type | Notes |
|
||||
|-------|------|-------|
|
||||
| id | UUID | PK |
|
||||
| event_type | string | experiment.complete, new_best_found, budget.exhausted, human_needed |
|
||||
| url | string | Target URL |
|
||||
| headers | JSONB | Optional auth headers |
|
||||
| is_active | bool | |
|
||||
|
||||
---
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Auth
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| POST | `/api/v1/auth/setup` | First-boot admin password setup |
|
||||
| POST | `/api/v1/auth/login` | Login, returns JWT |
|
||||
| GET | `/api/v1/auth/me` | Current user info |
|
||||
|
||||
### Admin
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/v1/admin/settings` | System settings (guest access, default model, etc.) |
|
||||
| PUT | `/api/v1/admin/settings` | Update settings |
|
||||
| GET | `/api/v1/admin/stats` | System-wide stats (total runs, cache hit rate, etc.) |
|
||||
|
||||
### Projects
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/v1/projects` | List projects |
|
||||
| POST | `/api/v1/projects` | Create project |
|
||||
| GET | `/api/v1/projects/{id}` | Project detail with experiment summaries |
|
||||
| PUT | `/api/v1/projects/{id}` | Update project |
|
||||
| DELETE | `/api/v1/projects/{id}` | Delete project and all experiments |
|
||||
|
||||
### Experiments
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/v1/experiments` | List experiments (filter by project) |
|
||||
| POST | `/api/v1/experiments` | Create experiment |
|
||||
| GET | `/api/v1/experiments/{id}` | Experiment detail with run summaries |
|
||||
| PUT | `/api/v1/experiments/{id}` | Update experiment config |
|
||||
| DELETE | `/api/v1/experiments/{id}` | Delete experiment |
|
||||
| POST | `/api/v1/experiments/{id}/sweep` | Start a sweep (grid, random, or guided) |
|
||||
| POST | `/api/v1/experiments/{id}/pause` | Pause running sweep |
|
||||
| POST | `/api/v1/experiments/{id}/resume` | Resume paused sweep |
|
||||
| POST | `/api/v1/experiments/{id}/stop` | Stop sweep |
|
||||
|
||||
### Runs
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/v1/experiments/{id}/runs` | List runs with scores (sortable, filterable) |
|
||||
| GET | `/api/v1/runs/{id}` | Run detail with stage results |
|
||||
| POST | `/api/v1/runs` | Execute a single run (ad-hoc) |
|
||||
| POST | `/api/v1/runs/{id}/score` | Add human rating to a run |
|
||||
| GET | `/api/v1/experiments/{id}/leaderboard` | Top runs ranked by weighted score |
|
||||
|
||||
### Export
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/v1/experiments/{id}/export/best` | Best config as JSON |
|
||||
| GET | `/api/v1/experiments/{id}/export/env` | Best config as .env snippet |
|
||||
| GET | `/api/v1/experiments/{id}/export/yaml` | Best config as YAML |
|
||||
| GET | `/api/v1/experiments/{id}/export/report` | Full experiment report (markdown) |
|
||||
|
||||
### LLM Endpoints (Target Management)
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/v1/endpoints` | List configured LLM endpoints |
|
||||
| POST | `/api/v1/endpoints` | Add endpoint (URL, API key, label) |
|
||||
| PUT | `/api/v1/endpoints/{id}` | Update endpoint |
|
||||
| DELETE | `/api/v1/endpoints/{id}` | Remove endpoint |
|
||||
| POST | `/api/v1/endpoints/{id}/test` | Test connectivity and list available models |
|
||||
|
||||
### Webhooks
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/v1/webhooks` | List webhook configs |
|
||||
| POST | `/api/v1/webhooks` | Create webhook |
|
||||
| DELETE | `/api/v1/webhooks/{id}` | Remove webhook |
|
||||
|
||||
### WebSocket
|
||||
| Path | Description |
|
||||
|------|-------------|
|
||||
| `/ws/experiments/{id}` | Live stream: run progress, scores, stage completions |
|
||||
| `/ws/dashboard` | Global activity feed across all experiments |
|
||||
|
||||
### Health
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/health` | Health check (DB + Redis connectivity) |
|
||||
|
||||
---
|
||||
|
||||
## MCP Server
|
||||
|
||||
PromptLooper exposes an MCP (Model Context Protocol) server so AI agents can drive it programmatically. The MCP server runs as part of the API service.
|
||||
|
||||
### MCP Tools
|
||||
|
||||
| Tool | Description |
|
||||
|------|-------------|
|
||||
| `create_project` | Create a new project workspace |
|
||||
| `create_experiment` | Define an experiment with sample data, stages, and scoring |
|
||||
| `configure_endpoint` | Add or update an LLM target endpoint |
|
||||
| `run_single` | Execute one specific configuration and return results |
|
||||
| `run_sweep` | Start a parameter sweep (grid/random/guided) |
|
||||
| `get_leaderboard` | Get top N configurations ranked by score |
|
||||
| `get_run_detail` | Get full details of a specific run |
|
||||
| `export_best_config` | Export the best configuration in JSON/YAML/env format |
|
||||
| `pause_sweep` | Pause a running sweep |
|
||||
| `resume_sweep` | Resume a paused sweep |
|
||||
| `add_human_score` | Rate a run's output |
|
||||
| `get_experiment_status` | Check experiment progress |
|
||||
| `list_models` | List available models across all configured endpoints |
|
||||
|
||||
### Example Agent Interaction
|
||||
|
||||
```
|
||||
Agent: "Create a project called 'Chrysopedia Extraction' and an experiment
|
||||
that tests the stage3_extraction prompt against Qwen-72B and Qwen-32B,
|
||||
sweeping temperature from 0.1 to 0.9 in 0.2 increments.
|
||||
Use embedding similarity scoring against these reference outputs.
|
||||
Run a grid sweep."
|
||||
|
||||
PromptLooper MCP: [create_project] → [create_experiment] → [run_sweep]
|
||||
→ streams progress → [get_leaderboard]
|
||||
|
||||
Agent: "The top config uses Qwen-72B at temperature 0.3. Export it as
|
||||
a .env snippet I can drop into Chrysopedia."
|
||||
|
||||
PromptLooper MCP: [export_best_config format=env]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Response Caching
|
||||
|
||||
Every LLM call is cached by a SHA-256 hash of:
|
||||
- Prompt text (after template rendering)
|
||||
- Model identifier
|
||||
- All inference parameters (temperature, top_p, max_tokens, etc.)
|
||||
- Input data
|
||||
|
||||
If an identical configuration has been run before, the cached response is returned instantly with `status: cached`. This means:
|
||||
- Re-running experiments with new scoring functions costs zero tokens
|
||||
- Adding a new scorer retroactively evaluates all historical runs
|
||||
- Accidentally re-running a sweep wastes nothing
|
||||
- Cache can be invalidated per-run or per-experiment if needed
|
||||
|
||||
---
|
||||
|
||||
## Authentication Model
|
||||
|
||||
### First Boot
|
||||
- App detects no users exist
|
||||
- Presents a setup screen: create admin username + password
|
||||
- Admin account is created, user is logged in
|
||||
|
||||
### Guest Access
|
||||
- Admin can toggle `allow_guest_access` in settings
|
||||
- Guests can view experiments and results (read-only)
|
||||
- Guests cannot create experiments, run sweeps, or modify configs
|
||||
- Default: guest access disabled
|
||||
|
||||
### API Authentication
|
||||
- JWT tokens for the web UI
|
||||
- API key (generated in admin settings) for programmatic access and MCP
|
||||
- API key passed via `Authorization: Bearer <key>` header
|
||||
|
||||
---
|
||||
|
||||
## Real-Time Observability Dashboard
|
||||
|
||||
The dashboard is the primary user interface during active experimentation. It provides:
|
||||
|
||||
### Live Experiment View
|
||||
- Progress bar: X of Y runs completed
|
||||
- Token usage accumulator (running total)
|
||||
- Cost estimate (based on configured model pricing)
|
||||
- Cache hit rate for current sweep
|
||||
- Estimated time remaining
|
||||
|
||||
### Side-by-Side Output Comparison
|
||||
- Pick any two runs and diff their outputs
|
||||
- Highlight differences in prompt, parameters, and response
|
||||
- Score comparison overlay
|
||||
|
||||
### Leaderboard
|
||||
- Real-time ranked list of runs by weighted score
|
||||
- Sortable by any individual scorer
|
||||
- Click to expand full run detail
|
||||
|
||||
### Steering Controls
|
||||
- **Pause**: Stop the sweep after current run completes
|
||||
- **Fork**: Create a new experiment branching from current best, with modified parameters
|
||||
- **Redirect**: Change remaining sweep parameters mid-flight
|
||||
- **Approve**: Mark a configuration as "good enough" and export
|
||||
- **Reject**: Exclude a run from leaderboard consideration
|
||||
|
||||
### Activity Timeline
|
||||
- Chronological feed of events: run started, run completed, new best found, cache hit, error
|
||||
- Filterable by event type
|
||||
|
||||
---
|
||||
|
||||
## Webhook Events
|
||||
|
||||
| Event | Payload | Trigger |
|
||||
|-------|---------|---------|
|
||||
| `experiment.started` | experiment_id, sweep config | Sweep begins |
|
||||
| `experiment.completed` | experiment_id, best config, summary stats | All runs finished |
|
||||
| `experiment.paused` | experiment_id, reason | Manual or budget pause |
|
||||
| `new_best_found` | experiment_id, run_id, scores, config | New top-scoring run |
|
||||
| `budget.exhausted` | experiment_id, token_count, cost | Token/cost budget hit |
|
||||
| `human_needed` | experiment_id, reason, context | Agent requests human review |
|
||||
| `run.failed` | run_id, error | Individual run error |
|
||||
|
||||
---
|
||||
|
||||
## Configuration Export Formats
|
||||
|
||||
### JSON
|
||||
```json
|
||||
{
|
||||
"model": "qwen2.5-72b-instruct",
|
||||
"endpoint": "http://chat.forgetyour.name/api",
|
||||
"temperature": 0.3,
|
||||
"top_p": 0.85,
|
||||
"max_tokens": 2048,
|
||||
"system_prompt": "You are a music production knowledge extractor...",
|
||||
"score": 0.87,
|
||||
"experiment": "chrysopedia-extraction-v2",
|
||||
"exported_at": "2026-04-06T12:00:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
### .env
|
||||
```bash
|
||||
LLM_MODEL=qwen2.5-72b-instruct
|
||||
LLM_API_URL=http://chat.forgetyour.name/api
|
||||
LLM_TEMPERATURE=0.3
|
||||
LLM_TOP_P=0.85
|
||||
LLM_MAX_TOKENS=2048
|
||||
# Score: 0.87 | Experiment: chrysopedia-extraction-v2
|
||||
```
|
||||
|
||||
### YAML
|
||||
```yaml
|
||||
model: qwen2.5-72b-instruct
|
||||
endpoint: http://chat.forgetyour.name/api
|
||||
parameters:
|
||||
temperature: 0.3
|
||||
top_p: 0.85
|
||||
max_tokens: 2048
|
||||
system_prompt: |
|
||||
You are a music production knowledge extractor...
|
||||
metadata:
|
||||
score: 0.87
|
||||
experiment: chrysopedia-extraction-v2
|
||||
exported_at: 2026-04-06T12:00:00Z
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables
|
||||
|
||||
| Group | Variable | Default | Notes |
|
||||
|-------|----------|---------|-------|
|
||||
| **Database** | `DATABASE_URL` | (none → SQLite) | PostgreSQL connection string |
|
||||
| **Redis** | `REDIS_URL` | (none → in-process) | Redis connection string |
|
||||
| **Server** | `HOST` | `0.0.0.0` | Bind address |
|
||||
| **Server** | `PORT` | `8400` | HTTP port |
|
||||
| **Auth** | `JWT_SECRET` | (auto-generated) | JWT signing key |
|
||||
| **Auth** | `API_KEY` | (none) | Static API key for programmatic access |
|
||||
| **Defaults** | `DEFAULT_ENDPOINT_URL` | (none) | Pre-configured LLM endpoint |
|
||||
| **Defaults** | `DEFAULT_ENDPOINT_KEY` | (none) | API key for default endpoint |
|
||||
| **Limits** | `MAX_CONCURRENT_RUNS` | `4` | Parallel run limit |
|
||||
| **Limits** | `MAX_TOKENS_PER_SWEEP` | `0` (unlimited) | Token budget per sweep |
|
||||
| **Storage** | `DATA_DIR` | `/data` | SQLite DB + file storage location |
|
||||
| **MCP** | `MCP_ENABLED` | `true` | Enable MCP server |
|
||||
| **MCP** | `MCP_PORT` | `8401` | MCP server port |
|
||||
|
||||
---
|
||||
|
||||
## Docker Compose (Production — XPLTD Conventions)
|
||||
|
||||
Project name: `xpltd_promptlooper`
|
||||
Network: `promptlooper` (`172.33.0.0/24`)
|
||||
Persistent data: `/vmPool/r/services/promptlooper_*`
|
||||
PostgreSQL port: `5434` (external)
|
||||
Web UI port: `8400` (external)
|
||||
|
||||
---
|
||||
|
||||
## Technology Stack
|
||||
|
||||
| Layer | Technology | Rationale |
|
||||
|-------|-----------|-----------|
|
||||
| **API** | Python 3.12 + FastAPI | Async, OpenAPI auto-gen, matches XPLTD conventions |
|
||||
| **Task Queue** | Celery + Redis | Proven for background job execution, matches Chrysopedia |
|
||||
| **Database** | PostgreSQL 16 (prod) / SQLite (single-container) | JSONB for flexible experiment configs |
|
||||
| **Real-time** | WebSocket via FastAPI + Redis pub/sub | Sub-second dashboard updates |
|
||||
| **Frontend** | React 18 + TypeScript + Vite | Real-time dashboard, matches Chrysopedia |
|
||||
| **Styling** | Tailwind CSS | Fast iteration, utility-first |
|
||||
| **MCP** | Python MCP SDK | Standard protocol for agent integration |
|
||||
| **Container** | Multi-stage Docker build | Single image serves both API and frontend |
|
||||
|
||||
---
|
||||
|
||||
## Development & Deployment
|
||||
|
||||
### Local Development
|
||||
```bash
|
||||
git clone git@git.xpltd.co:xpltdco/promptlooper.git
|
||||
cd promptlooper
|
||||
cp .env.example .env
|
||||
docker compose up -d promptlooper-db promptlooper-redis
|
||||
cd backend && pip install -r requirements.txt
|
||||
alembic upgrade head
|
||||
uvicorn main:app --reload --host 0.0.0.0 --port 8000
|
||||
# In another terminal:
|
||||
cd frontend && npm install && npm run dev
|
||||
```
|
||||
|
||||
### Production Deployment (ub01)
|
||||
```bash
|
||||
ssh ub01
|
||||
cd /vmPool/r/repos/xpltdco/promptlooper
|
||||
git pull && docker compose build && docker compose up -d
|
||||
```
|
||||
|
||||
### Project Structure
|
||||
```
|
||||
promptlooper/
|
||||
├── backend/
|
||||
│ ├── main.py # FastAPI entry point
|
||||
│ ├── config.py # Pydantic Settings
|
||||
│ ├── models.py # SQLAlchemy ORM
|
||||
│ ├── schemas.py # Pydantic request/response
|
||||
│ ├── auth.py # JWT + API key auth
|
||||
│ ├── worker.py # Celery app config
|
||||
│ ├── routers/
|
||||
│ │ ├── auth.py
|
||||
│ │ ├── projects.py
|
||||
│ │ ├── experiments.py
|
||||
│ │ ├── runs.py
|
||||
│ │ ├── endpoints.py
|
||||
│ │ ├── export.py
|
||||
│ │ ├── webhooks.py
|
||||
│ │ └── admin.py
|
||||
│ ├── engine/
|
||||
│ │ ├── runner.py # Run execution logic
|
||||
│ │ ├── sweep.py # Sweep orchestration
|
||||
│ │ ├── cache.py # Response cache layer
|
||||
│ │ ├── adapters/ # LLM endpoint adapters
|
||||
│ │ │ ├── openai_compat.py
|
||||
│ │ │ └── base.py
|
||||
│ │ └── scorers/ # Pluggable scoring functions
|
||||
│ │ ├── embedding.py
|
||||
│ │ ├── format.py
|
||||
│ │ ├── keyword.py
|
||||
│ │ ├── llm_judge.py
|
||||
│ │ └── base.py
|
||||
│ ├── mcp/
|
||||
│ │ ├── server.py # MCP server implementation
|
||||
│ │ └── tools.py # MCP tool definitions
|
||||
│ ├── websocket/
|
||||
│ │ └── manager.py # WebSocket connection management
|
||||
│ └── tests/
|
||||
├── frontend/
|
||||
│ └── src/
|
||||
│ ├── pages/
|
||||
│ │ ├── Setup.tsx # First-boot admin setup
|
||||
│ │ ├── Login.tsx
|
||||
│ │ ├── Dashboard.tsx # Global activity
|
||||
│ │ ├── Projects.tsx
|
||||
│ │ ├── Experiment.tsx # Experiment builder + config
|
||||
│ │ ├── Live.tsx # Real-time observability
|
||||
│ │ ├── Compare.tsx # Side-by-side run comparison
|
||||
│ │ └── Admin.tsx # System settings
|
||||
│ ├── components/
|
||||
│ │ ├── Leaderboard.tsx
|
||||
│ │ ├── SteeringControls.tsx
|
||||
│ │ ├── RunCard.tsx
|
||||
│ │ ├── ScoreChart.tsx
|
||||
│ │ └── Timeline.tsx
|
||||
│ └── api/
|
||||
├── docker/
|
||||
│ ├── Dockerfile # Multi-stage: API + frontend
|
||||
│ └── nginx.conf
|
||||
├── alembic/
|
||||
├── docker-compose.yml
|
||||
├── .env.example
|
||||
├── CLAUDE.md
|
||||
└── README.md
|
||||
```
|
||||
Loading…
Add table
Reference in a new issue