promptlooper

Author	SHA1	Message	Date
John Lightner	5a1d029b9b	MAESTRO: Add comprehensive engine test suite achieving 90% coverage Created tests/test_engine_core.py with 52 tests covering webhook dispatch engine (sync+async delivery, retries, dispatch), format scorer structure/length edge cases, cache hash determinism with nested/special chars, adapter mock call tracking, grid sweep combo verification, scorer integration with known inputs, and EventBus. Engine coverage improved from 83% to 90%, webhooks.py from 27% to 99%.	2026-04-07 03:45:24 -05:00
John Lightner	ad6b6ffb49	MAESTRO: Build useExperimentWS hook with typed events, exponential backoff reconnect, and refactor LivePage to use it Extract WebSocket connection management from LivePage into a reusable custom hook. Supports connect/disconnect/reconnect/send, experiment event filtering, configurable backoff, and enabled flag. 20 tests added.	2026-04-07 03:43:49 -05:00
John Lightner	0f64dfbb02	MAESTRO: Implement webhook CRUD router, async dispatch with retry logic, and delivery logging Full webhook system: CRUD endpoints (list/filter/get/create/update/delete), WebhookDelivery model for delivery audit trail, dispatch engine with 3-attempt retry and exponential backoff, Celery task integration with sync fallback, and webhook firing hooks in runner.py and sweep.py event paths.	2026-04-07 03:41:04 -05:00
John Lightner	3c78f874fb	MAESTRO: Implement Dashboard page with stats, active sweeps, recent projects, and quick actions	2026-04-07 03:37:31 -05:00
John Lightner	74ccc1a8ed	MAESTRO: Implement Admin page with settings, API keys, stats, and webhook management	2026-04-07 03:34:32 -05:00
John Lightner	30fd15ec7a	MAESTRO: Implement WebSocket connection manager with per-experiment routing, Redis pub/sub bridge, and message replay - WebSocketManager in backend/websocket/manager.py with per-experiment and global subscriptions - Redis pub/sub bridge (sync + async) broadcasting events to relevant WebSocket clients - Deque-based replay buffers with since_ts/limit filtering for reconnection support - Runtime subscribe/unsubscribe and stats API - Enhanced /ws endpoint in main.py with subscribe/unsubscribe/replay actions - 35 tests in test_ws_manager.py, all passing	2026-04-07 03:34:21 -05:00
John Lightner	e42117c8ee	MAESTRO: Implement export router with JSON, .env, YAML, and markdown report endpoints Four fully authenticated endpoints at /api/export/experiments/{id}/: - /best: Returns best config as JSON with weighted score and metadata - /env: Flattened KEY=VALUE format with metadata comments - /yaml: Simple YAML serialization (no external dependency) - /report: Full markdown report with config space, top N configs, score distributions, token usage, and timing stats 34 tests in test_export.py covering all endpoints, auth, 404s, and helpers. Updated test_routers.py to expect 401 (auth required) instead of 501 (stub).	2026-04-07 03:30:45 -05:00
John Lightner	32535a92ea	MAESTRO: Build ScoreChart component with scatter, bar, and line chart types Custom SVG-based charts with no external dependencies. Scatter plot for score vs parameter value, bar chart for top N configs comparison, line chart for score progression over time. Interactive tooltips, click callbacks, chart type switching, dark mode support. 30 tests added.	2026-04-07 03:29:17 -05:00
John Lightner	1d3917a44e	MAESTRO: Implement Compare page with side-by-side run comparison, config/response diffs, and score overlay - Two-column run selectors with experiment→run cascading dropdowns and URL state sync - Config diff with color-coded same/changed/added/removed entries using key-level comparison - Line-level LCS response diff with added/removed/same highlighting - Score comparison with overlaid indigo/emerald bars per scorer - Pick Winner buttons submit human_preference score via API - Full RunCard detail view for each run side by side - 15 tests added (5 diff helper unit tests + 10 component integration tests) - App.test.tsx updated to mock experiments.list for ComparePage route	2026-04-07 03:25:37 -05:00
John Lightner	b3fb8e3063	MAESTRO: Implement runs router with full CRUD, filtering, scoring, and leaderboard - List runs with filtering by experiment, status, and score range plus pagination - Get run detail with eager-loaded stage results and scores - Ad-hoc single run creation with Celery/sync dispatch - Human scoring endpoint (POST /{id}/score) - Leaderboard endpoint with configurable weighted scoring from experiment scoring_config - Added AdHocRunCreate, LeaderboardEntry, LeaderboardResponse schemas - 25 tests in test_runs.py, all passing (503 total tests passing)	2026-04-07 03:24:56 -05:00
John Lightner	e6c344d554	MAESTRO: Build RunCard expandable component with scores, prompts, responses, and stage timing Implements RunCard.tsx with expandable card showing config summary, cache status badge, score bars, config JSON, per-stage timing breakdown, collapsible prompt/response sections (with copy button), and metadata footer. 26 tests added, all 310 tests pass.	2026-04-07 03:20:38 -05:00
John Lightner	82e97e9dba	MAESTRO: Implement experiments router with full CRUD and sweep control endpoints Add complete experiments API: list (with project filter), get, create, update, delete, plus sweep lifecycle (start/pause/resume/stop/status). Adds SweepRequest and SweepStatusResponse schemas. Sweep dispatch routes through Celery with synchronous fallback for single-container mode. Redis flags control pause/resume/stop; direct DB updates used when Redis unavailable. 34 tests.	2026-04-07 03:19:43 -05:00
John Lightner	59f18a11c3	MAESTRO: Extract SteeringControls into standalone component with Fork, Export, and ETA Extracted inline SteeringControls from LivePage into standalone component. Added Fork button (modal to clone experiment config), Export Best dropdown (JSON/YAML/.env download), and estimated time remaining stat. LivePage updated to import the new component. 33 tests added, all 284 tests pass.	2026-04-07 03:17:47 -05:00
John Lightner	35d72e7fa8	MAESTRO: Implement LLM endpoints router with CRUD, test_connection, and Fernet-encrypted API key storage - Add LLMEndpoint model to models.py with encrypted api_key field - Create encryption.py with Fernet symmetric encryption (key derived from JWT_SECRET via PBKDF2) - Implement full endpoints router: list, get, create, update, delete + test_connection - Test endpoint calls adapter.test_connection() and list_models() - API keys never exposed in responses; has_api_key boolean flag added - 25 tests in test_endpoints.py, all 444 tests passing	2026-04-07 03:13:52 -05:00
John Lightner	1253994c9e	MAESTRO: Extract Activity Timeline into standalone component with filter, auto-scroll, and color-coded events	2026-04-07 03:13:31 -05:00
John Lightner	cf49e9c888	MAESTRO: Extract Leaderboard into standalone component with expand, sort, and animation Extract the inline LeaderboardTable from LivePage into a standalone Leaderboard component with click-to-expand detail rows, sortable columns, smooth slide-in animation for new entries, and a subtle glow effect on the best run. 29 tests added.	2026-04-07 03:10:08 -05:00
John Lightner	b16454994e	MAESTRO: Implement Celery tasks (execute_run, execute_sweep) with synchronous fallback for single-container mode Created engine/tasks.py with: - execute_run and execute_sweep Celery tasks registered via autodiscover - SyncTaskResult class mimicking Celery AsyncResult for in-process mode - dispatch_run/dispatch_sweep helpers that route to Celery or sync based on config - Proper async-to-sync bridging for the async engine functions - 17 tests covering task execution, sync fallback, error handling, and Celery dispatch	2026-04-07 03:08:41 -05:00
John Lightner	16c56b13f2	MAESTRO: Implement Live Observability page with real-time WebSocket dashboard Full LivePage implementation with 60/40 split layout: - Left column: Activity Timeline with color-coded event cards (run.started, run.completed, new_best_found, cache_hit, run.failed), event type filtering, and auto-scroll toggle - Right column: Leaderboard table with sortable columns, best-run highlighting, and status badges; Steering Controls with pause/resume/stop (with confirmation dialogs), progress bar, token counter, cost estimate, and cache hit rate - WebSocket integration with exponential backoff reconnect, connection status indicator, and experiment subscription - 35 tests covering loading/error states, WebSocket events, timeline filtering, leaderboard updates, progress tracking, and steering control interactions	2026-04-07 03:06:16 -05:00
John Lightner	fb78eac1b0	MAESTRO: Implement LLMJudgeScorer with configurable judge prompt, rating parsing, and response caching	2026-04-07 03:05:00 -05:00
John Lightner	0d5a6169c5	MAESTRO: Implement KeywordScorer with presence/absence keyword checking and ratio scoring	2026-04-07 03:02:40 -05:00
John Lightner	bc1d41e3a6	MAESTRO: Implement FormatScorer with json, markdown, length, and structure format checks Adds format.py scorer supporting four validation modes: - json: validates parseable JSON - markdown: checks for headers (0.5) and lists (0.5) - length: proportional scoring against min/max token bounds - structure: JSON schema validation via jsonschema library Includes 38 passing tests covering all format types, edge cases, and async delegation.	2026-04-07 03:00:56 -05:00
John Lightner	7fc2a2b8c3	MAESTRO: Implement ModelSelector component with endpoint grouping, refresh, and connectivity indicators	2026-04-07 03:00:10 -05:00
John Lightner	3cc1e22e3f	MAESTRO: Implement EmbeddingScorer with cosine similarity scoring via OpenAI-compatible embedding API	2026-04-07 02:58:00 -05:00
John Lightner	f2e6baa56f	MAESTRO: Implement PromptEditor component with Jinja2 syntax highlighting, variable sidebar, and preview Built standalone PromptEditor with transparent-textarea overlay for syntax highlighting of Jinja2 expressions, statements, and comments. Includes clickable variable sidebar for insertion and preview panel with sample data substitution. Integrated into ExperimentPage PipelineStageCard. 27 tests added.	2026-04-07 02:56:48 -05:00
John Lightner	405bbf8206	MAESTRO: Implement BaseScorer abstract class with sync/async scoring interface Adds backend/engine/scorers/base.py with abstract name property, score() method, and score_async() default implementation. Updates scorers __init__.py to export BaseScorer. Includes 9 tests covering instantiation guards, sync/async dispatch, context dict usage, and partial implementation rejection.	2026-04-07 02:55:05 -05:00
John Lightner	ba8cb7e2c6	MAESTRO: Implement sweep orchestration engine with grid, random, and guided sweep types Adds backend/engine/sweep.py with three sweep strategies: - GridSweep: exhaustive enumeration of all parameter combinations - RandomSweep: N random samples from parameter ranges (list, min/max, step) - GuidedSweep: top-K exploitation + random exploration from previous results Features: bounded parallelism via asyncio.Semaphore, token budget enforcement, Redis-based pause/resume/stop control flags, sweep-level event publishing. 36 tests in test_sweep.py covering config generation, helpers, and full sweep execution.	2026-04-07 02:53:30 -05:00
John Lightner	e8ce2f016b	MAESTRO: Implement Experiment Builder page with all six sections and comprehensive tests Build the full Experiment Builder (ExperimentPage.tsx) with: basic info form, sample data input (text/JSON/file upload), pipeline stage builder with template variables and preview, scoring configuration with enable toggles and weight sliders, parameter space definition (fixed/range/options types), and action buttons (Save Draft, Run Single, Start Sweep). Supports both creating new experiments and editing existing ones. 20 tests added.	2026-04-07 02:52:52 -05:00
John Lightner	d607970f0c	MAESTRO: Implement run execution engine with Jinja2 templating, caching, scoring, and event bus Adds backend/engine/runner.py with run_single() that iterates pipeline stages, renders Jinja2 prompt templates with stage history context, checks/stores response cache, calls LLM adapters, runs configured scorers, creates StageResult and Score records, and publishes progress events via Redis pub/sub or in-process EventBus. Includes 21 passing tests covering all execution paths.	2026-04-07 02:48:20 -05:00
John Lightner	04a96f3dc3	MAESTRO: Implement Projects page with card grid, creation modal, and comprehensive tests	2026-04-07 02:47:24 -05:00
John Lightner	0e6ae49b3c	MAESTRO: Implement AuthContext provider with JWT management, session validation, and protected route redirects	2026-04-07 02:38:23 -05:00
John Lightner	f60128604f	MAESTRO: Implement ResponseCache layer with SHA-256 config hashing and hit-rate tracking	2026-04-07 02:37:58 -05:00
John Lightner	bf1e9d1c84	MAESTRO: Implement OpenAI-compatible LLM adapter with streaming, retries, and tests Add OpenAICompatAdapter that works with any OpenAI-compatible API endpoint (OpenWebUI, vLLM, Ollama, OpenAI, Anthropic via proxy). Features: - Async HTTP calls via httpx with configurable timeout - Chat completions format with system + user messages - Token usage parsing from responses - Exponential backoff retries (configurable, default 3 attempts) - Both streaming (SSE) and non-streaming modes - Model listing and connection testing - 21 tests covering construction, request building, response parsing, retry logic, and error handling	2026-04-07 02:35:52 -05:00
John Lightner	060f399789	MAESTRO: Implement Login page with form validation, error handling, and guest access link	2026-04-07 02:35:34 -05:00
John Lightner	1050109777	MAESTRO: Implement Setup page with first-boot admin creation flow - Full setup form with username, password, confirm password - Auth detection on mount (redirects if already authenticated) - Client-side validation (empty username, short password, mismatch) - Server error handling (409 conflict, network errors) - Welcoming UI with gradient background, dark mode support - 9 new tests covering all states and error paths - Updated App.test.tsx to handle async SetupPage rendering - Added @testing-library/user-event dependency	2026-04-07 02:34:00 -05:00
John Lightner	9e0dc4e9fe	MAESTRO: Implement BaseAdapter abstract class and AdapterResponse dataclass Define the LLM adapter interface in backend/engine/adapters/base.py with async methods complete(), list_models(), and test_connection(). The AdapterResponse dataclass holds response text, token counts, latency, model name, and raw metadata. Includes 11 tests covering instantiation guards, concrete subclass behavior, and dataclass semantics.	2026-04-07 02:32:57 -05:00
John Lightner	7dad9d97af	MAESTRO: Add entrypoint migrations, worker config, and stack integration tests Create docker/entrypoint.sh to run alembic migrations on API startup. Create backend/worker.py with Celery app config for the compose worker service. Fix README single-container port (8000) and add production compose documentation. Add 27 tests (stack integration + worker) verifying all Docker/compose artifacts are present, consistent, and the /health endpoint responds correctly.	2026-04-07 02:09:56 -05:00
John Lightner	43d2aafbbe	MAESTRO: Create typed API client with in-memory JWT auth, fetch wrappers, and WebSocket helper	2026-04-07 02:07:03 -05:00
John Lightner	4cd0b8a1c8	MAESTRO: Initialize frontend routing with 8 placeholder page components and vitest test suite Add SetupPage, LoginPage, DashboardPage, ProjectsPage, ExperimentPage, LivePage, ComparePage, and AdminPage as placeholder components. Wire up react-router-dom routing in App.tsx with BrowserRouter in main.tsx. Unknown routes redirect to dashboard. Install vitest + @testing-library/react and add 9 routing tests. Build passes cleanly.	2026-04-07 02:03:48 -05:00
John Lightner	267091bbce	MAESTRO: Scaffold all 8 router stubs in backend/routers/ with 501 placeholder endpoints	2026-04-07 02:01:11 -05:00
John Lightner	848fb06407	MAESTRO: Create backend/auth.py with JWT, API key auth, and first-boot setup flow	2026-04-07 01:59:24 -05:00
John Lightner	15ca2c922a	MAESTRO: Create backend/main.py with FastAPI app, CORS, health check, WebSocket, and router mounting FastAPI application with: - CORS middleware (permissive for dev) - /health endpoint checking DB and Redis connectivity - /ws WebSocket endpoint with ConnectionManager for real-time updates - Async lifespan hooks for DB engine and Redis init/teardown - get_db dependency for session management - Dynamic router mounting that silently skips missing router modules - 10 tests covering all endpoints and utilities	2026-04-07 01:56:40 -05:00
John Lightner	42668eeeb1	MAESTRO: Create backend/schemas.py with all Pydantic request/response schemas Create/update/response schemas for Project, Experiment, Run, Endpoint, Webhook, Score, Auth (setup/login/token), Export, and Health. All use Pydantic v2 ConfigDict(from_attributes=True) for ORM compatibility. RunDetailResponse nests StageResults and Scores. ExportRunRow provides flat scorer_name→value dict for CSV/JSON export. 30 tests added.	2026-04-07 01:54:02 -05:00
John Lightner	0ec75ab617	MAESTRO: Set up Alembic with initial migration for all 8 ORM models	2026-04-07 01:52:03 -05:00
John Lightner	7ef116e2f9	MAESTRO: Create backend/models.py with all 8 SQLAlchemy ORM models from spec Define User, Project, Experiment, Run, StageResult, Score, ResponseCache, and WebhookConfig with UUID primary keys, JSON columns, enum types (ExperimentStatus, RunStatus), full relationship cascades, and indexes. Uses sqlalchemy.JSON (not JSONB) for SQLite compatibility in single-container mode. 16 tests added covering table creation, CRUD, uniqueness constraints, default values, and cascade deletes — all passing.	2026-04-07 01:49:10 -05:00
John Lightner	309bbacb5d	MAESTRO: Create backend/config.py with Pydantic Settings and SQLite/in-process fallback All 13 environment variables from the spec defined with proper defaults. SQLite fallback when DATABASE_URL is unset, in-process queue flag when REDIS_URL is unset, JWT_SECRET auto-generation, empty API_KEY normalization. 13 unit tests covering all configuration paths.	2026-04-07 01:46:30 -05:00
John Lightner	9e2961d648	MAESTRO: Create multi-stage Dockerfile, nginx.conf, and frontend/backend scaffolding Three-stage Dockerfile: frontend-build (Node 20), api (Python 3.12 + uvicorn), web (nginx 1.27). nginx.conf proxies /api and /ws to the API service with WebSocket upgrade support. Includes backend/requirements.txt with all Python deps, frontend scaffolding (Vite + React + TypeScript + Tailwind), and placeholder alembic files for Docker COPY compatibility.	2026-04-07 01:44:52 -05:00
John Lightner	3c5fdace31	MAESTRO: Update docker-compose.yml with corrected XPLTD conventions Fixed DATABASE_URL to use standard postgresql:// scheme, hardcoded DB credentials for dev simplicity, added API_KEY pass-through, set worker working_dir, and made JWT_SECRET optional with dev default. All 5 services: db (:5434), redis, api (MCP :8401), worker (Celery), web (:8400).	2026-04-07 01:42:58 -05:00
John Lightner	4a0e4b6c65	MAESTRO: Add .env.example with all environment variables from spec Includes all 13 env vars organized into 7 groups: Database, Redis, Server, Auth, Default LLM Endpoint, Limits, Storage, and MCP. Production-only variables are commented out; single-container defaults work out of the box.	2026-04-07 01:41:22 -05:00
John Lightner	cb4af5f707	MAESTRO: Create full directory structure with placeholder files Set up all directories from the spec's Project Structure section: - backend/ with routers/, engine/adapters/, engine/scorers/, mcp/, websocket/, tests/ (all with __init__.py) - frontend/src/ with pages/, components/, api/ (.gitkeep) - docker/ (.gitkeep) - alembic/versions/ (.gitkeep)	2026-04-07 01:40:27 -05:00
John Lightner	fc2e4cd7d1	MAESTRO: Initialize repository with README, .gitignore, and project files Add README.md with project description, quick-start instructions, and AGPL-3.0 license badge. Add .gitignore for Python, Node, and Docker artifacts. Include existing CLAUDE.md, spec, docker-compose.yml, and env.example.	2026-04-07 01:39:18 -05:00

50 commits