promptlooper

Author	SHA1	Message	Date
John Lightner	30fd15ec7a	MAESTRO: Implement WebSocket connection manager with per-experiment routing, Redis pub/sub bridge, and message replay - WebSocketManager in backend/websocket/manager.py with per-experiment and global subscriptions - Redis pub/sub bridge (sync + async) broadcasting events to relevant WebSocket clients - Deque-based replay buffers with since_ts/limit filtering for reconnection support - Runtime subscribe/unsubscribe and stats API - Enhanced /ws endpoint in main.py with subscribe/unsubscribe/replay actions - 35 tests in test_ws_manager.py, all passing	2026-04-07 03:34:21 -05:00
John Lightner	e42117c8ee	MAESTRO: Implement export router with JSON, .env, YAML, and markdown report endpoints Four fully authenticated endpoints at /api/export/experiments/{id}/: - /best: Returns best config as JSON with weighted score and metadata - /env: Flattened KEY=VALUE format with metadata comments - /yaml: Simple YAML serialization (no external dependency) - /report: Full markdown report with config space, top N configs, score distributions, token usage, and timing stats 34 tests in test_export.py covering all endpoints, auth, 404s, and helpers. Updated test_routers.py to expect 401 (auth required) instead of 501 (stub).	2026-04-07 03:30:45 -05:00
John Lightner	b3fb8e3063	MAESTRO: Implement runs router with full CRUD, filtering, scoring, and leaderboard - List runs with filtering by experiment, status, and score range plus pagination - Get run detail with eager-loaded stage results and scores - Ad-hoc single run creation with Celery/sync dispatch - Human scoring endpoint (POST /{id}/score) - Leaderboard endpoint with configurable weighted scoring from experiment scoring_config - Added AdHocRunCreate, LeaderboardEntry, LeaderboardResponse schemas - 25 tests in test_runs.py, all passing (503 total tests passing)	2026-04-07 03:24:56 -05:00
John Lightner	82e97e9dba	MAESTRO: Implement experiments router with full CRUD and sweep control endpoints Add complete experiments API: list (with project filter), get, create, update, delete, plus sweep lifecycle (start/pause/resume/stop/status). Adds SweepRequest and SweepStatusResponse schemas. Sweep dispatch routes through Celery with synchronous fallback for single-container mode. Redis flags control pause/resume/stop; direct DB updates used when Redis unavailable. 34 tests.	2026-04-07 03:19:43 -05:00
John Lightner	35d72e7fa8	MAESTRO: Implement LLM endpoints router with CRUD, test_connection, and Fernet-encrypted API key storage - Add LLMEndpoint model to models.py with encrypted api_key field - Create encryption.py with Fernet symmetric encryption (key derived from JWT_SECRET via PBKDF2) - Implement full endpoints router: list, get, create, update, delete + test_connection - Test endpoint calls adapter.test_connection() and list_models() - API keys never exposed in responses; has_api_key boolean flag added - 25 tests in test_endpoints.py, all 444 tests passing	2026-04-07 03:13:52 -05:00
John Lightner	b16454994e	MAESTRO: Implement Celery tasks (execute_run, execute_sweep) with synchronous fallback for single-container mode Created engine/tasks.py with: - execute_run and execute_sweep Celery tasks registered via autodiscover - SyncTaskResult class mimicking Celery AsyncResult for in-process mode - dispatch_run/dispatch_sweep helpers that route to Celery or sync based on config - Proper async-to-sync bridging for the async engine functions - 17 tests covering task execution, sync fallback, error handling, and Celery dispatch	2026-04-07 03:08:41 -05:00
John Lightner	fb78eac1b0	MAESTRO: Implement LLMJudgeScorer with configurable judge prompt, rating parsing, and response caching	2026-04-07 03:05:00 -05:00
John Lightner	0d5a6169c5	MAESTRO: Implement KeywordScorer with presence/absence keyword checking and ratio scoring	2026-04-07 03:02:40 -05:00
John Lightner	bc1d41e3a6	MAESTRO: Implement FormatScorer with json, markdown, length, and structure format checks Adds format.py scorer supporting four validation modes: - json: validates parseable JSON - markdown: checks for headers (0.5) and lists (0.5) - length: proportional scoring against min/max token bounds - structure: JSON schema validation via jsonschema library Includes 38 passing tests covering all format types, edge cases, and async delegation.	2026-04-07 03:00:56 -05:00
John Lightner	3cc1e22e3f	MAESTRO: Implement EmbeddingScorer with cosine similarity scoring via OpenAI-compatible embedding API	2026-04-07 02:58:00 -05:00
John Lightner	405bbf8206	MAESTRO: Implement BaseScorer abstract class with sync/async scoring interface Adds backend/engine/scorers/base.py with abstract name property, score() method, and score_async() default implementation. Updates scorers __init__.py to export BaseScorer. Includes 9 tests covering instantiation guards, sync/async dispatch, context dict usage, and partial implementation rejection.	2026-04-07 02:55:05 -05:00
John Lightner	ba8cb7e2c6	MAESTRO: Implement sweep orchestration engine with grid, random, and guided sweep types Adds backend/engine/sweep.py with three sweep strategies: - GridSweep: exhaustive enumeration of all parameter combinations - RandomSweep: N random samples from parameter ranges (list, min/max, step) - GuidedSweep: top-K exploitation + random exploration from previous results Features: bounded parallelism via asyncio.Semaphore, token budget enforcement, Redis-based pause/resume/stop control flags, sweep-level event publishing. 36 tests in test_sweep.py covering config generation, helpers, and full sweep execution.	2026-04-07 02:53:30 -05:00
John Lightner	d607970f0c	MAESTRO: Implement run execution engine with Jinja2 templating, caching, scoring, and event bus Adds backend/engine/runner.py with run_single() that iterates pipeline stages, renders Jinja2 prompt templates with stage history context, checks/stores response cache, calls LLM adapters, runs configured scorers, creates StageResult and Score records, and publishes progress events via Redis pub/sub or in-process EventBus. Includes 21 passing tests covering all execution paths.	2026-04-07 02:48:20 -05:00
John Lightner	f60128604f	MAESTRO: Implement ResponseCache layer with SHA-256 config hashing and hit-rate tracking	2026-04-07 02:37:58 -05:00
John Lightner	bf1e9d1c84	MAESTRO: Implement OpenAI-compatible LLM adapter with streaming, retries, and tests Add OpenAICompatAdapter that works with any OpenAI-compatible API endpoint (OpenWebUI, vLLM, Ollama, OpenAI, Anthropic via proxy). Features: - Async HTTP calls via httpx with configurable timeout - Chat completions format with system + user messages - Token usage parsing from responses - Exponential backoff retries (configurable, default 3 attempts) - Both streaming (SSE) and non-streaming modes - Model listing and connection testing - 21 tests covering construction, request building, response parsing, retry logic, and error handling	2026-04-07 02:35:52 -05:00
John Lightner	9e0dc4e9fe	MAESTRO: Implement BaseAdapter abstract class and AdapterResponse dataclass Define the LLM adapter interface in backend/engine/adapters/base.py with async methods complete(), list_models(), and test_connection(). The AdapterResponse dataclass holds response text, token counts, latency, model name, and raw metadata. Includes 11 tests covering instantiation guards, concrete subclass behavior, and dataclass semantics.	2026-04-07 02:32:57 -05:00
John Lightner	7dad9d97af	MAESTRO: Add entrypoint migrations, worker config, and stack integration tests Create docker/entrypoint.sh to run alembic migrations on API startup. Create backend/worker.py with Celery app config for the compose worker service. Fix README single-container port (8000) and add production compose documentation. Add 27 tests (stack integration + worker) verifying all Docker/compose artifacts are present, consistent, and the /health endpoint responds correctly.	2026-04-07 02:09:56 -05:00
John Lightner	267091bbce	MAESTRO: Scaffold all 8 router stubs in backend/routers/ with 501 placeholder endpoints	2026-04-07 02:01:11 -05:00
John Lightner	848fb06407	MAESTRO: Create backend/auth.py with JWT, API key auth, and first-boot setup flow	2026-04-07 01:59:24 -05:00
John Lightner	15ca2c922a	MAESTRO: Create backend/main.py with FastAPI app, CORS, health check, WebSocket, and router mounting FastAPI application with: - CORS middleware (permissive for dev) - /health endpoint checking DB and Redis connectivity - /ws WebSocket endpoint with ConnectionManager for real-time updates - Async lifespan hooks for DB engine and Redis init/teardown - get_db dependency for session management - Dynamic router mounting that silently skips missing router modules - 10 tests covering all endpoints and utilities	2026-04-07 01:56:40 -05:00
John Lightner	42668eeeb1	MAESTRO: Create backend/schemas.py with all Pydantic request/response schemas Create/update/response schemas for Project, Experiment, Run, Endpoint, Webhook, Score, Auth (setup/login/token), Export, and Health. All use Pydantic v2 ConfigDict(from_attributes=True) for ORM compatibility. RunDetailResponse nests StageResults and Scores. ExportRunRow provides flat scorer_name→value dict for CSV/JSON export. 30 tests added.	2026-04-07 01:54:02 -05:00
John Lightner	0ec75ab617	MAESTRO: Set up Alembic with initial migration for all 8 ORM models	2026-04-07 01:52:03 -05:00
John Lightner	7ef116e2f9	MAESTRO: Create backend/models.py with all 8 SQLAlchemy ORM models from spec Define User, Project, Experiment, Run, StageResult, Score, ResponseCache, and WebhookConfig with UUID primary keys, JSON columns, enum types (ExperimentStatus, RunStatus), full relationship cascades, and indexes. Uses sqlalchemy.JSON (not JSONB) for SQLite compatibility in single-container mode. 16 tests added covering table creation, CRUD, uniqueness constraints, default values, and cascade deletes — all passing.	2026-04-07 01:49:10 -05:00
John Lightner	309bbacb5d	MAESTRO: Create backend/config.py with Pydantic Settings and SQLite/in-process fallback All 13 environment variables from the spec defined with proper defaults. SQLite fallback when DATABASE_URL is unset, in-process queue flag when REDIS_URL is unset, JWT_SECRET auto-generation, empty API_KEY normalization. 13 unit tests covering all configuration paths.	2026-04-07 01:46:30 -05:00
John Lightner	cb4af5f707	MAESTRO: Create full directory structure with placeholder files Set up all directories from the spec's Project Structure section: - backend/ with routers/, engine/adapters/, engine/scorers/, mcp/, websocket/, tests/ (all with __init__.py) - frontend/src/ with pages/, components/, api/ (.gitkeep) - docker/ (.gitkeep) - alembic/versions/ (.gitkeep)	2026-04-07 01:40:27 -05:00

25 commits