promptlooper

Author	SHA1	Message	Date
John Lightner	0f64dfbb02	MAESTRO: Implement webhook CRUD router, async dispatch with retry logic, and delivery logging Full webhook system: CRUD endpoints (list/filter/get/create/update/delete), WebhookDelivery model for delivery audit trail, dispatch engine with 3-attempt retry and exponential backoff, Celery task integration with sync fallback, and webhook firing hooks in runner.py and sweep.py event paths.	2026-04-07 03:41:04 -05:00
John Lightner	30fd15ec7a	MAESTRO: Implement WebSocket connection manager with per-experiment routing, Redis pub/sub bridge, and message replay - WebSocketManager in backend/websocket/manager.py with per-experiment and global subscriptions - Redis pub/sub bridge (sync + async) broadcasting events to relevant WebSocket clients - Deque-based replay buffers with since_ts/limit filtering for reconnection support - Runtime subscribe/unsubscribe and stats API - Enhanced /ws endpoint in main.py with subscribe/unsubscribe/replay actions - 35 tests in test_ws_manager.py, all passing	2026-04-07 03:34:21 -05:00
John Lightner	e42117c8ee	MAESTRO: Implement export router with JSON, .env, YAML, and markdown report endpoints Four fully authenticated endpoints at /api/export/experiments/{id}/: - /best: Returns best config as JSON with weighted score and metadata - /env: Flattened KEY=VALUE format with metadata comments - /yaml: Simple YAML serialization (no external dependency) - /report: Full markdown report with config space, top N configs, score distributions, token usage, and timing stats 34 tests in test_export.py covering all endpoints, auth, 404s, and helpers. Updated test_routers.py to expect 401 (auth required) instead of 501 (stub).	2026-04-07 03:30:45 -05:00
John Lightner	b3fb8e3063	MAESTRO: Implement runs router with full CRUD, filtering, scoring, and leaderboard - List runs with filtering by experiment, status, and score range plus pagination - Get run detail with eager-loaded stage results and scores - Ad-hoc single run creation with Celery/sync dispatch - Human scoring endpoint (POST /{id}/score) - Leaderboard endpoint with configurable weighted scoring from experiment scoring_config - Added AdHocRunCreate, LeaderboardEntry, LeaderboardResponse schemas - 25 tests in test_runs.py, all passing (503 total tests passing)	2026-04-07 03:24:56 -05:00
John Lightner	82e97e9dba	MAESTRO: Implement experiments router with full CRUD and sweep control endpoints Add complete experiments API: list (with project filter), get, create, update, delete, plus sweep lifecycle (start/pause/resume/stop/status). Adds SweepRequest and SweepStatusResponse schemas. Sweep dispatch routes through Celery with synchronous fallback for single-container mode. Redis flags control pause/resume/stop; direct DB updates used when Redis unavailable. 34 tests.	2026-04-07 03:19:43 -05:00
John Lightner	35d72e7fa8	MAESTRO: Implement LLM endpoints router with CRUD, test_connection, and Fernet-encrypted API key storage - Add LLMEndpoint model to models.py with encrypted api_key field - Create encryption.py with Fernet symmetric encryption (key derived from JWT_SECRET via PBKDF2) - Implement full endpoints router: list, get, create, update, delete + test_connection - Test endpoint calls adapter.test_connection() and list_models() - API keys never exposed in responses; has_api_key boolean flag added - 25 tests in test_endpoints.py, all 444 tests passing	2026-04-07 03:13:52 -05:00
John Lightner	b16454994e	MAESTRO: Implement Celery tasks (execute_run, execute_sweep) with synchronous fallback for single-container mode Created engine/tasks.py with: - execute_run and execute_sweep Celery tasks registered via autodiscover - SyncTaskResult class mimicking Celery AsyncResult for in-process mode - dispatch_run/dispatch_sweep helpers that route to Celery or sync based on config - Proper async-to-sync bridging for the async engine functions - 17 tests covering task execution, sync fallback, error handling, and Celery dispatch	2026-04-07 03:08:41 -05:00
John Lightner	fb78eac1b0	MAESTRO: Implement LLMJudgeScorer with configurable judge prompt, rating parsing, and response caching	2026-04-07 03:05:00 -05:00
John Lightner	0d5a6169c5	MAESTRO: Implement KeywordScorer with presence/absence keyword checking and ratio scoring	2026-04-07 03:02:40 -05:00
John Lightner	bc1d41e3a6	MAESTRO: Implement FormatScorer with json, markdown, length, and structure format checks Adds format.py scorer supporting four validation modes: - json: validates parseable JSON - markdown: checks for headers (0.5) and lists (0.5) - length: proportional scoring against min/max token bounds - structure: JSON schema validation via jsonschema library Includes 38 passing tests covering all format types, edge cases, and async delegation.	2026-04-07 03:00:56 -05:00
John Lightner	3cc1e22e3f	MAESTRO: Implement EmbeddingScorer with cosine similarity scoring via OpenAI-compatible embedding API	2026-04-07 02:58:00 -05:00
John Lightner	405bbf8206	MAESTRO: Implement BaseScorer abstract class with sync/async scoring interface Adds backend/engine/scorers/base.py with abstract name property, score() method, and score_async() default implementation. Updates scorers __init__.py to export BaseScorer. Includes 9 tests covering instantiation guards, sync/async dispatch, context dict usage, and partial implementation rejection.	2026-04-07 02:55:05 -05:00
John Lightner	ba8cb7e2c6	MAESTRO: Implement sweep orchestration engine with grid, random, and guided sweep types Adds backend/engine/sweep.py with three sweep strategies: - GridSweep: exhaustive enumeration of all parameter combinations - RandomSweep: N random samples from parameter ranges (list, min/max, step) - GuidedSweep: top-K exploitation + random exploration from previous results Features: bounded parallelism via asyncio.Semaphore, token budget enforcement, Redis-based pause/resume/stop control flags, sweep-level event publishing. 36 tests in test_sweep.py covering config generation, helpers, and full sweep execution.	2026-04-07 02:53:30 -05:00
John Lightner	d607970f0c	MAESTRO: Implement run execution engine with Jinja2 templating, caching, scoring, and event bus Adds backend/engine/runner.py with run_single() that iterates pipeline stages, renders Jinja2 prompt templates with stage history context, checks/stores response cache, calls LLM adapters, runs configured scorers, creates StageResult and Score records, and publishes progress events via Redis pub/sub or in-process EventBus. Includes 21 passing tests covering all execution paths.	2026-04-07 02:48:20 -05:00
John Lightner	f60128604f	MAESTRO: Implement ResponseCache layer with SHA-256 config hashing and hit-rate tracking	2026-04-07 02:37:58 -05:00
John Lightner	bf1e9d1c84	MAESTRO: Implement OpenAI-compatible LLM adapter with streaming, retries, and tests Add OpenAICompatAdapter that works with any OpenAI-compatible API endpoint (OpenWebUI, vLLM, Ollama, OpenAI, Anthropic via proxy). Features: - Async HTTP calls via httpx with configurable timeout - Chat completions format with system + user messages - Token usage parsing from responses - Exponential backoff retries (configurable, default 3 attempts) - Both streaming (SSE) and non-streaming modes - Model listing and connection testing - 21 tests covering construction, request building, response parsing, retry logic, and error handling	2026-04-07 02:35:52 -05:00
John Lightner	9e0dc4e9fe	MAESTRO: Implement BaseAdapter abstract class and AdapterResponse dataclass Define the LLM adapter interface in backend/engine/adapters/base.py with async methods complete(), list_models(), and test_connection(). The AdapterResponse dataclass holds response text, token counts, latency, model name, and raw metadata. Includes 11 tests covering instantiation guards, concrete subclass behavior, and dataclass semantics.	2026-04-07 02:32:57 -05:00

17 commits