Commit graph

8 commits

Author SHA1 Message Date
John Lightner
bc1d41e3a6 MAESTRO: Implement FormatScorer with json, markdown, length, and structure format checks
Adds format.py scorer supporting four validation modes:
- json: validates parseable JSON
- markdown: checks for headers (0.5) and lists (0.5)
- length: proportional scoring against min/max token bounds
- structure: JSON schema validation via jsonschema library

Includes 38 passing tests covering all format types, edge cases, and async delegation.
2026-04-07 03:00:56 -05:00
John Lightner
3cc1e22e3f MAESTRO: Implement EmbeddingScorer with cosine similarity scoring via OpenAI-compatible embedding API 2026-04-07 02:58:00 -05:00
John Lightner
405bbf8206 MAESTRO: Implement BaseScorer abstract class with sync/async scoring interface
Adds backend/engine/scorers/base.py with abstract name property, score() method,
and score_async() default implementation. Updates scorers __init__.py to export
BaseScorer. Includes 9 tests covering instantiation guards, sync/async dispatch,
context dict usage, and partial implementation rejection.
2026-04-07 02:55:05 -05:00
John Lightner
ba8cb7e2c6 MAESTRO: Implement sweep orchestration engine with grid, random, and guided sweep types
Adds backend/engine/sweep.py with three sweep strategies:
- GridSweep: exhaustive enumeration of all parameter combinations
- RandomSweep: N random samples from parameter ranges (list, min/max, step)
- GuidedSweep: top-K exploitation + random exploration from previous results

Features: bounded parallelism via asyncio.Semaphore, token budget enforcement,
Redis-based pause/resume/stop control flags, sweep-level event publishing.
36 tests in test_sweep.py covering config generation, helpers, and full sweep execution.
2026-04-07 02:53:30 -05:00
John Lightner
d607970f0c MAESTRO: Implement run execution engine with Jinja2 templating, caching, scoring, and event bus
Adds backend/engine/runner.py with run_single() that iterates pipeline stages,
renders Jinja2 prompt templates with stage history context, checks/stores response
cache, calls LLM adapters, runs configured scorers, creates StageResult and Score
records, and publishes progress events via Redis pub/sub or in-process EventBus.
Includes 21 passing tests covering all execution paths.
2026-04-07 02:48:20 -05:00
John Lightner
f60128604f MAESTRO: Implement ResponseCache layer with SHA-256 config hashing and hit-rate tracking 2026-04-07 02:37:58 -05:00
John Lightner
bf1e9d1c84 MAESTRO: Implement OpenAI-compatible LLM adapter with streaming, retries, and tests
Add OpenAICompatAdapter that works with any OpenAI-compatible API endpoint
(OpenWebUI, vLLM, Ollama, OpenAI, Anthropic via proxy). Features:
- Async HTTP calls via httpx with configurable timeout
- Chat completions format with system + user messages
- Token usage parsing from responses
- Exponential backoff retries (configurable, default 3 attempts)
- Both streaming (SSE) and non-streaming modes
- Model listing and connection testing
- 21 tests covering construction, request building, response parsing,
  retry logic, and error handling
2026-04-07 02:35:52 -05:00
John Lightner
9e0dc4e9fe MAESTRO: Implement BaseAdapter abstract class and AdapterResponse dataclass
Define the LLM adapter interface in backend/engine/adapters/base.py with
async methods complete(), list_models(), and test_connection(). The
AdapterResponse dataclass holds response text, token counts, latency,
model name, and raw metadata. Includes 11 tests covering instantiation
guards, concrete subclass behavior, and dataclass semantics.
2026-04-07 02:32:57 -05:00