promptlooper/CLAUDE.md

# CLAUDE.md — PromptLooper

## What is this project?

PromptLooper is a self-hosted LLM pipeline tuning workbench. It runs experiments across prompt × model × parameter combinations, caches every response, scores results, and surfaces optimal configurations through a real-time dashboard. It has an MCP server so AI agents can drive it programmatically.

## Repository

- **Hosted at**: git.xpltd.co/xpltdco/promptlooper
- **XPLTD project name**: `xpltd_promptlooper`
- **Sister project**: Chrysopedia (git.xpltd.co/xpltdco/chrysopedia) — a knowledge extraction pipeline that is PromptLooper's first integration target

## Tech Stack

- **Backend**: Python 3.12, FastAPI, Celery, SQLAlchemy, Alembic
- **Frontend**: React 18, TypeScript, Vite, Tailwind CSS
- **Database**: PostgreSQL 16 (production) / SQLite (single-container mode)
- **Cache/Queue**: Redis 7 (production) / in-process (single-container)
- **Real-time**: WebSocket via FastAPI + Redis pub/sub
- **MCP**: Python MCP SDK
- **Container**: Multi-stage Docker build, nginx for frontend

## XPLTD Conventions

These are non-negotiable project conventions shared across all XPLTD projects:

- Docker Compose project name: `xpltd_promptlooper`
- Dedicated bridge network: `promptlooper` (`172.33.0.0/24`)
- Persistent data bind mounts under `/vmPool/r/services/promptlooper_*`
- PostgreSQL on external port `5434` (internal `5432`)
- Web UI on port `8400`
- MCP server on port `8401`
- Container naming: `promptlooper-{service}` (e.g., `promptlooper-api`, `promptlooper-db`)

## Key Architecture Decisions

1. **No LLM runs inside PromptLooper itself** — it's purely an HTTP client that calls external LLM endpoints. The only exception is the optional "LLM-as-judge" scorer.
2. **Response caching by config hash** — SHA-256 of (prompt + model + params + input). Cache hits return instantly. This is critical for cost control.
3. **Single-container mode** — when `DATABASE_URL` is not set, use SQLite + in-process queue. Zero dependencies.
4. **WebSocket for real-time** — the dashboard connects via WebSocket to receive run progress, score updates, and steering events.
5. **Pluggable scorers** — all scoring functions implement a base class with `score(input, output, context) → float` signature.
6. **OpenAI-compatible adapter** — the LLM adapter layer speaks OpenAI's chat completions API. This covers OpenWebUI, vLLM, Ollama, and most providers.

## File Organization

```
backend/
  main.py              — FastAPI app, middleware, router mounting
  config.py            — Pydantic Settings from env vars
  models.py            — SQLAlchemy ORM models
  schemas.py           — Pydantic request/response schemas
  auth.py              — JWT + API key authentication
  worker.py            — Celery app configuration
  routers/             — API endpoint handlers
  engine/              — Core experiment execution logic
    runner.py          — Individual run execution
    sweep.py           — Sweep orchestration (grid/random/guided)
    cache.py           — Response cache layer
    adapters/          — LLM endpoint adapters
    scorers/           — Pluggable scoring functions
  mcp/                 — MCP server implementation
  websocket/           — WebSocket connection management

frontend/src/
  pages/               — Route-level components
  components/          — Shared UI components
  api/                 — Typed API client functions
```

## Database Migrations

Use Alembic. Same patterns as Chrysopedia:
```bash
alembic revision --autogenerate -m "describe_change"
alembic upgrade head
```

## Running Locally

```bash
docker compose up -d promptlooper-db promptlooper-redis
cd backend && uvicorn main:app --reload --host 0.0.0.0 --port 8000
# Frontend in another terminal:
cd frontend && npm run dev
```

## Testing

```bash
cd backend && pytest
cd frontend && npm test
```

## Important Patterns

### Adding a new scorer
1. Create `backend/engine/scorers/my_scorer.py`
2. Implement `BaseScorer` with `name`, `score(input, output, context) → float`
3. Register in `backend/engine/scorers/__init__.py`
4. Add to frontend scorer picker component

### Adding a new LLM adapter
1. Create `backend/engine/adapters/my_adapter.py`
2. Implement `BaseAdapter` with `complete(prompt, model, params) → response`
3. Register in `backend/engine/adapters/__init__.py`
4. Currently only OpenAI-compatible is implemented; all others should be edge cases

### Adding a new MCP tool
1. Add tool definition in `backend/mcp/tools.py`
2. Implement handler in `backend/mcp/server.py`
3. Tools should map 1:1 to API endpoints where possible

## Common Gotchas

- Always hash the FULL config when checking cache — missing a single parameter means cache misses
- WebSocket connections must be cleaned up on disconnect — use the connection manager
- SQLite mode doesn't support concurrent writes — the in-process queue must be single-threaded
- Frontend must handle both WebSocket and polling fallback for environments where WS is blocked
- MCP server runs on a separate port from the main API

## Deployment

```bash
ssh ub01
cd /vmPool/r/repos/xpltdco/promptlooper
git pull && docker compose build && docker compose up -d
```