chrysopedia/backend/pipeline
jlightner c6c15defee feat: Dynamic token estimation for per-stage max_tokens
- Add estimate_tokens() and estimate_max_tokens() to llm_client with
  stage-specific output ratios (0.3x segmentation, 1.2x extraction,
  0.15x classification, 1.5x synthesis)
- Add max_tokens override parameter to LLMClient.complete()
- Wire all 4 pipeline stages to estimate max_tokens from actual prompt
  content with 20% buffer and 2048 floor
- Add LLM_MAX_TOKENS_HARD_LIMIT=32768 config (dynamic estimator ceiling)
- Log token estimates alongside every LLM request

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 05:55:17 -05:00
..
__init__.py chore: Extended Settings with 12 LLM/embedding/Qdrant config fields, cr… 2026-03-29 22:30:31 +00:00
embedding_client.py feat: Created sync EmbeddingClient, QdrantManager with idempotent colle… 2026-03-29 22:39:04 +00:00
llm_client.py feat: Dynamic token estimation for per-stage max_tokens 2026-03-30 05:55:17 -05:00
qdrant_client.py feat: Created sync EmbeddingClient, QdrantManager with idempotent colle… 2026-03-29 22:39:04 +00:00
schemas.py chore: Extended Settings with 12 LLM/embedding/Qdrant config fields, cr… 2026-03-29 22:30:31 +00:00
stages.py feat: Dynamic token estimation for per-stage max_tokens 2026-03-30 05:55:17 -05:00