jlightner
6fb497d03a
chore: Bump LLM max_tokens to 32768, commit M002/M003 GSD artifacts
...
- max_tokens bumped from 16384 to 32768 (extraction responses still hitting limits)
- All GSD planning/completion artifacts for M002 (deployment) and M003 (DNS + LLM routing)
- KNOWLEDGE.md updated with XPLTD domain setup flow and container healthcheck patterns
- DECISIONS.md updated with D015 (subnet) and D016 (Ollama for embeddings)
2026-03-30 04:22:45 +00:00
jlightner
cf759f3739
fix: Add max_tokens=16384 to LLM requests (OpenWebUI defaults to 1000, truncating pipeline JSON)
2026-03-30 04:08:29 +00:00
jlightner
4aa4b08a7f
feat: Per-stage LLM model routing with thinking modality and think-tag stripping
...
- Added 8 per-stage config fields: llm_stage{2-5}_model and llm_stage{2-5}_modality
- LLMClient.complete() accepts modality ('chat'/'thinking') and model_override
- Thinking modality: appends JSON instructions to system prompt, strips <think> tags
- strip_think_tags() handles multiline, multiple blocks, and edge cases
- Pipeline stages 2-5 read per-stage config and pass to LLM client
- Updated .env.example with per-stage model/modality documentation
- All 59 tests pass including new think-tag stripping test
2026-03-30 02:12:14 +00:00
jlightner
12cc86aef9
chore: Extended Settings with 12 LLM/embedding/Qdrant config fields, cr…
...
- "backend/config.py"
- "backend/worker.py"
- "backend/pipeline/schemas.py"
- "backend/pipeline/llm_client.py"
- "backend/requirements.txt"
- "backend/pipeline/__init__.py"
- "backend/pipeline/stages.py"
GSD-Task: S03/T01
2026-03-29 22:30:31 +00:00
jlightner
07126138b5
chore: Built FastAPI app with DB-connected health check, Pydantic schem…
...
- "backend/main.py"
- "backend/config.py"
- "backend/schemas.py"
- "backend/routers/__init__.py"
- "backend/routers/health.py"
- "backend/routers/creators.py"
- "backend/routers/videos.py"
GSD-Task: S01/T03
2026-03-29 21:54:57 +00:00