feat: Replaced 3-tier step function with 5-tier continuous interpolatio…

- "backend/chat_service.py"
- "backend/tests/test_chat.py"

GSD-Task: S04/T01
This commit is contained in:
jlightner 2026-04-04 10:04:47 +00:00
parent 84d8dc4455
commit 1062e003bf
12 changed files with 712 additions and 24 deletions

View file

@ -49,3 +49,4 @@
| D041 | M022/S05 | architecture | Highlight scorer weight distribution for 10-dimension model | Original 7 dimensions reduced proportionally, new 3 audio proxy dimensions (speech_rate_variance, pause_density, speaking_pace) allocated 0.22 total weight. Audio dims default to 0.5 (neutral) when word_timings unavailable for backward compatibility. | Audio proxy signals derived from word-level timing data provide meaningful highlight quality indicators without requiring raw audio analysis (librosa). Neutral fallback ensures existing scoring paths are unaffected. | Yes | agent |
| D042 | M023/S01 | architecture | Rich text editor for creator posts | Tiptap (headless, React) with StarterKit + Link + Placeholder extensions. Store Tiptap JSON as canonical format in JSONB column, render client-side via @tiptap/html. | Headless architecture fits dark theme customization. Large ecosystem, well-maintained. JSON storage is lossless and enables future server-side rendering. No HTML sanitization needed since canonical format is structured JSON. | Yes | agent |
| D043 | M023/S02 | architecture | Personality weight → system prompt modulation strategy | 3-tier intensity (<0.4 subtle reference, 0.4-0.8 adopt voice, 0.8 fully embody) with temperature scaling 0.30.5 linear on weight | Stepped intensity prevents jarring persona at low weights while allowing full creator voice at high values. Temperature stays in 0.3-0.5 range to keep responses factually grounded even at maximum personality wider ranges risk hallucination in a knowledge-base context. | Yes | agent |
| D044 | M023/S04 | architecture | Personality weight → system prompt modulation strategy (revision) | 5-tier continuous interpolation replacing 3-tier step function. Progressive field inclusion: weight < 0.2 = no personality block; 0.2+ adds basic tone; 0.4+ adds descriptors/explanation approach; 0.6+ adds signature phrases (count scaled with weight); 0.8+ adds full vocabulary/style markers; 0.9+ adds summary paragraph. Temperature scaling unchanged (0.3 + weight * 0.2). | 3-tier step function had jarring transitions at 0.4 and 0.8 boundaries. Continuous interpolation with progressive field inclusion gives finer control encyclopedic responses stay clean at low weights while high weights pull in the full personality profile gradually. The 0.0-0.19 dead zone ensures purely encyclopedic mode remains truly encyclopedic with zero personality artifacts. | Yes | agent |

View file

@ -8,6 +8,6 @@ The demo MVP comes together. Chat widget wires to the intelligence layer (INT-1)
|----|-------|------|---------|------|------------|
| S01 | [A] Post Editor + File Sharing | high | — | ✅ | Creator writes rich text posts with file attachments (presets, sample packs). Followers see posts in feed. Files downloadable via signed URLs. |
| S02 | [A] Chat Widget ↔ Chat Engine Wiring (INT-1) | high | — | ✅ | Chat widget on creator profile wired to chat engine. Personality slider adjusts response style. Citations link to sources. |
| S03 | [B] Shorts Generation Pipeline v1 | medium | — | | Shorts pipeline extracts clips from highlight boundaries in 3 format presets (vertical, square, horizontal) |
| S03 | [B] Shorts Generation Pipeline v1 | medium | — | | Shorts pipeline extracts clips from highlight boundaries in 3 format presets (vertical, square, horizontal) |
| S04 | [B] Personality Slider (Full Interpolation) | medium | — | ⬜ | Personality slider at 0.0 gives encyclopedic response. At 1.0 gives creator-voiced response with their speech patterns. |
| S05 | Forgejo KB Update — Demo Build Docs | low | S01, S02, S03, S04 | ⬜ | Forgejo wiki updated with post editor, MinIO, chat integration, shorts pipeline, personality system |

View file

@ -0,0 +1,122 @@
---
id: S03
parent: M023
milestone: M023
provides:
- GeneratedShort model with FormatPreset/ShortStatus enums
- stage_generate_shorts Celery task
- Shorts API endpoints (generate/list/download)
- Frontend shorts UI in HighlightQueue
- ffmpeg in Docker image + /videos volume mount
requires:
[]
affects:
- S05
key_files:
- backend/models.py
- backend/config.py
- backend/pipeline/shorts_generator.py
- backend/pipeline/stages.py
- backend/routers/shorts.py
- backend/main.py
- frontend/src/api/shorts.ts
- frontend/src/pages/HighlightQueue.tsx
- frontend/src/pages/HighlightQueue.module.css
- docker/Dockerfile.api
- docker-compose.yml
- alembic/versions/025_add_generated_shorts.py
key_decisions:
- Used explicit enum creation in migration for clean up/down lifecycle
- Lazy imports inside Celery task for shorts_generator and model types to avoid circular imports
- Per-preset independent processing with isolated error handling — one failure doesn't block others
- Show generate button only on approved highlights with no in-progress shorts (or all-failed)
- Poll every 5s only for highlights with pending/processing shorts, stop when all settle
- Download opens presigned MinIO URL in new tab
patterns_established:
- Celery task with per-item independent error handling (generate all 3 presets, each in its own try/catch)
- ffmpeg subprocess wrapper with timeout and stderr capture for diagnostics
- Frontend polling pattern: 5s interval while processing, auto-stop when all settle
observability_surfaces:
- generated_shorts table: status and error_message columns per preset per highlight
- Celery worker logs: per-preset structured log lines with highlight_id, preset, status, duration_ms, file_size or error
- API endpoints: GET /admin/shorts/{highlight_id} returns full status for all presets
drill_down_paths:
- .gsd/milestones/M023/slices/S03/tasks/T01-SUMMARY.md
- .gsd/milestones/M023/slices/S03/tasks/T02-SUMMARY.md
- .gsd/milestones/M023/slices/S03/tasks/T03-SUMMARY.md
duration: ""
verification_result: passed
completed_at: 2026-04-04T09:54:34.807Z
blocker_discovered: false
---
# S03: [B] Shorts Generation Pipeline v1
**Shorts pipeline extracts video clips from approved highlights in 3 format presets (vertical, square, horizontal), stores in MinIO, and exposes generate/list/download through API and HighlightQueue UI.**
## What Happened
Three tasks delivered the full shorts generation pipeline end-to-end.
T01 laid the infrastructure: GeneratedShort model with FormatPreset (vertical/square/horizontal) and ShortStatus (pending/processing/complete/failed) enums, Alembic migration 025, video_source_path config setting, ffmpeg in the Docker image, and /videos volume mount on both API and worker services.
T02 built the generation engine: shorts_generator.py with PRESETS dict defining ffmpeg video filter chains for each format (vertical 1080×1920, square 1080×1080, horizontal 1920×1080), extract_clip() with 300s subprocess timeout, and resolve_video_path() with file existence validation. The stage_generate_shorts Celery task loads an approved HighlightCandidate, resolves the source video file, and processes each preset independently — creating GeneratedShort rows, extracting clips to /tmp, uploading to MinIO under shorts/{highlight_id}/{preset}.mp4, and updating status. Each preset failure is isolated so one bad encode doesn't block others. Temp files are cleaned in finally blocks.
T03 wired the API and frontend: three endpoints on /api/v1/admin/shorts (POST generate trigger with 202 response, GET list per highlight, GET download returning presigned MinIO URL). Frontend API client with TypeScript types. HighlightQueue.tsx updated with generate button (visible only on approved highlights with no in-progress shorts), per-preset status badges (color-coded pending/processing/complete/failed with pulsing animation for processing), download links opening presigned URLs in new tabs, and 5s polling while any shorts are processing.
## Verification
All slice-level verification checks pass:
- Model imports (GeneratedShort, FormatPreset, ShortStatus): exit 0
- Generator module imports (extract_clip, PRESETS, resolve_video_path): exit 0
- Celery task imports (stage_generate_shorts): exit 0
- Router imports (routers.shorts.router): exit 0
- Router registered in main.py: confirmed via grep
- ffmpeg in Dockerfile.api: confirmed via grep
- video_source_path in config.py: confirmed via grep
- chrysopedia_videos volume mount in docker-compose.yml: confirmed via grep
- TypeScript compilation (npx tsc --noEmit): exit 0
- Frontend production build (npm run build): exit 0
## Requirements Advanced
None.
## Requirements Validated
None.
## New Requirements Surfaced
None.
## Requirements Invalidated or Re-scoped
None.
## Deviations
None.
## Known Limitations
Video source files must exist at the configured video_source_path (/videos mount). No retry mechanism on ffmpeg failures — preset is marked failed and requires manual re-trigger. Single Celery worker concurrency means generation jobs queue sequentially.
## Follow-ups
None.
## Files Created/Modified
- `backend/models.py` — Added FormatPreset, ShortStatus enums and GeneratedShort model with FK to highlight_candidates
- `backend/config.py` — Added video_source_path setting (default /videos)
- `docker/Dockerfile.api` — Added ffmpeg to apt-get install
- `docker-compose.yml` — Added /vmPool/r/services/chrysopedia_videos:/videos:ro volume mount to API and worker services
- `alembic/versions/025_add_generated_shorts.py` — Migration creating generated_shorts table with formatpreset and shortstatus enums
- `backend/pipeline/shorts_generator.py` — New ffmpeg wrapper: PRESETS dict, extract_clip(), resolve_video_path()
- `backend/pipeline/stages.py` — Added stage_generate_shorts Celery task with per-preset processing
- `backend/routers/shorts.py` — New router: POST generate, GET list, GET download endpoints
- `backend/main.py` — Registered shorts router
- `frontend/src/api/shorts.ts` — New API client with typed generateShorts, fetchShorts, getShortDownloadUrl
- `frontend/src/pages/HighlightQueue.tsx` — Added generate button, per-preset status badges, download links, 5s polling
- `frontend/src/pages/HighlightQueue.module.css` — Styles for shorts UI: badges, buttons, pulsing processing animation

View file

@ -0,0 +1,84 @@
# S03: [B] Shorts Generation Pipeline v1 — UAT
**Milestone:** M023
**Written:** 2026-04-04T09:54:34.807Z
# S03 UAT: Shorts Generation Pipeline v1
## Preconditions
- Chrysopedia stack running on ub01 (docker compose up -d)
- Migration 025 applied (docker exec chrysopedia-api alembic upgrade head)
- At least one approved highlight candidate exists in the review queue
- Video source files present at /vmPool/r/services/chrysopedia_videos/
- MinIO running and accessible
## Test Cases
### TC-01: Model and Infrastructure Verification
1. SSH to ub01, exec into API container
2. Run: `python -c "from models import GeneratedShort, FormatPreset, ShortStatus; print(FormatPreset.vertical.value, FormatPreset.square.value, FormatPreset.horizontal.value)"`
- **Expected:** Prints "vertical square horizontal"
3. Run: `which ffmpeg`
- **Expected:** Returns path (e.g., /usr/bin/ffmpeg)
4. Check volume mount: `ls /videos/`
- **Expected:** Lists video files from the host mount
### TC-02: Generate Shorts — Happy Path
1. Navigate to http://ub01:8096, log in as admin
2. Open Highlight Queue from admin menu
3. Find an approved highlight candidate
4. **Expected:** "Generate Shorts" button is visible
5. Click "Generate Shorts"
6. **Expected:** Button disappears, three preset badges appear (vertical, square, horizontal) showing "pending" or "processing" state
7. Wait for processing (badges should pulse during processing)
8. **Expected:** All three badges transition to "complete" with green color
9. **Expected:** Download links appear next to each completed preset
### TC-03: Download Generated Short
1. After TC-02 completes, click a download link for any completed preset
2. **Expected:** New tab opens with the video file (presigned MinIO URL)
3. Video should play and match the expected format (e.g., vertical = portrait orientation)
### TC-04: Generate Button Visibility Rules
1. Find a highlight that is NOT approved (pending/rejected)
2. **Expected:** No "Generate Shorts" button visible
3. Find an approved highlight that already has completed shorts
4. **Expected:** No "Generate Shorts" button (shorts already exist); status badges and download links shown instead
### TC-05: Re-generate After All Failed
1. If a highlight has all three presets in "failed" state (e.g., due to missing video file)
2. **Expected:** "Generate Shorts" button reappears, allowing retry
### TC-06: Missing Video File
1. Trigger generation for a highlight whose source video file doesn't exist at the /videos mount path
2. **Expected:** All three presets show "failed" status with error messages
3. Check API: `GET /api/v1/admin/shorts/{highlight_id}`
4. **Expected:** Response includes error_message for each preset indicating file not found
### TC-07: API Endpoint Validation
1. `POST /api/v1/admin/shorts/generate/{highlight_id}` with an approved highlight
- **Expected:** 202 Accepted with status message
2. `POST /api/v1/admin/shorts/generate/{highlight_id}` with a non-approved highlight
- **Expected:** 400 or 404 error
3. `GET /api/v1/admin/shorts/{highlight_id}`
- **Expected:** JSON array of GeneratedShort objects with all fields (id, format_preset, status, etc.)
4. `GET /api/v1/admin/shorts/download/{short_id}` for a completed short
- **Expected:** Presigned URL returned
5. `GET /api/v1/admin/shorts/download/{short_id}` for a non-complete short
- **Expected:** Error response
### TC-08: Polling Behavior
1. Trigger shorts generation, observe network tab in browser devtools
2. **Expected:** Polling requests every ~5 seconds to GET /admin/shorts/{highlight_id}
3. Once all presets reach terminal state (complete/failed), polling should stop
4. **Expected:** No further network requests after all presets settle
### TC-09: Partial Failure Independence
1. Set up a scenario where one preset's ffmpeg command would fail (e.g., corrupt source at specific timestamp)
2. **Expected:** Failed preset shows "failed" badge with error, other presets still process to completion
3. **Expected:** Completed presets have download links; failed preset shows error message
### Edge Cases
- Highlight with 0-second duration: generation should either fail gracefully or produce minimal clip
- Concurrent generation attempts on same highlight: second request should be rejected (no duplicate processing)
- Very long highlight (>5 min): ffmpeg should still complete within 300s timeout for most presets

View file

@ -0,0 +1,36 @@
{
"schemaVersion": 1,
"taskId": "T03",
"unitId": "M023/S03/T03",
"timestamp": 1775296321161,
"passed": false,
"discoverySource": "task-plan",
"checks": [
{
"command": "grep -q 'shorts' backend/main.py",
"exitCode": 0,
"durationMs": 7,
"verdict": "pass"
},
{
"command": "cd backend",
"exitCode": 0,
"durationMs": 6,
"verdict": "pass"
},
{
"command": "cd ../frontend",
"exitCode": 2,
"durationMs": 4,
"verdict": "fail"
},
{
"command": "npx tsc --noEmit",
"exitCode": 1,
"durationMs": 788,
"verdict": "fail"
}
],
"retryAttempt": 1,
"maxRetries": 2
}

View file

@ -1,6 +1,54 @@
# S04: [B] Personality Slider (Full Interpolation)
**Goal:** Build full personality interpolation from extracted profiles into chat system prompts
**Goal:** Personality slider produces continuous interpolation from encyclopedic (0.0) to fully creator-voiced (1.0), with progressive profile field inclusion and enhanced slider UX feedback.
**Demo:** After this: Personality slider at 0.0 gives encyclopedic response. At 1.0 gives creator-voiced response with their speech patterns.
## Tasks
- [x] **T01: Replaced 3-tier step function with 5-tier continuous interpolation in _build_personality_block(), adding progressive field inclusion and scaled phrase counts** — Replace the 3-tier step function in `_build_personality_block()` with continuous interpolation. The new function progressively includes more personality profile fields as weight increases:
- weight < 0.2: return empty string (no personality block)
- weight 0.20.39: basic tone — teaching_style, formality, energy + subtle instruction
- weight 0.40.59: + descriptors, explanation_approach + adopt-voice instruction
- weight 0.60.79: + signature_phrases (count scaled: `max(2, round(weight * len(phrases)))`) + creator-voice instruction
- weight 0.80.89: + distinctive_terms, sound_descriptions, sound_words, self_references, pacing + fully-embody instruction
- weight >= 0.9: + full summary paragraph
Instruction text per tier:
- 0.20.39: "When relevant, subtly reference {name}'s communication style"
- 0.40.59: "Adopt {name}'s tone and communication style"
- 0.60.79: "Respond as {name} would, using their voice and manner"
- 0.8+: "Fully embody {name} — use their exact phrases, energy, and teaching approach"
Keep existing temperature scaling (already linear, no changes needed). Keep uses_analogies and audience_engagement at weight >= 0.4.
Update existing test `test_personality_prompt_injected_when_weight_and_profile` (uses weight=0.7, so now in the 0.60.79 tier — assert signature phrases present, but NOT distinctive_terms/sound_descriptions). Add new parametrized test covering weights 0.1, 0.3, 0.5, 0.7, 0.9 asserting correct progressive field presence/absence at each tier. Add test that weight=0.0 and weight=0.15 produce no personality block.
- Estimate: 45m
- Files: backend/chat_service.py, backend/tests/test_chat.py
- Verify: cd backend && python -m pytest tests/test_chat.py -v -k personality
- [ ] **T02: Enhance slider UX with dynamic tier label, value indicator, and gradient track** — Enhance the personality slider in ChatWidget.tsx with three visual feedback features:
1. **Dynamic tier label**: Below the slider, show a centered label that changes based on current weight value:
- 0.00.19: "Encyclopedic"
- 0.20.39: "Subtle Reference"
- 0.40.59: "Creator Tone"
- 0.60.79: "Creator Voice"
- 0.81.0: "Full Embodiment"
2. **Value indicator**: Show the numeric value (e.g., "0.7") next to or integrated with the tier label. Small, secondary text style.
3. **Gradient track fill**: Use an inline style to set a CSS custom property `--slider-fill` based on the current value (percentage). In CSS, use `background: linear-gradient(to right, var(--color-accent) var(--slider-fill), var(--color-border) var(--slider-fill))` on the slider track.
Implementation in ChatWidget.tsx:
- Add a `getTierLabel(weight: number): string` helper function
- Add `style={{ '--slider-fill': `${personalityWeight * 100}%` } as React.CSSProperties}` to the slider input
- Add a new div below the slider row with the tier label + value
CSS changes in ChatWidget.module.css:
- `.slider` background becomes `linear-gradient(to right, var(--color-accent) var(--slider-fill, 0%), var(--color-border) var(--slider-fill, 0%))`
- Add `.tierLabel` class: centered, small font, color transitions
- Add `.tierValue` class: numeric value, even smaller, secondary color
Keep existing 'Encyclopedic' and 'Creator Voice' endpoint labels as-is — they frame the slider range.
- Estimate: 30m
- Files: frontend/src/components/ChatWidget.tsx, frontend/src/components/ChatWidget.module.css
- Verify: cd frontend && npm run build

View file

@ -0,0 +1,94 @@
# S04 Research: Personality Slider (Full Interpolation)
## Summary
This is **light research** — the infrastructure is fully built. S02 delivered the complete personality pipeline: API field, service-layer modulation, prompt injection, temperature scaling, frontend slider UI, and 22 tests. S04's job is to refine the 3-tier step function into continuous interpolation and enhance the slider UI feedback.
The S02 summary explicitly states: "S04 will refine the prompt modulation with continuous interpolation rather than the current 3-tier step function."
## Recommendation
Three clean tasks:
1. **Backend**: Replace `_build_personality_block()` step function with continuous interpolation — progressively include more profile fields as weight increases, use linear blending for instruction intensity.
2. **Frontend slider UX**: Add visual feedback (current weight value indicator, gradient track fill, tier label that changes dynamically).
3. **Test updates**: Update existing personality tests and add new ones for continuous interpolation behavior at several weight points.
## Implementation Landscape
### What Exists (built in S02)
**Backend** (`backend/chat_service.py`):
- `ChatRequest.personality_weight`: float 0.0-1.0, Pydantic validated
- `ChatService._inject_personality()`: queries `Creator.personality_profile` JSONB, calls `_build_personality_block()`
- `_build_personality_block()`: **current 3-tier step function** at lines ~200-240
- `weight >= 0.8`: "Fully embody {name}'s voice and style"
- `weight >= 0.4`: "Respond in {name}'s voice"
- `weight < 0.4`: "Subtly adopt {name}'s communication style"
- Always includes: teaching_style, descriptors (max 5), phrases (max 6), formality+energy, analogies, audience_engagement
- Temperature scaling: `0.3 + (weight * 0.2)` — already linear, good as-is
**Profile fields available** (from `personality_extraction.txt` prompt schema):
- Currently used: `vocabulary.signature_phrases`, `tone.descriptors`, `tone.teaching_style`, `tone.energy`, `tone.formality`, `style_markers.uses_analogies`, `style_markers.audience_engagement`
- **Unused but available for higher weights**: `vocabulary.distinctive_terms`, `vocabulary.sound_descriptions`, `vocabulary.jargon_level`, `style_markers.explanation_approach`, `style_markers.sound_words`, `style_markers.pacing`, `style_markers.self_references`, `summary` (full personality summary paragraph)
**Frontend** (`frontend/src/components/ChatWidget.tsx`):
- `useState(0)` for personalityWeight
- Range input: min=0, max=1, step=0.1
- Labels: "Encyclopedic" (left), "Creator Voice" (right)
- CSS: `ChatWidget.module.css` has `.sliderRow`, `.sliderLabel`, `.slider` with custom webkit/moz thumb styling in cyan accent
**Frontend** (`frontend/src/pages/ChatPage.tsx`):
- **No personality slider** — this is the global chat page (no creator context), so no slider makes sense. No work needed here.
**Tests** (`backend/tests/test_chat.py`):
- 22 tests total, 9 personality-specific
- Tests cover: weight forwarding, prompt injection content check, null/missing creator fallback, weight=0 skips query, temperature scaling at 0.0 and 1.0, validation boundaries (>1.0, <0.0, string)
### Continuous Interpolation Design
Replace the 3-tier if/elif/else with a continuous approach:
**Intensity instruction** (the opening sentence):
- Use `weight` to linearly interpolate between instruction strengths
- 0.0-0.2: no personality block at all (purely encyclopedic)
- 0.2-0.4: "When relevant, subtly reference {name}'s communication style"
- 0.4-0.6: "Adopt {name}'s tone and communication style"
- 0.6-0.8: "Respond as {name} would, using their voice and manner"
- 0.8-1.0: "Fully embody {name} — use their exact phrases, energy, and teaching approach"
**Progressive field inclusion** (more profile data at higher weights):
- weight ≥ 0.2: teaching_style + formality + energy (basic tone info)
- weight ≥ 0.4: + descriptors + explanation_approach (how they teach)
- weight ≥ 0.6: + signature_phrases (limited count scaling: `min(round(weight * 8), len(phrases))`)
- weight ≥ 0.8: + distinctive_terms, sound_descriptions, sound_words, self_references, pacing
- weight ≥ 0.9: + full summary paragraph as context
**Phrase count scaling**: Instead of a fixed max of 6, scale with weight: `max(2, round(weight * len(phrases)))`.
### Slider UX Enhancement
Current slider is functional but gives no feedback about what each position means. Enhancement:
- **Dynamic label** below the slider showing the current tier ("Encyclopedic", "Subtle Reference", "Creator Tone", "Creator Voice", "Full Embodiment")
- **Track fill gradient** — CSS `background: linear-gradient()` using `--slider-fill` variable set via inline style based on value
- **Value indicator** — small bubble or text showing current numeric value (0.0-1.0) or percentage
### File Change Map
| File | Change | Risk |
|------|--------|------|
| `backend/chat_service.py` | Rewrite `_build_personality_block()` for continuous interpolation | Low — pure function, well-tested |
| `backend/tests/test_chat.py` | Update `test_personality_prompt_injected_when_weight_and_profile` assertions, add tests for intermediate weights (0.2, 0.5, 0.9) checking progressive field inclusion | Low |
| `frontend/src/components/ChatWidget.tsx` | Add dynamic label, enhance slider visual feedback | Low |
| `frontend/src/components/ChatWidget.module.css` | Slider track fill gradient, value indicator styling | Low |
### Verification
- `cd backend && python -m pytest tests/test_chat.py -v` — all tests pass including new interpolation tests
- `cd frontend && npm run build` — 0 errors
- Manual: at weight 0.2, system prompt should have minimal personality cues; at weight 1.0 should have full profile dump including summary
### Natural Task Boundaries
1. **T01: Backend continuous interpolation** — rewrite `_build_personality_block()`, update and add tests
2. **T02: Frontend slider UX** — dynamic label, gradient track, visual feedback
3. Both tasks are independent and could run in parallel.

View file

@ -0,0 +1,40 @@
---
estimated_steps: 14
estimated_files: 2
skills_used: []
---
# T01: Rewrite _build_personality_block() for continuous interpolation + update tests
Replace the 3-tier step function in `_build_personality_block()` with continuous interpolation. The new function progressively includes more personality profile fields as weight increases:
- weight < 0.2: return empty string (no personality block)
- weight 0.20.39: basic tone — teaching_style, formality, energy + subtle instruction
- weight 0.40.59: + descriptors, explanation_approach + adopt-voice instruction
- weight 0.60.79: + signature_phrases (count scaled: `max(2, round(weight * len(phrases)))`) + creator-voice instruction
- weight 0.80.89: + distinctive_terms, sound_descriptions, sound_words, self_references, pacing + fully-embody instruction
- weight >= 0.9: + full summary paragraph
Instruction text per tier:
- 0.20.39: "When relevant, subtly reference {name}'s communication style"
- 0.40.59: "Adopt {name}'s tone and communication style"
- 0.60.79: "Respond as {name} would, using their voice and manner"
- 0.8+: "Fully embody {name} — use their exact phrases, energy, and teaching approach"
Keep existing temperature scaling (already linear, no changes needed). Keep uses_analogies and audience_engagement at weight >= 0.4.
Update existing test `test_personality_prompt_injected_when_weight_and_profile` (uses weight=0.7, so now in the 0.60.79 tier — assert signature phrases present, but NOT distinctive_terms/sound_descriptions). Add new parametrized test covering weights 0.1, 0.3, 0.5, 0.7, 0.9 asserting correct progressive field presence/absence at each tier. Add test that weight=0.0 and weight=0.15 produce no personality block.
## Inputs
- ``backend/chat_service.py` — existing `_build_personality_block()` at line 292`
- ``backend/tests/test_chat.py` — existing personality tests starting at line 617, `_FAKE_PERSONALITY_PROFILE` fixture`
## Expected Output
- ``backend/chat_service.py` — rewritten `_build_personality_block()` with 5-tier continuous interpolation`
- ``backend/tests/test_chat.py` — updated and new personality interpolation tests`
## Verification
cd backend && python -m pytest tests/test_chat.py -v -k personality

View file

@ -0,0 +1,77 @@
---
id: T01
parent: S04
milestone: M023
provides: []
requires: []
affects: []
key_files: ["backend/chat_service.py", "backend/tests/test_chat.py"]
key_decisions: ["5-tier boundaries at 0.2/0.4/0.6/0.8/0.9 with distinct instruction text per tier", "Phrase count scaling: max(2, round(weight * len(phrases)))"]
patterns_established: []
drill_down_paths: []
observability_surfaces: []
duration: ""
verification_result: "Ran `cd backend && python -m pytest tests/test_chat.py -v -k personality` — all 11 personality tests pass (9 existing + 2 new)."
completed_at: 2026-04-04T10:04:37.100Z
blocker_discovered: false
---
# T01: Replaced 3-tier step function with 5-tier continuous interpolation in _build_personality_block(), adding progressive field inclusion and scaled phrase counts
> Replaced 3-tier step function with 5-tier continuous interpolation in _build_personality_block(), adding progressive field inclusion and scaled phrase counts
## What Happened
---
id: T01
parent: S04
milestone: M023
key_files:
- backend/chat_service.py
- backend/tests/test_chat.py
key_decisions:
- 5-tier boundaries at 0.2/0.4/0.6/0.8/0.9 with distinct instruction text per tier
- Phrase count scaling: max(2, round(weight * len(phrases)))
duration: ""
verification_result: passed
completed_at: 2026-04-04T10:04:37.101Z
blocker_discovered: false
---
# T01: Replaced 3-tier step function with 5-tier continuous interpolation in _build_personality_block(), adding progressive field inclusion and scaled phrase counts
**Replaced 3-tier step function with 5-tier continuous interpolation in _build_personality_block(), adding progressive field inclusion and scaled phrase counts**
## What Happened
Rewrote _build_personality_block() from a coarse 3-tier step function to 5-tier continuous interpolation. Weight < 0.2 returns empty string. Tiers progressively add profile fields: basic tone descriptors/explanation_approach signature phrases (count scaled by weight) distinctive_terms/sound_descriptions/sound_words/self_references/pacing full summary paragraph. Instruction text escalates per tier. Updated existing test for weight=0.7 tier and added 2 new test functions covering all tiers and phrase count scaling.
## Verification
Ran `cd backend && python -m pytest tests/test_chat.py -v -k personality` — all 11 personality tests pass (9 existing + 2 new).
## Verification Evidence
| # | Command | Exit Code | Verdict | Duration |
|---|---------|-----------|---------|----------|
| 1 | `cd backend && python -m pytest tests/test_chat.py -v -k personality` | 0 | ✅ pass | 5800ms |
## Deviations
Moved uses_analogies and audience_engagement from unconditional to weight >= 0.4 gate, and gated descriptors at 0.4+, matching the task plan's tier design.
## Known Issues
None.
## Files Created/Modified
- `backend/chat_service.py`
- `backend/tests/test_chat.py`
## Deviations
Moved uses_analogies and audience_engagement from unconditional to weight >= 0.4 gate, and gated descriptors at 0.4+, matching the task plan's tier design.
## Known Issues
None.

View file

@ -0,0 +1,46 @@
---
estimated_steps: 18
estimated_files: 2
skills_used: []
---
# T02: Enhance slider UX with dynamic tier label, value indicator, and gradient track
Enhance the personality slider in ChatWidget.tsx with three visual feedback features:
1. **Dynamic tier label**: Below the slider, show a centered label that changes based on current weight value:
- 0.00.19: "Encyclopedic"
- 0.20.39: "Subtle Reference"
- 0.40.59: "Creator Tone"
- 0.60.79: "Creator Voice"
- 0.81.0: "Full Embodiment"
2. **Value indicator**: Show the numeric value (e.g., "0.7") next to or integrated with the tier label. Small, secondary text style.
3. **Gradient track fill**: Use an inline style to set a CSS custom property `--slider-fill` based on the current value (percentage). In CSS, use `background: linear-gradient(to right, var(--color-accent) var(--slider-fill), var(--color-border) var(--slider-fill))` on the slider track.
Implementation in ChatWidget.tsx:
- Add a `getTierLabel(weight: number): string` helper function
- Add `style={{ '--slider-fill': `${personalityWeight * 100}%` } as React.CSSProperties}` to the slider input
- Add a new div below the slider row with the tier label + value
CSS changes in ChatWidget.module.css:
- `.slider` background becomes `linear-gradient(to right, var(--color-accent) var(--slider-fill, 0%), var(--color-border) var(--slider-fill, 0%))`
- Add `.tierLabel` class: centered, small font, color transitions
- Add `.tierValue` class: numeric value, even smaller, secondary color
Keep existing 'Encyclopedic' and 'Creator Voice' endpoint labels as-is — they frame the slider range.
## Inputs
- ``frontend/src/components/ChatWidget.tsx` — existing slider at line 277291`
- ``frontend/src/components/ChatWidget.module.css` — existing slider styles at line 70135`
## Expected Output
- ``frontend/src/components/ChatWidget.tsx` — enhanced slider with tier label, value indicator, gradient track`
- ``frontend/src/components/ChatWidget.module.css` — new tierLabel/tierValue classes, gradient track background`
## Verification
cd frontend && npm run build

View file

@ -292,42 +292,91 @@ def _build_context_block(items: list[dict[str, Any]]) -> str:
def _build_personality_block(creator_name: str, profile: dict[str, Any], weight: float) -> str:
"""Build a personality voice injection block from a creator's personality_profile JSONB.
The ``weight`` (0.01.0) determines how strongly the personality should
come through. At low weights the instruction is softer ("subtly adopt");
at high weights it is emphatic ("fully embody").
The ``weight`` (0.01.0) controls progressive inclusion of personality
fields via 5 tiers of continuous interpolation:
- < 0.2: no personality block (empty string)
- 0.20.39: basic tone teaching_style, formality, energy + subtle hint
- 0.40.59: + descriptors, explanation_approach + adopt-voice instruction
- 0.60.79: + signature_phrases (count scaled by weight) + creator-voice
- 0.80.89: + distinctive_terms, sound_descriptions, sound_words,
self_references, pacing + fully-embody instruction
- >= 0.9: + full summary paragraph
"""
if weight < 0.2:
return ""
vocab = profile.get("vocabulary", {})
tone = profile.get("tone", {})
style = profile.get("style_markers", {})
phrases = vocab.get("signature_phrases", [])
descriptors = tone.get("descriptors", [])
teaching_style = tone.get("teaching_style", "")
energy = tone.get("energy", "moderate")
formality = tone.get("formality", "conversational")
descriptors = tone.get("descriptors", [])
phrases = vocab.get("signature_phrases", [])
parts: list[str] = []
# Intensity qualifier
if weight >= 0.8:
parts.append(f"Fully embody {creator_name}'s voice and style.")
elif weight >= 0.4:
parts.append(f"Respond in {creator_name}'s voice.")
# --- Tier 1 (0.20.39): basic tone ---
if weight < 0.4:
parts.append(
f"When relevant, subtly reference {creator_name}'s communication style."
)
elif weight < 0.6:
parts.append(f"Adopt {creator_name}'s tone and communication style.")
elif weight < 0.8:
parts.append(
f"Respond as {creator_name} would, using their voice and manner."
)
else:
parts.append(f"Subtly adopt {creator_name}'s communication style.")
parts.append(
f"Fully embody {creator_name} — use their exact phrases, energy, and teaching approach."
)
if teaching_style:
parts.append(f"Teaching style: {teaching_style}.")
if descriptors:
parts.append(f"Tone: {', '.join(descriptors[:5])}.")
if phrases:
parts.append(f"Use their signature phrases: {', '.join(phrases[:6])}.")
parts.append(f"Match their {formality} {energy} tone.")
# Style markers
if style.get("uses_analogies"):
parts.append("Use analogies when helpful.")
if style.get("audience_engagement"):
parts.append(f"Audience engagement: {style['audience_engagement']}.")
# --- Tier 2 (0.4+): descriptors, explanation_approach, uses_analogies, audience_engagement ---
if weight >= 0.4:
if descriptors:
parts.append(f"Tone: {', '.join(descriptors[:5])}.")
explanation = style.get("explanation_approach", "")
if explanation:
parts.append(f"Explanation approach: {explanation}.")
if style.get("uses_analogies"):
parts.append("Use analogies when helpful.")
if style.get("audience_engagement"):
parts.append(f"Audience engagement: {style['audience_engagement']}.")
# --- Tier 3 (0.6+): signature phrases (count scaled by weight) ---
if weight >= 0.6 and phrases:
count = max(2, round(weight * len(phrases)))
parts.append(f"Use their signature phrases: {', '.join(phrases[:count])}.")
# --- Tier 4 (0.8+): distinctive_terms, sound_descriptions, sound_words, self_references, pacing ---
if weight >= 0.8:
distinctive = vocab.get("distinctive_terms", [])
if distinctive:
parts.append(f"Distinctive terms: {', '.join(distinctive)}.")
sound_desc = vocab.get("sound_descriptions", [])
if sound_desc:
parts.append(f"Sound descriptions: {', '.join(sound_desc)}.")
sound_words = style.get("sound_words", [])
if sound_words:
parts.append(f"Sound words: {', '.join(sound_words)}.")
self_refs = style.get("self_references", "")
if self_refs:
parts.append(f"Self-references: {self_refs}.")
pacing = style.get("pacing", "")
if pacing:
parts.append(f"Pacing: {pacing}.")
# --- Tier 5 (0.9+): full summary paragraph ---
if weight >= 0.9:
summary = profile.get("summary", "")
if summary:
parts.append(summary)
return " ".join(parts)

View file

@ -681,12 +681,103 @@ async def test_personality_prompt_injected_when_weight_and_profile(chat_client):
assert len(captured_messages) >= 2
system_prompt = captured_messages[0]["content"]
# Personality block should be appended
# weight=0.7 → tier 3: signature phrases YES, distinctive_terms NO
assert "Keota" in system_prompt
assert "let's gooo" in system_prompt
assert "Respond as Keota would" in system_prompt
assert "hands-on demo-driven" in system_prompt
assert "casual" in system_prompt
assert "high" in system_prompt
assert "let's gooo" in system_prompt # signature phrases included at 0.6+
assert "enthusiastic" in system_prompt # descriptors at 0.4+
assert "example-first" in system_prompt # explanation_approach at 0.4+
# Tier 4 fields (0.8+) should NOT be present
assert "sauce" not in system_prompt # distinctive_terms
assert "crispy" not in system_prompt # sound_descriptions
assert "brrr" not in system_prompt # sound_words
def test_personality_block_continuous_interpolation_tiers():
"""Progressive field inclusion across 5 interpolation tiers."""
from chat_service import _build_personality_block
profile = _FAKE_PERSONALITY_PROFILE
# weight < 0.2: empty
for w in (0.0, 0.1, 0.15, 0.19):
result = _build_personality_block("Keota", profile, w)
assert result == "", f"weight={w} should produce empty block"
# weight 0.20.39: basic tone only
for w in (0.2, 0.3, 0.39):
result = _build_personality_block("Keota", profile, w)
assert "subtly reference Keota" in result
assert "hands-on demo-driven" in result
assert "casual" in result and "high" in result
# Should NOT include descriptors, explanation_approach, phrases
assert "enthusiastic" not in result
assert "example-first" not in result
assert "let's gooo" not in result
# weight 0.40.59: + descriptors, explanation_approach
for w in (0.4, 0.5, 0.59):
result = _build_personality_block("Keota", profile, w)
assert "Adopt Keota" in result
assert "enthusiastic" in result # descriptors
assert "example-first" in result # explanation_approach
assert "analogies" in result # uses_analogies
# Should NOT include phrases or tier-4 fields
assert "let's gooo" not in result
assert "sauce" not in result
# weight 0.60.79: + signature phrases
for w in (0.6, 0.7, 0.79):
result = _build_personality_block("Keota", profile, w)
assert "Respond as Keota would" in result
assert "let's gooo" in result # signature phrases
assert "enthusiastic" in result # still has descriptors
# Should NOT include tier-4 fields
assert "sauce" not in result
assert "crispy" not in result
# weight 0.80.89: + distinctive_terms, sound_descriptions, etc.
for w in (0.8, 0.85, 0.89):
result = _build_personality_block("Keota", profile, w)
assert "Fully embody Keota" in result
assert "sauce" in result # distinctive_terms
assert "crispy" in result # sound_descriptions
assert "brrr" in result # sound_words
assert "I always" in result # self_references
assert "fast" in result # pacing
# Should NOT include summary
assert "High-energy producer" not in result
# weight >= 0.9: + summary
for w in (0.9, 0.95, 1.0):
result = _build_personality_block("Keota", profile, w)
assert "Fully embody Keota" in result
assert "High-energy producer" in result # summary
def test_personality_block_phrase_count_scales_with_weight():
"""Signature phrase count = max(2, round(weight * len(phrases)))."""
from chat_service import _build_personality_block
# Profile with 6 phrases to make scaling visible
profile = {
"vocabulary": {
"signature_phrases": ["p1", "p2", "p3", "p4", "p5", "p6"],
},
"tone": {},
"style_markers": {},
}
# weight=0.6: max(2, round(0.6*6)) = max(2,4) = 4 → first 4 phrases
result = _build_personality_block("Test", profile, 0.6)
assert "p4" in result
assert "p5" not in result
# weight=1.0: max(2, round(1.0*6)) = 6 → all phrases
result = _build_personality_block("Test", profile, 1.0)
assert "p6" in result
@pytest.mark.asyncio