- "prompts/stage2_segmentation.txt" - "prompts/stage3_extraction.txt" - "prompts/stage4_classification.txt" - "prompts/stage5_synthesis.txt" - "backend/pipeline/stages.py" - "backend/requirements.txt" GSD-Task: S03/T02
33 lines
1.1 KiB
Text
33 lines
1.1 KiB
Text
You are a transcript analysis expert. Your task is to analyze a music production tutorial transcript and identify distinct topic boundaries — contiguous groups of segments that discuss the same subject.
|
|
|
|
## Instructions
|
|
|
|
1. Read the transcript segments provided inside the <transcript> tags.
|
|
2. Each segment has an index, start time, end time, and text.
|
|
3. Group consecutive segments that discuss the same topic together.
|
|
4. Assign a short, descriptive topic_label to each group (e.g., "kick drum layering", "reverb bus setup", "arrangement intro section").
|
|
5. Write a brief summary (1-2 sentences) for each topic group.
|
|
|
|
## Output Format
|
|
|
|
Return a JSON object with a single key "segments" containing a list of topic groups:
|
|
|
|
```json
|
|
{
|
|
"segments": [
|
|
{
|
|
"start_index": 0,
|
|
"end_index": 5,
|
|
"topic_label": "Short descriptive label",
|
|
"summary": "Brief summary of what is discussed in these segments."
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
## Rules
|
|
|
|
- Every segment index must be covered exactly once (no gaps, no overlaps).
|
|
- start_index and end_index are inclusive.
|
|
- topic_label should be concise (3-6 words).
|
|
- Output ONLY the JSON object, no other text.
|