docs: create roadmap (4 phases)

This commit is contained in:
John Lightner 2026-04-11 01:45:42 -05:00
parent 93bd57d386
commit b7260bbd26
3 changed files with 173 additions and 21 deletions

View file

@ -84,30 +84,30 @@ Deferred to future release. Tracked but not in current roadmap.
| Requirement | Phase | Status |
|-------------|-------|--------|
| MEL-01 | | Pending |
| MEL-02 | | Pending |
| MEL-03 | | Pending |
| MEL-04 | | Pending |
| INST-01 | | Pending |
| INST-02 | | Pending |
| INST-03 | | Pending |
| INP-01 | | Pending |
| INP-02 | | Pending |
| INP-03 | | Pending |
| OUT-01 | | Pending |
| OUT-02 | | Pending |
| OUT-03 | | Pending |
| REPR-01 | | Pending |
| REPR-02 | | Pending |
| PIPE-01 | | Pending |
| PIPE-02 | | Pending |
| PIPE-03 | | Pending |
| MEL-01 | Phase 1 | Pending |
| MEL-02 | Phase 1 | Pending |
| MEL-03 | Phase 2 | Pending |
| MEL-04 | Phase 1 | Pending |
| INST-01 | Phase 1 | Pending |
| INST-02 | Phase 2 | Pending |
| INST-03 | Phase 2 | Pending |
| INP-01 | Phase 1 | Pending |
| INP-02 | Phase 3 | Pending |
| INP-03 | Phase 3 | Pending |
| OUT-01 | Phase 3 | Pending |
| OUT-02 | Phase 1 | Pending |
| OUT-03 | Phase 3 | Pending |
| REPR-01 | Phase 4 | Pending |
| REPR-02 | Phase 4 | Pending |
| PIPE-01 | Phase 1 | Pending |
| PIPE-02 | Phase 4 | Pending |
| PIPE-03 | Phase 3 | Pending |
**Coverage:**
- v1 requirements: 18 total
- Mapped to phases: 0
- Unmapped: 18
- Mapped to phases: 18
- Unmapped: 0
---
*Requirements defined: 2026-04-11*
*Last updated: 2026-04-11 after initial definition*
*Last updated: 2026-04-11 after roadmap creation*

92
.planning/ROADMAP.md Normal file
View file

@ -0,0 +1,92 @@
# Roadmap: AI Music Pipeline
## Overview
This roadmap delivers a voice-to-instrument pipeline built on ACE-Step 1.5 XL-SFT cover mode. Phase 1 establishes the core end-to-end flow (hum in, instrument out), Phase 2 validates instrument variety and exposes fidelity control, Phase 3 hardens input/output handling, and Phase 4 adds configuration file support and reproducibility via seed control. The result is a single CLI tool that takes a humming WAV and produces high-quality instrument renditions that faithfully follow the input melody.
## Phases
**Phase Numbering:**
- Integer phases (1, 2, 3, 4): Planned milestone work
- Decimal phases (e.g., 2.1): Urgent insertions (marked with INSERTED)
- [ ] **Phase 1: Core Pipeline** - End-to-end humming WAV to instrument output via ACE-Step cover mode
- [ ] **Phase 2: Instrument Variety & Fidelity Control** - Multiple distinct instruments and cover_strength tuning
- [ ] **Phase 3: Input & Output Robustness** - Sample rate handling, duration detection, CD quality output, error messages
- [ ] **Phase 4: Configuration & Reproducibility** - TOML config support and seed control for reproducible outputs
## Phase Details
### Phase 1: Core Pipeline
**Goal**: User can hum a melody, run one command, and get an instrument rendition that audibly follows the melody
**Depends on**: Nothing (first phase)
**Requirements**: MEL-01, MEL-02, MEL-04, INST-01, INP-01, OUT-02, PIPE-01
**Success Criteria** (what must be TRUE):
1. User can run a single script/command with a humming WAV file and get instrument audio output
2. Output audio audibly follows the pitch contour of the input humming
3. Output audio preserves the rhythmic timing of the input humming
4. Output sounds like a coherent instrument performance, not garbled noise
5. User can specify the target instrument (e.g., piano, guitar) and the output reflects that instrument
**Plans**: TBD
Plans:
- [ ] 01-01: TBD
- [ ] 01-02: TBD
- [ ] 01-03: TBD
### Phase 2: Instrument Variety & Fidelity Control
**Goal**: User can choose from multiple instruments that sound distinctly different, and control how closely the output follows the input melody
**Depends on**: Phase 1
**Requirements**: INST-02, INST-03, MEL-03
**Success Criteria** (what must be TRUE):
1. Different instrument prompts (piano, guitar, saxophone, violin, flute) produce audibly different timbres from the same input
2. At least 5 distinct instrument types produce usable output
3. User can adjust cover_strength parameter and hear the difference -- higher values follow the melody more closely, lower values allow more creative interpretation
**Plans**: TBD
Plans:
- [ ] 02-01: TBD
- [ ] 02-02: TBD
### Phase 3: Input & Output Robustness
**Goal**: Pipeline handles real-world input files gracefully and produces properly named CD-quality output
**Depends on**: Phase 1
**Requirements**: INP-02, INP-03, OUT-01, OUT-03, PIPE-03
**Success Criteria** (what must be TRUE):
1. Input WAV files at 44.1kHz, 48kHz, and 16kHz sample rates all work without errors
2. Pipeline auto-detects input audio duration and configures generation duration appropriately
3. Output audio is at least 44.1kHz sample rate
4. Output filenames include the instrument name and a timestamp (e.g., piano_20260411_143022.wav)
5. Clear error message shown when input file is missing, corrupted, or in an unsupported format
**Plans**: TBD
Plans:
- [ ] 03-01: TBD
- [ ] 03-02: TBD
- [ ] 03-03: TBD
### Phase 4: Configuration & Reproducibility
**Goal**: User can configure the pipeline via TOML file and reproduce or vary outputs using seed control
**Depends on**: Phase 1
**Requirements**: PIPE-02, REPR-01, REPR-02
**Success Criteria** (what must be TRUE):
1. User can specify instrument, cover_strength, duration, and seed via a TOML config file instead of CLI arguments
2. Running the pipeline twice with the same seed, input, and prompt produces identical output
3. Running with different seeds produces meaningfully different outputs from the same input and prompt
**Plans**: TBD
Plans:
- [ ] 04-01: TBD
- [ ] 04-02: TBD
## Progress
**Execution Order:**
Phases execute in numeric order. Phases 2, 3, and 4 all depend on Phase 1 but are independent of each other.
| Phase | Plans Complete | Status | Completed |
|-------|----------------|--------|-----------|
| 1. Core Pipeline | 0/3 | Not started | - |
| 2. Instrument Variety & Fidelity Control | 0/2 | Not started | - |
| 3. Input & Output Robustness | 0/3 | Not started | - |
| 4. Configuration & Reproducibility | 0/2 | Not started | - |

60
.planning/STATE.md Normal file
View file

@ -0,0 +1,60 @@
# Project State
## Project Reference
See: .planning/PROJECT.md (updated 2026-04-11)
**Core value:** A hummed melody input must produce instrument-specific output that audibly follows the melody's contour and rhythm
**Current focus:** Phase 1: Core Pipeline
## Current Position
Phase: 1 of 4 (Core Pipeline)
Plan: 0 of 3 in current phase
Status: Ready to plan
Last activity: 2026-04-11 -- Roadmap created
Progress: [..........] 0%
## Performance Metrics
**Velocity:**
- Total plans completed: 0
- Average duration: -
- Total execution time: 0 hours
**By Phase:**
| Phase | Plans | Total | Avg/Plan |
|-------|-------|-------|----------|
| - | - | - | - |
**Recent Trend:**
- Last 5 plans: -
- Trend: -
*Updated after each plan completion*
## Accumulated Context
### Decisions
Decisions are logged in PROJECT.md Key Decisions table.
Recent decisions affecting current work:
- [Roadmap]: ACE-Step 1.5 XL-SFT cover mode is the sole generation engine for v1. No MusicGen/AudioCraft.
- [Roadmap]: Phases 2-4 are independent after Phase 1; can be executed in any order.
### Pending Todos
None yet.
### Blockers/Concerns
None yet.
## Session Continuity
Last session: 2026-04-11
Stopped at: Roadmap created, ready to plan Phase 1
Resume file: None