docs(01): capture phase context
This commit is contained in:
parent
b7260bbd26
commit
5a04fc3498
1 changed files with 67 additions and 0 deletions
67
.planning/phases/01-core-pipeline/01-CONTEXT.md
Normal file
67
.planning/phases/01-core-pipeline/01-CONTEXT.md
Normal file
|
|
@ -0,0 +1,67 @@
|
||||||
|
# Phase 1: Core Pipeline - Context
|
||||||
|
|
||||||
|
**Gathered:** 2026-04-11
|
||||||
|
**Status:** Ready for planning
|
||||||
|
|
||||||
|
<domain>
|
||||||
|
## Phase Boundary
|
||||||
|
|
||||||
|
Single CLI command that takes a humming WAV file and an instrument name, and produces an instrument rendition via ACE-Step XL-SFT cover mode. Output audibly follows the pitch contour and rhythmic timing of the input. No multi-instrument batch, no config files, no advanced error handling — just the core end-to-end flow.
|
||||||
|
|
||||||
|
</domain>
|
||||||
|
|
||||||
|
<decisions>
|
||||||
|
## Implementation Decisions
|
||||||
|
|
||||||
|
### CLI invocation design
|
||||||
|
- Invoked as `python hum2inst.py input.wav --instrument piano`
|
||||||
|
- Python script directly, no installation step
|
||||||
|
- `--instrument` as a named CLI flag (not positional)
|
||||||
|
- `--output` flag optional, defaults to `./output/` directory
|
||||||
|
- Use Python argparse for argument parsing (gives --help for free)
|
||||||
|
|
||||||
|
### ACE-Step generation parameters
|
||||||
|
- Default cover_strength in the high fidelity range (0.8-1.0)
|
||||||
|
- `--strength` flag exposed in Phase 1 so users can experiment immediately
|
||||||
|
- Duration matches input WAV length by default; `--duration` flag to override
|
||||||
|
- Caption auto-built from instrument name (e.g., "piano cover of a melody") — no custom prompt flag in Phase 1
|
||||||
|
|
||||||
|
### Output behavior
|
||||||
|
- Output filename includes instrument and timestamp (exact format at Claude's discretion)
|
||||||
|
- On generation failure or silence: print clear error message, exit with non-zero code
|
||||||
|
- No auto-play — just save and print the output path
|
||||||
|
- No silent failures
|
||||||
|
|
||||||
|
### Pipeline architecture
|
||||||
|
- Single `hum2inst.py` script — no module splitting in Phase 1
|
||||||
|
- Assume CUDA GPU is available; fail with clear message if no GPU detected
|
||||||
|
- Move existing experimental scripts (midi_to_audio.py, musicgen_melody.py) to an `/archive` folder
|
||||||
|
|
||||||
|
### Claude's Discretion
|
||||||
|
- ACE-Step invocation method (import Python API vs subprocess call — choose based on what ACE-Step exposes)
|
||||||
|
- Progress/feedback during generation (print statements, progress bar, or similar — pick what's appropriate)
|
||||||
|
- Exact output filename format (instrument + timestamp pattern)
|
||||||
|
- Exact cover_strength default value within the 0.8-1.0 range
|
||||||
|
|
||||||
|
</decisions>
|
||||||
|
|
||||||
|
<specifics>
|
||||||
|
## Specific Ideas
|
||||||
|
|
||||||
|
- ACE-Step XL-SFT cover mode is the generation backend — this is established from prior experimentation
|
||||||
|
- The `ace-step/` directory already exists in the project root with the model code
|
||||||
|
- User wants high melodic fidelity as the default — the pipeline should prioritize faithful melody reproduction over creative interpretation
|
||||||
|
|
||||||
|
</specifics>
|
||||||
|
|
||||||
|
<deferred>
|
||||||
|
## Deferred Ideas
|
||||||
|
|
||||||
|
None — discussion stayed within phase scope
|
||||||
|
|
||||||
|
</deferred>
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Phase: 01-core-pipeline*
|
||||||
|
*Context gathered: 2026-04-11*
|
||||||
Loading…
Add table
Reference in a new issue