docs(01): capture phase context

This commit is contained in:
John Lightner 2026-04-11 01:58:51 -05:00
parent b7260bbd26
commit 5a04fc3498

View file

@ -0,0 +1,67 @@
# Phase 1: Core Pipeline - Context
**Gathered:** 2026-04-11
**Status:** Ready for planning
<domain>
## Phase Boundary
Single CLI command that takes a humming WAV file and an instrument name, and produces an instrument rendition via ACE-Step XL-SFT cover mode. Output audibly follows the pitch contour and rhythmic timing of the input. No multi-instrument batch, no config files, no advanced error handling — just the core end-to-end flow.
</domain>
<decisions>
## Implementation Decisions
### CLI invocation design
- Invoked as `python hum2inst.py input.wav --instrument piano`
- Python script directly, no installation step
- `--instrument` as a named CLI flag (not positional)
- `--output` flag optional, defaults to `./output/` directory
- Use Python argparse for argument parsing (gives --help for free)
### ACE-Step generation parameters
- Default cover_strength in the high fidelity range (0.8-1.0)
- `--strength` flag exposed in Phase 1 so users can experiment immediately
- Duration matches input WAV length by default; `--duration` flag to override
- Caption auto-built from instrument name (e.g., "piano cover of a melody") — no custom prompt flag in Phase 1
### Output behavior
- Output filename includes instrument and timestamp (exact format at Claude's discretion)
- On generation failure or silence: print clear error message, exit with non-zero code
- No auto-play — just save and print the output path
- No silent failures
### Pipeline architecture
- Single `hum2inst.py` script — no module splitting in Phase 1
- Assume CUDA GPU is available; fail with clear message if no GPU detected
- Move existing experimental scripts (midi_to_audio.py, musicgen_melody.py) to an `/archive` folder
### Claude's Discretion
- ACE-Step invocation method (import Python API vs subprocess call — choose based on what ACE-Step exposes)
- Progress/feedback during generation (print statements, progress bar, or similar — pick what's appropriate)
- Exact output filename format (instrument + timestamp pattern)
- Exact cover_strength default value within the 0.8-1.0 range
</decisions>
<specifics>
## Specific Ideas
- ACE-Step XL-SFT cover mode is the generation backend — this is established from prior experimentation
- The `ace-step/` directory already exists in the project root with the model code
- User wants high melodic fidelity as the default — the pipeline should prioritize faithful melody reproduction over creative interpretation
</specifics>
<deferred>
## Deferred Ideas
None — discussion stayed within phase scope
</deferred>
---
*Phase: 01-core-pipeline*
*Context gathered: 2026-04-11*