docs(M001): context, requirements, and roadmap

This commit is contained in:
xpltd 2026-03-17 22:08:56 -05:00
parent 9a94de7455
commit 57365e7af8
5 changed files with 812 additions and 0 deletions

21
.gsd/DECISIONS.md Normal file
View file

@ -0,0 +1,21 @@
# Decisions Register
<!-- Append-only. Never edit or remove existing rows.
To reverse a decision, add a new row that supersedes it.
Read this file at the start of any planning or research phase. -->
| # | When | Scope | Decision | Choice | Rationale | Revisable? |
|---|------|-------|----------|--------|-----------|------------|
| D001 | M001 | arch | Backend framework | Python 3.12 + FastAPI | Async-first, Pydantic v2, SSE support, well-documented yt-dlp integration patterns | No |
| D002 | M001 | arch | Frontend framework | Vue 3 + TypeScript + Pinia + Vite | Composition API, `<script setup>`, Pinia 3 (Vuex dead for Vue 3), Vite 8 with Rolldown | No |
| D003 | M001 | arch | Real-time transport | SSE via sse-starlette (not WebSocket) | Server-push only needed; SSE is simpler, HTTP-native, auto-reconnecting. sse-starlette has better disconnect handling than FastAPI native SSE | No |
| D004 | M001 | arch | Database | SQLite via aiosqlite with WAL mode | Single-file, zero external deps, sufficient for single-instance self-hosted tool. WAL required for concurrent download writes | No |
| D005 | M001 | arch | yt-dlp integration | Library import, not subprocess | Structured progress hooks, no shell injection surface, typed error info | No |
| D006 | M001 | arch | Sync-to-async bridge | ThreadPoolExecutor + loop.call_soon_threadsafe | YoutubeDL not picklable (rules out ProcessPoolExecutor). call_soon_threadsafe is the only safe bridge from sync threads to asyncio Queue | No |
| D007 | M001 | arch | Session identity | Opaque UUID in httpOnly cookie, all state in SQLite | Starlette SessionMiddleware signs entire session dict into cookie — grows unboundedly and can be decoded. Opaque ID is simpler and safer | No |
| D008 | M001 | arch | Admin authentication | HTTPBasic + bcrypt 5.0.0 (direct, not passlib) | passlib is unmaintained, breaks on Python 3.13. bcrypt direct is simple and correct. timing-safe comparison via secrets.compare_digest | No |
| D009 | M001 | arch | Config hierarchy | Defaults → config.yaml → env vars → SQLite admin writes | Operators need both infra-as-code (YAML, env) AND live UI config. YAML seeds DB on first boot, then SQLite wins | No |
| D010 | M001 | arch | Scheduler | APScheduler 3.x AsyncIOScheduler (not 4.x alpha) | 3.x is stable and well-documented. 4.x is alpha with breaking changes | Yes — when 4.x ships stable |
| D011 | M001 | convention | TLS handling | Reverse proxy responsibility, not in-container | Standard self-hosted pattern. App provides startup warning when admin enabled without TLS. Secure deployment example with reverse proxy sidecar | No |
| D012 | M001 | convention | Commit strategy | Branch-per-slice with squash merge to main | Clean main history, one commit per slice, individually revertable | No |
| D013 | M001 | scope | Anti-features | OAuth/SSO, WebSocket, user accounts, embedded player, auto-update yt-dlp, subscription monitoring, FlareSolverr — all explicitly out of scope | Each would massively increase scope or conflict with anonymous-first, zero-telemetry positioning | No |

34
.gsd/PROJECT.md Normal file
View file

@ -0,0 +1,34 @@
# media.rip()
## What This Is
A self-hostable, redistributable Docker container — a web-based yt-dlp frontend that anyone can run on their own infrastructure. Users paste a URL, pick quality, and download media without creating an account, sending data anywhere, or knowing what a terminal is. Ships with a cyberpunk default theme, session isolation, and ephemeral downloads. Fully configurable via mounted config file for personal, family, team, or public use.
Ground-up build. Not a MeTube fork. Treats theming, session behavior, purge policy, and operator experience as first-class concerns.
## Core Value
A user can paste any yt-dlp-supported URL, see exactly what they're about to download, and get it — without creating an account, without sending data anywhere, and without knowing what a terminal is.
## Current State
Greenfield. Spec complete (see `/PROJECT.md`). Architecture, feature, stack, and pitfall research complete (see `.planning/research/`). No code written yet.
## Architecture / Key Patterns
- **Backend:** Python 3.12 + FastAPI, yt-dlp as library (not subprocess), aiosqlite for SQLite, sse-starlette for SSE, APScheduler 3.x for cron, bcrypt for admin auth
- **Frontend:** Vue 3 + TypeScript + Pinia + Vite
- **Transport:** SSE (server-push only, no WebSocket)
- **Persistence:** SQLite with WAL mode
- **Critical pattern:** `ThreadPoolExecutor` + `loop.call_soon_threadsafe` bridges sync yt-dlp into async FastAPI — the load-bearing architectural seam
- **Session isolation:** Per-browser cookie-scoped queues (isolated/shared/open modes)
- **Config hierarchy:** Hardcoded defaults → config.yaml → env var overrides → SQLite admin writes
- **Distribution:** Single multi-stage Docker image, GHCR + Docker Hub, amd64 + arm64
## Capability Contract
See `.gsd/REQUIREMENTS.md` for the explicit capability contract, requirement status, and coverage mapping.
## Milestone Sequence
- [ ] M001: media.rip() v1.0 — Full-featured self-hosted yt-dlp web frontend, Docker-distributed

455
.gsd/REQUIREMENTS.md Normal file
View file

@ -0,0 +1,455 @@
# Requirements
This file is the explicit capability and coverage contract for the project.
Use it to track what is actively in scope, what has been validated by completed work, what is intentionally deferred, and what is explicitly out of scope.
## Active
### R001 — URL submission + download for any yt-dlp-supported site
- Class: core-capability
- Status: active
- Description: User pastes any URL supported by yt-dlp and the system downloads it to the configured output directory
- Why it matters: The fundamental product primitive — everything else depends on this working
- Source: user
- Primary owning slice: M001/S01
- Supporting slices: none
- Validation: unmapped
- Notes: Jobs keyed by UUID4 (R024), not URL — concurrent same-URL downloads are supported
### R002 — Live format/quality extraction and selection
- Class: core-capability
- Status: active
- Description: GET /api/formats?url= calls yt-dlp extract_info to return available formats; user picks resolution, codec, ext before downloading
- Why it matters: Power users won't use a tool that hides quality choice. Competitors use presets — live extraction is a step up
- Source: user
- Primary owning slice: M001/S01
- Supporting slices: M001/S03
- Validation: unmapped
- Notes: Extraction can take 3-10s for some sites — UI must show loading state. filesize is frequently null
### R003 — Real-time SSE progress
- Class: core-capability
- Status: active
- Description: Server-sent events stream delivers job status transitions (queued→extracting→downloading→completed/failed) with download progress (percent, speed, ETA) per session
- Why it matters: No progress = no trust. Users need to see something is happening
- Source: user
- Primary owning slice: M001/S02
- Supporting slices: M001/S03
- Validation: unmapped
- Notes: SSE via sse-starlette, not WebSocket. Events: init, job_update, job_removed, error, purge_complete
### R004 — SSE init replay on reconnect
- Class: continuity
- Status: active
- Description: When a client reconnects to the SSE endpoint, the server replays current job states from the DB as synthetic events before entering the live queue
- Why it matters: Without this, page refresh clears the queue view even though downloads are running. Breaks session isolation's value proposition entirely
- Source: user
- Primary owning slice: M001/S02
- Supporting slices: none
- Validation: unmapped
- Notes: Eliminates "spinner forever after refresh" bugs. The DB is source of truth, not frontend memory
### R005 — Download queue: view, cancel, filter, sort
- Class: primary-user-loop
- Status: active
- Description: Users see all their downloads in a unified queue with status, progress, and can cancel or remove entries. Filter by status, sort by date/name
- Why it matters: Table stakes for any download manager UX
- Source: user
- Primary owning slice: M001/S03
- Supporting slices: none
- Validation: unmapped
- Notes: Queue is a projection of SQLite state replayed via SSE
### R006 — Playlist support: parent + collapsible child jobs
- Class: core-capability
- Status: active
- Description: Playlist URLs create a parent job with collapsible child video rows. Parent status reflects aggregate child progress. Mixed success/failure shown per child
- Why it matters: Playlists are a primary use case for self-hosters. MeTube treats them as flat — collapsible parent/child is a step up
- Source: user
- Primary owning slice: M001/S03
- Supporting slices: M001/S01
- Validation: unmapped
- Notes: A 200-video playlist = 201 rows — must be collapsed by default. Parent completes when all children reach completed or failed
### R007 — Session isolation: isolated (default) / shared / open modes
- Class: differentiator
- Status: active
- Description: Operator selects session mode server-wide. Isolated: each browser sees only its own downloads via httpOnly UUID cookie. Shared: all sessions see all downloads. Open: no session tracking
- Why it matters: The primary differentiator from MeTube (issue #591 closed as "won't fix"). The feature that created demand for forks
- Source: user
- Primary owning slice: M001/S02
- Supporting slices: M001/S03
- Validation: unmapped
- Notes: isolated is the zero-config safe default. Mode switching mid-deployment: isolated rows remain scoped, shared queries all rows
### R008 — Cookie auth: per-session cookies.txt upload
- Class: core-capability
- Status: active
- Description: Users upload a Netscape-format cookies.txt file scoped to their session. Enables downloading paywalled/private content. Files purged on session clear
- Why it matters: The practical reason people move off MeTube. Enables authenticated downloads without embedding credentials in the app
- Source: research
- Primary owning slice: M001/S04
- Supporting slices: none
- Validation: unmapped
- Notes: CVE-2023-35934 — pin yt-dlp >= 2023-07-06. Store per-session at data/sessions/{id}/cookies.txt. Never log contents. Normalize CRLF→LF. Chrome cookie extraction broken since July 2024 — surface Firefox recommendation in UI
### R009 — Purge system: scheduled/manual/never, independent file + log TTL
- Class: operability
- Status: active
- Description: Operator configures purge mode (scheduled cron, manual-only, never). File TTL and log TTL are independent values. Purge activity written to audit log. Purge must skip active downloads
- Why it matters: Ephemeral storage is the contract with users. Operators need control over disk lifecycle
- Source: user
- Primary owning slice: M001/S04
- Supporting slices: none
- Validation: unmapped
- Notes: Purge must filter status IN (completed, failed, cancelled) — never delete files for active downloads. Handle already-deleted files gracefully
### R010 — Three built-in themes: cyberpunk (default), dark, light
- Class: differentiator
- Status: active
- Description: Three themes baked into the Docker image. Cyberpunk is default: #00a8ff/#ff6b2b, JetBrains Mono, scanlines, grid overlay. Dark and light are clean alternatives
- Why it matters: Visual identity differentiator — every other tool ships with plain material/tailwind defaults. Cyberpunk makes first impressions memorable
- Source: user
- Primary owning slice: M001/S05
- Supporting slices: none
- Validation: unmapped
- Notes: Built-in themes compiled into frontend bundle. Heavily commented as drop-in documentation for custom theme authors
### R011 — Drop-in custom theme system via volume mount
- Class: differentiator
- Status: active
- Description: Operators drop a theme folder into /themes volume mount. Theme pack: theme.css (CSS variable overrides) + metadata.json + optional preview.png + optional assets/. Appears in picker without recompile
- Why it matters: The feature MeTube refuses to build. Lowers theming floor to "edit a CSS file"
- Source: user
- Primary owning slice: M001/S05
- Supporting slices: none
- Validation: unmapped
- Notes: Theme directory scanned at startup + on-demand re-scan. No file watchers needed
### R012 — CSS variable contract (base.css) as stable theme API
- Class: constraint
- Status: active
- Description: A documented, stable set of CSS custom properties (--color-bg, --color-accent-primary, --font-ui, --radius-sm, --effect-overlay, etc.) that all themes override. Token names cannot change after v1.0 ships — they are the public API for custom themes
- Why it matters: Changing token names after operators write custom themes breaks those themes. This is a one-way door
- Source: user
- Primary owning slice: M001/S05
- Supporting slices: M001/S03
- Validation: unmapped
- Notes: Must be designed before component work references token names. Establish early in S05, referenced by S03 components
### R013 — Mobile-responsive layout
- Class: primary-user-loop
- Status: active
- Description: <768px breakpoint: bottom tab bar (Submit/Queue/Settings), full-width URL input, card list for queue (swipe-to-cancel), bottom sheet for format options. All tap targets minimum 44px
- Why it matters: >50% of self-hoster interactions happen on phone or tablet. No existing yt-dlp web UI does mobile well
- Source: user
- Primary owning slice: M001/S03
- Supporting slices: none
- Validation: unmapped
- Notes: Desktop (≥768px): top header bar, left sidebar (collapsible), full download table
### R014 — Admin panel with secure auth
- Class: operability
- Status: active
- Description: Admin panel with username/password login (HTTPBasic + bcrypt). First-boot credential setup with forced change prompt. Session list, storage view, manual purge trigger, live config editor, unsupported URL log download. Security posture: timing-safe comparison (secrets.compare_digest), Secure/HttpOnly/SameSite=Strict cookies behind TLS, security headers on admin routes (HSTS, X-Content-Type-Options, X-Frame-Options), startup warning when admin enabled without TLS detected
- Why it matters: Shipping an admin panel with crappy auth undermines the trust proposition of the entire product. Operators deserve qBittorrent/Sonarr-level login UX, not raw tokens
- Source: user
- Primary owning slice: M001/S04
- Supporting slices: none
- Validation: unmapped
- Notes: If no X-Forwarded-Proto: https detected, log warning. Admin routes hidden from nav unless credentials configured
### R015 — Unsupported URL reporting with audit log
- Class: failure-visibility
- Status: active
- Description: When yt-dlp fails with extraction error, job shows failed badge + "Report unsupported site" button. Click appends to log (domain-only by default, full URL opt-in). Admin downloads log. Zero automatic outbound reporting
- Why it matters: Users see exactly what gets logged. Trust feature — transparency in failure handling
- Source: user
- Primary owning slice: M001/S04
- Supporting slices: none
- Validation: unmapped
- Notes: User-triggered only. Config report_full_url controls privacy level
### R016 — Health endpoint
- Class: operability
- Status: active
- Description: GET /api/health returns status, version, yt_dlp_version, uptime
- Why it matters: Uptime Kuma and similar monitoring tools are table stakes for self-hosters
- Source: user
- Primary owning slice: M001/S02
- Supporting slices: none
- Validation: unmapped
- Notes: Extend with disk space and queue depth if practical
### R017 — Session export/import
- Class: continuity
- Status: active
- Description: Export session as JSON archive (download history + queue state + preferences). Import restores history into a new session. Does not require sign-in, stays anonymous-first
- Why it matters: Enables identity continuity on persistent instances without a real account system. No competitor offers this
- Source: research
- Primary owning slice: M001/S04
- Supporting slices: none
- Validation: unmapped
- Notes: Meaningless in open mode — UI should hide export button when session mode is open
### R018 — Link sharing (completed file shareable URL)
- Class: primary-user-loop
- Status: active
- Description: Completed downloads are served at predictable URLs. Users can copy a direct download link to share with others
- Why it matters: Removes the "now what?" question after downloading — users share a ripped file with a friend via URL
- Source: research
- Primary owning slice: M001/S04
- Supporting slices: none
- Validation: unmapped
- Notes: Requires knowing the output filename. Files served via FastAPI StaticFiles or explicit route on /downloads
### R019 — Source-aware output templates
- Class: core-capability
- Status: active
- Description: Per-site default output templates (YouTube: uploader/title, SoundCloud: uploader/title, generic: title). Configurable via config.yaml source_templates map
- Why it matters: Sensible defaults per-site are a step up from MeTube's single global template. Organizes downloads without user effort
- Source: user
- Primary owning slice: M001/S01
- Supporting slices: none
- Validation: unmapped
- Notes: Per-download override also supported (R025)
### R020 — Zero automatic outbound telemetry
- Class: constraint
- Status: active
- Description: The container makes zero automatic outbound network requests. No CDN calls, no Google Fonts, no update checks, no analytics. All fonts and assets bundled or self-hosted
- Why it matters: Trust is the core proposition. Competing tools have subtle external requests. This is an explicit design constraint, not an afterthought
- Source: user
- Primary owning slice: M001/S06
- Supporting slices: all
- Validation: unmapped
- Notes: Verified by checking zero outbound network requests from container during normal operation
### R021 — Docker: single multi-stage image, GHCR + Docker Hub, amd64 + arm64
- Class: launchability
- Status: active
- Description: Single Dockerfile, multi-stage build (Node frontend builder → Python deps → slim runtime with ffmpeg). Published to ghcr.io/xpltd/media-rip and docker.io/xpltd/media-rip. Both amd64 and arm64 architectures
- Why it matters: Docker is the distribution mechanism for self-hosted tools. arm64 users (Raspberry Pi, Apple Silicon NAS) are a significant audience
- Source: user
- Primary owning slice: M001/S06
- Supporting slices: none
- Validation: unmapped
- Notes: Target <400MB compressed. ffmpeg from Debian apt supports arm64 natively
### R022 — CI/CD: lint + test on PR, build + push on tag
- Class: launchability
- Status: active
- Description: GitHub Actions: ci.yml runs ruff + pytest + eslint + vue-tsc + vitest + Docker smoke on PRs. publish.yml builds multi-platform image and pushes to both registries on v*.*.* tags. Generates GitHub Release with changelog
- Why it matters: Ensures the image stays functional as yt-dlp extractors evolve. Automated quality gate
- Source: user
- Primary owning slice: M001/S06
- Supporting slices: none
- Validation: unmapped
- Notes: CI smoke-tests downloads from 2+ sites to catch extractor breakage
### R023 — Config system: config.yaml + env var overrides + admin live writes
- Class: operability
- Status: active
- Description: Three-layer config: hardcoded defaults → config.yaml (read-only at start) → env var overrides (MEDIARIP__SECTION__KEY) → SQLite admin writes (live, no restart). All fields optional — zero-config works out of the box
- Why it matters: Operators need infrastructure-as-code (YAML, env vars) AND live UI config without restart
- Source: user
- Primary owning slice: M001/S01
- Supporting slices: M001/S04
- Validation: unmapped
- Notes: YAML seeds DB on first boot, then SQLite wins. YAML never reflects admin UI changes — document this clearly
### R024 — Concurrent same-URL support
- Class: core-capability
- Status: active
- Description: Jobs keyed by UUID4, not URL. Submitting the same URL twice at different qualities creates two independent jobs
- Why it matters: Users legitimately want the same video in different formats. URL-keyed dedup would prevent this
- Source: user
- Primary owning slice: M001/S01
- Supporting slices: none
- Validation: unmapped
- Notes: Intentional design per PROJECT.md
### R025 — Per-download output template override
- Class: core-capability
- Status: active
- Description: Users can override the output template on a per-download basis, in addition to the source-aware defaults (R019)
- Why it matters: Power users want control over file naming for specific downloads
- Source: user
- Primary owning slice: M001/S03
- Supporting slices: none
- Validation: unmapped
- Notes: UI field in "More options" area
### R026 — Secure deployment example
- Class: launchability
- Status: active
- Description: docker-compose.example.yml ships with a reverse proxy + TLS configuration as the default documented deployment path, not an afterthought
- Why it matters: Making the secure path the default path prevents operators from accidentally running admin auth over cleartext
- Source: user
- Primary owning slice: M001/S06
- Supporting slices: none
- Validation: unmapped
- Notes: Caddy or Traefik sidecar — decision deferred to slice planning
## Deferred
### R027 — Per-format download presets (saved quality profiles)
- Class: primary-user-loop
- Status: deferred
- Description: Save "my 720p MP3 preset" for reuse across downloads
- Why it matters: Convenience feature for repeat users
- Source: research
- Primary owning slice: none
- Supporting slices: none
- Validation: unmapped
- Notes: Deferred — v1 needs live format selection working first. Add when session system is stable
### R028 — GitHub issue prefill for unsupported URL reporting
- Class: failure-visibility
- Status: deferred
- Description: Config option reporting.github_issues: true opens pre-filled GitHub issue for unsupported URLs
- Why it matters: Streamlines community reporting of extractor gaps
- Source: research
- Primary owning slice: none
- Supporting slices: none
- Validation: unmapped
- Notes: Deferred — enable only after log download (R015) is validated
### R029 — Queue filter/sort persistence in localStorage
- Class: primary-user-loop
- Status: deferred
- Description: Store last sort/filter state in localStorage so it persists across page loads
- Why it matters: Minor convenience — avoids resetting sort every refresh
- Source: research
- Primary owning slice: none
- Supporting slices: none
- Validation: unmapped
- Notes: Trivial to add post-v1
## Out of Scope
### R030 — OAuth / SSO integration
- Class: anti-feature
- Status: out-of-scope
- Description: Centralized auth via OAuth/SSO providers
- Why it matters: Prevents massive scope increase. Reverse proxy handles AuthN; media.rip handles AuthZ via session mode + admin auth
- Source: research
- Primary owning slice: none
- Supporting slices: none
- Validation: n/a
- Notes: Authentik, Authelia, Traefik ForwardAuth are the operator's tools for this
### R031 — WebSocket transport
- Class: anti-feature
- Status: out-of-scope
- Description: WebSocket for real-time communication
- Why it matters: SSE covers 100% of actual needs (server-push only). WebSocket adds complexity without benefit
- Source: research
- Primary owning slice: none
- Supporting slices: none
- Validation: n/a
- Notes: SSE is simpler, HTTP-native, auto-reconnecting via browser EventSource
### R032 — User accounts / registration
- Class: anti-feature
- Status: out-of-scope
- Description: User registration, login, password reset
- Why it matters: Anonymous-first identity model. Session isolation provides multi-user support without accounts
- Source: research
- Primary owning slice: none
- Supporting slices: none
- Validation: n/a
- Notes: Would fundamentally change the product shape
### R033 — Automatic yt-dlp update at runtime
- Class: anti-feature
- Status: out-of-scope
- Description: Auto-update yt-dlp extractors inside running container
- Why it matters: Breaks immutable containers and reproducible builds. Version drift between deployments
- Source: research
- Primary owning slice: none
- Supporting slices: none
- Validation: n/a
- Notes: Pin version in requirements; publish new image on yt-dlp releases via CI
### R034 — Embedded video player
- Class: anti-feature
- Status: out-of-scope
- Description: Play downloaded media within the web UI
- Why it matters: Adds significant frontend complexity, licensing surface for codecs, scope creep. Files go to Jellyfin/Plex anyway
- Source: research
- Primary owning slice: none
- Supporting slices: none
- Validation: n/a
- Notes: Serve files at predictable paths; users open in their preferred player
### R035 — Subscription / channel monitoring
- Class: anti-feature
- Status: out-of-scope
- Description: "Set it and forget it" channel archiving
- Why it matters: Fundamentally different product — a scheduler/archiver vs a download UI. Tools like Pinchflat, TubeArchivist do this better
- Source: research
- Primary owning slice: none
- Supporting slices: none
- Validation: n/a
- Notes: Architecture should not block adding it later. APScheduler already present for purge
### R036 — FlareSolverr / Cloudflare bypass
- Class: anti-feature
- Status: out-of-scope
- Description: Cloudflare bypass via external FlareSolverr service
- Why it matters: Introduces external service dependency, legal gray area, niche use case
- Source: research
- Primary owning slice: none
- Supporting slices: none
- Validation: n/a
- Notes: cookies.txt upload (R008) solves authenticated content for most users
## Traceability
| ID | Class | Status | Primary owner | Supporting | Proof |
|---|---|---|---|---|---|
| R001 | core-capability | active | M001/S01 | none | unmapped |
| R002 | core-capability | active | M001/S01 | M001/S03 | unmapped |
| R003 | core-capability | active | M001/S02 | M001/S03 | unmapped |
| R004 | continuity | active | M001/S02 | none | unmapped |
| R005 | primary-user-loop | active | M001/S03 | none | unmapped |
| R006 | core-capability | active | M001/S03 | M001/S01 | unmapped |
| R007 | differentiator | active | M001/S02 | M001/S03 | unmapped |
| R008 | core-capability | active | M001/S04 | none | unmapped |
| R009 | operability | active | M001/S04 | none | unmapped |
| R010 | differentiator | active | M001/S05 | none | unmapped |
| R011 | differentiator | active | M001/S05 | none | unmapped |
| R012 | constraint | active | M001/S05 | M001/S03 | unmapped |
| R013 | primary-user-loop | active | M001/S03 | none | unmapped |
| R014 | operability | active | M001/S04 | none | unmapped |
| R015 | failure-visibility | active | M001/S04 | none | unmapped |
| R016 | operability | active | M001/S02 | none | unmapped |
| R017 | continuity | active | M001/S04 | none | unmapped |
| R018 | primary-user-loop | active | M001/S04 | none | unmapped |
| R019 | core-capability | active | M001/S01 | none | unmapped |
| R020 | constraint | active | M001/S06 | all | unmapped |
| R021 | launchability | active | M001/S06 | none | unmapped |
| R022 | launchability | active | M001/S06 | none | unmapped |
| R023 | operability | active | M001/S01 | M001/S04 | unmapped |
| R024 | core-capability | active | M001/S01 | none | unmapped |
| R025 | core-capability | active | M001/S03 | none | unmapped |
| R026 | launchability | active | M001/S06 | none | unmapped |
| R027 | primary-user-loop | deferred | none | none | unmapped |
| R028 | failure-visibility | deferred | none | none | unmapped |
| R029 | primary-user-loop | deferred | none | none | unmapped |
| R030 | anti-feature | out-of-scope | none | none | n/a |
| R031 | anti-feature | out-of-scope | none | none | n/a |
| R032 | anti-feature | out-of-scope | none | none | n/a |
| R033 | anti-feature | out-of-scope | none | none | n/a |
| R034 | anti-feature | out-of-scope | none | none | n/a |
| R035 | anti-feature | out-of-scope | none | none | n/a |
| R036 | anti-feature | out-of-scope | none | none | n/a |
## Coverage Summary
- Active requirements: 26
- Mapped to slices: 26
- Validated: 0
- Unmapped active requirements: 0

View file

@ -0,0 +1,126 @@
# M001: media.rip() v1.0 — Context
**Gathered:** 2026-03-17
**Status:** Ready for planning
## Project Description
media.rip() is a self-hostable web-based yt-dlp frontend distributed as a Docker container. Users paste any yt-dlp-supported URL, select format/quality from live extraction, and download media — no account, no telemetry, no terminal. Ground-up build targeting the gaps every competitor (MeTube, yt-dlp-web-ui, ytptube) leaves open: session isolation, real theming, mobile UX, and operator-first configuration.
## Why This Milestone
This is the only milestone. M001 delivers the complete v1.0 product — from first line of code through Docker distribution. The product cannot ship partially; a download tool without real-time progress, or with progress but no session isolation, or with isolation but no admin panel, would be an incomplete product that fails to differentiate from existing tools.
## User-Visible Outcome
### When this milestone is complete, the user can:
- Run `docker compose up` and access a fully functional download UI at :8080 with cyberpunk theme, zero configuration
- Paste any yt-dlp-supported URL, pick format/quality from live extraction, and download to /downloads
- See real-time progress (percent, speed, ETA) via SSE, surviving page refreshes
- Use isolated session mode (default) so two browsers see only their own downloads
- Upload cookies.txt for paywalled content, scoped to their session
- Switch between cyberpunk, dark, and light themes — or drop a custom theme into /themes
- Access admin panel via username/password login to manage sessions, storage, purge, and config
- Deploy securely using the provided reverse-proxy + TLS compose example
### Entry point / environment
- Entry point: `docker compose up` → http://localhost:8080 (dev), https://media.example.com (prod behind reverse proxy)
- Environment: Docker container, browser-accessed
- Live dependencies involved: yt-dlp (bundled library), ffmpeg (bundled binary), SQLite (embedded)
## Completion Class
- Contract complete means: all API endpoints respond correctly, yt-dlp downloads succeed, SSE streams deliver events, session isolation works, admin auth rejects unauthorized requests, purge deletes correct files, themes apply correctly
- Integration complete means: frontend ↔ backend SSE flow works end-to-end, yt-dlp progress hooks bridge to browser progress bars, admin config changes take effect live, theme volume mount → picker → apply chain works
- Operational complete means: Docker image builds for both architectures, CI runs on PR, CD publishes on tag, health endpoint responds, startup TLS warning fires when appropriate
## Final Integrated Acceptance
To call this milestone complete, we must prove:
- Paste a YouTube URL in the browser → pick quality → see real-time progress → file appears in /downloads (the full primary loop)
- Open two different browsers → each sees only its own downloads (session isolation)
- Admin login → change a config value → effect visible without container restart
- Drop a custom theme folder into /themes volume → restart → appears in theme picker → applies correctly
- `docker compose up` with zero config → everything works at :8080 with cyberpunk theme and isolated mode
- Tag v0.1.0 → GitHub Actions builds and pushes amd64 + arm64 images to both registries
## Risks and Unknowns
- **Sync-to-async bridge correctness** — yt-dlp is synchronous, FastAPI is async. ThreadPoolExecutor + `call_soon_threadsafe` is the known-correct pattern, but getting the event loop capture and progress hook wiring wrong produces silent event loss or blocked loops. Must be proven in S01
- **SSE disconnect handling** — CancelledError swallowing creates zombie connections. sse-starlette handles this but the generator must use try/finally correctly. Must be proven in S02
- **SQLite write contention** — WAL mode + busy_timeout handles this for the expected load, but must be enabled at DB init before any schema work. Addressed in S01
- **CSS variable contract is a one-way door** — Token names cannot change after operators write custom themes. Must be designed deliberately in S05, not evolved by accident
- **cookies.txt security** — CVE-2023-35934 requires pinning yt-dlp >= 2023-07-06. Cookie files are sensitive — never log, store per-session, delete on purge
- **Admin auth over cleartext** — If operator doesn't use TLS, admin credentials sent in cleartext. Mitigated by startup warning + secure deployment docs, but can't be prevented from the app side
## Existing Codebase / Prior Art
- `PROJECT.md` — comprehensive product spec with data models, API surface, SSE schema, config schema, Dockerfile sketch, CI/CD outline
- `.planning/research/ARCHITECTURE.md` — system diagram, component boundaries, data flow paths, anti-patterns, Docker layering strategy
- `.planning/research/FEATURES.md` — feature landscape, competitor analysis, dependency graph, edge cases, MVP definition
- `.planning/research/STACK.md` — pinned versions for all dependencies, integration patterns, known pitfalls per library
- `.planning/research/PITFALLS.md` — critical pitfalls with prevention strategies and warning signs
- `.planning/research/SUMMARY.md` — executive summary of all research with confidence assessments
> See `.gsd/DECISIONS.md` for all architectural and pattern decisions — it is an append-only register; read it during planning, append to it during execution.
## Relevant Requirements
- R001-R006 — Core download loop (URL → format → progress → queue → playlist)
- R007 — Session isolation (the primary differentiator)
- R003, R004 — SSE transport + replay (the technical enabler for isolation)
- R014 — Admin panel with secure auth (trust proposition)
- R010-R012 — Theme system (visual identity + operator customization)
- R021-R022 — Docker distribution + CI/CD (the delivery mechanism)
- R020 — Zero telemetry (hard constraint on all slices)
## Scope
### In Scope
- Complete backend: FastAPI app with all API endpoints, yt-dlp integration, SSE, sessions, admin, purge, config, health
- Complete frontend: Vue 3 SPA with download queue, format picker, progress, playlist UI, mobile layout, admin panel, theme picker
- Three built-in themes + drop-in custom theme system
- Cookie auth (cookies.txt per-session)
- Session export/import
- Unsupported URL reporting
- Docker packaging + CI/CD
- Secure deployment documentation
### Out of Scope / Non-Goals
- OAuth/SSO, user accounts, WebSocket, embedded player, auto-update yt-dlp, subscription monitoring, FlareSolverr (see R030-R036)
- TLS termination inside the container (reverse proxy responsibility)
- Telegram/Discord bot (v2+ extension point)
- Arr-stack API integration (v2+)
## Technical Constraints
- Python 3.12 (not 3.13 — passlib breakage)
- yt-dlp as library, not subprocess (structured progress hooks, no shell injection)
- YoutubeDL instance created fresh per job — never shared across threads
- ThreadPoolExecutor only (not ProcessPoolExecutor — YoutubeDL not picklable)
- SQLite with WAL mode, synchronous=NORMAL, busy_timeout=5000 — enabled before any schema work
- SSE via sse-starlette (not FastAPI native — better disconnect handling)
- APScheduler 3.x (not 4.x alpha)
- bcrypt 5.0.0 direct (not passlib — unmaintained, Python 3.13 breakage)
- All fonts/assets bundled — zero external CDN requests
## Integration Points
- **yt-dlp** — library import, ThreadPoolExecutor workers, progress hooks via call_soon_threadsafe
- **ffmpeg** — installed in Docker image, found by yt-dlp via PATH for muxing
- **sse-starlette** — EventSourceResponse wrapping async generators
- **APScheduler AsyncIOScheduler** — started in FastAPI lifespan, shares event loop
- **aiosqlite** — connection pool via FastAPI Depends, WAL mode
- **GitHub Actions** — CI (lint/test on PR) + CD (build/push on tag)
- **GHCR + Docker Hub** — image registry targets
## Open Questions
- **Reverse proxy for deployment example** — Caddy vs Traefik. Leaning Caddy for simplicity (one-liner TLS). Decide during S06 planning
- **First-boot admin UX** — How pushy should the forced credential change prompt be? Decide during S04 planning
- **HTTP/2 for SSE connection limit** — SSE has 6-connection-per-domain limit on HTTP/1.1. Caddy handles HTTP/2 automatically if chosen as reverse proxy. Confirm approach during S06

View file

@ -0,0 +1,176 @@
# M001: media.rip() v1.0 — Ship It
**Vision:** Deliver a complete self-hostable yt-dlp web frontend as a Docker container. Paste a URL, pick quality, download — with session isolation, real-time progress, a cyberpunk default theme, secure admin panel, and zero telemetry. Distributed via GHCR + Docker Hub for amd64 + arm64.
## Success Criteria
- User can `docker compose up` with zero config and get a working download UI at :8080 with cyberpunk theme and isolated session mode
- User can paste any yt-dlp-supported URL, select format/quality from live extraction, and download to /downloads with real-time progress
- Two different browsers see only their own downloads (session isolation works)
- Page refresh preserves queue state via SSE replay
- Admin can log in with username/password, manage sessions/storage/config, trigger manual purge
- Custom theme dropped into /themes volume appears in picker and applies correctly
- Mobile layout (375px) uses bottom tabs, card list, ≥44px touch targets
- Tag v0.1.0 triggers CI/CD pipeline that pushes multi-arch images to both registries
- Container makes zero automatic outbound network requests
## Key Risks / Unknowns
- **Sync-to-async bridge** — yt-dlp is synchronous; FastAPI is async. The ThreadPoolExecutor + `call_soon_threadsafe` pattern is well-documented but must be wired correctly or progress events are silently lost
- **SSE zombie connections** — CancelledError swallowing in SSE generators creates memory leaks. Must use try/finally and explicitly handle cancellation
- **CSS variable contract lock-in** — Token names are a one-way door once custom themes exist. Must be designed deliberately before components reference them
- **Admin auth over cleartext** — Can't prevent operators from skipping TLS, but can warn loudly at startup
## Proof Strategy
- Sync-to-async bridge → retire in S01 by proving yt-dlp progress events arrive in an asyncio.Queue via call_soon_threadsafe, with a test that runs a real download and asserts events were received
- SSE zombie connections → retire in S02 by proving SSE endpoint cleanup works on client disconnect (generator finally block fires, queue removed from broker)
- CSS variable contract → retire in S05 by establishing the token set before any component references it, with documentation freeze
- Admin auth security → retire in S04 by proving bcrypt comparison, timing-safe check, security headers, and TLS detection warning all function correctly
## Verification Classes
- Contract verification: pytest for backend (API, services, models), vitest for frontend (stores, components), ruff + eslint + vue-tsc for lint/type-check
- Integration verification: real yt-dlp download producing a file, SSE events flowing from progress hook to browser EventSource, admin config write taking effect without restart
- Operational verification: Docker image builds for both architectures, health endpoint responds, startup TLS warning fires when appropriate
- UAT / human verification: visual theme check, mobile layout feel, admin panel UX flow, first-boot credential setup
## Milestone Definition of Done
This milestone is complete only when all are true:
- All six slices are complete with passing verification
- The full primary loop works end-to-end: URL → format picker → real-time progress → completed file
- Session isolation proven with two independent browsers
- Admin panel accessible only via authenticated login with bcrypt-hashed credentials
- Three built-in themes render correctly; drop-in custom theme chain works
- Mobile layout functions at 375px with correct breakpoint behavior
- Docker image builds and runs for amd64 + arm64
- CI/CD pipeline triggers correctly on PR and tag
- Zero outbound network requests from container verified
- Secure deployment example (reverse proxy + TLS) documented and functional
## Requirement Coverage
- Covers: R001-R026 (all 26 active requirements)
- Partially covers: none
- Leaves for later: R027 (presets), R028 (GitHub issue prefill), R029 (filter persistence)
- Orphan risks: none
## Slices
- [ ] **S01: Foundation + Download Engine** `risk:high` `depends:[]`
> After this: POST a URL to the API → yt-dlp downloads it to /downloads with progress events arriving in an asyncio.Queue. Format probe returns available qualities. Config loads from YAML + env vars. SQLite with WAL mode stores jobs. Proven via API tests and a real yt-dlp download.
- [ ] **S02: SSE Transport + Session System** `risk:high` `depends:[S01]`
> After this: Open two browser tabs → each gets its own SSE stream scoped to their session cookie. Live progress events flow from yt-dlp worker threads through SSEBroker to the correct session's EventSource. Refresh a tab → SSE replays current state. Health endpoint responds. Proven via real SSE connections and session isolation test.
- [ ] **S03: Frontend Core** `risk:medium` `depends:[S02]`
> After this: Full Vue 3 SPA in the browser: paste URL, pick format from live extraction, watch progress bar fill, see completed files in queue. Playlists show as collapsible parent/child rows. Mobile layout (375px) uses bottom tabs, card list, ≥44px targets. Desktop uses sidebar + table. Proven by loading the SPA and completing a download flow.
- [ ] **S04: Admin, Auth + Supporting Features** `risk:medium` `depends:[S02]`
> After this: Admin panel requires username/password login (bcrypt). Session list, storage view, manual purge, live config editor, unsupported URL log download all functional. Cookie auth upload works per-session. Session export/import produces valid archive. File link sharing serves completed downloads. Security headers present on admin routes. Startup warns if TLS not detected. Proven via auth tests + admin flow verification.
- [ ] **S05: Theme System** `risk:low` `depends:[S03]`
> After this: Cyberpunk theme renders with scanlines/grid overlay, JetBrains Mono, #00a8ff/#ff6b2b. Dark and light themes are clean alternatives. CSS variable contract documented in base.css. Drop a custom theme folder into /themes volume → restart → appears in picker → applies correctly. Built-in themes heavily commented as documentation. Proven by theme switching and custom theme load.
- [ ] **S06: Docker + CI/CD** `risk:low` `depends:[S01,S02,S03,S04,S05]`
> After this: `docker compose up` → app works at :8080 with zero config. `docker-compose.example.yml` includes Caddy/Traefik sidecar for TLS. Tag v0.1.0 → GitHub Actions builds multi-arch image → pushes to GHCR + Docker Hub → creates GitHub Release. PR triggers lint + test + Docker smoke. Zero outbound telemetry verified. Proven by running the published image and completing a full download flow.
## Boundary Map
### S01 → S02
Produces:
- `app/core/database.py` → aiosqlite connection pool with WAL mode, job CRUD operations
- `app/core/config.py` → ConfigManager: YAML + env var merge, typed config access
- `app/models/job.py` → Job Pydantic model, JobStatus enum, ProgressEvent model
- `app/models/session.py` → Session Pydantic model
- `app/services/download.py` → DownloadService: ThreadPoolExecutor, enqueue(), progress hook producing ProgressEvent into a callback
- `app/core/sse_broker.py` → SSEBroker: per-session Queue map, put_nowait(), subscribe()/unsubscribe()
Consumes:
- nothing (first slice)
### S01 → S03
Produces:
- `app/routers/downloads.py` → POST /api/downloads, GET /api/downloads, DELETE /api/downloads/{id}
- `app/routers/formats.py` → GET /api/formats?url= (live yt-dlp extraction)
- `app/models/job.py` → Job, ProgressEvent (JSON schema for frontend TypeScript types)
### S01 → S04
Produces:
- `app/core/database.py` → job/session/config table access
- `app/core/config.py` → ConfigManager (admin writes extend this)
- `app/services/download.py` → DownloadService.cancel()
### S02 → S03
Produces:
- `app/routers/sse.py` → GET /api/events (EventSourceResponse per session)
- `app/middleware/session.py` → SessionMiddleware: auto-creates mrip_session httpOnly cookie, populates request.state.session_id
- `app/routers/health.py` → GET /api/health
- `app/routers/system.py` → GET /api/config/public (sanitized config for frontend)
- SSE event contract: init, job_update, job_removed, error event types with typed payloads
Consumes from S01:
- `app/core/sse_broker.py` → SSEBroker.subscribe(), SSEBroker.put_nowait()
- `app/core/database.py` → job queries for SSE replay
- `app/models/job.py` → Job, ProgressEvent models
- `app/models/session.py` → Session model
### S02 → S04
Produces:
- `app/middleware/session.py` → SessionMiddleware (session identity for admin to list)
- `app/core/database.py` → session table queries
### S03 → S05
Produces:
- Vue component structure referencing CSS custom properties (--color-bg, --color-accent-primary, etc.)
- `frontend/src/stores/theme.ts` → theme store with setTheme(), availableThemes
- Component DOM structure that themes must style correctly
Consumes from S02:
- SSE event contract (EventSource integration in Pinia sse store)
- GET /api/config/public (session mode, default theme)
- Session cookie (auto-set by middleware)
### S04 → S06
Produces:
- `app/routers/admin.py` → all admin API endpoints
- Admin auth middleware (HTTPBasic + bcrypt)
- `app/services/purge.py` → PurgeService
- Test suite for admin routes
Consumes from S02:
- Session middleware, session queries
- SSEBroker (for purge_complete event)
Consumes from S01:
- Database, ConfigManager, DownloadService
### S05 → S06
Produces:
- `frontend/src/themes/` → cyberpunk.css, dark.css, light.css (baked into build)
- `app/core/theme_loader.py` → ThemeLoader scanning /themes volume
- `app/routers/themes.py` → GET /api/themes manifest
- CSS variable contract in base.css (the stable theme API)
Consumes from S03:
- Vue component structure (components reference CSS custom properties)
- Theme store (setTheme, availableThemes)
### All → S06
S06 consumes the complete application from S01-S05:
- All backend source under `backend/app/`
- All frontend source under `frontend/src/`
- All test suites
- All theme assets
- docker-compose.yml, Dockerfile, GitHub Actions workflows