docs: complete project research

Adds STACK, FEATURES, ARCHITECTURE, PITFALLS, and SUMMARY research files
for media.rip() v1.0 (self-hosted yt-dlp web frontend).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
xpltd 2026-03-17 21:36:25 -05:00
parent bc4f90f3fa
commit 476e4a4cb5
6 changed files with 1894 additions and 1 deletions

View file

@ -9,4 +9,4 @@
"plan_check": true,
"verifier": true
}
}
}

View file

@ -0,0 +1,662 @@
# Architecture Research
**Domain:** Self-hosted yt-dlp web frontend (Python/FastAPI + Vue 3)
**Researched:** 2026-03-17
**Confidence:** HIGH (core integration patterns) / MEDIUM (schema shape, theme system)
---
## Standard Architecture
### System Overview
```
┌─────────────────────────────────────────────────────────────────────┐
│ BROWSER (Vue 3 SPA) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ DownloadQ │ │ AdminPanel │ │ ThemePicker │ │
│ │ (Vue comp) │ │ (Vue comp) │ │ (Vue comp) │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ ┌──────┴─────────────────┴──────────────────┴──────────────────┐ │
│ │ Pinia Stores │ │
│ │ downloads | session | admin | theme | sse-connection │ │
│ └──────┬────────────────────────────────────────────────────────┘ │
│ │ REST (fetch) + SSE (EventSource) │
└─────────┼───────────────────────────────────────────────────────────┘
│ HTTP (behind nginx in prod)
┌─────────────────────────────────────────────────────────────────────┐
│ FastAPI (Python 3.12) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ /api/dl │ │ /api/admin │ │ /api/sse │ │
│ │ /api/session│ │ (basic auth)│ │ /api/health │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ ┌──────┴─────────────────┴──────────────────┴──────────────────┐ │
│ │ Service Layer │ │
│ │ DownloadService | SessionService | AdminService | SSEBroker │ │
│ └──────┬─────────────────────────────────────────┬─────────────┘ │
│ │ │ │
│ ┌──────┴──────────────┐ ┌───────────┴──────────────┐ │
│ │ ThreadPool │ │ APScheduler │ │
│ │ (yt-dlp workers) │ │ (purge cron) │ │
│ └──────┬──────────────┘ └──────────────────────────┘ │
│ │ progress_hook → asyncio.Queue → SSEBroker │
└─────────┼───────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ Persistence Layer │
│ ┌──────────────────────┐ ┌───────────────────────────────────┐ │
│ │ SQLite (aiosqlite) │ │ Filesystem │ │
│ │ jobs, sessions, │ │ /data/downloads/ (output) │ │
│ │ config, logs │ │ /data/cookies/ (per-session) │ │
│ └──────────────────────┘ │ /data/unsupported_urls.log │ │
│ │ /themes/ (custom) │ │
│ │ config.yaml (override) │ │
│ └───────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
```
### Component Responsibilities
| Component | Responsibility | Notes |
|-----------|----------------|-------|
| Vue SPA | All user interaction, queue visualization, SSE state sync | Built to `/app/static/` at image build time, served by FastAPI StaticFiles |
| Pinia `downloads` store | Download job state, optimistic updates, SSE-driven mutations | SSE events are the source of truth; REST is for initial hydration and commands |
| Pinia `sse-connection` store | Manages EventSource lifecycle, reconnect, missed-event replay | Separate store so reconnect logic doesn't pollute download logic |
| FastAPI routers | Route validation, auth middleware, response shaping | Thin — delegates to services |
| `DownloadService` | Orchestrates yt-dlp jobs, manages queue, dispatches progress to SSEBroker | One service, not per-request; holds job registry |
| `SSEBroker` | Per-session asyncio.Queue map; fan-out to all active SSE connections for a session | Singleton; isolates sessions by `session_id` key |
| `SessionService` | Cookie creation/validation, session CRUD, export/import packaging | Owns session identity; no auth — identity only |
| `AdminService` | Config read/write, live reload, session listing, manual purge | Protected by HTTP Basic auth middleware |
| ThreadPoolExecutor | Runs yt-dlp synchronously; progress hooks bridge back to async via `call_soon_threadsafe` | yt-dlp is synchronous and cannot be awaited directly |
| APScheduler `AsyncIOScheduler` | Purge cron job (file TTL, session TTL, log rotation) | Shares event loop with FastAPI; started in lifespan |
| SQLite (aiosqlite) | Job state, session records, config overrides, unsupported URL log | Single file at `/data/mrip.db` |
---
## Key Integration: yt-dlp Progress → SSE
This is the most architecturally significant path in the system. Getting it wrong causes either blocking the event loop or losing progress events.
### The Problem
yt-dlp's `download()` method is **synchronous and blocking**. It calls `progress_hook` callbacks from inside that synchronous thread. FastAPI runs on asyncio. These two worlds must be bridged without:
- Blocking the event loop (which would stall all SSE streams and API requests)
- Using ProcessPoolExecutor (yt-dlp `YoutubeDL` objects contain file handles — not picklable)
### The Solution: ThreadPoolExecutor + `call_soon_threadsafe`
```
yt-dlp thread (sync) asyncio event loop (async)
───────────────────── ───────────────────────────
run_in_executor(pool, fn) →→→ awaited by DownloadService
progress_hook(d) fires
loop.call_soon_threadsafe(
queue.put_nowait, event →→→ asyncio.Queue receives event
) ↓
SSEBroker.publish(session_id, event)
EventSourceResponse yields to browser
```
**Rule:** Never call `asyncio.Queue.put()` directly from the yt-dlp thread. Always use `loop.call_soon_threadsafe(queue.put_nowait, event)`. This is the only safe bridge from sync threads to the async event loop.
### Progress Hook Payload
yt-dlp calls `progress_hook(d)` where `d` is a dict with these fields:
```python
{
"status": "downloading" | "finished" | "error",
"filename": str,
"downloaded_bytes": int,
"total_bytes": int | None, # None if unknown
"total_bytes_estimate": int | None,
"speed": float | None, # bytes/sec
"eta": int | None, # seconds
"elapsed": float,
"tmpfilename": str | None,
# "fragment_index", "fragment_count" for HLS/DASH
}
```
Normalize this into a typed `ProgressEvent` before putting it on the queue — never send raw yt-dlp dicts to the browser.
---
## Component Boundaries
### New Components Required (not pre-existing libraries)
| Component | File | Why It's Its Own Thing |
|-----------|------|------------------------|
| `SSEBroker` | `app/core/sse_broker.py` | Singleton managing per-session queues; must be referenced from both the download worker thread and the SSE endpoint. Lives outside any request lifecycle. |
| `DownloadService` | `app/services/download.py` | Long-lived, holds job registry (`job_id → job_state`), manages ThreadPoolExecutor lifecycle. Not per-request. |
| `SessionMiddleware` (custom) | `app/middleware/session.py` | Auto-creates `mrip_session` UUID cookie on first request; validates on subsequent. Lighter than Starlette's full SessionMiddleware, which signs the entire session dict into the cookie. We only want an opaque ID. |
| `ConfigManager` | `app/core/config.py` | Merges `config.yaml` overrides onto defaults; exposes live-reload API for admin. SQLite holds the mutable copy; `config.yaml` is read-only at start and writes nothing back. |
| `ThemeLoader` | `app/core/theme_loader.py` | Scans `/themes/` volume directory at startup and on admin request; returns manifest of available themes. Does not compile anything — themes are served as static CSS variable files. |
| `PurgeService` | `app/services/purge.py` | Encapsulates purge logic (file TTL, session TTL, log trim). Called by APScheduler cron and by admin manual-trigger endpoint. |
| `SessionExporter` | `app/services/session_export.py` | Serializes session + job history to JSON archive; validates and imports the reverse. |
### Modified / Wrapped Components
| Component | Modification |
|-----------|-------------|
| `sse-starlette` `EventSourceResponse` | Used directly; no modification needed |
| `APScheduler` `AsyncIOScheduler` | Wrapped in lifespan startup/shutdown; no subclassing |
| `aiosqlite` | Wrapped in a thin `Database` context manager for connection reuse across requests via FastAPI dependency injection |
---
## Database Schema Shape
Single SQLite file at `/data/mrip.db`. All tables use `TEXT` UUIDs as primary keys for portability in exports.
```sql
-- Sessions: cookie identity
CREATE TABLE sessions (
id TEXT PRIMARY KEY, -- UUID, matches mrip_session cookie value
created_at INTEGER NOT NULL, -- unix timestamp
last_seen INTEGER NOT NULL,
mode TEXT NOT NULL DEFAULT 'isolated',
preferences TEXT NOT NULL DEFAULT '{}' -- JSON blob (theme selection, etc.)
);
-- Jobs: one row per download task
CREATE TABLE jobs (
id TEXT PRIMARY KEY, -- UUID
session_id TEXT NOT NULL REFERENCES sessions(id),
url TEXT NOT NULL,
title TEXT,
format_id TEXT,
status TEXT NOT NULL, -- queued|downloading|finished|error|cancelled
progress_pct REAL DEFAULT 0,
speed_bps REAL,
eta_secs INTEGER,
error_msg TEXT,
output_path TEXT, -- relative to /data/downloads/
file_size INTEGER,
created_at INTEGER NOT NULL,
started_at INTEGER,
finished_at INTEGER
);
-- Config: mutable settings (admin UI writes here; config.yaml seeds it)
CREATE TABLE config (
key TEXT PRIMARY KEY,
value TEXT NOT NULL -- JSON-serialized scalar or object
);
-- Unsupported URL log (append-only)
CREATE TABLE unsupported_urls (
id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id TEXT,
domain TEXT NOT NULL, -- logged domain only (default)
full_url TEXT, -- NULL unless report_full_url=true
error_msg TEXT,
created_at INTEGER NOT NULL
);
```
**Indexes needed:**
- `jobs(session_id, status)` — SSE reconnect replay, queue filtering
- `jobs(finished_at)` — purge queries
- `sessions(last_seen)` — session TTL purge
---
## Recommended Project Structure
```
media-rip/
├── backend/
│ ├── app/
│ │ ├── main.py # FastAPI app factory, lifespan, middleware
│ │ ├── core/
│ │ │ ├── config.py # ConfigManager (yaml merge + SQLite live config)
│ │ │ ├── database.py # aiosqlite connection pool + migration runner
│ │ │ ├── sse_broker.py # SSEBroker singleton
│ │ │ └── theme_loader.py # /themes/ scanner
│ │ ├── middleware/
│ │ │ └── session.py # mrip_session cookie auto-create/validate
│ │ ├── routers/
│ │ │ ├── downloads.py # POST /api/dl, GET /api/dl/{id}, DELETE
│ │ │ ├── sessions.py # GET/DELETE /api/session, export/import
│ │ │ ├── sse.py # GET /api/sse (EventSourceResponse)
│ │ │ ├── admin.py # /api/admin/* (basic auth protected)
│ │ │ ├── health.py # GET /api/health
│ │ │ └── themes.py # GET /api/themes (manifest)
│ │ ├── services/
│ │ │ ├── download.py # DownloadService (ThreadPool + job registry)
│ │ │ ├── purge.py # PurgeService
│ │ │ └── session_export.py # SessionExporter
│ │ └── models/
│ │ ├── job.py # Pydantic models: JobCreate, JobStatus, ProgressEvent
│ │ ├── session.py # SessionRecord, SessionExport
│ │ └── config.py # ConfigSchema
│ ├── tests/
│ │ ├── test_sse_broker.py
│ │ ├── test_download_service.py
│ │ └── test_session.py
│ ├── alembic/ # DB migrations (keep even for SQLite — schema evolves)
│ └── pyproject.toml
├── frontend/
│ ├── src/
│ │ ├── main.ts
│ │ ├── App.vue
│ │ ├── stores/
│ │ │ ├── downloads.ts # Job state, queue ops
│ │ │ ├── session.ts # Session identity, export/import
│ │ │ ├── sse.ts # EventSource lifecycle + reconnect
│ │ │ ├── admin.ts # Admin state, config editor
│ │ │ └── theme.ts # Active theme, available themes
│ │ ├── components/
│ │ │ ├── DownloadQueue/
│ │ │ ├── FormatPicker/
│ │ │ ├── ProgressBar/
│ │ │ ├── PlaylistRow/
│ │ │ └── AdminPanel/
│ │ ├── composables/
│ │ │ └── useSSE.ts # Thin wrapper over sse store
│ │ └── themes/ # Built-in theme CSS variable files (embedded in build)
│ │ ├── cyberpunk.css
│ │ ├── dark.css
│ │ └── light.css
│ ├── public/
│ └── vite.config.ts
├── themes/ # Volume-mounted custom themes (operator drop-in)
│ └── .gitkeep
├── data/ # Volume-mounted runtime data
│ └── .gitkeep
├── Dockerfile
├── docker-compose.yml # For local dev and reference deploy
└── config.yaml.example
```
### Structure Rationale
- **`backend/app/core/`:** Things that live for the full application lifetime (broker, config, DB pool) vs. `services/` which own business logic and can be unit-tested in isolation.
- **`backend/app/middleware/`:** Session cookie logic in middleware means every request gets `request.state.session_id` populated before it hits any router. No per-route cookie reading.
- **`frontend/src/stores/sse.ts`:** SSE lifecycle is isolated from business stores. Downloads store subscribes to SSE store events. This means reconnect logic doesn't leak into job state logic.
- **`themes/` at repo root:** Separate from `frontend/src/themes/` — built-in themes are compiled into the frontend bundle; operator themes are volume-mounted and served dynamically at runtime.
---
## Data Flow: Key Paths
### Path 1: URL → Download → SSE Progress → Completion
```
1. User pastes URL
Browser: URL field onChange → format-probe fetch (GET /api/dl/probe?url=...)
Backend: yt-dlp.extract_info(url, download=False) in ThreadPool → returns formats
Browser: FormatPicker shows options
2. User selects format, clicks Download
Browser: POST /api/dl {url, format_id, session_id (from cookie)}
Backend: DownloadService.enqueue(job) → creates DB row (status=queued)
returns {job_id}
3. SSE stream delivers state
Browser: EventSource on /api/sse (session_id from cookie)
SSEBroker has a queue keyed by session_id
Backend: GET /api/sse → EventSourceResponse(async_generator)
generator: while True: event = await queue.get(); yield event
4. Download worker executes
Backend: ThreadPoolExecutor.submit(run_download, job_id, url, format_id, opts)
Inside thread:
YoutubeDL(opts).download([url])
progress_hook fires with {status, downloaded_bytes, ...}
→ loop.call_soon_threadsafe(
sse_broker.put_nowait,
session_id,
ProgressEvent(job_id, ...)
)
On finish: DB update (status=finished, output_path=...)
→ call_soon_threadsafe sends "finished" event
5. Browser receives progress events
SSE store receives raw event → dispatches to downloads store
downloads store: jobs[job_id].progress = event.pct
6. SSE reconnect (browser drop/refresh)
Browser: EventSource auto-reconnects (built-in)
Backend: GET /api/sse → queries DB for all active/recent jobs for this session
Replays current state as synthetic SSE events before entering live queue
```
### Path 2: Admin Config Change (live reload)
```
Admin UI → POST /api/admin/config {key, value}
→ AdminService.set(key, value) → writes to config table in SQLite
→ ConfigManager.invalidate_cache()
→ next request picks up new value
(No restart required — config is read from DB on each use, not at startup)
```
### Path 3: Drop-in Theme Load
```
Operator: docker volume mount ./my-theme/ → /themes/my-theme/
/themes/my-theme/theme.css (CSS custom properties)
/themes/my-theme/meta.json {name, author, preview_color}
Backend startup: ThemeLoader.scan() → reads /themes/*/meta.json
GET /api/themes → returns [{id, name, author, preview_color, is_builtin}]
GET /themes/{id}/theme.css → FileResponse (volume-served, not compiled)
Browser: ThemePicker calls /api/themes, shows list
User selects custom theme → <link rel="stylesheet"> swapped to /themes/id/theme.css
(Built-in themes are already in the bundle as CSS files)
```
### Path 4: Session Export/Import
```
Export:
GET /api/session/export
→ SessionExporter.export(session_id)
→ queries: session row + all jobs for session
→ zips: export.json + any cookies.txt for this session
→ returns StreamingResponse (zip file download)
Import:
POST /api/session/import (multipart, zip file)
→ unzip, validate schema version
→ create new session (new UUID, import grants new identity)
→ insert jobs (status "finished" only — don't replay active downloads)
→ return new session cookie (Set-Cookie: mrip_session=new_uuid)
```
---
## Architectural Patterns
### Pattern 1: Sync-to-Async Bridge via `call_soon_threadsafe`
**What:** yt-dlp progress hooks fire synchronously inside a thread. The running event loop must be captured at app startup and used to safely enqueue events without blocking the thread or corrupting the loop.
**When to use:** Any time synchronous library code in a worker thread needs to communicate back to the asyncio world.
**Trade-offs:** Simple and correct. The only alternative (running yt-dlp in a subprocess and parsing stdout) is fragile and loses structured error info.
**Key snippet shape:**
```python
# In app startup — capture the loop once
loop = asyncio.get_event_loop()
# In progress hook (called from sync thread)
def progress_hook(d: dict) -> None:
event = ProgressEvent.from_yt_dlp(job_id, d)
loop.call_soon_threadsafe(sse_broker.put_nowait, session_id, event)
```
### Pattern 2: Per-Session SSE Queue Fan-Out
**What:** One `asyncio.Queue` per connected SSE client (not per session). Multiple browser tabs from the same session each get their own queue. SSEBroker maintains `session_id → List[Queue]` and fans out to all queues on `publish()`.
**When to use:** Always. A single global queue would leak events across sessions — a privacy violation that defeats session isolation.
**Trade-offs:** Queue cleanup requires detecting client disconnect. `sse-starlette`'s `EventSourceResponse` handles this — the generator raises `asyncio.CancelledError` or `GeneratorExit` when the client disconnects, allowing cleanup in a `finally` block.
### Pattern 3: SSE Replay on Reconnect
**What:** When a client reconnects to `/api/sse`, the endpoint first emits synthetic events for all current job states from the DB before entering the live queue. This ensures the UI is fully hydrated on reconnect without requiring a separate REST fetch.
**When to use:** Any SSE endpoint where the client might have missed events during a disconnect.
**Trade-offs:** Slightly more complex endpoint logic, but eliminates an entire class of "spinner forever after refresh" bugs.
### Pattern 4: Config Hierarchy (Defaults → YAML → SQLite)
**What:** Settings have three layers. Built-in defaults are hardcoded in Python. `config.yaml` overrides them at startup (read-only after that). Admin UI writes to the `config` SQLite table, which is the live source of truth at runtime.
**When to use:** Operator-facing applications that need both infra-as-code (YAML) and live UI config without restart.
**Trade-offs:** Two sources of truth during initial startup (YAML seeds SQLite on first boot, then SQLite wins). Must document precedence clearly. YAML never reflects what admin UI has changed.
---
## Anti-Patterns
### Anti-Pattern 1: Running yt-dlp directly in an async def route
**What people do:** `await asyncio.to_thread(ydl.download, [url])` inside a route handler.
**Why it's wrong:** `asyncio.to_thread` uses the default executor, which shares a pool with all other blocking calls. More critically, the progress hook fires from inside that thread and has no safe way to reach the SSE queue without a stored event loop reference. This pattern leads to either lost events or `RuntimeError: no running event loop`.
**Do this instead:** Use `DownloadService` (a singleton with its own dedicated `ThreadPoolExecutor`), capture `asyncio.get_event_loop()` at app startup, and use `call_soon_threadsafe` in the hook.
### Anti-Pattern 2: Storing session content in the cookie
**What people do:** Use Starlette's `SessionMiddleware` which signs the entire session dict into the cookie.
**Why it's wrong:** Session content (job IDs, preferences) grows unboundedly. Signed cookies can be decoded (just not tampered with). Violates the principle that the browser should hold only an opaque identity token.
**Do this instead:** Store only a UUID in the `mrip_session` cookie. All session state lives in SQLite keyed by that UUID.
### Anti-Pattern 3: Single global SSE queue for all sessions
**What people do:** One `asyncio.Queue` app-wide; all SSE consumers read from it.
**Why it's wrong:** Every client sees every other client's download events. Violates session isolation (the core privacy promise). Also creates thundering-herd wake-ups for unrelated events.
**Do this instead:** `SSEBroker` maps `session_id → List[asyncio.Queue]`, one queue per live connection.
### Anti-Pattern 4: Polling the DB for progress updates from SSE endpoint
**What people do:** SSE endpoint loops with `await asyncio.sleep(0.5)` and queries the DB for job state changes.
**Why it's wrong:** Generates constant DB load proportional to active connections × poll frequency. Introduces 0-500ms latency on progress events. Doesn't scale.
**Do this instead:** DownloadService pushes events directly into the SSE queues via `call_soon_threadsafe`. DB is only written for persistence — SSE reads from the queue.
### Anti-Pattern 5: Volume-mounting themes into the frontend build directory
**What people do:** Mount custom themes into `/app/static/themes/` and expect Vue to pick them up.
**Why it's wrong:** The built-in themes are baked into the static bundle at image build time. A volume mount on the same directory would shadow built-in themes and create confusion.
**Do this instead:** Built-in themes live at `/app/static/builtin-themes/` (baked in). Custom themes live at `/themes/` (volume-mounted). Frontend fetches the manifest from `/api/themes` to know what's available. `GET /themes/{id}/theme.css` is served by FastAPI's `StaticFiles` mount on the volume directory.
---
## Docker Layering Strategy
### Multi-Stage Build: 3 Stages
```dockerfile
# Stage 1: Frontend builder (Node)
FROM node:22-alpine AS frontend-builder
WORKDIR /frontend
COPY frontend/package*.json ./
RUN npm ci
COPY frontend/ .
RUN npm run build
# Output: /frontend/dist/
# Stage 2: Python dependency builder
FROM python:3.12-slim AS python-builder
WORKDIR /build
RUN pip install uv
COPY backend/pyproject.toml backend/uv.lock ./
RUN uv pip install --system --no-cache -r pyproject.toml
# Installs: fastapi, uvicorn, yt-dlp, sse-starlette, aiosqlite, apscheduler, pyyaml, etc.
# Stage 3: Final runtime image
FROM python:3.12-slim AS runtime
# Install ffmpeg (required by yt-dlp for muxing)
RUN apt-get update && apt-get install -y --no-install-recommends ffmpeg && rm -rf /var/lib/apt/lists/*
# Copy Python packages from builder
COPY --from=python-builder /usr/local/lib/python3.12 /usr/local/lib/python3.12
COPY --from=python-builder /usr/local/bin /usr/local/bin
# Copy backend source
COPY backend/app /app/app
# Copy built frontend assets into location FastAPI StaticFiles will serve
COPY --from=frontend-builder /frontend/dist /app/static
# Runtime config
WORKDIR /app
ENV MRIP_DATA_DIR=/data
VOLUME ["/data", "/themes"]
EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
```
### Layer Cache Optimization
The stage order matters for cache hit rates during development:
1. **Frontend builder first:** Node dependencies are the most stable. `package-lock.json` changes rarely. `npm ci` layer is cache-friendly.
2. **Python deps before source:** `pyproject.toml` changes less often than `app/` code. Source copy is always last within each stage.
3. **ffmpeg in a single RUN:** Combine `apt-get update`, install, and `rm -rf /var/lib/apt/lists/*` in one layer to avoid caching a stale package index.
### Multi-Platform Build (amd64 + arm64)
```bash
# CI pipeline (GitHub Actions)
docker buildx build \
--platform linux/amd64,linux/arm64 \
--tag ghcr.io/xpltd/media-rip:$VERSION \
--push \
.
```
**Arm64 consideration:** `ffmpeg` from Debian apt supports arm64 natively — no cross-compile needed. yt-dlp is pure Python — no binary concern. The only risk is any Python package with C extensions (e.g., `aiosqlite``sqlite3` → system library). `python:3.12-slim` includes `libsqlite3` for both platforms.
**QEMU vs. native:** GitHub Actions standard runners are amd64. QEMU emulation for arm64 is slow but correct for this stack (no complex native compilation). If build times become painful, use ARM runners (e.g., Blacksmith or self-hosted).
### FastAPI Serving Static Files (no nginx needed in single container)
FastAPI's `StaticFiles` mount is sufficient for this use case (single-instance self-hosted tool, not a CDN-scale app):
```python
from fastapi.staticfiles import StaticFiles
# Built frontend assets
app.mount("/assets", StaticFiles(directory="/app/static/assets"), name="assets")
# Volume-mounted custom themes
app.mount("/themes", StaticFiles(directory=os.environ.get("MRIP_THEMES_DIR", "/themes")), name="themes")
# SPA fallback: any unmatched path returns index.html
@app.get("/{full_path:path}")
async def spa_fallback(full_path: str):
return FileResponse("/app/static/index.html")
```
If an operator wants to put nginx in front (for TLS termination, caching), the container works unchanged behind a reverse proxy.
---
## Build Order (Dependency-Respecting)
Build phases in this order to avoid blocking work:
```
Phase 1: Foundation (no dependencies)
├── Database schema + migrations (aiosqlite, alembic init)
├── ConfigManager (pure Python, no DB dependency)
├── SessionMiddleware (cookie only — no DB needed to write it)
└── SSEBroker (pure asyncio.Queue — no yt-dlp, no DB)
Phase 2: Core Services (depends on Phase 1)
├── DownloadService skeleton (ThreadPool, queue intake, DB writes)
│ └── yt-dlp integration + progress hook bridge to SSEBroker
├── SSE endpoint (depends on SSEBroker from Phase 1)
│ └── With reconnect/replay from DB
└── Session CRUD endpoints (depends on DB + SessionMiddleware)
Phase 3: Frontend Core (can start after Phase 2 API shape is stable)
├── Pinia sse store + EventSource lifecycle
├── Pinia downloads store (consumes SSE events)
├── DownloadQueue component (URL input → probe → format picker → enqueue)
└── ProgressBar (driven by downloads store)
Phase 4: Admin + Auth (depends on Phase 2)
├── AdminService (config read/write)
├── Basic auth middleware on /api/admin/*
├── Admin router (sessions, storage, purge trigger, config editor)
└── Admin UI (Vue components)
Phase 5: Supporting Features (depends on Phases 2-4)
├── Theme system (ThemeLoader + /api/themes + volume serving)
├── PurgeService + APScheduler integration
├── Session export/import
├── cookies.txt upload (per-session)
└── Unsupported URL logging + admin download
Phase 6: Distribution
├── Dockerfile (multi-stage)
├── docker-compose.yml
├── GitHub Actions CI (lint, type-check, test, Docker smoke)
└── GitHub Actions CD (tag → build + push + release)
```
**Critical path:** Phase 1 → Phase 2 (SSEBroker + yt-dlp bridge) → Phase 3 (SSE consumer). The SSE transport must exist before meaningful frontend progress work can be validated end-to-end.
---
## Integration Points
### External Dependencies
| Dependency | Integration Pattern | Critical Notes |
|------------|---------------------|----------------|
| yt-dlp | `import yt_dlp` as library, not subprocess | `YoutubeDL` instance created fresh per job inside worker thread. Not shared. Not passed across process boundary. |
| ffmpeg | Installed in Docker image; yt-dlp finds it via `PATH` | Required for muxing video+audio streams. Not directly called by app code. |
| `sse-starlette` (v3.3.3) | `EventSourceResponse(async_generator)` | Handles ping/heartbeat, client disconnect detection. No subclassing needed. |
| `APScheduler` `AsyncIOScheduler` | Started in FastAPI `lifespan` context manager | Use `AsyncIOScheduler` (not `BackgroundScheduler`) to share the event loop. One instance globally. |
| `aiosqlite` | Thin wrapper for connection reuse via FastAPI `Depends` | One connection pool, not per-request connections. WAL mode for concurrent reads. |
### Internal Boundaries
| Boundary | Communication | Notes |
|----------|---------------|-------|
| Worker Thread ↔ SSEBroker | `loop.call_soon_threadsafe(broker.put_nowait, ...)` | Only safe async bridge from sync thread |
| SSEBroker ↔ SSE endpoint | `await queue.get()` in async generator | SSEBroker holds the queue; endpoint holds a reference |
| DownloadService ↔ DB | Direct `aiosqlite` calls | Service owns all job table writes |
| Middleware ↔ Routers | `request.state.session_id` | Middleware populates state; routers read it |
| ConfigManager ↔ All Services | Singleton read via dependency injection | No global variable — injected via `Depends(get_config)` |
| ThemeLoader ↔ Volume | Filesystem scan at startup + on-demand re-scan | No file watchers — re-scan is triggered by API call |
---
## Scaling Considerations
This is a single-instance self-hosted tool. The relevant scaling axis is concurrent downloads per instance, not users.
| Concern | Practical Limit | Mitigation |
|---------|-----------------|------------|
| Concurrent downloads | ThreadPoolExecutor defaults (min: 1, configurable) | Expose `max_concurrent_downloads` in config. Default 3 is safe for home use. |
| SQLite write contention | WAL mode handles concurrent reads + single writer fine | Enable `PRAGMA journal_mode=WAL` at DB init. No further action needed for this use case. |
| SSE connection count | asyncio handles hundreds of idle connections trivially | Not a practical concern for self-hosted tool |
| Disk space | operator concern | PurgeService + health endpoint disk-free flag address this |
| yt-dlp blocking | Handled by ThreadPool | GIL is released during I/O-heavy yt-dlp work; threads are effective here |
The architecture should not block a future "external API" milestone. The service layer is already the right boundary: a future v2 API consumer calls `DownloadService.enqueue()` just like the REST endpoint does — no architectural change required.
---
## Sources
- yt-dlp asyncio + ProcessPoolExecutor issue: https://github.com/yt-dlp/yt-dlp/issues/9487
- sse-starlette PyPI (v3.3.3, 2026-03-17): https://pypi.org/project/sse-starlette/
- FastAPI SSE official docs: https://fastapi.tiangolo.com/tutorial/server-sent-events/
- FastAPI async/threading patterns: https://fastapi.tiangolo.com/async/
- Docker multi-platform builds: https://docs.docker.com/build/building/multi-platform/
- Multi-arch GitHub Actions: https://www.blacksmith.sh/blog/building-multi-platform-docker-images-for-arm64-in-github-actions
- FastAPI + aiosqlite pattern: https://sqlspec.dev/examples/frameworks/fastapi/aiosqlite_app.html
- APScheduler + FastAPI lifespan: https://rajansahu713.medium.com/implementing-background-job-scheduling-in-fastapi-with-apscheduler-6f5fdabf3186
- FastAPI ThreadPool vs run_in_executor: https://sentry.io/answers/fastapi-difference-between-run-in-executor-and-run-in-threadpool/
---
*Architecture research for: media.rip() v1.0 — Python/FastAPI + Vue 3 + yt-dlp + SSE + SQLite + Docker*
*Researched: 2026-03-17*

View file

@ -0,0 +1,273 @@
# Feature Research
**Domain:** yt-dlp web frontend / self-hosted media downloader
**Researched:** 2026-03-17
**Confidence:** HIGH (core features), MEDIUM (UX patterns), HIGH (competitor gaps)
## Feature Landscape
### Table Stakes (Users Expect These)
Features users assume exist. Missing these = product feels incomplete.
| Feature | Why Expected | Complexity | Notes |
|---------|--------------|------------|-------|
| URL paste + download | The core primitive — every tool has this | LOW | Must support all yt-dlp-supported sites, not just YouTube |
| Real-time download progress | Users need feedback; "Processing..." with no indicator is dead UX | MEDIUM | MeTube uses WebSocket; we use SSE — both solve this. SSE is simpler and HTTP-native with auto-reconnect |
| Queue view (active + completed) | Users submit multiple URLs; need to track all of them | LOW | MeTube separates active/done lists; unified queue with status is cleaner |
| Format/quality selection | Power users always want control over resolution, codec, ext | MEDIUM | Must show resolution, codec, ext, filesize estimate. yt-dlp returns all fields: height, vcodec, acodec, ext, filesize, fps |
| Playlist support | Playlists are a primary use case for self-hosters | HIGH | Parent + child job model. MeTube treats playlists as flat — collapsible parent/child is a step up |
| Cancel / remove a download | Users make mistakes | LOW | DELETE /api/downloads/{id}; must handle mid-stream cancellation gracefully |
| Persistent queue across refresh | Losing the queue on page refresh is unacceptable | MEDIUM | Requires SSE `init` event replaying state on connect. MeTube uses state file; our SQLite-backed SSE replay is equivalent |
| Mobile-accessible UI | >50% of self-hoster interactions happen on phone or tablet | HIGH | No existing yt-dlp web UI does mobile well. All competitors are desktop-first. 44px touch targets, bottom nav required |
| Docker distribution | The self-hosted audience expects Docker | LOW | Single image, both registries, amd64 + arm64 |
| Health endpoint | Ops audiences rely on this for monitoring integrations (Uptime Kuma, etc.) | LOW | `GET /api/health` with version, uptime, disk space, queue depth |
### Differentiators (Competitive Advantage)
Features that set the product apart. Not required, but valued.
| Feature | Value Proposition | Complexity | Notes |
|---------|-------------------|------------|-------|
| Session isolation (isolated / shared / open modes) | MeTube Issue #591 closed as "won't fix" — maintainer dismisses multi-user isolation as bloat; community forked it to add this | HIGH | Cookie-based httpOnly UUID4; operator chooses mode; addresses the exact pain point that created demand for forks |
| Cookie auth (cookies.txt upload per-session) | Enables paywalled/private content without embedding credentials in the app; yt-dlp Netscape format is well-documented | MEDIUM | Files must be scoped per-session, purged on session clear. Security note: cookie files are sensitive — never log, never expose via API, delete on purge |
| Drop-in custom themes via volume mount | No competitor offers this. MeTube has light/dark/auto only via env var. yt-dlp-web-ui has no theming | HIGH | CSS variable contract required first. Theme directory: theme.css + metadata.json + optional preview.png. Hot-loaded at startup |
| Heavily commented built-in themes as documentation | Lowers floor for customization to near-zero — anyone with a text editor or AI can retheme | LOW | No runtime cost. Every CSS token documented inline. Built-in themes serve as learning examples |
| Admin UI with username/password login (not raw token) | yt-dlp-web-ui uses JWT tokens in headers/query params — not user-friendly. MeTube has no admin UI at all. qBittorrent/Sonarr-style login is the expected self-hosted pattern | MEDIUM | First-boot credential setup with forced change prompt. Config-via-UI means no docker restarts for settings changes |
| Session export/import | No competitor offers portable session state. Enables identity continuity on persistent instances without a real account system | MEDIUM | JSON export of download history + queue state + preferences. Import restores history. Does not require sign-in, stays anonymous-first |
| Unsupported URL reporting with audit log | No competitor surfaces extraction errors with actionable reporting. MeTube just shows "error" | LOW | User-triggered only. Logs domain by default. Admin downloads log. Optional GitHub issue prefill |
| Source-aware output templates | Sensible per-site defaults (YouTube: uploader/title, SoundCloud: uploader/title, generic: title). MeTube uses one global template | LOW | Config-driven. Per-download override also supported |
| Link sharing (completed file URL) | Users want to share a ripped file with a friend — a direct download URL removes the "now what?" question | LOW | Serve completed files under predictable path. Requires knowing the output filename |
| Zero automatic outbound telemetry | Competing tools have subtle CDN calls, Google Fonts, or update checks. Trust is the core proposition | LOW | No external requests from container. All fonts/assets bundled or self-hosted |
| Cyberpunk default theme | Visual identity differentiator. Every other tool ships with plain material/tailwind defaults | MEDIUM | #00a8ff/#ff6b2b, JetBrains Mono, scanlines, grid overlay. Makes first impressions memorable |
### Anti-Features (Commonly Requested, Often Problematic)
Features that seem good but create problems.
| Feature | Why Requested | Why Problematic | Alternative |
|---------|---------------|-----------------|-------------|
| OAuth / SSO integration | Multi-user deployments want centralized auth | Massive scope increase; introduces external runtime dependency; anonymous-first identity model conflicts with account-based auth | Reverse proxy handles AuthN (Authentik, Authelia, Traefik ForwardAuth); media.rip handles AuthZ via session mode + admin token |
| Real-time everything via WebSocket | Seems more capable than SSE | WebSockets require persistent bidirectional connections, more complex infra, harder to load-balance; SSE covers 100% of the UI's actual needs (server-push only) | SSE — simpler, HTTP-native, auto-reconnecting via browser EventSource |
| User accounts / registration | Makes multi-user feel "proper" | Adds password hashing, email, account management, password reset flow — massive scope for a download tool; users expect anonymous operation | Session isolation mode: each browser gets its own cookie-scoped queue without any account |
| Automatic yt-dlp update on startup | Ensures latest extractor support | Breaks immutable containers and reproducible builds; version drift between deployments; network dependency at boot time | Pin yt-dlp version in requirements.txt; publish new image on yt-dlp releases via CI |
| Embedded video player | Looks impressive in demos | Adds significant frontend complexity, licensing surface for codecs, and scope creep for a downloader tool; most files need to go to Jellyfin/Plex anyway | Serve files at predictable paths; let users open in their preferred player |
| Telegram / Discord bot integration | Power users want remote submission | Separate runtime concern; adds credentials management, API rate limits, message parsing complexity; not what v1 needs to prove | Documented as v2+ extension point; clean API surface makes it straightforward to add later |
| Subscription / channel monitoring | "Set it and forget it" appeal | Fundamentally different product — a scheduler/archiver vs a download UI; scope would double; tools like Pinchflat, TubeArchivist do this better | Out of scope — architecture should not block adding it; APScheduler is already present for purge |
| Per-format download presets | Advanced users want "my 720p MP3 preset" saved | Medium complexity, but defers well to v1.x — v1 needs live format selection working first before persisting preferences | Implement after session system is stable; presets can be stored per-session in config |
| FlareSolverr / Cloudflare bypass | Some sites block yt-dlp | Introduces external service dependency, legal gray area, maintenance surface; YTPTube does this but it's an edge case | cookies.txt upload solves the authenticated content problem for most users; FlareSolverr is too niche for v1 |
## Feature Dependencies
```
[SQLite Job Store]
└──required-by──> [Download Queue View]
└──required-by──> [Real-Time SSE Progress]
└──required-by──> [Playlist Parent/Child Jobs]
[Session System (cookie-based)]
└──required-by──> [Session Isolation Mode]
└──required-by──> [Cookie Auth (cookies.txt per-session)]
└──required-by──> [Session Export/Import]
└──required-by──> [SSE per-session stream]
[SSE Bus (per-session)]
└──required-by──> [Real-Time Progress Updates]
└──required-by──> [Init replay on reconnect]
└──required-by──> [purge_complete event]
[yt-dlp Integration (library mode)]
└──required-by──> [Format/Quality Selection (GET /api/formats)]
└──required-by──> [Download execution]
└──required-by──> [Playlist resolution → child jobs]
└──required-by──> [Error detection → unsupported URL reporting]
[Admin Auth (username/password)]
└──required-by──> [Admin Panel UI]
└──required-by──> [Purge API endpoint]
└──required-by──> [Session list / storage endpoints]
└──required-by──> [Unsupported URL log download]
[CSS Variable Contract (base.css)]
└──required-by──> [Built-in themes (cyberpunk, dark, light)]
└──required-by──> [Drop-in custom themes]
└──required-by──> [Theme picker UI]
[Theme Picker UI]
└──enhances──> [Drop-in custom themes]
[Completed Download File Serving]
└──required-by──> [Link sharing (shareable download URL)]
[Purge Scheduler (APScheduler)]
└──enhances──> [Session TTL expiry]
└──enhances──> [File and log TTL purge]
[Format/Quality Selection]
└──enhances──> [Per-download output template override]
[Session Export]
└──requires──> [Session System]
└──conflicts-with~~> [open mode] (no session = nothing to export)
```
### Dependency Notes
- **Session system required before session export/import:** No session state to serialize without it. Export is meaningless in `open` mode.
- **SSE bus must exist before progress updates:** Progress hooks from yt-dlp thread pool need a dispatcher to push events to the correct session's queue.
- **yt-dlp integration required before format selection:** `GET /api/formats?url=` calls `yt-dlp.extract_info(process=False)` — format list is live-extracted, not pre-cached.
- **CSS variable contract required before any theming:** All three built-in themes and the drop-in theme system depend on the base.css token contract being stable. Changing token names later breaks all custom themes operators have written.
- **Job store required before queue view:** The frontend queue is a projection of SQLite state replayed via SSE `init` events — the DB is the source of truth, not frontend memory.
- **Admin auth required before admin panel:** Admin routes must be protected before the panel is built, otherwise the panel ships with no auth and operators have no safe path to production.
- **File serving endpoint required before link sharing:** Shareable URLs point to a served file path. This is a FastAPI `StaticFiles` or explicit route serving `/downloads`.
## MVP Definition
### Launch With (v1.0)
Minimum viable product — the full target feature set per PROJECT.md.
- [x] URL submission + auto-detection triggers format scraping — core primitive
- [x] Format/quality selector (populated live from yt-dlp info extraction) — power users won't use a tool that hides quality choice
- [x] Real-time progress via SSE (queued → extracting → downloading → completed/failed) — no progress = no trust
- [x] Download queue: filter, sort, cancel, playlist collapsible parent/child — queue management is table stakes
- [x] Session system: isolated (default) / shared / open — the primary differentiation from MeTube; isolated mode is the zero-config safe default
- [x] SSE init replay on reconnect — required for page refresh resilience; without this isolated mode is useless
- [x] Cookie auth (cookies.txt upload per-session, Netscape format) — enables paywalled content; the practical reason people move off MeTube
- [x] Purge system: scheduled / manual / never; independent file + log TTL — ephemeral storage is the contract with users
- [x] Three built-in themes: cyberpunk (default), dark, light — visual identity and immediate differentiation
- [x] Drop-in custom theme system (volume mount) — the feature request MeTube refuses to build
- [x] Mobile-responsive layout (bottom tabs + card list at <768px) no competitor does mobile; 44px touch targets
- [x] Admin panel: username/password login, session list, storage, manual purge, unsupported URL log, live config — operators need a UI, not raw config
- [x] Unsupported URL reporting (user-triggered, domain-only by default) — trust feature; users see exactly what gets logged
- [x] Health endpoint (`GET /api/health`) — Uptime Kuma and similar monitoring tools are table stakes for self-hosters
- [x] Session export/import — enables identity continuity on persistent instances
- [x] Link sharing (source URL clipboard + completed file shareable URL) — reduces friction for the "share with a friend" use case
- [x] Zero automatic outbound telemetry — non-negotiable privacy baseline
- [x] Docker: single image, GHCR + Docker Hub, amd64 + arm64 — distribution is a feature
### Add After Validation (v1.x)
Features to add once core is working and v1.0 is shipped.
- [ ] Per-format/quality download presets — add when session system is stable and users ask for it
- [ ] Branding polish pass — tune cyberpunk defaults, tighten out-of-box experience, ensure built-in theme comments are comprehensive
- [ ] `reporting.github_issues: true` — pre-filled GitHub issue opening; disabled by default, enable only after log download is validated
- [ ] Queue filter/sort persistence — store last sort state in localStorage
### Future Consideration (v2+)
Features to defer until product-market fit is established.
- [ ] External arr-stack API (Radarr/Sonarr programmatic integration) — architecture designed not to block this; clean API surface ready
- [ ] Download presets / saved quality profiles — needs session stability first
- [ ] Subscription / channel monitoring — fundamentally different product scope; defer to TubeArchivist/Pinchflat integration or separate milestone
- [ ] Telegram/Discord bot — documented extension point; clean REST API makes it straightforward
## Feature Prioritization Matrix
| Feature | User Value | Implementation Cost | Priority |
|---------|------------|---------------------|----------|
| URL submission + download | HIGH | LOW | P1 |
| Real-time SSE progress | HIGH | MEDIUM | P1 |
| Format/quality selector | HIGH | MEDIUM | P1 |
| Job queue (view + cancel) | HIGH | LOW | P1 |
| Playlist parent/child jobs | HIGH | HIGH | P1 |
| Session isolation (cookie-based) | HIGH | HIGH | P1 |
| SSE init replay on reconnect | HIGH | MEDIUM | P1 |
| Three built-in themes | HIGH | MEDIUM | P1 |
| Mobile-responsive layout | HIGH | HIGH | P1 |
| Docker distribution | HIGH | LOW | P1 |
| Health endpoint | MEDIUM | LOW | P1 |
| Cookie auth (cookies.txt upload) | HIGH | MEDIUM | P1 |
| Purge system (scheduled/manual/never) | MEDIUM | MEDIUM | P1 |
| Admin panel (username/password) | MEDIUM | HIGH | P1 |
| Drop-in custom themes (volume mount) | MEDIUM | HIGH | P1 |
| Session export/import | MEDIUM | MEDIUM | P1 |
| Unsupported URL reporting | LOW | LOW | P1 |
| Link sharing | LOW | LOW | P1 |
| Zero outbound telemetry | HIGH | LOW | P1 (constraint, not feature) |
| Source-aware output templates | MEDIUM | LOW | P1 |
| Per-format download presets | MEDIUM | MEDIUM | P2 |
| GitHub issue prefill for reporting | LOW | LOW | P2 |
| Subscription/channel monitoring | MEDIUM | HIGH | P3 |
| Arr-stack API integration | MEDIUM | HIGH | P3 |
**Priority key:**
- P1: Must have for v1.0 launch
- P2: Should have in v1.x
- P3: Future milestone
## Competitor Feature Analysis
| Feature | MeTube | yt-dlp-web-ui | ytptube | media.rip() |
|---------|--------|---------------|---------|-------------|
| URL submission | Yes | Yes | Yes | Yes |
| Real-time progress | WebSocket | WebSocket/RPC | WebSocket | SSE (simpler, auto-reconnect) |
| Format selection | Quality presets (no live extraction) | Yes | Yes (presets) | Live extraction via `GET /api/formats` |
| Playlist support | Yes (flat) | Yes | Yes | Yes (collapsible parent/child) |
| Session isolation | No — all sessions see all downloads (closed as won't fix) | No | Basic auth only | Yes — isolated/shared/open modes |
| Cookie auth | Yes (global, not per-session) | No | Yes | Yes (per-session, purge-scoped) |
| Theming | light/dark/auto env var | None | None | 3 built-ins + drop-in custom themes |
| Mobile-first UI | No (desktop-first) | No | No | Yes (bottom tabs, card list, 44px targets) |
| Admin panel | No | Basic auth header | Basic auth | Username/password login UI, config editor |
| Session export/import | No | No | No | Yes |
| Purge policy | `CLEAR_COMPLETED_AFTER` only | No | No | scheduled/manual/never, independent TTLs |
| Unsupported URL reporting | Error shown only | Error shown only | Error shown only | User-triggered log + admin download |
| Health endpoint | No | No | No | Yes — version, uptime, disk space, queue depth |
| Link sharing | Base URL config only | No | No | Clipboard + direct file download URL |
| Zero telemetry | Yes | Yes | Yes | Yes (explicit design constraint) |
| Docker distribution | Yes (amd64 only) | Yes | Yes | Yes (amd64 + arm64) |
## Edge Cases and Expected Behaviors
### Format Selection
- **Slow info extraction:** `GET /api/formats?url=` calls `extract_info(process=False)` — for some sites this takes 3-10 seconds. UI must show a loading state on the format picker immediately after URL is pasted.
- **No formats returned:** Some sites return a direct URL without format list. UI should fall back to "Best available" option gracefully.
- **Audio-only formats:** Some formats have `vcodec: none` — these should be labeled clearly (e.g., "Audio only — MP3 128kbps").
- **Format IDs are extractor-specific:** `format_id` values are not portable across sites; always pass them as opaque strings to yt-dlp.
- **filesize field is frequently null:** Many formats don't report filesize in the info_dict. Show "~estimate" or "unknown" — never show 0.
### Cookie Auth
- **Cookie expiry:** Cookies expire within ~2 weeks of export. yt-dlp will fail with auth error after expiry — job should show `failed` with a "cookies may be expired" hint.
- **Cookie scope:** cookies.txt contains all site cookies from the browser export. Users should understand this is sensitive. Never log cookie file contents; purge on session clear.
- **Chrome cookie extraction broken since July 2024:** Chrome's App-Bound Encryption makes external extraction impossible. Firefox is the recommended browser for cookie export. UI should surface this note in the cookie upload flow.
- **CRLF vs LF:** Windows-generated cookies.txt files may use CRLF line endings, causing yt-dlp parse errors. Backend should normalize to LF on upload.
### Playlist Downloads
- **Large playlists:** A 200-video playlist creates 201 rows in the queue (1 parent + 200 children). UI must handle this gracefully — collapsed by default, with count shown on parent row.
- **Mixed success/failure in playlists:** Some child videos in a playlist may be geo-blocked or removed. Parent job should complete with a `partial` status or show child failure counts.
- **Playlist URL re-extraction:** If a user submits the same playlist URL twice, they get two independent parent jobs (keyed by UUID, not URL). This is intentional per PROJECT.md.
### Session System
- **SSE reconnect race:** If the user refreshes while a download is mid-progress, the SSE `init` event must replay the current job state. Without this, the queue appears empty after refresh even though downloads are running.
- **Session mode changes by operator:** If an operator switches from `isolated` to `shared` mid-deployment, existing per-session rows remain scoped to their session IDs. `shared` mode queries all rows regardless of session_id. This is a data model concern — no migration needed, but operator docs should explain the behavior.
- **`open` mode + session export conflict:** In `open` mode, no session is assigned (session_id = null). Session export has nothing to export. UI should hide the export button in `open` mode.
### Purge
- **Purge while download is active:** Purge must skip jobs with status `downloading` or `queued`. Only `completed`, `failed`, and `expired` jobs are eligible.
- **File already deleted manually:** If a user deletes a file from `/downloads` outside the app, purge should handle the missing file gracefully (log it, continue).
- **Log TTL vs file TTL independence:** The design intentionally allows keeping logs longer than files (e.g., files_ttl_hours: 24, logs_ttl_hours: 168). The purge.scope config controls what gets deleted.
## Sources
- [MeTube GitHub — alexta69/metube](https://github.com/alexta69/metube)
- [MeTube Issue #591 — User management / per-user isolation request](https://github.com/alexta69/metube/issues/591)
- [MeTube Issue #535 — Optional login page request](https://github.com/alexta69/metube/issues/535)
- [yt-dlp-web-ui — marcopiovanello/yt-dlp-web-ui](https://github.com/marcopiovanello/yt-dlp-web-ui)
- [yt-dlp-web-ui Authentication methods wiki](https://github.com/marcopiovanello/yt-dlp-web-ui/wiki/Authentication-methods)
- [ytptube — arabcoders/ytptube](https://github.com/arabcoders/ytptube)
- [yt-dlp Information Extraction Pipeline — DeepWiki](https://deepwiki.com/yt-dlp/yt-dlp/2.2-information-extraction-pipeline)
- [yt-dlp cookie system — DeepWiki](https://deepwiki.com/yt-dlp/yt-dlp/5.5-browser-integration-and-cookie-system)
- [The Ultimate Guide to GUI Front-Ends for yt-dlp 2025 — BrightCoding](https://www.blog.brightcoding.dev/2025/12/06/the-ultimate-guide-to-gui-front-ends-for-youtube-dl-yt-dlp-download-videos-like-a-pro-2025-edition/)
- [6 Ways to Get YouTube Cookies for yt-dlp in 2026 — DEV Community](https://dev.to/osovsky/6-ways-to-get-youtube-cookies-for-yt-dlp-in-2026-only-1-works-2cnb)
- [MeTube on Hacker News — user discussion of limitations](https://news.ycombinator.com/item?id=41098974)
---
*Feature research for: yt-dlp web frontend / self-hosted media downloader*
*Researched: 2026-03-17*

View file

@ -0,0 +1,358 @@
# Pitfalls Research
**Domain:** yt-dlp web frontend — FastAPI + Vue 3 + SSE + SQLite + Docker
**Researched:** 2026-03-17
**Confidence:** HIGH (critical pitfalls verified via official yt-dlp issues, sse-starlette docs, CVE advisories; MEDIUM for performance traps and Docker sizing which rely on community sources)
---
## Critical Pitfalls
### Pitfall 1: Using a Single YoutubeDL Instance for Concurrent Downloads
**What goes wrong:**
Multiple in-flight downloads share one `YoutubeDL` instance. Instance state (cookies, temp files, internal logger, download archive state) is mutated per-download, causing downloads to corrupt each other's progress data, swap cookies, or raise `TypeError` on `None` fields when hooks fire out of order.
**Why it happens:**
yt-dlp is documented as a library by example (`with YoutubeDL(opts) as ydl: ydl.download([url])`), which looks reusable. There is no explicit "not thread-safe" warning in the README. Developers assume the object is stateless between calls.
**How to avoid:**
Create a fresh `YoutubeDL` instance per download job, inside the worker function. Never share an instance across concurrent threads or tasks:
```python
def _run_download(job_id: str, url: str, opts: dict):
with YoutubeDL({**opts, "progress_hooks": [make_hook(job_id)]}) as ydl:
ydl.download([url])
```
Run this inside `loop.run_in_executor(thread_pool, _run_download, ...)` so the FastAPI event loop is not blocked. The YoutubeDL object never crosses the thread boundary.
**Warning signs:**
- Progress percentages jump between unrelated jobs
- Two downloads finish at the same time and one reports 0% or corrupted size
- `TypeError: '>' not supported between 'NoneType' and 'int'` in progress hook (a known issue when hook receives stale None from another job's state)
**Phase to address:**
Core download engine (Phase 1 / foundation). This is the fundamental architecture decision — get it right before building progress reporting on top of it.
---
### Pitfall 2: Calling asyncio Primitives from a yt-dlp Progress Hook
**What goes wrong:**
The progress hook fires inside the `ThreadPoolExecutor` worker thread, not on the asyncio event loop. Calling `asyncio.Queue.put()`, `asyncio.Event.set()`, or any awaitable directly from the hook raises `RuntimeError: no running event loop` or silently does nothing.
**Why it happens:**
Progress hooks feel like callbacks, and callbacks in async Python code are usually called on the event loop. But yt-dlp is synchronous — its hooks fire on whichever OS thread is running the download. `loop.run_in_executor` moves the whole call to a thread pool; the hook fires inside that thread.
**How to avoid:**
Use `loop.call_soon_threadsafe()` to bridge the thread back to the event loop:
```python
def make_hook(job_id: str, loop: asyncio.AbstractEventLoop, queue: asyncio.Queue):
def hook(d: dict):
# Called from thread — must not await or call asyncio directly
loop.call_soon_threadsafe(queue.put_nowait, {
"job_id": job_id,
"status": d.get("status"),
"downloaded": d.get("downloaded_bytes"),
"total": d.get("total_bytes"),
})
return hook
```
Capture `asyncio.get_event_loop()` in the FastAPI startup context (before executor threads start) and pass it into the hook factory.
**Warning signs:**
- SSE stream connects but never receives progress updates
- `RuntimeError: no running event loop` in thread worker logs
- Progress updates arrive in large batches rather than incrementally (queued but not flushed)
**Phase to address:**
Core download engine (Phase 1). The hook bridging must be wired before SSE progress streaming is built.
---
### Pitfall 3: SSE Connection Leak from Swallowed CancelledError
**What goes wrong:**
When a client disconnects, `sse-starlette` raises `asyncio.CancelledError` in the generator coroutine. If the generator catches it without re-raising (common in `try/except Exception` blocks), the task group never terminates: the ping task, the disconnect listener, and the downstream SSE write loop all become zombie tasks. Over time, the server accumulates connection handles, event queues, and memory.
**Why it happens:**
`except Exception` catches `CancelledError` in Python 3.7 (it inherits from `BaseException` as of 3.8, but code written for 3.7 patterns is still common). Developers add broad exception handlers to "safely" clean up resources, not realizing they're suppressing the cancellation signal.
**How to avoid:**
Always use `try/finally` for cleanup and never use bare `except Exception` around SSE generator bodies:
```python
async def event_generator(request: Request, session_id: str):
try:
async for event in _stream_events(session_id):
if await request.is_disconnected():
break
yield event
except asyncio.CancelledError:
# Clean up queues, unsubscribe session
_cleanup_session_stream(session_id)
raise # ALWAYS re-raise
finally:
_cleanup_session_stream(session_id)
```
**Warning signs:**
- Server memory grows slowly over time even with low active user count
- `asyncio.all_tasks()` shows growing number of `sse_starlette` tasks
- CPU spikes at idle as zombie ping tasks fire continuously
**Phase to address:**
SSE streaming (Phase 2). Must be enforced before load testing; the leak is invisible at low connection counts and only surfaces under sustained use.
---
### Pitfall 4: Purge Job Deleting Files for Active Downloads
**What goes wrong:**
The APScheduler purge job queries jobs older than TTL and deletes their files. If a download is actively writing to disk when the purge runs, the file is deleted mid-write. The download worker then fails with `FileNotFoundError` or produces a zero-byte file. The job status in SQLite may be stuck in `downloading` forever.
**Why it happens:**
Purge logic typically queries by `created_at < now() - TTL` or `completed_at < now() - TTL`. If `completed_at` is NULL for an active download, range logic can accidentally include it depending on NULL handling in the SQL query. Additionally, "complete" status transitions may lag: a job is marked `completed` in the DB a moment after the file is fully written, leaving a window.
**How to avoid:**
Add an explicit `status != 'downloading'` filter to every purge query — never rely on timestamp alone:
```sql
DELETE FROM jobs
WHERE status IN ('completed', 'failed', 'cancelled')
AND completed_at < :cutoff_ts
```
Also: before deleting a file path, verify the corresponding job row has a terminal status. Write a test that starts a slow download (sleep in a test hook) and triggers purge mid-download — verify the file is not touched.
**Warning signs:**
- Downloads succeed in tests but randomly fail in production under load
- Jobs stuck in `downloading` status in DB with no active worker
- Zero-byte files in the download directory
**Phase to address:**
Purge/session management (Phase 3). Write the status-guard test as part of the purge implementation, not after.
---
### Pitfall 5: SSE Reconnect Storm on Page Reload
**What goes wrong:**
When `EventSource` loses connection (server restart, tab backgrounded, network blip), the browser immediately retries every 3 seconds by default. If the frontend does not track `Last-Event-ID` and the server does not replay recent events, every reconnect gets a blank slate — the UI shows empty progress or "unknown" status for all in-progress downloads. Users refresh repeatedly, multiplying connections. On slow networks, multiple tabs from the same session each open their own SSE connection, exhausting the 6-connection-per-domain HTTP/1.1 limit.
**Why it happens:**
SSE reconnect is automatic and invisible — developers build the happy path but don't test what happens after a reconnect. `Last-Event-ID` support requires the server to track sent event IDs and replay them, which is non-trivial to implement late.
**How to avoid:**
- Assign an incrementing `event_id` to every SSE message from day one (can be a job-scoped counter or a global sequence).
- On reconnect, read `Last-Event-ID` header and replay all events for the session that occurred after that ID.
- Replay only the current state snapshot (latest status per job), not the full event log — prevents replay storms.
- Set `retry: 5000` in the SSE stream to slow down reconnect attempts.
- Use HTTP/2 in the Docker container (serve via `uvicorn --http h2` or behind nginx/caddy) to lift the 6-connection limit.
**Warning signs:**
- After page reload, download cards show "Unknown" or empty progress
- Browser devtools Network tab shows rapid repeated connections to `/api/events`
- Multiple tabs stop receiving updates (one tab's connection blocks others on HTTP/1.1)
**Phase to address:**
SSE streaming (Phase 2). Must be designed in from the start — adding `Last-Event-ID` replay retroactively requires event log storage.
---
### Pitfall 6: cookies.txt File Leakage via Redirect Attack (CVE-2023-35934)
**What goes wrong:**
yt-dlp passes uploaded cookies as a `Cookie` header to the file downloader for every request, including redirects. A malicious URL can redirect to an attacker-controlled host, leaking the user's session cookies for the original site. In a multi-user deployment, one user's cookies for YouTube, Vimeo, or Patreon are sent to any host that redirects the download.
**Why it happens:**
yt-dlp versions before 2023-07-06 do not scope cookies to the origin domain at the file download stage. The CVE affects youtube-dl (all versions) and all yt-dlp versions before the fix. The attack requires no exploit — it is the normal redirect behavior, just exploited.
**How to avoid:**
- Pin yt-dlp to >= 2023-07-06 (the patched version). Verify in `requirements.txt` and Docker build.
- Store cookies.txt files with per-session isolation: `data/sessions/{session_id}/cookies.txt` — never share files across sessions.
- Delete cookies.txt after the download job completes (or on session purge) so they do not persist on disk.
- Never log the cookies.txt path in any publicly readable log.
- In the security model: treat uploaded cookies as highly sensitive credentials, equivalent to a login token.
**Warning signs:**
- yt-dlp version pinned to a pre-2023-07-06 version
- cookies.txt stored in a shared directory (e.g., `/data/cookies.txt` instead of per-session paths)
- cookies.txt files not cleaned up after job completion
**Phase to address:**
Cookie auth feature (Phase 2 or whenever cookies.txt upload is implemented). Pin the version constraint immediately in Phase 1 setup.
---
### Pitfall 7: SQLite Write Contention Without WAL Mode
**What goes wrong:**
Multiple concurrent download workers write job status updates (progress %, `downloaded_bytes`, status transitions) to SQLite through aiosqlite. Without WAL mode, SQLite uses a database-level exclusive lock for every write: writer 1 locks, writers 2N receive `SQLITE_BUSY` and fail (or retry until timeout). Under 3+ simultaneous downloads, status updates are dropped, progress bars freeze, and failed retries surface as 500 errors.
**Why it happens:**
The default SQLite journal mode (`DELETE`) serializes all writers. aiosqlite runs all operations in a background thread, but the locking is at the database layer, not the Python layer. Developers test with one download at a time and never see contention.
**How to avoid:**
Enable WAL mode at application startup before any writes:
```python
async def setup_db(conn):
await conn.execute("PRAGMA journal_mode=WAL")
await conn.execute("PRAGMA synchronous=NORMAL")
await conn.execute("PRAGMA busy_timeout=5000")
await conn.commit()
```
`busy_timeout=5000` gives waiting writers up to 5 seconds to retry before failing, absorbing brief contention spikes. WAL allows concurrent readers alongside a single writer, which is exactly the access pattern for a download queue.
**Warning signs:**
- `sqlite3.OperationalError: database is locked` in logs under concurrent downloads
- Progress bars stall on multiple simultaneous jobs but work fine one at a time
- aiosqlite 0.20.0+ connection thread behavior change causing hangs (ensure connections are properly closed with `async with`)
**Phase to address:**
Core database setup (Phase 1). Set WAL mode in the database initialization function before any other schema work.
---
## Technical Debt Patterns
| Shortcut | Immediate Benefit | Long-term Cost | When Acceptable |
|----------|-------------------|----------------|-----------------|
| Single shared aiosqlite connection | Simpler code | Write serialization; connection-level lock defeats WAL concurrency | Never — use a connection pool or per-request connections |
| Hardcoded yt-dlp version (`yt-dlp==2024.x.x`) | Reproducibility | Site extractors break as YouTube/Vimeo update APIs; users report "can't download X" | Acceptable for initial release; add update strategy in v1.1 |
| Storing cookies.txt in a shared `/data/cookies/` directory | Simpler path management | Session A can access session B's cookies if path logic bugs; CVE-2023-35934 surface increases | Never — always per-session isolation |
| Running yt-dlp in the FastAPI process thread pool | No IPC complexity | One hanging download blocks a thread pool slot; OOM in one download can take down the whole process | Acceptable for v1.0 at self-hosted scale; document limit |
| Not implementing `Last-Event-ID` replay at launch | Simpler SSE handler | Every reconnect shows stale/blank UI; impossible to add replay cleanly without event log | Acceptable only if SSE is designed with event IDs from day one so replay can be added later without schema migration |
| `except Exception: pass` in SSE generators | Prevents crashes | Swallows `CancelledError`, creating zombie connections | Never |
| No busy_timeout on SQLite | Fewer config lines | Silent dropped writes under concurrent downloads | Never — always set busy_timeout |
---
## Integration Gotchas
| Integration | Common Mistake | Correct Approach |
|-------------|----------------|------------------|
| yt-dlp + asyncio | `await loop.run_in_executor(None, ydl.download, [url])` — blocks on `ydl` shared instance | Create `YoutubeDL` inside the worker function; pass only plain data (job_id, url, opts dict) across thread boundary |
| yt-dlp progress hook + event loop | `asyncio.Queue.put_nowait(data)` directly in hook | `loop.call_soon_threadsafe(queue.put_nowait, data)` — capture loop reference before entering executor |
| yt-dlp + ProcessPoolExecutor | Pass `YoutubeDL` instance to process pool | `YoutubeDL` is not picklable (contains file handles); use `ThreadPoolExecutor` only, or create instance inside worker |
| yt-dlp info extraction + download | Call `extract_info` and `download` in same executor call | Fine for ThreadPoolExecutor; `sanitize_info()` required if result crosses process boundary |
| sse-starlette + cleanup | `except Exception as e: cleanup(); pass` | `except asyncio.CancelledError: cleanup(); raise` — never swallow CancelledError |
| aiosqlite 0.20.0+ | `connection.daemon = True` (no longer a thread) | Use `async with aiosqlite.connect()` context manager; verify connection lifecycle in migration from older versions |
| cookies.txt + yt-dlp | Global cookies file path in `YDL_OPTS` shared across requests | Per-session path: `opts["cookiefile"] = f"data/sessions/{session_id}/cookies.txt"` |
| APScheduler + FastAPI lifespan | Starting scheduler outside `@asynccontextmanager lifespan` | Initialize and start scheduler inside the lifespan context manager to ensure clean shutdown |
| Vue 3 EventSource + HTTP/1.1 | Multiple browser tabs each open SSE connection | Serve over HTTP/2 (nginx/caddy in front of uvicorn) to lift 6-connection-per-domain limit |
---
## Performance Traps
| Trap | Symptoms | Prevention | When It Breaks |
|------|----------|------------|----------------|
| Progress hook writing to DB on every hook call | DB write rate exceeds 10/sec per download; downloads slow down | Throttle DB writes: update DB only when `downloaded_bytes` changes by >1MB or status changes | 3+ simultaneous downloads with fast connections |
| SSE endpoint holding open connection per download per session | Memory grows linearly with active sessions × downloads | One SSE connection per session (multiplexed events), not one per job | 10+ concurrent sessions |
| yt-dlp `extract_info` for URL auto-detection on every keystroke | Rapid URL paste triggers multiple concurrent `extract_info` calls; thread pool saturates | Debounce URL input (500ms) before triggering extraction; cancel in-flight extraction on new input | Immediately, if users paste multi-word text before settling on a URL |
| Docker COPY of entire project directory before pip install | Every code change invalidates pip cache layer | Order Dockerfile: copy `requirements.txt` first → `pip install` → copy app code | Every build during active development |
| aiosqlite without connection pool | Each request opens/closes its own connection; overhead accumulates | Use a single long-lived connection with WAL mode, or `aiosqlitepool` for high throughput | 50+ req/sec (well above self-hosted target, but good practice) |
| Purge scanning entire jobs table without index | Admin-triggered purge takes seconds to complete, blocks event loop if not offloaded | Index `(session_id, status, completed_at)` from the start | 10,000+ job rows |
---
## Security Mistakes
| Mistake | Risk | Prevention |
|---------|------|------------|
| Cookies.txt stored beyond job lifetime | User's site credentials persist on disk; accessible if container is compromised or volume is shared | Delete on job completion; delete on session purge; include in purge scope always |
| Admin password transmitted without HTTPS | Credentials intercepted on network | Enforce HTTPS in Docker deployment docs; add `SECURE_COOKIES=true` check in startup that warns loudly if running over HTTP |
| Session cookie without `HttpOnly` + `SameSite=Lax` | Cookie accessible via XSS; CSRF possible against download endpoints | Set `response.set_cookie("mrip_session", ..., httponly=True, samesite="lax", secure=False)` (secure=True in prod) |
| Session ID that doesn't rotate after login/admin-auth | Session fixation — attacker sets a known session ID before user authenticates | Regenerate session ID on any privilege change (session creation, admin login) |
| Admin credentials stored in plaintext in `config.yaml` | Credential leak if config volume is readable | Store bcrypt hash of admin password, not plaintext; generate a random default on first boot with forced change prompt |
| yt-dlp version < 2023-07-06 | CVE-2023-35934: cookie leak via redirect | Pin `yt-dlp>=2023.07.06` in `requirements.txt`; verify in Docker health check |
| No rate limiting on download submission | Unauthenticated user floods server with download jobs | Session-scoped queue depth limit (e.g., max 5 active jobs per session); configurable by operator |
| Shareable file URLs that expose internal paths | Directory traversal if filename is user-controlled | Serve files via a controlled endpoint (`/api/files/{job_id}/{filename}`) that resolves to an absolute path; never expose filesystem paths |
| Unsupported URL log with `report_full_url: true` default | Full URLs containing tokens/keys logged and downloadable | Default `report_full_url: false`; document clearly in config reference |
---
## UX Pitfalls
| Pitfall | User Impact | Better Approach |
|---------|-------------|-----------------|
| "Download failed" with raw yt-dlp error message | Non-technical users see Python tracebacks or opaque errors | Map common yt-dlp errors to human-readable messages: "This site requires login — upload a cookies.txt file" |
| Progress bar resets to 0% on SSE reconnect | User thinks download restarted; anxiety and confusion | Restore last known progress from DB on SSE reconnect; show "Reconnecting..." state briefly |
| Session expiry with no warning | User returns after 24h to find all downloads gone | Show session TTL countdown in UI; warn at 1h remaining; extend TTL on activity |
| Format picker with raw yt-dlp format strings | "bestvideo+bestaudio/best" meaningless to non-technical users | Translate to "Best quality (auto)", "1080p MP4", "Audio only (MP3)"; show file size estimate |
| Playlist shows all items but provides no bulk action | User has to click "start" 40 times for a 40-item playlist | Bulk start at playlist level is required, not optional; implement before any UX testing |
| No feedback when URL auto-detection starts | User pastes URL, nothing visible happens for 2-3 seconds | Show spinner/skeleton immediately on valid URL detection; don't wait for `extract_info` to complete |
| Theme picker that resets on page reload | Users re-select theme every visit | Persist to `localStorage` on selection; read on mount before first render to avoid flash |
---
## "Looks Done But Isn't" Checklist
- [ ] **Download engine:** Progress hook fires and updates DB — verify that it also correctly handles `total_bytes: None` (subtitle downloads, live streams) without `TypeError`
- [ ] **SSE streaming:** Events deliver in real time on initial connection — verify they also replay correctly after a client disconnect and reconnect using `Last-Event-ID`
- [ ] **Session cookie:** Cookie is set on first visit — verify it has `HttpOnly`, `SameSite=Lax`, and the correct domain/path; verify it is NOT `Secure` in local dev (blocks HTTP) but IS `Secure` in prod
- [ ] **Cookies.txt upload:** File is accepted and passed to yt-dlp — verify the file is deleted after the job completes and is not accessible via any API endpoint
- [ ] **Purge job:** Old jobs are deleted — verify the query explicitly filters `status IN ('completed', 'failed', 'cancelled')` and does not touch `status = 'downloading'`
- [ ] **Admin auth:** Login form accepts correct credentials — verify incorrect credentials return 401 with a constant-time comparison (no timing side channel); verify default credentials force a change prompt
- [ ] **Docker image:** Image builds and runs — verify multi-platform: `docker buildx build --platform linux/amd64,linux/arm64` succeeds before tagging v1.0
- [ ] **WAL mode:** SQLite is used — verify `PRAGMA journal_mode` returns `wal` at startup in health check or startup log
- [ ] **yt-dlp version:** Library is installed — verify `yt-dlp.__version__` in `/api/health` response and confirm it is >= 2023.07.06
- [ ] **SSE connection limit:** SSE works in one tab — verify in browser devtools that multiple tabs don't hit HTTP/1.1 6-connection limit (use HTTP/2 or test connection multiplexing)
---
## Recovery Strategies
| Pitfall | Recovery Cost | Recovery Steps |
|---------|---------------|----------------|
| YoutubeDL instance sharing discovered late | MEDIUM | Audit all `YoutubeDL` instantiation sites; refactor to per-job pattern; existing jobs in-flight are safe (no state corruption once they complete) |
| CancelledError swallowing causing connection leak | LOW | Find `except Exception` blocks in SSE generators; add explicit `except asyncio.CancelledError: raise`; restart server to clear zombie connections |
| Purge bug deleted active download files | LOW | Restore file from backup if available; re-queue job; add status guard to purge query and write regression test |
| cookies.txt not being deleted (security incident) | HIGH | Audit `data/sessions/` directory for leftover cookie files; purge all; rotate any credentials whose cookies were uploaded; add deletion to job completion hook |
| SQLite locked under concurrent downloads | LOW | Enable WAL mode and `busy_timeout`; no data loss if writes are retried; restart not required |
| Docker image too large (>1GB) for arm64 users | MEDIUM | Add `.dockerignore` to exclude `node_modules`, `__pycache__`, `.git`; use multi-stage build with slim Python base; use `wader/static-ffmpeg` for static ffmpeg binary |
| yt-dlp extractor broken by upstream site change | LOW-MEDIUM | Update yt-dlp pin in `requirements.txt` and rebuild image; CI smoke test catches this before release; document manual update procedure in README |
---
## Pitfall-to-Phase Mapping
| Pitfall | Prevention Phase | Verification |
|---------|------------------|--------------|
| YoutubeDL instance not thread-safe | Phase 1: Core download engine | Test 3 simultaneous downloads; verify no cross-job progress corruption |
| Progress hook not asyncio-safe | Phase 1: Core download engine | Verify SSE receives progress while yt-dlp runs in executor thread |
| SQLite contention without WAL | Phase 1: Database setup | `PRAGMA journal_mode` returns `wal` in startup; no `SQLITE_BUSY` errors under 5 concurrent downloads |
| SSE CancelledError swallowing | Phase 2: SSE streaming | Kill a client mid-stream; verify server task count does not grow over 30 minutes |
| SSE reconnect storm / no replay | Phase 2: SSE streaming | Disconnect and reconnect; verify progress state is restored within 1 SSE cycle |
| cookies.txt leakage | Phase 2: Cookie auth feature | Verify per-session isolation paths; verify file is deleted on job completion |
| Purge deletes active downloads | Phase 3: Purge/session management | Unit test: start slow download, trigger purge, verify file untouched |
| Admin auth security gaps | Phase 3: Admin auth | Verify HttpOnly+SameSite; constant-time password comparison; default password forced change |
| Docker image bloat | Phase 4: Docker distribution | Measure image size post-build: target < 400MB compressed for amd64 |
| yt-dlp version pinning risk | Phase 1: setup + ongoing | `yt-dlp>=2023.07.06` in requirements; health endpoint reports version; CI smoke-test downloads from at least 2 sites |
---
## Sources
- [yt-dlp issue #9487: asyncio + multiprocessing / YoutubeDL not picklable](https://github.com/yt-dlp/yt-dlp/issues/9487)
- [yt-dlp issue #11022: Concurrent URL downloads not supported natively](https://github.com/yt-dlp/yt-dlp/issues/11022)
- [yt-dlp issue #5957: Progress hooks + writesubtitles / None type error + asyncio incompatibility](https://github.com/yt-dlp/yt-dlp/issues/5957)
- [yt-dlp Security Advisory GHSA-v8mc-9377-rwjj: Cookie leak via redirect (CVE-2023-35934)](https://github.com/yt-dlp/yt-dlp/security/advisories/GHSA-v8mc-9377-rwjj)
- [sse-starlette: Client Disconnection Detection — CancelledError must be re-raised](https://deepwiki.com/sysid/sse-starlette/3.5-client-disconnection-detection)
- [MDN: Using server-sent events — reconnect and Last-Event-ID behavior](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events)
- [SSE production pitfalls: proxy buffering, reconnect, connection limits](https://dev.to/miketalbot/server-sent-events-are-still-not-production-ready-after-a-decade-a-lesson-for-me-a-warning-for-you-2gie)
- [Concurrency challenges in SQLite — write contention and WAL mode](https://www.slingacademy.com/article/concurrency-challenges-in-sqlite-and-how-to-overcome-them/)
- [aiosqlite 0.22.0 behavior change: connection is no longer a thread](https://github.com/sqlalchemy/sqlalchemy/issues/13039)
- [FastAPI SSE disconnect detection discussion](https://github.com/fastapi/fastapi/discussions/9398)
- [Browser connection limits for SSE: 6 per domain on HTTP/1.1](https://www.javascriptroom.com/blog/server-sent-events-and-browser-limits/)
- [wader/static-ffmpeg: multi-arch static ffmpeg binaries for Docker](https://github.com/wader/static-ffmpeg)
---
*Pitfalls research for: yt-dlp web frontend (media.rip v1.0)*
*Researched: 2026-03-17*

396
.planning/research/STACK.md Normal file
View file

@ -0,0 +1,396 @@
# Stack Research
**Domain:** Self-hosted yt-dlp web frontend (media downloader)
**Researched:** 2026-03-17
**Confidence:** HIGH — all versions verified against PyPI and npm as of research date
---
## Recommended Stack
### Core Technologies
| Technology | Version | Purpose | Why Recommended |
|------------|---------|---------|-----------------|
| Python | 3.12 | Backend runtime | Pinned in Dockerfile; `3.12-slim` is the smallest viable image. Avoids 3.13's passlib incompatibility. yt-dlp requires >=3.9. |
| FastAPI | 0.135.1 | HTTP API + SSE + middleware | Native SSE support added in 0.135.0 (EventSourceResponse). Async-first design matches the run_in_executor download pattern. HTTPBasic/HTTPBearer auth built in. |
| uvicorn | 0.42.0 | ASGI server | Standard FastAPI server. Use `uvicorn[standard]` for uvloop and httptools for production throughput. |
| yt-dlp | 2026.3.17 | Download engine | Used as a library (`import yt_dlp`), not subprocess. Gives synchronous progress hooks, structured error capture, and no shell-injection surface. |
| aiosqlite | 0.22.1 | Async SQLite | asyncio bridge over stdlib sqlite3. Single-file DB, zero external deps, sufficient for this concurrency model (small ThreadPoolExecutor). |
| APScheduler | 3.11.2 | Cron jobs (purge, session expiry) | 3.x is stable. 4.x is still alpha (4.0.0a6). Use `AsyncIOScheduler` from APScheduler 3.x — runs on FastAPI's event loop, started/stopped in the lifespan context manager. |
| pydantic | 2.12.5 | Data models and validation | FastAPI 0.135.x requires Pydantic v2. All request/response schemas and config validation. |
| pydantic-settings | 2.13.1 | Config loading from YAML + env | Install as `pydantic-settings[yaml]` for native YAML source support. Handles `MEDIARIP__SECTION__KEY` env var override pattern natively with `env_nested_delimiter='__'`. |
| sse-starlette | 3.3.3 | SSE EventSource response | Production-stable. Provides `EventSourceResponse`, handles client disconnect detection, cooperative shutdown, and multiple concurrent streams. Required even though FastAPI 0.135 has native SSE — sse-starlette's disconnect handling is more reliable for long-lived connections. |
### Supporting Libraries
| Library | Version | Purpose | When to Use |
|---------|---------|---------|-------------|
| python-multipart | 0.0.22 | Multipart form + file upload | Required for `UploadFile` (cookies.txt upload). FastAPI raises `RuntimeError` without it if any endpoint uses file/form data. |
| bcrypt | 5.0.0 | Password hashing for admin credentials | Direct bcrypt, no passlib wrapper. `bcrypt.hashpw()` / `bcrypt.checkpw()`. Avoids passlib's Python 3.12+ deprecation warnings and Python 3.13 breakage. |
| PyYAML | 6.0.x | YAML parsing for config.yaml | Used indirectly by `pydantic-settings[yaml]`. Pinning to 6.0.x avoids the arbitrary-code-execution issue in 5.x. |
| httpx | 0.28.1 | Async HTTP client for tests | Used with `ASGITransport` for FastAPI integration tests. Not needed at runtime. |
| pytest | 9.0.2 | Backend test runner | Requires Python >=3.10. Use with `anyio` marker for async tests. |
| anyio | bundled with FastAPI | Async test infrastructure | FastAPI uses anyio internally. `@pytest.mark.anyio` with `anyio_backend = "asyncio"` fixture is the correct pattern for async test functions. |
| vue | 3.5.30 | Frontend framework | Latest stable. 3.6.0 is in beta (Vapor mode) — avoid until stable. Composition API + `<script setup>` for all components. |
| vue-router | 5.0.3 | Frontend routing | Vue Router 5 is a non-breaking upgrade from 4 with file-based routing merged in. Use programmatic routing only — no file-based routing needed for this SPA. |
| pinia | 3.0.4 | Frontend state management | Pinia 3 drops Vue 2 support (irrelevant here). Better TypeScript inference than Vuex. Three stores: `downloads`, `config`, `ui`. |
| vite | 8.0.0 | Frontend build tool | Ships with Rolldown (Rust bundler), 10-30x faster builds. Node 22 required. |
| @vitejs/plugin-vue | 6.0.1 | Vue SFC support in Vite | Official Vite Vue plugin for `.vue` file compilation. |
| vue-tsc | latest | TypeScript type checking for .vue | Wraps `tsc` with Vue SFC awareness. Run as `vue-tsc --noEmit` in CI. |
| vitest | 4.1.0 | Frontend test runner | Requires Vite >=6. Native Vite integration, same config. Browser Mode now stable in v4. Use for component unit tests and store tests. |
| typescript | 5.x | TypeScript compiler | Pinia 3 requires >=4.5. Vue 3 + Vite works best with 5.x. |
### Development Tools
| Tool | Purpose | Notes |
|------|---------|-------|
| ruff | Python linting + formatting | v0.15.x. Replaces flake8, black, isort in one tool. `ruff check` + `ruff format`. Configure in `pyproject.toml`. |
| eslint | JavaScript/TypeScript linting | Use `@vue/eslint-config-typescript` preset for Vue 3 + TS. |
| vue-tsc | Vue SFC type checking | Run `vue-tsc --noEmit` in CI, not just `tsc`. Standard `tsc` does not understand `.vue` files. |
---
## Integration Architecture
### yt-dlp as Library: The Critical Pattern
yt-dlp's `YoutubeDL` is synchronous. FastAPI is async. Bridge with `asyncio.run_in_executor` using a `ThreadPoolExecutor` — NOT `ProcessPoolExecutor`. `YoutubeDL` objects contain file handles that cannot be pickled for process-based parallelism.
```python
# backend/app/core/downloader.py — canonical pattern
import asyncio
from concurrent.futures import ThreadPoolExecutor
import yt_dlp
_executor = ThreadPoolExecutor(max_workers=config.downloads.max_concurrent)
class YDLLogger:
"""Suppress yt-dlp stdout; route to structured logging."""
def debug(self, msg): pass # suppress [debug] lines
def info(self, msg): logging.info(msg)
def warning(self, msg): logging.warning(msg)
def error(self, msg): logging.error(msg)
def _make_progress_hook(job_id: str, sse_bus):
def hook(d: dict):
if d["status"] == "downloading":
sse_bus.publish(job_id, {
"type": "job_update",
"id": job_id,
"percent": float(d.get("_percent_str", "0").strip("%") or 0),
"speed": d.get("speed"),
"eta": d.get("eta"),
"downloaded_bytes": d.get("downloaded_bytes", 0),
})
elif d["status"] == "finished":
sse_bus.publish(job_id, {
"type": "job_update",
"id": job_id,
"status": "completed",
"filename": d.get("filename"),
"filesize": d.get("total_bytes") or d.get("total_bytes_estimate"),
})
return hook
def _run_download(url: str, ydl_opts: dict) -> dict:
"""Runs in thread pool. Returns info_dict on success."""
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
return ydl.extract_info(url, download=True)
async def download_async(url: str, ydl_opts: dict) -> dict:
loop = asyncio.get_event_loop()
return await loop.run_in_executor(_executor, _run_download, url, ydl_opts)
```
**Key yt-dlp options to set:**
```python
ydl_opts = {
"quiet": True, # suppress console output
"noprogress": True, # suppress progress bar (hooks handle this)
"logger": YDLLogger(),
"progress_hooks": [_make_progress_hook(job_id, sse_bus)],
"outtmpl": output_template, # resolved per source domain
"format": format_id or "bestvideo+bestaudio/best",
"cookiefile": cookie_path, # None if no cookies.txt uploaded
"noplaylist": not is_playlist_request,
"extract_flat": False, # False for actual download; True for format listing only
}
```
**Format extraction (no download):**
```python
ydl_opts = {"quiet": True, "extract_flat": True, "skip_download": True}
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
info = ydl.extract_info(url, download=False)
formats = info.get("formats", [])
```
**Progress hook dict keys available during `status == "downloading"`:**
- `_percent_str` — e.g. `" 45.2%"` (strip whitespace and `%`)
- `speed` — bytes/sec (float or None)
- `eta` — seconds remaining (int or None)
- `downloaded_bytes` — int
- `total_bytes` — int (may be None for live streams)
- `total_bytes_estimate` — int (fallback when total_bytes is None)
- `filename` — destination path
### SSE Bus: asyncio.Queue per Session
```python
# backend/app/core/sse_bus.py — canonical pattern
import asyncio
from collections import defaultdict
class SSEBus:
def __init__(self):
self._queues: dict[str, list[asyncio.Queue]] = defaultdict(list)
def subscribe(self, session_id: str) -> asyncio.Queue:
q: asyncio.Queue = asyncio.Queue()
self._queues[session_id].append(q)
return q
def unsubscribe(self, session_id: str, q: asyncio.Queue):
self._queues[session_id].discard(q)
def publish(self, session_id: str, event: dict):
"""Called from thread pool via run_in_executor — must be thread-safe."""
# asyncio.Queue is NOT thread-safe from a thread pool worker.
# Use loop.call_soon_threadsafe instead.
loop = asyncio.get_event_loop()
for q in self._queues.get(session_id, []):
loop.call_soon_threadsafe(q.put_nowait, event)
```
**SSE endpoint using sse-starlette:**
```python
from sse_starlette.sse import EventSourceResponse
@router.get("/api/events")
async def events(request: Request, session_id: str = Depends(get_session)):
async def generator():
q = sse_bus.subscribe(session_id)
try:
# Replay current state on connect (page-refresh safe)
jobs = await job_manager.get_jobs_for_session(session_id)
yield {"event": "init", "data": json.dumps({"jobs": [j.to_dict() for j in jobs]})}
while True:
if await request.is_disconnected():
break
try:
event = await asyncio.wait_for(q.get(), timeout=15.0)
yield {"event": event["type"], "data": json.dumps(event)}
except asyncio.TimeoutError:
yield {"event": "ping", "data": ""} # keepalive
finally:
sse_bus.unsubscribe(session_id, q)
return EventSourceResponse(generator())
```
### APScheduler 3.x Lifespan Integration
```python
# backend/app/main.py
from contextlib import asynccontextmanager
from apscheduler.schedulers.asyncio import AsyncIOScheduler
scheduler = AsyncIOScheduler()
@asynccontextmanager
async def lifespan(app: FastAPI):
# Startup
await db.init()
if config.purge.mode == "scheduled":
scheduler.add_job(
run_purge,
"cron",
id="purge_job",
**parse_cron(config.purge.schedule), # parse "0 3 * * *" → hour=3, minute=0
)
scheduler.start()
yield
# Shutdown
scheduler.shutdown(wait=False)
await db.close()
app = FastAPI(lifespan=lifespan)
```
**Cron string parsing:** APScheduler 3.x does NOT accept raw cron strings. Parse `"0 3 * * *"` into kwargs manually or use `CronTrigger.from_crontab("0 3 * * *")`:
```python
from apscheduler.triggers.cron import CronTrigger
scheduler.add_job(run_purge, CronTrigger.from_crontab(config.purge.schedule))
```
### pydantic-settings Config Pattern
```python
# backend/app/config.py
from pydantic import BaseModel
from pydantic_settings import BaseSettings, SettingsConfigDict, YamlConfigSettingsSource
class DownloadsConfig(BaseModel):
output_dir: str = "/downloads"
max_concurrent: int = 3
default_quality: str = "bestvideo+bestaudio/best"
class AppConfig(BaseSettings):
downloads: DownloadsConfig = DownloadsConfig()
# ... other sections
model_config = SettingsConfigDict(
env_prefix="MEDIARIP_",
env_nested_delimiter="__",
yaml_file="/config/config.yaml",
yaml_file_encoding="utf-8",
)
@classmethod
def settings_customise_sources(cls, settings_cls, **kwargs):
return (
kwargs["env_settings"], # MEDIARIP__SECTION__KEY highest priority
YamlConfigSettingsSource(settings_cls), # config.yaml
kwargs["init_settings"],
kwargs["default_settings"],
)
```
### Admin Auth: HTTPBasic + bcrypt
No JWT. No OAuth. Username/password stored (hashed) in SQLite `settings` table. Pattern mirrors qBittorrent/Sonarr.
```python
# backend/app/dependencies.py
import secrets
import bcrypt
from fastapi import Depends, HTTPException, status
from fastapi.security import HTTPBasic, HTTPBasicCredentials
security = HTTPBasic()
async def require_admin(credentials: HTTPBasicCredentials = Depends(security)):
stored_hash = await settings_store.get("admin_password_hash")
username_ok = secrets.compare_digest(
credentials.username.encode(), (await settings_store.get("admin_username")).encode()
)
password_ok = bcrypt.checkpw(credentials.password.encode(), stored_hash.encode())
if not (username_ok and password_ok):
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
headers={"WWW-Authenticate": "Basic"},
)
```
**First-boot flow:** If no admin credentials in DB, generate random password, log it to stdout once, store hash. UI prompts forced change.
---
## Installation
```bash
# backend/requirements.txt — pinned versions
fastapi==0.135.1
uvicorn[standard]==0.42.0
yt-dlp==2026.3.17
aiosqlite==0.22.1
apscheduler==3.11.2
pydantic==2.12.5
pydantic-settings[yaml]==2.13.1
sse-starlette==3.3.3
python-multipart==0.0.22
bcrypt==5.0.0
PyYAML==6.0.2
# Dev/test only
httpx==0.28.1
pytest==9.0.2
anyio[trio]==4.x # anyio bundled with fastapi; install for pytest marker
ruff==0.15.x
```
```bash
# frontend/package.json (key deps)
npm install vue@3.5.30 vue-router@5.0.3 pinia@3.0.4
npm install -D vite@8.0.0 @vitejs/plugin-vue@6.0.1 vue-tsc typescript vitest@4.1.0
```
---
## Alternatives Considered
| Recommended | Alternative | When to Use Alternative |
|-------------|-------------|-------------------------|
| sse-starlette | FastAPI native SSE (0.135+) | Use native only for simple fire-and-forget streams. sse-starlette wins for long-lived connections needing disconnect detection and keepalive. |
| APScheduler 3.x | APScheduler 4.x | Revisit when 4.x exits alpha. 4.x has cleaner asyncio API but is not production-stable as of March 2026. |
| APScheduler 3.x | Celery + Redis | Only if distributed workers needed. Adds Redis dependency — unacceptable for single-container distribution goal. |
| aiosqlite (raw) | SQLAlchemy async + aiosqlite | SQLAlchemy adds overhead and ORM complexity. Raw aiosqlite with parameterized queries is sufficient for this schema. |
| bcrypt (direct) | passlib | passlib is unmaintained and throws deprecation warnings on Python 3.12. Will break on Python 3.13 (crypt module removed). |
| bcrypt (direct) | pwdlib | pwdlib 0.3.0 is Beta status. Fine for new projects, but bcrypt direct is simpler for a single-algorithm case. |
| pydantic-settings[yaml] | python-dotenv + manual YAML | pydantic-settings handles env var layering, type coercion, and nested delimiter out of the box. |
| ThreadPoolExecutor | ProcessPoolExecutor | YoutubeDL objects are not picklable — process pool raises RuntimeError immediately. |
| Vue 3.5.x | Vue 3.6.x beta | 3.6 beta introduces Vapor mode (breaking internal changes). Wait for stable. |
| Vite 8 | Vite 6/7 | Vite 8 is current stable with Rolldown. Vitest 4.x requires Vite >=6, compatible with 8. |
---
## What NOT to Use
| Avoid | Why | Use Instead |
|-------|-----|-------------|
| WebSockets | Bidirectional protocol overhead; `EventSource` auto-reconnects natively; HTTP POST is sufficient for submitting downloads | SSE via sse-starlette |
| passlib | Last release years ago; `crypt` module deprecated Python 3.12, removed Python 3.13; throws DeprecationWarning in prod | bcrypt directly |
| APScheduler 4.x | Still alpha (4.0.0a6) as of March 2026 | APScheduler 3.11.2 |
| ProcessPoolExecutor | YoutubeDL cannot be pickled — crashes immediately | ThreadPoolExecutor |
| SQLAlchemy ORM | Adds 3 abstraction layers for a schema that has 2 tables. Raw aiosqlite is ~50 lines | Raw aiosqlite |
| JWT / OAuth | Unnecessary complexity for an admin panel on a self-hosted tool. No multi-user auth needed. | HTTPBasic over bcrypt |
| Vuex | Superseded by Pinia; Vuex has no active development for Vue 3 | Pinia 3 |
| Vue 3.6.x beta | Vapor mode is in flux; internal API changes can break component libraries | Vue 3.5.30 stable |
| axios | No advantage over browser `fetch` + `EventSource` for this app's API surface | Native `fetch` for REST, `EventSource` for SSE |
---
## Version Compatibility
| Package | Compatible With | Notes |
|---------|-----------------|-------|
| FastAPI 0.135.1 | Pydantic v2 only | Pydantic v1 not supported. |
| FastAPI 0.135.1 | Starlette 0.46.x | Pinned transitively; don't install Starlette separately unless matching. |
| sse-starlette 3.3.3 | Python >=3.10 | Will fail on Python 3.9. Project uses 3.12 — fine. |
| vitest 4.1.0 | Vite >=6.0.0 | Compatible with Vite 8. |
| APScheduler 3.11.2 | Python >=3.6 | `AsyncIOScheduler` requires asyncio event loop to already be running when `.start()` is called — hence lifespan pattern. |
| bcrypt 5.0.0 | Breaking: passwords >72 bytes raise ValueError | Not a concern for admin passwords. |
| pydantic-settings 2.13.1 | pydantic >=2.7.0 | Installed alongside FastAPI — transitive version is fine. |
| yt-dlp 2026.3.17 | ffmpeg (system package) | ffmpeg must be installed at the OS level (`apt-get install ffmpeg`). yt-dlp does not bundle it. The Dockerfile already handles this. |
---
## Sources
- [PyPI: yt-dlp](https://pypi.org/project/yt-dlp/) — version 2026.3.17 confirmed
- [PyPI: FastAPI](https://pypi.org/project/fastapi/) — version 0.135.1 confirmed
- [PyPI: uvicorn](https://pypi.org/project/uvicorn/) — version 0.42.0 confirmed
- [PyPI: aiosqlite](https://pypi.org/project/aiosqlite/) — version 0.22.1 confirmed
- [PyPI: APScheduler](https://pypi.org/project/apscheduler/) — 3.11.2 stable, 4.0.0a6 alpha
- [PyPI: pydantic-settings](https://pypi.org/project/pydantic-settings/) — version 2.13.1 confirmed
- [PyPI: sse-starlette](https://pypi.org/project/sse-starlette/) — version 3.3.3 confirmed
- [PyPI: bcrypt](https://pypi.org/project/bcrypt/) — version 5.0.0 confirmed
- [PyPI: httpx](https://pypi.org/project/httpx/) — version 0.28.1 confirmed
- [PyPI: pytest](https://pypi.org/project/pytest/) — version 9.0.2 confirmed
- [npm: vue](https://www.npmjs.com/package/vue) — 3.5.30 stable, 3.6.0-beta.6 available
- [npm: vue-router](https://www.npmjs.com/package/vue-router) — 5.0.3 confirmed (non-breaking from 4.x)
- [npm: pinia](https://www.npmjs.com/package/pinia) — 3.0.4 confirmed
- [npm: vite](https://vite.dev/releases) — 8.0.0 with Rolldown stable
- [Vitest 4.0 announcement](https://vitest.dev/blog/vitest-4) — version 4.1.0 confirmed
- [FastAPI HTTP Basic Auth docs](https://fastapi.tiangolo.com/advanced/security/http-basic-auth/) — HTTPBasic pattern
- [FastAPI SSE docs](https://fastapi.tiangolo.com/tutorial/server-sent-events/) — EventSourceResponse
- [sse-starlette GitHub](https://github.com/sysid/sse-starlette) — disconnect handling pattern
- [APScheduler 3.x docs](https://apscheduler.readthedocs.io/en/3.x/userguide.html) — CronTrigger.from_crontab
- [passlib deprecation discussion](https://github.com/fastapi/fastapi/discussions/11773) — confirmed broken on Python 3.13
- [yt-dlp asyncio issue #9487](https://github.com/yt-dlp/yt-dlp/issues/9487) — ThreadPoolExecutor vs ProcessPoolExecutor constraint
---
*Stack research for: media.rip() — self-hosted yt-dlp web frontend*
*Researched: 2026-03-17*

View file

@ -0,0 +1,204 @@
# Project Research Summary
**Project:** media.rip() — self-hosted yt-dlp web frontend
**Domain:** Self-hosted media downloader / yt-dlp web UI
**Researched:** 2026-03-17
**Confidence:** HIGH
## Executive Summary
media.rip() is a self-hosted web UI for yt-dlp: users paste URLs, select quality, and the tool downloads media to a local volume. The competitive landscape (MeTube, yt-dlp-web-ui, ytptube) reveals a consistent set of gaps — no competitor does mobile well, none offer per-session isolation, and theming is either absent or env-var-only. The recommended approach is a Python 3.12 / FastAPI backend serving a Vue 3 SPA, with yt-dlp used as a library (not subprocess) inside a `ThreadPoolExecutor`, and real-time progress delivered over SSE rather than WebSockets. All versions are verified stable as of March 2026. The stack is well-documented with established integration patterns.
The primary architectural challenge is the sync-to-async bridge: yt-dlp is synchronous and blocking, FastAPI is async. The correct pattern — `ThreadPoolExecutor` + `loop.call_soon_threadsafe` to route progress hook events into per-session `asyncio.Queue`s — is well-understood and must be built correctly in Phase 1. Getting this wrong produces either a blocked event loop or silent event loss, and retrofitting it later is expensive. Every subsequent feature (SSE progress, session isolation, cookies.txt auth) depends on this bridge being correct.
The top risks are (1) shared `YoutubeDL` instances corrupting concurrent downloads, (2) SSE `CancelledError` swallowing creating zombie connections, (3) cookies.txt leakage via CVE-2023-35934 if cookie files are not per-session and purge-scoped, and (4) SQLite write contention without WAL mode. All four are preventable at setup time with known mitigations. The session isolation differentiator (the feature MeTube explicitly closed as "won't fix") is also the feature with the most architectural surface area — it must be designed in from Phase 1, not bolted on.
## Key Findings
### Recommended Stack
The backend is Python 3.12 (avoiding 3.13's passlib breakage), FastAPI 0.135.1 (Pydantic v2, native SSE support), yt-dlp 2026.3.17 as a library, aiosqlite 0.22.1 for async SQLite, APScheduler 3.x (not 4.x alpha) for cron jobs, and sse-starlette 3.3.3 for production-reliable SSE disconnect handling. Password hashing uses bcrypt 5.0.0 directly — passlib is unmaintained and breaks on Python 3.13. Config is loaded from `config.yaml` and env vars via `pydantic-settings[yaml]` with `MEDIARIP__SECTION__KEY` override pattern. The frontend is Vue 3.5.30 (avoiding 3.6 beta's Vapor mode churn), Pinia 3 (Vuex is dead for Vue 3), Vite 8 with Rolldown, and Vitest 4. See STACK.md for pinned versions and integration patterns.
**Core technologies:**
- Python 3.12 + FastAPI 0.135.1: async HTTP API, SSE, HTTPBasic auth — Pydantic v2 required, async-first design matches download model
- yt-dlp 2026.3.17 (library mode): download engine — used as `import yt_dlp`, not subprocess; gives structured progress hooks and no shell-injection surface
- aiosqlite 0.22.1: job/session/config persistence — single-file DB, zero external deps, WAL mode required for concurrent downloads
- sse-starlette 3.3.3: SSE transport — more reliable disconnect handling than FastAPI's native SSE for long-lived connections
- Vue 3.5.30 + Pinia 3 + Vite 8: frontend SPA — Composition API, `<script setup>`, Rolldown builds
- ThreadPoolExecutor (not ProcessPoolExecutor): runs yt-dlp sync code — `YoutubeDL` is not picklable; threads only
### Expected Features
The full v1.0 feature set is ambitious but well-scoped. All features are mapped to dependencies in FEATURES.md. Session isolation is the primary differentiator and the feature that drives architectural decisions for the entire product.
**Must have (table stakes):**
- URL submission + format/quality selector (live extraction via yt-dlp, not presets)
- Real-time SSE progress with SSE init replay on reconnect
- Download queue: filter, sort, cancel, playlist parent/child collapsible
- Session isolation: isolated (default) / shared / open modes via cookie-based UUID
- cookies.txt upload per-session (Netscape format, purge-scoped)
- Mobile-responsive layout (bottom tabs, 44px touch targets, card list at <768px)
- Admin panel: username/password login, session list, storage, manual purge, config editor
- Purge system: scheduled/manual/never, independent file and log TTLs
- Three built-in themes: cyberpunk (default), dark, light
- Docker: single image, GHCR + Docker Hub, amd64 + arm64
- Health endpoint, session export/import, link sharing, unsupported URL reporting
**Should have (competitive):**
- Drop-in custom theme system via volume mount — the feature MeTube refuses to build
- Source-aware output templates (per-site defaults)
- Heavily commented built-in themes as drop-in documentation
- Zero automatic outbound telemetry (explicit design constraint, not an afterthought)
**Defer (v2+):**
- Subscription/channel monitoring — fundamentally different product scope (TubeArchivist territory)
- External arr-stack API integration — architecture does not block this; clean service layer is ready
- Telegram/Discord bot — documented as extension point; clean REST API makes it straightforward later
**Anti-features (do not build):**
- OAuth/SSO, WebSockets, user accounts/registration, embedded video player, automatic yt-dlp updates at runtime, FlareSolverr integration
### Architecture Approach
The system is a single Docker container: Vue 3 SPA (built to `/app/static/` at image build time, served by FastAPI `StaticFiles`) communicating with a FastAPI backend over REST + SSE. The backend has a clear layered structure — `core/` (long-lived singletons: SSEBroker, ConfigManager, DB pool), `middleware/` (session cookie), `routers/` (thin, delegate to services), `services/` (business logic: DownloadService, PurgeService, SessionExporter). The critical architectural decision is the async bridge: `DownloadService` holds a dedicated `ThreadPoolExecutor`; progress hooks use `loop.call_soon_threadsafe` to route events into per-session `asyncio.Queue`s in the `SSEBroker` singleton. See ARCHITECTURE.md for the full system diagram, data flow paths, and anti-patterns.
**Major components:**
1. `SSEBroker` (`app/core/sse_broker.py`) — per-session `asyncio.Queue` fan-out; singleton; bridges thread-pool workers to SSE clients
2. `DownloadService` (`app/services/download.py`) — long-lived, owns `ThreadPoolExecutor`, job registry, and yt-dlp invocation per job
3. `SessionMiddleware` (`app/middleware/session.py`) — auto-creates `mrip_session` UUID cookie; stores opaque ID only (not content)
4. `ConfigManager` (`app/core/config.py`) — three-layer config: hardcoded defaults → `config.yaml` → SQLite admin writes
5. `PurgeService` (`app/services/purge.py`) — file TTL, session TTL, log trim; called by APScheduler and admin trigger
6. Vue Pinia `sse` store (`frontend/src/stores/sse.ts`) — isolated SSE lifecycle; downloads store subscribes to it
**Key patterns:**
- Sync-to-async bridge: `loop.call_soon_threadsafe(queue.put_nowait, event)` — never call asyncio primitives directly from progress hook
- Per-session SSE queue fan-out: `SSEBroker` maps `session_id → List[Queue]`; one queue per tab, not per session
- SSE replay on reconnect: endpoint replays current DB state as synthetic events before entering live queue
- Config hierarchy: defaults → YAML (seeds DB on first boot) → SQLite (live admin writes win)
- Opaque session cookie: only UUID stored in cookie; all state lives in SQLite
### Critical Pitfalls
1. **Shared `YoutubeDL` instance across concurrent downloads** — create a fresh `YoutubeDL` per job inside the worker function; never share across threads. Warning signs: progress percentages swap between unrelated jobs; `TypeError` in progress hook. Address in Phase 1.
2. **Calling asyncio primitives directly from progress hook** — use `loop.call_soon_threadsafe(queue.put_nowait, event)` only; capture the event loop at FastAPI startup before executor threads start. Warning signs: SSE never receives progress; `RuntimeError: no running event loop`. Address in Phase 1.
3. **SSE `CancelledError` swallowing creating zombie connections** — never use `except Exception` in SSE generators; always use `try/finally` and explicitly `raise` in `except asyncio.CancelledError`. Warning signs: server memory grows slowly; zombie tasks visible in `asyncio.all_tasks()`. Address in Phase 2.
4. **SQLite write contention without WAL mode** — enable `PRAGMA journal_mode=WAL`, `PRAGMA synchronous=NORMAL`, `PRAGMA busy_timeout=5000` at DB init before any other schema work. Warning signs: `SQLITE_BUSY` errors under 3+ concurrent downloads. Address in Phase 1.
5. **cookies.txt leakage (CVE-2023-35934)** — pin yt-dlp >= 2023-07-06; store cookies.txt per-session at `data/sessions/{session_id}/cookies.txt`; delete on job completion and session purge. Address in Phase 2 when cookie auth is implemented; pin version constraint in Phase 1.
6. **Purge deleting files for active downloads** — purge queries must filter `status IN ('completed', 'failed', 'cancelled')`; never rely on timestamp alone. Write a regression test as part of purge implementation. Address in Phase 3.
## Implications for Roadmap
The build order from ARCHITECTURE.md is the correct dependency-respecting sequence. The SSE transport is on the critical path — all meaningful frontend progress validation requires it. Session isolation must be designed in from Phase 1 (the middleware and DB schema), not added in Phase 3.
### Phase 1: Foundation
**Rationale:** Everything else depends on this layer. DB schema, WAL mode, session cookie middleware, SSEBroker, and ConfigManager have no inter-dependencies and must be correct before any business logic is added. The yt-dlp integration pattern (ThreadPoolExecutor + `call_soon_threadsafe`) must also be established here — it is the load-bearing architectural decision.
**Delivers:** Working yt-dlp download engine, DB schema with WAL mode, session cookie middleware, SSEBroker, ConfigManager, URL submission + format probe API
**Addresses:** URL submission, format/quality selector, real-time SSE progress (the core loop)
**Avoids:** Shared `YoutubeDL` instance pitfall, asyncio bridge pitfall, SQLite WAL pitfall — all three must be implemented correctly in this phase, not retrofitted
### Phase 2: SSE Transport + Session System
**Rationale:** SSE replay-on-reconnect and per-session isolation are the features that differentiate this product from MeTube. Both require the DB and SSEBroker from Phase 1. SSE `Last-Event-ID` replay and session cookie handling must be designed together — they share state assumptions. cookies.txt upload is also here because it depends on the session system.
**Delivers:** Full SSE streaming with disconnect handling, reconnect replay, and per-session queue isolation; session isolation modes (isolated/shared/open); cookies.txt upload per-session
**Uses:** sse-starlette 3.3.3, `asyncio.Queue` per-session fan-out, aiosqlite session table
**Implements:** SSEBroker fan-out pattern, SSE reconnect replay, SessionMiddleware, `SessionService`
**Avoids:** `CancelledError` swallowing, SSE reconnect storm, cookies.txt CVE-2023-35934
### Phase 3: Frontend Core
**Rationale:** Once the Phase 2 API shape is stable (SSE events typed, endpoints defined), the frontend can be built against it. Pinia SSE store and downloads store must be built together — their event contract is the interface. The download queue component drives the primary UX validation.
**Delivers:** Vue 3 SPA with download queue, format picker, progress bars, playlist parent/child rows, mobile-responsive layout (bottom tabs, 44px targets)
**Uses:** Vue 3.5.30, Pinia 3, Vite 8, `EventSource` API, `fetch` for REST
**Implements:** Pinia `sse` store (isolated lifecycle), `downloads` store (SSE-driven mutations), `DownloadQueue`, `FormatPicker`, `ProgressBar`, `PlaylistRow` components
### Phase 4: Admin + Auth
**Rationale:** Admin routes must be protected before the panel is built — shipping an unprotected admin panel even briefly is not acceptable. HTTPBasic + bcrypt is simple and sufficient; no JWT needed. Admin panel enables operator self-service for config, session management, and purge.
**Delivers:** Admin authentication (HTTPBasic + bcrypt, first-boot credential setup with forced change prompt), Admin panel UI (session list, storage view, manual purge trigger, live config editor, unsupported URL log download)
**Uses:** bcrypt 5.0.0 (direct, not passlib), `secrets.compare_digest` for constant-time comparison, `pydantic-settings[yaml]` config hierarchy
**Avoids:** Plaintext admin credentials, timing side channels in auth comparison
### Phase 5: Supporting Features
**Rationale:** These features enhance the product but do not block the primary user journey. Theme system requires a stable CSS variable contract (establish early in this phase before any components reference token names — changing token names later breaks all custom themes). Purge requires Admin auth from Phase 4. Session export depends on the session system from Phase 2.
**Delivers:** Three built-in themes (cyberpunk default, dark, light) + drop-in custom theme system via volume mount + theme picker UI; PurgeService with APScheduler cron (file TTL, session TTL, log rotation); session export/import; health endpoint; link sharing; unsupported URL reporting; source-aware output templates
**Avoids:** Purge-deletes-active-downloads pitfall (status guard required); theme token naming lock-in (establish CSS variable contract before component work)
### Phase 6: Distribution
**Rationale:** Docker packaging is a feature for this audience. Multi-stage build keeps image size under 400MB compressed. amd64 + arm64 is required — arm64 users (Raspberry Pi, Apple Silicon NAS devices) are a significant self-hosted audience. CI/CD ensures the image stays functional as yt-dlp extractors evolve.
**Delivers:** Multi-stage Dockerfile (Node builder → Python deps builder → slim runtime with ffmpeg), docker-compose.yml, GitHub Actions CI (lint, type-check, test, Docker smoke), GitHub Actions CD (tag → build + push GHCR + Docker Hub → release)
**Avoids:** Docker image bloat (multi-stage build + `.dockerignore` + slim base targets <400MB compressed), stale extractor risk (CI smoke-tests downloads from 2+ sites)
### Phase Ordering Rationale
- Phase 1 before Phase 2: SSEBroker and DB must exist before SSE endpoint or session middleware can be built
- Phase 2 before Phase 3: Frontend SSE store requires a typed event contract; that contract comes from the working SSE endpoint
- Phase 4 after Phase 2: Admin routes depend on session infrastructure for session listing; auth must precede the panel itself
- Phase 5 after Phase 4: Purge needs admin auth; theme system needs stable components to reference token names
- Phase 6 last: Docker packaging wraps a working application; CI/CD requires the test suite from earlier phases
### Research Flags
Phases likely needing deeper research during planning:
- **Phase 2 (SSE + Session):** `Last-Event-ID` replay implementation details are non-trivial; session mode switching behavior (isolated → shared mid-deployment) needs explicit design before coding. Consider a dedicated research step on SSE event ID sequencing strategy.
- **Phase 5 (Theme system):** CSS variable contract naming is a one-way door — token names cannot change after operators write custom themes. Needs deliberate design (not just "we'll figure it out") before Phase 3 component work begins.
- **Phase 6 (Docker/CI):** Multi-platform QEMU builds on GitHub Actions standard runners can be slow; arm64 smoke testing strategy needs explicit plan.
Phases with standard patterns (skip research-phase):
- **Phase 1 (Foundation):** ThreadPoolExecutor + `call_soon_threadsafe` pattern is fully documented in STACK.md and ARCHITECTURE.md. WAL pragma sequence is known. DB schema is defined.
- **Phase 3 (Frontend Core):** Vue 3 + Pinia + Vite patterns are well-established. SSE via `EventSource` is a browser standard.
- **Phase 4 (Admin + Auth):** HTTPBasic + bcrypt pattern is fully specified in STACK.md. No novel patterns needed.
## Confidence Assessment
| Area | Confidence | Notes |
|------|------------|-------|
| Stack | HIGH | All versions verified against PyPI and npm as of 2026-03-17. Critical alternatives (passlib, APScheduler 4.x, ProcessPoolExecutor, Vue 3.6 beta) explicitly ruled out with rationale. |
| Features | HIGH (core), MEDIUM (UX patterns) | Competitor feature gaps verified via GitHub issues (MeTube #591 closed won't-fix). UX patterns (mobile layout specifics, theme interaction details) are based on community consensus, not official specs. |
| Architecture | HIGH (integration patterns), MEDIUM (schema shape) | ThreadPoolExecutor + `call_soon_threadsafe` pattern verified via yt-dlp issue #9487. Schema shape is a design choice, not a discovered pattern — reviewed but not battle-tested. |
| Pitfalls | HIGH (critical), MEDIUM (performance traps) | Critical pitfalls verified via CVE advisories, official yt-dlp issues, and sse-starlette docs. Performance trap thresholds (e.g., "10,000+ job rows for index to matter") are community estimates. |
**Overall confidence:** HIGH
### Gaps to Address
- **Session mode switching mid-deployment:** Research documents the data model implications (isolated rows remain per-session when switching to shared) but does not specify a migration or operator-facing behavior contract. Design explicitly before Phase 2 implementation.
- **CSS variable token naming:** No canonical reference for a yt-dlp-themed CSS variable contract exists. The token set must be designed from scratch in Phase 5 (or early Phase 3 if components will reference them). Treat as a design deliverable, not an implementation detail.
- **HTTP/2 in single-container deployment:** SSE 6-connection-per-domain limit on HTTP/1.1 is documented as a risk. The mitigation (nginx/caddy in front, or `uvicorn --http h2`) is noted but not fully specified in the architecture. Confirm which approach is the recommended default for the Docker compose reference deployment.
- **yt-dlp extractor freshness strategy:** Pinning to `yt-dlp==2026.3.17` is correct for reproducibility, but extractors break as sites update. The update strategy ("publish new image on yt-dlp releases via CI") is noted but not implemented. Plan this in Phase 6 as a CI/CD workflow.
## Sources
### Primary (HIGH confidence)
- [PyPI: yt-dlp, FastAPI, uvicorn, aiosqlite, APScheduler, pydantic-settings, sse-starlette, bcrypt, httpx, pytest](https://pypi.org/) — all versions verified 2026-03-17
- [npm: vue, vue-router, pinia, vite, @vitejs/plugin-vue, vitest](https://www.npmjs.com/) — all versions verified 2026-03-17
- [yt-dlp Security Advisory GHSA-v8mc-9377-rwjj (CVE-2023-35934)](https://github.com/yt-dlp/yt-dlp/security/advisories/GHSA-v8mc-9377-rwjj) — cookie leak via redirect
- [yt-dlp issue #9487](https://github.com/yt-dlp/yt-dlp/issues/9487) — ThreadPoolExecutor vs ProcessPoolExecutor constraint
- [MeTube issue #591](https://github.com/alexta69/metube/issues/591) — session isolation closed as won't-fix
- [sse-starlette: Client Disconnection Detection](https://deepwiki.com/sysid/sse-starlette/3.5-client-disconnection-detection) — CancelledError must be re-raised
- [FastAPI docs: HTTP Basic Auth](https://fastapi.tiangolo.com/advanced/security/http-basic-auth/) — HTTPBasic pattern
- [FastAPI docs: SSE](https://fastapi.tiangolo.com/tutorial/server-sent-events/) — EventSourceResponse
### Secondary (MEDIUM confidence)
- [MeTube GitHub](https://github.com/alexta69/metube) — competitor feature analysis
- [yt-dlp-web-ui GitHub](https://github.com/marcopiovanello/yt-dlp-web-ui) — competitor feature analysis
- [ytptube GitHub](https://github.com/arabcoders/ytptube) — competitor feature analysis
- [APScheduler 3.x docs](https://apscheduler.readthedocs.io/en/3.x/userguide.html) — CronTrigger.from_crontab pattern
- [Browser connection limits for SSE](https://www.javascriptroom.com/blog/server-sent-events-and-browser-limits/) — 6-connection HTTP/1.1 limit
- [passlib deprecation discussion](https://github.com/fastapi/fastapi/discussions/11773) — Python 3.12/3.13 breakage confirmed
### Tertiary (LOW confidence)
- [Docker image size targets for arm64](https://github.com/wader/static-ffmpeg) — community estimate of <400MB compressed; not formally benchmarked for this stack
---
*Research completed: 2026-03-17*
*Ready for roadmap: yes*