tubearr/docs/NAMING-METADATA-SPEC.md
jlightner 4fe4986f01
Some checks failed
CI / test (push) Failing after 17s
docs: naming templates + metadata embedding design spec
Four-slice feature: template engine with {token} syntax, Plex-compatible
metadata embedding via ffmpeg, batch rename tool, live preview UI.
Per-platform and per-channel naming profile support.

Design only — implementation in next milestone.
2026-04-04 03:27:12 +00:00

250 lines
8 KiB
Markdown

# Naming Templates & Metadata Embedding — Design Spec
> Drafted 2026-04-04. Build target: next milestone.
## Overview
Configurable filename templates with token-based variables, Plex-compatible metadata embedding via ffmpeg, and batch rename tooling for existing libraries.
---
## 1. Naming Template Engine
### Token Syntax
Templates use `{token}` placeholders with optional formatters:
| Token | Description | Example Output |
|-------|-------------|---------------|
| `{title}` | Content item title | `This is the video name!` |
| `{channel}` | Channel name | `RockinWorms Homestead` |
| `{platform}` | Platform name | `youtube` |
| `{Platform}` | Platform name (capitalized) | `YouTube` |
| `{date}` | Publish date (YYYY-MM-DD) | `2026-04-03` |
| `{date:MM-DD-YYYY}` | Publish date (custom format) | `04-03-2026` |
| `{date:YYYY}` | Year only | `2026` |
| `{date:YYYYMMDD}` | Compact date | `20260403` |
| `{quality}` | Resolution shorthand | `1080p` |
| `{codec}` | Video codec | `h264` |
| `{id}` | Platform content ID | `dQw4w9WgXcQ` |
| `{ext}` | File extension | `mp4` |
| `{duration}` | Duration (HH-MM-SS) | `00-12-34` |
| `{index}` | Item index (zero-padded) | `001` |
| `{type}` | Content type | `video` |
### Conditional Sections
Wrap optional sections in `[]`:
- `{channel} - {title} [{quality}]``RockinWorms - Title [1080p]` (quality present)
- `{channel} - {title} [{quality}]``RockinWorms - Title` (quality null — brackets removed)
### Path Template (directory structure)
Separate from filename template. Controls folder hierarchy:
- Default: `{platform}/{channel}`
- Custom: `{platform}/{channel}/{date:YYYY}` (year subfolders)
- Custom: `{Platform} - {channel}` (flat by channel)
### Live Preview
Settings UI shows a real-time preview as the user types:
```
Template: {channel} - {title} ({date:MM-DD-YYYY}) [{quality}]
Preview: RockinWorms Homestead - Sifting Worm Castings (04-03-2026) [1080p].mp4
```
Uses a sample content item from the platform's existing library for realistic preview.
---
## 2. Schema Changes
### `naming_profiles` table (new)
| Column | Type | Notes |
|--------|------|-------|
| id | INTEGER PK | |
| name | TEXT NOT NULL | e.g. "YouTube Default", "Music Archive" |
| pathTemplate | TEXT NOT NULL | Directory structure: `{platform}/{channel}` |
| filenameTemplate | TEXT NOT NULL | Filename: `{channel} - {title} ({date:MM-DD-YYYY}) [{quality}]` |
| createdAt | TEXT | |
| updatedAt | TEXT | |
### `platform_settings` additions
| Column | Type | Notes |
|--------|------|-------|
| namingProfileId | INTEGER FK → naming_profiles | Per-platform naming override |
### `channels` additions
| Column | Type | Notes |
|--------|------|-------|
| namingProfileId | INTEGER FK → naming_profiles | Per-channel override (trumps platform) |
### Resolution order
1. Channel-level `namingProfileId` (if set)
2. Platform-level `namingProfileId` (if set)
3. System default (hardcoded: `{platform}/{channel}/{title}`)
---
## 3. Plex Metadata Embedding
### Strategy
Post-download step using ffmpeg (already available in the container). Zero re-encoding — metadata written via `-c copy -metadata key=value`.
### Tags Written (MP4 container — Plex Local Media Assets compatible)
| ffmpeg key | Source | Plex field |
|-----------|--------|------------|
| `title` | Content item title | Title |
| `artist` | Channel name | Studio/Artist |
| `date` | publishedAt (YYYY-MM-DD) | Originally Available |
| `year` | publishedAt year | Year |
| `comment` | Channel description or content URL | Summary |
| `description` | Content URL | Summary (fallback) |
| `genre` | Platform name | Genre |
| `show` | Channel name | Show (for TV-style libraries) |
| `episode_sort` | Item index in channel | Sort Order |
| `synopsis` | Content URL | (for some Plex agents) |
### Thumbnail embedding
Already implemented via `--embed-thumbnail` in FormatProfile. Plex reads embedded cover art from MP4.
### MKV container
MKV uses Matroska tags — different key names but ffmpeg handles the translation. Same `-metadata` flags work.
### Implementation
New `MetadataEmbedder` service:
```typescript
class MetadataEmbedder {
async embedMetadata(filePath: string, metadata: MediaMetadata): Promise<void>
}
interface MediaMetadata {
title: string;
artist: string; // channel name
date: string; // YYYY-MM-DD
year: string; // YYYY
comment: string; // URL or description
genre: string; // platform
show: string; // channel name (for TV library mode)
}
```
Called after download completes, before marking status as `downloaded`. Uses `ffmpeg -i input -c copy -metadata ... output && mv output input` (atomic replace via temp file).
### Toggle
Add `embedMetadata: boolean` to FormatProfile (default: false). When enabled, the post-download step runs.
---
## 4. Batch Rename Tool
### Purpose
Apply a naming profile retroactively to already-downloaded files. Handles:
1. Rename files on disk
2. Update `filePath` in database
3. Update `qualityMetadata` path references
4. Handle conflicts (duplicate names)
### API
```
POST /api/v1/channel/:id/rename-preview
Body: { namingProfileId: number }
Response: { renames: [{ contentId, oldPath, newPath, conflict: boolean }] }
POST /api/v1/channel/:id/rename-apply
Body: { namingProfileId: number }
Response: { renamed: number, skipped: number, errors: number }
```
### UI
Channel detail page: "Rename Files" button → modal showing preview of all renames → confirm to apply.
### Safety
- Preview-first: always show what WILL change before applying
- Conflict detection: flag files that would collide
- Dry-run by default
- Rollback: store old paths in download_history for manual recovery
---
## 5. Frontend Components
### NamingProfileEditor
- Template input with token autocomplete (type `{` to see suggestions)
- Live preview using sample data from the platform
- Path template + filename template as separate fields
- Save as named profile
### Platform Settings Integration
- "Naming Profile" dropdown in PlatformSettingsForm
- Per-channel override in ChannelDetail settings section
### Batch Rename Modal
- Channel-scoped: triggered from ChannelDetail
- Shows table: Old Name → New Name with conflict highlighting
- Apply button with confirmation
- Progress indicator for large channels
---
## 6. Implementation Order
### Slice 1: Template Engine + Schema
- `NamingTemplate` class with `resolve(tokens)` method
- Token parsing, date formatting, conditional sections
- Migration: `naming_profiles` table + FK columns
- CRUD routes + repository
### Slice 2: Download Integration
- `FileOrganizer.buildOutputPath` uses naming profile instead of hardcoded pattern
- Resolution chain: channel → platform → system default
- Settings UI: NamingProfileEditor component + platform/channel dropdowns
### Slice 3: Metadata Embedding
- `MetadataEmbedder` service (ffmpeg -c copy -metadata)
- Post-download hook in DownloadService
- `embedMetadata` toggle on FormatProfile
- Verify Plex reads the tags (manual UAT)
### Slice 4: Batch Rename
- Preview API + Apply API
- Rename modal in ChannelDetail
- Conflict detection + rollback tracking
---
## 7. Plex Library Type Recommendations
Document in the app's help/docs:
| Content Type | Plex Library Type | Naming Pattern |
|-------------|------------------|----------------|
| YouTube channels (episodic) | TV Shows | `{channel} - s01e{index} - {title}` |
| YouTube channels (non-episodic) | Other Videos / Personal Media | `{channel} - {title} ({date:YYYY-MM-DD})` |
| Music (SoundCloud, Bandcamp) | Music | `{channel}/{title}` (+ audio tags) |
| Generic (Vimeo, misc) | Other Videos | `{title} ({date:YYYY})` |
---
## Open Questions
1. Should naming profiles be global or per-user? (Currently single-user, so global is fine)
2. Should we support yt-dlp's own output template syntax (`%(title)s`) as an alternative to our `{title}` tokens? Pros: power users already know it. Cons: two template syntaxes to maintain.
3. TV-style episode numbering: auto-increment based on publish date order, or explicit mapping?