Create Data-Model wiki page with full schema documentation

xpltd_admin 2026-04-03 22:40:30 -06:00
parent 1033723870
commit 3ef1e12be3

272
Data-Model.-.md Normal file

@ -0,0 +1,272 @@
# Data Model
| Meta | Value |
|------|-------|
| **Repo** | `xpltdco/tubearr` |
| **Page** | `Data-Model` |
| **Audience** | developers, agents |
| **Last Updated** | 2026-04-04 |
| **Status** | current |
## Overview
Tubearr uses SQLite (via LibSQL) with Drizzle ORM. The database runs in WAL (Write-Ahead Logging) mode for concurrent read/write support. All schema definitions live in `src/db/schema/` and data access goes through repository classes in `src/db/repositories/`.
## Entity Relationships
```mermaid
erDiagram
channels ||--o{ contentItems : "has many"
channels ||--o{ playlists : "has many"
channels ||--o{ downloadHistory : "logged in"
channels }o--o| formatProfiles : "default profile"
contentItems ||--o{ queueItems : "enqueued as"
contentItems ||--o{ downloadHistory : "logged in"
contentItems }o--o{ playlists : "belongs to (via contentPlaylist)"
platformSettings }o--o| formatProfiles : "default profile"
notificationSettings ||--|| notificationSettings : "standalone"
systemConfig ||--|| systemConfig : "standalone"
```
## Tables
### `channels`
**Schema file:** `src/db/schema/channels.ts`
**Repository:** `src/db/repositories/channel-repository.ts`
Monitored content sources — YouTube channels, SoundCloud artists, or generic URLs.
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `id` | INTEGER | No | autoincrement | Primary key |
| `name` | TEXT | No | — | Human-readable channel name |
| `platform` | TEXT | No | — | `youtube` \| `soundcloud` \| `generic` |
| `platformId` | TEXT | No | — | Platform-specific channel/artist ID |
| `url` | TEXT | No | — | Direct URL to channel/artist page |
| `monitoringEnabled` | INTEGER | No | 1 | Boolean: is monitoring active? |
| `monitoringMode` | TEXT | Yes | `all` | `all` \| `future` \| `existing` \| `none` |
| `checkInterval` | INTEGER | Yes | 360 | Minutes between monitoring checks |
| `formatProfileId` | INTEGER | Yes | NULL | FK → `formatProfiles.id` (ON DELETE SET NULL) |
| `imageUrl` | TEXT | Yes | NULL | Channel avatar/thumbnail URL |
| `bannerUrl` | TEXT | Yes | NULL | Channel banner image URL |
| `description` | TEXT | Yes | NULL | Channel description |
| `subscriberCount` | INTEGER | Yes | NULL | Subscriber/follower count |
| `metadata` | JSON | Yes | NULL | Platform-specific extra data |
| `lastCheckedAt` | TEXT | Yes | NULL | ISO timestamp of last monitoring check |
| `lastCheckStatus` | TEXT | Yes | NULL | `success` \| `error` \| `rate_limited` |
| `createdAt` | TEXT | No | now | Creation timestamp |
| `updatedAt` | TEXT | No | now | Last update timestamp |
### `contentItems`
**Schema file:** `src/db/schema/content.ts`
**Repository:** `src/db/repositories/content-repository.ts`
Individual downloadable media — videos, audio tracks, or livestreams.
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `id` | INTEGER | No | autoincrement | Primary key |
| `channelId` | INTEGER | No | — | FK → `channels.id` (ON DELETE CASCADE) |
| `title` | TEXT | No | — | Content title |
| `platformContentId` | TEXT | No | — | Platform-specific video/track ID |
| `url` | TEXT | No | — | Direct URL to content |
| `contentType` | TEXT | No | — | `video` \| `audio` \| `livestream` |
| `duration` | INTEGER | Yes | NULL | Duration in seconds |
| `filePath` | TEXT | Yes | NULL | Local filesystem path (set after download) |
| `fileSize` | INTEGER | Yes | NULL | File size in bytes (set after download) |
| `format` | TEXT | Yes | NULL | Container format: `mp4`, `webm`, `mp3`, etc. |
| `qualityMetadata` | JSON | Yes | NULL | Post-download analysis: resolution, codec, bitrate, warnings |
| `status` | TEXT | No | — | `monitored` \| `queued` \| `downloading` \| `downloaded` \| `failed` \| `ignored` |
| `thumbnailUrl` | TEXT | Yes | NULL | Content thumbnail URL |
| `publishedAt` | TEXT | Yes | NULL | ISO timestamp when content was published |
| `downloadedAt` | TEXT | Yes | NULL | ISO timestamp when download completed |
| `monitored` | INTEGER | Yes | 1 | Boolean: should this item be tracked? |
| `createdAt` | TEXT | No | now | Creation timestamp |
| `updatedAt` | TEXT | No | now | Last update timestamp |
### `formatProfiles`
**Schema file:** `src/db/schema/content.ts`
**Repository:** `src/db/repositories/format-profile-repository.ts`
Reusable download quality presets.
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `id` | INTEGER | No | autoincrement | Primary key |
| `name` | TEXT | No | — | Profile name (e.g., "High Quality", "Mobile") |
| `videoResolution` | TEXT | Yes | NULL | `best`, `1080p`, `720p`, `480p`, etc. |
| `audioCodec` | TEXT | Yes | NULL | `opus`, `aac`, `mp3`, `vorbis` |
| `audioBitrate` | TEXT | Yes | NULL | `320k`, `192k`, `128k`, etc. |
| `containerFormat` | TEXT | Yes | NULL | `mp4`, `mkv`, `webm`, `mp3`, `opus`, etc. |
| `isDefault` | INTEGER | Yes | 0 | Boolean: is this the default profile? |
| `subtitleLanguages` | TEXT | Yes | NULL | Comma-separated language codes (e.g., `en,es,fr`) |
| `embedSubtitles` | INTEGER | Yes | 0 | Boolean: embed subtitles in output file |
| `embedChapters` | INTEGER | Yes | 0 | Boolean: embed chapter markers |
| `embedThumbnail` | INTEGER | Yes | 0 | Boolean: embed thumbnail in metadata |
| `sponsorBlockRemove` | TEXT | Yes | NULL | Comma-separated SponsorBlock categories |
**SponsorBlock categories:** `sponsor`, `selfpromo`, `interaction`, `intro`, `outro`, `preview`, `music_offtopic`, `filler`
### `queueItems`
**Schema file:** `src/db/schema/queue.ts`
**Repository:** `src/db/repositories/queue-repository.ts`
Download queue with retry logic and status tracking.
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `id` | INTEGER | No | autoincrement | Primary key |
| `contentItemId` | INTEGER | No | — | FK → `contentItems.id` (ON DELETE CASCADE) |
| `status` | TEXT | No | — | `pending` \| `downloading` \| `completed` \| `failed` \| `cancelled` |
| `priority` | INTEGER | Yes | 0 | Higher = processed sooner |
| `attempts` | INTEGER | Yes | 0 | Number of download attempts |
| `maxAttempts` | INTEGER | Yes | 3 | Maximum retry attempts |
| `error` | TEXT | Yes | NULL | Last error message |
| `errorCategory` | TEXT | Yes | NULL | Classified error type (see below) |
| `startedAt` | TEXT | Yes | NULL | When download started |
| `completedAt` | TEXT | Yes | NULL | When download finished |
| `createdAt` | TEXT | No | now | Creation timestamp |
| `updatedAt` | TEXT | No | now | Last update timestamp |
**Error categories:** `rate_limit`, `format_unavailable`, `geo_blocked`, `age_restricted`, `private`, `network`, `sign_in_required`, `copyright`, `unknown`
### `downloadHistory`
**Schema file:** `src/db/schema/history.ts`
**Repository:** `src/db/repositories/history-repository.ts`
Activity log for auditing and analytics.
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `id` | INTEGER | No | autoincrement | Primary key |
| `contentItemId` | INTEGER | Yes | NULL | FK → `contentItems.id` (ON DELETE SET NULL) |
| `channelId` | INTEGER | Yes | NULL | FK → `channels.id` (ON DELETE SET NULL) |
| `eventType` | TEXT | No | — | `grabbed` \| `downloaded` \| `failed` \| `imported` \| `deleted` |
| `status` | TEXT | No | — | Event status description |
| `details` | JSON | Yes | NULL | Arbitrary event metadata |
| `createdAt` | TEXT | No | now | Timestamp |
### `notificationSettings`
**Schema file:** `src/db/schema/notifications.ts`
**Repository:** `src/db/repositories/notification-repository.ts`
Notification channel configurations.
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `id` | INTEGER | No | autoincrement | Primary key |
| `type` | TEXT | No | — | `discord` \| `email` \| `pushover` \| `telegram` |
| `name` | TEXT | No | — | Human-readable name |
| `enabled` | INTEGER | No | 1 | Boolean: is this channel active? |
| `config` | JSON | No | — | Type-specific settings (webhook URL, token, etc.) |
| `onGrab` | INTEGER | Yes | 1 | Notify when content is enqueued |
| `onDownload` | INTEGER | Yes | 1 | Notify when download completes |
| `onFailure` | INTEGER | Yes | 1 | Notify when download fails |
| `createdAt` | TEXT | No | now | Creation timestamp |
| `updatedAt` | TEXT | No | now | Last update timestamp |
### `platformSettings`
**Schema file:** `src/db/schema/platform-settings.ts`
**Repository:** `src/db/repositories/platform-settings-repository.ts`
Per-platform defaults for monitoring and downloads.
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `platform` | TEXT | No | — | Primary key: `youtube` \| `soundcloud` \| `generic` |
| `defaultFormatProfileId` | INTEGER | Yes | NULL | FK → `formatProfiles.id` (ON DELETE SET NULL) |
| `checkInterval` | INTEGER | Yes | NULL | Default check interval (minutes) |
| `concurrencyLimit` | INTEGER | Yes | NULL | Max concurrent operations |
| `subtitleLanguages` | TEXT | Yes | NULL | Default subtitle languages |
| `grabAllEnabled` | INTEGER | Yes | 0 | Enable back-catalog import by default |
| `grabAllOrder` | TEXT | Yes | `newest` | `newest` \| `oldest` |
| `scanLimit` | INTEGER | Yes | NULL | Max items per monitoring scan |
| `rateLimitDelay` | INTEGER | Yes | NULL | Milliseconds between API requests |
| `defaultMonitoringMode` | TEXT | Yes | NULL | Default mode for new channels |
| `createdAt` | TEXT | No | now | Creation timestamp |
| `updatedAt` | TEXT | No | now | Last update timestamp |
### `playlists`
**Schema file:** `src/db/schema/playlists.ts`
**Repository:** `src/db/repositories/playlist-repository.ts`
Platform playlists discovered from channels.
| Column | Type | Nullable | Default | Description |
|--------|------|----------|---------|-------------|
| `id` | INTEGER | No | autoincrement | Primary key |
| `channelId` | INTEGER | No | — | FK → `channels.id` (ON DELETE CASCADE) |
| `platformPlaylistId` | TEXT | No | — | Platform playlist ID |
| `title` | TEXT | No | — | Playlist name |
| `position` | INTEGER | Yes | NULL | Display order |
| `createdAt` | TEXT | No | now | Creation timestamp |
| `updatedAt` | TEXT | No | now | Last update timestamp |
### `contentPlaylist` (junction)
**Schema file:** `src/db/schema/playlists.ts`
Many-to-many relationship between content items and playlists.
| Column | Type | Description |
|--------|------|-------------|
| `contentItemId` | INTEGER | FK → `contentItems.id` (ON DELETE CASCADE) |
| `playlistId` | INTEGER | FK → `playlists.id` (ON DELETE CASCADE) |
**Primary key:** composite (`contentItemId`, `playlistId`)
### `systemConfig`
**Schema file:** `src/db/schema/system.ts`
**Repository:** `src/db/repositories/system-config-repository.ts`
Key-value store for application-level settings.
| Column | Type | Description |
|--------|------|-------------|
| `key` | TEXT | Primary key (e.g., `api_key`, `check_interval`, `concurrent_downloads`, `ytdlp_last_updated`) |
| `value` | TEXT | Setting value |
| `createdAt` | TEXT | Creation timestamp |
| `updatedAt` | TEXT | Last update timestamp |
## Database Initialization
**File:** `src/db/index.ts`
On startup, the database connection sets these SQLite pragmas:
```sql
PRAGMA journal_mode = WAL; -- Write-Ahead Logging for concurrent reads
PRAGMA busy_timeout = 5000; -- Wait up to 5 seconds if DB is locked
PRAGMA foreign_keys = ON; -- Enforce referential integrity
```
## Migration Strategy
1. **Schema changes** are defined in `src/db/schema/*.ts` using Drizzle table definitions
2. **Generate migration:** `npm run db:generate` runs `drizzle-kit generate` — compares schema files against the current migration state and outputs new SQL migration files to `drizzle/`
3. **Apply migration:** `npm run db:migrate` runs `tsx src/db/migrate.ts` — applies pending migrations in order
4. **On startup:** Migrations run automatically (idempotent) before the server starts
5. **Migration files** in `drizzle/` are committed to git and applied in sequence
## Key Query Patterns
### Content with channel join
Most content queries join through `channels` to include channel name, platform, and image URL alongside content item data.
### Status-based filtering
Content items are frequently filtered by `status` (`monitored`, `queued`, `downloading`, `downloaded`, `failed`, `ignored`). The repository layer provides methods for each status transition.
### Deduplication
When monitoring discovers content, it checks `platformContentId` + `channelId` to avoid inserting duplicates. This is the primary uniqueness constraint for content items.
### Queue processing order
Queue items are selected by `status = 'pending'` ordered by `priority DESC, createdAt ASC` — higher priority first, then FIFO within the same priority level.