vidrip/PROJECT_CONTEXT.md

8.7 KiB

VidRip - Project Context

Project Overview

VidRip is a respectful YouTube video drip downloader that slowly downloads videos from channels over time to avoid aggressive scraping. Videos are downloaded at configurable intervals with random variance to make the pattern less predictable.

Architecture

Tech Stack

  • Backend: Node.js, Express, TypeScript, SQLite (better-sqlite3), node-cron
  • Frontend: React, TypeScript, Vite, Tailwind CSS, React Router
  • External: yt-dlp (must be installed on system)

Project Structure

vidrip/
├── backend/
│   ├── src/
│   │   ├── db/
│   │   │   └── database.ts          # SQLite operations, schema, CRUD
│   │   ├── routes/
│   │   │   ├── channels.ts          # Channel management endpoints
│   │   │   ├── videos.ts            # Video management endpoints
│   │   │   └── config.ts            # Settings & scheduler status
│   │   ├── services/
│   │   │   ├── ytdlp.ts             # yt-dlp wrapper for downloads
│   │   │   └── scheduler.ts         # Download scheduler logic
│   │   ├── types/
│   │   │   └── index.ts             # TypeScript interfaces
│   │   └── server.ts                # Express app entry point
│   ├── downloads/                   # Downloaded video storage
│   ├── data.db                      # SQLite database (auto-created)
│   └── package.json
├── frontend/
│   ├── src/
│   │   ├── components/              # (empty - components in pages)
│   │   ├── pages/
│   │   │   ├── VideosPage.tsx       # List/add/filter videos
│   │   │   ├── ChannelsPage.tsx     # Manage channels
│   │   │   ├── SettingsPage.tsx     # Configure scheduler
│   │   │   └── VideoPlayerPage.tsx  # Watch videos
│   │   ├── services/
│   │   │   └── api.ts               # API client functions
│   │   ├── types/
│   │   │   └── index.ts             # TypeScript interfaces
│   │   ├── App.tsx                  # Router & navigation
│   │   ├── main.tsx                 # React entry point
│   │   └── index.css                # Tailwind imports
│   ├── index.html
│   ├── vite.config.ts               # Proxy to backend
│   └── package.json
└── README.md

Database Schema

Tables

  1. channels

    • id, url, name, channelId, addedAt, lastChecked, active
    • Stores YouTube channels to monitor
  2. videos

    • id, channelId, videoId, title, url, duration, thumbnail, uploadDate
    • status (pending/downloading/completed/failed), filePath, fileSize
    • addedAt, downloadedAt, error
    • Stores individual videos and download status
  3. config

    • id, key, value
    • Stores app configuration (intervalHours, varianceMinutes, enabled, etc.)

Key Flows

Download Scheduler (backend/src/services/scheduler.ts)

  1. Channel Checking: Runs every hour via cron to check channels for new videos
  2. Download Cycle:
    • Picks next pending video
    • Downloads with progress tracking
    • Marks completed or failed
    • Schedules next download using: intervalHours ± random(varianceMinutes)
  3. Restartable: Can be stopped/restarted when config changes

Adding a Channel

  1. User enters YouTube channel URL in frontend
  2. Backend calls getChannelInfo() to fetch channel metadata
  3. Backend calls getChannelVideos() to get all videos (flat playlist)
  4. Creates channel record and all video records with status='pending'
  5. Scheduler will pick them up based on configuration

Video Download

  1. Scheduler picks next pending video
  2. Calls yt-dlp with progress callbacks
  3. Downloads to backend/downloads/{videoId}.mp4
  4. Updates database with file path, size, status

API Endpoints

Channels

  • GET /api/channels - List all channels
  • GET /api/channels/:id - Get channel details
  • GET /api/channels/:id/videos - Get channel's videos
  • POST /api/channels - Add new channel (body: {url})
  • PATCH /api/channels/:id - Update channel (body: {active})
  • DELETE /api/channels/:id - Delete channel & videos
  • POST /api/channels/:id/refresh - Check for new videos

Videos

  • GET /api/videos - List all videos (optional ?status=)
  • GET /api/videos/:id - Get video details
  • POST /api/videos - Add single video (body: {url})
  • DELETE /api/videos/:id - Delete video & file
  • POST /api/videos/:id/retry - Retry failed video

Config

  • GET /api/config - Get all settings
  • PATCH /api/config - Update settings (body: {key: value})
  • GET /api/config/scheduler/status - Get scheduler status & progress

Static

  • /downloads/{videoId}.mp4 - Serve downloaded videos

Configuration Options

Default values in database:

  • intervalHours: "3" - Average hours between downloads
  • varianceMinutes: "30" - Random ± minutes to add
  • maxConcurrentDownloads: "1" - Simultaneous downloads (currently only 1 supported)
  • enabled: "true" - Whether scheduler is active

Important Implementation Details

yt-dlp Integration

  • Uses spawn to call yt-dlp CLI (must be in PATH)
  • --dump-json for metadata extraction
  • --flat-playlist for channel video lists
  • Progress parsing via stdout line parsing
  • Downloads as MP4 (merges best video+audio)

Scheduler Variance

  • Prevents predictable download patterns
  • Calculates: baseMinutes = intervalHours * 60
  • Adds random: variance = random(-varianceMinutes, +varianceMinutes)
  • Uses setTimeout for next download, not fixed cron

Frontend Proxy

  • Vite proxies /api and /downloads to backend (port 3001)
  • Allows development without CORS issues
  • Production would need reverse proxy (nginx/etc)

Known Limitations & Future Enhancements

Current Limitations

  1. Only 1 concurrent download supported (hardcoded in scheduler)
  2. No authentication/authorization
  3. No video queue reordering
  4. No bandwidth limiting
  5. No retry limit for failed videos
  6. No disk space checking
  7. No video preview before download
  8. Progress tracking only works during active download (not persisted)

Potential Enhancements

  • User authentication
  • Video quality selection
  • Download queue prioritization
  • Bandwidth throttling
  • Automatic old video cleanup
  • Video search/filtering by title
  • Channel categorization/tagging
  • Download history/statistics
  • Webhook notifications
  • Docker containerization
  • Multiple download quality profiles
  • Subtitle downloading

Development Commands

# Install all dependencies
npm run install:all

# Development (run in separate terminals)
npm run dev:backend    # Backend on :3001
npm run dev:frontend   # Frontend on :3000

# Production build
npm run build:backend
npm run build:frontend

# Production run
npm run start:backend
npm run start:frontend

Dependencies to Install

System Requirements

  • Node.js 18+
  • yt-dlp (via pip, brew, or binary)

Installation

# yt-dlp
pip install yt-dlp
# or
brew install yt-dlp

# Node dependencies
npm run install:all

Troubleshooting

Common Issues

  1. "yt-dlp not found": Ensure yt-dlp is in PATH
  2. Database locked: Only one backend instance should run
  3. Video won't play: Check file exists in backend/downloads/
  4. Scheduler not running: Check settings page, ensure enabled=true
  5. Channel refresh fails: YouTube may be rate limiting, wait and retry

File Locations

  • Database: backend/data.db
  • Videos: backend/downloads/{videoId}.mp4
  • Logs: Console output (not persisted to file currently)

Security Considerations

  • No authentication - anyone with access can manage downloads
  • Videos stored unencrypted on filesystem
  • No rate limiting on API endpoints
  • YouTube URLs not validated before passing to yt-dlp
  • Consider running behind reverse proxy with auth in production

Code Entry Points

To understand the codebase quickly:

  1. Start with backend/src/server.ts - see how routes connect
  2. Read backend/src/services/scheduler.ts - core business logic
  3. Check frontend/src/App.tsx - understand page routing
  4. Review backend/src/db/database.ts - database schema & operations

Testing

Currently no automated tests. Manual testing checklist:

  • Add channel and verify videos appear
  • Add individual video by URL
  • Watch completed video
  • Change settings and verify scheduler restarts
  • Delete channel and verify cascade delete
  • Retry failed video
  • Refresh channel for new videos

Environment Variables

Backend supports (optional):

  • PORT - Server port (default: 3001)
  • NODE_ENV - Environment (development/production)

Currently no .env file required - uses defaults.