Getting Started¶

cast2md is a podcast transcription service. Add your podcast feeds, and cast2md builds a searchable transcript library -- automatically.

How It Works¶

New Episode Discovered
        │
        ▼
  Check External Transcripts
  (Podcast 2.0, Pocket Casts)
        │
   ┌────┴────┐
   │         │
Found    Not Found
   │         │
   ▼         ▼
 Done    Download Audio
            │
            ▼
      Transcribe (Whisper)
            │
            ▼
          Done

Feed discovery -- add RSS feeds, episodes are discovered automatically
Transcript download -- checks publisher transcripts and Pocket Casts first
Audio fallback -- downloads audio only when no external transcript exists
Whisper transcription -- local or distributed transcription
Search & access -- full-text and semantic search, REST API, MCP for Claude

Transcript-First Workflow¶

When a new episode is discovered, cast2md doesn't immediately download audio. Instead, it first checks for existing transcripts:

Podcast 2.0 -- publisher-provided transcripts via RSS <podcast:transcript> tags
Pocket Casts -- auto-generated transcripts from the Pocket Casts API
Whisper -- local transcription after audio download (last resort)

This saves storage and processing time -- audio is only downloaded when no external transcript is available.

Audio download is always available

You can always download audio manually for any episode, regardless of transcript availability. Use "Download Audio" on the episode detail page or the CLI/API.

Search¶

cast2md includes hybrid search combining full-text and semantic search:

Keyword search -- PostgreSQL full-text search for exact term matching
Semantic search -- sentence-transformers embeddings with pgvector for meaning-based queries (e.g., "cold water swimming" finds episodes about "Eisbaden")
Hybrid mode (default) -- combines both using Reciprocal Rank Fusion for best relevance

Search works across episode titles, descriptions, and transcript content.

Single Server is Enough¶

A single cast2md server handles the complete workflow -- downloading episodes, transcribing audio, and searching transcripts. No additional setup is needed beyond the server itself.

Remote transcriber nodes and RunPod GPU workers are entirely optional. They speed up transcription for large backlogs but aren't required for normal use.