TTS Engines¶
DPO Reader supports three text-to-speech backends with different trade-offs.
Bark (Default)¶
Neural TTS that runs locally. Produces natural-sounding speech with good intonation and emotion.
dpo-reader listen URL -e bark
Quality: Excellent
Speed: ~10 seconds per sentence
Hardware: GPU recommended (MPS on Apple Silicon, CUDA on NVIDIA)
Install: Included by default
Bark works without a GPU but runs slower. On Apple Silicon Macs, it uses Metal Performance Shaders automatically.
OpenAI¶
Cloud-based TTS with the best quality. Requires an API key and costs money per character.
export OPENAI_API_KEY=sk-...
dpo-reader listen URL -e openai
Quality: Best
Speed: Fast (network-dependent)
Hardware: Any (runs in cloud)
Cost: ~$0.015 per 1K characters
Get an API key at platform.openai.com/api-keys.
Available voices: alloy, echo, fable, onyx, nova, shimmer.
Piper¶
Lightweight local TTS optimized for CPU. Good for batch processing or machines without GPUs.
uv pip install dpo-reader[piper]
dpo-reader listen URL -e piper
Quality: Good
Speed: ~0.1 seconds per sentence
Hardware: CPU only (~50MB models)
Install:
dpo-reader[piper]
Piper uses ONNX runtime and works on any machine. Models download automatically on first use.
Comparison¶
Engine |
Quality |
Speed |
Requirements |
|---|---|---|---|
OpenAI |
Best |
Fast |
API key ($) |
Bark |
Excellent |
Slow |
GPU helps |
Piper |
Good |
Fast |
CPU only |