dpo_reader.tts.base¶

Base TTS backend interface.

Functions

`fix_pronunciations`(text)	Apply pronunciation fixes to text before TTS.
`split_into_chunks`(text[, max_chars])	Split text into chunks suitable for TTS.

Classes

`TTSBackend`	Abstract base class for TTS backends.
`TTSGenerator`	High-level TTS generator with caching and progress tracking.

dpo_reader.tts.base.fix_pronunciations(text)[source]¶

Apply pronunciation fixes to text before TTS.

Parameters:: text (str)
Return type:: str

dpo_reader.tts.base.split_into_chunks(text, max_chars=250)[source]¶

Split text into chunks suitable for TTS.

Parameters:

text (str)
max_chars (int)

Return type:

list[str]

class dpo_reader.tts.base.TTSBackend[source]¶

Bases: ABC

Abstract base class for TTS backends.

name: str = 'base'¶

sample_rate: int = 24000¶

narrator_voice: str = 'default'¶

abstractmethod get_voices()[source]¶

Return list of available voice IDs.

Return type:: list[str]

abstractmethod synthesize(text, voice)[source]¶

Synthesize text to audio.

Parameters:

text (str) – Text to synthesize
voice (str) – Voice ID to use

Returns:

Audio as float32 numpy array

Return type:

ndarray

generate_silence(duration_seconds)[source]¶

Generate silence of specified duration.

Parameters:: duration_seconds (float)
Return type:: ndarray

class dpo_reader.tts.base.TTSGenerator[source]¶

Bases: object

High-level TTS generator with caching and progress tracking.

__init__(backend, voice_assignment, cache_dir=None, include_attribution=True, pause_between_posts=1.5, narrator_voice=None)[source]¶

Parameters:

backend (TTSBackend)
voice_assignment (VoiceAssignment)
cache_dir (Path | None)
include_attribution (bool)
pause_between_posts (float)
narrator_voice (str | None)

generate_post(post)[source]¶

Generate audio for a post, using cache if available.

Uses narrator voice for attribution (“Author says:”) and the author’s assigned voice for actual content.

Returns:: Tuple of (audio_array, attribution_samples) where attribution_samples is the number of samples used for the “Author says:” portion.
Parameters:: post (Post)
Return type:: tuple[np.ndarray, int]

generate_all(posts, progress_callback=None, return_segments=False)[source]¶

Generate audio for all posts.

Parameters:

posts (list[Post]) – List of posts to convert
progress_callback (Callable[..., Any] | None) – Optional callback(current, total, post)
return_segments (bool) – If True, return (audio, segments) where segments contains start/end sample positions for each post

Returns:

Audio array, or tuple of (audio, segments) if return_segments=True

Return type:

np.ndarray | tuple[np.ndarray, list[dict]]

generate_streaming(posts, progress_callback=None)[source]¶

Generate audio segments one at a time (yields as generated).

Yields:

Tuple of (audio_chunk, segment_info, post_index, total_posts)

Parameters:

posts (list[Post])
progress_callback (Callable[..., Any] | None)