dpo_reader.tts.base

Base TTS backend interface.

Functions

fix_pronunciations(text)

Apply pronunciation fixes to text before TTS.

split_into_chunks(text[, max_chars])

Split text into chunks suitable for TTS.

Classes

TTSBackend

Abstract base class for TTS backends.

TTSGenerator

High-level TTS generator with caching and progress tracking.

dpo_reader.tts.base.fix_pronunciations(text)[source]

Apply pronunciation fixes to text before TTS.

Parameters:

text (str)

Return type:

str

dpo_reader.tts.base.split_into_chunks(text, max_chars=250)[source]

Split text into chunks suitable for TTS.

Parameters:
Return type:

list[str]

class dpo_reader.tts.base.TTSBackend[source]

Bases: ABC

Abstract base class for TTS backends.

name: str = 'base'
sample_rate: int = 24000
narrator_voice: str = 'default'
abstractmethod get_voices()[source]

Return list of available voice IDs.

Return type:

list[str]

abstractmethod synthesize(text, voice)[source]

Synthesize text to audio.

Parameters:
  • text (str) – Text to synthesize

  • voice (str) – Voice ID to use

Returns:

Audio as float32 numpy array

Return type:

ndarray

generate_silence(duration_seconds)[source]

Generate silence of specified duration.

Parameters:

duration_seconds (float)

Return type:

ndarray

class dpo_reader.tts.base.TTSGenerator[source]

Bases: object

High-level TTS generator with caching and progress tracking.

__init__(backend, voice_assignment, cache_dir=None, include_attribution=True, pause_between_posts=1.5, narrator_voice=None)[source]
Parameters:
generate_post(post)[source]

Generate audio for a post, using cache if available.

Uses narrator voice for attribution (“Author says:”) and the author’s assigned voice for actual content.

Returns:

Tuple of (audio_array, attribution_samples) where attribution_samples is the number of samples used for the “Author says:” portion.

Parameters:

post (Post)

Return type:

tuple[np.ndarray, int]

generate_all(posts, progress_callback=None, return_segments=False)[source]

Generate audio for all posts.

Parameters:
  • posts (list[Post]) – List of posts to convert

  • progress_callback (Callable[..., Any] | None) – Optional callback(current, total, post)

  • return_segments (bool) – If True, return (audio, segments) where segments contains start/end sample positions for each post

Returns:

Audio array, or tuple of (audio, segments) if return_segments=True

Return type:

np.ndarray | tuple[np.ndarray, list[dict]]

generate_streaming(posts, progress_callback=None)[source]

Generate audio segments one at a time (yields as generated).

Yields:

Tuple of (audio_chunk, segment_info, post_index, total_posts)

Parameters:
  • posts (list[Post])

  • progress_callback (Callable[..., Any] | None)