Skip to content

v25.12.33#1508

Merged
ROBERT-MCDOWELL merged 12 commits intoDrewThomasson:v25from
ROBERT-MCDOWELL:v25
Jan 10, 2026
Merged

v25.12.33#1508
ROBERT-MCDOWELL merged 12 commits intoDrewThomasson:v25from
ROBERT-MCDOWELL:v25

Conversation

@ROBERT-MCDOWELL
Copy link
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings January 10, 2026 02:51
@ROBERT-MCDOWELL ROBERT-MCDOWELL merged commit f309ec6 into DrewThomasson:v25 Jan 10, 2026
2 checks passed
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request (v25.12.33) introduces significant refactoring to the TTS (Text-to-Speech) engine architecture and audio processing pipeline. The changes focus on improving GPU/device handling, VTT subtitle generation, and text processing logic.

Changes:

  • Refactored GPU policy handling to support multiple device types (CUDA, ROCm, MPS, XPU) with improved AMP dtype selection
  • Moved VTT subtitle generation from inline processing to a deferred batch operation using audio file analysis
  • Extracted common TTS engine methods (_set_voice, _convert_sml) to a shared utility base class
  • Added text merging logic for very short rows in join_ideogramms function
  • Fixed template parameter name in time format string substitution

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 19 comments.

Show a summary per file
File Description
lib/gradio.py Refactored event handler to chain voice list updates; simplified return values for change_gr_fine_tuned_list
lib/core.py Added text merging pass, fixed time format parameter, refactored convert_chapters2audio to track sentences separately, extracted get_audio_duration function
lib/classes/tts_manager.py Added create_sentences2vtt method to support deferred VTT generation
lib/classes/tts_engines/*.py Removed per-sentence VTT tracking, added torch.autocast for GPU acceleration, refactored to use common _set_voice/_convert_sml methods
lib/classes/tts_engines/common/utils.py Replaced _apply_cuda_policy with comprehensive _apply_gpu_policy supporting multiple devices; added _build_vtt_file for batch VTT generation; extracted _set_voice and _convert_sml methods

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants