Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request (v25.12.33) introduces significant refactoring to the TTS (Text-to-Speech) engine architecture and audio processing pipeline. The changes focus on improving GPU/device handling, VTT subtitle generation, and text processing logic.
Changes:
- Refactored GPU policy handling to support multiple device types (CUDA, ROCm, MPS, XPU) with improved AMP dtype selection
- Moved VTT subtitle generation from inline processing to a deferred batch operation using audio file analysis
- Extracted common TTS engine methods (_set_voice, _convert_sml) to a shared utility base class
- Added text merging logic for very short rows in join_ideogramms function
- Fixed template parameter name in time format string substitution
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 19 comments.
Show a summary per file
| File | Description |
|---|---|
| lib/gradio.py | Refactored event handler to chain voice list updates; simplified return values for change_gr_fine_tuned_list |
| lib/core.py | Added text merging pass, fixed time format parameter, refactored convert_chapters2audio to track sentences separately, extracted get_audio_duration function |
| lib/classes/tts_manager.py | Added create_sentences2vtt method to support deferred VTT generation |
| lib/classes/tts_engines/*.py | Removed per-sentence VTT tracking, added torch.autocast for GPU acceleration, refactored to use common _set_voice/_convert_sml methods |
| lib/classes/tts_engines/common/utils.py | Replaced _apply_cuda_policy with comprehensive _apply_gpu_policy supporting multiple devices; added _build_vtt_file for batch VTT generation; extracted _set_voice and _convert_sml methods |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.