-
17:42
(UTC +08:00)
Highlights
- Pro
Stars
A complete academic research Skill suite. Supports Claude Code, ChatGPT / Codex CLI, and Gemini CLI.
A lightweight, wake word detection engine. Train custom, high-accuracy models with minimal effort.
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
A framework for efficient model inference with omni-modality models
The most accurate natural language detection library for Python, suitable for short text and mixed-language text
A Python script that converts Romaji to Hiragana and/or Katakana
A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.
A nearly-live implementation of OpenAI's Whisper.
Faster Whisper transcription with CTranslate2
Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection
A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/abs/1905.11235.
Collection of articles, books, videos and other things I found useful for those interested in the topic.
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra
Inference of resemble denoiser
Various speech datasets made available to the public
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
Transcription, forced alignment, and audio indexing with OpenAI's Whisper
Download books from pubu.com.tw without buying them 在未購買的情況下下載pubu電子書
Code for paper "Vocabulary Learning via Optimal Transport for Neural Machine Translation"


