saiful9379

🎯

Focusing

saiful islam saiful9379

🎯

Focusing

An Enthusiastic AI | Deep Learning | ML Researcher

56 followers · 34 following

https://hishab.co
Gulshan-2 House 4A, Rd 96, Dhaka 1212
https://saiful9379.github.io
https://huggingface.co/saiful9379
in/saiful-islam-907128ba

Achievements

Irodori-TTS Public
Forked from Aratako/Irodori-TTS

A Flow Matching-based Text-to-Speech Model with Emoji-driven Style Control

Python MIT License Updated Feb 27, 2026
personaplex Public
Forked from NVIDIA/personaplex

PersonaPlex code.

Python MIT License Updated Jan 16, 2026
ART Public
Forked from OpenPipe/ART

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python Apache License 2.0 Updated Sep 6, 2025
faster-whisper Public
Forked from SYSTRAN/faster-whisper

Faster Whisper transcription with CTranslate2

Python MIT License Updated Aug 16, 2025
auto-tuning-vllm Public
Forked from openshift-psap/auto-tuning-vllm

PSAP auto-tuning for vllm (vllm+guidellm+optuna)

Python Apache License 2.0 Updated Aug 12, 2025
S3Tokenizer Public
Forked from xingchensong/S3Tokenizer

Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice

Python Apache License 2.0 Updated Jun 16, 2025
ai-agents-for-beginners Public
Forked from microsoft/ai-agents-for-beginners

11 Lessons to Get Started Building AI Agents

Jupyter Notebook MIT License Updated May 21, 2025
QwenTTS Public

Python 3 Updated May 13, 2025
sesame-explorations Public
Forked from thomwolf/sesame-explorations

Python Apache License 2.0 Updated Apr 28, 2025
LLaSA_training Public
Forked from zhenye234/LLaSA_training

LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python Other Updated Apr 8, 2025
X-Codec-2.0 Public
Forked from zhenye234/X-Codec-2.0

Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python MIT License Updated Mar 12, 2025
awesome-japanese-nlp-resources Public
Forked from taishi-i/awesome-japanese-nlp-resources

A curated list of resources dedicated to Python libraries, LLMs, dictionaries, and corpora of NLP for Japanese

Creative Commons Zero v1.0 Universal Updated Feb 17, 2025
Zonos Public
Forked from sardorb3k/Zonos

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …

Python Apache License 2.0 Updated Feb 16, 2025
nanospeech Public
Forked from lucasnewman/nanospeech

A simple, hackable text-to-speech system in PyTorch and MLX

Python MIT License Updated Feb 14, 2025
seed-vc Public
Forked from Plachtaa/seed-vc

zero-shot voice conversion & singing voice conversion, with real-time support

Python GNU General Public License v3.0 Updated Feb 14, 2025
misaki Public
Forked from hexgrad/misaki

G2P

Python Apache License 2.0 Updated Feb 6, 2025
DeepSeek-R1 Public
Forked from deepseek-ai/DeepSeek-R1

MIT License Updated Jan 26, 2025
Auralis Public
Forked from astramind-ai/Auralis

A Fast TTS Engine

Python Other Updated Dec 23, 2024
CosyVoice Public
Forked from FunAudioLLM/CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python Apache License 2.0 Updated Dec 18, 2024
avro.py Public
Forked from hitblast/avro.py

⚡ A modern Pythonic implementation of Avro Phonetic.

Python MIT License Updated Nov 27, 2024
mini-omni2 Public
Forked from gpt-omni/mini-omni2

Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。

Python MIT License Updated Nov 6, 2024
audioseal Public
Forked from facebookresearch/audioseal

Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector

Python MIT License Updated Oct 26, 2024
NeMo-text-processing Public
Forked from NVIDIA/NeMo-text-processing

NeMo text processing for ASR and TTS

Python Apache License 2.0 Updated Oct 23, 2024
LLaMA-Omni Public
Forked from menon92/LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python Apache License 2.0 Updated Sep 24, 2024
BigVGAN Public
Forked from NVIDIA/BigVGAN

Official PyTorch implementation of BigVGAN (ICLR 2023)

Python MIT License Updated Sep 5, 2024
dataspeech Public
Forked from huggingface/dataspeech

Python MIT License Updated Sep 3, 2024
descript-audio-codec Public
Forked from descriptinc/descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Python MIT License Updated Jul 11, 2024
xtts-finetune-tests Public
Forked from daswer123/xtts-finetune-tests

In this repository I will be running various experiments on finetune different parts for xtts

Python 1 MIT License Updated Jun 22, 2024
open-speech-corpora Public
Forked from coqui-ai/open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

MIT License Updated Jun 6, 2024
resume_ai Public

Introducing Smart Resume AI: Revolutionizing Resume Sorting and Job Matching

Python 4 Updated May 29, 2024

saiful islam saiful9379

Achievements

Achievements

Irodori-TTS Public

Uh oh!

personaplex Public

Uh oh!

ART Public

Uh oh!

faster-whisper Public

Uh oh!

auto-tuning-vllm Public

Uh oh!

S3Tokenizer Public

Uh oh!

ai-agents-for-beginners Public

Uh oh!

QwenTTS Public

Uh oh!

sesame-explorations Public

Uh oh!

LLaSA_training Public

Uh oh!

X-Codec-2.0 Public

Uh oh!

awesome-japanese-nlp-resources Public

Uh oh!

Zonos Public

Uh oh!

nanospeech Public

Uh oh!

seed-vc Public

Uh oh!

misaki Public

Uh oh!

DeepSeek-R1 Public

Uh oh!

Auralis Public

Uh oh!

CosyVoice Public

Uh oh!

avro.py Public

Uh oh!

mini-omni2 Public

Uh oh!

audioseal Public

Uh oh!

NeMo-text-processing Public

Uh oh!

LLaMA-Omni Public

Uh oh!

BigVGAN Public

Uh oh!

dataspeech Public

Uh oh!

descript-audio-codec Public

Uh oh!

xtts-finetune-tests Public

Uh oh!

open-speech-corpora Public

Uh oh!

resume_ai Public

Uh oh!