Skip to content
View shauheen's full-sized avatar
🦅
Focusing
🦅
Focusing

Block or report shauheen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 837 141 Updated Apr 15, 2026

A profiling and performance analysis tool for machine learning

C++ 500 84 Updated Apr 15, 2026

PyTorch Single Controller

Rust 1,012 157 Updated Apr 15, 2026

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Python 4,091 306 Updated Apr 13, 2026

Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)

Python 493 60 Updated Apr 3, 2026

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

Python 424 63 Updated Jan 5, 2026

lightweight, standalone C++ inference engine for Google's Gemma models.

C++ 6,853 624 Updated Apr 14, 2026

A simplified and automated orchestration workflow to perform ML end-to-end (E2E) model tests and benchmarking on Cloud VMs across different frameworks.

Python 61 63 Updated Apr 14, 2026

🤗 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.

Python 17 13 Updated Jun 5, 2025

Generate snapshots and rankings of monthly committer and issue/PR activity

Shell 49 7 Updated Nov 11, 2024

Policy and data administration, distribution, and real-time updates on top of Policy Agents (OPA, Cedar, ...)

Python 5,446 273 Updated Apr 6, 2026

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ 4,171 782 Updated Apr 15, 2026

A C++ standalone library for machine learning

C++ 5,440 503 Updated Feb 23, 2026

Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimentation and parallelization, and has demonstrated industry lead…

Python 549 71 Updated Apr 9, 2026

A Python-level JIT compiler designed to make unmodified PyTorch programs faster.

Python 1,078 128 Updated Apr 17, 2024

Enabling PyTorch on Google TPU

C++ 2 1 Updated Feb 16, 2023

A performant and modular runtime for TensorFlow

C++ 754 123 Updated Sep 4, 2025

A list of awesome compiler projects and papers for tensor computation and deep learning.

1 Updated Jul 27, 2021

A list of awesome compiler projects and papers for tensor computation and deep learning.

2,737 324 Updated Oct 19, 2024

Development repository for the Triton language and compiler

MLIR 18,942 2,762 Updated Apr 15, 2026

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

LLVM 37,866 16,855 Updated Apr 15, 2026

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 36,649 5,142 Updated Apr 15, 2026

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 99,135 27,493 Updated Apr 15, 2026

Model parallel transformers in JAX and Haiku

Python 6,366 884 Updated Jan 21, 2023

Hummingbird compiles trained ML models into tensor computation for faster inference.

Python 3,534 292 Updated Jul 17, 2025

Google Cloud TPU Utilization Bar for Training Models

Python 7 2 Updated Dec 23, 2020

Babysit your preemptible TPUs

Python 86 15 Updated Dec 3, 2022

Your PyTorch AI Factory - Flash enables you to easily configure and run complex AI recipes for over 15 tasks across 7 data domains

Python 1,729 212 Updated Oct 8, 2023

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 159,397 32,872 Updated Apr 15, 2026

PyTorch extensions for high performance and large scale training.

Python 3,405 296 Updated Apr 26, 2025
Next