Skip to content
View yl-1993's full-sized avatar

Highlights

  • Pro

Organizations

@openxrlab

Block or report yl-1993

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.

Python 3,369 201 Updated Jun 22, 2024

Official training and inference code for VBVR (A Very Big Video Reasoning Suite)

Python 8 Updated Apr 9, 2026
Python 1 Updated Mar 26, 2026

[LightX2V](https://x2v.light-ai.top) integration for [OpenClaw](https://openclaw.ai) — image generation (t2i/i2i), video (t2v/i2v/s2v), TTS, and voice clone via cloud API.

Shell 8 Updated Mar 10, 2026

[CVPR2026] ConsistCompose: Unified Multimodal Layout Control for Image Composition

Python 10 2 Updated Apr 1, 2026

[CVPR 2026] EmbodMocap: In-the-Wild 4D Human-Scene Reconstruction for Embodied Agents

Python 130 4 Updated Mar 10, 2026

Reinforcement Learning Framework for Visual Generation

Python 105 4 Updated Feb 13, 2026

In our implementation of Qwen-Image-Edit, we employ block causal attention to improve inference speed.

Python 48 2 Updated Feb 16, 2026

This is a collection of recent papers on reasoning in video generation models.

148 5 Updated Mar 30, 2026

Agentic LaTeX Writer - Local-first editor for AI-assisted academic writing

TypeScript 109 12 Updated Feb 23, 2026

[ICLR 2026] Official Code for "the Quest for Generalizable Motion Generation: Data, Model, and Evaluation"

Python 87 3 Updated Mar 25, 2026

NEO Series: Native Vision-Language Models from First Principles

Python 701 25 Updated Mar 23, 2026

[CVPR 2026] OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Python 156 7 Updated Mar 30, 2026

[CVPR2026] Scaling Spatial Intelligence with Multimodal Foundation Models

Python 198 11 Updated Apr 10, 2026

An open-source evaluation toolkit to evaluate MLLMs on Spatial Intelligence using the EASI protocol

Python 18 Updated Feb 13, 2026

Holistic Evaluation of Multimodal LLMs on Spatial Intelligence

Python 98 7 Updated Apr 3, 2026

This is a framework for evaluating reasoning in foundational Video Models.

Python 87 7 Updated Apr 1, 2026

Speech2Motion is a real-time streaming system that converts speech input into synchronized 3D character animations. The system provides intelligent motion matching based on speech content, keywords…

Python 4 Updated Feb 28, 2026

Audio2Face is a real-time audio-to-face animation service that converts streaming audio input into synchronized facial animation data. The system uses advanced machine learning models to extract au…

Python 11 Updated Jan 14, 2026

Orchestrator is a real-time intelligent conversation system for building personalized multimodal AI interaction workflows, including speech recognition (ASR), text conversation (LLM), text-to-speec…

Python 6 2 Updated Mar 2, 2026

Open-source Autonomous 3D Characters on the Web

TypeScript 221 22 Updated Jan 15, 2026

A simple, unified multimodal models training engine. Lean, flexible, and built for hacking at scale.

Python 757 35 Updated Apr 9, 2026

Audio-driven Digital Human Generation Model

41 Updated Sep 14, 2025

🎖️ A collection of badges for your projects README

259 35 Updated Feb 14, 2026

ComfyUI custom node for lightx2v

Python 79 7 Updated Apr 8, 2026

Qwen-Image-Lightning: Speed up Qwen-Image model with distillation

Python 1,286 44 Updated Jan 1, 2026

Wan2.2-Lightning: Speed up wan2.2 model with distillation

Python 282 17 Updated Nov 7, 2025

OpenXRLab Data Visualization Toolbox

JavaScript 6 Updated Jun 30, 2025
Next