Stars
AutoGaze automatically removes redundant patches in a video, reducing #tokens in ViT/MLLM by 4x-100x.
EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video
[ICRA 2026] VITRA: Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
Syncthing-Fork - A Syncthing Wrapper for Android.
📚 Claude Code plugin that automates research papers study with automatic material generation, code demonstrations, and interactive web viewer.
Real-time global intelligence dashboard. AI-powered news aggregation, geopolitical monitoring, and infrastructure tracking in a unified situational awareness interface
Schmetzler / hamer_keypoints
Forked from geopavlakos/hamerHaMeR: Reconstructing Hands in 3D with Transformers
This is a suite of tools/exploits that can be used with action/body cameras that use the Viidure application with the WiFi hotspot enabled. It can also run Gameboy games!
⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载,你的 AI 舆情监控助手与热点筛选工具!聚合多平台热点 + RSS 订阅,支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机,也支持接入 MCP 架构…
A collection of 100+ specialized Claude Code subagents covering a wide range of development use cases
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.
JT808、JT808协议解析;支持TCP、UDP,实时兼容2011、2013、2019版本协议,支持分包。支持JT/T1078音视频协议,T/JSATL12苏标主动安全协议,T/GDRTA002粤标主动安全协议,支持Android客户端编解码。
Protocol Buffers - Google's data interchange format
Download pictures (or videos) along with their captions and other metadata from Instagram.
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
A Python package providing additional service implementations for the Google ADK framework (S3, Redis, MongoDB, Azure, etc)
A powerful tool for creating datasets for LLM fine-tuning 、RAG and Eval
🎬 卡卡字幕助手 | VideoCaptioner - 基于 LLM 的智能字幕助手 - 视频字幕生成、断句、校正、字幕翻译全流程处理!- A powered tool for easy and efficient video subtitling.
Legacy Python library for Agentic Document Extraction (ADE). Use the landingai-ade library for all new projects.
A comprehensive collection of IQA papers
A Comprehensive Toolkit for High-Quality PDF Content Extraction
Implementation of my RAG system that won all categories in Enterprise RAG Challenge 2
HelixDB is an open-source graph-vector database built from scratch in Rust.
This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025
Generate impressive-looking terminal output to look busy when stakeholders walk by
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
healthcare data standard in China
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. Each technique has a detailed notebook tutorial.
[NeurIPS 2023] The official repo for the paper: "Time Series as Images: Vision Transformer for Irregularly Sampled Time Series"."




