Skip to content
View drogozhang's full-sized avatar
🐢
Running
🐢
Running

Highlights

  • Pro

Organizations

@OSU-NLP-Group @open-vision-language @MMMU-Benchmark

Block or report drogozhang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

[ACL'26 Findings] The Model Agreed, But Didn’t Learn: Diagnosing Surface Compliance in Large Language Models

2 Updated Apr 8, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 358,294 72,831 Updated Apr 16, 2026

Code and data for the paper "Bridging Online and Offline RL: Contextual Bandit Learning for Multi-Turn Code Generation"

Python 9 Updated Feb 4, 2026

Scaling Agentic Reinforcement Learning with a Multi-Turn, Multi-Task Framework

Python 272 20 Updated Jan 17, 2026

[NeurIPS 2025] CamSAM2: Segment Anything Accurately in Camouflaged Videos

Python 18 1 Updated Nov 19, 2025

Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.

Python 853 96 Updated Apr 13, 2026

AeroSpace is an i3-like tiling window manager for macOS

Swift 20,271 488 Updated Apr 14, 2026

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Python 183 21 Updated Mar 27, 2026

[NeurIPS'25 D&B] Mind2Web-2 Benchmark: Evaluating Agentic Search with Agent-as-a-Judge

Python 108 6 Updated Feb 28, 2026

主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题

HTML 13,880 1,368 Updated Apr 30, 2025

Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"

Python 1,068 117 Updated Mar 4, 2024

AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓,同时包含工作和科研过程中的新想法、新问题、新资源与新项目

2,823 248 Updated Mar 5, 2026

Machine Learning and Computer Vision Engineer - Technical Interview Questions

4,560 745 Updated Jan 24, 2026
Python 31 1 Updated Jun 9, 2025

🔥 A list of tools, frameworks, and resources for building AI web agents

Python 1,389 158 Updated Apr 8, 2026

Code for paper "Is Extending Modality The Right Path Towards Omni-Modality?"

Python 13 Updated Jun 3, 2025

[ICLR'26 Oral] RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments

Python 48 8 Updated Feb 9, 2026

PhyX: Does Your Model Have the "Wits" for Physical Reasoning?

Python 52 1 Updated Mar 16, 2026

[NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model

Python 65 3 Updated Apr 6, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 76,801 15,657 Updated Apr 16, 2026

🌎💪 BrowserGym, a Gym environment for web task automation

Python 1,195 167 Updated Mar 17, 2026

ALFWorld: Aligning Text and Embodied Environments for Interactive Learning

Python 711 84 Updated Feb 8, 2026

[TMLR'26] UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Large Language Models

Python 54 4 Updated Mar 24, 2026

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 1,800 169 Updated Feb 27, 2026
HTML 6 Updated Oct 14, 2025

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 2,618 218 Updated Apr 14, 2026

Awesome curated collection of images and prompts generated by GPT-4o and gpt-image-1. Explore AI generated visuals created with ChatGPT and Sora, showcasing OpenAI’s advanced image generation capab…

JavaScript 7,862 1,792 Updated May 26, 2025

Pioneering Automated GUI Interaction with Native Agents

Python 10,089 732 Updated Jan 27, 2026

[ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery

Python 135 19 Updated Mar 5, 2026
Next