UI-TARS is an open-source multimodal “GUI agent” created by ByteDance: a model designed to perceive raw screenshots (or rendered UI frames), reason about what needs to be done, and then perform real interactions with graphical user interfaces (GUIs) — like clicking, typing, navigating menus — across desktop, browser, mobile, or game environments. Rather than relying on rigid, manually scripted UI automation, UI-TARS uses a unified vision-language model (VLM) that integrates perception, reasoning, grounding, and action into one end-to-end framework: it “thinks before acting,” enabling flexible, general-purpose automation. This allows it to perform complex, multi-step tasks such as filling forms, downloading files, navigating applications, and even controlling in-game actions — all by understanding the UI as a human would. The project is open-source, supports deployment locally or remotely, and offers a foundation for building GUI automation agents that are more robust, and adaptable.

Features

  • Vision-language model-based GUI agent: perceives raw screenshots and reasons about UI context
  • Unified action space: supports clicks, typing, gestures, hotkeys across desktop, browser, mobile, and games
  • “Think-then-act” decision-making: performs internal reasoning (task decomposition, planning, reflection) before executing actions
  • Cross-platform GUI control: works across different operating systems, browsers, and application contexts
  • End-to-end automation: capable of carrying out full workflows (forms, downloads, navigation, game controls) without custom scripts per UI
  • Open-source with published inference scripts and models — enabling reproducibility and customization

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow UI-TARS

UI-TARS Web Site

Other Useful Business Software
$300 in Free Credit Towards Top Cloud Services Icon
$300 in Free Credit Towards Top Cloud Services

Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
Get Started
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of UI-TARS!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Python

Related Categories

Python Artificial Intelligence Software

Registered

2025-12-01