Open Source Rust Large Language Models (LLM)

Rust Large Language Models (LLM)

View 362 business solutions

Browse free open source Rust Large Language Models (LLM) and projects below. Use the toggles on the left to filter open source Rust Large Language Models (LLM) by OS, license, language, programming language, and project status.

  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    llmfit

    llmfit

    157 models, 30 providers, one command to find what runs on hardware

    llmfit is a terminal-based utility that helps developers determine which large language models can realistically run on their local hardware by analyzing system resources and model requirements. The tool automatically detects CPU, RAM, GPU, and VRAM specifications, then ranks available models based on performance factors such as speed, quality, and memory fit. It provides both an interactive terminal user interface and a traditional CLI mode, enabling flexible workflows for different user preferences. llmfit also supports advanced configurations including multi-GPU setups, mixture-of-experts architectures, and dynamic quantization recommendations. By presenting clear performance estimates and compatibility guidance, the project reduces the trial-and-error typically involved in local LLM experimentation. Overall, llmfit serves as a practical decision assistant for developers who want to run language models efficiently on their own machines.
    Downloads: 44 This Week
    Last Update:
    See Project
  • 2
    HASH

    HASH

    The best way to use and work with blocks

    This is HASH's public monorepo which contains our public code, docs, and other key resources. HASH is a platform for decision-making, which helps you integrate, understand and use data in a variety of different ways. HASH does this by combining various different powerful tools together into one simple interface. These range from data pipelines and a graph database, through to an all-in-one workspace, no-code tool builder, and agent-based simulation engine. These exist at varying stages of maturity, and while some are polished, not all are ready for real-world production use. You can read more about our big-picture vision at hash.dev
    Downloads: 10 This Week
    Last Update:
    See Project
  • 3
    rtk

    rtk

    CLI proxy that reduces LLM token consumption

    rtk is an open-source command-line proxy designed to optimize interactions between AI coding agents and the terminal by reducing unnecessary token consumption. When AI assistants execute shell commands during software development tasks, the resulting terminal output often contains large amounts of repetitive or irrelevant information that can overwhelm the model’s context window. RTK intercepts these command outputs and compresses them into concise summaries before sending them to the language model. This process helps maintain important information while removing redundant data such as boilerplate logs, long directory listings, or repetitive test outputs. By minimizing the amount of noise sent to the AI model, the tool improves reasoning quality and allows longer development sessions within the same context window. The system is implemented as a lightweight Rust binary that runs locally and integrates easily with common AI coding environments.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    gptcommit

    gptcommit

    A git prepare-commit-msg hook for authoring commit messages with GPT-3

    A git prepare-commit-msg hook for authoring commit messages with GPT-3. With this tool, you can easily generate clear, comprehensive and descriptive commit messages letting you focus on writing code. To use gptcommit, simply run git commit as you normally would. The hook will automatically generate a commit message for you using a large language model like GPT. If you're not satisfied with the generated message, you can always edit it before committing. By default, gptcommit uses the GPT-3 model. Please ensure you have sufficient credits in your OpenAI account to use it. Commit messages are a key channel for developers to communicate their work with others, especially in code reviews. When making complex code changes, it can be tedious to thoroughly document the contents of each change. I often felt the impulse to just title my commit “fix bug” and move on. Surfacing these changes with gptcommit helps the author and reviewer by bringing attention to these additional changes.
    Downloads: 2 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    mistral.rs

    mistral.rs

    Fast, flexible LLM inference

    mistral.rs is a fast and flexible LLM inference engine implemented in Rust, designed to run and serve modern language models with an emphasis on performance and practical deployment. It provides multiple entry points for developers, including a CLI for running models locally and an HTTP server that exposes an OpenAI-compatible API surface for easy integration with existing clients. The project includes hardware-aware tooling that can benchmark a system and choose sensible quantization and device-mapping strategies, helping users get strong performance without manual tuning. It also supports serving multiple models from the same server process, enabling routing or quick switching between models depending on workload needs. For user-facing testing, mistral.rs can provide a built-in web UI, and it also offers a dedicated lightweight web chat interface that supports richer interaction patterns.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    MusicGPT

    MusicGPT

    Generate music based on natural language prompts using LLMs

    MusicGPT is an open-source application designed to generate music from natural language prompts using locally executed artificial intelligence models. The software allows users to run advanced music generation systems directly on their own devices without requiring heavy dependencies such as Python or full machine learning frameworks. Instead, it provides a lightweight environment capable of executing music generation models locally on CPUs or GPUs while maintaining strong performance across operating systems including Windows, macOS, and Linux. Users can describe a musical style, mood, or instrumentation using text prompts, and the system produces original audio samples based on those instructions. The application currently integrates with models such as MusicGen and is designed to support additional models transparently in the future. In addition to a command-line interface, the project includes a web-based interface that enables conversational interaction with the AI model.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    BAML

    BAML

    The AI framework that adds the engineering to prompt engineering

    BAML is an open-source framework and domain-specific language designed to bring structured engineering practices to prompt development for large language model applications. Instead of treating prompts as unstructured text, BAML introduces a schema-driven approach where prompts are defined as typed functions with explicit inputs and outputs. This design allows developers to treat language model interactions as predictable software components rather than ad-hoc prompt strings. The framework enables developers to define prompt logic in a dedicated language while integrating it into applications written in various programming languages such as Python, TypeScript, Ruby, and Go. BAML also allows developers to specify which models are used for each prompt and how outputs should be validated or structured. By converting prompt engineering into a more formal programming workflow, the framework improves reliability, debugging, and maintainability of AI systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Cake

    Cake

    Distributed LLM and StableDiffusion inference

    Cake is a compact, powerful toolkit that combines a flexible TCP/UDP proxy, port forwarding system, and connection manager designed for both development and penetration testing scenarios. It enables users to create complex networking flows where traffic can be proxied, relayed, and manipulated between endpoints — useful for debugging networked applications, inspecting protocols, or tunneling traffic through different hops. The tool is designed to work with multiple protocols and supports dynamic rule definitions so that incoming and outgoing connections can be routed, rewritten, or logged according to user-defined policies. Unlike many simple proxies, Cake can act as a full connection broker: it can bind to arbitrary interfaces, handle simultaneous upstream/downstream sessions, and apply traffic rules on the fly. This makes it suitable for troubleshooting tricky network behavior, simulating network conditions, or chaining services in a modular test environment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Extractous

    Extractous

    Fast and efficient unstructured data extraction

    Extractous is a Rust-based unstructured data extraction library focused on fast local parsing of documents and other content-heavy files. Its purpose is to extract text and metadata efficiently from formats such as PDF, Word, HTML, email archives, images, and more, without depending on external APIs or separate parsing servers. The project emphasizes performance and low memory usage, and its maintainers describe it as a local-first alternative to heavier extraction stacks. For broader format support, the system combines its Rust core with ahead-of-time compiled Apache Tika shared libraries, which allows it to extend parsing coverage while still avoiding traditional server-based overhead. It also supports OCR for images and scanned documents through Tesseract, making it useful for document ingestion pipelines that include image-based or scanned inputs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    Floneum

    Floneum

    Instant, controllable, local pre-trained AI models in Rust

    Floneum is an open-source platform for building AI-powered workflows using large language models through a visual and extensible interface. The system allows users to design complex AI pipelines using a drag-and-drop workflow builder rather than writing extensive code. It focuses on enabling developers and researchers to create language model applications that combine different tools, data sources, and AI capabilities into automated workflows. Floneum supports a plugin architecture that allows external components to extend the platform while maintaining isolation and security. Many plugins can be written in different programming languages and compiled to WebAssembly modules, allowing them to run safely within the system. The platform is implemented primarily in Rust and emphasizes performance, modularity, and local execution.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Korvus

    Korvus

    Korvus is a search SDK that unifies the entire RAG pipeline

    Korvus is an open-source retrieval-augmented generation (RAG) pipeline designed to run entirely inside PostgreSQL, allowing developers to build AI search and knowledge systems directly within a database environment. The project consolidates the typical steps of a RAG pipeline—including embedding generation, document retrieval, reranking, and text generation—into a single query executed within the Postgres ecosystem. By leveraging PostgresML and vector extensions such as pgvector, Korvus eliminates the need for external microservices typically used for AI search architectures, reducing both system complexity and latency. The architecture enables machine learning operations to occur directly in the database, minimizing data transfer between services and improving overall performance for large datasets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    LangChain Rust

    LangChain Rust

    LangChain for Rust, the easiest way to write LLM-based programs

    LangChain Rust is an open-source Rust implementation inspired by the LangChain ecosystem for building applications powered by large language models. The library aims to provide Rust developers with a structured framework for orchestrating prompts, chains, agents, and external tools within LLM-driven workflows. By adapting LangChain concepts to the Rust programming language, the project emphasizes performance, safety, and efficient memory management. Developers can use the framework to build chatbots, autonomous agents, and knowledge-augmented AI systems that interact with external data sources. The library provides abstractions for model providers, prompt templates, conversation memory, and vector search integrations. It also enables the construction of multi-step pipelines where LLM outputs feed into subsequent actions or tool calls.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Paddler

    Paddler

    Open-source LLM load balancer and serving platform for hosting LLMs

    Paddler is an open-source LLM infrastructure platform designed to deploy, manage, and scale large language models on private infrastructure. The system acts as a specialized load balancer and serving layer for language models, enabling organizations to run inference workloads without relying on external API providers. It supports running models locally through engines such as llama.cpp while distributing requests across multiple compute nodes to improve performance and reliability. The architecture is designed with privacy and cost control in mind, making it suitable for organizations that handle sensitive data or require predictable operational costs. Paddler also includes tools for monitoring, request buffering, and autoscaling integration so that deployments can adapt dynamically to changing workloads. A built-in administrative interface allows developers and operations teams to manage models, observe system performance, and test inference endpoints.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    SmartGPT

    SmartGPT

    A program that provides LLMs with ability to complete complex tasks

    SmartGPT is an experimental autonomous agent framework built to help large language models tackle complex tasks with minimal or no additional user input. It works by decomposing larger objectives into smaller steps and gathering information from the internet and other outside sources as needed. The project is written in Rust and emphasizes modularity, allowing developers to compose different “Autos” depending on the workflow they want to build. Its architecture separates responsibility between a dynamic agent that reasons about what to do next and a static agent that plans and executes tool chains in a defined order. The repository describes this approach as a way to improve flexibility and consistency compared with simpler agent loops, while still acknowledging that the project is highly experimental and not focused on backward compatibility.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    aigc

    aigc

    An e-book about the real-world application of LLM

    "Building Large Language Model Applications: Application Development and Architecture Design" is an open source e-book about the real-world application of LLM. It introduces the basics and applications of large language models, as well as how to build your own models. These include writing, developing, and managing prompts, exploring what the best large language models can bring, and pattern and architecture design for LLM application development.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    gptee

    gptee

    LLMs done the UNIX-y way

    Output from a language model using standard input as the prompt. Now supporting GPT3.5 chat completions! gptee was designed for use within shell scripts and other programs and also works in interactive shells. You can compose commands and execute them in a script. Proceed with caution before running arbitrary shell scripts. Using a chat completion model (like gpt-3.5-turbo), you can then inject a system message with -s or --system messages. For davinci and other non-chat models, the output is prefixed to the prompt. Compose shell commands like you would in a script. Try with a custom model. By default gptee uses gpt-3.5-turbo.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    llm

    llm

    An ecosystem of Rust libraries for working with large language models

    llm is an ecosystem of Rust libraries for working with large language models - it's built on top of the fast, efficient GGML library for machine learning. The primary entry point for developers is the llm crate, which wraps the llm-base and the supported model crates. Documentation for the released version is available on Docs.rs. For end-users, there is a CLI application, llm-cli, which provides a convenient interface for interacting with supported models. Text generation can be done as a one-off based on a prompt, or interactively, through REPL or chat modes. The CLI can also be used to serialize (print) decoded models, quantize GGML files, or compute the perplexity of a model. It can be downloaded from the latest GitHub release or by installing it from crates.io.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    llm-chain

    llm-chain

    Rust crate for building chains in large language models

    We offer a collection of Rust crates packed with features that make working with Large Language Models easy and seamless. With llm-chain, you can focus on building powerful AI applications. Create reusable and easily customizable prompt templates for consistent and structured interactions with LLMs. Build powerful chains of prompts that allow you to execute more complex tasks, step by step, leveraging the full potential of LLMs. Provides seamless integration with LLaMa models, enabling natural language understanding and generation tasks with Facebook's research models. Incorporates support for Stanford's Alpaca models, expanding the range of available language models for advanced AI applications. Enhance your AI agents' capabilities by giving them access to various tools, such as running Bash commands, executing Python scripts, or performing web searches.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    lumen

    lumen

    Beautiful git diff viewer, generate commits with AI

    Lumen is an open-source command-line developer tool that enhances Git workflows by combining advanced diff visualization with AI-powered code assistance. The tool provides an ergonomic interface for reviewing code changes directly in the terminal, offering syntax-highlighted diffs and structured output to make change analysis easier. In addition to displaying differences between commits, Lumen integrates AI services that can explain code changes, generate commit messages, and assist with Git operations. The platform supports multiple AI providers, allowing developers to connect to models from services such as OpenAI, Claude, Groq, or locally hosted inference engines. It also includes interactive exploration features that allow users to search through commits and understand the history of changes in a repository. Because it runs entirely from the command line, the tool integrates seamlessly into existing Git workflows without requiring graphical interfaces or additional IDE plugins.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    pgvecto.rs

    pgvecto.rs

    Vector database plugin for Postgres, written in Rust

    pgvecto.rs is a Postgres extension that provides vector similarity search functions. It is written in Rust and based on pgrx. It is currently under heavy development, please take care when using it in production. pgvecto.rs is a Postgres extension, which means that you can use it directly within your existing database. This makes it easy to integrate into your existing workflows and applications. pgvecto.rs supports filtering. You can set conditions when searching or retrieving points. This is the missing feature of other postgres extensions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    uzu

    uzu

    A high-performance inference engine for AI models

    uzu is a high-performance inference engine designed to run artificial intelligence models efficiently on Apple Silicon hardware. Written primarily in Rust and leveraging Apple’s Metal framework, the project focuses on maximizing performance when executing large language models and other AI workloads on devices such as Mac computers with M-series chips. The engine implements a hybrid architecture in which model layers can be executed either as custom GPU kernels or through Apple’s MPSGraph API, allowing it to balance performance and compatibility depending on the workload. By utilizing Apple’s unified memory architecture, uzu reduces memory copying overhead and improves inference throughput for local AI workloads. The system includes a simple high-level API that enables developers to run models, create inference sessions, and generate outputs with minimal configuration.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB