nexint-mem is the dedicated memory management subsystem for the NexInt AI Agent runtime. It treats Large Language Model (LLM) context windows as a finite resource, applying Operating System principles—such as paging, virtual memory, and LRU swapping—to manage agent state efficiently.
By abstracting raw text tokens into "pages" of memory, nexint-mem allows agents to maintain long-running conversations and complex reasoning chains without overflowing the context window or incurring excessive reallocation costs.
- Token Paging Architecture: Breaks down agent context into fixed-size "pages," enabling fine-grained control over what stays in the active context window (RAM) versus what gets swapped to disk or vector storage.
- Zero-Copy Context Management: Heavily utilizes
std::borrow::Cowto manage string data. Static prompts remain borrowed, while dynamic responses are owned, minimizing unnecessary allocations. - LRU/LFU Eviction Strategies: Implements Least Recently Used and Least Frequently Used algorithms to automatically retire old context when the token limit is reached, ensuring the most relevant information remains accessible.
- Strict Error Handling: All memory operations return explicit
Resulttypes (usingthiserror), guaranteeing that allocation failures or out-of-bounds access are handled gracefully without panicking.
nexint-mem introduces the concept of a Memory Management Unit (MMU) for LLMs:
- Page Table: Maps virtual token indices to physical storage locations (Active Context vs. Swap).
- Context Frame: A contiguous block of tokens representing a distinct thought or dialogue turn.
- Swapper: Handles the serialization of cold pages to persistent storage when memory pressure is high.
Add this crate to your Cargo.toml:
[dependencies]
nexint-mem = { path = "../nexint-mem" }
thiserror = "1.0"use nexint_mem::{MemoryManager, ContextPage, PageId};
use std::borrow::Cow;
fn main() -> anyhow::Result<()> {
// Initialize Memory Manager with a 4096-token hard limit
let mut mmu = MemoryManager::new(4096);
// Create a new context page (e.g., system prompt)
// Using Cow::Borrowed avoids allocating new strings for static data
let sys_prompt = ContextPage::new(
PageId::next(),
Cow::Borrowed("You are a helpful AI assistant.")
);
// Allocate the page in the active window
mmu.allocate(sys_prompt)?;
// Simulate a conversation turn
let user_input = String::from("Explain quantum physics.");
let user_page = ContextPage::new(
PageId::next(),
Cow::Owned(user_input) // Owned data for dynamic input
);
// Attempt allocation; if full, the eviction policy triggers
if let Err(e) = mmu.allocate(user_page) {
eprintln!("Memory allocation failed: {}", e);
// Handle eviction or summary compression here
}
// Iterate over active context
// Returns an iterator, zero-cost abstraction over the underlying storage
let full_context: String = mmu.active_pages()
.map(|page| page.content.as_ref())
.collect();
println!("Current Context: {}", full_context);
Ok(())
}Adhering to NexInt core standards:
- No Implicit Allocations: Avoid
.to_string()on slices unless ownership is strictly required. PreferCow<'a, str>. - No Index-Based Loops: Use iterators (
.iter(),.map(),.filter()) to traverse context pages. Direct index access is discouraged to prevent bounds check overhead. - Panic Freedom: Never use
unwrap()on memory allocation. Always handle theResultto decide whether to crash, retry, or evict data.