Lists (1)
Sort Name ascending (A-Z)
Stars
Official Code of Memento: Fine-tuning LLM Agents without Fine-tuning LLMs
AI coding assistant skill (Claude Code, Codex, OpenCode, Cursor, Gemini CLI, GitHub Copilot CLI, OpenClaw, Factory Droid, Trae, Google Antigravity). Turn any folder of code, docs, papers, images, o…
Academic Research Skills for Claude Code: research → write → review → revise → finalize
AI agents running research on single-GPU nanochat training automatically
This project includes code for using the WebGym framework to train web agentic models.
Co-evolving policy actors and experience extractors for efficient experience-driven agent RL
General technology for enabling AI capabilities w/ LLMs and MLLMs
OpenClaw-RL: Train any agent simply by talking
The absolute trainer to light up AI agents.
Reinforcement Learning via Self-Distillation (SDPO)
Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning
Machine Learning Interviews from FAANG, Snapchat, LinkedIn. I have offers from Snapchat, Coupang, Stitchfix etc. Blog: mlengineer.io.
CL-bench: A Benchmark for Context Learning
Awesome In-Context RL: A curated list of In-Context Reinforcement Learning - - —
A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning
Defeating the Training-Inference Mismatch via FP16
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Accompanying code for "Discovering State-of-the-art Reinforcement Algorithms" Nature publication
[ICLR 2026] Agentic Reinforced Policy Optimization (ARPO)
slime is an LLM post-training framework for RL Scaling.
Building Open LLM Web Agents with Self-Evolving Online Curriculum RL



