skillrl

Automatic skill distillation, retrieval, and evolution from AI coding agent trajectories.

AI coding agents solve the same types of problems repeatedly but start from scratch every session. When Claude Code figures out the exact sequence to fix a race condition in your WebSocket server, that knowledge vanishes when the session ends.

skillrl builds a persistent, searchable skill memory from agent trajectories. It implements the SkillRL three-phase framework — distillation, retrieval, and evolution — so agents improve over time without fine-tuning.

npm install -g skillrl

skillrl ingest conversation.jsonl   # Extract skills from a session
skillrl index                       # Build the vector search index
skillrl retrieve "fix race condition in WebSocket handler"
# Found 5 relevant skills:
#   async_concurrency_guard (relevance: 78%)
#   websocket_lifecycle_management (relevance: 71%)
#   ...

How It Works

skillrl implements the three-phase framework from the SkillRL paper (arXiv:2602.08234v1):

                    Agent Trajectories
                   (Claude Code, Kiro, Cursor, OpenClaw)
                           |
            +--------------+--------------+
            |                             |
     Success Trajectories          Failed Trajectories
            |                             |
            v                             v
   +------------------+        +------------------+
   |  DISTILLATION     |        |  DISTILLATION     |
   |  Extract reusable |        |  Synthesize       |
   |  patterns from    |        |  corrective       |
   |  what worked      |        |  skills from      |
   |  (Section 3.1)    |        |  what failed      |
   +--------+----------+        +--------+----------+
            |                             |
            +----------+    +-------------+
                       |    |
                       v    v
              +------------------+
              |  SKILL BANK      |
              |  Persistent JSON |
              |  Sg (general)    |
              |  Sk (domain)     |
              +--------+---------+
                       |
          +------------+------------+
          |                         |
          v                         v
 +------------------+     +------------------+
 |  RETRIEVAL        |     |  EVOLUTION        |
 |  Embedding KNN    |     |  Refine from      |
 |  + domain filter  |     |  accumulated      |
 |  (Section 3.2)    |     |  failures         |
 +--------+----------+     |  (Section 3.3)    |
          |                 +--------+----------+
          v                          |
 +------------------+                |
 |  Agent Context   | <--------------+
 |  Skills injected |
 |  into next       |
 |  session         |
 +------------------+

Phase 1: Distillation

An LLM analyzes the trajectory and extracts structured skills. Each skill contains 10 fields: name, description, domain, category, instructions (step-by-step), prerequisites, anti-patterns, examples (with context/input/output), tags, and a confidence score (0.0-1.0).

Separate prompts handle success vs. failure trajectories. From success: "what patterns should be repeated?" From failure: "what should have been done differently?" New skills are deduplicated against the existing bank using an LLM call.

Phase 2: Retrieval

Two-tier search. The fast path uses pre-built embeddings (Gemini gemini-embedding-001, 256 dimensions) stored in SQLite with sqlite-vec KNN search. One API call for the query embedding, then a pure SQL vector similarity query. Domain partition keys enable pre-filtered search — querying with --domain react only scans React skills, not the entire index.

The fallback path (when no embedding index exists) uses keyword matching on skill name/description/tags, then LLM-based reranking for ambiguous cases. The retriever tries embeddings first and falls back transparently.

Phase 3: Evolution

After accumulating failures, the evolution cycle analyzes patterns across failed trajectories, creates new skills to fill gaps, refines existing skills with updated instructions and confidence, and deprecates skills that consistently lead to poor outcomes.

Skill Organization

Skills follow the paper's dual-pool architecture:

Sg (General): Broadly applicable patterns — "Incremental Verification Loop", "Hypothesis-Driven Debugging"
Sk (Task-Specific): Domain-bound techniques — "React State Management", "SQL Query Optimization"

Quick Start

1. Configure Credentials

Gemini (default)

Get a free API key from Google AI Studio:

skillrl config YOUR_GEMINI_API_KEY

# Or environment variable
export GEMINI_API_KEY=your_api_key

Amazon Bedrock

npm install @aws-sdk/client-bedrock-runtime

export AWS_BEARER_TOKEN_BEDROCK=your_bedrock_api_key
export AWS_REGION=us-east-1

Claude (Anthropic)

export ANTHROPIC_API_KEY=your_api_key

2. Ingest a Trajectory

skillrl init                                     # Create empty skill bank
skillrl ingest conversation.jsonl                # Ingest Claude Code session
# Distilled 3 skills:
#   incremental_verification_loop [general] (confidence: 98%)
#   silent_failure_recovery [general] (confidence: 90%)
#   ...

skillrl index                                    # Build embedding index

3. Retrieve Skills

skillrl retrieve "implement JWT authentication with refresh tokens"
# Found 5 relevant skills:
#   auth_token_lifecycle (relevance: 74%)
#   ...

skillrl retrieve "fix race condition" --domain typescript

4. Export for Your IDE

skillrl export kiro-power --output ./power       # Kiro Power bundle
skillrl export skill-md --output SKILL.md        # Claude Code
skillrl export cursorrules --output .cursorrules  # Cursor

CLI Reference

Commands

Command	Description
`skillrl init`	Initialize an empty skill bank
`skillrl ingest <file>`	Parse a trajectory and extract skills
`skillrl ingest --stdin`	Read trajectory from stdin/pipe
`skillrl index`	Build/rebuild the embedding index
`skillrl retrieve "<task>"`	Find relevant skills by semantic search
`skillrl evolve <file>`	Run evolution cycle on failed trajectories
`skillrl export <format>`	Export skill bank (5 formats)
`skillrl list`	Display all skills with metadata
`skillrl stats`	Show skill bank statistics
`skillrl import <file>`	Merge skills from another bank
`skillrl config [api-key]`	Show or configure settings
`skillrl test`	Test LLM provider connection

Options

Option	Description
`--provider, -p`	LLM provider: `gemini` (default), `bedrock`, `claude`
`--model, -m`	Model alias or full ID (see Model Configuration)
`--bank-path`	Custom skill bank path (default: `.skillrl/bank.json`)
`--domain`	Filter by domain (e.g., `typescript`, `react`, `python`)
`--source`	Trajectory source: `claude-code`, `kiro`, `cursor`, `openclaw`, `custom`
`--output, -o`	Output path for exports
`--verbose, -v`	Show detailed output

Trajectory Formats

Format	Description	Auto-Detected
JSONL	Claude Code conversation logs	Yes
JSON	Structured `{ task, steps, outcome }` object	Yes
Kiro	Section-delimited logs with `---` markers	Yes
Text	Unstructured agent logs	Yes

MCP Server

skillrl includes an MCP server with 8 tools for direct agent integration.

Setup

Add to your MCP configuration (~/.claude.json, project .mcp.json, or equivalent):

{
  "mcpServers": {
    "skillrl": {
      "command": "npx",
      "args": ["-y", "skillrl-mcp"],
      "env": {
        "GEMINI_API_KEY": "your_api_key"
      }
    }
  }
}

For Bedrock:

{
  "mcpServers": {
    "skillrl": {
      "command": "npx",
      "args": ["-y", "skillrl-mcp"],
      "env": {
        "RLM_PROVIDER": "bedrock",
        "AWS_BEARER_TOKEN_BEDROCK": "your_key",
        "AWS_REGION": "us-east-1"
      }
    }
  }
}

Tools

Tool	Description
`skill_ingest`	Parse trajectory and extract skills
`skill_retrieve`	Semantic search for relevant skills
`skill_evolve`	Run evolution from failed trajectories
`skill_export`	Export to kiro-power, skill-md, cursorrules, markdown, or json
`skill_list`	List skills with domain/category/confidence filters
`skill_bank_stats`	Skill bank statistics
`skill_config`	Current configuration
`skill_index`	Build/rebuild embedding index

All tool inputs are validated with Zod schemas. The agent receives ranked skills with relevance scores, step-by-step instructions, and anti-patterns injected into its context.

Programmatic API

ESM-only ("type": "module"). Requires Node.js >= 18.

Factory Function

import { createSkillManager } from 'skillrl';

const manager = createSkillManager({
  provider: 'gemini',
  bankPath: '.skillrl/bank.json',
});

// Distill skills from a trajectory
const result = await manager.distiller.distill(trajectory);
console.log(`Extracted ${result.skills.length} skills`);

// Retrieve relevant skills
const retrieval = await manager.retriever.retrieve(
  'implement OAuth2 with PKCE flow',
  'typescript'
);
for (const { skill, relevanceScore } of retrieval.skills) {
  console.log(`${skill.name} (${(relevanceScore * 100).toFixed(0)}%)`);
}

// Evolve from failures
const evolution = await manager.evolver.evolve(failedTrajectories);
console.log(`New: ${evolution.newSkills.length}, Refined: ${evolution.refinedSkills.length}`);

Direct Class Usage

import { SkillBankManager, EmbeddingManager } from 'skillrl';

// Load the skill bank
const bank = new SkillBankManager({ bankPath: '.skillrl/bank.json' });
bank.load();
const skills = bank.listSkills();
console.log(`${skills.length} skills loaded`);

// Embedding index (SQLite + sqlite-vec)
const emb = new EmbeddingManager({ bankPath: '.skillrl/bank.json' });
await emb.load();

// KNN search with domain filtering
const results = await emb.search('handle authentication errors', {
  topK: 5,
  threshold: 0.3,
  domain: 'typescript',
});

for (const { skillId, score } of results) {
  console.log(`${skillId}: ${(score * 100).toFixed(1)}% match`);
}

emb.close();

Export

import { getExporter } from 'skillrl';

const exporter = getExporter('kiro-power');
const result = await exporter.export(bank.getBank(), {
  format: 'kiro-power',
  outputPath: './power',
  domain: 'typescript',
  minConfidence: 0.6,
});

Sub-Path Exports

import { resolveModelConfig } from 'skillrl/models';
import { getApiKey, detectProvider } from 'skillrl/config';
import type { LLMProvider } from 'skillrl/providers';

Embedding Index

The embedding index uses SQLite + sqlite-vec for hardware-accelerated KNN search.

Schema

CREATE TABLE index_metadata (
  key TEXT PRIMARY KEY, value TEXT NOT NULL
);

CREATE TABLE skill_metadata (
  skill_id TEXT PRIMARY KEY,
  text TEXT NOT NULL,
  domain TEXT NOT NULL DEFAULT 'general',
  model TEXT NOT NULL,
  dimensions INTEGER NOT NULL,
  updated_at TEXT NOT NULL
);
CREATE INDEX idx_skill_metadata_domain ON skill_metadata(domain);

CREATE VIRTUAL TABLE vec_skill_embeddings USING vec0(
  skill_id TEXT PRIMARY KEY,
  domain TEXT partition key,
  embedding float[256] distance_metric=cosine
);

How Search Works

The query string is embedded via gemini-embedding-001 (one API call, 256-dimensional vector)
The embedding is passed to sqlite-vec's MATCH operator with the requested k (top-K) and optional domain partition filter
sqlite-vec performs approximate nearest neighbor search using cosine distance
Results are converted: similarity = 1 - distance

-- With domain filter (searches only the "react" partition):
SELECT skill_id, distance FROM vec_skill_embeddings
WHERE embedding MATCH ?1 AND k = ?2 AND domain = 'react'
ORDER BY distance;

-- Without domain filter (searches all partitions):
SELECT skill_id, distance FROM vec_skill_embeddings
WHERE embedding MATCH ?1 AND k = ?2
ORDER BY distance;

Configuration

Setting	Value
Embedding model	`gemini-embedding-001` (256 dimensions, float32)
Distance metric	Cosine
Journal mode	WAL (concurrent readers, single writer)
Sync mode	NORMAL (acceptable for regenerable data)
Busy timeout	5000ms
Domain filtering	Partition key (pre-filtered, not post-filtered)
Storage	`embeddings.db` alongside `bank.json`

Migration

If an embeddings.json file exists from a previous version, it is automatically migrated to SQLite on the next load(), search(), or index call. The original file is renamed to embeddings.json.migrated.

Model Configuration

Gemini (Default)

Alias	Model ID
`fast`, `default`, `flash`	`gemini-3-flash-preview`
`smart`, `pro`	`gemini-3-pro-preview`
`flash-2`	`gemini-2.0-flash-exp`
`flash-2.5`	`gemini-2.5-flash`

Amazon Bedrock

Alias	Model ID
`fast`, `default`, `nova-2-lite`	`us.amazon.nova-2-lite-v1:0`
`smart`, `claude-4.5-sonnet`	`us.anthropic.claude-sonnet-4-5-*`
`claude-4.5-opus`	`us.anthropic.claude-opus-4-5-*`
`llama-4`	`us.meta.llama4-maverick-*`

Bedrock requires @aws-sdk/client-bedrock-runtime (optional peer dependency).

Claude (Anthropic)

Alias	Model ID
`fast`, `haiku`	`claude-haiku-4-5-20251001`
`smart`, `default`, `sonnet`	`claude-sonnet-4-5-20250929`
`opus`	`claude-opus-4-5-20251101`

Usage

skillrl ingest session.jsonl --model fast
skillrl ingest session.jsonl --model smart
skillrl ingest session.jsonl --provider bedrock --model claude-4.5-sonnet

Export Formats

Kiro Power (`kiro-power`)

Full IDE integration bundle with directory structure:

power/
  POWER.md           # Activation manifest
  mcp.json           # MCP server configuration
  steering/          # Domain-grouped skill files
  hooks/             # Agent lifecycle hooks (auto-ingestion)

SKILL.md (`skill-md`)

YAML frontmatter + markdown. Compatible with Claude Code. Includes version, domains, skill count, generated_by metadata.

.cursorrules (`cursorrules`)

Native Cursor IDE rules format. Skills sorted by confidence, inline domain tags.

Markdown (`markdown`)

Human-readable documentation with table of contents, overview table, and skills grouped by domain.

JSON (`json`)

Raw skill bank for programmatic use, backup, and import into other banks.

Configuration

Credential Resolution Order

Environment variables (GEMINI_API_KEY, AWS_BEARER_TOKEN_BEDROCK, ANTHROPIC_API_KEY)
.env file in current directory
.env.local file in current directory
~/.skillrl/.env
~/.config/skillrl/.env
~/.skillrl/config.json
~/.config/skillrl/config.json

Provider Detection

RLM_PROVIDER environment variable
provider field in config file
Auto-detect based on available credentials (Gemini -> Bedrock -> Claude)
Default: gemini

File Structure

.skillrl/
  bank.json              # Skill definitions and metadata
  embeddings.db          # SQLite vector index (auto-created by `skillrl index`)
  embeddings.db-wal      # WAL journal (auto-managed)
  embeddings.db-shm      # Shared memory (auto-managed)

Troubleshooting

"API key not configured"

skillrl config                 # Check current state
skillrl config YOUR_API_KEY    # Set Gemini key

"No skills found" on retrieve

Build the embedding index after ingesting:

skillrl stats    # Verify skills exist
skillrl index    # Build the index

"requires @aws-sdk/client-bedrock-runtime"

npm install @aws-sdk/client-bedrock-runtime

Slow first retrieval

The first search() call opens the SQLite database and loads the sqlite-vec extension (~5-20ms). Subsequent calls reuse the connection.

Embedding index size

For 500 skills with 256-dimension embeddings, expect ~2-5 MB. WAL and SHM files are temporary.

TypeScript Types

import type {
  Skill, SkillExample, SkillMetadata,
  SkillBank, SkillBankMetadata, SkillConfig,
  DistillationResult, RetrievalResult, EvolutionResult,
  ExportResult, ScoredSkill,
  Trajectory, TrajectoryStep, ToolCall,
  ExportOptions,
  SkillEmbedding, EmbeddingIndex,
  ProviderName, ResolvedModelConfig,
} from 'skillrl';

Skill Type

interface Skill {
  id: string;
  name: string;
  description: string;
  domain: string;                          // e.g., "typescript", "react", "python"
  category: 'general' | 'task-specific';
  instructions: string[];                  // Step-by-step
  prerequisites: string[];
  antiPatterns: string[];                  // What to avoid
  examples: { context: string; input: string; output: string }[];
  metadata: {
    usageCount: number;
    successRate: number;
    lastUsed: string | null;
    evolvedFrom: string | null;
    deprecated: boolean;
    deprecationReason: string | null;
  };
  tags: string[];
  confidence: number;                      // 0.0 - 1.0
  version: number;
  createdAt: string;
  updatedAt: string;
  sourceTrajectories: string[];
}

Security

API keys stored locally, transmitted only to configured LLM providers
Path traversal protection prevents reads/writes outside the sandbox
Zod validation on all MCP tool inputs with length limits
Output sanitization escapes markdown/YAML injection in exports
Read-only trajectory ingestion never modifies source code

License

MIT

Credits

Based on SkillRL: Skill-Based Transferable Reinforcement Learning for LLM Agents (arXiv:2602.08234v1).

Part of the RLM project.

Contributing

Contributions welcome. GitHub Issues | Discussions

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.rlm-skill		.rlm-skill
benchmark		benchmark
power		power
src		src
trajectories		trajectories
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md
capture-trajectory.sh		capture-trajectory.sh
ingest-all.sh		ingest-all.sh
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

skillrl

Table of Contents

How It Works

Phase 1: Distillation

Phase 2: Retrieval

Phase 3: Evolution

Skill Organization

Quick Start

1. Configure Credentials

Gemini (default)

Amazon Bedrock

Claude (Anthropic)

2. Ingest a Trajectory

3. Retrieve Skills

4. Export for Your IDE

CLI Reference

Commands

Options

Trajectory Formats

MCP Server

Setup

Tools

Programmatic API

Factory Function

Direct Class Usage

Export

Sub-Path Exports

Embedding Index

Schema

How Search Works

Configuration

Migration

Model Configuration

Gemini (Default)

Amazon Bedrock

Claude (Anthropic)

Usage

Export Formats

Kiro Power (kiro-power)

SKILL.md (skill-md)

.cursorrules (cursorrules)

Markdown (markdown)

JSON (json)

Configuration

Credential Resolution Order

Provider Detection

File Structure

Troubleshooting

"API key not configured"

"No skills found" on retrieve

"requires @aws-sdk/client-bedrock-runtime"

Slow first retrieval

Embedding index size

TypeScript Types

Skill Type

Security

License

Credits

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Kiro Power (`kiro-power`)

SKILL.md (`skill-md`)

.cursorrules (`cursorrules`)

Markdown (`markdown`)

JSON (`json`)

Packages