LexIQ Vectors

Production-Ready Text Embeddings & Reranking API by Remodl AI

Overview

LexIQ Vectors is a high-performance text embedding and reranking service developed by Remodl AI, Inc. It provides state-of-the-art multilingual text embeddings and document reranking through an OpenAI-compatible API, powered by Qwen3 models optimized for search and retrieval tasks.

Key Features

🚀 Ultra-Fast Performance: Sub-200ms response times when warm
🌐 100+ Languages: Full multilingual support out of the box
🔍 Instruction-Aware: Boost search quality by 1-5% with custom instructions
🔄 OpenAI Compatible: Drop-in replacement for OpenAI embeddings
📊 Advanced Reranking: Optimize search results with state-of-the-art reranking
🏗️ Enterprise Ready: Built on RunPod serverless infrastructure for scale

Tech Stack

Models: Qwen3-Embedding-0.6B & Qwen3-Reranker-0.6B
Infrastructure: RunPod Serverless GPU Workers
Engine: Infinity (high-throughput inference)
API: OpenAI-compatible REST endpoints
Container: Docker with CUDA 12.4.1 support

API Endpoints

Production Base URL

https://api.lexiq.dev

Available Endpoints

POST /openai/v1/models - List available models
POST /openai/v1/embeddings - Generate embeddings
POST /openai/v1/rerank - Rerank documents

Quick Start

Using OpenAI SDK

from openai import OpenAI

# Initialize client
client = OpenAI(
    api_key="your-lexiq-api-key",
    base_url="https://api.lexiq.dev/openai"
)

# Generate embeddings
response = client.embeddings.create(
    model="models/Qwen3-Embedding-0.6B",
    input="What is machine learning?",
    extra_body={
        "prompt_type": "query"  # Optimize for search queries
    }
)

embedding = response.data[0].embedding

Using Requests

import requests

# Generate embeddings
response = requests.post(
    "https://api.lexiq.dev/openai/v1/embeddings",
    headers={
        "Authorization": "Bearer your-lexiq-api-key",
        "Content-Type": "application/json"
    },
    json={
        "model": "models/Qwen3-Embedding-0.6B",
        "input": "What is machine learning?",
        "extra_body": {
            "prompt_type": "query"
        }
    }
)

embedding = response.json()["data"][0]["embedding"]

Advanced Usage

Instruction-Aware Embeddings

Improve search quality by using different instructions for queries vs documents:

For Search Queries

# Using built-in optimization
embedding = client.embeddings.create(
    model="models/Qwen3-Embedding-0.6B",
    input="How to implement neural networks?",
    extra_body={
        "prompt_type": "query"
    }
)

# Or with custom instruction
embedding = client.embeddings.create(
    model="models/Qwen3-Embedding-0.6B",
    input="How to implement neural networks?",
    extra_body={
        "instruction": "Represent this programming question for finding code examples"
    }
)

For Documents

# Documents typically don't need instructions
embedding = client.embeddings.create(
    model="models/Qwen3-Embedding-0.6B",
    input="Neural networks are a fundamental component of deep learning..."
)

Document Reranking

Optimize search results by reranking documents based on relevance:

# Rerank documents with default instruction
response = requests.post(
    "https://api.lexiq.dev/openai/v1/rerank",
    headers={
        "Authorization": "Bearer your-lexiq-api-key",
        "Content-Type": "application/json"
    },
    json={
        "query": "What product has the best warranty?",
        "documents": [
            "Product A: 2-year comprehensive warranty",
            "Product B: Lifetime limited warranty", 
            "Product C: 90-day warranty",
            "Product D: 5-year extended warranty available"
        ],
        "return_documents": True,
        "top_k": 3
    }
)

# Or with custom instruction
response = requests.post(
    "https://api.lexiq.dev/openai/v1/rerank",
    headers={
        "Authorization": "Bearer your-lexiq-api-key",
        "Content-Type": "application/json"
    },
    json={
        "query": "What product has the best warranty?",
        "documents": [...],
        "extra_body": {
            "instruction": "Find products with the longest warranty duration and best coverage terms"
        },
        "return_documents": True,
        "top_k": 3
    }
)

# Results are sorted by relevance score
for result in response.json()["results"]:
    print(f"Score: {result['score']:.3f} - {result['document']}")

Performance

Response Times

Cold Start: 10-15 seconds (model loading)
Warm Requests: 100-200ms (embeddings), 130-180ms (reranking)
Throughput: Up to 300 concurrent requests

Model Specifications

Embedding Dimensions: 1024
Max Sequence Length: 32,768 tokens
Batch Processing: Optimized for batches up to 32

Best Practices

1. Use Appropriate Instructions

Search Queries: Always use prompt_type: "query" or custom search instructions
Documents: Use raw text without instructions for best results
Domain-Specific: Create custom instructions for specialized use cases

2. Batch Processing

# Process multiple texts in one request
embeddings = client.embeddings.create(
    model="models/Qwen3-Embedding-0.6B",
    input=[
        "First document",
        "Second document",
        "Third document"
    ]
)

3. Combine Embedding + Reranking

# Step 1: Embed query
query_embedding = client.embeddings.create(
    model="models/Qwen3-Embedding-0.6B",
    input="user search query",
    extra_body={"prompt_type": "query"}
)

# Step 2: Vector search in your database
candidates = vector_db.search(query_embedding, top_k=100)

# Step 3: Rerank top candidates
reranked = requests.post(
    "https://api.lexiq.dev/openai/v1/rerank",
    headers={"Authorization": "Bearer your-key"},
    json={
        "query": "user search query",
        "documents": candidates,
        "top_k": 10
    }
)

Custom Instructions Examples

Academic Search

extra_body={
    "instruction": "Represent this query for finding academic research papers"
}

Code Search

extra_body={
    "instruction": "Represent this query for finding code implementations and examples"
}

Product Search

extra_body={
    "instruction": "Represent this query for e-commerce product search"
}

FAQ/Support

extra_body={
    "instruction": "Represent this question for finding answers in documentation"
}

Error Handling

All errors follow the OpenAI error format:

{
  "error": {
    "message": "Model not found",
    "type": "invalid_request_error",
    "code": "model_not_found"
  }
}

Common error types:

invalid_request_error - Invalid parameters
authentication_error - Invalid API key
rate_limit_error - Too many requests
internal_error - Server error

Support

Documentation: https://docs.lexiq.dev
API Status: https://status.lexiq.dev
Support: support@remodlai.com

About Remodl AI

Remodl AI, Inc. is dedicated to building production-ready AI infrastructure that scales. LexIQ Vectors is part of the LexIQ platform, providing enterprise-grade AI capabilities for modern applications.

Built with ❤️ by Remodl AI, Inc.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.github/workflows		.github/workflows
.runpod		.runpod
api-gateway @ 0634a80		api-gateway @ 0634a80
archive		archive
cloudflare-warmkeeper		cloudflare-warmkeeper
docker_old		docker_old
infinity-fork		infinity-fork
models/hub		models/hub
original-worker-infinity-embedding		original-worker-infinity-embedding
scripts_old		scripts_old
src		src
worker-qwen3-reranker		worker-qwen3-reranker
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
.mcp.json		.mcp.json
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
Dockerfile.unified		Dockerfile.unified
LICENSE		LICENSE
README.md		README.md
build-and-push-reranker.sh		build-and-push-reranker.sh
docker-bake.hcl		docker-bake.hcl
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
test-reranker-payload.json		test-reranker-payload.json
test_input.json		test_input.json

Folders and files

Latest commit

History

Repository files navigation

LexIQ Vectors

Overview

Key Features

Tech Stack

API Endpoints

Production Base URL

Available Endpoints

Quick Start

Using OpenAI SDK

Using Requests

Advanced Usage

Instruction-Aware Embeddings

For Search Queries

For Documents

Document Reranking

Performance

Response Times

Model Specifications

Best Practices

1. Use Appropriate Instructions

2. Batch Processing

3. Combine Embedding + Reranking

Custom Instructions Examples

Academic Search

Code Search

Product Search

FAQ/Support

Error Handling

Support

About Remodl AI

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages