Documentation v1
Live Documentation

ModelReins Guide

A lightweight model dispatch layer — route prompts to Claude, local LLMs, or any CLI tool, schedule tasks via cron, and wire Claude Code in as a background worker over MCP.

Fast Setup

Install the npm package, point it at a server, and dispatch your first request in under two minutes.

🔌

Any Backend

Cloud APIs, local models, or shell scripts — all behind one uniform interface.

🕐

Scheduled Agents

Cron-driven tasks that run prompts on a schedule and store results for later retrieval.

📈

Web Dashboard

Inspect dispatch history, live provider status, and scheduled job runs in one place.

1

Quick Start

ModelReins exposes a small HTTP API. The easiest way to interact with it is the modelreins npm package, which wraps every endpoint and handles auth. You can also hit the REST API directly from any language.

Install the client

shell
$ npm install -g modelreins
+ [email protected]
added 42 packages in 3.2s

$ modelreins --version
1.0.0

Connect to a server

Point the client at your ModelReins instance. The base URL is stored in ~/.modelreins/config.json and can be overridden with MODELREINS_URL and MODELREINS_TOKEN environment variables.

shell
$ modelreins config set url http://192.168.0.246:8484
$ modelreins config set token YOUR_API_TOKEN

$ modelreins ping
✓ connected to ModelReins at http://192.168.0.246:8484
  version  : 1.0.0
  providers: 3 active
  uptime   : 4d 12h 7m

Dispatch your first prompt

modelreins dispatch sends a prompt to the default provider and streams the response to stdout. Use --provider to target a specific backend.

shell
$ modelreins dispatch "Summarise the last 5 git commits"

Dispatching to provider: claude (default)
────────────────────────────────────────
The last 5 commits focused on...

From JavaScript / TypeScript

typescript
import ModelReins from 'modelreins';

const mr = new ModelReins({
  url:   'http://192.168.0.246:8484',
  token: process.env.MODELREINS_TOKEN,
});

const result = await mr.dispatch({
  prompt:   'Explain quantum entanglement simply.',
  provider: 'claude',          // optional, uses default if omitted
  model:    'claude-sonnet-4-6', // optional, uses provider default
});

console.log(result.text);
console.log('tokens used:', result.usage);

Raw HTTP

bash
curl -s -X POST http://192.168.0.246:8484/api/dispatch \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer YOUR_TOKEN' \
  -d '{"prompt":"Hello world","provider":"claude"}' \
  | jq '.text'
💡

Authentication tokens are managed in the Dashboard under Settings → API Tokens. Tokens can be scoped to specific providers or operations.

2

Providers

A provider is a named backend that ModelReins can route prompts to. Each provider has its own config block in config.yaml on the server side. Multiple providers of the same type can coexist with different names.

Set default: true on one provider to make it the target when no provider field is supplied in a dispatch request.

🤖

Claude Anthropic API

Hosted Claude models via the Anthropic Messages API

Requires an Anthropic API key. Supports streaming, system prompts, multi-turn conversation context, and tool use pass-through.

yaml — config.yaml
providers:
  claude:
    type:     anthropic
    default: true
    apiKey:  ${ANTHROPIC_API_KEY}
    model:   claude-sonnet-4-6   # default model
    maxTokens: 8192
    systemPrompt: |
      You are a helpful assistant running in ModelReins.

Supported models

Model IDContextNotes
claude-opus-4-6200kMost capable, highest cost
claude-sonnet-4-6200kBest balance — recommended default
claude-haiku-4-5-20251001200kFast & cheap, great for classification
📀

LM Studio Local

OpenAI-compatible API served by LM Studio on the local network

LM Studio exposes an OpenAI-compatible endpoint on localhost:1234 by default. ModelReins uses the openai-compat type for any server speaking the OpenAI Chat Completions API.

yaml — config.yaml
providers:
  lmstudio:
    type:    openai-compat
    baseUrl: http://192.168.0.109:1234/v1
    apiKey:  lm-studio          # any string works
    model:   lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF
    stream:  true
⚠️

Enable the Local Server toggle in LM Studio and load a model before dispatching. LM Studio must remain running for the provider to be reachable.

🌞

Ollama Local

Native Ollama API or its OpenAI-compat shim

Ollama runs models locally and exposes both a native API on port 11434 and an OpenAI-compatible shim. ModelReins supports both; the native type gives access to Ollama-specific options like num_ctx.

yaml — config.yaml
providers:
  ollama:
    type:    ollama
    baseUrl: http://192.168.0.109:11434
    model:   llama3.2
    options:
      num_ctx:  32768
      num_gpu:  99
      temperature: 0.7

Pull a model

shell
$ ollama pull llama3.2
$ ollama pull qwen2.5-coder:7b
$ ollama list
NAME                        ID            SIZE
llama3.2:latest             ...           2.0 GB
qwen2.5-coder:7b            ...           4.7 GB
🔸

1minAI Cloud

Unified API hub — access 100+ models with one key

1minAI provides a single API key that unlocks access to GPT-4o, Gemini, Mistral, and many others via an OpenAI-compatible surface. Useful for fallback or cost comparison without managing multiple keys.

yaml — config.yaml
providers:
  1minai:
    type:    openai-compat
    baseUrl: https://api.1min.ai/v1
    apiKey:  ${ONEMINAI_API_KEY}
    model:   gpt-4o
    fallback: true  # used when primary provider errors
🛠

Custom CLI Bring Your Own

Pipe prompts through any shell command

The cli provider type shells out to an arbitrary command. The prompt is piped to stdin; the response is read from stdout. This unlocks any tool that accepts text input — scripts, local Claude Code instances, custom wrappers, etc.

yaml — config.yaml
providers:
  local-claude-code:
    type:    cli
    command: claude --print --dangerously-skip-permissions
    timeout: 120000  # ms
    env:
      ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY}

  my-script:
    type:    cli
    command: /opt/modelreins/scripts/my_llm.sh
    timeout: 30000
💡

CLI providers inherit the server process's environment unless you override it with the env block. The working directory is the ModelReins server root.

3

Scheduled Tasks

Scheduled tasks dispatch a prompt on a cron schedule and store the result. Results are retrievable via the API or visible in the Dashboard. Tasks can target any configured provider and optionally POST their result to a webhook.

Cron API

Tasks are managed through the /api/tasks endpoint family.

typescript
import ModelReins from 'modelreins';
const mr = new ModelReins({ url: 'http://192.168.0.246:8484', token: TOKEN });

// Create a scheduled task
const task = await mr.tasks.create({
  name:     'daily-standup-summary',
  schedule: '0 9 * * 1-5',        // 9 AM Mon–Fri
  provider: 'claude',
  prompt:   'Summarise open GitHub PRs and suggest priorities.',
  webhook:  'https://hooks.slack.com/T.../B.../...',  // optional
  enabled:  true,
});
console.log(task.id);  // task_abc123

// List tasks
const all = await mr.tasks.list();

// Get latest result
const result = await mr.tasks.latestResult(task.id);

// Trigger immediately (one-off run)
await mr.tasks.run(task.id);

// Disable / enable
await mr.tasks.update(task.id, { enabled: false });

// Delete
await mr.tasks.delete(task.id);

Schedule presets

Common cron expressions — click any chip to copy the expression.

0 * * * * Every hour
*/15 * * * * Every 15 min
0 9 * * 1-5 9 AM weekdays
0 8 * * 1 Monday 8 AM
0 0 * * * Midnight daily
0 6 1 * * 1st of month
*/5 * * * * Every 5 min
0 18 * * 5 Friday 6 PM

Examples

Log monitor

typescript
await mr.tasks.create({
  name:     'error-log-digest',
  schedule: '0 */4 * * *',  // every 4 hours
  provider: 'claude',
  prompt: `
    Read /var/log/app/error.log (last 500 lines).
    Summarise recurring errors, count by type,
    and flag anything that looks critical.
    Be concise — 10 lines max.
  `,
  context: { readFile: '/var/log/app/error.log' },
});

Weekly report via webhook

typescript
await mr.tasks.create({
  name:     'weekly-metrics',
  schedule: '0 8 * * 1',  // Monday 8 AM
  provider: 'claude',
  prompt:   'Generate a weekly metrics summary for the team.',
  webhook:  'https://hooks.slack.com/services/...',
  webhookFormat: 'slack',  // wraps text in Slack payload
});

Cron fields reference

FieldRangeSpecial
Minute0–59*/n every n minutes
Hour0–23*/n every n hours
Day of month1–31? unspecified
Month1–12 or JAN–DEC* every month
Day of week0–6 (Sun=0) or SUN–SAT1-5 weekdays
4

MCP Channel

ModelReins can register itself as an MCP (Model Context Protocol) server. This lets any MCP-aware client — including Claude Code — call ModelReins tools directly from within a conversation, turning Claude Code into a background worker that dispatches sub-tasks through ModelReins.

🔗

MCP is Anthropic's open protocol for connecting AI models to external tools and data. Claude Code reads .mcp.json in the project root (or user-global ~/.claude/mcp.json) to discover available MCP servers.

Configure .mcp.json

Add a ModelReins entry to your project's .mcp.json. The MCP server is exposed by ModelReins itself over stdio or SSE — choose whichever your client supports.

stdio transport (recommended for Claude Code)

json — .mcp.json
{
  "mcpServers": {
    "modelreins": {
      "type": "stdio",
      "command": "npx",
      "args": ["modelreins", "mcp"],
      "env": {
        "MODELREINS_URL":   "http://192.168.0.246:8484",
        "MODELREINS_TOKEN": "YOUR_TOKEN_HERE"
      }
    }
  }
}

SSE transport (for web clients)

json — .mcp.json
{
  "mcpServers": {
    "modelreins": {
      "type": "sse",
      "url":  "http://192.168.0.246:8484/mcp/sse",
      "headers": {
        "Authorization": "Bearer YOUR_TOKEN_HERE"
      }
    }
  }
}

Available MCP tools

Once connected, these tools appear inside Claude Code (and other MCP clients):

Tool nameDescriptionKey params
mr_dispatch Send a prompt to a provider and return the result prompt, provider, model
mr_providers List all configured providers and their status
mr_task_create Create a new scheduled task name, schedule, prompt
mr_task_list List scheduled tasks with last-run status
mr_task_result Fetch the latest result for a task taskId
mr_task_run Trigger a task immediately taskId

Example: Claude Code as a worker

With ModelReins as an MCP server and a cli provider pointing at Claude Code, the outer Claude instance can spin up inner Claude Code sessions for long-running tasks:

user prompt inside Claude Code
Use the mr_dispatch tool to have the local-claude-code provider
refactor the auth module at src/auth/* and return a summary of changes.

Claude Code will call mr_dispatch with provider: "local-claude-code", ModelReins shells out to claude --print, and the inner session performs the refactor. The outer session receives the result as the tool response.

⚠️

When using --dangerously-skip-permissions on the inner CLI provider, ensure it runs in an isolated working directory or container. Grant only the permissions the task actually needs.

Verify the connection

shell — inside claude code project
$ claude mcp list
modelreins   stdio   connected
  Tools: mr_dispatch, mr_providers, mr_task_create,
         mr_task_list, mr_task_result, mr_task_run
5

Dashboard

The web UI is served at the root of your ModelReins instance (http://192.168.0.246:8484). It provides a real-time view of the system without requiring any CLI setup.

🏠

Overview

Live provider status, total dispatches today, error rate, and average latency. Updates every 10 seconds.

📄

Dispatch History

Full log of every dispatch with prompt, response, provider, model, token count, latency, and status.

🕐

Scheduled Tasks

Create, edit, enable/disable, and manually trigger tasks. View run history and last output inline.

🔌

Provider Health

Per-provider uptime, last-error details, and a live "ping" button to test reachability.

📈

Usage Graphs

Token consumption and request volume over time, broken out by provider and model.

🔒

Settings

Manage API tokens, configure webhooks, edit server config (with live reload), and view server logs.

Keyboard shortcuts

KeyAction
G DGo to Dispatch History
G TGo to Scheduled Tasks
G PGo to Providers
G SGo to Settings
/Focus search bar
RRefresh current view
?Show keyboard help
6

API Reference

All endpoints are under http://<host>:8484/api. Requests require an Authorization: Bearer <token> header unless the server is configured with auth: none. All request/response bodies are JSON.

📄

An OpenAPI 3.1 spec is available at /api/openapi.json and can be imported directly into Postman, Insomnia, or any compatible client.

POST /api/dispatch Send a prompt, get a response

Request body

json
{
  "prompt":   "Explain binary search trees.",
  "provider": "claude",          // optional
  "model":    "claude-sonnet-4-6", // optional
  "system":   "You are a CS tutor.", // optional
  "stream":   false,               // set true for SSE stream
  "options":  { "temperature": 0.5 }  // optional provider opts
}

Response

json
{
  "id":       "dsp_01jx...",
  "text":     "A binary search tree is...",
  "provider": "claude",
  "model":    "claude-sonnet-4-6",
  "usage":    { "input_tokens": 12, "output_tokens": 284 },
  "latencyMs": 1423,
  "createdAt": "2026-04-04T09:12:33Z"
}
GET /api/providers List all providers
json — response
[
  {
    "name":      "claude",
    "type":      "anthropic",
    "default":   true,
    "status":    "healthy",
    "lastCheck": "2026-04-04T09:10:00Z"
  }
]
GET /api/tasks List scheduled tasks

Returns an array of task objects including id, name, schedule, enabled, lastRunAt, and lastStatus.

POST /api/tasks Create a scheduled task
json — request body
{
  "name":      "my-task",
  "schedule":  "0 9 * * 1-5",
  "provider":  "claude",
  "prompt":    "Your prompt here.",
  "enabled":   true,
  "webhook":   "https://..."   // optional
}
POST /api/tasks/:id/run Trigger a task immediately

Runs the task once regardless of its schedule. Returns a runId that can be polled at /api/tasks/:id/results/:runId.

PATCH /api/tasks/:id Update a task

Accepts any subset of the creation fields. Changes take effect at the next scheduled run.

DELETE /api/tasks/:id Delete a task

Permanently removes the task and its run history. Returns 204 No Content.

GET /api/tasks/:id/results Task run history

Returns paginated results. Query params: limit (default 20), offset.

GET /api/health Server health check
json — response
{
  "status":    "ok",
  "version":   "1.0.0",
  "uptime":    391627,
  "providers": { "healthy": 3, "degraded": 0 }
}

Error responses

StatusCodeMeaning
400VALIDATION_ERRORMissing or invalid request fields
401UNAUTHORIZEDMissing or invalid token
404NOT_FOUNDTask or resource does not exist
429RATE_LIMITEDProvider rate limit hit; retry after header set
502PROVIDER_ERRORUpstream provider returned an error
504PROVIDER_TIMEOUTProvider did not respond within configured timeout
json — error envelope
{
  "error": {
    "code":    "PROVIDER_ERROR",
    "message": "Anthropic API returned 529 Overloaded",
    "provider":"claude",
    "retryAfter": 30
  }
}