ModelReins Guide
A lightweight model dispatch layer — route prompts to Claude, local LLMs, or any CLI tool, schedule tasks via cron, and wire Claude Code in as a background worker over MCP.
Fast Setup
Install the npm package, point it at a server, and dispatch your first request in under two minutes.
Any Backend
Cloud APIs, local models, or shell scripts — all behind one uniform interface.
Scheduled Agents
Cron-driven tasks that run prompts on a schedule and store results for later retrieval.
Web Dashboard
Inspect dispatch history, live provider status, and scheduled job runs in one place.
Quick Start
ModelReins exposes a small HTTP API. The easiest way to interact with it is the
modelreins npm package, which wraps every endpoint and handles auth.
You can also hit the REST API directly from any language.
Install the client
$ npm install -g modelreins + [email protected] added 42 packages in 3.2s $ modelreins --version 1.0.0
Connect to a server
Point the client at your ModelReins instance. The base URL is stored in
~/.modelreins/config.json and can be overridden with
MODELREINS_URL and MODELREINS_TOKEN environment variables.
$ modelreins config set url http://192.168.0.246:8484 $ modelreins config set token YOUR_API_TOKEN $ modelreins ping ✓ connected to ModelReins at http://192.168.0.246:8484 version : 1.0.0 providers: 3 active uptime : 4d 12h 7m
Dispatch your first prompt
modelreins dispatch sends a prompt to the default provider and streams the
response to stdout. Use --provider to target a specific backend.
$ modelreins dispatch "Summarise the last 5 git commits" Dispatching to provider: claude (default) ──────────────────────────────────────── The last 5 commits focused on...
From JavaScript / TypeScript
import ModelReins from 'modelreins';
const mr = new ModelReins({
url: 'http://192.168.0.246:8484',
token: process.env.MODELREINS_TOKEN,
});
const result = await mr.dispatch({
prompt: 'Explain quantum entanglement simply.',
provider: 'claude', // optional, uses default if omitted
model: 'claude-sonnet-4-6', // optional, uses provider default
});
console.log(result.text);
console.log('tokens used:', result.usage);
Raw HTTP
curl -s -X POST http://192.168.0.246:8484/api/dispatch \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer YOUR_TOKEN' \
-d '{"prompt":"Hello world","provider":"claude"}' \
| jq '.text'
Authentication tokens are managed in the Dashboard under Settings → API Tokens. Tokens can be scoped to specific providers or operations.
Providers
A provider is a named backend that ModelReins can route prompts to.
Each provider has its own config block in config.yaml on the server side.
Multiple providers of the same type can coexist with different names.
Set default: true on one provider to make it the target when no
provider field is supplied in a dispatch request.
Claude Anthropic API
Hosted Claude models via the Anthropic Messages API
Requires an Anthropic API key. Supports streaming, system prompts, multi-turn conversation context, and tool use pass-through.
providers:
claude:
type: anthropic
default: true
apiKey: ${ANTHROPIC_API_KEY}
model: claude-sonnet-4-6 # default model
maxTokens: 8192
systemPrompt: |
You are a helpful assistant running in ModelReins.
Supported models
| Model ID | Context | Notes |
|---|---|---|
claude-opus-4-6 | 200k | Most capable, highest cost |
claude-sonnet-4-6 | 200k | Best balance — recommended default |
claude-haiku-4-5-20251001 | 200k | Fast & cheap, great for classification |
LM Studio Local
OpenAI-compatible API served by LM Studio on the local network
LM Studio exposes an OpenAI-compatible endpoint on localhost:1234
by default. ModelReins uses the openai-compat type for any
server speaking the OpenAI Chat Completions API.
providers:
lmstudio:
type: openai-compat
baseUrl: http://192.168.0.109:1234/v1
apiKey: lm-studio # any string works
model: lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF
stream: true
Enable the Local Server toggle in LM Studio and load a model before dispatching. LM Studio must remain running for the provider to be reachable.
Ollama Local
Native Ollama API or its OpenAI-compat shim
Ollama runs models locally and exposes both a native API on port 11434
and an OpenAI-compatible shim. ModelReins supports both; the native type gives
access to Ollama-specific options like num_ctx.
providers:
ollama:
type: ollama
baseUrl: http://192.168.0.109:11434
model: llama3.2
options:
num_ctx: 32768
num_gpu: 99
temperature: 0.7
Pull a model
$ ollama pull llama3.2 $ ollama pull qwen2.5-coder:7b $ ollama list NAME ID SIZE llama3.2:latest ... 2.0 GB qwen2.5-coder:7b ... 4.7 GB
1minAI Cloud
Unified API hub — access 100+ models with one key
1minAI provides a single API key that unlocks access to GPT-4o, Gemini, Mistral, and many others via an OpenAI-compatible surface. Useful for fallback or cost comparison without managing multiple keys.
providers:
1minai:
type: openai-compat
baseUrl: https://api.1min.ai/v1
apiKey: ${ONEMINAI_API_KEY}
model: gpt-4o
fallback: true # used when primary provider errors
Custom CLI Bring Your Own
Pipe prompts through any shell command
The cli provider type shells out to an arbitrary command.
The prompt is piped to stdin; the response is read from stdout.
This unlocks any tool that accepts text input — scripts, local Claude Code
instances, custom wrappers, etc.
providers:
local-claude-code:
type: cli
command: claude --print --dangerously-skip-permissions
timeout: 120000 # ms
env:
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY}
my-script:
type: cli
command: /opt/modelreins/scripts/my_llm.sh
timeout: 30000
CLI providers inherit the server process's environment unless you override it with the
env block. The working directory is the ModelReins server root.
Scheduled Tasks
Scheduled tasks dispatch a prompt on a cron schedule and store the result. Results are retrievable via the API or visible in the Dashboard. Tasks can target any configured provider and optionally POST their result to a webhook.
Cron API
Tasks are managed through the /api/tasks endpoint family.
import ModelReins from 'modelreins';
const mr = new ModelReins({ url: 'http://192.168.0.246:8484', token: TOKEN });
// Create a scheduled task
const task = await mr.tasks.create({
name: 'daily-standup-summary',
schedule: '0 9 * * 1-5', // 9 AM Mon–Fri
provider: 'claude',
prompt: 'Summarise open GitHub PRs and suggest priorities.',
webhook: 'https://hooks.slack.com/T.../B.../...', // optional
enabled: true,
});
console.log(task.id); // task_abc123
// List tasks
const all = await mr.tasks.list();
// Get latest result
const result = await mr.tasks.latestResult(task.id);
// Trigger immediately (one-off run)
await mr.tasks.run(task.id);
// Disable / enable
await mr.tasks.update(task.id, { enabled: false });
// Delete
await mr.tasks.delete(task.id);
Schedule presets
Common cron expressions — click any chip to copy the expression.
Examples
Log monitor
await mr.tasks.create({
name: 'error-log-digest',
schedule: '0 */4 * * *', // every 4 hours
provider: 'claude',
prompt: `
Read /var/log/app/error.log (last 500 lines).
Summarise recurring errors, count by type,
and flag anything that looks critical.
Be concise — 10 lines max.
`,
context: { readFile: '/var/log/app/error.log' },
});
Weekly report via webhook
await mr.tasks.create({
name: 'weekly-metrics',
schedule: '0 8 * * 1', // Monday 8 AM
provider: 'claude',
prompt: 'Generate a weekly metrics summary for the team.',
webhook: 'https://hooks.slack.com/services/...',
webhookFormat: 'slack', // wraps text in Slack payload
});
Cron fields reference
| Field | Range | Special |
|---|---|---|
| Minute | 0–59 | */n every n minutes |
| Hour | 0–23 | */n every n hours |
| Day of month | 1–31 | ? unspecified |
| Month | 1–12 or JAN–DEC | * every month |
| Day of week | 0–6 (Sun=0) or SUN–SAT | 1-5 weekdays |
MCP Channel
ModelReins can register itself as an MCP (Model Context Protocol) server. This lets any MCP-aware client — including Claude Code — call ModelReins tools directly from within a conversation, turning Claude Code into a background worker that dispatches sub-tasks through ModelReins.
MCP is Anthropic's open protocol for connecting AI models to external tools and data.
Claude Code reads .mcp.json in the project root (or user-global
~/.claude/mcp.json) to discover available MCP servers.
Configure .mcp.json
Add a ModelReins entry to your project's .mcp.json. The MCP server
is exposed by ModelReins itself over stdio or SSE — choose whichever your client supports.
stdio transport (recommended for Claude Code)
{
"mcpServers": {
"modelreins": {
"type": "stdio",
"command": "npx",
"args": ["modelreins", "mcp"],
"env": {
"MODELREINS_URL": "http://192.168.0.246:8484",
"MODELREINS_TOKEN": "YOUR_TOKEN_HERE"
}
}
}
}
SSE transport (for web clients)
{
"mcpServers": {
"modelreins": {
"type": "sse",
"url": "http://192.168.0.246:8484/mcp/sse",
"headers": {
"Authorization": "Bearer YOUR_TOKEN_HERE"
}
}
}
}
Available MCP tools
Once connected, these tools appear inside Claude Code (and other MCP clients):
| Tool name | Description | Key params |
|---|---|---|
mr_dispatch |
Send a prompt to a provider and return the result | prompt, provider, model |
mr_providers |
List all configured providers and their status | — |
mr_task_create |
Create a new scheduled task | name, schedule, prompt |
mr_task_list |
List scheduled tasks with last-run status | — |
mr_task_result |
Fetch the latest result for a task | taskId |
mr_task_run |
Trigger a task immediately | taskId |
Example: Claude Code as a worker
With ModelReins as an MCP server and a cli provider pointing at Claude Code,
the outer Claude instance can spin up inner Claude Code sessions for long-running tasks:
Use the mr_dispatch tool to have the local-claude-code provider
refactor the auth module at src/auth/* and return a summary of changes.
Claude Code will call mr_dispatch with provider: "local-claude-code",
ModelReins shells out to claude --print, and the inner session performs
the refactor. The outer session receives the result as the tool response.
When using --dangerously-skip-permissions on the inner CLI provider,
ensure it runs in an isolated working directory or container. Grant only the
permissions the task actually needs.
Verify the connection
$ claude mcp list modelreins stdio connected Tools: mr_dispatch, mr_providers, mr_task_create, mr_task_list, mr_task_result, mr_task_run
Dashboard
The web UI is served at the root of your ModelReins instance
(http://192.168.0.246:8484). It provides a real-time view
of the system without requiring any CLI setup.
Overview
Live provider status, total dispatches today, error rate, and average latency. Updates every 10 seconds.
Dispatch History
Full log of every dispatch with prompt, response, provider, model, token count, latency, and status.
Scheduled Tasks
Create, edit, enable/disable, and manually trigger tasks. View run history and last output inline.
Provider Health
Per-provider uptime, last-error details, and a live "ping" button to test reachability.
Usage Graphs
Token consumption and request volume over time, broken out by provider and model.
Settings
Manage API tokens, configure webhooks, edit server config (with live reload), and view server logs.
Keyboard shortcuts
| Key | Action |
|---|---|
G D | Go to Dispatch History |
G T | Go to Scheduled Tasks |
G P | Go to Providers |
G S | Go to Settings |
/ | Focus search bar |
R | Refresh current view |
? | Show keyboard help |
API Reference
All endpoints are under http://<host>:8484/api.
Requests require an Authorization: Bearer <token> header unless
the server is configured with auth: none.
All request/response bodies are JSON.
An OpenAPI 3.1 spec is available at /api/openapi.json and can be
imported directly into Postman, Insomnia, or any compatible client.
Request body
{
"prompt": "Explain binary search trees.",
"provider": "claude", // optional
"model": "claude-sonnet-4-6", // optional
"system": "You are a CS tutor.", // optional
"stream": false, // set true for SSE stream
"options": { "temperature": 0.5 } // optional provider opts
}
Response
{
"id": "dsp_01jx...",
"text": "A binary search tree is...",
"provider": "claude",
"model": "claude-sonnet-4-6",
"usage": { "input_tokens": 12, "output_tokens": 284 },
"latencyMs": 1423,
"createdAt": "2026-04-04T09:12:33Z"
}
[
{
"name": "claude",
"type": "anthropic",
"default": true,
"status": "healthy",
"lastCheck": "2026-04-04T09:10:00Z"
}
]
Returns an array of task objects including id, name,
schedule, enabled, lastRunAt, and lastStatus.
{
"name": "my-task",
"schedule": "0 9 * * 1-5",
"provider": "claude",
"prompt": "Your prompt here.",
"enabled": true,
"webhook": "https://..." // optional
}
Runs the task once regardless of its schedule. Returns a runId that can be
polled at /api/tasks/:id/results/:runId.
Accepts any subset of the creation fields. Changes take effect at the next scheduled run.
Permanently removes the task and its run history. Returns 204 No Content.
Returns paginated results. Query params: limit (default 20), offset.
{
"status": "ok",
"version": "1.0.0",
"uptime": 391627,
"providers": { "healthy": 3, "degraded": 0 }
}
Error responses
| Status | Code | Meaning |
|---|---|---|
400 | VALIDATION_ERROR | Missing or invalid request fields |
401 | UNAUTHORIZED | Missing or invalid token |
404 | NOT_FOUND | Task or resource does not exist |
429 | RATE_LIMITED | Provider rate limit hit; retry after header set |
502 | PROVIDER_ERROR | Upstream provider returned an error |
504 | PROVIDER_TIMEOUT | Provider did not respond within configured timeout |
{
"error": {
"code": "PROVIDER_ERROR",
"message": "Anthropic API returned 529 Overloaded",
"provider":"claude",
"retryAfter": 30
}
}