Stop juggling AI tools.
Start orchestrating them.
One saddle. One brain. Range in.
One saddle. One brain. Range in.
The Companion packs the local routing brain, a fleet worker, and the Wall into one installer. It detects the AI tools you already have, runs offline by default, and only escalates to a frontier model when the question deserves it.
npm install -g modelreins-worker
We don't care and can't even see that info.
When you install ModelReins, you get Bob — a local brain that lives on your hardware. One Bob per fleet. Every companion you install shares the same intelligence. Your memories, routing patterns, and learned behaviors never leave your machines. Our servers handle dispatch and coordination. Bob handles everything else.
You stay because you want to, not because you have to. If you leave, Bob goes with you. Nothing held hostage. Nothing phoned home.
Personal infrastructure. Built for you, not on you.
Today Anthropic announced Project Glasswing — a $100M coalition with AWS, Apple, Cisco, CrowdStrike, Google, JPMorganChase, Microsoft, NVIDIA, Palo Alto Networks, the Linux Foundation, Broadcom, and more — using Claude Mythos Preview to find zero-days in critical infrastructure. Mythos has already found vulnerabilities that survived 27 years of human review and 5 million automated tests.
Twelve of the biggest tech companies in the world just publicly agreed: AI has crossed a threshold. The same capabilities that make a model powerful enough to find a 27-year-old zero-day in OpenBSD make it powerful enough to need governance.
The window between a vulnerability being discovered and being exploited by an adversary has collapsed — what once took months now happens in minutes with AI. That is not a reason to slow down; it’s a reason to move together, faster. If you want to deploy AI, you need security. That is why CrowdStrike is part of this effort from day one.
— Elia Zaitsev, CTO, CrowdStrike (Project Glasswing announcement, April 8, 2026)
Project Glasswing is the model layer. ModelReins is the orchestration and governance layer. Same problem, different rung. Both essential.
When a single frontier model scores 94% on SWE-bench Verified, the bottleneck isn’t capability anymore. It’s which agent handles which problem, who reviews, what happens when the rate limit hits, and whether anything you shipped can be audited tomorrow. That’s the layer ModelReins lives in.
One command. Your AI workforce is online.
$ npx modelreins-worker __ __ _ _ ____ _ | \/ | ___ __| | ___| | _ \ ___(_)_ __ ___ | |\/| |/ _ \ / _` |/ _ \ | |_) / _ \ | '_ \/ __| | | | | (_) | (_| | __/ | _ < __/ | | | \__ \ |_| |_|\___/ \__,_|\___|_|_| \_\___|_|_| |_|___/ by MEDiAGATO Worker: haiku-devbox Provider: anthropic (haiku-4.5) Server: app.modelreins.com Tags: draft,triage,cheap,fast Session: spawn [20:24:01] Ready — waiting for jobs... [20:24:17] >>> Job #803 claimed [20:24:17] Prompt: Write a product description for ModelReins... [20:24:17] Spawning: anthropic-cli "Write a product description..." [20:24:22] <<< Job #803 complete (exit 0, 4.8s) [20:24:27] >>> Job #804 claimed [20:24:27] Prompt: Triage this issue: auth middleware returns 403... [20:24:29] <<< Job #804 complete (exit 0, 1.2s) [20:24:34] Ready — waiting for jobs...|
Google calls it:
"the shift from generative to agentic AI."
We just call it Tuesday.
ModelReins has been orchestrating multi-provider AI workforces while the industry was still writing trend reports about it.
Google Cloud AI Agent Trends 2026Download the Companion. The wizard finds your Ollama, LM Studio, and any AI tools you already use — then pulls a small local model if you don’t have one yet.
Install the Saddle in VSCode. It finds the local Companion automatically over loopback. One keystroke to dispatch.
Type a prompt. The Matriarch picks the right worker — local first, frontier when you need it. Output streams back in real time.
Route tasks to any registered worker — manual, automatic, or fan-out to multiple workers in parallel.
Tag workers by capability. The Matriarch matches job complexity and type to the right worker automatically.
Rate limited on Opus? Router falls over to Sonnet. Sonnet full? Ollama picks it up locally. Work never stops.
VSCode extension. One keystroke to dispatch from your editor. Output streams back inline. No context switch.
Your brain follows you across every machine you own. Context, history, and routing state — everywhere.
Per-job spend across every provider. See exactly what each task cost, in real time and in history.
Queue jobs to run at a time or on a cron. Overnight builds, morning reports, timed deploys.
Every output stored and searchable. Full audit log with worker, provider, cost, and duration.
Define your infrastructure in YAML. Workers know what exists, what's healthy, and what's rate-limited.
File, URL, or dead man's switch. Halt all workers instantly. Independent of the server.
Pointers, not passwords. Env vars or Vault. Workers get short-lived tokens, not the keys themselves.
Your API keys never touch the control plane. Workers fetch credentials locally and talk directly to providers.
Every action HMAC-signed and logged. Verify integrity. Ship to your SIEM.
Complete data isolation. Admin, operator, viewer. Teams share one server safely.
AI orchestration used to be a luxury reserved for teams with dedicated ML platform engineers. Solo devs and indie teams — whose code increasingly runs the world — have been left to hand-coordinate agents across a dozen browser tabs. The companion’s free tier is our answer to that equation. Your laptop, your local brain, your workers. No account required to start.
Start free. Upgrade when you need more workers.
Run ModelReins on your own infrastructure. Bring your own API keys, keep data in-house, and manage AI workloads across teams with the controls you actually need.