What's new in ModelReins
Release history, features, and fixes. Two channels — the stable one you can trust, the beta one where the new stuff ships first.
Last known-good. Pin to this track if you need a safety net. If beta ever blows up, come here.
Where the new stuff ships first. Breaking a beta is fine — stable has your back.
SSRF hardening: decimal/hex/octal numeric IP forms are now decoded before DNS
- URLs like `http://2130706433/`, `http://0x7f000001/`, and `http://0177.0.0.1/` (all encodings of 127.0.0.1) are now decoded to their canonical IPv4 form BEFORE the safe-URL check, on every platform. Previously, Linux's getaddrinfo silently resolved decimal IPs to 127.0.0.1 (caught), but Windows getaddrinfo failed outright, so a Windows-hosted Companion treated the URL as 'DNS failed, denied' — denying correctly but losing the SSRF signal in audit logs. Defense-in-depth assumed numeric forms were caught here; now they actually are.
- Cloud metadata endpoint (`169.254.169.254`) audit messages now say 'link-local — covers cloud metadata' instead of the generic 'private network address'. In Python 3.13+, the metadata IP has both is_link_local=True and is_private=True; the safe-URL check now orders link-local first so the specific threat is named in audit logs.
Atomic job claim — duplicate-execution hole closed at the dispatcher
- `PUT /jobs/{id}` with `status='running'` now conditionally claims the job only if its current status is `pending`. Two workers sharing a `WORKER_NAME` (companion restart in-flight, VM clone with stale hostname, network retry that lands after the first PUT already succeeded) can no longer both run the same prompt and double the token cost.
- Collision returns 409 Conflict with the current status; the channel daemon (companion 4.5.13+) handles 409 cleanly — treats it as 'someone else got this one', rolls back local state, retries on next poll. Other transitions (done/failed/etc) still go through the standard path.
- Pairs with the companion-side stuck-job watchdog (4.5.13) so the full duplicate-execution path is closed end-to-end.
Channel watchdog, wizard polish, npm brain package, Linux exec fix
- If Claude crashes mid-stream or a notification gets dropped, the channel no longer wedges with a stuck currentJobId. A new watchdog clears the slot and best-effort reports the job as failed after 30 minutes — generous enough that legitimate long-running Claude tasks don't get false-positived.
- Pairs with the SaaS-side atomic claim shipped in 4.8.x: the channel now correctly handles 409 Conflict on claim collisions (treats it as 'someone else got the job' and rolls back local state for the next poll).
- The wizard now auto-picks the Director when you save — one fewer click in the most common path (closes #150).
- 'Open the Wall' button on the wizard's success screen actually opens the Wall now (closes #149).
- The screensaver mode's npx-fallback exec line is now correctly tokenized so the multi-word command parses on `freedesktop`-spec systems. Linux users running Companion as a screensaver get the same auto-update behavior Windows users already had.
- Companion's local routing brain (`local-brain.js`) is now published as `@mediagato/brain` on npm. Same code, same behavior — just consumable by other modules in the elifant ecosystem (silicon worker SDK, future Director surfaces, etc.) without vendoring.
- Companion repo now ships with an explicit BUSL-1.1 license file. Same license that the rest of MEDiAGATO ecosystem code uses — no change in terms, just visible in the tree now.
Three quick-fire patches that close the SERIAL-gap saga on Postgres + small operational nits
- Fixed a bug where create_review_job's combined INSERT+UPDATE was advancing the jobs.id SERIAL counter on every retry, leaving holes in the job ID sequence. Now split into a stable INSERT first, UPDATE second — no more skipped IDs on Postgres.
- Quality-gate flag is now coerced correctly under the Postgres TEXT column. Previously a stored '0' or 'false' could read as truthy because the TEXT representation was non-empty. The check is explicit now.
- Dev banner keys off the Host header instead of an env var, so staging and dev deploys correctly show their banner without needing a per-environment env var set.
Director planner goes LLM-driven — and 'tools' are now 'skills' to match the rest of the agent ecosystem
- The Director now drafts plans by calling a worker LLM with the skill catalog as context. Out-of-the-box requests like 'help me draft an email about X' or 'pull the latest GitHub stars and write a memo' get planned with real reasoning instead of pattern matching.
- Deterministic pattern matcher kept as the fallback. If the LLM is unreachable, returns malformed JSON, or names an unknown skill, the planner drops to the v4.7.0 patterns. End users never see a 500 from a bad LLM response.
- Set `MODELREINS_DIRECTOR_LLM_DISABLED=1` on a server to force pattern-matcher only — useful for low-resource self-host deploys or when the worker fleet is empty.
- Lined the Director up with the rest of the agent-platform vocabulary (Anthropic, OpenClaw, HyperAgent, Hermes, Wirken — they all call this 'skills'). Catalog endpoint at `GET /director/skills`. The legacy `/director/tools` returns the same payload under both keys for one release cycle, then it goes away.
- A skill is a registered named capability the Director can invoke. Today's catalog has 9 (dispatch_inference, web_search, worker_list, fleet_find_worker_with_capability, worker_currently_loaded, worker_list_models, worker_load_model, worker_unload_model, dispatch_e2e_smoke). The Skill Factory feature — Director writing its own skills at runtime — is queued.
Director substrate ships — request planner + admin model swap + chat UI
- New `/director` chat UI lets you type a request, see the plan the Director assembled, approve or cancel, then execute. Plans persist; you can revisit any prior plan.
- Tool catalog at `GET /director/tools` lists what the Director can call: dispatch_inference, web_search, worker_list, fleet_find_worker_with_capability, worker_currently_loaded, worker_list_models, worker_load_model, worker_unload_model, dispatch_e2e_smoke.
- v4.7.0 ships a deterministic pattern-matcher planner (search / smoke / list / model-swap / generic). LLM-driven planning is layer-2 work — the substrate (catalog, schema, endpoints, UI) is now in place so the swap-in is a single class change.
- New endpoints under `/api/v1/workers/{worker_name}/`: `models` (list), `load_model`, `unload_model`, `admin_task/{id}` (track progress). Records intent in the new `worker_admin_tasks` table; daemon-side execution is queued.
- Pairs with the Holly LM Studio CT 200 substrate live at `holly-lmstudio-01`. Once the daemon-side handlers ship, the Director can swap models on-fleet ('swap to qwen-coder-7b on holly-lmstudio-01') as a single step.
- Master-token / platform-admin only. Tenant admins cannot load/unload on platform workers.
- `director_plans` + `director_plan_steps` for plan persistence.
- `worker_admin_tasks` for model-swap operations.
Ollama install completes reliably.
- The setup wizard's Ollama install step now signals completion the moment the Ollama API is reachable, instead of waiting on a PowerShell wrapper handle that could hang indefinitely on Win11.
- If the UAC prompt is canceled or the install otherwise doesn't finish, the wizard surfaces a clear timeout message after 3 minutes instead of staying frozen.
Artifacts land in the Wall.
- When workers upload images, video, audio, or PDFs via `upload_artifact`, they now surface in a dedicated Artifacts page in the Wall carousel — image content-types render as thumbnails inline, others show a content-type pill.
- Click any artifact to open its stable `/s/<slug>` URL in your system browser.
- If the Companion is paired with a SaaS instance older than API 4.6.0 (which doesn't expose `/artifacts`), the Wall just hides the Artifacts page — no error spam.
Smoother upgrades.
- The installer closes the running Companion automatically before installing the new build, so in-place upgrades always land on the latest version.
- The setup wizard's footer now shows the running Companion version.
One-click pairing from the Wall.
- When the Companion isn't paired with a fleet yet, the Wall now shows a Pair With Your Fleet prompt with the setup wizard one click away.
- Local-only mode is a one-click choice from the same prompt — for users who want the on-machine routing brain without a SaaS pairing.
- When the wizard saves, the Wall swaps to the live dashboard immediately — no manual reload, no restart.
Security and reliability hardening across the Companion.
- The local pairing handshake responds only to known IDE origins.
- Brain-channel access validates every caller against the Companion's own renderer paths.
- Installer execution requires absolute paths.
- Reset uses a marker-file + clean-boot pattern, so the wipe runs before any data files are opened.
- Ollama model downloads use an idle-timeout, so stalled connections surface clearly.
- Job dispatch is reentrancy-safe and preserves output chunk order when streaming long responses.
Steadier polling under network jitter.
- The poll loop now drops a tick if the previous one is still in flight, so a slow API response can't trigger two concurrent claims of the same job.
- A failed claim PUT (5xx, timeout, network blip) now rolls back the local state cleanly, so the next poll can retry the job from scratch instead of half-claiming it.
- MCP notification failures release the in-flight job slot rather than silently wedging the channel until restart.
Visual artifacts ship + silicon workers as first-class identities
- Workers can now produce images, video, audio, and PDFs as first-class outputs. Each artifact gets a stable URL at `modelreins.com/s/<slug>` (3-tier hosting: companion-local, modelreins-tunnel, or your own protowebb).
- The review queue renders artifacts inline so a human can preview the actual image before approving it for publish — purpose-built for generative-AI compliance workflows where someone needs to eyeball every output.
- Foundation for the upcoming creative-team use case: marketing departments generate brand assets at scale, your queue catches anything off-brand before it goes live.
- New `workers_registry` table: every silicon worker has a persistent identity independent of heartbeat/presence — declared capabilities, risk tiers (auto / audit / approve / session), audit trail, revocation.
- New `/workers` dashboard for register / list / revoke. New reserved `platform` tenant for internal first-party workers (Moltbook bot, Director, Windows tester).
- Pairs with the silicon-worker SDK 0.2.0 (see below) for the full developer experience.
- New `/settings/api-keys` flow lets users mint scoped keys themselves. Keys appear once via a one-time-view URL with 15-minute TTL — no support tickets, no raw tokens emailed, no tokens in logs.
- Tightened plan-picker routing across Pro / Team / Lifetime so every upgrade path lands at a clean checkout page in one click.
Catch-up release: playwright provider, killswitch, rate-limit phrases
- Wrap web UIs that have no API as first-class workers. Useful when the only way to get something done is to drive a browser.
- Worker poll loop now honors a server-side killswitch — the dispatch fleet can be stopped centrally without restarting every worker.
- More upstream rate-limit error phrases recognized + classified, so workers back off gracefully instead of failing jobs.
Workers can submit visual artifacts + queue reviews
- `worker.upload_artifact(data, content_type, filename, ...)` — POST a binary blob (image, video, audio, PDF) through the artifact tunnel, get back `{slug, url, content_type, size_bytes}`. Multipart-encoded manually to stay stdlib-only — zero new dependencies.
- `worker.submit_review(type, title, content, preview, target, ...)` — push content to the human-review queue with optional artifact preview. Used when a worker produces output that shouldn't publish without a human eyeball (risk_tier=audit or approve).
- New `examples/image_generation_worker.py` shows the full pattern end-to-end: claim a job, call 1minai IMAGE_GENERATOR, upload result as artifact, submit to review queue. Drop-in starter for any text-to-image worker.
Build silicon workers in 60 seconds.
- A stdlib-only Python SDK for building silicon workers on ModelReins. One install command (`pip install modelreins-worker`), one class (`Worker`), five methods: heartbeat, inbox, claim, complete, run. The full public API fits on one screen.
- Works with any ModelReins server at version 4.5.0 or later. Uses worker-token auth through the existing one-time-view retrieval flow — raw tokens never transit email or chat.
- Silicon workers become first-class employees in your tenant: declared capabilities, risk tiers (auto / audit / approve / session), audit trail, revocation. See /workers on your dashboard to register your first one, or the docs page linked below.
Cleaner first-run on fresh Windows.
- Fresh Windows installs complete the Ollama setup step in about 30 seconds — Companion now requests elevation explicitly before handing the installer control.
- Carries forward 4.5.6's brighter Stirrup tray icon, clean first-run, double-click-safe installer, and 256k Ollama context default.
Brighter Stirrup in your tray.
- The Companion's tray icon now uses the same `#00ff88` green that's on the landing page and the logo. Brand-consistent, opaque, unmissable in a row of colorful system-tray icons.
- Carries forward 4.5.5's clean Windows first-run, double-click-safe installer, and 256k Ollama context default.
Cleaner Windows first-run.
- Fresh Windows installs now land on a working local worker without hand-holding. The install wizard progresses cleanly through Ollama setup and model pull on the first try.
- Double-clicking the downloaded installer no longer produces two wizards on top of each other — the installer now holds a single-instance lock so the second launch quietly no-ops.
- Ollama is configured with a 256k context window out of the box, ready for long prompts.
Zero-click auto-pair.
- When the Saddle has no token stored, it now probes the Companion's loopback router (127.0.0.1:11435/pair) and picks up your API token silently. Zero clicks, no paste, no tray hunt. Just works if the Companion is running on the same machine.
- Manual paste still works for remote-VSCode setups or split-machine configurations. Graceful fallback.
Connect, visibly.
- When the Saddle has no API token stored, the Patch panel shows an explicit "Not connected yet" card with a Connect button. Click it, paste your token from the Companion tray, done.
- "ModelReins: Connect to Patch" is now properly registered in the Command Palette, so you can start the auth flow from anywhere with Ctrl+Shift+P.
- After a successful connect, the fleet refreshes immediately — you see your worker show up without waiting for the next poll.
- README corrected: the extension is under Business Source License 1.1, not MIT. Matches the rest of the ModelReins repo.
- Saddle strip section count updated in docs (three segments: effort, mode, targets).
Fresh coat.
- Marketplace icon updated to the circular hands-and-reins mark that matches the Companion and the landing page.
- Extension description and README rewritten around the 4.5.x story: Companion-first, local-first, trivial-tier dispatch to your own hardware.
- The package-level config default is now "trivial" — matching the runtime. Fresh installs see "trivial" in the Saddle strip on first load.
Dispatch that matches your fleet.
- First-run effort tier is now "trivial" — maps to your local Companion worker, so fresh installs dispatch to qwen2.5 the moment you sign in. Click up to standard/deep/critical once you add cloud workers.
- Connect prompt asks for your API token and nothing else. Server URL defaults silently; override only if you self-host.
- When your fleet has no eligible reviewer for the current tier, the quality gate auto-passes with a note. Clean jobs table, clean head.
Auto-pair with the Saddle.
- The Companion now exposes its API credentials at `http://127.0.0.1:11435/pair` for any client running on the same machine. The Saddle uses this to auto-pair on first launch — you install the Companion, install the Saddle, and the Saddle just connects. No token paste, no tray hunt.
- Loopback-only (bound to 127.0.0.1), so nothing on the network can pull your token. Your keys stay local.
Tray lights up on finish.
- The tray menu's "Copy API Token" item enables the moment the setup wizard writes your token — no companion restart needed.
- Applies to both fresh-install wizards and the older save-config path.
First-run, every time.
- The wizard runs six named stages — detect your AI engine, install Ollama, pull the routing brain, connect your account, seed the Director, register the worker. Each stage reports status as it lands. Every external failure surfaces in plain text, no trailing ellipses.
- Your API token appears on the done screen with a copy button, ready for the Saddle. System tray handshake confirms where the app is running.
- Copy API Token — clipboard in one click, confirmation balloon.
- Check for Updates — pings the server, flags newer builds, points you at the download.
- Ollama submenu — reinstall, remove and clear the model cache, or open Ollama's site. Keep your local brain on your terms.
- The uninstaller asks whether to keep or wipe your ModelReins config, and separately whether to remove Ollama. Defaults keep everything; one-click wipes when you want a truly fresh start.
- Stale Ollama partial-chunk files get cleaned before each model pull, so interrupted downloads don't poison the next one.
- Ollama detection tries IPv4, IPv6, and hostname localhost so the VM quirks that used to hang the wizard don't anymore.
Red button in reach.
- A red "kill" button sits next to "send it." One tap aborts every in-flight job in the current thread, leaves other threads and other workers untouched. Scalpel, not sledgehammer.
- Works for both single-worker and fan-out dispatches. Fail-soft — one bad abort won't block the rest.
- Background stability work. No behavior changes to normal dispatch.
More than one mind on the problem.
- The target picker now takes checkboxes. Leave it empty for "router picks." Tick one to pin a worker. Tick two or more to dispatch the same prompt to all of them at once.
- Each worker's answer streams into its own card, side by side, so you can read them as they arrive. No more flipping between jobs to compare.
- A merged answer appears at the top once everyone's done, automatically merged from all of them — read just that if you're in a hurry, or compare and accept the one you liked best.
- Tap "accept" on any card to mark it as the one you went with. Tap "copy" to pull just that card's text into your clipboard. The Saddle remembers which one you accepted.
- The command strip shows "N workers ⤳ fan-out" when you've picked two or more, so you always know what's about to fire.
- Single-worker dispatch is unchanged — pick none or one and it works exactly like before.
- Background quality and stability work; nothing you'll notice.
Your brain. Your machine. Your rules.
- Every companion now ships with Bob — a local brain that lives on your machine. Your memories, patterns, and learned behaviors stay with Bob. Not on our servers. Not in our database.
- New accounts are seeded with default routing patterns on first setup. From there, Bob is yours.
- Download, run, useful in 3 minutes. The wizard handles everything: local routing engine, account creation, worker registration.
- Privacy-first welcome — the first screen explains our data promise. No buried terms of service.
- Inline signup — create your account directly in the companion. No browser tab, no copy-paste.
- Workers can now check out scoped context from Bob with a time limit and check it back in with results. Think of it as a library card for your brain.
- Every piece of data is tagged at creation so the system knows what's shareable and what's private.
- New human-readable job IDs.
- Workers now have a health lifecycle. Unresponsive workers are automatically removed from dispatch and archived for analysis.
- The In Memoriam page honors retired workers and the intelligence they contributed.
- Graduated security posture with automatic de-escalation timers. Full shutdown requires human confirmation — no automated path.
- Improved backend performance and reliability.
- The tagline bar now rotates through a list of Prince-tribute songs. Purple Reins. November Reins. Reinsdrops Keep Fallin' on My Head.
Mission Control comes alive.
- Mission Control — the dashboard idle screen now shows your live patch: active workers with heartbeat status, recent job history, and fleet activity. Refreshes every 30 seconds. Clears automatically when you start a conversation.
- ModelReins app menu — clean, branded menu replaces the generic default. Includes version display, About screen, and a direct link to documentation.
- Version in tray — build version visible in the tray tooltip and right-click menu so you always know exactly what you're running.
Tightened up. Locked down.
- ModelReins logo throughout — window, installer, and tray all use the correct icon.
- Local model detection — Ollama and LM Studio instances on the same machine are detected reliably during setup.
- Brain context injection is now opt-in and disabled by default. Workers only receive brain context when explicitly requested per dispatch.
- Credential pattern detection added to the brain sensitivity filter — scans both key names and value content before any context is shared.
Your patch. Your workers.
- Patch — the saddle now connects to your patch. Workers, jobs, and routing all flow through the patch you own.
- Version is now live — the saddle header always reflects the actual installed version.
The Wall is here. Cast your Director from the fleet.
- The Wall — windowed mission control with live fleet view, job stream, metrics, leaderboard, and audit trail.
- Page carousel — indicator dots at the bottom of the Wall are now clickable to jump directly to any page.
- Welcome splash — first-launch window explains the tray icon and how to open the Wall.
- First-close notice — closing the Wall for the first time shows a system notification reminding you the companion is still running in the tray.
Compact controls. Conversation thread. Enter to send.
- Conversation thread — responses render inline below the prompt, follow-ups carry full context, each reply has copy and insert-at-cursor actions.
- Command strip — effort, mode, and target compressed into a single clickable line above the prompt.
- Enter to send, Shift+Enter for new line.
- Stable and Beta download channels — two independent lanes, fix one without touching the other.
Multi-provider AI orchestration is here.
- Multi-provider support — Claude, OpenAI, Gemini, Ollama, LM Studio, OpenRouter, 1MinAI, or any CLI tool.
- Provider plugin registry — add new AI tools via YAML config, no code changes.
- PayPal billing — Free, Pro ($29/mo), Team ($79/mo) plans.
- npm package — npx @mediagato/modelreins-worker to connect in 60 seconds.
- Multi-tenant brain — engine (shared knowledge), tenant (isolated), ops (admin) scopes.
- Engine knowledge — workers learn error patterns and provider tips at startup.
- Preflight validation — workers verify environment before claiming jobs.
- Structured error classification — spawn_failed, auth_failed, timeout, rate_limited, etc.
- Dead man's switch — automated failsafe notifications.
- Quickstart docs at /docs/quickstart.
- Terminal showcase on landing page.
- Demo video.
- Killswitch — graduated defense levels with automatic de-escalation (all plans).
- Zero-knowledge architecture — API keys stay on your machine.
- HMAC-signed audit trail — every action logged with a tamper-evident signature.
- Conversation context — follow-up messages include full history.
- Enter to send, Shift+Enter for newline.
- Worker cards with provider color accents.
- Per-section error boundaries — one component crash doesn't kill the page.
- 10 provider YAML definitions (Claude, OpenAI, Gemini, Ollama, LM Studio, OpenRouter, 1MinAI, Nocturne, OpenClaw, custom).
- CI pipeline — lint, security, validate, test (31/31 passing), e2e, deploy.
- Nocturne/Ollama worker — free local AI via self-hosted models.
- Haiku worker — direct Anthropic API, pennies per job.
- Web Audio sound system — relay click on job complete, chime on worker online (toggle).
- Worker dispatch and job routing.
- Real-time dashboard with WebSocket streaming.
- Fleet context injection.
- Context policies (frontend, infrastructure, ci-cd scopes).
- Multi-tenant RBAC.
- Secrets brokering.
- Signed audit trail.
- Worker profiles and resumes.
- Cost tracking per job.