Troubleshooting

401 Unauthorized after token rotation

Symptom: Workers were running fine, then suddenly all jobs fail with 401 Unauthorized or Authentication failed.

Cause: You rotated your API key (at the provider or at the ModelReins coordinator) but not all workers picked up the new key.

Fix:

Update the key in your environment or config file:

export MODELREINS_API_KEY=new-key-here

Restart all workers:

modelreins worker restart --all

If using the Companion App, open the tray menu → Settings → update the API key and click Save. The worker restarts automatically.
If using the MCP channel, update the key in your .mcp.json or VS Code settings and restart the MCP client.

Prevention: Use a shared config file or secrets manager so all workers read from the same source. The MODELREINS_CONFIG_URL env var can point workers at a remote config endpoint.

Ollama not detected

Symptom: modelreins worker start --provider ollama fails with Could not connect to Ollama at http://localhost:11434.

Fix:

Check if Ollama is running:

curl http://localhost:11434/api/tags

If this fails, start the Ollama service:

# Linux
systemctl start ollama

# macOS — open the Ollama app, or:
ollama serve

If Ollama is on a different host or port:

export MODELREINS_OLLAMA_HOST=http://192.168.1.50:11434

If running in Docker, make sure the container can reach the host network:

docker run --network host mediagato/modelreins-worker --provider ollama

LM Studio model not found

Symptom: Jobs dispatched to LM Studio fail with Model not found or No model loaded.

Fix:

Open LM Studio and check the Local Server tab.
Make sure a model is selected and loaded in the model dropdown. The server can run without a model selected, but it won’t process requests.
Verify the server is serving the expected model:

curl http://localhost:1234/v1/models

If the model name in the response doesn’t match your ModelReins config, update the config:

{
  "lmstudio": {
    "model": "TheBloke/Llama-3.2-GGUF"
  }
}

Use the exact model name from the /v1/models response.

Job stuck in “running” state

Symptom: A job shows status: running indefinitely. The worker processing it may have crashed or disconnected.

Fix:

Check which worker has the job:

modelreins job info <job-id>

Check if that worker is still alive:

modelreins worker list

If the worker is dead, release the job back to the queue:

modelreins job retry <job-id>

If this happens frequently, increase the job timeout and enable automatic reaping:

{
  "jobs": {
    "timeout_seconds": 300,
    "reap_stale_after_seconds": 600
  }
}

The coordinator will automatically requeue jobs that have been running longer than reap_stale_after_seconds without a heartbeat.

Rate limit errors (429 Too Many Requests)

Symptom: Cloud provider jobs fail with 429 Too Many Requests or Rate limit exceeded.

Fix:

Immediate: Reduce the concurrency on the affected worker:

modelreins worker update <worker-id> --concurrency 1

Short-term: Enable built-in rate limit handling. ModelReins will automatically back off and retry:

{
  "providers": {
    "claude": {
      "rate_limit": {
        "max_concurrent": 3,
        "retry_after_seconds": 10,
        "max_retries": 5
      }
    }
  }
}

Long-term: Spread load across providers using routing rules. Add OpenRouter as a fallback — it handles rate limiting across multiple upstream providers:

{
  "routing": {
    "strategy": "fallback",
    "chain": ["claude", "openrouter"]
  }
}

Worker connects but never picks up jobs

Symptom: modelreins worker list shows the worker as connected, but it never processes any jobs.

Fix:

Check that the worker’s provider matches the jobs in the queue:

modelreins job list --status pending
modelreins worker info <worker-id>

If jobs are queued for claude but the worker only supports ollama, it won’t pick them up.

Check routing rules. If your routing config specifies a worker name or tag that doesn’t match:

modelreins config show routing

Verify the worker has available capacity:

modelreins worker info <worker-id> --verbose

Look for concurrency: 0 or paused: true.

Dashboard not loading

Symptom: The dashboard URL returns a blank page or connection refused.

Fix:

Check the coordinator is running:

modelreins status

The dashboard is served by the coordinator on the same port (default 7420):

curl http://localhost:7420/health

If accessing remotely, check firewall rules allow traffic on port 7420.
If using a reverse proxy, ensure WebSocket connections are proxied (the dashboard uses WebSockets for live updates):

location / {
    proxy_pass http://localhost:7420;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
}