Skip to content

candela-local

candela-local is a lightweight binary that runs on a developer’s machine. It provides:

  • Unified model discovery — one endpoint for local and cloud models
  • Smart routing — automatically sends requests to the right backend
  • Runtime management — start/stop Ollama, pull models, manage state
  • Local observability — capture every LLM call to SQLite with zero cloud dependencies

For: Individual developers who want to run local models with full observability and zero cloud dependencies.

# ~/.candela.yaml — Solo Mode
port: 8181
lm_studio_port: 1234
runtime_backend: ollama

What you get:

  • Local models via Ollama/vLLM/LM Studio on :1234
  • Embedded observability — every call traced to ~/.candela/traces.db
  • Management UI at http://localhost:8181/_local/
  • Model pulling, health monitoring, backend discovery
  • No cloud account, no authentication, no remote server needed

For: Individual developers who want local and cloud models (Gemini, Claude) without deploying a Candela server. Uses Google ADC — the same identity you already have.

# ~/.candela.yaml — Solo + Cloud
runtime_backend: ollama
providers:
- name: google
models: [gemini-2.5-pro, gemini-2.0-flash]
- name: anthropic
models: [claude-sonnet-4-20250514, claude-3-haiku]
vertex_ai:
project: my-gcp-project
region: us-central1

Prerequisites:

Terminal window
gcloud auth application-default login

What you get: Everything from Solo Mode, plus cloud models merged into /v1/models, smart routing (local stays local, cloud routes to Vertex AI), and all calls traced to SQLite.

Architecture:

JetBrains / Cline / curl
LM Compat (:1234)
/v1/models → local + cloud models
/v1/chat/completions
├── local model ──▶ Ollama / vLLM
│ │
│ spanCapture
│ │
└── cloud model ──▶ pkg/proxy ──▶ Vertex AI (Google ADC)
SpanProcessor → SQLite (traces.db)

For: Teams that need budgeting, governance, and RBAC via a shared Candela cloud backend.

# ~/.candela.yaml — Team Mode
port: 8181
lm_studio_port: 1234
runtime_backend: ollama
remote: https://candela-xxx.a.run.app
audience: "12345678.apps.googleusercontent.com"

What you get: Everything from Solo Mode, plus cloud models routed through the Candela server with automatic OIDC auth injection via ADC, team-wide cost tracking, and budget enforcement.


Terminal window
go install github.com/candelahq/candela/cmd/candela-local@latest
# ── Required ──
runtime_backend: ollama # ollama | vllm | lmstudio
# ── Optional: Network ──
port: 8181 # main proxy port (default: 8181)
lm_studio_port: 1234 # LM compat listener (default: 1234)
# ── Optional: Direct Cloud (Solo + Cloud) ──
providers: # omit for local-only solo mode
- name: google
models: [gemini-2.5-pro]
- name: anthropic
models: [claude-sonnet-4-20250514]
vertex_ai:
project: my-gcp-project # required when providers is set
region: us-central1 # default: us-central1
# ── Optional: Team Mode (omit for Solo) ──
remote: https://candela-xxx.run.app # Candela server URL
audience: "12345678.apps..." # IAP audience for OIDC auth
# ── Optional: Advanced ──
local_upstream: http://localhost:11434 # explicit local runtime URL
state_db_path: ~/.candela/state.db # runtime state persistence
Request modelModeWhere it runs
llama3.2:3bAnyLocal (Ollama) — always preferred
gemini-2.5-proSolo + CloudVertex AI (direct, via ADC)
claude-sonnet-4-20250514Solo + CloudVertex AI Anthropic
gpt-4oTeamCloud (via Candela server)

Access at http://localhost:8181/_local/:

CardDescription
HealthRuntime status, start/stop controls, uptime
ModelsLoaded models with size, family, quantization
Pull ModelDownload new models with progress tracking
TracesRecent LLM calls with tokens, cost, duration
BackendsAuto-detected runtimes with install hints
SettingsState DB path, reset
  1. Settings → AI Assistant → Enable “LM Studio”
  2. URL is pre-configured to http://localhost:1234 — just works!
  3. Select any model from the dropdown (local + cloud)
SymptomCauseFix
”model not found locally and no remote server configured”Solo Mode + unknown modelAdd providers for cloud models
”vertex_ai.project is required”providers set but no projectAdd vertex_ai.project to config
”failed to get Google ADC”ADC not configuredRun gcloud auth application-default login
”audience is required when remote is set”Missing audienceAdd IAP audience to config
Traces card shows “Traces not available”Team ModeExpected — check cloud dashboard
No models in /v1/modelsRuntime not startedStart Ollama: ollama serve