Model Access Control
Model access control lets you define which LLM models are approved for use within your organization. Requests for unapproved models are rejected at the proxy layer with a 403 Forbidden response and a full audit trail entry.
Why Model Access Control?
Section titled “Why Model Access Control?”Not all models are appropriate for every workload:
- Compliance: Some models aren’t approved for handling sensitive data
- Cost: GPT-4o costs 10-30× more than GPT-4o-mini for similar tasks
- Consistency: Standardize on approved models to reduce operational complexity
- Vendor risk: Limit exposure to specific providers
How It Works
Section titled “How It Works”The model access policy is evaluated as a pre-flight check in the governance pipeline, after the request body is parsed but before the budget gate:
Request: POST /proxy/openai/v1/chat/completionsBody: { "model": "gpt-4o", ... } │ ▼┌─────────────────┐│ Model Policy │──── "gpt-4o" in allowlist? ──▶ ✅ Continue│ Evaluation │──── "gpt-4o" in blocklist? ──▶ ❌ HTTP 403└─────────────────┘Configuration
Section titled “Configuration”Global Allowlist
Section titled “Global Allowlist”Define a global model allowlist that applies to all tenants:
Only these models are permitted. Everything else is blocked.
governance: models: mode: allowlist allowed: - gpt-4o-mini - gpt-4o - gemini-3.5-pro - gemini-3.5-flash - gemini-2.5-pro - gemini-2.5-flash - claude-sonnet-4-20250514All models are permitted except these. Use when you want to block specific high-cost or unapproved models.
governance: models: mode: blocklist blocked: - o1-pro - claude-opus-4-20250514Per-Tenant Overrides
Section titled “Per-Tenant Overrides”Tenants can have their own model policies that override the global policy:
governance: models: mode: allowlist allowed: - gpt-4o-mini - gemini-2.5-flash
tenants: research-team: models: # Research team gets access to expensive models allowed: - gpt-4o - claude-sonnet-4-20250514 - gemini-3.5-pro - gemini-2.5-pro contractor-pool: models: # Contractors restricted to cost-efficient models only allowed: - gpt-4o-mini - gemini-3.5-flash - gemini-2.5-flashEnforcement Behavior
Section titled “Enforcement Behavior”When a request is blocked:
- HTTP Status:
403 Forbidden - Response Body: JSON with the denied model name and policy reason
- Audit Log: Full entry with request metadata, user, tenant, and denied model
- Metrics:
candela_policy_model_denied_totalcounter incremented
{ "error": "model_not_allowed", "message": "Model 'o1-pro' is not in the approved model list for tenant 'contractor-pool'", "model": "o1-pro", "tenant_id": "contractor-pool", "policy": "allowlist"}Audit Trail
Section titled “Audit Trail”Every model policy decision — both allowed and denied — is recorded:
| Field | Example |
|---|---|
action | model_policy_check |
result | denied or allowed |
model | o1-pro |
tenant_id | contractor-pool |
user_id | alice@example.com |
policy_mode | allowlist |
timestamp | 2026-05-17T00:00:00Z |
This gives compliance teams a complete record of which models were used and which were blocked, by whom, and when.