Budget Enforcement
Budget enforcement is Candela’s most mature governance capability. Every LLM request passes through a real-time budget gate that checks, deducts, and alerts — ensuring no user or automation exceeds their approved spending limits. This is not a dashboard metric; it’s an active enforcement control that blocks requests at the proxy layer.
How It Works
Section titled “How It Works”Every LLM request passes through a two-phase budget check:
Request arrives │ ▼┌─────────────┐ Budget exhausted?│ Pre-flight │────────────────────────▶ HTTP 402│ Budget Gate │ "budget exhausted"└──────┬──────┘ │ ✅ Allowed ▼┌─────────────┐│ LLM Call │ tokens counted,│ (proxied) │ cost calculated└──────┬──────┘ │ ▼┌─────────────┐ Threshold crossed?│ Deduct & │────────────────────────▶ 🔔 Alert│ Notify │ (80%, 90%, 100%)└─────────────┘Key design decisions:
- Pre-flight check is soft — it checks if any budget remains but doesn’t estimate cost (impossible before the call)
- Deduction is synchronous — prevents billing bypass on crash
- Grants are spent first (waterfall: earliest-expiring grant → daily budget)
- Service accounts skip budget checks — they have no budget entries
Setting Up Budgets
Section titled “Setting Up Budgets”Server Configuration
Section titled “Server Configuration”Set a default daily budget for all new users in your server config:
# config.yaml (candela-server)users: default_daily_budget_usd: 5.00 # applied to auto-provisioned usersAdmin API
Section titled “Admin API”Admins can manage budgets via the ConnectRPC UserService:
# Set a $10/day budget for a userbuf curl --protocol connect \ https://candela.example.com/candela.v1.UserService/SetBudget \ -d '{ "user_id": "alice@example.com", "limit_usd": 10.0 }'# View current budget statusbuf curl --protocol connect \ https://candela.example.com/candela.v1.UserService/GetBudget \ -d '{"user_id": "alice@example.com"}'Response:
{ "budget": { "userId": "alice@example.com", "limitUsd": 10.0, "spentUsd": 3.47, "tokensUsed": 14200, "periodType": "BUDGET_PERIOD_DAILY", "periodKey": "2026-05-04" }}# Emergency: reset a user's daily spend to $0buf curl --protocol connect \ https://candela.example.com/candela.v1.UserService/ResetSpend \ -d '{"user_id": "alice@example.com"}'Self-Service API
Section titled “Self-Service API”Developers can check their own budget:
# Authenticated as the current userbuf curl --protocol connect \ https://candela.example.com/candela.v1.UserService/GetMyBudgetReturns remaining budget across grants + daily allocation.
Grants
Section titled “Grants”Grants are one-time budget bonuses with expiration dates. They’re consumed before the daily budget (waterfall order: earliest-expiring grant first).
buf curl --protocol connect \ https://candela.example.com/candela.v1.UserService/CreateGrant \ -d '{ "user_id": "alice@example.com", "amount_usd": 50.0, "reason": "hackathon sprint", "starts_at": "2026-05-04T00:00:00Z", "expires_at": "2026-05-11T00:00:00Z" }'buf curl --protocol connect \ https://candela.example.com/candela.v1.UserService/ListGrants \ -d '{"user_id": "alice@example.com", "active_only": true}'buf curl --protocol connect \ https://candela.example.com/candela.v1.UserService/RevokeGrant \ -d '{ "user_id": "alice@example.com", "grant_id": "abc123-..." }'Deduction Waterfall
Section titled “Deduction Waterfall”When a $0.50 LLM call completes:
- Check active grants (sorted by
expires_atascending) - Deduct from earliest-expiring grant until spent or grant exhausted
- Remaining cost → daily budget
- All updates are transactional
Cost: $0.50 ├── Grant A ($0.30 remaining, expires May 5) → deduct $0.30 ├── Grant B ($2.00 remaining, expires May 10) → deduct $0.20 └── Daily budget → $0.00 (grants covered it)Budget Alerts
Section titled “Budget Alerts”Alerts fire when a user’s daily spend crosses configurable thresholds. Defaults:
| Threshold | When |
|---|---|
| 80% | Warning — approaching limit |
| 90% | Critical — nearly exhausted |
| 100% | Blocked — budget fully spent |
Notification Channels
Section titled “Notification Channels”| Channel | Status | How |
|---|---|---|
| Structured logs | ✅ Built-in | Cloud Logging → alert policy |
| Slack | 🔜 Planned | Webhook integration |
| Microsoft Teams | 🔜 Planned | Webhook integration |
Alerts are deduplicated — each threshold fires at most once per period per user.
Cloud Logging Alert Policy
Section titled “Cloud Logging Alert Policy”The log-based notifier emits structured warnings that can trigger GCP alert policies:
jsonPayload.message = "🔔 budget alert: 90% threshold reached"Rate Limiting
Section titled “Rate Limiting”Per-user rate limiting prevents runaway automation from draining budgets:
| Setting | Default | Scope |
|---|---|---|
rate_limit | 60 req/min | Per user |
Rate limits use minute-window counters with a 2-minute TTL.
Configuring Per-User Limits
Section titled “Configuring Per-User Limits”buf curl --protocol connect \ https://candela.example.com/candela.v1.UserService/UpdateUser \ -d '{ "id": "alice@example.com", "rate_limit": 120 }'Audit Trail
Section titled “Audit Trail”Every admin action is logged to an immutable audit collection:
| Action | Logged |
|---|---|
create_user | ✅ |
set_budget | ✅ |
reset_spend | ✅ |
create_grant | ✅ |
revoke_grant | ✅ |
deactivate_user | ✅ |
reactivate_user | ✅ |
delete_user | ✅ (global collection — survives deletion) |
buf curl --protocol connect \ https://candela.example.com/candela.v1.UserService/ListAuditLog \ -d '{"user_id": "alice@example.com", "limit": 20}'