Theme & Branding Bot Modes LLM Routing Secrets & API Keys Public API Keys (cg_…)Users & Roles Notifications Audit log System health

LLM Routing

Per-stage provider routing. Each call goes through LLMClient.complete(stage="…").

ⓘ

Failover policy

Each stage has a primary + fallback. On provider error / timeout / rate limit, the runtime automatically falls back. Failovers appear in TurnTrace and surface as "provider degradation" anomalies in Insights.

Per-stage routing

Stage	Primary	Fallback	Rationale
`drafting` Final grounded answer to visitor			Multilingual strength for 10 langs; 2M ctx; ~½ cost of GPT-4.1
`kq_classifier` Knowledge-question binary classifier			Trivial; sub-200ms; cheap
`intent_rerank` Intent reranker post-embedding			Same as above
`triage_cluster` Failed query clustering naming			Batched; cost-sensitive
`triage_fix_suggest` Quick-fix suggestion in triage queue			Quality matters here
`judge` Sampled LLM-as-judge audit			Independent model — avoid Gemini judging Gemini
`agent_suggested_reply` Agent suggested replies			Quality + voice matters
`handback_summary` Bot context summary after agent hand-back			Short summary; cheap
`realtime_voice` Realtime voice generation			OpenAI Realtime is best realtime today

Configured providers

Google Gemini

connected

GEMINI_API_KEY · last used 12s ago

OpenAI

connected

OPENAI_API_KEY · last used 2 min ago