Model Configuration Guidance
Details
Overview
The combined (orchestrator) route in AI Optimizer performs classification, sub-session execution, and optional synthesis โ all using the configured primary chat model (ll_model). This keeps the default configuration simple: a single model handles every stage. Developers building on this application should understand the implications and know where to customize if needed.
How It Works
When a user query enters the combined route, it passes through up to three LLM call stages:
Classification โ The primary model determines whether the query should be handled by
nl2sql,vecsearch, orboth. This is a lightweight call withmax_tokens=10andtemperature=0.0, designed to return a single word.Execution โ Based on the classification result, the query is delegated to the appropriate sub-session. When the classification is
both, the NL2SQL and VecSearch sub-sessions run in parallel.Synthesis โ When both routes are used, a final LLM call merges the two answers into a single coherent response. Synthesis is skipped only if the optional grade node explicitly marks the VecSearch result as not relevant, in which case the NL2SQL answer is returned alone.
Considerations for Developers
Latency and Cost
Classification and synthesis add LLM round-trips that use the primary model. For high-throughput or cost-sensitive workloads, introducing a smaller, dedicated classifier model can reduce both latency and per-request cost for these lightweight calls.
Model Suitability
Some models may not perform optimally on very short, constrained classification prompts. If classification frequently falls back to both (visible in the application logs as a warning), consider evaluating a model better suited for routing tasks.
Customization Point
The classifier and synthesis calls use the same model as the primary ll_model โ they are bound to it in the runtime wiring. Using a separate, lighter model for these lightweight calls is possible (the combined session accepts a distinct classifier_model parameter), but it requires a code change to pass a different value; there is no setting or UI field for it today. Substituting a smaller model here would not affect the models used by the NL2SQL or VecSearch sub-sessions.
Fallback Behavior
If classification fails or returns an unexpected value, the system defaults to running both routes. This ensures a response is always returned, even when the classifier encounters an error or produces an unrecognized output.
Synthesis has its own fallback: if the final merge call fails, the system returns the NL2SQL and VecSearch answers concatenated under labelled headings rather than surfacing the error to the user.