Model Configuration Guidance

Details
This page shows version v0.0.0 (dev). The current version can be found here.

Overview

The combined (orchestrator) route in AI Optimizer performs classification, sub-session execution, and optional synthesis โ€” all using the configured primary chat model (ll_model). This keeps the default configuration simple: a single model handles every stage. Developers building on this application should understand the implications and know where to customize if needed.

How It Works

When a user query enters the combined route, it passes through up to three LLM call stages:

  1. Classification โ€” The primary model determines whether the query should be handled by nl2sql, vecsearch, or both. This is a lightweight call with max_tokens=10 and temperature=0.0, designed to return a single word.

  2. Execution โ€” Based on the classification result, the query is delegated to the appropriate sub-session. When the classification is both, the NL2SQL and VecSearch sub-sessions run in parallel.

  3. Synthesis โ€” When both routes are used, a final LLM call merges the two answers into a single coherent response. Synthesis is skipped only if the optional grade node explicitly marks the VecSearch result as not relevant, in which case the NL2SQL answer is returned alone.

Considerations for Developers

Latency and Cost

Classification and synthesis add LLM round-trips that use the primary model. For high-throughput or cost-sensitive workloads, introducing a smaller, dedicated classifier model can reduce both latency and per-request cost for these lightweight calls.

Model Suitability

Some models may not perform optimally on very short, constrained classification prompts. If classification frequently falls back to both (visible in the application logs as a warning), consider evaluating a model better suited for routing tasks.

Customization Point

The classifier and synthesis calls use the same model as the primary ll_model โ€” they are bound to it in the runtime wiring. Using a separate, lighter model for these lightweight calls is possible (the combined session accepts a distinct classifier_model parameter), but it requires a code change to pass a different value; there is no setting or UI field for it today. Substituting a smaller model here would not affect the models used by the NL2SQL or VecSearch sub-sessions.

Fallback Behavior

If classification fails or returns an unexpected value, the system defaults to running both routes. This ensures a response is always returned, even when the classifier encounters an error or produces an unrecognized output.

Synthesis has its own fallback: if the final merge call fails, the system returns the NL2SQL and VecSearch answers concatenated under labelled headings rather than surfacing the error to the user.