How to Specify the Generation Configuration when Using LLMs#

Generation parameters, such as temperature, top-p, and the maximum number of output tokens, are useful for achieving the desired performance with Large Language Models (LLMs). In Agent Spec, these parameters can be configured in the default_generation_parameters of the LlmConfig.

This guide will show you how to:

Configure the generation parameters for an agent.
Configure the generation parameters for a flow.

Note

For a deeper understanding of the impact of each generation parameter, refer to the resources at the bottom of this page.

Configure the generation parameters for an agent#

Customizing the generation configuration for an agent requires to specify the generation configuration of the llm used by the agent.

In pyagentspec, the generation configuration can be specified when creating an LlmConfig using the LlmGenerationConfig class. This ensures that all the outputs generated by the agent will have the same generation configuration.

The LlmGenerationConfig is transformed into a dictionary during serialization. The class is defined containing the following attributes, to refer to some of the most common and widely supported generation parameters, but it allows arbitrary entries.

max_new_tokens: controls the maximum numbers of tokens to generate, ignoring the number of tokens in the prompt;
temperature: controls the randomness of the output;
top_p: controls the randomness of the output.

from pyagentspec.llms import LlmGenerationConfig

generation_config = LlmGenerationConfig(max_tokens=512, temperature=1.0, top_p=1.0)

Agent Spec supports LLM configurations for several providers. You can pass the generation_config for each of them.

from pyagentspec.llms import VllmConfig

llm_config = VllmConfig(
    name="vllm-llama-4-maverick",
    model_id="llama-4-maverick",
    url="http://url.to.my.vllm.server/llama4mav",
    default_generation_parameters=generation_config,
)

Important

API keys should not be stored anywhere in the Agent Spec representation of an agent.

Now, you can build an agent using the LLM as follows:

from pyagentspec import Agent

agent = Agent(
    name="my_first_agent",
    llm_config=llm_config,
    system_prompt="You are a helpful assistant.",
)

Configure the generation parameters for a flow#

Customizing the generation configuration for a flow requires to specify the generation configuration of the llm used by a node.

Refer to the previous section to learn how to configure the generation parameters when initializing an LLM using the LlmGenerationConfig class.

For example, you can then create a one-step flow using the LlmNode with custom generation parameters as follows.

from pyagentspec.flows.edges import ControlFlowEdge, DataFlowEdge
from pyagentspec.flows.flow import Flow
from pyagentspec.flows.nodes import LlmNode, StartNode, EndNode
from pyagentspec.property import StringProperty

start_node = StartNode(name="start", inputs=[StringProperty(title="user_question")])
end_node = EndNode(name="end", outputs=[StringProperty(title="llm_output")])
llm_node = LlmNode(
    name="llm_node",
    prompt_template="{{user_question}}",
    llm_config=llm_config,
    outputs=[StringProperty(title="llm_output")]
)
flow = Flow(
    name="flow",
    start_node=start_node,
    nodes=[start_node, end_node, llm_node],
    control_flow_connections=[
        ControlFlowEdge(name="cfe1", from_node=start_node, to_node=llm_node),
        ControlFlowEdge(name="cfe2", from_node=llm_node, to_node=end_node),
    ],
    data_flow_connections=[
        DataFlowEdge(
            name="dfe1",
            source_node=start_node,
            source_output="user_question",
            destination_node=llm_node,
            destination_input="user_question"
        ),
        DataFlowEdge(
            name="dfe2",
            source_node=llm_node,
            source_output="llm_output",
            destination_node=end_node,
            destination_input="llm_output"
        ),
    ],
)

Recap#

In this guide, you learned how to configure generation parameters for an agent or a flow.

Below is the complete code from this guide.

from pyagentspec.llms import LlmGenerationConfig, VllmConfig

generation_config = LlmGenerationConfig(max_tokens=512, temperature=1.0, top_p=1.0)

llm_config = VllmConfig(
    name="vllm-llama-4-maverick",
    model_id="llama-4-maverick",
    url="http://url.to.my.vllm.server/llama4mav",
    default_generation_parameters=generation_config,
)

from pyagentspec import Agent

agent = Agent(
    name="my_first_agent",
    llm_config=llm_config,
    system_prompt="You are a helpful assistant.",
)

from pyagentspec.flows.edges import ControlFlowEdge, DataFlowEdge
from pyagentspec.flows.flow import Flow
from pyagentspec.flows.nodes import LlmNode, StartNode, EndNode
from pyagentspec.property import StringProperty

start_node = StartNode(name="start", inputs=[StringProperty(title="user_question")])
end_node = EndNode(name="end", outputs=[StringProperty(title="llm_output")])
llm_node = LlmNode(
    name="llm_node",
    prompt_template="{{user_question}}",
    llm_config=llm_config,
    outputs=[StringProperty(title="llm_output")]
)
flow = Flow(
    name="flow",
    start_node=start_node,
    nodes=[start_node, end_node, llm_node],
    control_flow_connections=[
        ControlFlowEdge(name="cfe1", from_node=start_node, to_node=llm_node),
        ControlFlowEdge(name="cfe2", from_node=llm_node, to_node=end_node),
    ],
    data_flow_connections=[
        DataFlowEdge(
            name="dfe1",
            source_node=start_node,
            source_output="user_question",
            destination_node=llm_node,
            destination_input="user_question"
        ),
        DataFlowEdge(
            name="dfe2",
            source_node=llm_node,
            source_output="llm_output",
            destination_node=end_node,
            destination_input="llm_output"
        ),
    ],
)

Next steps#

Having learned how to specify the generation configuration, you may now proceed to:

Some additional resources we recommend: