How to Specify the Generation Configuration when Using LLMs#
Generation parameters, such as temperature, top-p, and the maximum number of output tokens,
are useful for achieving the desired performance with Large Language Models (LLMs).
In Agent Spec, these parameters can be configured in the default_generation_parameters
of the LlmConfig
.
This guide will show you how to:
Configure the generation parameters for an agent.
Configure the generation parameters for a flow.
Note
For a deeper understanding of the impact of each generation parameter, refer to the resources at the bottom of this page.
Configure the generation parameters for an agent#
Customizing the generation configuration for an agent requires to specify the generation configuration of the llm used by the agent.
In pyagentspec
, the generation configuration can be specified when creating an LlmConfig using the
LlmGenerationConfig class.
This ensures that all the outputs generated by the agent will have the same generation configuration.
The LlmGenerationConfig is transformed into a dictionary during serialization. The class is defined containing the following attributes, to refer to some of the most common and widely supported generation parameters, but it allows arbitrary entries.
max_new_tokens
: controls the maximum numbers of tokens to generate, ignoring the number of tokens in the prompt;temperature
: controls the randomness of the output;top_p
: controls the randomness of the output.
from pyagentspec.llms import LlmGenerationConfig
generation_config = LlmGenerationConfig(max_tokens=512, temperature=1.0, top_p=1.0)
Agent Spec supports LLM configurations for several providers.
You can pass the generation_config
for each of them.
from pyagentspec.llms import VllmConfig
llm_config = VllmConfig(
name="vllm-llama-4-maverick",
model_id="llama-4-maverick",
url="http://url.to.my.vllm.server/llama4mav",
default_generation_parameters=generation_config,
)
Important
API keys should not be stored anywhere in the Agent Spec representation of an agent.
Now, you can build an agent using the LLM as follows:
from pyagentspec import Agent
agent = Agent(
name="my_first_agent",
llm_config=llm_config,
system_prompt="You are a helpful assistant.",
)
Configure the generation parameters for a flow#
Customizing the generation configuration for a flow requires to specify the generation configuration of the llm used by a node.
Refer to the previous section to learn how to configure the generation parameters when initializing an LLM using the LlmGenerationConfig
class.
For example, you can then create a one-step flow using the LlmNode
with custom generation parameters as follows.
from pyagentspec.flows.edges import ControlFlowEdge, DataFlowEdge
from pyagentspec.flows.flow import Flow
from pyagentspec.flows.nodes import LlmNode, StartNode, EndNode
from pyagentspec.property import StringProperty
start_node = StartNode(name="start", inputs=[StringProperty(title="user_question")])
end_node = EndNode(name="end", outputs=[StringProperty(title="llm_output")])
llm_node = LlmNode(
name="llm_node",
prompt_template="{{user_question}}",
llm_config=llm_config,
outputs=[StringProperty(title="llm_output")]
)
flow = Flow(
name="flow",
start_node=start_node,
nodes=[start_node, end_node, llm_node],
control_flow_connections=[
ControlFlowEdge(name="cfe1", from_node=start_node, to_node=llm_node),
ControlFlowEdge(name="cfe2", from_node=llm_node, to_node=end_node),
],
data_flow_connections=[
DataFlowEdge(
name="dfe1",
source_node=start_node,
source_output="user_question",
destination_node=llm_node,
destination_input="user_question"
),
DataFlowEdge(
name="dfe2",
source_node=llm_node,
source_output="llm_output",
destination_node=end_node,
destination_input="llm_output"
),
],
)
Recap#
In this guide, you learned how to configure generation parameters for an agent or a flow.
Below is the complete code from this guide.
from pyagentspec.llms import LlmGenerationConfig, VllmConfig
generation_config = LlmGenerationConfig(max_tokens=512, temperature=1.0, top_p=1.0)
llm_config = VllmConfig(
name="vllm-llama-4-maverick",
model_id="llama-4-maverick",
url="http://url.to.my.vllm.server/llama4mav",
default_generation_parameters=generation_config,
)
from pyagentspec import Agent
agent = Agent(
name="my_first_agent",
llm_config=llm_config,
system_prompt="You are a helpful assistant.",
)
from pyagentspec.flows.edges import ControlFlowEdge, DataFlowEdge
from pyagentspec.flows.flow import Flow
from pyagentspec.flows.nodes import LlmNode, StartNode, EndNode
from pyagentspec.property import StringProperty
start_node = StartNode(name="start", inputs=[StringProperty(title="user_question")])
end_node = EndNode(name="end", outputs=[StringProperty(title="llm_output")])
llm_node = LlmNode(
name="llm_node",
prompt_template="{{user_question}}",
llm_config=llm_config,
outputs=[StringProperty(title="llm_output")]
)
flow = Flow(
name="flow",
start_node=start_node,
nodes=[start_node, end_node, llm_node],
control_flow_connections=[
ControlFlowEdge(name="cfe1", from_node=start_node, to_node=llm_node),
ControlFlowEdge(name="cfe2", from_node=llm_node, to_node=end_node),
],
data_flow_connections=[
DataFlowEdge(
name="dfe1",
source_node=start_node,
source_output="user_question",
destination_node=llm_node,
destination_input="user_question"
),
DataFlowEdge(
name="dfe2",
source_node=llm_node,
source_output="llm_output",
destination_node=end_node,
destination_input="llm_output"
),
],
)
Next steps#
Having learned how to specify the generation configuration, you may now proceed to:
Some additional resources we recommend: