How to Specify the Generation Configuration when Using LLMs#
Generation parameters, such as temperature, top-p, and the maximum number of output tokens, are important for achieving the desired performance with Large Language Models (LLMs).
In WayFlow, these parameters can be configured with the LlmGenerationConfig
class.
This guide will show you how to:
Configure the generation parameters for an agent.
Configure the generation parameters for a flow.
Apply the generation configuration from a dictionary.
Save a custom generation configuration.
Note
For a deeper understanding of the impact of each generation parameter, refer to the resources at the bottom of this page.
Basic implementation#
Configure the generation parameters for an agent#
Customizing the generation configuration for an agent requires the use of the following wayflowcore
components.
from wayflowcore.agent import Agent
from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig
The generation configuration can be specified when initializing the LLM using the LlmGenerationConfig class. This ensures that all the outputs generated by the agent will have the same generation configuration.
The generation configuration dictionary can have the following arguments:
max_new_tokens
: controls the maximum numbers of tokens to generate, ignoring the number of tokens in the prompt;temperature
: controls the randomness of the output;top_p
: controls the randomness of the output;stop
: defines a list of stop words to indicate the LLM to stop generating;frequency_penalty
: controls the frequency of tokens generated.
Additionally, the LlmGenerationConfig offers the possibility to set a dictionary
of arbitrary parameters, called extra_args
, that will be sent as part of the llm generation call.
This allows specifying provider-specific parameters that might not be common to all.
Note
The extra parameters should never include sensitive information.
generation_config = LlmGenerationConfig(
max_tokens=512,
temperature=1.0,
top_p=1.0,
stop=["exit", "end"],
frequency_penalty=0,
extra_args={"seed": 1},
)
WayFlow supports several LLM API providers.
You can pass the generation_config
for each of them.
Select an LLM from the options below:
from wayflowcore.models import OCIGenAIModel
if __name__ == "__main__":
llm = OCIGenAIModel(
model_id="provider.model-id",
service_endpoint="https://url-to-service-endpoint.com",
compartment_id="compartment-id",
auth_type="API_KEY",
)
from wayflowcore.models import VllmModel
llm = VllmModel(
model_id="model-id",
host_port="VLLM_HOST_PORT",
)
from wayflowcore.models import OllamaModel
llm = OllamaModel(
model_id="model-id",
)
Important
API keys should not be stored anywhere in the code. Use environment variables and/or tools such as python-dotenv
Now, you can build an agent using the LLM as follows:
agent = Agent(llm=llm)
conversation = agent.start_conversation()
conversation.append_user_message("What is the capital of Switzerland?")
conversation.execute()
print(conversation.get_last_message())
Configure the generation parameters for a flow#
Customizing the generation configuration for a flow requires the use of the following wayflowcore
components.
from wayflowcore.flow import Flow
from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig
from wayflowcore.property import StringProperty
from wayflowcore.steps import PromptExecutionStep, StartStep
Refer to the previous section to learn how to configure the generation parameters when initializing an LLM
using the LlmGenerationConfig
class.
You can then create a one-step flow using the PromptExecutionStep
step.
start_step = StartStep(name="start_step", input_descriptors=[StringProperty("user_question")])
prompt_step = PromptExecutionStep(
name="PromptExecution",
prompt_template="{{user_question}}",
llm=llm,
generation_config=LlmGenerationConfig(temperature=0.8),
)
flow = Flow(
begin_step=start_step,
control_flow_edges=[
ControlFlowEdge(source_step=start_step, destination_step=prompt_step),
ControlFlowEdge(source_step=prompt_step, destination_step=None),
],
data_flow_edges=[DataFlowEdge(start_step, "user_question", prompt_step, "user_question")],
)
conversation = flow.start_conversation(
inputs={"user_question": "What is the capital of Switzerland?"}
)
conversation.execute()
Important
The generation_config
parameter passed to the PromptExecutionStep
overrides the LLM’s original generation configuration.
Advanced usage#
The LlmGenerationConfig
class is a serializable object. It can be instantiated from a dictionary or saved to one, as you will see below.
Apply the generation configuration from a dictionary#
If you have a generation configuration in a dictionary (for example, from a JSON or YAML file),
you can instantiate the LlmGenerationConfig
class as follows:
config_dict = {
"max_tokens": 512,
"temperature": 0.9,
}
config = LlmGenerationConfig.from_dict(config_dict)
Save a custom generation configuration#
If you would like to share your specific generation configuration, you can create a LlmGenerationConfig
class instance and store it to a dictionary.
config = LlmGenerationConfig(max_tokens=1024, temperature=0.8, top_p=0.6)
config_dict = config.to_dict()
Agent Spec Exporting/Loading#
You can export the assistant configuration to its Agent Spec configuration using the AgentSpecExporter
.
The following example exports the serialization of the flow defined above.
from wayflowcore.agentspec import AgentSpecExporter
serialized_assistant = AgentSpecExporter().to_yaml(flow)
Here is what the Agent Spec representation will look like ↓
Click here to see the assistant configuration.
{
"component_type": "Flow",
"id": "fc3d10f4-5ee2-40d8-a580-0db6c44b0b39",
"name": "flow_0e4b989a",
"description": "",
"metadata": {
"__metadata_info__": {}
},
"inputs": [
{
"type": "string",
"title": "user_question"
}
],
"outputs": [
{
"description": "the generated text",
"type": "string",
"title": "output"
}
],
"start_node": {
"$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
},
"nodes": [
{
"$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
},
{
"$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
},
{
"$component_ref": "158c838b-f8be-41ef-8b66-64348c8d379c"
}
],
"control_flow_connections": [
{
"component_type": "ControlFlowEdge",
"id": "7b8c0c0b-fcd1-4bf3-96be-dcf726ab1526",
"name": "start_step_to_PromptExecution_control_flow_edge",
"description": null,
"metadata": {
"__metadata_info__": {}
},
"from_node": {
"$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
},
"from_branch": null,
"to_node": {
"$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
}
},
{
"component_type": "ControlFlowEdge",
"id": "6b2c2840-126c-43fe-a8e1-f3cb08e8ae88",
"name": "PromptExecution_to_None End node_control_flow_edge",
"description": null,
"metadata": {},
"from_node": {
"$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
},
"from_branch": null,
"to_node": {
"$component_ref": "158c838b-f8be-41ef-8b66-64348c8d379c"
}
}
],
"data_flow_connections": [
{
"component_type": "DataFlowEdge",
"id": "86ba0435-be9b-46b0-97ae-64e145045e19",
"name": "start_step_user_question_to_PromptExecution_user_question_data_flow_edge",
"description": null,
"metadata": {
"__metadata_info__": {}
},
"source_node": {
"$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
},
"source_output": "user_question",
"destination_node": {
"$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
},
"destination_input": "user_question"
},
{
"component_type": "DataFlowEdge",
"id": "8638e234-9a23-45c5-89d4-296fc5a8c5ac",
"name": "PromptExecution_output_to_None End node_output_data_flow_edge",
"description": null,
"metadata": {},
"source_node": {
"$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
},
"source_output": "output",
"destination_node": {
"$component_ref": "158c838b-f8be-41ef-8b66-64348c8d379c"
},
"destination_input": "output"
}
],
"$referenced_components": {
"25917cac-52d4-4816-8c62-c18d8b70ee33": {
"component_type": "LlmNode",
"id": "25917cac-52d4-4816-8c62-c18d8b70ee33",
"name": "PromptExecution",
"description": "",
"metadata": {
"__metadata_info__": {}
},
"inputs": [
{
"description": "\"user_question\" input variable for the template",
"type": "string",
"title": "user_question"
}
],
"outputs": [
{
"description": "the generated text",
"type": "string",
"title": "output"
}
],
"branches": [
"next"
],
"llm_config": {
"component_type": "VllmConfig",
"id": "93d098ef-9643-4d38-a012-8903bacbb784",
"name": "LLAMA_MODEL_ID",
"description": null,
"metadata": {
"__metadata_info__": {}
},
"default_generation_parameters": null,
"url": "LLAMA_API_URL",
"model_id": "LLAMA_MODEL_ID"
},
"prompt_template": "{{user_question}}"
},
"d8870848-f3c1-4a88-a0f3-b6ca20c61bab": {
"component_type": "StartNode",
"id": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab",
"name": "start_step",
"description": "",
"metadata": {
"__metadata_info__": {}
},
"inputs": [
{
"type": "string",
"title": "user_question"
}
],
"outputs": [
{
"type": "string",
"title": "user_question"
}
],
"branches": [
"next"
]
},
"158c838b-f8be-41ef-8b66-64348c8d379c": {
"component_type": "EndNode",
"id": "158c838b-f8be-41ef-8b66-64348c8d379c",
"name": "None End node",
"description": "End node representing all transitions to None in the WayFlow flow",
"metadata": {},
"inputs": [
{
"description": "the generated text",
"type": "string",
"title": "output"
}
],
"outputs": [
{
"description": "the generated text",
"type": "string",
"title": "output"
}
],
"branches": [],
"branch_name": "next"
}
},
"agentspec_version": "25.4.1"
}
component_type: Flow
id: fc3d10f4-5ee2-40d8-a580-0db6c44b0b39
name: flow_0e4b989a
description: ''
metadata:
__metadata_info__: {}
inputs:
- type: string
title: user_question
outputs:
- description: the generated text
type: string
title: output
start_node:
$component_ref: d8870848-f3c1-4a88-a0f3-b6ca20c61bab
nodes:
- $component_ref: d8870848-f3c1-4a88-a0f3-b6ca20c61bab
- $component_ref: 25917cac-52d4-4816-8c62-c18d8b70ee33
- $component_ref: 158c838b-f8be-41ef-8b66-64348c8d379c
control_flow_connections:
- component_type: ControlFlowEdge
id: 7b8c0c0b-fcd1-4bf3-96be-dcf726ab1526
name: start_step_to_PromptExecution_control_flow_edge
description: null
metadata:
__metadata_info__: {}
from_node:
$component_ref: d8870848-f3c1-4a88-a0f3-b6ca20c61bab
from_branch: null
to_node:
$component_ref: 25917cac-52d4-4816-8c62-c18d8b70ee33
- component_type: ControlFlowEdge
id: 6b2c2840-126c-43fe-a8e1-f3cb08e8ae88
name: PromptExecution_to_None End node_control_flow_edge
description: null
metadata: {}
from_node:
$component_ref: 25917cac-52d4-4816-8c62-c18d8b70ee33
from_branch: null
to_node:
$component_ref: 158c838b-f8be-41ef-8b66-64348c8d379c
data_flow_connections:
- component_type: DataFlowEdge
id: 86ba0435-be9b-46b0-97ae-64e145045e19
name: start_step_user_question_to_PromptExecution_user_question_data_flow_edge
description: null
metadata:
__metadata_info__: {}
source_node:
$component_ref: d8870848-f3c1-4a88-a0f3-b6ca20c61bab
source_output: user_question
destination_node:
$component_ref: 25917cac-52d4-4816-8c62-c18d8b70ee33
destination_input: user_question
- component_type: DataFlowEdge
id: 8638e234-9a23-45c5-89d4-296fc5a8c5ac
name: PromptExecution_output_to_None End node_output_data_flow_edge
description: null
metadata: {}
source_node:
$component_ref: 25917cac-52d4-4816-8c62-c18d8b70ee33
source_output: output
destination_node:
$component_ref: 158c838b-f8be-41ef-8b66-64348c8d379c
destination_input: output
$referenced_components:
25917cac-52d4-4816-8c62-c18d8b70ee33:
component_type: LlmNode
id: 25917cac-52d4-4816-8c62-c18d8b70ee33
name: PromptExecution
description: ''
metadata:
__metadata_info__: {}
inputs:
- description: '"user_question" input variable for the template'
type: string
title: user_question
outputs:
- description: the generated text
type: string
title: output
branches:
- next
llm_config:
component_type: VllmConfig
id: 93d098ef-9643-4d38-a012-8903bacbb784
name: LLAMA_MODEL_ID
description: null
metadata:
__metadata_info__: {}
default_generation_parameters: null
url: LLAMA_API_URL
model_id: LLAMA_MODEL_ID
prompt_template: '{{user_question}}'
d8870848-f3c1-4a88-a0f3-b6ca20c61bab:
component_type: StartNode
id: d8870848-f3c1-4a88-a0f3-b6ca20c61bab
name: start_step
description: ''
metadata:
__metadata_info__: {}
inputs:
- type: string
title: user_question
outputs:
- type: string
title: user_question
branches:
- next
158c838b-f8be-41ef-8b66-64348c8d379c:
component_type: EndNode
id: 158c838b-f8be-41ef-8b66-64348c8d379c
name: None End node
description: End node representing all transitions to None in the WayFlow flow
metadata: {}
inputs:
- description: the generated text
type: string
title: output
outputs:
- description: the generated text
type: string
title: output
branches: []
branch_name: next
agentspec_version: 25.4.1
You can then load the configuration back to an assistant using the AgentSpecLoader
.
from wayflowcore.agentspec import AgentSpecLoader
assistant = AgentSpecLoader().load_yaml(serialized_assistant)
Next steps#
Having learned how to specify the generation configuration, you may now proceed to:
Some additional resources we recommend:
Full code#
Click on the card at the top of this page to download the full code for this guide or copy the code below.
1# Copyright © 2025 Oracle and/or its affiliates.
2#
3# This software is under the Universal Permissive License
4# %%[markdown]
5# Code Example - How to Specify the Generation Configuration when Using LLMs
6# --------------------------------------------------------------------------
7
8# How to use:
9# Create a new Python virtual environment and install the latest WayFlow version.
10# ```bash
11# python -m venv venv-wayflowcore
12# source venv-wayflowcore/bin/activate
13# pip install --upgrade pip
14# pip install "wayflowcore==26.1"
15# ```
16
17# You can now run the script
18# 1. As a Python file:
19# ```bash
20# python example_generationconfig.py
21# ```
22# 2. As a Notebook (in VSCode):
23# When viewing the file,
24# - press the keys Ctrl + Enter to run the selected cell
25# - or Shift + Enter to run the selected cell and move to the cell below# (UPL) 1.0 (LICENSE-UPL or https://oss.oracle.com/licenses/upl) or Apache License
26# 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0), at your option.
27
28
29
30
31# %%[markdown]
32## Imports
33
34# %%
35from wayflowcore.agent import Agent
36from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig
37
38
39# %%[markdown]
40## Define the llm generation configuration
41
42# %%
43generation_config = LlmGenerationConfig(
44 max_tokens=512,
45 temperature=1.0,
46 top_p=1.0,
47 stop=["exit", "end"],
48 frequency_penalty=0,
49 extra_args={"seed": 1},
50)
51
52
53# %%[markdown]
54## Define the vLLM
55
56# %%
57from wayflowcore.models import VllmModel
58
59llm = VllmModel(
60 model_id="LLAMA_MODEL_ID",
61 host_port="LLAMA_API_URL",
62 generation_config=generation_config,
63)
64# NOTE: host_port should be a string with the IP address/domain name and the port. An example string: "192.168.1.1:8000"
65# NOTE: model_id usually indicates the HuggingFace model id,
66# e.g. meta-llama/Llama-3.1-8B-Instruct from https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct
67
68# %%[markdown]
69## Build the agent and run it
70
71# %%
72agent = Agent(llm=llm)
73conversation = agent.start_conversation()
74conversation.append_user_message("What is the capital of Switzerland?")
75conversation.execute()
76print(conversation.get_last_message())
77
78
79from wayflowcore.controlconnection import ControlFlowEdge
80from wayflowcore.dataconnection import DataFlowEdge
81
82
83# %%[markdown]
84## Import what is needed to build a flow
85
86# %%
87from wayflowcore.flow import Flow
88from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig
89from wayflowcore.property import StringProperty
90from wayflowcore.steps import PromptExecutionStep, StartStep
91
92
93# %%[markdown]
94## Build the flow using custom generation parameters
95
96# %%
97start_step = StartStep(name="start_step", input_descriptors=[StringProperty("user_question")])
98prompt_step = PromptExecutionStep(
99 name="PromptExecution",
100 prompt_template="{{user_question}}",
101 llm=llm,
102 generation_config=LlmGenerationConfig(temperature=0.8),
103)
104flow = Flow(
105 begin_step=start_step,
106 control_flow_edges=[
107 ControlFlowEdge(source_step=start_step, destination_step=prompt_step),
108 ControlFlowEdge(source_step=prompt_step, destination_step=None),
109 ],
110 data_flow_edges=[DataFlowEdge(start_step, "user_question", prompt_step, "user_question")],
111)
112conversation = flow.start_conversation(
113 inputs={"user_question": "What is the capital of Switzerland?"}
114)
115conversation.execute()
116
117
118# %%[markdown]
119## Export config to Agent Spec
120
121# %%
122from wayflowcore.agentspec import AgentSpecExporter
123
124serialized_assistant = AgentSpecExporter().to_yaml(flow)
125
126
127# %%[markdown]
128## Load Agent Spec config
129
130# %%
131from wayflowcore.agentspec import AgentSpecLoader
132
133assistant = AgentSpecLoader().load_yaml(serialized_assistant)
134
135
136# %%[markdown]
137## Build the generation configuration from dictionary
138
139# %%
140config_dict = {
141 "max_tokens": 512,
142 "temperature": 0.9,
143}
144
145config = LlmGenerationConfig.from_dict(config_dict)
146
147
148# %%[markdown]
149## Export a generation configuration to dictionary
150
151# %%
152
153config = LlmGenerationConfig(max_tokens=1024, temperature=0.8, top_p=0.6)
154config_dict = config.to_dict()