How to Specify the Generation Configuration when Using LLMs#
Generation parameters, such as temperature, top-p, the maximum number of output tokens, and per-token log-probabilities, are important for achieving the desired performance with Large Language Models (LLMs). In WayFlow, these parameters can be configured with the LlmGenerationConfig class.
This guide will show you how to:
Configure the generation parameters for an agent.
Configure the generation parameters for a flow.
Request token log probabilities.
Apply the generation configuration from a dictionary.
Save a custom generation configuration.
Note
For a deeper understanding of the impact of each generation parameter, refer to the resources at the bottom of this page.
Basic implementation#
Configure the generation parameters for an agent#
Customizing the generation configuration for an agent requires the use of the following wayflowcore components.
from wayflowcore.agent import Agent
from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig
The generation configuration can be specified when initializing the LLM using the LlmGenerationConfig class. This ensures that all the outputs generated by the agent will have the same generation configuration.
The generation configuration dictionary can have the following arguments:
max_new_tokens: controls the maximum numbers of tokens to generate, ignoring the number of tokens in the prompt;temperature: controls the randomness of the output;top_p: controls the randomness of the output;stop: defines a list of stop words to indicate the LLM to stop generating;frequency_penalty: controls the frequency of tokens generated.top_logprobs: requests token-level log probabilities, including alternate candidates when the provider supports them.
Additionally, the LlmGenerationConfig offers the possibility to set a dictionary
of arbitrary parameters, called extra_args, that will be sent as part of the llm generation call.
This allows specifying provider-specific parameters that might not be common to all.
Note
The extra parameters should never include sensitive information.
generation_config = LlmGenerationConfig(
max_tokens=512,
temperature=1.0,
top_p=1.0,
stop=["exit", "end"],
frequency_penalty=0,
extra_args={"seed": 1},
)
WayFlow supports several LLM API providers.
You can pass the generation_config for each of them.
Select an LLM from the options below:
from wayflowcore.models import OCIGenAIModel, OCIClientConfigWithApiKey
llm = OCIGenAIModel(
model_id="provider.model-id",
compartment_id="compartment-id",
client_config=OCIClientConfigWithApiKey(
service_endpoint="https://url-to-service-endpoint.com",
),
)
from wayflowcore.models import VllmModel
llm = VllmModel(
model_id="model-id",
host_port="VLLM_HOST_PORT",
)
from wayflowcore.models import OllamaModel
llm = OllamaModel(
model_id="model-id",
)
Important
API keys should not be stored anywhere in the code. Use environment variables and/or tools such as python-dotenv
Now, you can build an agent using the LLM as follows:
agent = Agent(llm=llm)
conversation = agent.start_conversation()
conversation.append_user_message("What is the capital of Switzerland?")
conversation.execute()
print(conversation.get_last_message())
Configure the generation parameters for a flow#
Customizing the generation configuration for a flow requires the use of the following wayflowcore components.
from wayflowcore.flow import Flow
from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig
from wayflowcore.property import StringProperty
from wayflowcore.steps import PromptExecutionStep, StartStep
Refer to the previous section to learn how to configure the generation parameters when initializing an LLM using the LlmGenerationConfig class.
You can then create a one-step flow using the PromptExecutionStep step.
start_step = StartStep(name="start_step", input_descriptors=[StringProperty("user_question")])
prompt_step = PromptExecutionStep(
name="PromptExecution",
prompt_template="{{user_question}}",
llm=llm,
generation_config=LlmGenerationConfig(temperature=0.8),
)
flow = Flow(
begin_step=start_step,
control_flow_edges=[
ControlFlowEdge(source_step=start_step, destination_step=prompt_step),
ControlFlowEdge(source_step=prompt_step, destination_step=None),
],
data_flow_edges=[DataFlowEdge(start_step, "user_question", prompt_step, "user_question")],
)
conversation = flow.start_conversation(
inputs={"user_question": "What is the capital of Switzerland?"}
)
conversation.execute()
Important
The generation_config parameter passed to the PromptExecutionStep overrides the LLM’s original generation configuration.
Advanced usage#
The LlmGenerationConfig class is a serializable object. It can be instantiated from a dictionary or saved to one, as you will see below.
Request token log probabilities#
Use top_logprobs when you want the model to return token-level probabilities for generated text.
WayFlow stores those values on TextContent.logprobs for direct LLM calls, and the
PromptExecutionStep also exposes them as an additional logprobs output.
Note
top_logprobs is only available for raw text generation.
It is not supported with structured generation in PromptExecutionStep,
and support depends on the selected provider and model.
For direct LlmModel calls, configure top_logprobs on the prompt and inspect the TextContent chunk:
from wayflowcore.messagelist import Message, TextContent
from wayflowcore.models import Prompt
prompt = Prompt(
messages=[Message(content="Say 'Bern' and nothing else.")],
generation_config=LlmGenerationConfig(top_logprobs=2, max_tokens=16),
)
completion = llm.generate(prompt)
text_chunk = next(chunk for chunk in completion.message.contents if isinstance(chunk, TextContent))
print(text_chunk.content)
print(text_chunk.logprobs)
For flows, you can request logprobs directly on PromptExecutionStep.
When enabled, the step appends a logprobs output alongside the normal text output:
from wayflowcore.executors.executionstatus import FinishedStatus
logprob_start_step = StartStep(
name="logprob_start_step",
input_descriptors=[StringProperty("user_question")],
)
logprob_step = PromptExecutionStep(
name="PromptExecutionWithLogprobs",
prompt_template="{{user_question}}",
llm=llm,
top_logprobs=2,
)
logprob_flow = Flow(
begin_step=logprob_start_step,
control_flow_edges=[
ControlFlowEdge(source_step=logprob_start_step, destination_step=logprob_step),
ControlFlowEdge(source_step=logprob_step, destination_step=None),
],
data_flow_edges=[
DataFlowEdge(logprob_start_step, "user_question", logprob_step, "user_question")
],
)
conversation = logprob_flow.start_conversation(
inputs={"user_question": "What is the capital of Switzerland?"}
)
status = conversation.execute()
if isinstance(status, FinishedStatus):
print(status.output_values[PromptExecutionStep.OUTPUT])
print(status.output_values[PromptExecutionStep.LOGPROBS])
Apply the generation configuration from a dictionary#
If you have a generation configuration in a dictionary (for example, from a JSON or YAML file), you can instantiate the LlmGenerationConfig class as follows:
config_dict = {
"max_tokens": 512,
"temperature": 0.9,
}
config = LlmGenerationConfig.from_dict(config_dict)
Save a custom generation configuration#
If you would like to share your specific generation configuration, you can create a LlmGenerationConfig class instance and store it to a dictionary.
config = LlmGenerationConfig(max_tokens=1024, temperature=0.8, top_p=0.6)
config_dict = config.to_dict()
Agent Spec Exporting/Loading#
You can export the assistant configuration to its Agent Spec configuration using the AgentSpecExporter.
The following example exports the serialization of the flow defined above.
from wayflowcore.agentspec import AgentSpecExporter
serialized_assistant = AgentSpecExporter().to_yaml(flow)
Here is what the Agent Spec representation will look like ↓
Click here to see the assistant configuration.
{
"component_type": "Flow",
"id": "fc3d10f4-5ee2-40d8-a580-0db6c44b0b39",
"name": "flow_0e4b989a",
"description": "",
"metadata": {
"__metadata_info__": {}
},
"inputs": [
{
"type": "string",
"title": "user_question"
}
],
"outputs": [
{
"description": "the generated text",
"type": "string",
"title": "output"
}
],
"start_node": {
"$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
},
"nodes": [
{
"$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
},
{
"$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
},
{
"$component_ref": "158c838b-f8be-41ef-8b66-64348c8d379c"
}
],
"control_flow_connections": [
{
"component_type": "ControlFlowEdge",
"id": "7b8c0c0b-fcd1-4bf3-96be-dcf726ab1526",
"name": "start_step_to_PromptExecution_control_flow_edge",
"description": null,
"metadata": {
"__metadata_info__": {}
},
"from_node": {
"$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
},
"from_branch": null,
"to_node": {
"$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
}
},
{
"component_type": "ControlFlowEdge",
"id": "6b2c2840-126c-43fe-a8e1-f3cb08e8ae88",
"name": "PromptExecution_to_None End node_control_flow_edge",
"description": null,
"metadata": {},
"from_node": {
"$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
},
"from_branch": null,
"to_node": {
"$component_ref": "158c838b-f8be-41ef-8b66-64348c8d379c"
}
}
],
"data_flow_connections": [
{
"component_type": "DataFlowEdge",
"id": "86ba0435-be9b-46b0-97ae-64e145045e19",
"name": "start_step_user_question_to_PromptExecution_user_question_data_flow_edge",
"description": null,
"metadata": {
"__metadata_info__": {}
},
"source_node": {
"$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
},
"source_output": "user_question",
"destination_node": {
"$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
},
"destination_input": "user_question"
},
{
"component_type": "DataFlowEdge",
"id": "8638e234-9a23-45c5-89d4-296fc5a8c5ac",
"name": "PromptExecution_output_to_None End node_output_data_flow_edge",
"description": null,
"metadata": {},
"source_node": {
"$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
},
"source_output": "output",
"destination_node": {
"$component_ref": "158c838b-f8be-41ef-8b66-64348c8d379c"
},
"destination_input": "output"
}
],
"$referenced_components": {
"25917cac-52d4-4816-8c62-c18d8b70ee33": {
"component_type": "LlmNode",
"id": "25917cac-52d4-4816-8c62-c18d8b70ee33",
"name": "PromptExecution",
"description": "",
"metadata": {
"__metadata_info__": {}
},
"inputs": [
{
"description": "\"user_question\" input variable for the template",
"type": "string",
"title": "user_question"
}
],
"outputs": [
{
"description": "the generated text",
"type": "string",
"title": "output"
}
],
"branches": [
"next"
],
"llm_config": {
"component_type": "VllmConfig",
"id": "93d098ef-9643-4d38-a012-8903bacbb784",
"name": "LLAMA_MODEL_ID",
"description": null,
"metadata": {
"__metadata_info__": {}
},
"default_generation_parameters": null,
"url": "LLAMA_API_URL",
"model_id": "LLAMA_MODEL_ID"
},
"prompt_template": "{{user_question}}"
},
"d8870848-f3c1-4a88-a0f3-b6ca20c61bab": {
"component_type": "StartNode",
"id": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab",
"name": "start_step",
"description": "",
"metadata": {
"__metadata_info__": {}
},
"inputs": [
{
"type": "string",
"title": "user_question"
}
],
"outputs": [
{
"type": "string",
"title": "user_question"
}
],
"branches": [
"next"
]
},
"158c838b-f8be-41ef-8b66-64348c8d379c": {
"component_type": "EndNode",
"id": "158c838b-f8be-41ef-8b66-64348c8d379c",
"name": "None End node",
"description": "End node representing all transitions to None in the WayFlow flow",
"metadata": {},
"inputs": [
{
"description": "the generated text",
"type": "string",
"title": "output"
}
],
"outputs": [
{
"description": "the generated text",
"type": "string",
"title": "output"
}
],
"branches": [],
"branch_name": "next"
}
},
"agentspec_version": "25.4.1"
}
component_type: Flow
id: fc3d10f4-5ee2-40d8-a580-0db6c44b0b39
name: flow_0e4b989a
description: ''
metadata:
__metadata_info__: {}
inputs:
- type: string
title: user_question
outputs:
- description: the generated text
type: string
title: output
start_node:
$component_ref: d8870848-f3c1-4a88-a0f3-b6ca20c61bab
nodes:
- $component_ref: d8870848-f3c1-4a88-a0f3-b6ca20c61bab
- $component_ref: 25917cac-52d4-4816-8c62-c18d8b70ee33
- $component_ref: 158c838b-f8be-41ef-8b66-64348c8d379c
control_flow_connections:
- component_type: ControlFlowEdge
id: 7b8c0c0b-fcd1-4bf3-96be-dcf726ab1526
name: start_step_to_PromptExecution_control_flow_edge
description: null
metadata:
__metadata_info__: {}
from_node:
$component_ref: d8870848-f3c1-4a88-a0f3-b6ca20c61bab
from_branch: null
to_node:
$component_ref: 25917cac-52d4-4816-8c62-c18d8b70ee33
- component_type: ControlFlowEdge
id: 6b2c2840-126c-43fe-a8e1-f3cb08e8ae88
name: PromptExecution_to_None End node_control_flow_edge
description: null
metadata: {}
from_node:
$component_ref: 25917cac-52d4-4816-8c62-c18d8b70ee33
from_branch: null
to_node:
$component_ref: 158c838b-f8be-41ef-8b66-64348c8d379c
data_flow_connections:
- component_type: DataFlowEdge
id: 86ba0435-be9b-46b0-97ae-64e145045e19
name: start_step_user_question_to_PromptExecution_user_question_data_flow_edge
description: null
metadata:
__metadata_info__: {}
source_node:
$component_ref: d8870848-f3c1-4a88-a0f3-b6ca20c61bab
source_output: user_question
destination_node:
$component_ref: 25917cac-52d4-4816-8c62-c18d8b70ee33
destination_input: user_question
- component_type: DataFlowEdge
id: 8638e234-9a23-45c5-89d4-296fc5a8c5ac
name: PromptExecution_output_to_None End node_output_data_flow_edge
description: null
metadata: {}
source_node:
$component_ref: 25917cac-52d4-4816-8c62-c18d8b70ee33
source_output: output
destination_node:
$component_ref: 158c838b-f8be-41ef-8b66-64348c8d379c
destination_input: output
$referenced_components:
25917cac-52d4-4816-8c62-c18d8b70ee33:
component_type: LlmNode
id: 25917cac-52d4-4816-8c62-c18d8b70ee33
name: PromptExecution
description: ''
metadata:
__metadata_info__: {}
inputs:
- description: '"user_question" input variable for the template'
type: string
title: user_question
outputs:
- description: the generated text
type: string
title: output
branches:
- next
llm_config:
component_type: VllmConfig
id: 93d098ef-9643-4d38-a012-8903bacbb784
name: LLAMA_MODEL_ID
description: null
metadata:
__metadata_info__: {}
default_generation_parameters: null
url: LLAMA_API_URL
model_id: LLAMA_MODEL_ID
prompt_template: '{{user_question}}'
d8870848-f3c1-4a88-a0f3-b6ca20c61bab:
component_type: StartNode
id: d8870848-f3c1-4a88-a0f3-b6ca20c61bab
name: start_step
description: ''
metadata:
__metadata_info__: {}
inputs:
- type: string
title: user_question
outputs:
- type: string
title: user_question
branches:
- next
158c838b-f8be-41ef-8b66-64348c8d379c:
component_type: EndNode
id: 158c838b-f8be-41ef-8b66-64348c8d379c
name: None End node
description: End node representing all transitions to None in the WayFlow flow
metadata: {}
inputs:
- description: the generated text
type: string
title: output
outputs:
- description: the generated text
type: string
title: output
branches: []
branch_name: next
agentspec_version: 25.4.1
You can then load the configuration back to an assistant using the AgentSpecLoader.
from wayflowcore.agentspec import AgentSpecLoader
assistant = AgentSpecLoader().load_yaml(serialized_assistant)
Next steps#
Having learned how to specify the generation configuration, you may now proceed to:
Some additional resources we recommend:
Full code#
Click on the card at the top of this page to download the full code for this guide or copy the code below.
1# Copyright © 2025 Oracle and/or its affiliates.
2#
3# This software is under the Apache License 2.0
4# %%[markdown]
5# Code Example - How to Specify the Generation Configuration when Using LLMs
6# --------------------------------------------------------------------------
7
8# How to use:
9# Create a new Python virtual environment and install the latest WayFlow version.
10# ```bash
11# python -m venv venv-wayflowcore
12# source venv-wayflowcore/bin/activate
13# pip install --upgrade pip
14# pip install "wayflowcore==26.1.2"
15# ```
16
17# You can now run the script
18# 1. As a Python file:
19# ```bash
20# python example_generationconfig.py
21# ```
22# 2. As a Notebook (in VSCode):
23# When viewing the file,
24# - press the keys Ctrl + Enter to run the selected cell
25# - or Shift + Enter to run the selected cell and move to the cell below# (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0) or Universal Permissive License
26# (UPL) 1.0 (LICENSE-UPL or https://oss.oracle.com/licenses/upl), at your option.
27
28
29
30
31# %%[markdown]
32## Imports
33
34# %%
35from wayflowcore.agent import Agent
36from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig
37
38
39# %%[markdown]
40## Define the llm generation configuration
41
42# %%
43generation_config = LlmGenerationConfig(
44 max_tokens=512,
45 temperature=1.0,
46 top_p=1.0,
47 stop=["exit", "end"],
48 frequency_penalty=0,
49 extra_args={"seed": 1},
50)
51
52
53# %%[markdown]
54## Define the vLLM
55
56# %%
57from wayflowcore.models import VllmModel
58
59llm = VllmModel(
60 model_id="LLAMA_MODEL_ID",
61 host_port="LLAMA_API_URL",
62 generation_config=generation_config,
63)
64# NOTE: host_port should be a string with the IP address/domain name and the port. An example string: "192.168.1.1:8000"
65# NOTE: model_id usually indicates the HuggingFace model id,
66# e.g. meta-llama/Llama-3.1-8B-Instruct from https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct
67
68# %%[markdown]
69## Build the agent and run it
70
71# %%
72agent = Agent(llm=llm)
73conversation = agent.start_conversation()
74conversation.append_user_message("What is the capital of Switzerland?")
75conversation.execute()
76print(conversation.get_last_message())
77
78
79# %%[markdown]
80## Request logprobs from a direct llm call
81
82# %%
83from wayflowcore.messagelist import Message, TextContent
84from wayflowcore.models import Prompt
85
86prompt = Prompt(
87 messages=[Message(content="Say 'Bern' and nothing else.")],
88 generation_config=LlmGenerationConfig(top_logprobs=2, max_tokens=16),
89)
90completion = llm.generate(prompt)
91text_chunk = next(chunk for chunk in completion.message.contents if isinstance(chunk, TextContent))
92
93print(text_chunk.content)
94print(text_chunk.logprobs)
95
96
97from wayflowcore.controlconnection import ControlFlowEdge
98from wayflowcore.dataconnection import DataFlowEdge
99
100
101# %%[markdown]
102## Import what is needed to build a flow
103
104# %%
105from wayflowcore.flow import Flow
106from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig
107from wayflowcore.property import StringProperty
108from wayflowcore.steps import PromptExecutionStep, StartStep
109
110
111# %%[markdown]
112## Build the flow using custom generation parameters
113
114# %%
115start_step = StartStep(name="start_step", input_descriptors=[StringProperty("user_question")])
116prompt_step = PromptExecutionStep(
117 name="PromptExecution",
118 prompt_template="{{user_question}}",
119 llm=llm,
120 generation_config=LlmGenerationConfig(temperature=0.8),
121)
122flow = Flow(
123 begin_step=start_step,
124 control_flow_edges=[
125 ControlFlowEdge(source_step=start_step, destination_step=prompt_step),
126 ControlFlowEdge(source_step=prompt_step, destination_step=None),
127 ],
128 data_flow_edges=[DataFlowEdge(start_step, "user_question", prompt_step, "user_question")],
129)
130conversation = flow.start_conversation(
131 inputs={"user_question": "What is the capital of Switzerland?"}
132)
133conversation.execute()
134
135
136# %%[markdown]
137## Request logprobs from a flow step
138
139# %%
140from wayflowcore.executors.executionstatus import FinishedStatus
141
142logprob_start_step = StartStep(
143 name="logprob_start_step",
144 input_descriptors=[StringProperty("user_question")],
145)
146logprob_step = PromptExecutionStep(
147 name="PromptExecutionWithLogprobs",
148 prompt_template="{{user_question}}",
149 llm=llm,
150 top_logprobs=2,
151)
152logprob_flow = Flow(
153 begin_step=logprob_start_step,
154 control_flow_edges=[
155 ControlFlowEdge(source_step=logprob_start_step, destination_step=logprob_step),
156 ControlFlowEdge(source_step=logprob_step, destination_step=None),
157 ],
158 data_flow_edges=[
159 DataFlowEdge(logprob_start_step, "user_question", logprob_step, "user_question")
160 ],
161)
162conversation = logprob_flow.start_conversation(
163 inputs={"user_question": "What is the capital of Switzerland?"}
164)
165status = conversation.execute()
166if isinstance(status, FinishedStatus):
167 print(status.output_values[PromptExecutionStep.OUTPUT])
168 print(status.output_values[PromptExecutionStep.LOGPROBS])
169
170
171# %%[markdown]
172## Export config to Agent Spec
173
174# %%
175from wayflowcore.agentspec import AgentSpecExporter
176
177serialized_assistant = AgentSpecExporter().to_yaml(flow)
178
179
180# %%[markdown]
181## Load Agent Spec config
182
183# %%
184from wayflowcore.agentspec import AgentSpecLoader
185
186assistant = AgentSpecLoader().load_yaml(serialized_assistant)
187
188
189# %%[markdown]
190## Build the generation configuration from dictionary
191
192# %%
193config_dict = {
194 "max_tokens": 512,
195 "temperature": 0.9,
196}
197
198config = LlmGenerationConfig.from_dict(config_dict)
199
200
201# %%[markdown]
202## Export a generation configuration to dictionary
203
204# %%
205
206config = LlmGenerationConfig(max_tokens=1024, temperature=0.8, top_p=0.6)
207config_dict = config.to_dict()