How to Specify the Generation Configuration when Using LLMs#

Download Python Script

Python script/notebook for this guide.

Generation parameters, such as temperature, top-p, the maximum number of output tokens, and per-token log-probabilities, are important for achieving the desired performance with Large Language Models (LLMs). In WayFlow, these parameters can be configured with the LlmGenerationConfig class.

This guide will show you how to:

Configure the generation parameters for an agent.
Configure the generation parameters for a flow.
Request token log probabilities.
Apply the generation configuration from a dictionary.
Save a custom generation configuration.

Note

For a deeper understanding of the impact of each generation parameter, refer to the resources at the bottom of this page.

Basic implementation#

Configure the generation parameters for an agent#

Customizing the generation configuration for an agent requires the use of the following wayflowcore components.

from wayflowcore.agent import Agent
from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig

The generation configuration can be specified when initializing the LLM using the LlmGenerationConfig class. This ensures that all the outputs generated by the agent will have the same generation configuration.

The generation configuration dictionary can have the following arguments:

max_new_tokens: controls the maximum numbers of tokens to generate, ignoring the number of tokens in the prompt;
temperature: controls the randomness of the output;
top_p: controls the randomness of the output;
stop: defines a list of stop words to indicate the LLM to stop generating;
frequency_penalty: controls the frequency of tokens generated.
top_logprobs: requests token-level log probabilities, including alternate candidates when the provider supports them.

Additionally, the LlmGenerationConfig offers the possibility to set a dictionary of arbitrary parameters, called extra_args, that will be sent as part of the llm generation call. This allows specifying provider-specific parameters that might not be common to all.

Note

The extra parameters should never include sensitive information.

generation_config = LlmGenerationConfig(
    max_tokens=512,
    temperature=1.0,
    top_p=1.0,
    stop=["exit", "end"],
    frequency_penalty=0,
    extra_args={"seed": 1},
)

WayFlow supports several LLM API providers. You can pass the generation_config for each of them. Select an LLM from the options below:

from wayflowcore.models import OCIGenAIModel, OCIClientConfigWithApiKey

llm = OCIGenAIModel(
    model_id="provider.model-id",
    compartment_id="compartment-id",
    client_config=OCIClientConfigWithApiKey(
        service_endpoint="https://url-to-service-endpoint.com",
    ),
)

from wayflowcore.models import VllmModel

llm = VllmModel(
    model_id="model-id",
    host_port="VLLM_HOST_PORT",
)

from wayflowcore.models import OllamaModel

llm = OllamaModel(
    model_id="model-id",
)

Important

API keys should not be stored anywhere in the code. Use environment variables and/or tools such as python-dotenv

Now, you can build an agent using the LLM as follows:

agent = Agent(llm=llm)
conversation = agent.start_conversation()
conversation.append_user_message("What is the capital of Switzerland?")
conversation.execute()
print(conversation.get_last_message())

Configure the generation parameters for a flow#

Customizing the generation configuration for a flow requires the use of the following wayflowcore components.

from wayflowcore.flow import Flow
from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig
from wayflowcore.property import StringProperty
from wayflowcore.steps import PromptExecutionStep, StartStep

Refer to the previous section to learn how to configure the generation parameters when initializing an LLM using the LlmGenerationConfig class.

You can then create a one-step flow using the PromptExecutionStep step.

start_step = StartStep(name="start_step", input_descriptors=[StringProperty("user_question")])
prompt_step = PromptExecutionStep(
    name="PromptExecution",
    prompt_template="{{user_question}}",
    llm=llm,
    generation_config=LlmGenerationConfig(temperature=0.8),
)
flow = Flow(
    begin_step=start_step,
    control_flow_edges=[
        ControlFlowEdge(source_step=start_step, destination_step=prompt_step),
        ControlFlowEdge(source_step=prompt_step, destination_step=None),
    ],
    data_flow_edges=[DataFlowEdge(start_step, "user_question", prompt_step, "user_question")],
)
conversation = flow.start_conversation(
    inputs={"user_question": "What is the capital of Switzerland?"}
)
conversation.execute()

Important

The generation_config parameter passed to the PromptExecutionStep overrides the LLM’s original generation configuration.

Advanced usage#

The LlmGenerationConfig class is a serializable object. It can be instantiated from a dictionary or saved to one, as you will see below.

Request token log probabilities#

Use top_logprobs when you want the model to return token-level probabilities for generated text. WayFlow stores those values on TextContent.logprobs for direct LLM calls, and the PromptExecutionStep also exposes them as an additional logprobs output.

Note

top_logprobs is only available for raw text generation. It is not supported with structured generation in PromptExecutionStep, and support depends on the selected provider and model.

For direct LlmModel calls, configure top_logprobs on the prompt and inspect the TextContent chunk:

from wayflowcore.messagelist import Message, TextContent
from wayflowcore.models import Prompt

prompt = Prompt(
    messages=[Message(content="Say 'Bern' and nothing else.")],
    generation_config=LlmGenerationConfig(top_logprobs=2, max_tokens=16),
)
completion = llm.generate(prompt)
text_chunk = next(chunk for chunk in completion.message.contents if isinstance(chunk, TextContent))

print(text_chunk.content)
print(text_chunk.logprobs)

For flows, you can request logprobs directly on PromptExecutionStep. When enabled, the step appends a logprobs output alongside the normal text output:

from wayflowcore.executors.executionstatus import FinishedStatus

logprob_start_step = StartStep(
    name="logprob_start_step",
    input_descriptors=[StringProperty("user_question")],
)
logprob_step = PromptExecutionStep(
    name="PromptExecutionWithLogprobs",
    prompt_template="{{user_question}}",
    llm=llm,
    top_logprobs=2,
)
logprob_flow = Flow(
    begin_step=logprob_start_step,
    control_flow_edges=[
        ControlFlowEdge(source_step=logprob_start_step, destination_step=logprob_step),
        ControlFlowEdge(source_step=logprob_step, destination_step=None),
    ],
    data_flow_edges=[
        DataFlowEdge(logprob_start_step, "user_question", logprob_step, "user_question")
    ],
)
conversation = logprob_flow.start_conversation(
    inputs={"user_question": "What is the capital of Switzerland?"}
)
status = conversation.execute()
if isinstance(status, FinishedStatus):
    print(status.output_values[PromptExecutionStep.OUTPUT])
    print(status.output_values[PromptExecutionStep.LOGPROBS])

Apply the generation configuration from a dictionary#

If you have a generation configuration in a dictionary (for example, from a JSON or YAML file), you can instantiate the LlmGenerationConfig class as follows:

config_dict = {
    "max_tokens": 512,
    "temperature": 0.9,
}

config = LlmGenerationConfig.from_dict(config_dict)

Save a custom generation configuration#

If you would like to share your specific generation configuration, you can create a LlmGenerationConfig class instance and store it to a dictionary.

config = LlmGenerationConfig(max_tokens=1024, temperature=0.8, top_p=0.6)
config_dict = config.to_dict()

Agent Spec Exporting/Loading#

You can export the assistant configuration to its Agent Spec configuration using the AgentSpecExporter. The following example exports the serialization of the flow defined above.

from wayflowcore.agentspec import AgentSpecExporter

serialized_assistant = AgentSpecExporter().to_yaml(flow)

Here is what the Agent Spec representation will look like ↓

Click here to see the assistant configuration.

{
  "component_type": "Flow",
  "id": "fc3d10f4-5ee2-40d8-a580-0db6c44b0b39",
  "name": "flow_0e4b989a",
  "description": "",
  "metadata": {
    "__metadata_info__": {}
  },
  "inputs": [
    {
      "type": "string",
      "title": "user_question"
    }
  ],
  "outputs": [
    {
      "description": "the generated text",
      "type": "string",
      "title": "output"
    }
  ],
  "start_node": {
    "$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
  },
  "nodes": [
    {
      "$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
    },
    {
      "$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
    },
    {
      "$component_ref": "158c838b-f8be-41ef-8b66-64348c8d379c"
    }
  ],
  "control_flow_connections": [
    {
      "component_type": "ControlFlowEdge",
      "id": "7b8c0c0b-fcd1-4bf3-96be-dcf726ab1526",
      "name": "start_step_to_PromptExecution_control_flow_edge",
      "description": null,
      "metadata": {
        "__metadata_info__": {}
      },
      "from_node": {
        "$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
      },
      "from_branch": null,
      "to_node": {
        "$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
      }
    },
    {
      "component_type": "ControlFlowEdge",
      "id": "6b2c2840-126c-43fe-a8e1-f3cb08e8ae88",
      "name": "PromptExecution_to_None End node_control_flow_edge",
      "description": null,
      "metadata": {},
      "from_node": {
        "$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
      },
      "from_branch": null,
      "to_node": {
        "$component_ref": "158c838b-f8be-41ef-8b66-64348c8d379c"
      }
    }
  ],
  "data_flow_connections": [
    {
      "component_type": "DataFlowEdge",
      "id": "86ba0435-be9b-46b0-97ae-64e145045e19",
      "name": "start_step_user_question_to_PromptExecution_user_question_data_flow_edge",
      "description": null,
      "metadata": {
        "__metadata_info__": {}
      },
      "source_node": {
        "$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
      },
      "source_output": "user_question",
      "destination_node": {
        "$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
      },
      "destination_input": "user_question"
    },
    {
      "component_type": "DataFlowEdge",
      "id": "8638e234-9a23-45c5-89d4-296fc5a8c5ac",
      "name": "PromptExecution_output_to_None End node_output_data_flow_edge",
      "description": null,
      "metadata": {},
      "source_node": {
        "$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
      },
      "source_output": "output",
      "destination_node": {
        "$component_ref": "158c838b-f8be-41ef-8b66-64348c8d379c"
      },
      "destination_input": "output"
    }
  ],
  "$referenced_components": {
    "25917cac-52d4-4816-8c62-c18d8b70ee33": {
      "component_type": "LlmNode",
      "id": "25917cac-52d4-4816-8c62-c18d8b70ee33",
      "name": "PromptExecution",
      "description": "",
      "metadata": {
        "__metadata_info__": {}
      },
      "inputs": [
        {
          "description": "\"user_question\" input variable for the template",
          "type": "string",
          "title": "user_question"
        }
      ],
      "outputs": [
        {
          "description": "the generated text",
          "type": "string",
          "title": "output"
        }
      ],
      "branches": [
        "next"
      ],
      "llm_config": {
        "component_type": "VllmConfig",
        "id": "93d098ef-9643-4d38-a012-8903bacbb784",
        "name": "LLAMA_MODEL_ID",
        "description": null,
        "metadata": {
          "__metadata_info__": {}
        },
        "default_generation_parameters": null,
        "url": "LLAMA_API_URL",
        "model_id": "LLAMA_MODEL_ID"
      },
      "prompt_template": "{{user_question}}"
    },
    "d8870848-f3c1-4a88-a0f3-b6ca20c61bab": {
      "component_type": "StartNode",
      "id": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab",
      "name": "start_step",
      "description": "",
      "metadata": {
        "__metadata_info__": {}
      },
      "inputs": [
        {
          "type": "string",
          "title": "user_question"
        }
      ],
      "outputs": [
        {
          "type": "string",
          "title": "user_question"
        }
      ],
      "branches": [
        "next"
      ]
    },
    "158c838b-f8be-41ef-8b66-64348c8d379c": {
      "component_type": "EndNode",
      "id": "158c838b-f8be-41ef-8b66-64348c8d379c",
      "name": "None End node",
      "description": "End node representing all transitions to None in the WayFlow flow",
      "metadata": {},
      "inputs": [
        {
          "description": "the generated text",
          "type": "string",
          "title": "output"
        }
      ],
      "outputs": [
        {
          "description": "the generated text",
          "type": "string",
          "title": "output"
        }
      ],
      "branches": [],
      "branch_name": "next"
    }
  },
  "agentspec_version": "25.4.1"
}

component_type: Flow
id: fc3d10f4-5ee2-40d8-a580-0db6c44b0b39
name: flow_0e4b989a
description: ''
metadata:
  __metadata_info__: {}
inputs:
- type: string
  title: user_question
outputs:
- description: the generated text
  type: string
  title: output
start_node:
  $component_ref: d8870848-f3c1-4a88-a0f3-b6ca20c61bab
nodes:
- $component_ref: d8870848-f3c1-4a88-a0f3-b6ca20c61bab
- $component_ref: 25917cac-52d4-4816-8c62-c18d8b70ee33
- $component_ref: 158c838b-f8be-41ef-8b66-64348c8d379c
control_flow_connections:
- component_type: ControlFlowEdge
  id: 7b8c0c0b-fcd1-4bf3-96be-dcf726ab1526
  name: start_step_to_PromptExecution_control_flow_edge
  description: null
  metadata:
    __metadata_info__: {}
  from_node:
    $component_ref: d8870848-f3c1-4a88-a0f3-b6ca20c61bab
  from_branch: null
  to_node:
    $component_ref: 25917cac-52d4-4816-8c62-c18d8b70ee33
- component_type: ControlFlowEdge
  id: 6b2c2840-126c-43fe-a8e1-f3cb08e8ae88
  name: PromptExecution_to_None End node_control_flow_edge
  description: null
  metadata: {}
  from_node:
    $component_ref: 25917cac-52d4-4816-8c62-c18d8b70ee33
  from_branch: null
  to_node:
    $component_ref: 158c838b-f8be-41ef-8b66-64348c8d379c
data_flow_connections:
- component_type: DataFlowEdge
  id: 86ba0435-be9b-46b0-97ae-64e145045e19
  name: start_step_user_question_to_PromptExecution_user_question_data_flow_edge
  description: null
  metadata:
    __metadata_info__: {}
  source_node:
    $component_ref: d8870848-f3c1-4a88-a0f3-b6ca20c61bab
  source_output: user_question
  destination_node:
    $component_ref: 25917cac-52d4-4816-8c62-c18d8b70ee33
  destination_input: user_question
- component_type: DataFlowEdge
  id: 8638e234-9a23-45c5-89d4-296fc5a8c5ac
  name: PromptExecution_output_to_None End node_output_data_flow_edge
  description: null
  metadata: {}
  source_node:
    $component_ref: 25917cac-52d4-4816-8c62-c18d8b70ee33
  source_output: output
  destination_node:
    $component_ref: 158c838b-f8be-41ef-8b66-64348c8d379c
  destination_input: output
$referenced_components:
  25917cac-52d4-4816-8c62-c18d8b70ee33:
    component_type: LlmNode
    id: 25917cac-52d4-4816-8c62-c18d8b70ee33
    name: PromptExecution
    description: ''
    metadata:
      __metadata_info__: {}
    inputs:
    - description: '"user_question" input variable for the template'
      type: string
      title: user_question
    outputs:
    - description: the generated text
      type: string
      title: output
    branches:
    - next
    llm_config:
      component_type: VllmConfig
      id: 93d098ef-9643-4d38-a012-8903bacbb784
      name: LLAMA_MODEL_ID
      description: null
      metadata:
        __metadata_info__: {}
      default_generation_parameters: null
      url: LLAMA_API_URL
      model_id: LLAMA_MODEL_ID
    prompt_template: '{{user_question}}'
  d8870848-f3c1-4a88-a0f3-b6ca20c61bab:
    component_type: StartNode
    id: d8870848-f3c1-4a88-a0f3-b6ca20c61bab
    name: start_step
    description: ''
    metadata:
      __metadata_info__: {}
    inputs:
    - type: string
      title: user_question
    outputs:
    - type: string
      title: user_question
    branches:
    - next
  158c838b-f8be-41ef-8b66-64348c8d379c:
    component_type: EndNode
    id: 158c838b-f8be-41ef-8b66-64348c8d379c
    name: None End node
    description: End node representing all transitions to None in the WayFlow flow
    metadata: {}
    inputs:
    - description: the generated text
      type: string
      title: output
    outputs:
    - description: the generated text
      type: string
      title: output
    branches: []
    branch_name: next
agentspec_version: 25.4.1

You can then load the configuration back to an assistant using the AgentSpecLoader.

from wayflowcore.agentspec import AgentSpecLoader

assistant = AgentSpecLoader().load_yaml(serialized_assistant)

Next steps#

Having learned how to specify the generation configuration, you may now proceed to:

Some additional resources we recommend:

Full code#

Click on the card at the top of this page to download the full code for this guide or copy the code below.

# Copyright © 2025 Oracle and/or its affiliates.
#
# This software is under the Apache License 2.0
# %%[markdown]
# Code Example - How to Specify the Generation Configuration when Using LLMs
# --------------------------------------------------------------------------

# How to use:
# Create a new Python virtual environment and install the latest WayFlow version.
# ```bash
# python -m venv venv-wayflowcore
# source venv-wayflowcore/bin/activate
# pip install --upgrade pip
# pip install "wayflowcore==26.1.2" 
# ```

# You can now run the script
# 1. As a Python file:
# ```bash
# python example_generationconfig.py
# ```
# 2. As a Notebook (in VSCode):
# When viewing the file,
#  - press the keys Ctrl + Enter to run the selected cell
#  - or Shift + Enter to run the selected cell and move to the cell below# (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0) or Universal Permissive License
# (UPL) 1.0 (LICENSE-UPL or https://oss.oracle.com/licenses/upl), at your option.




# %%[markdown]
## Imports

# %%
from wayflowcore.agent import Agent
from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig


# %%[markdown]
## Define the llm generation configuration

# %%
generation_config = LlmGenerationConfig(
    max_tokens=512,
    temperature=1.0,
    top_p=1.0,
    stop=["exit", "end"],
    frequency_penalty=0,
    extra_args={"seed": 1},
)


# %%[markdown]
## Define the vLLM

# %%
from wayflowcore.models import VllmModel

llm = VllmModel(
    model_id="LLAMA_MODEL_ID",
    host_port="LLAMA_API_URL",
    generation_config=generation_config,
)
# NOTE: host_port should be a string with the IP address/domain name and the port. An example string: "192.168.1.1:8000"
# NOTE: model_id usually indicates the HuggingFace model id,
# e.g. meta-llama/Llama-3.1-8B-Instruct from https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct

# %%[markdown]
## Build the agent and run it

# %%
agent = Agent(llm=llm)
conversation = agent.start_conversation()
conversation.append_user_message("What is the capital of Switzerland?")
conversation.execute()
print(conversation.get_last_message())


# %%[markdown]
## Request logprobs from a direct llm call

# %%
from wayflowcore.messagelist import Message, TextContent
from wayflowcore.models import Prompt

prompt = Prompt(
    messages=[Message(content="Say 'Bern' and nothing else.")],
    generation_config=LlmGenerationConfig(top_logprobs=2, max_tokens=16),
)
completion = llm.generate(prompt)
text_chunk = next(chunk for chunk in completion.message.contents if isinstance(chunk, TextContent))

print(text_chunk.content)
print(text_chunk.logprobs)


from wayflowcore.controlconnection import ControlFlowEdge
from wayflowcore.dataconnection import DataFlowEdge


# %%[markdown]
## Import what is needed to build a flow

# %%
from wayflowcore.flow import Flow
from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig
from wayflowcore.property import StringProperty
from wayflowcore.steps import PromptExecutionStep, StartStep


# %%[markdown]
## Build the flow using custom generation parameters

# %%
start_step = StartStep(name="start_step", input_descriptors=[StringProperty("user_question")])
prompt_step = PromptExecutionStep(
    name="PromptExecution",
    prompt_template="{{user_question}}",
    llm=llm,
    generation_config=LlmGenerationConfig(temperature=0.8),
)
flow = Flow(
    begin_step=start_step,
    control_flow_edges=[
        ControlFlowEdge(source_step=start_step, destination_step=prompt_step),
        ControlFlowEdge(source_step=prompt_step, destination_step=None),
    ],
    data_flow_edges=[DataFlowEdge(start_step, "user_question", prompt_step, "user_question")],
)
conversation = flow.start_conversation(
    inputs={"user_question": "What is the capital of Switzerland?"}
)
conversation.execute()


# %%[markdown]
## Request logprobs from a flow step

# %%
from wayflowcore.executors.executionstatus import FinishedStatus

logprob_start_step = StartStep(
    name="logprob_start_step",
    input_descriptors=[StringProperty("user_question")],
)
logprob_step = PromptExecutionStep(
    name="PromptExecutionWithLogprobs",
    prompt_template="{{user_question}}",
    llm=llm,
    top_logprobs=2,
)
logprob_flow = Flow(
    begin_step=logprob_start_step,
    control_flow_edges=[
        ControlFlowEdge(source_step=logprob_start_step, destination_step=logprob_step),
        ControlFlowEdge(source_step=logprob_step, destination_step=None),
    ],
    data_flow_edges=[
        DataFlowEdge(logprob_start_step, "user_question", logprob_step, "user_question")
    ],
)
conversation = logprob_flow.start_conversation(
    inputs={"user_question": "What is the capital of Switzerland?"}
)
status = conversation.execute()
if isinstance(status, FinishedStatus):
    print(status.output_values[PromptExecutionStep.OUTPUT])
    print(status.output_values[PromptExecutionStep.LOGPROBS])


# %%[markdown]
## Export config to Agent Spec

# %%
from wayflowcore.agentspec import AgentSpecExporter

serialized_assistant = AgentSpecExporter().to_yaml(flow)


# %%[markdown]
## Load Agent Spec config

# %%
from wayflowcore.agentspec import AgentSpecLoader

assistant = AgentSpecLoader().load_yaml(serialized_assistant)


# %%[markdown]
## Build the generation configuration from dictionary

# %%
config_dict = {
    "max_tokens": 512,
    "temperature": 0.9,
}

config = LlmGenerationConfig.from_dict(config_dict)


# %%[markdown]
## Export a generation configuration to dictionary

# %%

config = LlmGenerationConfig(max_tokens=1024, temperature=0.8, top_p=0.6)
config_dict = config.to_dict()