How to Specify the Generation Configuration when Using LLMs#

python-icon Download Python Script

Python script/notebook for this guide.

Generation Configuration how-to script

Generation parameters, such as temperature, top-p, and the maximum number of output tokens, are important for achieving the desired performance with Large Language Models (LLMs). In WayFlow, these parameters can be configured with the LlmGenerationConfig class.

This guide will show you how to:

  • Configure the generation parameters for an agent.

  • Configure the generation parameters for a flow.

  • Apply the generation configuration from a dictionary.

  • Save a custom generation configuration.

Note

For a deeper understanding of the impact of each generation parameter, refer to the resources at the bottom of this page.

Basic implementation#

Configure the generation parameters for an agent#

Customizing the generation configuration for an agent requires the use of the following wayflowcore components.

from wayflowcore.agent import Agent
from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig

The generation configuration can be specified when initializing the LLM using the LlmGenerationConfig class. This ensures that all the outputs generated by the agent will have the same generation configuration.

The generation configuration dictionary can have the following arguments:

  • max_new_tokens: controls the maximum numbers of tokens to generate, ignoring the number of tokens in the prompt;

  • temperature: controls the randomness of the output;

  • top_p: controls the randomness of the output;

  • stop: defines a list of stop words to indicate the LLM to stop generating;

  • frequency_penalty: controls the frequency of tokens generated.

Additionally, the LlmGenerationConfig offers the possibility to set a dictionary of arbitrary parameters, called extra_args, that will be sent as part of the llm generation call. This allows specifying provider-specific parameters that might not be common to all.

Note

The extra parameters should never include sensitive information.

generation_config = LlmGenerationConfig(
    max_tokens=512,
    temperature=1.0,
    top_p=1.0,
    stop=["exit", "end"],
    frequency_penalty=0,
    extra_args={"seed": 1},
)

WayFlow supports several LLM API providers. You can pass the generation_config for each of them. Select an LLM from the options below:

from wayflowcore.models import OCIGenAIModel

if __name__ == "__main__":

    llm = OCIGenAIModel(
        model_id="provider.model-id",
        service_endpoint="https://url-to-service-endpoint.com",
        compartment_id="compartment-id",
        auth_type="API_KEY",
    )

Important

API keys should not be stored anywhere in the code. Use environment variables and/or tools such as python-dotenv

Now, you can build an agent using the LLM as follows:

agent = Agent(llm=llm)
conversation = agent.start_conversation()
conversation.append_user_message("What is the capital of Switzerland?")
conversation.execute()
print(conversation.get_last_message())

Configure the generation parameters for a flow#

Customizing the generation configuration for a flow requires the use of the following wayflowcore components.

from wayflowcore.flow import Flow
from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig
from wayflowcore.property import StringProperty
from wayflowcore.steps import PromptExecutionStep, StartStep

Refer to the previous section to learn how to configure the generation parameters when initializing an LLM using the LlmGenerationConfig class.

You can then create a one-step flow using the PromptExecutionStep step.

start_step = StartStep(name="start_step", input_descriptors=[StringProperty("user_question")])
prompt_step = PromptExecutionStep(
    name="PromptExecution",
    prompt_template="{{user_question}}",
    llm=llm,
    generation_config=LlmGenerationConfig(temperature=0.8),
)
flow = Flow(
    begin_step=start_step,
    control_flow_edges=[
        ControlFlowEdge(source_step=start_step, destination_step=prompt_step),
        ControlFlowEdge(source_step=prompt_step, destination_step=None),
    ],
    data_flow_edges=[DataFlowEdge(start_step, "user_question", prompt_step, "user_question")],
)
conversation = flow.start_conversation(
    inputs={"user_question": "What is the capital of Switzerland?"}
)
conversation.execute()

Important

The generation_config parameter passed to the PromptExecutionStep overrides the LLM’s original generation configuration.

Advanced usage#

The LlmGenerationConfig class is a serializable object. It can be instantiated from a dictionary or saved to one, as you will see below.

Apply the generation configuration from a dictionary#

If you have a generation configuration in a dictionary (for example, from a JSON or YAML file), you can instantiate the LlmGenerationConfig class as follows:

config_dict = {
    "max_tokens": 512,
    "temperature": 0.9,
}

config = LlmGenerationConfig.from_dict(config_dict)

Save a custom generation configuration#

If you would like to share your specific generation configuration, you can create a LlmGenerationConfig class instance and store it to a dictionary.


config = LlmGenerationConfig(max_tokens=1024, temperature=0.8, top_p=0.6)
config_dict = config.to_dict()

Agent Spec Exporting/Loading#

You can export the assistant configuration to its Agent Spec configuration using the AgentSpecExporter. The following example exports the serialization of the flow defined above.

from wayflowcore.agentspec import AgentSpecExporter

serialized_assistant = AgentSpecExporter().to_yaml(flow)

Here is what the Agent Spec representation will look like ↓

Click here to see the assistant configuration.
{
  "component_type": "Flow",
  "id": "fc3d10f4-5ee2-40d8-a580-0db6c44b0b39",
  "name": "flow_0e4b989a",
  "description": "",
  "metadata": {
    "__metadata_info__": {}
  },
  "inputs": [
    {
      "type": "string",
      "title": "user_question"
    }
  ],
  "outputs": [
    {
      "description": "the generated text",
      "type": "string",
      "title": "output"
    }
  ],
  "start_node": {
    "$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
  },
  "nodes": [
    {
      "$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
    },
    {
      "$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
    },
    {
      "$component_ref": "158c838b-f8be-41ef-8b66-64348c8d379c"
    }
  ],
  "control_flow_connections": [
    {
      "component_type": "ControlFlowEdge",
      "id": "7b8c0c0b-fcd1-4bf3-96be-dcf726ab1526",
      "name": "start_step_to_PromptExecution_control_flow_edge",
      "description": null,
      "metadata": {
        "__metadata_info__": {}
      },
      "from_node": {
        "$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
      },
      "from_branch": null,
      "to_node": {
        "$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
      }
    },
    {
      "component_type": "ControlFlowEdge",
      "id": "6b2c2840-126c-43fe-a8e1-f3cb08e8ae88",
      "name": "PromptExecution_to_None End node_control_flow_edge",
      "description": null,
      "metadata": {},
      "from_node": {
        "$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
      },
      "from_branch": null,
      "to_node": {
        "$component_ref": "158c838b-f8be-41ef-8b66-64348c8d379c"
      }
    }
  ],
  "data_flow_connections": [
    {
      "component_type": "DataFlowEdge",
      "id": "86ba0435-be9b-46b0-97ae-64e145045e19",
      "name": "start_step_user_question_to_PromptExecution_user_question_data_flow_edge",
      "description": null,
      "metadata": {
        "__metadata_info__": {}
      },
      "source_node": {
        "$component_ref": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab"
      },
      "source_output": "user_question",
      "destination_node": {
        "$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
      },
      "destination_input": "user_question"
    },
    {
      "component_type": "DataFlowEdge",
      "id": "8638e234-9a23-45c5-89d4-296fc5a8c5ac",
      "name": "PromptExecution_output_to_None End node_output_data_flow_edge",
      "description": null,
      "metadata": {},
      "source_node": {
        "$component_ref": "25917cac-52d4-4816-8c62-c18d8b70ee33"
      },
      "source_output": "output",
      "destination_node": {
        "$component_ref": "158c838b-f8be-41ef-8b66-64348c8d379c"
      },
      "destination_input": "output"
    }
  ],
  "$referenced_components": {
    "25917cac-52d4-4816-8c62-c18d8b70ee33": {
      "component_type": "LlmNode",
      "id": "25917cac-52d4-4816-8c62-c18d8b70ee33",
      "name": "PromptExecution",
      "description": "",
      "metadata": {
        "__metadata_info__": {}
      },
      "inputs": [
        {
          "description": "\"user_question\" input variable for the template",
          "type": "string",
          "title": "user_question"
        }
      ],
      "outputs": [
        {
          "description": "the generated text",
          "type": "string",
          "title": "output"
        }
      ],
      "branches": [
        "next"
      ],
      "llm_config": {
        "component_type": "VllmConfig",
        "id": "93d098ef-9643-4d38-a012-8903bacbb784",
        "name": "LLAMA_MODEL_ID",
        "description": null,
        "metadata": {
          "__metadata_info__": {}
        },
        "default_generation_parameters": null,
        "url": "LLAMA_API_URL",
        "model_id": "LLAMA_MODEL_ID"
      },
      "prompt_template": "{{user_question}}"
    },
    "d8870848-f3c1-4a88-a0f3-b6ca20c61bab": {
      "component_type": "StartNode",
      "id": "d8870848-f3c1-4a88-a0f3-b6ca20c61bab",
      "name": "start_step",
      "description": "",
      "metadata": {
        "__metadata_info__": {}
      },
      "inputs": [
        {
          "type": "string",
          "title": "user_question"
        }
      ],
      "outputs": [
        {
          "type": "string",
          "title": "user_question"
        }
      ],
      "branches": [
        "next"
      ]
    },
    "158c838b-f8be-41ef-8b66-64348c8d379c": {
      "component_type": "EndNode",
      "id": "158c838b-f8be-41ef-8b66-64348c8d379c",
      "name": "None End node",
      "description": "End node representing all transitions to None in the WayFlow flow",
      "metadata": {},
      "inputs": [
        {
          "description": "the generated text",
          "type": "string",
          "title": "output"
        }
      ],
      "outputs": [
        {
          "description": "the generated text",
          "type": "string",
          "title": "output"
        }
      ],
      "branches": [],
      "branch_name": "next"
    }
  },
  "agentspec_version": "25.4.1"
}

You can then load the configuration back to an assistant using the AgentSpecLoader.

from wayflowcore.agentspec import AgentSpecLoader

assistant = AgentSpecLoader().load_yaml(serialized_assistant)

Next steps#

Having learned how to specify the generation configuration, you may now proceed to:

Some additional resources we recommend:

Full code#

Click on the card at the top of this page to download the full code for this guide or copy the code below.

  1# Copyright © 2025 Oracle and/or its affiliates.
  2#
  3# This software is under the Universal Permissive License
  4# %%[markdown]
  5# Code Example - How to Specify the Generation Configuration when Using LLMs
  6# --------------------------------------------------------------------------
  7
  8# How to use:
  9# Create a new Python virtual environment and install the latest WayFlow version.
 10# ```bash
 11# python -m venv venv-wayflowcore
 12# source venv-wayflowcore/bin/activate
 13# pip install --upgrade pip
 14# pip install "wayflowcore==26.1" 
 15# ```
 16
 17# You can now run the script
 18# 1. As a Python file:
 19# ```bash
 20# python example_generationconfig.py
 21# ```
 22# 2. As a Notebook (in VSCode):
 23# When viewing the file,
 24#  - press the keys Ctrl + Enter to run the selected cell
 25#  - or Shift + Enter to run the selected cell and move to the cell below# (UPL) 1.0 (LICENSE-UPL or https://oss.oracle.com/licenses/upl) or Apache License
 26# 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0), at your option.
 27
 28
 29
 30
 31# %%[markdown]
 32## Imports
 33
 34# %%
 35from wayflowcore.agent import Agent
 36from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig
 37
 38
 39# %%[markdown]
 40## Define the llm generation configuration
 41
 42# %%
 43generation_config = LlmGenerationConfig(
 44    max_tokens=512,
 45    temperature=1.0,
 46    top_p=1.0,
 47    stop=["exit", "end"],
 48    frequency_penalty=0,
 49    extra_args={"seed": 1},
 50)
 51
 52
 53# %%[markdown]
 54## Define the vLLM
 55
 56# %%
 57from wayflowcore.models import VllmModel
 58
 59llm = VllmModel(
 60    model_id="LLAMA_MODEL_ID",
 61    host_port="LLAMA_API_URL",
 62    generation_config=generation_config,
 63)
 64# NOTE: host_port should be a string with the IP address/domain name and the port. An example string: "192.168.1.1:8000"
 65# NOTE: model_id usually indicates the HuggingFace model id,
 66# e.g. meta-llama/Llama-3.1-8B-Instruct from https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct
 67
 68# %%[markdown]
 69## Build the agent and run it
 70
 71# %%
 72agent = Agent(llm=llm)
 73conversation = agent.start_conversation()
 74conversation.append_user_message("What is the capital of Switzerland?")
 75conversation.execute()
 76print(conversation.get_last_message())
 77
 78
 79from wayflowcore.controlconnection import ControlFlowEdge
 80from wayflowcore.dataconnection import DataFlowEdge
 81
 82
 83# %%[markdown]
 84## Import what is needed to build a flow
 85
 86# %%
 87from wayflowcore.flow import Flow
 88from wayflowcore.models.llmgenerationconfig import LlmGenerationConfig
 89from wayflowcore.property import StringProperty
 90from wayflowcore.steps import PromptExecutionStep, StartStep
 91
 92
 93# %%[markdown]
 94## Build the flow using custom generation parameters
 95
 96# %%
 97start_step = StartStep(name="start_step", input_descriptors=[StringProperty("user_question")])
 98prompt_step = PromptExecutionStep(
 99    name="PromptExecution",
100    prompt_template="{{user_question}}",
101    llm=llm,
102    generation_config=LlmGenerationConfig(temperature=0.8),
103)
104flow = Flow(
105    begin_step=start_step,
106    control_flow_edges=[
107        ControlFlowEdge(source_step=start_step, destination_step=prompt_step),
108        ControlFlowEdge(source_step=prompt_step, destination_step=None),
109    ],
110    data_flow_edges=[DataFlowEdge(start_step, "user_question", prompt_step, "user_question")],
111)
112conversation = flow.start_conversation(
113    inputs={"user_question": "What is the capital of Switzerland?"}
114)
115conversation.execute()
116
117
118# %%[markdown]
119## Export config to Agent Spec
120
121# %%
122from wayflowcore.agentspec import AgentSpecExporter
123
124serialized_assistant = AgentSpecExporter().to_yaml(flow)
125
126
127# %%[markdown]
128## Load Agent Spec config
129
130# %%
131from wayflowcore.agentspec import AgentSpecLoader
132
133assistant = AgentSpecLoader().load_yaml(serialized_assistant)
134
135
136# %%[markdown]
137## Build the generation configuration from dictionary
138
139# %%
140config_dict = {
141    "max_tokens": 512,
142    "temperature": 0.9,
143}
144
145config = LlmGenerationConfig.from_dict(config_dict)
146
147
148# %%[markdown]
149## Export a generation configuration to dictionary
150
151# %%
152
153config = LlmGenerationConfig(max_tokens=1024, temperature=0.8, top_p=0.6)
154config_dict = config.to_dict()