How to Do Structured LLM Generation in Flows#

Prerequisites

This guide assumes familiarity with Flows.

WayFlow enables to leverage LLMs to generate text and structured outputs. This guide will show you how to:

use the PromptExecutionStep to generate text using an LLM
use the PromptExecutionStep to generate structured outputs
use the AgentExecutionStep to generate structured outputs using an agent

Basic implementation#

In this how-to guide, you will learn how to do a structured LLM generation with Flows.

WayFlow supports several LLM API providers. Select an LLM from the options below:

from wayflowcore.models import OCIGenAIModel

if __name__ == "__main__":

    llm = OCIGenAIModel(
        model_id="provider.model-id",
        service_endpoint="https://url-to-service-endpoint.com",
        compartment_id="compartment-id",
        auth_type="API_KEY",
    )

from wayflowcore.models import VllmModel

llm = VllmModel(
    model_id="model-id",
    host_port="VLLM_HOST_PORT",
)

from wayflowcore.models import OllamaModel

llm = OllamaModel(
    model_id="model-id",
)

Assuming you want to summarize this article:

article = """Sea turtles are ancient reptiles that have been around for over 100 million years. They play crucial roles in marine ecosystems, such as maintaining healthy seagrass beds and coral reefs. Unfortunately, they are under threat due to poaching, habitat loss, and pollution. Conservation efforts worldwide aim to protect nesting sites and reduce bycatch in fishing gear."""

WayFlow offers the PromptExecutionStep for this type of queries. Use the code below to generate a 10-words summary:

from wayflowcore.steps import PromptExecutionStep, StartStep

start_step = StartStep(input_descriptors=[StringProperty("article")])
summarize_step = PromptExecutionStep(
    llm=llm,
    prompt_template="""Summarize this article in 10 words:\n {{article}}""",
    output_mapping={PromptExecutionStep.OUTPUT: "summary"},
)
summarize_step_name = "summarize_step"
flow = Flow(
    begin_step_name="start_step",
    steps={
        "start_step": start_step,
        summarize_step_name: summarize_step,
    },
    control_flow_edges=[
        ControlFlowEdge(source_step=start_step, destination_step=summarize_step),
        ControlFlowEdge(source_step=summarize_step, destination_step=None),
    ],
    data_flow_edges=[
        DataFlowEdge(start_step, "article", summarize_step, "article"),
    ],
)

Note

In the prompt, article is a Jinja2 syntax to specify a placeholder for a variable, which will appear as an input for the step. If you use {{var_name}}, the variable named var_name will be of type StringProperty. If you specify anything else Jinja2 compatible (for loops, filters, and so on), it will be of type AnyProperty.

Now execute the flow:

conversation = flow.start_conversation(inputs={"article": article})
status = conversation.execute()
print(status.output_values["summary"])
# Sea turtles face threats from poaching, habitat loss, and pollution globally.

As expected, your flow has generated the article summary!

Structured generation with Flows#

In many cases, generating raw text within a flow is not very useful, as it is difficult to leverage in later steps. Instead, you might want to generate attributes that follow a particular schema. The PromptExecutionStep class enables this through the output_descriptors parameter.

from wayflowcore.property import ListProperty, StringProperty
from wayflowcore.steps import PromptExecutionStep, StartStep

animal_output = StringProperty(
    name="animal_name",
    description="name of the animal",
    default_value="",
)
danger_level_output = StringProperty(
    name="danger_level",
    description='level of danger of the animal. Can be "HIGH", "MEDIUM" or "LOW"',
    default_value="",
)
threats_output = ListProperty(
    name="threats",
    description="list of threats for the animal",
    item_type=StringProperty("threat"),
    default_value=[],
)


start_step = StartStep(input_descriptors=[StringProperty("article")])
summarize_step = PromptExecutionStep(
    llm=llm,
    prompt_template="""Extract from the following article the name of the animal, its danger level and the threats it's subject to. The article:\n\n {{article}}""",
    output_descriptors=[animal_output, danger_level_output, threats_output],
)
summarize_step_name = "summarize_step"
flow = Flow(
    begin_step_name="start_step",
    steps={
        "start_step": start_step,
        summarize_step_name: summarize_step,
    },
    control_flow_edges=[
        ControlFlowEdge(source_step=start_step, destination_step=summarize_step),
        ControlFlowEdge(source_step=summarize_step, destination_step=None),
    ],
    data_flow_edges=[
        DataFlowEdge(start_step, "article", summarize_step, "article"),
    ],
)

conversation = flow.start_conversation(inputs={"article": article})
status = conversation.execute()
print(status.output_values)
# {'threats': ['poaching', 'habitat loss', 'pollution'], 'danger_level': 'HIGH', 'animal_name': 'Sea turtles'}

Complex JSON objects#

Sometimes, you might need to generate an object that follows a specific JSON Schema. You can do that by using an output descriptor of type ObjectProperty, or directly converting your JSON Schema into a descriptor:

from wayflowcore.property import Property, StringProperty
from wayflowcore.steps import PromptExecutionStep, StartStep

animal_json_schema = {
    "title": "animal_object",
    "description": "information about the animal",
    "type": "object",
    "properties": {
        "animal_name": {
            "type": "string",
            "description": "name of the animal",
            "default": "",
        },
        "danger_level": {
            "type": "string",
            "description": 'level of danger of the animal. Can be "HIGH", "MEDIUM" or "LOW"',
            "default": "",
        },
        "threats": {
            "type": "array",
            "description": "list of threats for the animal",
            "items": {"type": "string"},
            "default": [],
        },
    },
}
animal_descriptor = Property.from_json_schema(animal_json_schema)

start_step = StartStep(input_descriptors=[StringProperty("article")])
summarize_step = PromptExecutionStep(
    llm=llm,
    prompt_template="""Extract from the following article the name of the animal, its danger level and the threats it's subject to. The article:\n\n {{article}}""",
    output_descriptors=[animal_descriptor],
)
summarize_step_name = "summarize_step"
flow = Flow(
    begin_step_name="start_step",
    steps={
        "start_step": start_step,
        summarize_step_name: summarize_step,
    },
    control_flow_edges=[
        ControlFlowEdge(source_step=start_step, destination_step=summarize_step),
        ControlFlowEdge(source_step=summarize_step, destination_step=None),
    ],
    data_flow_edges=[
        DataFlowEdge(start_step, "article", summarize_step, "article"),
    ],
)

conversation = flow.start_conversation(inputs={"article": article})
status = conversation.execute()
print(status.output_values)
# {'animal_object': {'animal_name': 'Sea turtles', 'danger_level': 'MEDIUM', 'threats': ['Poaching', 'Habitat loss', 'Pollution']}}

Structured generation with Agents#

In certain scenarios, you might need to invoke additional tools within your flow. You can instruct the agent to generate specific outputs, and use them in the AgentExecutionStep class to perform structured generation.

from wayflowcore.agent import Agent, CallerInputMode
from wayflowcore.controlconnection import ControlFlowEdge
from wayflowcore.steps import AgentExecutionStep, StartStep

start_step = StartStep(input_descriptors=[])
agent = Agent(
    llm=llm,
    custom_instruction="""Extract from the article given by the user the name of the animal, its danger level and the threats it's subject to.""",
    initial_message=None,
)

summarize_agent_step = AgentExecutionStep(
    agent=agent,
    output_descriptors=[animal_output, danger_level_output, threats_output],
    caller_input_mode=CallerInputMode.NEVER,
)
summarize_step_name = "summarize_step"
flow = Flow(
    begin_step_name="start_step",
    steps={
        "start_step": start_step,
        summarize_step_name: summarize_agent_step,
    },
    control_flow_edges=[
        ControlFlowEdge(source_step=start_step, destination_step=summarize_agent_step),
        ControlFlowEdge(source_step=summarize_agent_step, destination_step=None),
    ],
    data_flow_edges=[],
)

conversation = flow.start_conversation()
conversation.append_user_message("Here is the article: " + article)
status = conversation.execute()
print(status.output_values)
# {'animal_name': 'Sea turtles', 'danger_level': 'HIGH', 'threats': ['poaching', 'habitat loss', 'pollution']}

Recap#

In this guide, you learned how to incorporate LLMs into flows using the PromptExecutionStep class to:

generate raw text
produce structured output
generate structured generation using the agent and AgentExecutionStep

Below is the complete code from this guide.

article = """Sea turtles are ancient reptiles that have been around for over 100 million years. They play crucial roles in marine ecosystems, such as maintaining healthy seagrass beds and coral reefs. Unfortunately, they are under threat due to poaching, habitat loss, and pollution. Conservation efforts worldwide aim to protect nesting sites and reduce bycatch in fishing gear."""

llm = LlmModelFactory.from_config(model_config)

from wayflowcore.steps import PromptExecutionStep, StartStep

start_step = StartStep(input_descriptors=[StringProperty("article")])
summarize_step = PromptExecutionStep(
    llm=llm,
    prompt_template="""Summarize this article in 10 words:\n {{article}}""",
    output_mapping={PromptExecutionStep.OUTPUT: "summary"},
)
summarize_step_name = "summarize_step"
flow = Flow(
    begin_step_name="start_step",
    steps={
        "start_step": start_step,
        summarize_step_name: summarize_step,
    },
    control_flow_edges=[
        ControlFlowEdge(source_step=start_step, destination_step=summarize_step),
        ControlFlowEdge(source_step=summarize_step, destination_step=None),
    ],
    data_flow_edges=[
        DataFlowEdge(start_step, "article", summarize_step, "article"),
    ],
)

conversation = flow.start_conversation(inputs={"article": article})
status = conversation.execute()
print(status.output_values["summary"])
# Sea turtles face threats from poaching, habitat loss, and pollution globally.

from wayflowcore.property import ListProperty, StringProperty
from wayflowcore.steps import PromptExecutionStep, StartStep

animal_output = StringProperty(
    name="animal_name",
    description="name of the animal",
    default_value="",
)
danger_level_output = StringProperty(
    name="danger_level",
    description='level of danger of the animal. Can be "HIGH", "MEDIUM" or "LOW"',
    default_value="",
)
threats_output = ListProperty(
    name="threats",
    description="list of threats for the animal",
    item_type=StringProperty("threat"),
    default_value=[],
)


start_step = StartStep(input_descriptors=[StringProperty("article")])
summarize_step = PromptExecutionStep(
    llm=llm,
    prompt_template="""Extract from the following article the name of the animal, its danger level and the threats it's subject to. The article:\n\n {{article}}""",
    output_descriptors=[animal_output, danger_level_output, threats_output],
)
summarize_step_name = "summarize_step"
flow = Flow(
    begin_step_name="start_step",
    steps={
        "start_step": start_step,
        summarize_step_name: summarize_step,
    },
    control_flow_edges=[
        ControlFlowEdge(source_step=start_step, destination_step=summarize_step),
        ControlFlowEdge(source_step=summarize_step, destination_step=None),
    ],
    data_flow_edges=[
        DataFlowEdge(start_step, "article", summarize_step, "article"),
    ],
)

conversation = flow.start_conversation(inputs={"article": article})
status = conversation.execute()
print(status.output_values)
# {'threats': ['poaching', 'habitat loss', 'pollution'], 'danger_level': 'HIGH', 'animal_name': 'Sea turtles'}

from wayflowcore.agent import Agent, CallerInputMode
from wayflowcore.controlconnection import ControlFlowEdge
from wayflowcore.steps import AgentExecutionStep, StartStep

start_step = StartStep(input_descriptors=[])
agent = Agent(
    llm=llm,
    custom_instruction="""Extract from the article given by the user the name of the animal, its danger level and the threats it's subject to.""",
    initial_message=None,
)

summarize_agent_step = AgentExecutionStep(
    agent=agent,
    output_descriptors=[animal_output, danger_level_output, threats_output],
    caller_input_mode=CallerInputMode.NEVER,
)
summarize_step_name = "summarize_step"
flow = Flow(
    begin_step_name="start_step",
    steps={
        "start_step": start_step,
        summarize_step_name: summarize_agent_step,
    },
    control_flow_edges=[
        ControlFlowEdge(source_step=start_step, destination_step=summarize_agent_step),
        ControlFlowEdge(source_step=summarize_agent_step, destination_step=None),
    ],
    data_flow_edges=[],
)

conversation = flow.start_conversation()
conversation.append_user_message("Here is the article: " + article)
status = conversation.execute()
print(status.output_values)
# {'animal_name': 'Sea turtles', 'danger_level': 'HIGH', 'threats': ['poaching', 'habitat loss', 'pollution']}

Next steps#

Having learned how to perform structured generation in WayFlow, you may now proceed to:

Config Generation to change LLM generation parameters.
Catching Exceptions to ensure robustness of the generated outputs.