How to Serve Agents with WayFlow#

python-icon Download Python Script

Python script/notebook for this guide.

Serve Agents how-to script

Prerequisites

This guide assumes familiarity with:

WayFlow can host agents behind an OpenAI Responses API compatible endpoint. Reliable serving unlocks predictable SLAs, reusable state, and consistent security, while letting clients keep using familiar OpenAI SDKs. Start with an in-memory setup for quick experiments, then add persistence to reuse conversation state and layer FastAPI security controls that fit your environment.

Create an agent to host#

WayFlow supports several LLM API providers. Select an LLM from the options below:

from wayflowcore.models import OCIGenAIModel, OCIClientConfigWithApiKey

llm = OCIGenAIModel(
    model_id="provider.model-id",
    compartment_id="compartment-id",
    client_config=OCIClientConfigWithApiKey(
        service_endpoint="https://url-to-service-endpoint.com",
    ),
)

Note

API keys should not be stored anywhere in the code. Use environment variables or tools such as python-dotenv.

Then, create or reuse an agent you want to serve. You can define it as code:

from wayflowcore.agent import Agent
from wayflowcore.tools.toolhelpers import DescriptionMode, tool

@tool(description_mode=DescriptionMode.ONLY_DOCSTRING)
def get_policy(topic: str) -> str:
    """Return a short HR policy excerpt."""
    return f"{topic} is available. Check the HR portal for details."


agent = Agent(
    llm=llm,
    tools=[get_policy],
    custom_instruction="""You are an HR assistant.
- Call tools when you need facts.""",
)

API Reference: Agent | Tool

Export and reload agent specs#

Save your agent as an Agent Spec so you can deploy from a config file or ship it to another team. Reloading requires a tool_registry that maps tool names back to callables.

from wayflowcore.agentspec import AgentSpecExporter, AgentSpecLoader

spec = AgentSpecExporter().to_json(agent)
tool_registry = {"get_policy": get_policy}
agent_from_spec = AgentSpecLoader(tool_registry=tool_registry).load_json(spec)

API Reference: AgentSpecExporter | AgentSpecLoader

Run an in-memory Responses API server#

Expose the agent with OpenAIResponsesServer. The server mounts /v1/responses and /v1/models endpoints that work with the official openai SDK or OpenAICompatibleModel.

from wayflowcore.agentserver.server import OpenAIResponsesServer

server = OpenAIResponsesServer()
server.serve_agent("hr-assistant", agent)
app = server.get_app()

if __name__ == "__main__":
    server.run(host="127.0.0.1", port=3000)

You can now call the server with an OpenAI-compatible client:

from wayflowcore.models import OpenAIAPIType, OpenAICompatibleModel

client = OpenAICompatibleModel(
    model_id="hr-assistant",
    base_url="http://127.0.0.1:3000",
    api_type=OpenAIAPIType.RESPONSES,
)
completion = client.generate("Summarize the vacation policy.")
print(completion.message.content)

API Reference: OpenAIResponsesServer

Persist conversations with datastores#

To reuse conversation history across requests or server restarts, attach a datastore. Use ServerStorageConfig to define table and column names, then pass a supported Datastore implementation such as PostgresDatabaseDatastore or OracleDatabaseDatastore.

from wayflowcore.agentserver import ServerStorageConfig
from wayflowcore.datastore.postgres import (
    WithoutTlsPostgresDatabaseConnectionConfig,
    PostgresDatabaseDatastore,
)

storage_config = ServerStorageConfig(table_name="assistant_conversations")

connection_config = WithoutTlsPostgresDatabaseConnectionConfig(
    user=os.environ.get("PG_USER", "postgres"),
    password=os.environ.get("PG_PASSWORD", "password"),
    url=os.environ.get("PG_HOST", "localhost:5432"),
)

datastore = PostgresDatabaseDatastore(
    schema=storage_config.to_schema(),
    connection_config=connection_config,
)

persistent_server = OpenAIResponsesServer(
    storage=datastore,
    storage_config=storage_config,
)
persistent_server.serve_agent("hr-assistant", agent)

In production, create the table beforehand or run wayflow serve with --setup-datastore yes to let WayFlow prepare it when the backend supports schema management. It will not override any existing table, so you will need to first delete any existing table to allow wayflow to set it up for you.

API Reference: ServerStorageConfig | Datastore

Add FastAPI security controls#

OpenAIResponsesServer gives you the FastAPI app instance so you can stack your own middleware, dependencies, or routers. This example enforces a simple bearer token check.

import secrets
from fastapi import Request, status
from fastapi.responses import JSONResponse

API_TOKEN = os.environ.get("WAYFLOW_SERVER_TOKEN", "change-me")

secured_app = server.get_app()

@secured_app.middleware("http")
async def require_bearer_token(request: Request, call_next):
    auth_header = request.headers.get("authorization", "")
    expected_header = f"Bearer {API_TOKEN}"
    if not secrets.compare_digest(auth_header, expected_header):
        return JSONResponse(
            status_code=status.HTTP_401_UNAUTHORIZED,
            content={"detail": "Missing or invalid bearer token"},
        )
    return await call_next(request)

Replace the token check with your own authentication handler (OAuth2, mTLS validation, signed cookies, IP filtering, etc.) and add rate limiting or CORS rules as needed.

API Reference: OpenAIResponsesServer

Use the CLI#

You can also serve an agent spec file directly from the CLI:

wayflow serve \
  --api openai-responses \
  --agent-config hr_agent.json \
  --agent-id hr-assistant \
  --server-storage postgres-db \
  --datastore-connection-config postgres_conn.yaml \
  --setup-datastore yes

Pass --tool-registry to load your own tools, swap --server-storage to oracle-db or in-memory, and set --server-storage-config to override column names. See the API reference for a complete description of all arguments.

Warning

This CLI does not implement any security features; use it only for development or inside an already-secured environment such as OCI agent deployments. Missing controls include:

  • Authentication: No verification of caller identity—anyone with network access can invoke the agent.

  • Authorization: No role or permission checks to restrict which users can access specific agents or actions.

  • Rate limiting: No protection against excessive requests that could exhaust resources or incur runaway costs.

  • TLS/HTTPS: Traffic is unencrypted by default, risking interception of sensitive prompts and responses.

For production deployments, wrap the server with an API gateway, reverse proxy, or custom FastAPI middleware that enforces these controls.

Next steps#

Full code#

Click on the card at the top of this page to download the full code for this guide or copy the code below.

  1# Copyright © 2025 Oracle and/or its affiliates.
  2#
  3# This software is under the Apache License 2.0
  4# %%[markdown]
  5# WayFlow Code Example - How to Serve Agents with WayFlow
  6# -------------------------------------------------------
  7
  8# How to use:
  9# Create a new Python virtual environment and install the latest WayFlow version.
 10# ```bash
 11# python -m venv venv-wayflowcore
 12# source venv-wayflowcore/bin/activate
 13# pip install --upgrade pip
 14# pip install "wayflowcore==26.1" 
 15# ```
 16
 17# You can now run the script
 18# 1. As a Python file:
 19# ```bash
 20# python howto_serve_agents.py
 21# ```
 22# 2. As a Notebook (in VSCode):
 23# When viewing the file,
 24#  - press the keys Ctrl + Enter to run the selected cell
 25#  - or Shift + Enter to run the selected cell and move to the cell below# (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0) or Universal Permissive License
 26# (UPL) 1.0 (LICENSE-UPL or https://oss.oracle.com/licenses/upl), at your option.
 27
 28
 29import os
 30
 31
 32# %%[markdown]
 33## Define the llm
 34
 35# %%
 36from wayflowcore.models import VllmModel
 37
 38llm = VllmModel(
 39    model_id="LLAMA_MODEL_ID",
 40    host_port="LLAMA_API_URL",
 41)
 42
 43
 44# %%[markdown]
 45## Define the agent
 46
 47# %%
 48from wayflowcore.agent import Agent
 49from wayflowcore.tools.toolhelpers import DescriptionMode, tool
 50
 51@tool(description_mode=DescriptionMode.ONLY_DOCSTRING)
 52def get_policy(topic: str) -> str:
 53    """Return a short HR policy excerpt."""
 54    return f"{topic} is available. Check the HR portal for details."
 55
 56
 57agent = Agent(
 58    llm=llm,
 59    tools=[get_policy],
 60    custom_instruction="""You are an HR assistant.
 61- Call tools when you need facts.""",
 62)
 63
 64
 65# %%[markdown]
 66## Export agent spec
 67
 68# %%
 69from wayflowcore.agentspec import AgentSpecExporter, AgentSpecLoader
 70
 71spec = AgentSpecExporter().to_json(agent)
 72tool_registry = {"get_policy": get_policy}
 73agent_from_spec = AgentSpecLoader(tool_registry=tool_registry).load_json(spec)
 74
 75
 76# %%[markdown]
 77## Serve in memory
 78
 79# %%
 80from wayflowcore.agentserver.server import OpenAIResponsesServer
 81
 82server = OpenAIResponsesServer()
 83server.serve_agent("hr-assistant", agent)
 84app = server.get_app()
 85
 86if __name__ == "__main__":
 87    server.run(host="127.0.0.1", port=3000)
 88
 89
 90
 91# %%[markdown]
 92## Call the server
 93
 94# %%
 95from wayflowcore.models import OpenAIAPIType, OpenAICompatibleModel
 96
 97client = OpenAICompatibleModel(
 98    model_id="hr-assistant",
 99    base_url="http://127.0.0.1:3000",
100    api_type=OpenAIAPIType.RESPONSES,
101)
102completion = client.generate("Summarize the vacation policy.")
103print(completion.message.content)
104
105
106# %%[markdown]
107## Persistent storage
108
109# %%
110from wayflowcore.agentserver import ServerStorageConfig
111from wayflowcore.datastore.postgres import (
112    WithoutTlsPostgresDatabaseConnectionConfig,
113    PostgresDatabaseDatastore,
114)
115
116storage_config = ServerStorageConfig(table_name="assistant_conversations")
117
118connection_config = WithoutTlsPostgresDatabaseConnectionConfig(
119    user=os.environ.get("PG_USER", "postgres"),
120    password=os.environ.get("PG_PASSWORD", "password"),
121    url=os.environ.get("PG_HOST", "localhost:5432"),
122)
123
124datastore = PostgresDatabaseDatastore(
125    schema=storage_config.to_schema(),
126    connection_config=connection_config,
127)
128
129persistent_server = OpenAIResponsesServer(
130    storage=datastore,
131    storage_config=storage_config,
132)
133persistent_server.serve_agent("hr-assistant", agent)
134
135
136# %%[markdown]
137## Add fastapi security
138
139# %%
140import secrets
141from fastapi import Request, status
142from fastapi.responses import JSONResponse
143
144API_TOKEN = os.environ.get("WAYFLOW_SERVER_TOKEN", "change-me")
145
146secured_app = server.get_app()
147
148@secured_app.middleware("http")
149async def require_bearer_token(request: Request, call_next):
150    auth_header = request.headers.get("authorization", "")
151    expected_header = f"Bearer {API_TOKEN}"
152    if not secrets.compare_digest(auth_header, expected_header):
153        return JSONResponse(
154            status_code=status.HTTP_401_UNAUTHORIZED,
155            content={"detail": "Missing or invalid bearer token"},
156        )
157    return await call_next(request)