How to Serve Agents with WayFlow#
WayFlow can host agents behind an OpenAI Responses API compatible endpoint. Reliable serving unlocks predictable SLAs, reusable state, and consistent security, while letting clients keep using familiar OpenAI SDKs. Start with an in-memory setup for quick experiments, then add persistence to reuse conversation state and layer FastAPI security controls that fit your environment.
Create an agent to host#
WayFlow supports several LLM API providers. Select an LLM from the options below:
from wayflowcore.models import OCIGenAIModel, OCIClientConfigWithApiKey
llm = OCIGenAIModel(
model_id="provider.model-id",
compartment_id="compartment-id",
client_config=OCIClientConfigWithApiKey(
service_endpoint="https://url-to-service-endpoint.com",
),
)
from wayflowcore.models import VllmModel
llm = VllmModel(
model_id="model-id",
host_port="VLLM_HOST_PORT",
)
from wayflowcore.models import OllamaModel
llm = OllamaModel(
model_id="model-id",
)
Note
API keys should not be stored anywhere in the code. Use environment variables or tools such as python-dotenv.
Then, create or reuse an agent you want to serve. You can define it as code:
from wayflowcore.agent import Agent
from wayflowcore.tools.toolhelpers import DescriptionMode, tool
@tool(description_mode=DescriptionMode.ONLY_DOCSTRING)
def get_policy(topic: str) -> str:
"""Return a short HR policy excerpt."""
return f"{topic} is available. Check the HR portal for details."
agent = Agent(
llm=llm,
tools=[get_policy],
custom_instruction="""You are an HR assistant.
- Call tools when you need facts.""",
)
Export and reload agent specs#
Save your agent as an Agent Spec so you can deploy from a config file or ship it to another team.
Reloading requires a tool_registry that maps tool names back to callables.
from wayflowcore.agentspec import AgentSpecExporter, AgentSpecLoader
spec = AgentSpecExporter().to_json(agent)
tool_registry = {"get_policy": get_policy}
agent_from_spec = AgentSpecLoader(tool_registry=tool_registry).load_json(spec)
API Reference: AgentSpecExporter | AgentSpecLoader
Run an in-memory Responses API server#
Expose the agent with OpenAIResponsesServer. The server mounts
/v1/responses and /v1/models endpoints that work with the official openai SDK or
OpenAICompatibleModel.
from wayflowcore.agentserver.server import OpenAIResponsesServer
server = OpenAIResponsesServer()
server.serve_agent("hr-assistant", agent)
app = server.get_app()
if __name__ == "__main__":
server.run(host="127.0.0.1", port=3000)
You can now call the server with an OpenAI-compatible client:
from wayflowcore.models import OpenAIAPIType, OpenAICompatibleModel
client = OpenAICompatibleModel(
model_id="hr-assistant",
base_url="http://127.0.0.1:3000",
api_type=OpenAIAPIType.RESPONSES,
)
completion = client.generate("Summarize the vacation policy.")
print(completion.message.content)
API Reference: OpenAIResponsesServer
Persist conversations with datastores#
To reuse conversation history across requests or server restarts, attach a datastore. Use ServerStorageConfig to define table and column names, then pass a supported Datastore implementation such as PostgresDatabaseDatastore or OracleDatabaseDatastore.
from wayflowcore.agentserver import ServerStorageConfig
from wayflowcore.datastore.postgres import (
WithoutTlsPostgresDatabaseConnectionConfig,
PostgresDatabaseDatastore,
)
storage_config = ServerStorageConfig(table_name="assistant_conversations")
connection_config = WithoutTlsPostgresDatabaseConnectionConfig(
user=os.environ.get("PG_USER", "postgres"),
password=os.environ.get("PG_PASSWORD", "password"),
url=os.environ.get("PG_HOST", "localhost:5432"),
)
datastore = PostgresDatabaseDatastore(
schema=storage_config.to_schema(),
connection_config=connection_config,
)
persistent_server = OpenAIResponsesServer(
storage=datastore,
storage_config=storage_config,
)
persistent_server.serve_agent("hr-assistant", agent)
In production, create the table beforehand or run wayflow serve with --setup-datastore yes
to let WayFlow prepare it when the backend supports schema management. It will not override any
existing table, so you will need to first delete any existing table to allow wayflow to set it up
for you.
API Reference: ServerStorageConfig | Datastore
Add FastAPI security controls#
OpenAIResponsesServer gives you the FastAPI app instance so
you can stack your own middleware, dependencies, or routers. This example enforces a simple bearer
token check.
import secrets
from fastapi import Request, status
from fastapi.responses import JSONResponse
API_TOKEN = os.environ.get("WAYFLOW_SERVER_TOKEN", "change-me")
secured_app = server.get_app()
@secured_app.middleware("http")
async def require_bearer_token(request: Request, call_next):
auth_header = request.headers.get("authorization", "")
expected_header = f"Bearer {API_TOKEN}"
if not secrets.compare_digest(auth_header, expected_header):
return JSONResponse(
status_code=status.HTTP_401_UNAUTHORIZED,
content={"detail": "Missing or invalid bearer token"},
)
return await call_next(request)
Replace the token check with your own authentication handler (OAuth2, mTLS validation, signed cookies, IP filtering, etc.) and add rate limiting or CORS rules as needed.
API Reference: OpenAIResponsesServer
Use the CLI#
You can also serve an agent spec file directly from the CLI:
wayflow serve \
--api openai-responses \
--agent-config hr_agent.json \
--agent-id hr-assistant \
--server-storage postgres-db \
--datastore-connection-config postgres_conn.yaml \
--setup-datastore yes
Pass --tool-registry to load your own tools, swap --server-storage to oracle-db or
in-memory, and set --server-storage-config to override column names. See the API reference for a
complete description of all arguments.
Warning
This CLI does not implement any security features; use it only for development or inside an already-secured environment such as OCI agent deployments. Missing controls include:
Authentication: No verification of caller identity—anyone with network access can invoke the agent.
Authorization: No role or permission checks to restrict which users can access specific agents or actions.
Rate limiting: No protection against excessive requests that could exhaust resources or incur runaway costs.
TLS/HTTPS: Traffic is unencrypted by default, risking interception of sensitive prompts and responses.
For production deployments, wrap the server with an API gateway, reverse proxy, or custom FastAPI middleware that enforces these controls.
Next steps#
Full code#
Click on the card at the top of this page to download the full code for this guide or copy the code below.
1# Copyright © 2025 Oracle and/or its affiliates.
2#
3# This software is under the Apache License 2.0
4# %%[markdown]
5# WayFlow Code Example - How to Serve Agents with WayFlow
6# -------------------------------------------------------
7
8# How to use:
9# Create a new Python virtual environment and install the latest WayFlow version.
10# ```bash
11# python -m venv venv-wayflowcore
12# source venv-wayflowcore/bin/activate
13# pip install --upgrade pip
14# pip install "wayflowcore==26.1"
15# ```
16
17# You can now run the script
18# 1. As a Python file:
19# ```bash
20# python howto_serve_agents.py
21# ```
22# 2. As a Notebook (in VSCode):
23# When viewing the file,
24# - press the keys Ctrl + Enter to run the selected cell
25# - or Shift + Enter to run the selected cell and move to the cell below# (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0) or Universal Permissive License
26# (UPL) 1.0 (LICENSE-UPL or https://oss.oracle.com/licenses/upl), at your option.
27
28
29import os
30
31
32# %%[markdown]
33## Define the llm
34
35# %%
36from wayflowcore.models import VllmModel
37
38llm = VllmModel(
39 model_id="LLAMA_MODEL_ID",
40 host_port="LLAMA_API_URL",
41)
42
43
44# %%[markdown]
45## Define the agent
46
47# %%
48from wayflowcore.agent import Agent
49from wayflowcore.tools.toolhelpers import DescriptionMode, tool
50
51@tool(description_mode=DescriptionMode.ONLY_DOCSTRING)
52def get_policy(topic: str) -> str:
53 """Return a short HR policy excerpt."""
54 return f"{topic} is available. Check the HR portal for details."
55
56
57agent = Agent(
58 llm=llm,
59 tools=[get_policy],
60 custom_instruction="""You are an HR assistant.
61- Call tools when you need facts.""",
62)
63
64
65# %%[markdown]
66## Export agent spec
67
68# %%
69from wayflowcore.agentspec import AgentSpecExporter, AgentSpecLoader
70
71spec = AgentSpecExporter().to_json(agent)
72tool_registry = {"get_policy": get_policy}
73agent_from_spec = AgentSpecLoader(tool_registry=tool_registry).load_json(spec)
74
75
76# %%[markdown]
77## Serve in memory
78
79# %%
80from wayflowcore.agentserver.server import OpenAIResponsesServer
81
82server = OpenAIResponsesServer()
83server.serve_agent("hr-assistant", agent)
84app = server.get_app()
85
86if __name__ == "__main__":
87 server.run(host="127.0.0.1", port=3000)
88
89
90
91# %%[markdown]
92## Call the server
93
94# %%
95from wayflowcore.models import OpenAIAPIType, OpenAICompatibleModel
96
97client = OpenAICompatibleModel(
98 model_id="hr-assistant",
99 base_url="http://127.0.0.1:3000",
100 api_type=OpenAIAPIType.RESPONSES,
101)
102completion = client.generate("Summarize the vacation policy.")
103print(completion.message.content)
104
105
106# %%[markdown]
107## Persistent storage
108
109# %%
110from wayflowcore.agentserver import ServerStorageConfig
111from wayflowcore.datastore.postgres import (
112 WithoutTlsPostgresDatabaseConnectionConfig,
113 PostgresDatabaseDatastore,
114)
115
116storage_config = ServerStorageConfig(table_name="assistant_conversations")
117
118connection_config = WithoutTlsPostgresDatabaseConnectionConfig(
119 user=os.environ.get("PG_USER", "postgres"),
120 password=os.environ.get("PG_PASSWORD", "password"),
121 url=os.environ.get("PG_HOST", "localhost:5432"),
122)
123
124datastore = PostgresDatabaseDatastore(
125 schema=storage_config.to_schema(),
126 connection_config=connection_config,
127)
128
129persistent_server = OpenAIResponsesServer(
130 storage=datastore,
131 storage_config=storage_config,
132)
133persistent_server.serve_agent("hr-assistant", agent)
134
135
136# %%[markdown]
137## Add fastapi security
138
139# %%
140import secrets
141from fastapi import Request, status
142from fastapi.responses import JSONResponse
143
144API_TOKEN = os.environ.get("WAYFLOW_SERVER_TOKEN", "change-me")
145
146secured_app = server.get_app()
147
148@secured_app.middleware("http")
149async def require_bearer_token(request: Request, call_next):
150 auth_header = request.headers.get("authorization", "")
151 expected_header = f"Bearer {API_TOKEN}"
152 if not secrets.compare_digest(auth_header, expected_header):
153 return JSONResponse(
154 status_code=status.HTTP_401_UNAUTHORIZED,
155 content={"detail": "Missing or invalid bearer token"},
156 )
157 return await call_next(request)