Open Agent Specification Tracing (Agent Spec Tracing)#
Overview#
Open Agent Specification Tracing (short: Agent Spec Tracing) is an extension of Agent Spec that standardizes how agent and flow executions emit traces. It defines a unified, implementation-agnostic semantic for:
Events: structured, point-in-time facts.
Spans: time-bounded execution contexts that group events.
Traces: trees of spans that represent an end-to-end execution.
SpanProcessors: pluggable hooks to consume spans and events (e.g., export to UIs, tracing backends, or logs).
Agent Spec Tracing enables:
Runtime adapters to emit consistent traces across different frameworks.
Consumers (observability backends, UIs, developer tooling) to ingest one standardized format regardless of the producer.
Agent Spec Tracing aligns with widely used observability concepts (e.g., OpenTelemetry), while grounding definitions in Agent Spec components and semantics. It specifies what spans and events to emit, when to emit them, and which attributes to include, including which attributes are sensitive.
Scope and goals#
Provide a canonical list of span and event types for Agent Spec runtimes.
Define lifecycle and attribute schema for each span/event.
Identify sensitive fields and how they should be handled.
Provide a minimal API surface for producers and consumers.
Remain neutral to storage/transport (Telemetry, UIs, files, etc.).
Core Concepts#
Event#
An Event is an atomic episode that occurs at a specific time instant. It always belongs to exactly one Span.
Events have a definition similar to the Agent Spec Components, and they have the same descriptive fields:
id, name, description, type, and metadata.
Additionally, they require a timestamp that defines when the event occurred, and extensions of this class can
add more attributes based on the event they represent, aimed at preserving all the relevant information
related to the event being recorded.
Events can also have Sensitive fields, that are declared and must be handled per Agent Spec guidelines.
class Event:
id: str # Unique identifier for the event. Typically generated from component type plus a unique token.
type: str # Concrete type specifier for this event
name: str # Name of the event, if applicable
description: str # Description of the event, if applicable
metadata: Dictionary[str, Any] # Additional metadata that could be used for extra information
timestamp: int # nanoseconds since epoch
timestamp: time of occurrence (ns). Producers should use monotonic clocks where possible and/or convert to wall-clock ns as configured by the runtime.
Agent Spec Tracing defines a set of Event types with specific attributes, that you can find in the following sections.
Span#
A Span defines a time-bounded execution context. Each Span:
starts at start_time (ns), ends at end_time (ns), end_time can be null if not closed.
can contain zero or more Events.
can be nested (child span has a parent span).
Also Spans have a definition similar to the Agent Spec Components, and they share the same descriptive fields:
id, name, description, type, and metadata.
Extensions of this Span can add more attributes based on the Span they represent.
Attributes on a Span typically reflect configuration that applies to the whole
duration of the Span (e.g., the Agent being executed, the LLM config, etc.).
Spans can also have Sensitive fields, that are declared and must be handled per Agent Spec guidelines.
Besides these attributes, Spans MUST also implement the following interface:
class Span:
id: str # Unique identifier for the span. Typically generated from component type plus a unique token.
type: str # Concrete type specifier for this span.
name: str # Name of the span, if applicable
description: str # Description of the span, if applicable
metadata: Dictionary[str, Any] # Additional metadata that could be used for extra information
start_time: int
end_time: Optional[int]
events: List[Event]
def start(self) -> None: ...
def end(self) -> None: ...
def add_event(self, event: Event) -> None: ...
start: called when the span starts.shutdown: called when the span ends.add_event: called when an event has to be added to this Span. It MUST append the event in theeventsattribute.
Lifecycle rules:
Spans MUST have start_time when started, while end_time is set on end.
Events added to a Span MUST have timestamps within [start_time, end_time], whenever end_time is known.
Spans MAY contain child spans. Implementations SHOULD propagate correlation context so consumers can rebuild the tree.
Agent Spec Tracing defines a set of Span types with specific attributes, that you can find in the following sections.
SpanProcessor#
A SpanProcessor receives callbacks when Spans start/end and when Events are added. Processors are meant to consume the Agent Spec traces (spans, events) emitted by the runtime adapter during the execution. They can be used to export traces to third parties consumers (e.g., to OpenTelemetry, files, UIs).
A SpanProcessor MUST implement the following interface.
class SpanProcessor(ABC):
def on_start(self, span: Span) -> None: ...
def on_end(self, span: "Span") -> None: ...
def on_event(self, event: Event, span: Span) -> None: ...
def startup(self) -> None: ...
def shutdown(self) -> None: ...
startup: called when an Agent Spec Tracing session starts.shutdown: called when an Agent Spec Tracing session ends.on_start: called when a Span starts.on_end: called when a Span ends.on_event: called when an Event is added to a Span.
Trace#
A Trace groups all spans and events that belong to the same top-level assistant execution. It is the root where all the SpanProcessors that must be active during the assistant execution are declared.
Opening a Trace SHOULD call
SpanProcessor.startup()on all configured processors.Closing a Trace SHOULD call
SpanProcessor.shutdown()on all configured processors.
Standard Span Types#
This first version defines the following span types.
All spans inherit the attributes of the base Span class.
The attributes listed here are additional and span-specific.
Fields marked sensitive MUST be handled as described in Security Considerations.
LlmGenerationSpan#
Covers the whole LLM generation process.
Starts: when the LLM generation request is received and the LLM call is performed.
Ends: when the LLM output has been generated and is ready to be processed.
Attributes:
Name |
Description |
Type |
Default |
Sensitive |
|---|---|---|---|---|
llm_config |
The LlmConfig performing the generation |
LlmConfig |
no |
ToolExecutionSpan#
Covers a tool execution (excluding client-side tools executed by the UI/client).
Starts: when tool execution starts.
Ends: when the tool execution completes and the result is ready to be processed.
Attributes:
Name |
Description |
Type |
Default |
Sensitive |
|---|---|---|---|---|
tool |
The Tool being executed |
Tool |
no |
AgentExecutionSpan#
Represents the execution of an Agent. May be nested for sub-agents.
Starts: when the agent execution starts.
Ends: when the agent execution completes and outputs are ready to process.
Attributes:
Name |
Description |
Type |
Default |
Sensitive |
|---|---|---|---|---|
agent |
The Agent being executed |
Agent |
no |
SwarmExecutionSpan#
Specialization of AgentExecutionSpan for a Swarm Component.
Starts: when swarm execution starts.
Ends: when swarm execution completes and outputs are ready.
Attributes:
Name |
Description |
Type |
Default |
Sensitive |
|---|---|---|---|---|
swarm |
The Swarm being executed |
Swarm |
no |
ManagerWorkersExecutionSpan#
Specialization of AgentExecutionSpan for a Manager-Workers pattern.
Starts: when Manager-Workers execution starts.
Ends: when the execution completes and outputs are ready.
Attributes:
Name |
Description |
Type |
Default |
Sensitive |
|---|---|---|---|---|
managerworkers |
The ManagerWorkers being executed |
ManagerWorkers |
no |
FlowExecutionSpan#
Covers the execution of a Flow.
Starts: when the Flow’s StartNode execution starts.
Ends: when one of the Flow’s EndNode executions finishes.
Attributes:
Name |
Description |
Type |
Default |
Sensitive |
|---|---|---|---|---|
flow |
The Flow being executed |
Flow |
no |
NodeExecutionSpan#
Covers the execution of a single Node within a Flow.
Starts: when the Node execution starts on the given inputs.
Ends: when the Node execution ends and outputs are ready.
Attributes:
Name |
Description |
Type |
Default |
Sensitive |
|---|---|---|---|---|
node |
The Node being executed |
Node |
no |
Standard Event Types#
All events are inherit the attributes defined in the base Event class.
The following events define the default set for Agent Spec Tracing.
For each, we specify when it is emitted and which attributes it carries.
Fields marked sensitive MUST be handled as described in Security Considerations.
LLM events#
LlmGenerationRequest#
An LLM generation request was received. Emitted when an LlmGenerationSpan starts.
Attributes:
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
llm_config |
The LlmConfig performing the generation |
LlmConfig |
no |
|
request_id |
Identifier of the generation request |
str |
no |
|
llm_generation_config |
The LLM generation parameters used for this call |
Optional[LlmGenerationConfig] |
null |
no |
prompt |
Prompt that will be sent to the LLM; a list of Message with at least content and role, optionally sender |
List[Message] |
yes |
|
tools |
Tools sent as part of the generation request |
Optional[List[Tool]] |
null |
no |
The Message model should be implemented as
class Message(BaseModel):
"""Model used to specify LLM message details in events and spans"""
id: Optional[str] = None
"Identifier of the message"
content: str
"Content of the message"
sender: Optional[str] = None
"Sender of the message"
role: str
"Role of the sender of the message. Typically 'user', 'assistant', or 'system'"
LlmGenerationResponse#
An LLM response was received. Emitted when an LlmGenerationSpan ends.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
llm_config |
The LlmConfig performing the generation |
LlmConfig |
no |
|
request_id |
Identifier of the generation request |
str |
no |
|
tool_calls |
Tool calls returned by the LLM. Each tool call should contain a |
List[ToolCall] |
yes |
|
completion_id |
Identifier of the completion related to this response |
Optional[str] |
null |
no |
content |
The content of the response (assistant message text) |
str |
yes |
The ToolCall model should be implemented as
class ToolCall(BaseModel):
"""Model for an LLM tool call."""
call_id: str
"Identifier of the tool call"
tool_name: str
"The name of the tool that should be called"
arguments: str
"The values of the arguments that should be passed to the tool, in JSON format"
LlmGenerationStreamingChunkReceived#
A streamed chunk was received during LLM generation.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
llm_config |
The LlmConfig performing the generation |
LlmConfig |
no |
|
request_id |
Identifier of the generation request |
str |
no |
|
tool_calls |
Tool calls chunked by the LLM. Each tool call should contain a |
List[ToolCall] |
yes |
|
completion_id |
Identifier of the parent completion (message or tool call) this chunk belongs to |
Optional[str] |
null |
no |
content |
The chunk content. This is the delta compared to the last chunk received. |
str |
yes |
Tool events#
ToolExecutionRequest#
A tool execution request is received. Emitted when a ToolExecutionSpan starts (or a client tool is requested).
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
tool |
The Tool being executed |
Tool |
no |
|
request_id |
Identifier of the tool execution request |
str |
no |
|
inputs |
Input values for the tool (one per input property) |
dict[str, any] |
yes |
ToolExecutionResponse#
A tool execution finishes and a result is received. Emitted when a ToolExecutionSpan ends (or a client tool result is received).
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
tool |
The Tool being executed |
Tool |
no |
|
request_id |
Identifier of the tool execution request |
str |
no |
|
output |
Return value produced by the tool (one per output property) |
dict[str, any] |
yes |
ToolConfirmationRequest#
A tool confirmation is requested (e.g., human-in-the-loop approval before execution).
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
tool |
The Tool being executed |
Tool |
no |
|
tool_execution_request_id |
Identifier of the related tool execution request |
str |
no |
|
request_id |
Identifier of this confirmation request |
str |
no |
ToolConfirmationResponse#
A tool confirmation response is received.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
tool |
The Tool being executed |
Tool |
no |
|
tool_execution_request_id |
Identifier of the related tool execution request |
str |
no |
|
request_id |
Identifier of the confirmation request |
str |
no |
|
execution_confirmed |
Whether execution was confirmed |
bool |
no |
AgenticComponent events#
AgentExecutionStart#
Emitted when an AgentExecutionSpan starts.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
agent |
The Agent being executed |
Agent |
no |
|
inputs |
Inputs used for the agent execution (one per input property) |
dict[str, any] |
yes |
AgentExecutionEnd#
Emitted when an AgentExecutionSpan ends.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
agent |
The Agent being executed |
Agent |
no |
|
outputs |
Outputs produced by the agent (one per output property) |
dict[str, any] |
yes |
ManagerWorkersExecutionStart#
Emitted when a ManagerWorkersExecutionSpan starts.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
managerworkers |
The ManagerWorkers being executed |
ManagerWorkers |
no |
|
inputs |
Inputs used for execution (one per input property) |
dict[str, any] |
yes |
ManagerWorkersExecutionEnd#
Emitted when a ManagerWorkersExecutionSpan ends.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
managerworkers |
The ManagerWorkers being executed |
ManagerWorkers |
no |
|
outputs |
Outputs produced (one per output property) |
dict[str, any] |
yes |
SwarmExecutionStart#
Emitted when a SwarmExecutionSpan starts.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
swarm |
The Swarm being executed |
Swarm |
no |
|
inputs |
Inputs used for the swarm execution (one per input property) |
dict[str, any] |
yes |
SwarmExecutionEnd#
Emitted when a SwarmExecutionSpan ends.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
swarm |
The Swarm being executed |
Swarm |
no |
|
outputs |
Outputs produced (one per output property) |
dict[str, any] |
yes |
Flow events#
FlowExecutionStart#
Emitted when a FlowExecutionSpan starts.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
flow |
The Flow being executed |
Flow |
no |
|
inputs |
Inputs used by the flow (one per StartNode input property) |
dict[str, any] |
yes |
FlowExecutionEnd#
Emitted when a FlowExecutionSpan ends.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
flow |
The Flow being executed |
Flow |
no |
|
outputs |
Outputs produced by the flow (one per flow output property) |
dict[str, any] |
yes |
|
branch_selected |
Exit branch selected at the end of the Flow |
str |
no |
NodeExecutionStart#
Emitted when a NodeExecutionSpan starts.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
node |
The Node being executed |
Node |
no |
|
inputs |
Inputs used by the node (one per node input property) |
dict[str, any] |
yes |
NodeExecutionEnd#
Emitted when a NodeExecutionSpan ends.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
node |
The Node being executed |
Node |
no |
|
outputs |
Outputs produced by the node (one per node output property) |
dict[str, any] |
yes |
|
branch_selected |
Exit branch selected at the end of the Node |
str |
no |
Conversation and control events#
ConversationMessageAdded#
A message was added to the conversation.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
message |
The message added; must contain at least content and role, optionally sender |
Message |
yes |
ExceptionRaised#
An exception occurred during execution.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
exception_type |
Type of the exception |
str |
no |
|
exception_message |
Exception message |
str |
yes |
|
exception_stacktrace |
Stacktrace of the exception, if available |
Optional[str] |
null |
yes |
HumanInTheLoopRequest#
A human-in-the-loop intervention is required; execution is interrupted until a response.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
request_id |
Identifier of the human-in-the-loop request |
str |
no |
|
content |
Request content forwarded to the user |
dict[str, any] |
{} (empty object) |
yes |
HumanInTheLoopResponse#
A HITL response is received; execution resumes.
Name |
Description |
Type |
Default/Optional |
Sensitive |
|---|---|---|---|---|
request_id |
Identifier of the HITL request |
str |
no |
|
content |
Response content received from the user |
dict[str, any] |
{} (empty object) |
yes |
Deterministic identifiers and correlation#
Some events define correlation identifiers to allow consumers to link request/response and other events.
For example:
request_id: unique identifier of a single LLM or Tool execution request within a Span.
completion_id: identifier of a completion (LLM message or tool-call) which may receive streaming chunks.
tool_execution_request_id: identifier of a tool execution for confirmation.
Runtimes SHOULD ensure uniqueness within a Span and consistency across all related events.
PyAgentSpecTracing (Python materialization)#
The pyagentspec.tracing subpackage of pyagentspec provides convenient Pydantic-based models and interfaces so that:
Producers (adapters, runtimes) can emit spans/events according to Agent Spec Tracing standards.
Consumers (exporters, UIs) can receive and consume them via SpanProcessors.
Emitting traces (producer example)#
Here’s an example of how adapters can emit traces (i.e., start and close Spans, emit Events)
extracted from the AgentNode implementation of the LangGraph’s adapter in pyagentspec==26.1.0.
with AgentExecutionSpan(name=f"AgentExecution - {agentspec_agent.name}", agent=agentspec_agent) as span:
span.add_event(AgentExecutionStart(inputs=inputs))
result = agent.invoke(inputs, config)
outputs = result.outputs if hasattr(result, "outputs") else {}
span.add_event(AgentExecutionEnd(outputs=outputs))
Consuming traces (consumer example)#
class OpenTelemetrySpanProcessor(SpanProcessor):
def __init__(self, sdk_processor: OtelSdkSpanProcessor):
self._sdk_processor = sdk_processor
def on_start(self, span: "Span") -> None:
otel_span = self._to_otel_span(span)
otel_span.start(start_time=span.start_time)
self._sdk_processor.on_start(span=otel_span)
def on_end(self, span: "Span") -> None:
otel_span = self._to_otel_span(span)
otel_span.end(end_time=span.end_time)
self._sdk_processor.on_end(span=otel_span)
def on_event(self, event: Event, span: Span) -> None:
# Other processors may use this hook to stream events
pass
def startup(self) -> None:
...
def shutdown(self) -> None:
self._sdk_processor.shutdown
Interoperability examples#
Tracing with LangGraph
from pyagentspec.adapters.langgraph import AgentSpecLoader
from openinference_spanprocessor import ArizePhoenixSpanProcessor
# Assuming this package implements a SpanProcessor that takes the Agent Spec Traces and sends them to a Phoenix Arize server
agent_json = read_json_file("my/agentspec/agent.json")
processor = ArizePhoenixSpanProcessor(mask_sensitive_information=False, project_name="agentspec-tracing-test")
with Trace(name="agentspec_langgraph_demo", span_processors=[processor]) as trace:
agent = AgentSpecLoader().load_json(agent_json)
result = agent.invoke({"inputs": {}, "messages": []})
Tracing with WayFlow
from wayflowcore.agentspec import AgentSpecLoader
from wayflowcore.agentspec.tracing import AgentSpecTracingEventListener
from wayflowcore.events.eventlistener import register_event_listeners
from openinference_spanprocessor import ArizePhoenixSpanProcessor
agent_json = read_json_file("my/agentspec/agent.json")
processor = ArizePhoenixSpanProcessor(mask_sensitive_information=False, project_name="agentspec-tracing-test")
with register_event_listeners([AgentSpecTracingEventListener()]):
with Trace(name="agentspec_wayflow_demo", span_processors=[processor]) as trace:
agent = AgentSpecLoader().load_json(agent_json)
conversation = agent.start_conversation()
status = conversation.execute()
Security Considerations#
Agent Spec Tracing inherits all security requirements from Agent Spec (see Security Considerations). Additionally, tracing frequently includes potentially sensitive information (PII), including, but not limited to:
LLM prompts and generated content
Tool inputs/outputs
Exception messages and stacktraces
Conversation messages
Implementing a SpanProcessor#
Key points:
Async / non-blocking - keep the span processor off the critical path to avoid impacting agent’s performance.
Robust error handling - never raise from span processor methods; drop or queue on failure.
Back-pressure - apply rate limits, size limits, or batch Spans and Events to avoid DoS on the collector.
Sensitive fields#
Each event table above flags attributes that are considered sensitive. Producers SHOULD mark and/or emit them using Agent Spec’s Sensitive Field mechanism where applicable; consumers (SpanProcessors) SHOULD:
Mask or omit sensitive fields by default when exporting traces.
Provide an explicit configuration to unmask for trusted environments.
Avoid logging or exporting sensitive data inadvertently.
Guidelines:
Attribute-level masking: redact entire values or apply strongly irreversible masking (e.g., replace content with fixed placeholders or hashes as policy dictates).
Downstream mappers (e.g., OpenTelemetry exporters) MUST NOT downgrade masking guarantees.
When masking affects correlation (e.g., truncating request_id), preserve minimal non-sensitive identifiers for linkage.
Design Notes and Best Practices#
Event emission ordering: Within a span, events SHOULD be in timestamp order.
Time units: Use nanoseconds since epoch for timestamps and start/end times for consistency with common tracing systems.
Nesting: Prefer nesting spans to represent sub-operations (e.g., NodeExecutionSpan under FlowExecutionSpan, ToolExecutionSpan under AgentExecutionSpan).
Exceptions: Emit ExceptionRaised with type/message/stacktrace and consider adding it before ending the current span or on a dedicated error span, depending on runtime design.
FAQ and Open Questions#
Naming alignment with observability#
This specification uses tracing terminology common in OpenTelemetry (Trace, Span, Event, SpanProcessor) to leverage community familiarity.
Environment context#
This version focuses on agentic execution tracing. Future versions may add execution-environment spans or include environment metadata on Trace.
References and Cross-links#
Agent Spec language specification: Agent Spec specification (nightly version 26.1.0.dev4)
Security guidelines: Security Considerations