Open Agent Specification Tracing (Agent Spec Tracing)#

Overview#

Open Agent Specification Tracing (short: Agent Spec Tracing) is an extension of Agent Spec that standardizes how agent and flow executions emit traces. It defines a unified, implementation-agnostic semantic for:

Events: structured, point-in-time facts.
Spans: time-bounded execution contexts that group events.
Traces: trees of spans that represent an end-to-end execution.
SpanProcessors: pluggable hooks to consume spans and events (e.g., export to UIs, tracing backends, or logs).

Agent Spec Tracing enables:

Runtime adapters to emit consistent traces across different frameworks.
Consumers (observability backends, UIs, developer tooling) to ingest one standardized format regardless of the producer.

Agent Spec Tracing aligns with widely used observability concepts (e.g., OpenTelemetry), while grounding definitions in Agent Spec components and semantics. It specifies what spans and events to emit, when to emit them, and which attributes to include, including which attributes are sensitive.

Scope and goals#

Provide a canonical list of span and event types for Agent Spec runtimes.
Define lifecycle and attribute schema for each span/event.
Identify sensitive fields and how they should be handled.
Provide a minimal API surface for producers and consumers.
Remain neutral to storage/transport (Telemetry, UIs, files, etc.).

Core Concepts#

Event#

An Event is an atomic episode that occurs at a specific time instant. It always belongs to exactly one Span.

Events have a definition similar to the Agent Spec Components, and they have the same descriptive fields: id, name, description, type, and metadata. Additionally, they require a timestamp that defines when the event occurred, and extensions of this class can add more attributes based on the event they represent, aimed at preserving all the relevant information related to the event being recorded. Events can also have Sensitive fields, that are declared and must be handled per Agent Spec guidelines.

class Event:
    id: str  # Unique identifier for the event. Typically generated from component type plus a unique token.
    type: str  # Concrete type specifier for this event
    name: str  # Name of the event, if applicable
    description: str  # Description of the event, if applicable
    metadata: Dictionary[str, Any]  # Additional metadata that could be used for extra information
    timestamp: int  # nanoseconds since epoch

timestamp: time of occurrence (ns). Producers should use monotonic clocks where possible and/or convert to wall-clock ns as configured by the runtime.

Agent Spec Tracing defines a set of Event types with specific attributes, that you can find in the following sections.

Span#

A Span defines a time-bounded execution context. Each Span:

starts at start_time (ns), ends at end_time (ns), end_time can be null if not closed.
can contain zero or more Events.
can be nested (child span has a parent span).

Also Spans have a definition similar to the Agent Spec Components, and they share the same descriptive fields: id, name, description, type, and metadata. Extensions of this Span can add more attributes based on the Span they represent. Attributes on a Span typically reflect configuration that applies to the whole duration of the Span (e.g., the Agent being executed, the LLM config, etc.). Spans can also have Sensitive fields, that are declared and must be handled per Agent Spec guidelines.

Besides these attributes, Spans MUST also implement the following interface:

class Span:
    id: str  # Unique identifier for the span. Typically generated from component type plus a unique token.
    type: str  # Concrete type specifier for this span.
    name: str  # Name of the span, if applicable
    description: str  # Description of the span, if applicable
    metadata: Dictionary[str, Any]  # Additional metadata that could be used for extra information
    start_time: int
    end_time: Optional[int]
    events: List[Event]

    def start(self) -> None: ...

    def end(self) -> None: ...

    def add_event(self, event: Event) -> None: ...

start: called when the span starts.
shutdown: called when the span ends.
add_event: called when an event has to be added to this Span. It MUST append the event in the events attribute.

Lifecycle rules:

Spans MUST have start_time when started, while end_time is set on end.
Events added to a Span MUST have timestamps within [start_time, end_time], whenever end_time is known.
Spans MAY contain child spans. Implementations SHOULD propagate correlation context so consumers can rebuild the tree.

Agent Spec Tracing defines a set of Span types with specific attributes, that you can find in the following sections.

SpanProcessor#

A SpanProcessor receives callbacks when Spans start/end and when Events are added. Processors are meant to consume the Agent Spec traces (spans, events) emitted by the runtime adapter during the execution. They can be used to export traces to third parties consumers (e.g., to OpenTelemetry, files, UIs).

A SpanProcessor MUST implement the following interface.

class SpanProcessor(ABC):

    def on_start(self, span: Span) -> None: ...

    def on_end(self, span: "Span") -> None: ...

    def on_event(self, event: Event, span: Span) -> None: ...

    def startup(self) -> None: ...

    def shutdown(self) -> None: ...

startup: called when an Agent Spec Tracing session starts.
shutdown: called when an Agent Spec Tracing session ends.
on_start: called when a Span starts.
on_end: called when a Span ends.
on_event: called when an Event is added to a Span.

Trace#

A Trace groups all spans and events that belong to the same top-level assistant execution. It is the root where all the SpanProcessors that must be active during the assistant execution are declared.

Opening a Trace SHOULD call SpanProcessor.startup() on all configured processors.
Closing a Trace SHOULD call SpanProcessor.shutdown() on all configured processors.

Standard Span Types#

This first version defines the following span types. All spans inherit the attributes of the base Span class. The attributes listed here are additional and span-specific. Fields marked sensitive MUST be handled as described in Security Considerations.

LlmGenerationSpan#

Covers the whole LLM generation process.

Starts: when the LLM generation request is received and the LLM call is performed.
Ends: when the LLM output has been generated and is ready to be processed.

Attributes:

Name	Description	Type	Default	Sensitive
llm_config	The LlmConfig performing the generation	LlmConfig		no

ToolExecutionSpan#

Covers a tool execution (excluding client-side tools executed by the UI/client).

Starts: when tool execution starts.
Ends: when the tool execution completes and the result is ready to be processed.

Attributes:

Name	Description	Type	Default	Sensitive
tool	The Tool being executed	Tool		no

AgentExecutionSpan#

Represents the execution of an Agent. May be nested for sub-agents.

Starts: when the agent execution starts.
Ends: when the agent execution completes and outputs are ready to process.

Attributes:

Name	Description	Type	Default	Sensitive
agent	The Agent being executed	Agent		no

SwarmExecutionSpan#

Specialization of AgentExecutionSpan for a Swarm Component.

Starts: when swarm execution starts.
Ends: when swarm execution completes and outputs are ready.

Attributes:

Name	Description	Type	Default	Sensitive
swarm	The Swarm being executed	Swarm		no

ManagerWorkersExecutionSpan#

Specialization of AgentExecutionSpan for a Manager-Workers pattern.

Starts: when Manager-Workers execution starts.
Ends: when the execution completes and outputs are ready.

Attributes:

Name	Description	Type	Default	Sensitive
managerworkers	The ManagerWorkers being executed	ManagerWorkers		no

FlowExecutionSpan#

Covers the execution of a Flow.

Starts: when the Flow’s StartNode execution starts.
Ends: when one of the Flow’s EndNode executions finishes.

Attributes:

Name	Description	Type	Default	Sensitive
flow	The Flow being executed	Flow		no

NodeExecutionSpan#

Covers the execution of a single Node within a Flow.

Starts: when the Node execution starts on the given inputs.
Ends: when the Node execution ends and outputs are ready.

Attributes:

Name	Description	Type	Default	Sensitive
node	The Node being executed	Node		no

Standard Event Types#

All events are inherit the attributes defined in the base Event class. The following events define the default set for Agent Spec Tracing. For each, we specify when it is emitted and which attributes it carries. Fields marked sensitive MUST be handled as described in Security Considerations.

LLM events#

LlmGenerationRequest#

An LLM generation request was received. Emitted when an LlmGenerationSpan starts.

Attributes:

Name	Description	Type	Default/Optional	Sensitive
llm_config	The LlmConfig performing the generation	LlmConfig		no
request_id	Identifier of the generation request	str		no
llm_generation_config	The LLM generation parameters used for this call	Optional[LlmGenerationConfig]	null	no
prompt	Prompt that will be sent to the LLM; a list of Message with at least content and role, optionally sender	List[Message]		yes
tools	Tools sent as part of the generation request	Optional[List[Tool]]	null	no

The Message model should be implemented as

class Message(BaseModel):
    """Model used to specify LLM message details in events and spans"""

    id: Optional[str] = None
    "Identifier of the message"

    content: str
    "Content of the message"

    sender: Optional[str] = None
    "Sender of the message"

    role: str
    "Role of the sender of the message. Typically 'user', 'assistant', or 'system'"

LlmGenerationResponse#

An LLM response was received. Emitted when an LlmGenerationSpan ends.

Name	Description	Type	Default/Optional	Sensitive
llm_config	The LlmConfig performing the generation	LlmConfig		no
request_id	Identifier of the generation request	str		no
tool_calls	Tool calls returned by the LLM. Each tool call should contain a `call_id` identifier, the `tool_name`, and the `arguments` with which the tool is being called as a string in JSON format.	List[ToolCall]		yes
completion_id	Identifier of the completion related to this response	Optional[str]	null	no
content	The content of the response (assistant message text)	str		yes

The ToolCall model should be implemented as

class ToolCall(BaseModel):
    """Model for an LLM tool call."""

    call_id: str
    "Identifier of the tool call"

    tool_name: str
    "The name of the tool that should be called"

    arguments: str
    "The values of the arguments that should be passed to the tool, in JSON format"

LlmGenerationStreamingChunkReceived#

A streamed chunk was received during LLM generation.

Name	Description	Type	Default/Optional	Sensitive
llm_config	The LlmConfig performing the generation	LlmConfig		no
request_id	Identifier of the generation request	str		no
tool_calls	Tool calls chunked by the LLM. Each tool call should contain a `call_id` identifier, the `tool_name`, and the `arguments` with which the tool is being called as a string in JSON format. The content of arguments should be considered as the delta compared to the last chunk received.	List[ToolCall]		yes
completion_id	Identifier of the parent completion (message or tool call) this chunk belongs to	Optional[str]	null	no
content	The chunk content. This is the delta compared to the last chunk received.	str		yes

Tool events#

ToolExecutionRequest#

A tool execution request is received. Emitted when a ToolExecutionSpan starts (or a client tool is requested).

Name	Description	Type	Sensitive
tool	The Tool being executed	Tool	no
request_id	Identifier of the tool execution request	str	no
inputs	Input values for the tool (one per input property)	dict[str, any]	yes

ToolExecutionResponse#

A tool execution finishes and a result is received. Emitted when a ToolExecutionSpan ends (or a client tool result is received).

Name	Description	Type	Sensitive
tool	The Tool being executed	Tool	no
request_id	Identifier of the tool execution request	str	no
output	Return value produced by the tool (one per output property)	dict[str, any]	yes

ToolConfirmationRequest#

A tool confirmation is requested (e.g., human-in-the-loop approval before execution).

Name	Description	Type	Sensitive
tool	The Tool being executed	Tool	no
tool_execution_request_id	Identifier of the related tool execution request	str	no
request_id	Identifier of this confirmation request	str	no

ToolConfirmationResponse#

A tool confirmation response is received.

Name	Description	Type	Sensitive
tool	The Tool being executed	Tool	no
tool_execution_request_id	Identifier of the related tool execution request	str	no
request_id	Identifier of the confirmation request	str	no
execution_confirmed	Whether execution was confirmed	bool	no

AgenticComponent events#

AgentExecutionStart#

Emitted when an AgentExecutionSpan starts.

Name	Description	Type	Default/Optional	Sensitive
agent	The Agent being executed	Agent		no
inputs	Inputs used for the agent execution (one per input property)	dict[str, any]		yes

AgentExecutionEnd#

Emitted when an AgentExecutionSpan ends.

Name	Description	Type	Default/Optional	Sensitive
agent	The Agent being executed	Agent		no
outputs	Outputs produced by the agent (one per output property)	dict[str, any]		yes

ManagerWorkersExecutionStart#

Emitted when a ManagerWorkersExecutionSpan starts.

Name	Description	Type	Default/Optional	Sensitive
managerworkers	The ManagerWorkers being executed	ManagerWorkers		no
inputs	Inputs used for execution (one per input property)	dict[str, any]		yes

ManagerWorkersExecutionEnd#

Emitted when a ManagerWorkersExecutionSpan ends.

Name	Description	Type	Default/Optional	Sensitive
managerworkers	The ManagerWorkers being executed	ManagerWorkers		no
outputs	Outputs produced (one per output property)	dict[str, any]		yes

SwarmExecutionStart#

Emitted when a SwarmExecutionSpan starts.

Name	Description	Type	Default/Optional	Sensitive
swarm	The Swarm being executed	Swarm		no
inputs	Inputs used for the swarm execution (one per input property)	dict[str, any]		yes

SwarmExecutionEnd#

Emitted when a SwarmExecutionSpan ends.

Name	Description	Type	Default/Optional	Sensitive
swarm	The Swarm being executed	Swarm		no
outputs	Outputs produced (one per output property)	dict[str, any]		yes

Flow events#

FlowExecutionStart#

Emitted when a FlowExecutionSpan starts.

Name	Description	Type	Default/Optional	Sensitive
flow	The Flow being executed	Flow		no
inputs	Inputs used by the flow (one per StartNode input property)	dict[str, any]		yes

FlowExecutionEnd#

Emitted when a FlowExecutionSpan ends.

Name	Description	Type	Sensitive
flow	The Flow being executed	Flow	no
outputs	Outputs produced by the flow (one per flow output property)	dict[str, any]	yes
branch_selected	Exit branch selected at the end of the Flow	str	no

NodeExecutionStart#

Emitted when a NodeExecutionSpan starts.

Name	Description	Type	Default/Optional	Sensitive
node	The Node being executed	Node		no
inputs	Inputs used by the node (one per node input property)	dict[str, any]		yes

NodeExecutionEnd#

Emitted when a NodeExecutionSpan ends.

Name	Description	Type	Sensitive
node	The Node being executed	Node	no
outputs	Outputs produced by the node (one per node output property)	dict[str, any]	yes
branch_selected	Exit branch selected at the end of the Node	str	no

Conversation and control events#

ConversationMessageAdded#

A message was added to the conversation.

Name	Description	Type	Default/Optional	Sensitive
message	The message added; must contain at least content and role, optionally sender	Message		yes

ExceptionRaised#

An exception occurred during execution.

Name	Description	Type	Default/Optional	Sensitive
exception_type	Type of the exception	str		no
exception_message	Exception message	str		yes
exception_stacktrace	Stacktrace of the exception, if available	Optional[str]	null	yes

HumanInTheLoopRequest#

A human-in-the-loop intervention is required; execution is interrupted until a response.

Name	Description	Type	Default/Optional	Sensitive
request_id	Identifier of the human-in-the-loop request	str		no
content	Request content forwarded to the user	dict[str, any]	{} (empty object)	yes

HumanInTheLoopResponse#

A HITL response is received; execution resumes.

Name	Description	Type	Default/Optional	Sensitive
request_id	Identifier of the HITL request	str		no
content	Response content received from the user	dict[str, any]	{} (empty object)	yes

Deterministic identifiers and correlation#

Some events define correlation identifiers to allow consumers to link request/response and other events.

For example:

request_id: unique identifier of a single LLM or Tool execution request within a Span.
completion_id: identifier of a completion (LLM message or tool-call) which may receive streaming chunks.
tool_execution_request_id: identifier of a tool execution for confirmation.

Runtimes SHOULD ensure uniqueness within a Span and consistency across all related events.

PyAgentSpecTracing (Python materialization)#

The pyagentspec.tracing subpackage of pyagentspec provides convenient Pydantic-based models and interfaces so that:

Producers (adapters, runtimes) can emit spans/events according to Agent Spec Tracing standards.
Consumers (exporters, UIs) can receive and consume them via SpanProcessors.

Emitting traces (producer example)#

Here’s an example of how adapters can emit traces (i.e., start and close Spans, emit Events) extracted from the AgentNode implementation of the LangGraph’s adapter in pyagentspec==26.1.0.

with AgentExecutionSpan(name=f"AgentExecution - {agentspec_agent.name}", agent=agentspec_agent) as span:
    span.add_event(AgentExecutionStart(inputs=inputs))
    result = agent.invoke(inputs, config)
    outputs = result.outputs if hasattr(result, "outputs") else {}
    span.add_event(AgentExecutionEnd(outputs=outputs))

Consuming traces (consumer example)#

class OpenTelemetrySpanProcessor(SpanProcessor):

    def __init__(self, sdk_processor: OtelSdkSpanProcessor):
        self._sdk_processor = sdk_processor

    def on_start(self, span: "Span") -> None:
        otel_span = self._to_otel_span(span)
        otel_span.start(start_time=span.start_time)
        self._sdk_processor.on_start(span=otel_span)

    def on_end(self, span: "Span") -> None:
        otel_span = self._to_otel_span(span)
        otel_span.end(end_time=span.end_time)
        self._sdk_processor.on_end(span=otel_span)

    def on_event(self, event: Event, span: Span) -> None:
        # Other processors may use this hook to stream events
        pass

    def startup(self) -> None:
        ...

    def shutdown(self) -> None:
        self._sdk_processor.shutdown

Interoperability examples#

Tracing with LangGraph

from pyagentspec.adapters.langgraph import AgentSpecLoader
from openinference_spanprocessor import ArizePhoenixSpanProcessor
# Assuming this package implements a SpanProcessor that takes the Agent Spec Traces and sends them to a Phoenix Arize server

agent_json = read_json_file("my/agentspec/agent.json")
processor = ArizePhoenixSpanProcessor(mask_sensitive_information=False, project_name="agentspec-tracing-test")

with Trace(name="agentspec_langgraph_demo", span_processors=[processor]) as trace:
    agent = AgentSpecLoader().load_json(agent_json)
    result = agent.invoke({"inputs": {}, "messages": []})

Tracing with WayFlow

from wayflowcore.agentspec import AgentSpecLoader
from wayflowcore.agentspec.tracing import AgentSpecTracingEventListener
from wayflowcore.events.eventlistener import register_event_listeners
from openinference_spanprocessor import ArizePhoenixSpanProcessor

agent_json = read_json_file("my/agentspec/agent.json")
processor = ArizePhoenixSpanProcessor(mask_sensitive_information=False, project_name="agentspec-tracing-test")

with register_event_listeners([AgentSpecTracingEventListener()]):
    with Trace(name="agentspec_wayflow_demo", span_processors=[processor]) as trace:
        agent = AgentSpecLoader().load_json(agent_json)
        conversation = agent.start_conversation()
        status = conversation.execute()

Security Considerations#

Agent Spec Tracing inherits all security requirements from Agent Spec (see Security Considerations). Additionally, tracing frequently includes potentially sensitive information (PII), including, but not limited to:

LLM prompts and generated content
Tool inputs/outputs
Exception messages and stacktraces
Conversation messages

Implementing a SpanProcessor#

Key points:

Async / non-blocking - keep the span processor off the critical path to avoid impacting agent’s performance.
Robust error handling - never raise from span processor methods; drop or queue on failure.
Back-pressure - apply rate limits, size limits, or batch Spans and Events to avoid DoS on the collector.

Sensitive fields#

Each event table above flags attributes that are considered sensitive. Producers SHOULD mark and/or emit them using Agent Spec’s Sensitive Field mechanism where applicable; consumers (SpanProcessors) SHOULD:

Mask or omit sensitive fields by default when exporting traces.
Provide an explicit configuration to unmask for trusted environments.
Avoid logging or exporting sensitive data inadvertently.

Guidelines:

Attribute-level masking: redact entire values or apply strongly irreversible masking (e.g., replace content with fixed placeholders or hashes as policy dictates).
Downstream mappers (e.g., OpenTelemetry exporters) MUST NOT downgrade masking guarantees.
When masking affects correlation (e.g., truncating request_id), preserve minimal non-sensitive identifiers for linkage.

Design Notes and Best Practices#

Event emission ordering: Within a span, events SHOULD be in timestamp order.
Time units: Use nanoseconds since epoch for timestamps and start/end times for consistency with common tracing systems.
Nesting: Prefer nesting spans to represent sub-operations (e.g., NodeExecutionSpan under FlowExecutionSpan, ToolExecutionSpan under AgentExecutionSpan).
Exceptions: Emit ExceptionRaised with type/message/stacktrace and consider adding it before ending the current span or on a dedicated error span, depending on runtime design.

FAQ and Open Questions#

Naming alignment with observability#

This specification uses tracing terminology common in OpenTelemetry (Trace, Span, Event, SpanProcessor) to leverage community familiarity.

Environment context#

This version focuses on agentic execution tracing. Future versions may add execution-environment spans or include environment metadata on Trace.

References and Cross-links#

Agent Spec language specification: Agent Spec specification (nightly version 26.1.0.dev0)
Security guidelines: Security Considerations