feat(openai-agents): Support span streaming by alexander-alderman-webb · Pull Request #6404 · getsentry/sentry-python

alexander-alderman-webb · 2026-05-26T06:10:05Z

Note: Depends on #6424

Description

Use sentry_sdk.traces.start_span, replace Span.set_data() with StreamedSpan.set_attribute() and Span.set_status(SPANSTATUS.INTERNAL_ERROR) with StreamedSpan.status = SpanStatus.ERROR when in span streaming mode.

Parametrize tests on the trace lifecycle option.

Issues

Closes #6041

Reminders

Please add tests to validate your changes, and lint your code using tox -e linters.
Add GH Issue ID & Linear ID (if applicable)
PR title should use conventional commit style (feat:, fix:, ref:, meta:)
For external contributors: CONTRIBUTING.md, Sentry SDK development docs, Discord community

github-actions · 2026-05-26T06:11:12Z

Codecov Results 📊

✅ 89597 passed | ⏭️ 5956 skipped | Total: 95553 | Pass Rate: 93.77% | Execution Time: 305m 16s

📊 Comparison with Base Branch

Metric	Change
Total Tests	📈 +400
Passed Tests	📈 +457
Failed Tests	—
Skipped Tests	📉 -57

All tests are passing successfully.

✅ Patch coverage is 95.16%. Project has 2395 uncovered lines.
✅ Project coverage is 89.82%. Comparing base (base) to head (head).

Files with missing lines (5)

File	Patch %	Lines
sentry_sdk/integrations/openai_agents/patches/agent_run.py	85.71%	⚠️ 1 Missing and 3 partials
sentry_sdk/integrations/openai_agents/patches/runner.py	92.86%	⚠️ 1 Missing and 3 partials
sentry_sdk/integrations/openai_agents/spans/handoff.py	83.33%	⚠️ 2 Missing and 2 partials
sentry_sdk/integrations/openai_agents/patches/error_tracing.py	100.00%	⚠️ 2 partials
sentry_sdk/integrations/openai_agents/utils.py	92.59%	⚠️ 2 Missing

Coverage diff

@@            Coverage Diff             @@
##          main       #PR       +/-##
==========================================
+ Coverage    89.81%    89.82%    +0.01%
==========================================
  Files          192       192         —
  Lines        23457     23520       +63
  Branches      8060      8092       +32
==========================================
+ Hits         21066     21125       +59
- Misses        2391      2395        +4
- Partials      1328      1331        +3

Generated by Codecov Action

ericapisani

Overall looking good but there's a few things around the potential for some bugs with a couple of the conditionals that were updated, and the inconsistent way that we're processing spans in tests (some aren't asserting the total amount of spans and the order in which they appear, others are).

ericapisani · 2026-05-26T13:15:00Z

+        span is None
+        or isinstance(span, StreamedSpan)
+        and span.end_timestamp is not None
+        or not isinstance(span, StreamedSpan)
+        and span.timestamp is not None


This is a bit difficult to read and I think could lead to an accidental bug because of orders or precedence with the and and or keywords (the former taking evaluation precedence over the latter).

Reading this, I'm not sure if the intention was to have the statement evaluate as isinstance(span, StreamedSpan) and span.end_timestamp is not None rather than (span is None or isinstance(span, StreamedSpan)) and ...

Fair point about lack of intentionality.
The intention is for the condition to evaluate to true if the end timestamp has not yet been set, indicating that the span is still active. The complex conditional is used because Span.timestamp and StreamedSpan.end_timestamp represent the end timestamp (i.e., because the instance variable was renamed).
I've added brackets in e243b86

ericapisani · 2026-05-26T13:19:34Z

+            name=f"chat {model_name}",
+            origin=SPAN_ORIGIN,
+        )
+        # TODO-anton: remove hardcoded stuff and replace something that also works for embedding and so on


I'm not sure how useful this comment is anymore - any chance we can remove it?

I think this is still a bug 😬.
Created #6417 to track.

ericapisani · 2026-05-26T14:22:44Z

+    span_streaming = has_span_streaming_enabled(sentry_sdk.get_client().options)
+    if span_streaming:
+        with sentry_sdk.traces.start_span(
+            name=f"handoff from {from_agent.name} to {to_agent_name}",


We have a gen_ai.agent.name attribute and I'm thinking this should maybe be added as an attribute with a value of from_agent.name (since it's the originator of the span in terms of the action being taken)

I'd have the same instinct to add an attribute to avoid users matching on the name.

However, we're removing the workflow span entirely per the RFC.

ericapisani · 2026-05-26T14:27:39Z

+            assert result.final_output == "Hello, how can I help you?"
+
+        sentry_sdk.flush()
+        spans = [item.payload for item in items]


Even if we're only interested in the invoke agent span and ai client span, we should still assert the number of expected spans here since this should be stable. If there are more/less spans, we'd want this test to fail

I'd consider this out of scope for this PR because the existing test did not assert the total number of spans the goal of the PR is to port the integration to support span streaming (and make the same test guarantees as before).

I'm not sure if it's always been the intention to open another pull request to introduce the stricter checks, but if so, please add that in the pull request descriptions going forward so it's clear if it's an intentional decision or an accidental oversight.

As we've been introducing these stricter checks as part of other streamed span integration migrations and code review comments have been left when it's not been done, it appears to me that these stricter checks are part of the migration process, not out of scope.

In the case of this change set it's especially noticeable because it's being applied in some places and not others.

I was neither intending to open another PR nor was this an accidental oversight.

Going by the migration guide in #5386 (comment) I'm adding test code equivalent to the tests for the transaction based traces. (emphasis mine below).

Parametrize all tests that deal with tracing to execute both in legacy and in span streaming mode. Add equivalent assertions in streaming mode. You can use the capture_items fixture for unwrapping envelope item items automatically.

The two instances this has been brought up in reviews that I'm aware of (#6206 (comment) and #6123 (comment)) were specifically pointing out looser assertions compared to the legacy path.

Based on the information I have seen, broader changes to the tests are not part of the span first migration. If they are let's update the integration guide.

100% agree that the change set is a pain to read (and it's frustrating that code written so recently has so much tech debt).

With regards to porting tests, if the existing tests don't assert on the number/order of spans currently, fine to also not do that in scope of the span first conversion.

What I want to avoid during the migration is losing assertions compared to the legacy path. If they weren't there in the first place, we don't need to add them. (It's def nice to have though, and can be done low-effort with a clanker.)

ericapisani · 2026-05-26T14:37:47Z

+        invoke_agent_span = next(
+            span
+            for span in spans
+            if span["attributes"]["sentry.op"] == OP.GEN_AI_INVOKE_AGENT
+        )
+        ai_client_span = next(
+            span for span in spans if span["attributes"]["sentry.op"] == OP.GEN_AI_CHAT
+        )


As I'm reviewing, I'm noticing some test cases with this pattern, and other test cases with this pattern.

Is there a reason why some are implemented one way vs the other? Is it to avoid assigning spans to variables that won't be asserted on?

I have no idea what the authors of the integration were thinking. I have not made a conscious decision in favor of one or the other style.

…d matching

…ai-agents/span-first-2

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 515f38a. Configure here.}

ericapisani

Other than my one comment, LGTM otherwise, approving so as not to block

ericapisani · 2026-06-11T14:39:48Z

+    span_streaming = has_span_streaming_enabled(sentry_sdk.get_client().options)
+    if span_streaming:
+        span = sentry_sdk.traces.start_span(
+            name=f"{agent.name} workflow", attributes={"sentry.origin": SPAN_ORIGIN}
+        )
+
+        return span
+


Taking a look at get_start_span_function, it returns one of:

sentry_sdk.traces.start_span

sentry_sdk.start_span

sentry_sdk.start_transaction

depending on if span streaming is enabled or not or if there's a transaction that's currently active.

What we can do here in order to leverage the existing streamed span awareness within that function is to, instead of invoking start_span above, change how we're invoking the get_start_span_function based on whether span streaming is enabled or not.

So the function body would look something like the following:

Suggested change

span_streaming = has_span_streaming_enabled(sentry_sdk.get_client().options)

if span_streaming:

span = sentry_sdk.traces.start_span(

name=f"{agent.name} workflow", attributes={"sentry.origin": SPAN_ORIGIN}

)

return span

span_streaming = has_span_streaming_enabled(sentry_sdk.get_client().options)

span_func = get_start_span_function()

if span_streaming:

return span_func(

name=f"{agent.name} workflow",

attributes={"sentry.origin": SPAN_ORIGIN}

)

else:

return span_func(

name=f"{agent.name} workflow",

origin=SPAN_ORIGIN

)

alexander-alderman-webb added 2 commits May 22, 2026 11:59

fix(openai-agents): Remove redundant hosted MCP tool spans

6f67df0

feat(openai-agents): Support span streaming

0f53ac0

alexander-alderman-webb added 3 commits May 26, 2026 08:12

.

5a5f81d

mypy

c4a5366

mypy2

b2b0c34

Base automatically changed from webb/openai-agents/remote-tool to master May 26, 2026 07:31

merge master

bf37fa2

alexander-alderman-webb marked this pull request as ready for review May 26, 2026 07:41

alexander-alderman-webb requested a review from a team as a code owner May 26, 2026 07:41

cursor Bot reviewed May 26, 2026

View reviewed changes

Comment thread sentry_sdk/integrations/openai_agents/patches/runner.py

Comment thread sentry_sdk/integrations/openai_agents/utils.py

alexander-alderman-webb added 4 commits May 26, 2026 10:11

handle streamed span invoke agent

53a84ae

truncate invoke agent attributes

dcdeb8d

fix bool logic

2c1bd19

merge master

e9413c5

ericapisani requested changes May 26, 2026

View reviewed changes

add brackets to bool logic

e243b86

alexander-alderman-webb mentioned this pull request May 26, 2026

Create an enum for possible gen_ai.operation.name values #6416

Open

alexander-alderman-webb added 3 commits May 26, 2026 17:48

drop transaction reference in span streaming comment

931aa30

test(openai-agents): Deduplicate in tests by removing node.callspec.i…

3a95520

…d matching

Merge branch 'webb/openai-agents/remove-node-callspec' into webb/open…

ab65aa4

…ai-agents/span-first-2

alexander-alderman-webb changed the base branch from master to webb/openai-agents/remove-node-callspec May 27, 2026 07:28

merge cleanup

e1d6ebe

cursor Bot reviewed May 27, 2026

View reviewed changes

Comment thread sentry_sdk/integrations/openai_agents/patches/runner.py Outdated

add bool precedence

0b4ed5f

Base automatically changed from webb/openai-agents/remove-node-callspec to master May 27, 2026 14:14

alexander-alderman-webb added 2 commits May 27, 2026 16:17

merge master

f941351

remove redundant condition

515f38a

sentrivana approved these changes Jun 11, 2026

View reviewed changes

alexander-alderman-webb added the skip-changelog label Jun 11, 2026

cursor Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread sentry_sdk/integrations/openai_agents/spans/execute_tool.py

set errors on streamed spans

d3f3d5b

ericapisani approved these changes Jun 11, 2026

View reviewed changes

Conversation

alexander-alderman-webb commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issues

Reminders

Uh oh!

github-actions Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Results 📊

📊 Comparison with Base Branch

Uh oh!

Uh oh!

Uh oh!

ericapisani left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sentrivana May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ericapisani left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alexander-alderman-webb commented May 26, 2026 •

edited

Loading

github-actions Bot commented May 26, 2026 •

edited

Loading

sentrivana May 27, 2026 •

edited

Loading