Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix scope management in StartSpan and doFinish #886

Merged
merged 1 commit into from
Oct 14, 2024

Conversation

Darevski
Copy link
Contributor

Problem:
Previously, when creating a chain of spans, events triggered with the sentry.CaptureMessage() could be associated with the wrong span due to using the incorrect span in the hub’s scope, which led to inaccurate trace data in Sentry.

Fix:
By introducing proper span scope management, each non-transaction span now pushes a scope when started and pops it when finished. This ensures that events are associated with the correct active span in the trace hierarchy. Transactions remain unchanged as the root elements.

Test Coverage:
The new TestSpanScopeManagement confirms that events captured after finishing a child span are correctly associated with that span, ensuring that the trace and span IDs are accurate.

Copy link

codecov bot commented Oct 3, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.86%. Comparing base (a6acd05) to head (5407f87).
Report is 5 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #886      +/-   ##
==========================================
+ Coverage   82.85%   82.86%   +0.01%     
==========================================
  Files          55       55              
  Lines        4619     4623       +4     
==========================================
+ Hits         3827     3831       +4     
  Misses        636      636              
  Partials      156      156              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cleptric
Copy link
Member

cleptric commented Oct 3, 2024

It seems very wasteful to create a new scope for each span. With the recent changes in 0.29.0 (/~https://github.com/getsentry/sentry-go/blob/master/tracing.go#L197-L198), this should already work, considering you're setting the correct hub on the context.

@Darevski
Copy link
Contributor Author

Darevski commented Oct 3, 2024

@cleptric

It seems very wasteful to create a new scope for each span. With the recent changes in 0.29.0 (/~https://github.com/getsentry/sentry-go/blob/master/tracing.go#L197-L198), this should already work, considering you're setting the correct hub on the context.

With version 0.29.0, the following code:

	transaction := sentry.StartTransaction(ctx, "parent-operation")
	defer transaction.Finish()

	childSpan := sentry.StartSpan(transaction.Context(), "child-operation")
	defer childSpan.Finish()

	subChildSpan := sentry.StartSpan(childSpan.Context(), "sub_child-operation")
	subChildSpan.Finish()
	
	sentry.CaptureMessage("Test event")

The code will act like this: the “Test event” message will be assigned to the already finished ‘sub_child-operation’ span instead of the ‘child-operation’ span.
Is this the correct behaviour? I don’t think so.

The same thing happens with all methods that use hub.scope internally.

You could take the test that I prepared for that case from my PR if you need it.

But the biggest problems start when you try to create distributed tracing

Here is the working sample code example:
https://go.dev/play/p/OudKFJv1OL0

This is the trace view result of the code with my fix:
image
We have a fully functional informative view, with the messages correctly assigned to the required spans and transactions.

And here is the result of this code without my fix (0.29.0):
image
Chain connection are missing

We could try to look deeper into the requests that are being sent to Sentry:

Here is the request without the fix (removed unusable info):
image
We could see that when we sending the "child-transaction", the "child-span" is related to the "parent-span" (from the first service) instead of the "child-transaction."

Here is the request with the fix:
image
Here we can see that the "child-span" correctly references the "child-transaction" span.

In the case that we start using OpenTelemetry, we encounter far more problems, as a lot of information is placed inside the scope span, and the hub.Scope().SetSpan(&span) call simply overrides it (the child span overrides the parent scope without any restore).

Could you kindly clarify what exactly I should do in this situation without modifying the Sentry codebase? Are you suggesting I should create a new hub for every span, or is there a more practical solution I’m missing?

@shlyamster
Copy link

@cleptric I confirm the presence of the problem described here in version 0.29.0. In my traces the same way root context is lost because of which the span order in the transaction is not correctly organized.

Here is an example trace on version 0.29.0. Note the Parent Span ID field at the root transaction.
2024-10-07 15 45 15

And here's the trace with the suggested fixes.
2024-10-07 15 45 28

@Darevski
Copy link
Contributor Author

Darevski commented Oct 8, 2024

@cleptric, the fix in #888 is only a partial solution, as it only resolves the trace context for the transaction.
What about capturing events like messages or exceptions, which I mentioned two messages earlier?

The main issue is that the scope retains the state after starting spans and holds onto that state indefinitely, until a new span is created. This scope is then applied to all subsequent events.

If there is a large trace with some information or exceptions, it’s very useful to see in which span that event was created. However, currently, it is connected to the most recently created span.

If you think that my fix is wasteful, could you please advise what could be done to resolve this?

Our company is really need this functionality of correct attaching events

@cleptric cleptric merged commit 20684c7 into getsentry:master Oct 14, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants