Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BREAKING: remove messaging.destination.kind and messaging.source.kind values #3214

Merged
merged 9 commits into from
Mar 21, 2023

Conversation

lmolkova
Copy link
Contributor

@lmolkova lmolkova commented Feb 15, 2023

Fixes #3170, #3265, #3249

Changes

We currently allow topic or queue on messaging.destination.kind. While it's common in messaging world to have one or another, messaging semantic conventions can be applied to AMPQ communication (which does not have topic/queue terminology), socket.io, and potentially other less traditional messaging use-cases.

It's unclear how messaging.destination.kind and messaging.source.kind could be used. The distinction between queue and topic is significant for messaging and distributed systems, but not for tracing.

In either case, tracing backends should expect to process traces from 0+ messaging and 0+ messaging consumers. In either case, message consumers can be simultaneous or consequent and there could be many of them.

The only known case (Solace) where it could be useful is when messaging system allows having queues and topic with the same name on the same broker, and it could be used to distinguish one from another.

Based on messaging SIG discussion, the attributes are removed for the time being until we understand if and how they are useful.

Depending on messaging system queues or topics behavior vary a lot and in future it would makes more sense to represent actual behavior with individual attributes such as:

  • auto-settlement (at-most-once or at least once guarantees)
  • settlement for individual messages or offsets
  • broadcast or unicast
  • etc

@pyohannes
Copy link
Contributor

I think we should also change messaging.source.kind accordingly.

@lmolkova
Copy link
Contributor Author

I think we should also change messaging.source.kind accordingly.

it's already set to true (inconsistently with destination).

@lmolkova lmolkova added area:semantic-conventions Related to semantic conventions semconv:messaging labels Feb 16, 2023
@lmolkova lmolkova marked this pull request as ready for review February 16, 2023 05:05
@lmolkova lmolkova requested review from a team February 16, 2023 05:05
@Oberon00
Copy link
Member

Oberon00 commented Feb 22, 2023

I think this attribute becomes much less useful if there are custom values allowed. Maybe we should instead have an attribute that tells us the expected delivery kind. E.g. instead of topic, we could use "to_all", instead of queue "to_any_one". If there are different delivery types, we might be able to come up with a complete (though maybe not fine-grained) list, e.g. maybe we have to add to_some, to_specific_one.

@lmolkova
Copy link
Contributor Author

lmolkova commented Feb 23, 2023

I think this attribute becomes much less useful if there are custom values allowed. Maybe we should instead have an attribute that tells us the expected delivery kind. E.g. instead of topic, we could use "to_all", instead of queue "to_any_one". If there are different delivery types, we might be able to come up with a complete (though maybe not fine-grained) list, e.g. maybe we have to add to_some, to_specific_one.

could you please explain why they become less useful? how do you expect this information to be used?

I want to go even further and change the requirement level on those #3249 since they don't tell much to backends.

@Oberon00
Copy link
Member

how do you expect this information to be used?

To tell whether the message is expected to be sent to one or more receivers for example.

I fear that if we allow custom values, every messaging system might have different names and basically we will have values that boil down to "my-messaging-system-flavored-topic" instead of "topic" because the topic concept on "my-messaging-system" is named differently there and maybe also behaves slightly differently.

@pyohannes
Copy link
Contributor

To tell whether the message is expected to be sent to one or more receivers for example.

That might be true for many cases, but it's not a conclusion one can generally draw and rely on. E. g., I could use a fan-out exchange in RabbitMQ, where one messages ends up in several queues, or I can have topics with just a single consumer.

I think what this destination/source kind tries to tell is, whether a message is settled individually (queue) or based on checkpoints (topic). In the first case, doing a settlement operation on a single message will only settle this single message. In the latter case, a settlement operation on a single message also settles all unsettled messages before this message in the message stream.

For me this attribute makes sense for messaging systems or abstractions that support both modes (like JMS, or Azure Service Bus). For other systems this attribute is of very limited value: if messaging.system is set to kafka, source/destination kind will always be topic, if it's rabbitmq it will always be queue. I don't see the attribute providing additional value in those cases.

With that said, I wonder whether it makes sense to make this attribute conditionally required only for messaging systems that support more than one destination/source kind? And then allow using terminology that makes sense for the particular system?

@lmolkova
Copy link
Contributor Author

lmolkova commented Feb 24, 2023

To tell whether the message is expected to be sent to one or more receivers for example.

That might be true for many cases, but it's not a conclusion one can generally draw and rely on. E. g., I could use a fan-out exchange in RabbitMQ, where one messages ends up in several queues, or I can have topics with just a single consumer.

Right, if I implement a backend, seeing topic or queue (especially if instrumentations are forced to pick one out of two despite its behavior being different) does not give me any reliable information on what to expect. In each case, I can expect to have 0 or more consumers, parallel or consecutive, direct or routed.

@Oberon00
Copy link
Member

Oberon00 commented Feb 27, 2023

In each case, I can expect to have 0 or more consumers, parallel or consecutive, direct or routed.

This is not true, because the messaging semantic conventions include very particular definitions of topic and queue. /~https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/messaging.md#destinations-and-sources

A message that is sent (the send-operation is often called "publish" in this context) to a topic is broadcasted to all consumers that have subscribed to the topic.

A message that is sent to a queue is processed by a message consumer (usually exactly once although some message systems support a more performant at-least-once mode for messages with idempotent processing).

But maybe this was ignored by instrumentations? I guess it might make sense to remove this attribute completely?

@lmolkova
Copy link
Contributor Author

lmolkova commented Feb 28, 2023

In each case, I can expect to have 0 or more consumers, parallel or consecutive, direct or routed.

This is not true, because the messaging semantic conventions include very particular definitions of topic and queue. /~https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/messaging.md#destinations-and-sources

What spec says is not correct:

A message that is sent to a queue is processed by a message consumer (usually exactly once although some message systems support a more performant at-least-once mode for messages with idempotent processing).

I don't know if any queue provides exactly once guarantee (or if it's even possible). All queues I know of provide at least once guarantees along with at-most once, which is configurable with auto-settlement.

Proofs:

Message can be delivered multiple times if it's not settled during delivery (ack not received, settlement call times out, etc). Some queues allow to defer/re-publish message at settlement (RabbitMQ, Azure ServiceBus)

Created #3265 to track it.

So generic OTel instrumentation does not know how settlement or broker are configured and cannot assume there will be exactly one delivery. Message can be re-routed or forked on the consumer, replicated to multiple queues to improve reliability and whatnot.

@Oberon00
Copy link
Member

Oberon00 commented Mar 1, 2023

IMHO, even more arguments for removing the attribute completely (hopefully along with the definitions of queue and topic in the spec)

@jsuereth
Copy link
Contributor

jsuereth commented Mar 1, 2023

Regarding the usage of "allow_custom_values" I'm actually thinking we should remove that feature entirely. I made a case during the last specification SiG and I'll try to write up my thoughts better on the subject. You can see some of the discussion in #3225.

I think the danger of "allow_custom_values: false" is that tooling vendors (out-of-the-box dashboards, alerts, queries) that assume values won't change put us in a fragile position. Having semantic meaning for some values AND allowing open values should lead to a much more flexible ecosystem.

@Oberon00
Copy link
Member

Oberon00 commented Mar 1, 2023

I think the danger of "allow_custom_values: false" is that tooling vendors (out-of-the-box dashboards, alerts, queries) that assume values won't change put us in a fragile position

That is also the value proposition of that feature. You could of course say that it should be used so sparingly that an explicit text in the description is better (and that there is no need for code generators to do anything depending on that). I don't think I share that opinion

@lmolkova
Copy link
Contributor Author

lmolkova commented Mar 1, 2023

IMHO, even more arguments for removing the attribute completely (hopefully along with the definitions of queue and topic in the spec)

good point! I will bring it up with Messaging SIG to discuss.

@lmolkova lmolkova force-pushed the msg-dest-kind-custom branch from 945eb86 to b0a24f3 Compare March 9, 2023 01:29
@lmolkova lmolkova changed the title Allow custom messaging.destination.kind values BREAKING: remove messaging.destination.kind and messaging.source.kind values Mar 9, 2023
@carlosalberto
Copy link
Contributor

@lmolkova I think we are ready to merge this once we create an issue for future tracking of what @dpauls provided as feedback.

@lmolkova
Copy link
Contributor Author

@lmolkova I think we are ready to merge this once we create an issue for future tracking of what @dpauls provided as feedback.

thank you for the remainder! I created open-telemetry/semantic-conventions#1220 to track the future work

@carlosalberto carlosalberto merged commit 89a9601 into open-telemetry:main Mar 21, 2023
carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this pull request Oct 31, 2024
… values (open-telemetry#3214)

Fixes open-telemetry#3170, open-telemetry#3265, open-telemetry#3249

## Changes

~~We currently allow `topic` or `queue` on `messaging.destination.kind`.
While it's common in messaging world to have one or another, messaging
semantic conventions can be applied to AMPQ communication (which does
not have topic/queue terminology), [socket.io](https://socket.io/), and
potentially other less traditional messaging use-cases.~~

It's unclear how `messaging.destination.kind` and
`messaging.source.kind` could be used. The distinction between queue and
topic is significant for messaging and distributed systems, but not for
tracing.

In either case, tracing backends should expect to process traces from 0+
messaging and 0+ messaging consumers. In either case, message consumers
can be simultaneous or consequent and there could be many of them.

The only known case (Solace) where it could be useful is when messaging
system allows having queues and topic with the same name on the same
broker, and it could be used to distinguish one from another.

Based on messaging SIG discussion, the attributes are removed for the
time being until we understand if and how they are useful.

Depending on messaging system queues or topics behavior vary a lot and
in future it would makes more sense to represent actual behavior with
individual attributes such as:
- auto-settlement (at-most-once or at least once guarantees)
- settlement for individual messages or offsets
- broadcast or unicast
- etc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions semconv:messaging
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Messaging terminology: queue/topic vs address
9 participants