-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: All Operation search not functioning under Azure CosmosDB-for-Cassandra storage #5185
Comments
Apologies for the random ping @TheovanKraay but you may be well placed to help with this. |
The Cosmos DB API for Apache Cassandra does have some compatibility gaps. I would recommend running Jaeger with Azure Managed Instance for Apache Cassandra.. This is an offering under Azure Cosmos DB, but is a fully managed service for pure open-source Apache Cassandra with 100% compatibility. You should not have any problems with any of the Jaeger commands if using this service instead. |
Ok, thanks. No action needed here from Jaeger then. To anyone who finds this in a search you will need to move to full cassandra or elasticsearch storage backend. |
According to Microsoft Questions, the former "Preview" feature will no longer be activated and will not become "GA": https://learn.microsoft.com/en-us/answers/questions/1338536/is-cosmosclusterindex-still-a-preview-feature (expand comments on accepted answer) This means that it is not and will not be possible to create a custom index with Azure Cosmos DB for Apache Cassandra. But running a managed instance with Apache Cassandra instead is much more costly in comparison, simply to compensate for this one missing feature. Is it still possible for Jaeger to work around this issue so that a query with operation = "All" is possible? Maybe a config option to order after the data is fetched from Cassandra (of course with the disadvantage of being a little slower and/or more resource-intensive for memory/CPU)? |
Jaeger uses this table for service-only searches:
The clustering primary index was a feature of Cassandra since v2.x (10yrs ago). If Cosmos does not support it, I don't know how it claims to be Cassandra-compatible. Perhaps it has other workarounds to define some secondary indices. |
@yurishkuro Strangely enough, the tables themselves can be created with a clustering index without any problems. But as soon as a query uses an ORDER BY, the problem is reported that the said custom index is missing - which simply cannot be created since it is not a supported feature: SELECT * FROM jaegertracing.service_name_index WHERE bucket in (0,1,2,3,4,6,7,8,9) AND service_name = 'my-service' AND start_time > 1718728568730810 AND start_time < 1718728668730810 ORDER BY start_time DESC;
InvalidRequest: Error from server: code=2200 [Invalid query] message="ORDER BY requires creating a custom index: CosmosClusteringIndex. Please create a custom index and re-issue this query" If one submits the query without ORDER BY, the data can be read. A potential workaround would therefore probably be to run the SELECT without ORDER BY and sort the results in the application logic: SELECT * FROM jaegertracing.service_name_index WHERE bucket in (0,1,2,3,4,6,7,8,9) AND service_name = 'my-service' AND start_time > 1718728568730810 AND start_time < 1718728668730810;
service_name | bucket | start_time | trace_id
------------------+--------+------------------+------------------------------------
my-service | 1 | 1718728573990844 | 0x0d1b531a4024d9dc848c0793120673ad
my-service | 1 | 1718728572649478 | 0xcd2e6645a71ba85a1b12fce2e82b30a8
my-service | 1 | 1718728572637999 | 0xcd2e6645a71ba85a1b12fce2e82b30a8
my-service | 1 | 1718728571031771 | 0xdcfc1ac661729d6ca1aafe38bd146add
---MORE--- I don't think this can be solved with secondary indexes, as I understand that they only support faster filtering, but not sorting. |
What happened?
I have installed as a proof of concept and to minimize costs am using Azure CosmosDB for Cassandra as the database backend which I believe has been made to work based on previous closed issues #1667 and #2467 . The jaeger parts of the tooling work successfully but CosmosDB's cassandra api does not appear to be complete which means it is not possible to use the "all" operation selection to see the recent traces ordered by time.
Steps to reproduce
Setup a tracing database using AzureDB for Cassandra. Customize the install cqsh commands for a single node and run it, it will succeed. Proceed to setup something to send data through the collector to the backend and everything should work normally.
Run the query ui pointing to the azure db instance. select the "all" operation and click "Find Traces" button. An error will be returned:
The query that cases this to occur is:
and the definition of
service_name_index
already states thatstart_time
is part of the key and ordered,WITH CLUSTERING ORDER BY (start_time DESC)
so i think this is CosmosDB not conforming the cassandra api correctly. This is backed up by an issue on the microsoft learn site https://learn.microsoft.com/en-us/answers/questions/1181520/cassandra-api-unable-to-run-query?page=1&orderby=Helpful&comment=answer-1177286#newest-answer-comment where the user is directed to enable a preview cassandra feature that i cannot find.Expected behavior
It would be good if jaeger could be made to work around this issue or some azure specific schema change could be identified that let it work in spite of the missing feature in cosmosdb.
I understand that is not likely to be a problem in jaeger. However when researching whether this backend would function all the information I could find suggested that it would work. If azure cosmosdb for cassandra is not a viable backend because it lacks a required feature of the real cassandra system then it may be useful for others to be able to find this issue in searches.
Relevant log output
No response
Screenshot
No response
Additional context
No response
Jaeger backend version
1.53
SDK
OpenTelemetry Dotnet package 1.7.0
Pipeline
azure appservice -> jaeger-collector -> azurecosmosdb-for-cassandra
Stogage backend
azurecosmosdb-for-cassandra
Operating system
Windows
Deployment model
No response
Deployment configs
create-schema-clean.txt
The text was updated successfully, but these errors were encountered: