Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: All Operation search not functioning under Azure CosmosDB-for-Cassandra storage #5185

Closed
Wraith2 opened this issue Feb 7, 2024 · 6 comments
Labels

Comments

@Wraith2
Copy link

Wraith2 commented Feb 7, 2024

What happened?

I have installed as a proof of concept and to minimize costs am using Azure CosmosDB for Cassandra as the database backend which I believe has been made to work based on previous closed issues #1667 and #2467 . The jaeger parts of the tooling work successfully but CosmosDB's cassandra api does not appear to be complete which means it is not possible to use the "all" operation selection to see the recent traces ordered by time.

Steps to reproduce

Setup a tracing database using AzureDB for Cassandra. Customize the install cqsh commands for a single node and run it, it will succeed. Proceed to setup something to send data through the collector to the backend and everything should work normally.
Run the query ui pointing to the azure db instance. select the "all" operation and click "Find Traces" button. An error will be returned:

ORDER BY requires creating a custom index: CosmosClusteringIndex. Please create a custom index and re-issue this query

The query that cases this to occur is:

SELECT trace_id FROM service_name_index WHERE bucket IN (0,1,2,3,4,5,6,7,8,9) AND service_name = ? AND start_time > ?  AND start_time < ? ORDER BY start_time DESC LIMIT ?;

and the definition of service_name_index already states that start_time is part of the key and ordered, WITH CLUSTERING ORDER BY (start_time DESC) so i think this is CosmosDB not conforming the cassandra api correctly. This is backed up by an issue on the microsoft learn site https://learn.microsoft.com/en-us/answers/questions/1181520/cassandra-api-unable-to-run-query?page=1&orderby=Helpful&comment=answer-1177286#newest-answer-comment where the user is directed to enable a preview cassandra feature that i cannot find.

Expected behavior

It would be good if jaeger could be made to work around this issue or some azure specific schema change could be identified that let it work in spite of the missing feature in cosmosdb.

I understand that is not likely to be a problem in jaeger. However when researching whether this backend would function all the information I could find suggested that it would work. If azure cosmosdb for cassandra is not a viable backend because it lacks a required feature of the real cassandra system then it may be useful for others to be able to find this issue in searches.

Relevant log output

No response

Screenshot

No response

Additional context

No response

Jaeger backend version

1.53

SDK

OpenTelemetry Dotnet package 1.7.0

Pipeline

azure appservice -> jaeger-collector -> azurecosmosdb-for-cassandra

Stogage backend

azurecosmosdb-for-cassandra

Operating system

Windows

Deployment model

No response

Deployment configs

create-schema-clean.txt

@Wraith2 Wraith2 added the bug label Feb 7, 2024
@Wraith2 Wraith2 changed the title [Bug]: [Bug]: All Operation search not functioning under Azure CosmosDB-for-Cassandra storage Feb 7, 2024
@Wraith2
Copy link
Author

Wraith2 commented Feb 12, 2024

Apologies for the random ping @TheovanKraay but you may be well placed to help with this.

@TheovanKraay
Copy link

The Cosmos DB API for Apache Cassandra does have some compatibility gaps. I would recommend running Jaeger with Azure Managed Instance for Apache Cassandra.. This is an offering under Azure Cosmos DB, but is a fully managed service for pure open-source Apache Cassandra with 100% compatibility. You should not have any problems with any of the Jaeger commands if using this service instead.

@Wraith2
Copy link
Author

Wraith2 commented Feb 23, 2024

Ok, thanks.

No action needed here from Jaeger then. To anyone who finds this in a search you will need to move to full cassandra or elasticsearch storage backend.

@Wraith2 Wraith2 closed this as completed Feb 23, 2024
@jravnik
Copy link

jravnik commented Jun 18, 2024

According to Microsoft Questions, the former "Preview" feature will no longer be activated and will not become "GA": https://learn.microsoft.com/en-us/answers/questions/1338536/is-cosmosclusterindex-still-a-preview-feature (expand comments on accepted answer)
image

This means that it is not and will not be possible to create a custom index with Azure Cosmos DB for Apache Cassandra. But running a managed instance with Apache Cassandra instead is much more costly in comparison, simply to compensate for this one missing feature.

Is it still possible for Jaeger to work around this issue so that a query with operation = "All" is possible? Maybe a config option to order after the data is fetched from Cassandra (of course with the disadvantage of being a little slower and/or more resource-intensive for memory/CPU)?

@yurishkuro
Copy link
Member

Jaeger uses this table for service-only searches:

CREATE TABLE IF NOT EXISTS ${keyspace}.service_name_index (
    service_name      text,
    bucket            int,
    start_time        bigint, -- microseconds since epoch
    trace_id          blob,
    PRIMARY KEY ((service_name, bucket), start_time)
) WITH CLUSTERING ORDER BY (start_time DESC)

The clustering primary index was a feature of Cassandra since v2.x (10yrs ago). If Cosmos does not support it, I don't know how it claims to be Cassandra-compatible. Perhaps it has other workarounds to define some secondary indices.

@jravnik
Copy link

jravnik commented Jun 19, 2024

@yurishkuro Strangely enough, the tables themselves can be created with a clustering index without any problems. But as soon as a query uses an ORDER BY, the problem is reported that the said custom index is missing - which simply cannot be created since it is not a supported feature:

SELECT * FROM jaegertracing.service_name_index WHERE bucket in (0,1,2,3,4,6,7,8,9) AND service_name = 'my-service' AND start_time > 1718728568730810 AND start_time < 1718728668730810 ORDER BY start_time DESC;

InvalidRequest: Error from server: code=2200 [Invalid query] message="ORDER BY requires creating a custom index: CosmosClusteringIndex. Please create a custom index and re-issue this query"

If one submits the query without ORDER BY, the data can be read. A potential workaround would therefore probably be to run the SELECT without ORDER BY and sort the results in the application logic:

SELECT * FROM jaegertracing.service_name_index WHERE bucket in (0,1,2,3,4,6,7,8,9) AND service_name = 'my-service' AND start_time > 1718728568730810 AND start_time < 1718728668730810;

 service_name | bucket | start_time       | trace_id
------------------+--------+------------------+------------------------------------
 my-service   |      1 | 1718728573990844 | 0x0d1b531a4024d9dc848c0793120673ad
 my-service   |      1 | 1718728572649478 | 0xcd2e6645a71ba85a1b12fce2e82b30a8
 my-service   |      1 | 1718728572637999 | 0xcd2e6645a71ba85a1b12fce2e82b30a8
 my-service   |      1 | 1718728571031771 | 0xdcfc1ac661729d6ca1aafe38bd146add

---MORE---

I don't think this can be solved with secondary indexes, as I understand that they only support faster filtering, but not sorting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants