Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit OpenTelemetry Metrics in the SDKs #352

Closed
6 tasks done
rhamzeh opened this issue Apr 30, 2024 · 0 comments
Closed
6 tasks done

Emit OpenTelemetry Metrics in the SDKs #352

rhamzeh opened this issue Apr 30, 2024 · 0 comments
Assignees
Labels
dotnet-sdk Affects the C#/DotNet SDK enhancement New feature or request epic go-sdk Affects the Go SDK java-sdk Affects the Java/Kotlin SDK js-sdk Affects the JavaScript SDK python-sdk Affects the Python SDK

Comments

@rhamzeh
Copy link
Member

rhamzeh commented Apr 30, 2024

Checklist

Describe the problem you'd like to have solved

As a consumer of the SDK, I would like to hook it to my dashboards to get data on several metrics, as well being able to configure proper logging and tracing

Describe the ideal solution

For each SDK, users should be able to set up and connect to their infra

  • Phase 1: Metrics
  • Phase 2: Logging
  • Phase 3: Tracing
Metrics:

We're thinking of adding "fine-grained" config for the attributes/tags.

Something along the lines of:

var configuration = new ClientConfiguration() {
    ApiUrl = "http://localhost:8080",
    StoreId = "...",
    Credentials = new Credentials() { ... },
    Telemetry = new OpenFgaTelemetryConfig {
        Metrics: {
            [TelemetryHistograms.RequestDuration] = {
                Attributes: [Attributes.AttributeRequestMethod, Attributes.AttributeRequestStoreId]
            },
            [TelemetryCounters.TokenExchangeCountKey] = {
                Attributes: [Attributes.AttributeRequestModelId, Attributes.AttributeRequestClientId]
            },
        }
    }
};
var fgaClient = new OpenFgaClient(configuration);

If not set, we would enable a base set of metrics with minimal attributes, if configured, we follow whatever is configured. We will couple that with warnings in the OTEL config documentation around which attributes could be cost-prohibitive.

Metrics needed
Metric Name Type Enabled by Default Description
fga-client.request.duration Histogram Yes The total request time for FGA requests
fga-client.query.duration Histogram Yes The amount of time the FGA server took to internally process nd evaluate the request
fga-client.credentials.request Counter Yes The total number of times a new token was requested when using ClientCredentials
fga-client.request.count Counter No The total number of requests made to the FGA server
Supported attributes
Attribute Name Type Enabled by Default Description
fga-client.response.model_id string Yes The authorization model ID that the FGA server used
fga-client.request.method string Yes The FGA method/action that was performed (e.g. Check, ListObjects, ...) in TitleCase
fga-client.request.store_id string Yes The store ID that was sent as part of the request
fga-client.request.model_id string Yes The authorization model ID that was sent as part of the request, if any
fga-client.request.client_id string Yes The client ID associated with the request, if any
fga-client.user string No The user that is associated with the action of the request for check and list objects
http.request.resend_count int Yes The number of retries attempted (Only sent if the request was retried. Count of 1 means the request was retried once in addition to the original request)
http.response.status_code int Yes The status code of the response
http.request.method string No The HTTP method for the request
http.host string Yes Host identifier of the origin the request was sent to
url.scheme string No HTTP Scheme of the request (http/https)
url.full string No Full URL of the request
user_agent.original string Yes User Agent used in the query

This allows folks to not enable this by accident (they'd have to manually opt-in), while giving them the ability to be able to have visibility on things like:

  • Whether a client id is sending a disproportionate amount of calls or request tokens (could be an indication that it was misconfigured - eg.g they are initializing the SDK multiple times causing a credential request per call)
  • Whether their new model is causing significantly more latency than the old one
  • Whether slow requests are due to retries (tracing helps here, but usually traces are sampled and people might miss this)
  • The ratio of success vs. bad requests vs rate limits by model id, store id or client id so folks can understand whether a particular client is being called incorrectly or a particular model is problematic
  • Understanding whether they have still old clients running that they need to upgrade and how that goes with the errors they are getting (through the user agent)

Documentation

  • Documentation in each SDK on how to configure logging, metrics and tracing
  • Documentation in our docs on setting up tracing and connecting it to Prometheus and Grafana

Configuration

For each, SDK we need to allow the configuration of tracing, metrics and logging

For example, in the JS SDK, we may add: (note - config structure may change), based on the server config

Implementation

We will be using OpenTelemetry, e.g. open-telemetry/opentelemetry-js (for JS) or the appropriate SDK for each language: Language APIs & SDKs

Alternatives and current workarounds

No response

References

No response

Additional context

Roadmap Item: openfga/roadmap#41

@rhamzeh rhamzeh added enhancement New feature or request go-sdk Affects the Go SDK dotnet-sdk Affects the C#/DotNet SDK js-sdk Affects the JavaScript SDK python-sdk Affects the Python SDK java-sdk Affects the Java/Kotlin SDK epic labels Apr 30, 2024
@rhamzeh rhamzeh moved this from Backlog to Ready in SDKs and Tooling Jun 5, 2024
@rhamzeh rhamzeh moved this from Ready to In progress in SDKs and Tooling Jun 24, 2024
@ewanharris ewanharris moved this from In progress to Ready in SDKs and Tooling Jul 9, 2024
@rhamzeh rhamzeh moved this from Ready to In progress in SDKs and Tooling Jul 11, 2024
@rhamzeh rhamzeh changed the title Observability in the SDKs Emit OpenTelemetry Metrics in the SDKs Nov 25, 2024
@rhamzeh rhamzeh closed this as completed Nov 25, 2024
@github-project-automation github-project-automation bot moved this from In progress to Done in SDKs and Tooling Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dotnet-sdk Affects the C#/DotNet SDK enhancement New feature or request epic go-sdk Affects the Go SDK java-sdk Affects the Java/Kotlin SDK js-sdk Affects the JavaScript SDK python-sdk Affects the Python SDK
Projects
Archived in project
Development

No branches or pull requests

2 participants