-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configurable _source.enabled Elastic mapping property #1629
Comments
Thank you, Lars, for offering to make a PR. |
The quoted section of the documentation explains why disabling the Just so we're on the same page, the fields are there (you can check them in the Kibana index pattern, for instance) and you can query/aggregate them, but they don't show the raw values in the Kibana Discovery page for each document. I can see how this makes ad-hoc querying the metrics difficult unless you know the metric names/tags already. It would be nice if Kibana could still auto-complete the field names/values for this purpose. The proposal to enable the Other Elasticsearch/Kibana users also feel free to give your input on this, as the proposed change would affect the default behavior for all users of the Micrometer Elasticsearch registry. The above being said, there is the workaround currently of specifying your own template that uses Also, I want to point out that recently, the following dashboards/visualizations of Micrometer metrics in Kibana have been published by some users. |
What's the impact?To make the impact of this explicit. It might be obvious already, but just to put everyone on the same page. If you have the following mapping and 3 sample docs:
A search like
And Discover will show the same: But Visualizations (and thus Dashboards) will work just like normal: Is this the right tradeoff? I don't know :) How much disk will this save?As always — it depends. Number of fields, their mapping, how well the data can be compressed,... will all play a role. Do you have a representative dataset? Then we could easily try it out. Which Elasticsearch versions support this?
This feature has been around for a long time. I'm not sure what version range of Elasticsearch you support, but you could probably enable this for all versions (for example see the docs for 5.0). PS: RollupsMaybe the solution could also include rollups? Basically it takes the raw data and pre-aggregates it. So you could have 10s intervals for today, but after 48h you only keep 5min intervals around; IMO that would save a lot more disk space (but add different tradeoffs). PPS: CompressionAs mentioned in the docs, you could also look into compression if you're not doing that already. |
I like options |
BTW some other ideas for further reducing storage requirements:
|
The description states that this is the behaviour in Elasticsearch 7.3 and up, but if you check the docs this is the behavior in any Elasticsearch version. Don't see what is introduced in terms of |
I created #2363 to try to resolve this. |
I also created #2364 to incorporate the suggestion from @xeraa in #1629 (comment). |
We discussed this internally today. My thought is still that metrics data should not have Despite that, I can see the usefulness of it being easier to explore the contents of the metrics. As a compromise, we're thinking to make this configurable, but to print a warning log discouraging its use in production. See #2363. We can consider feedback on the idea still. After our internal discussion, I now see in the latest documentation, it is worded more to encourage leaving
@xeraa I like the idea of a data-driven approach. Things can vary a lot, but a good baseline for a lot of users is probably the metrics that are auto-configurred in a basic Spring Boot application. This can be reproduced by generating a project from https://start.spring.io with the Actuator and Web dependencies. Add We could run such an app for 10 minutes or so to extrapolate how much data might be generated in an hour and multiply it by a few hundred or thousand to estimate a production environment of many apps with many instances. I don't know if this is a good strategy for estimating the difference or if there will be some fixed costs that at the relatively low amount of data from 10 minutes with one instance, we won't be able to extrapolate to a longer run with many more instances. Separate from this, we really probably should be moving in the direction of writing metric data as a data stream, I think. I don't know if that obviates this |
But also warn about the costs associated as a mitigation to this mistakenly being enabled in a production environment. Closes gh-1629
Re-opening this as it looks like #2363 got inadvertently dropped from 1.10.0 when reverting 2.0-specific changes. See #2363 (comment) |
Looks like the easiest way to fix this is clone library, patch and build it myself. God bless opensource. |
@denistsyplakov, at the moment you do it, you start the adventure of syncing your fork and maintaining the CI/CD by yoursel..... Not sure which one is a bigger pain. |
Two sidenotes from the Elasticsearch side:
|
@xeraa thanks for always keeping us aware of the latest features that may be relevant. It's much appreciated. On TSDS, where things left off was #3763 (comment). As far as I'm aware, we can write metrics using Micrometer to a TSDS already using the same functionality to write to a traditional index (per #2997), but we don't handle creating the index template (or equivalent) for it now. Maybe our next step should be to do what was mentioned in @denistsyplakov Micrometer by default creates an index template if one doesn't already exist. You can create your own index template in your Elasticsearch instance and have Micrometer write to it. You could also modify the index template that Micrometer had previously created. So there should have been nothing blocking users this whole time from enabling source if they wanted - it merely wasn't a configuration option on the index template Micrometer creates if none exists. That said, the feature is now (re-)merged for inclusion in 1.14.0-RC1 and will be available in 1.14.0-SNAPSHOT versions momentarily. |
From Elastic's documentation:
https://www.elastic.co/guide/en/elasticsearch/reference/7.3/mapping-source-field.html
This means that it is not possible to see the fields in the index and Kibana view will not show the metrics.
I would make a pull request to change to enabled: true for Elasticsearch 7.3 and up
The text was updated successfully, but these errors were encountered: