Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracing Add-on For Linkerd #3955

Merged
merged 50 commits into from
Feb 26, 2020
Merged

Tracing Add-on For Linkerd #3955

merged 50 commits into from
Feb 26, 2020

Conversation

Pothulapati
Copy link
Contributor

@Pothulapati Pothulapati commented Jan 20, 2020

Follows #3794

This PR adds the tracing components as a Linkerd Add-On. The add-on model follows the same consistent approach both through Helm and Linkerd CLI.

The following changes are part of this PR:

  • Pre-Requisite Changes:
    • Moved some common yaml templates like _nodeselectors.yaml,etc required by multiple sub-charts to partials for re-use. Also updating the relevant helm tests.
  • Helm Changes:
    • A Tracing sub-chart is added with partials dependency (for injection). This sub-chart is also added as an optional dependency for linkerd2 chart.
    • Values.yaml has been updated to include the tracing add-on config.
    • linkerd-values cm has been added which stores the add-on configurations, and will only be present when there is at-least one Add-On.
  • Install Path:
    • Updates installOptions to include a config option, through which the configuration for the add-ons is given to the CLI.
    • During Install, A final step has been added to read config (if given), and the default Values will be overwritten by this.
    • If there are any add-ons present, those sub-charts are loaded and rendered in this step, by passing both .Values.global and the add-on values to the sub-chart.
  • Upgrade Path:
    • The upgrade path in the final stage, now first checks if the linkerd-values cm exists, and updates the default addon values with the values present in linkerd-values, this is done to have persistent config upgrades.
    • It then checks if there is a config option present, and overwrite the default values further with the options from the config file. This allows users to change configuration of add-ons during upgrades.

The logic to Overwrite the values struct, is done by this fork of mergo at /~https://github.com/pothulapati/mergo, because of this issue/decision from the official project.

To try tracing as an add-on:

  • Create a new config file as config.yaml
tracing:
  enabled: true
  • Now Run,
    linkerd install --config config.yaml --control-plane-tracing | kubectl apply -f -
    and see the magic happen ;)

@ihcsim @alpeb @zaharidichev @grampelberg

As add-ons re-use the partials helm chart, all the templates needed by multiple charts should be present in partials
This commit also updates the helm tests
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Tracing sub-chart includes open-census and jaeger components as a sub-chart which can be enabled as needed

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
This includes new interface for add-ons to implement, with example tracing implementation

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Changes include:
 - Adds an optional Linkerd Values configmap which stores add-on configuration when add-ons are present.
 - Updates Linkerd install path to check for add-ons and render their sub-charts.
 - Adds a install Option called config, which is used to pass confiugration for add-ons.
 - Uses a fork of mergo, to over-write default Values with the Values struct generated from config.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Upgrade path now checks for the linkerd-values cm, and overwrites the default values with it, if present.
It then checks the config option, for any further overwrites

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
also adds relevant nil checks

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
@Pothulapati
Copy link
Contributor Author

Process required to add new Add-Ons:

  • A sub-chart has to be created first at /charts/linkerd2/add-ons/<add-on>, and added as a dependency to the linkerd2 sub-chart, with the condition field as <add-on>.enabled from the linkerd2/values.yaml. The linkerd2/values.yaml will be the file where the default config for add-ons will also be present.
  • The sub-chart can optionally take a dependency on partials to have proxy injection by including the necessary templates.
  • The Values struct in pkg/charts/values.go should have the relevant field for add-on, with the same structure as that of the one present in Values.yaml.
  • The relevant add-on config struct will added at pkg/charts/linkerd2/<add-on>.go, and should also implement the AddOn interface, as that logic is used in the install and upgrade paths to render the add-on templates.
  • Finally, The mergeAddOnValues and checkAddOns functions in install.go should be updated to check for the relevant newly added add-ons.

@Pothulapati
Copy link
Contributor Author

This same add-on model can be used for making both grafana and prometheus as add-ons as per the test PR #3946
Separate PR's will be made for them, once this is accepted as there are some changes required to make grafana and prometheus as add-ons

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Also refactors the linkerd-values cm to work the same with helm

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
@Pothulapati
Copy link
Contributor Author

Sorry, Did not realize there was a small error with linkerd-config causing identity pod to fail,
Fixed it.

Ready to review, now :)

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Copy link
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Pothulapati, I'm starting to review this. Looks promising!

I guess this supersedes #3946? I like that you're just doing the tracing part here, as an example, and then Grafana and Prometheus can be done later as separate PRs 😉

Quick question: why having the addons directly under charts/linkerd2 and not directly under charts?

Nit: An entry in .gitignore has to be added to avoid pushing anything under charts/**/charts (like partials-0.1.0.tgz)

I'll be giving this a full test next week 🎉

charts/linkerd2/add-ons/tracing/Chart.yaml Outdated Show resolved Hide resolved
cli/cmd/install.go Outdated Show resolved Hide resolved
pkg/charts/linkerd2/tracing.go Outdated Show resolved Hide resolved
Comment on lines 1183 to 1192
func mergeAddonValues(values, addonValues *l5dcharts.Values) error {

if addonValues.Tracing != nil {
if err := mergo.Merge(addonValues.Tracing, values.Tracing); err != nil {
return err
}
values.Tracing = addonValues.Tracing
}
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not having the new --config flag serve as overrides for all the values, not just for add-ons? And if so, it might be clearer to implement that as a preliminary PR...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason is that, some fields in values like config etc are generated way before in the install/upgrade path but as we override them with config last in the pipeline, this would require us to again redo those values.

Sure, we can keep this overwrite logic in the start but that would require a lot of code-changes. So, the plan is to do that next but keep it out of this PR. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow. Doesn't this overriding logic imply that only the values present in the file provided through --config will get overridden, leaving all the others untouched? Why would it need to regenerate those other values? Maybe an example might help me understand 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, Sorry for not being clear.
My main concern is with this, as it is built by combining various values and is added in the linkerd-config cm, we will have to regenerate that agian after the override this. We could regenrate that again if it was based on Values struct but currently it is based on identityInstallOptions.

Configs ConfigJSONs `json:"configs"`

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Another reason to refactor this at some point in the future so we have to rely less on the structs in install.go and leverage those in values.go more directly 😉

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend excluding this change from this PR for now. It feels like a more complicated problem than we first thought. The addon will still work without the --config flag, right? It will be restricted to just defaults in the values.yaml.

@@ -0,0 +1,33 @@
{{ if or (.Values.tracing.enabled) -}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the or here is not needed?
Also if tracing is not enabled then $dupValues won't be defined. Will that cause a problem down at the end of this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if condition is to not have linkerd-values cm to be deployed, when there are zero add-ons enabled. Hence, the or condition but once we make the linkerd-values common for all config, then we should be able to remove this if condition.

Also introduced a defaultGetFiles method for add-ons

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
@Pothulapati
Copy link
Contributor Author

@alpeb Updated gitignore for the same :) Thanks
reg sub-charts in linkerd2
There is no particular problem regarding that, but my opinion was that, as the sub-charts can't be re-used individually i.e without linkerd2 parent chart, It made sense to be inside. :) but I can move it if needed.

@Pothulapati Pothulapati requested a review from alpeb January 28, 2020 09:47
.gitignore Outdated Show resolved Hide resolved
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Copy link
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Pothulapati, this tested great to me.

It tested well with helm install, helm upgrade, linkerd install and linkerd upgrade.
One bug though is that the --config option is only available to the "control-plane" stage in both linkerd install and linkerd upgrade.

reg sub-charts in linkerd2
There is no particular problem regarding that, but my opinion was that, as the sub-charts can't be re-used individually i.e without linkerd2 parent chart, It made sense to be inside. :) but I can move it if needed.

Neither do we publish the "partials" nor "patch" charts, yet they reside under /charts.

Various things to remember when this functionality gets documented:

  • Note that the collector and Jaeger are installed in linkerd's namespace, which differs from the current tracing walkthrough doc
  • Need docs on how to turn off (tracing.enabled: false plus kubectl apply --prune I guess?)

charts/linkerd2/templates/linkerd-values.yaml Outdated Show resolved Hide resolved
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
@Pothulapati
Copy link
Contributor Author

Neither do we publish the "partials" nor "patch" charts, yet they reside under /charts.

Yep, :) Moved them to charts now.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
cli/cmd/install.go Outdated Show resolved Hide resolved
cli/cmd/install.go Outdated Show resolved Hide resolved
Copy link
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Pothulapati , I've added a few more comments below.
I tried testing but I couldn't see the tracing components in the output. I'm trying with

bin/go-run cli install --addon-config config.yaml

where config.yaml looks like

# Configuration for Add-ons
tracing:
  enabled: true
  collector:
    name: linkerd-collector
    image: omnition/opencensus-collector:0.1.10
    resources:
      cpu:
        limit: 1
        request: 200m
      memory:
        limit: 1Gi
        request: 400Mi
  jaeger:
    name: linkerd-jaeger
    image: jaegertracing/all-in-one:1.8
    resources:

Comment on lines 87 to 88
// readVirtualFiles read the content of a file from the VFS. If the file is
// directory, it also loads the children files content, recursively.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typos: reads and If the file is a directory

return nil
}

if err := walkVFS(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems walkVFS no longer needs to be declared as a lambda so you can flatten out the contents of this function

Comment on lines 75 to 85
// LoadDependencies loads all the dependent subcharts of the specified chart.
// It relies on LoadChart to load the files and metadata of the chart from the
// VFS.
func LoadDependencies(chartName string) ([]*chart.Chart, error) {
chart, err := LoadChart(chartName)
if err != nil {
return nil, err
}

return chart.Dependencies, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be further simplified by removing this function and just calling LoadChart directly and using .Dependencies on the returned value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the patch I sent to @Pothulapati, I didn't want to call LoadChart() in install.go directly, to avoid confusion with the existing l5dchart.Chart struct.

pkg/charts/charts_test.go Outdated Show resolved Hide resolved
Comment on lines 840 to 841
addOnValues, enabled := checkAddon(values, chartName)
if enabled {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be simplified with

if addOnValues, enabled := checkAddon(values, chartName); enabled {
...
}

Comment on lines 1223 to 1230
if !reflect.Indirect(r).FieldByName(strings.Title(name)).IsNil() {
if reflect.Indirect(r).FieldByName(strings.Title(name)).MapIndex(reflect.ValueOf("enabled")).Interface().(bool) {
values, err := yaml.Marshal(reflect.Indirect(r).FieldByName(strings.Title(name)).Interface())
if err != nil {
return nil, false
}
return values, true
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the addon config is missing then the value of reflect.Indirect(r).FieldByName(strings.Title(name)) would be an empty map (zero value) to which you can safely apply MapIndex, so no need of the nested if nor the IsNil() check.

You can also extract the common part of the remaining condition:

		config := reflect.Indirect(r).FieldByName(strings.Title(name)); 
                if config.MapIndex(reflect.ValueOf("enabled")).Interface().(bool) {
			values, err := yaml.Marshal(config.Interface())
			if err != nil {
				return nil, false
			}
			return values, true
		}

@alpeb
Copy link
Member

alpeb commented Feb 24, 2020

@Pothulapati Also, I wasn't able to test, but it appears this new approach isn't complatible with multi-stage? One thing that occurs to me for that would be to enforce putting config-level templates under a separate templates/config dir...

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
@ihcsim
Copy link
Contributor

ihcsim commented Feb 25, 2020

@alpeb We did some more testing yesterday, and decided that the only way to make sure the addons code works for the different install/upgrade commands and subcommands is to follow the pattern that is used for the partials, where we provide the full path inside the VFS. Let us know what you think. Thanks!

Copy link
Contributor

@ihcsim ihcsim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works great for me 👍! I tested with install, upgrade, install config, install control-plane and inject (with the --addon-config flag), using the CLI from the target/ folder to make sure it's truly reading the templates from the VFS.

@alpeb
Copy link
Member

alpeb commented Feb 25, 2020

@alpeb We did some more testing yesterday, and decided that the only way to make sure the addons code works for the different install/upgrade commands and subcommands is to follow the pattern that is used for the partials, where we provide the full path inside the VFS. Let us know what you think.

I liked how the previous changes allowed for so little customization to be required when adding a new add-on. So I was hoping the VFS loading logic, having being informed of the current stage, would be able to detect the stage some template belonged to given its subdir location or some other convention. Nice to have, but I would understand if that'd be too much at this point 🙂

@alpeb
Copy link
Member

alpeb commented Feb 25, 2020

Oh I see all the VFS magic from charts.go is gone now. Alright, that looks good to me; we'll see later if being able to add addons with little-to-none changes is really required 👍

Copy link
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Pothulapati , it seems we're almost there 👍
I've added a couple of minor comments 🙏

addonValues := map[string][]byte{}

if values.Tracing != nil {
if values.Tracing["enabled"].(bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool that you found a way around reflection!

OTOH, If the enable property is not a boolean, this will cause a panic. Can you do instead something like

if enabled, ok := values.Tracing["enabled].(bool); !ok {
    return nil, fmt.Errorf("invalid value for 'Tracing.enabled' (should be boolean): ", values.Tracing["enabled"]);
} else if enabled {
...
}

cli/cmd/install.go Outdated Show resolved Hide resolved
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
@ihcsim
Copy link
Contributor

ihcsim commented Feb 26, 2020

I liked how the previous changes allowed for so little customization to be required when adding a new add-on

@alpeb Yeah, I back out of it because previously, Tarun and I thought of introducing the usage of the Helm lib to load charts in this PR. Then in a separate PR, see if we can refactor the existing charts code to use the same mechanism. But that would make it impossible to run the CLI unit tests (afaict) without requiring everyone to run bin/helm-build. (I didn't want to add a command.Exec() to the Go unit test code.) The reason is that when using the Helm lib, the render() function of install will always try to look for the addons subcharts in charts/linkerd2/charts, not charts/add-ons. There might be ways to try to bend it, but I feel it will further complicate the code.

We feel what we did here is easier to reason about, because it offers the consistency of using the same pattern as that used by the existing code to render the Linkerd, partials and CNI charts. Thanks for your help on this one!

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
@Pothulapati
Copy link
Contributor Author

@alpeb Thanks for the review! Updated the PR now.

As @ihcsim mentioned, These changes feel more concrete and stable as we are re-using the already present stuff. Also, Not using reflect, and writing concrete type checks make more sense!

@Pothulapati
Copy link
Contributor Author

One more thing, I think we should address is updating helm-build to also do dep up and get relevant sub-charts.

Copy link
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ihcsim for the detailed explanation, and @Pothulapati for implementing this. I agree about the pending helm-build update, which can be tackled in a separate PR along with the extra CI step needed to publish the chart.

This looks good and tests fine for me! 🥇 🏆

@ihcsim ihcsim merged commit 948dc22 into linkerd:master Feb 26, 2020
kleimkuhler added a commit that referenced this pull request Feb 27, 2020
## edge-20.2.3

This release introduces the first optional add-on `tracing`, added through the
new add-on model!

The existing optional `tracing` components Jaeger and OpenCensus can now be
installed as add-on components.

There will be more information to come about the new add-on model, but please
refer to the details of [#3955](#3955) for how to get started.

* CLI
  * Added the `linkerd diagnostics` command to get metrics only from the
    control plane, excluding metrics from the data plane proxies (thanks
    @srv-twry!)
  * Added the `linkerd install --prometheus-image` option for installing a
    custom Prometheus image (thanks @christyjacob4!)
  * Fixed an issue with `linkerd upgrade` where changes to the `Namespace`
    object were ignored (thanks @supra08!)
* Controller
  * Added the `tracing` add-on which installs Jaeger and OpenCensus as add-on
    components (thanks @Pothulapati!!)
* Proxy
  * Increased the inbound router's default capacity from 100 to 10k to
    accommodate environments that have a high cardinality of virtual hosts
    served by a single pod
* Web UI
  * Fixed styling in the CallToAction banner (thanks @aliariff!)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants