Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nfd-topology-updater: retrieve kubelet config from API /configz #842

Merged
merged 1 commit into from
Nov 11, 2022

Conversation

Garrybest
Copy link
Member

Now nfd-topology-updater tries to retrieve topologyManagerPolicy of a node by reading kubelet config file. However, sometimes we use command line to start Kubelet instead of the config file, or sometimes the config file is modified but Kubelet has not restarted to make this file take effect.

I think we'd better retrieve the latest config from Kubelet API /configz, Kubelet will return the configuration in its memory. This config is obviously accurate.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 2, 2022
@k8s-ci-robot
Copy link
Contributor

Welcome @Garrybest!

It looks like this is your first PR to kubernetes-sigs/node-feature-discovery 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/node-feature-discovery has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @Garrybest. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 2, 2022
@Garrybest
Copy link
Member Author

/cc @swatisehgal @fromanirh

@k8s-ci-robot k8s-ci-robot requested a review from swatisehgal July 2, 2022 12:14
@k8s-ci-robot
Copy link
Contributor

@Garrybest: GitHub didn't allow me to request PR reviews from the following users: fromanirh.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @swatisehgal @fromanirh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Garrybest
Copy link
Member Author

I have tested the whole deployment in my own minikube, the logs show this API would works.

I0702 12:03:12.757291       1 main.go:74] detected kubelet Topology Manager policy "SingleNUMANodeContainerLevel"

Copy link
Contributor

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR! this is a interesting direction.
We tried an approach like this some time ago but we decided not to purse it. I'm in favour of this general direction though, so I'll do some archeology to see if the reasons why we stopped are still relevant or not.

@ffromani
Copy link
Contributor

ffromani commented Jul 3, 2022

/cc @Tal-or

@k8s-ci-robot
Copy link
Contributor

@fromanirh: GitHub didn't allow me to request PR reviews from the following users: Tal-or.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @Tal-or

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ffromani
Copy link
Contributor

ffromani commented Jul 4, 2022

@Tal-or

@Tal-or
Copy link
Contributor

Tal-or commented Jul 4, 2022

Very nice.
I don't recall atm why eventually we decided not to move on in this direction, I need to do some digging to find out why.
It would be great if we can have some integration/e2e as part of this PR for validating this feature.

@Tal-or
Copy link
Contributor

Tal-or commented Jul 5, 2022

There is some note in K8S docs:
https://v1-23.docs.kubernetes.io/docs/tasks/administer-cluster/reconfigure-kubelet/#generating-a-file-that-contains-the-current-configuration

which says:
Caution: The kubelet's configz endpoint is there to help with debugging, and is not a stable part of kubelet behavior. Do not rely on the behavior of this endpoint for production scenarios or for use with automated tools.

This is the reason why we decided to not pursue this direction eventually.
Maybe this warning isn't relevant anymore but this is something that should be figured out.

@Garrybest
Copy link
Member Author

There is some note in K8S docs: https://v1-23.docs.kubernetes.io/docs/tasks/administer-cluster/reconfigure-kubelet/#generating-a-file-that-contains-the-current-configuration

which says: Caution: The kubelet's configz endpoint is there to help with debugging, and is not a stable part of kubelet behavior. Do not rely on the behavior of this endpoint for production scenarios or for use with automated tools.

This is the reason why we decided to not pursue this direction eventually. Maybe this warning isn't relevant anymore but this is something that should be figured out.

Thanks for reminding. Now I try to use token first. If failed, we could fall back to use config file.

@marquiz
Copy link
Contributor

marquiz commented Jul 8, 2022

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 8, 2022
@marquiz marquiz added this to the v.0.12.0 milestone Jul 8, 2022
@marquiz
Copy link
Contributor

marquiz commented Jul 8, 2022

Thanks @Garrybest for the PR. I think this makes sense.

I'm sorry I didn't have the time to review the PR this week and now I'm off to summer holidays 🙄 I will be off for four weeks but will reivew this when I'm back.

I'm not entirely sure about the fallback 🧐 How about changing -kubelet-config-file to -kubelet-config and you could the give a http:// endpoint or a file://pointing to a local file?

@ffromani
Copy link
Contributor

ffromani commented Jul 9, 2022

I'm not entirely sure about the fallback monocle_face How about changing -kubelet-config-file to -kubelet-config and you could the give a http:// endpoint or a file://pointing to a local file?

I like this idea!

@Garrybest
Copy link
Member Author

/retest

@Garrybest
Copy link
Member Author

Thanks @marquiz, I like this suggestion. Very cool 😄

@Garrybest Garrybest requested review from marquiz and removed request for swatisehgal November 7, 2022 02:04
Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update @Garrybest! A few mpre comments but nothing bug

@Garrybest Garrybest force-pushed the pr_config branch 2 times, most recently from ed0177c to 9ff5ea0 Compare November 7, 2022 13:07
@Garrybest
Copy link
Member Author

Thanks for the suggestions!

@Garrybest Garrybest requested a review from marquiz November 7, 2022 13:33
Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two more small nits and I would be good to go with this

@Garrybest
Copy link
Member Author

/retest

@Garrybest
Copy link
Member Author

It seems that there is something wrong with the robot 🤣

Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this and the quick responses @Garrybest 😊 I think we can merge this but I'll give some time to others chime in, too

There really seems to be something odd in prow...
/retest

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 8, 2022
@Garrybest
Copy link
Member Author

I think we can merge this but I'll give some time to others chime in, too

No problem. Thanks again for your rigorous reviewing 😄

@marquiz
Copy link
Contributor

marquiz commented Nov 8, 2022

/retest

@ffromani
Copy link
Contributor

ffromani commented Nov 8, 2022

Thanks for working on this and the quick responses @Garrybest blush I think we can merge this but I'll give some time to others chime in, too

There really seems to be something odd in prow... /retest

Thanks for this! I don't have any additional comments.

@marquiz
Copy link
Contributor

marquiz commented Nov 8, 2022

Thanks for this! I don't have any additional comments.

👍
/assign @fmuyassarov

prow had some issues so please fix those

Signed-off-by: Garrybest <garrybest@foxmail.com>
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Garrybest, marquiz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@marquiz
Copy link
Contributor

marquiz commented Nov 11, 2022

@fmuyassarov @zvonkok you wanna check this or should we just merge?

@fmuyassarov
Copy link
Member

@fmuyassarov @zvonkok you wanna check this or should we just merge?

I wanted to have a look. Will review it in a couple of hours today.

Copy link
Member

@fmuyassarov fmuyassarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Thanks

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 11, 2022
@k8s-ci-robot k8s-ci-robot merged commit 554145f into kubernetes-sigs:master Nov 11, 2022
@Garrybest Garrybest deleted the pr_config branch November 11, 2022 12:50
@marquiz marquiz mentioned this pull request Dec 20, 2022
22 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants