Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STOR-2040: CLI command to display bound pvc filesystem usage percentage #1854

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

gmeghnag
Copy link

The code makes use of a Prometheus query to implement the oc adm top persistentvolumeclaims and show usage statistics for bound persistentvolumeclaims, as follows:

oc adm top pvc --insecure-skip-tls-verify=true -n reproducer-pvc
NAMESPACE      NAME               USAGE(%) 
reproducer-pvc pvc-reproducer-pvc 98.28    
reproducer-pvc pvc-test-pvc       14.56    

It supports the following flags:

    -A, --all-namespaces=false:
	If present, list the pvc usage across all namespaces. Namespace in current context is ignored even if
	specified with --namespace

    --insecure-skip-tls-verify=false:
	If true, the server's certificate will not be checked for validity. This will make your HTTPS connections
	insecure

@openshift-ci-robot
Copy link

openshift-ci-robot commented Aug 22, 2024

@gmeghnag: This pull request references STOR-2040 which is a valid jira issue.

In response to this:

The code makes use of a Prometheus query to implement the oc adm top persistentvolumeclaims and show usage statistics for bound persistentvolumeclaims, as follows:

oc adm top pvc --insecure-skip-tls-verify=true -n reproducer-pvc
NAMESPACE      NAME               USAGE(%) 
reproducer-pvc pvc-reproducer-pvc 98.28    
reproducer-pvc pvc-test-pvc       14.56    

It supports the following flags:

   -A, --all-namespaces=false:
  If present, list the pvc usage across all namespaces. Namespace in current context is ignored even if
  specified with --namespace

   --insecure-skip-tls-verify=false:
  If true, the server's certificate will not be checked for validity. This will make your HTTPS connections
  insecure

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Aug 22, 2024
@openshift-ci openshift-ci bot requested review from ardaguclu and soltysh August 22, 2024 21:45
@ardaguclu
Copy link
Member

Thank you for spending time on this nice and useful feature. But that requires a KEP first to sync on the feature and implementation.

@gmeghnag
Copy link
Author

Hey @ardaguclu, what is a KEP? And how to create one? Thanks :)

@ardaguclu
Copy link
Member

Hey @ardaguclu, what is a KEP? And how to create one? Thanks :)

I think, you'd write an enhancement proposal under /~https://github.com/openshift/enhancements/tree/master/enhancements/oc and we'll discuss the design and align on it. Once it merges, we can return to this PR.

Copy link
Member

@dobsonj dobsonj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @gmeghnag :) This is nice work, I just have a few suggestions.

pkg/cli/admin/top/persistentvolumeclaims.go Show resolved Hide resolved
pkg/cli/admin/top/persistentvolumeclaims.go Show resolved Hide resolved
pkg/cli/admin/top/persistentvolumeclaims.go Outdated Show resolved Hide resolved
@dobsonj
Copy link
Member

dobsonj commented Oct 29, 2024

/retest

@ropatil010
Copy link

/test e2e-agnostic-ovn-cmd

@gmeghnag
Copy link
Author

/retest

@gmeghnag
Copy link
Author

DO NOT MERGE, I have to fix first https://issues.redhat.com/browse/OCPBUGS-44011

@gmeghnag
Copy link
Author

gmeghnag commented Nov 6, 2024

https://issues.redhat.com/browse/OCPBUGS-44011 fixed.

@gmeghnag
Copy link
Author

gmeghnag commented Nov 7, 2024

/retest

Copy link
Member

@ardaguclu ardaguclu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still believe that this is a cool feature and thanks for spending time on this. I think it requires major code refactorings though.

/label tide/merge-method-squash

@@ -0,0 +1,382 @@
package top
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you'll be the maintainer of this command, there should be a separate OWNERS file like this /~https://github.com/openshift/oc/blob/master/pkg/cli/admin/inspectalerts/OWNERS

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is still valid. We need OWNERS file to maintain and triage the issues with respect to this comment.

pkg/cli/admin/top/persistentvolumeclaims.go Outdated Show resolved Hide resolved
pkg/cli/admin/top/persistentvolumeclaims.go Show resolved Hide resolved
pkg/cli/admin/top/persistentvolumeclaims.go Outdated Show resolved Hide resolved
pkg/cli/admin/top/persistentvolumeclaims.go Outdated Show resolved Hide resolved
pkg/cli/admin/top/persistentvolumeclaims.go Show resolved Hide resolved
pkg/cli/admin/top/persistentvolumeclaims.go Outdated Show resolved Hide resolved
pkg/cli/admin/top/persistentvolumeclaims.go Show resolved Hide resolved
pkg/cli/admin/top/persistentvolumeclaims.go Outdated Show resolved Hide resolved
pkg/cli/admin/top/persistentvolumeclaims.go Show resolved Hide resolved
@openshift-ci openshift-ci bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Nov 19, 2024
@gmeghnag
Copy link
Author

@ardaguclu Thank you a lot for the code review!

Before spending time answering your questions and proceeding with the code modifications I would like to ask @tsmetana if doing so will make this PR merged, because, from our last chat I understood that you (with the storage leads) chose to go upstream first with this functionality, and doing so will probably need a different approach since Prometheus is not provided by default in vanilla Kubernetes.

In short, I don't want to spend time modifying the code if this PR will not be merged anyway.
Instead, if you think we can merge this PR in a future oc release, I'll work on that tomorrow.

Thanks

@tsmetana
Copy link
Member

@gmeghnag labelling the feature as experimental is exactly what would make it merge even if we want to implement this also in upstream kubectl. The experimental / TP nature gives us the possibility to have the feature early in oc, but be able to change it later in case the kubectl implementation would differ in an incompatible way.

Thanks for the awesome work!

@gmeghnag
Copy link
Author

@ardaguclu @tsmetana I've updated the PR, please let me know if anything else is needed! (:

@gmeghnag
Copy link
Author

/retest-required

@ardaguclu
Copy link
Member

Thank you. I quickly skimmed over the current code but we need to iterate a couple of times to be ready for merge. I think, it would be safer to defer this to 4.19 as unfortunately I don't have time for 4.18.

pkg/cli/admin/top/persistentvolumeclaims.go Show resolved Hide resolved
pkg/cli/admin/top/persistentvolumeclaims.go Outdated Show resolved Hide resolved
for _, promOutputDataResult := range promOutput.Data.Result {
namespaceName := promOutputDataResult.Metric["namespace"]
pvcName := promOutputDataResult.Metric["persistentvolumeclaim"]
usagePercentage := promOutputDataResult.Value[1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know promOutputDataResult.Value's length is greater than 1 and we don't panic?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this is a "dogma" from Prometheus output, see here [1][2]:

Instant vectors are returned as result type vector. The corresponding result property has the following format:

[
  {
    "metric": { "<label_name>": "<label_value>", ... },
    "value": [ <unix_time>, "<sample_value>" ],
    "histogram": [ <unix_time>, <histogram> ]
  },
  ...
]

[1] https://prometheus.io/docs/prometheus/latest/querying/api/#instant-queries
[2] https://prometheus.io/docs/prometheus/latest/querying/api/#instant-vectors

Copy link
Contributor

openshift-ci bot commented Jan 21, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: gmeghnag
Once this PR has been reviewed and has the lgtm label, please assign ardaguclu for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@@ -0,0 +1,382 @@
package top
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is still valid. We need OWNERS file to maintain and triage the issues with respect to this comment.


const (
localhostRecoveryTokenSecret = "localhost-recovery-client-token"
kubeApiserverNamespace = "openshift-kube-apiserver"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think using an arbitrary namespace would be allowed (especially using the openshift-kube-apiserver which is an important one). If there is no bearer tokens, it is better to return error?.

// RouteGetter is a function that gets a Route.
type RouteGetter func(ctx context.Context, namespace string, name string, opts metav1.GetOptions) (*routev1.Route, error)

type options struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

options is unexported which makes sense but why some of its fields are exported?

BearerToken string
}

func newOptions(streams genericiooptions.IOStreams) *options {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Better to export

o.BearerToken = o.ClientConfig.BearerToken
if len(o.ClientConfig.BearerToken) == 0 {
klog.V(4).Info(fmt.Sprintf(`no token is currently in use for this session, attempting to retrieve token from secret "%s" in namespace "%s"`, localhostRecoveryTokenSecret, kubeApiserverNamespace))
secret, err := o.ClientSet.CoreV1().Secrets(kubeApiserverNamespace).Get(context.TODO(), localhostRecoveryTokenSecret, metav1.GetOptions{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to continue with whatever we have defined in kubeconfig and should not try to use any other secret, etc.

return body, nil
}

type PromOutput struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can unexport this and the other nested ones.


reviewers:
- gmeghnag
- ardaguclu
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to add approvers and reviewers from which team will maintain this. I'd prefer not to be in here.

@@ -0,0 +1,8 @@
# See the OWNERS docs: https://git.k8s.io/community/contributors/guide/owners.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkg/cli/admin/top command exists and it is maintained by workloads team. What about moving this command to pkg/cli/admin/toppvc, etc. and have dedicated OWNERS?.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is also better to have an alias like this /~https://github.com/openshift/oc/blob/master/pkg/cli/admin/inspectalerts/OWNERS and

monitoring-approvers:


func (o *options) Run(ctx context.Context, args []string) error {
o.BearerToken = o.ClientConfig.BearerToken
if len(o.ClientConfig.BearerToken) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also keep in mind that, kubeconfig has many fields to access cluster, not only bearer token. As a result, this command will not be usable if kubeconfig authentication mechanism is certificates, exec credential plugins, etc.

Copy link
Contributor

openshift-ci bot commented Jan 21, 2025

@gmeghnag: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-serial 64325f9 link true /test e2e-aws-ovn-serial
ci/prow/e2e-agnostic-ovn-cmd 64325f9 link true /test e2e-agnostic-ovn-cmd
ci/prow/unit 64325f9 link true /test unit

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants