-
Notifications
You must be signed in to change notification settings - Fork 689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lownodeutilization]: Actual utilization: integration with Prometheus #1533
base: master
Are you sure you want to change the base?
[lownodeutilization]: Actual utilization: integration with Prometheus #1533
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hello, master. Due to the company's busy schedule previously, I only managed to complete half of the related KEP. I'm glad to see that you're working on this. It looks like you're aiming to reuse the current Node utilization logic. I have a few suggestions: It should support different data sources, similar to PayPal's load-watcher. Hope these suggestions help! |
Hello sir :) thank you for taking part in composing the out-of-tree descheduling plugin KEP.
You are on the right track here. I'd like to get in touch with load-watcher maintainers and extend the codebase to provide a generic interface for accessing metrics related to pod utilization as well. Currently, only actual node utilization gets collected. Meantime, I am forming the code here to be able to better integrate with other utilization sources like metrics.
This is where we can debate more. Thank you for sharing the specifics. There's an open issue for the pod autoscaler suggesting to introduce EMA: kubernetes/kubernetes#62235. Are you aware if there's a similar issue or a discussion for the cluster autoscaler? I'd love to learn more about how it's implemented there. Ultimately, the current plugin just needs to know which pod, when evicted, will improve the overall node/workload utilization when properly re-scheduled. I could see various ways to produce the utilization snapshot using various methods.
I can see how evicting hotspot pods is related to consuming the metrics/real-time node utilization. In the current plugin context this is more suitable for a new/different plugin. I can also see how |
c889a53
to
1f55c4d
Compare
d744a96
to
800c92c
Compare
kubernetes/kubernetes#128663 to address the discrepancy in the fake metrics client node/pod metricses resource name. |
f30f8a1
to
2e63411
Compare
0330902
to
baa6650
Compare
/test pull-descheduler-verify-master |
Integration with kubernetes metrics in #1555. |
baa6650
to
2442967
Compare
668f0fb
to
477104c
Compare
425f449
to
d16e2b0
Compare
/retest-required |
/retest-required |
fbf6fb2
to
3ec47f7
Compare
/hold |
Waiting for #1555 to merge |
4a21180
to
a4186f1
Compare
d30044e
to
c478ce8
Compare
c478ce8
to
d143aad
Compare
/hold cancel |
MetricsCollector MetricsCollector | ||
|
||
// Prometheus enables metrics collection through Prometheus | ||
Prometheus Prometheus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't new fields be optional and allow nil value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be good to also consider changing fields introduced in #1555
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tricky. The versioned type adds the omitempty
tag. Yet given Prometheus
is a struct one needs to turn the field into a pointer to get it omitted in the serialized json. For the actual setting you can ignore the field. The field is not required.
@@ -57,4 +57,14 @@ type MetricsUtilization struct { | |||
// metricsServer enables metrics from a kubernetes metrics server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question about MetricsUtilization
field: shouldn't it be nullable?
@@ -57,4 +57,14 @@ type MetricsUtilization struct { | |||
// metricsServer enables metrics from a kubernetes metrics server. | |||
// Please see https://kubernetes-sigs.github.io/metrics-server/ for more. | |||
MetricsServer bool `json:"metricsServer,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be better expressed by an enum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. Enums can be easily extended with other values besides true/false.
|
||
type AuthToken struct { | ||
// raw for a raw authentication token | ||
Raw string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we considered disadvantages of using Raw tokens? Wouldn't it be better to start with the secret ref? And revisit the raw token topic in the future?
type Prometheus struct { | ||
URL string | ||
AuthToken AuthToken | ||
InsecureSkipVerify bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question about supporting the InsecureSkipVerify
.
client._nodeUtilization = make(map[string]map[v1.ResourceName]*resource.Quantity) | ||
client._pods = make(map[string][]*v1.Pod) | ||
|
||
results, warnings, err := promv1.NewAPI(client.promClient).Query(context.TODO(), client.promQuery, time.Now()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nicer to have the context from above
func (client *prometheusUsageClient) resourceNames() []v1.ResourceName { | ||
return []v1.ResourceName{ResourceMetrics} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this method?
@@ -34,6 +34,8 @@ import ( | |||
"sigs.k8s.io/descheduler/pkg/utils" | |||
) | |||
|
|||
const ResourceMetrics = v1.ResourceName("MetricResource") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect the value to be the same as the variable name. Any reason why is it different?
if len(resourceNames) == 1 && resourceNames[0] == ResourceMetrics { | ||
// Make ResourceMetrics 100% => 100 points | ||
nodeCapacity[ResourceMetrics] = *resource.NewQuantity(int64(100), resource.DecimalSI) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: might be safer to double fail here. It expects that the plugin above does the validation correctly.
@@ -95,6 +106,11 @@ func NewLowNodeUtilization(args runtime.Object, handle frameworktypes.Handle) (f | |||
return nil, fmt.Errorf("metrics client not initialized") | |||
} | |||
usageClient = newActualUsageClient(resourceNames, handle.GetPodsAssignedToNodeFunc(), handle.MetricsCollector()) | |||
} else if lowNodeUtilizationArgsArgs.MetricsUtilization.Prometheus.Query != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentioned, enum would look better here
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Extend the actual utilization awareness with Prometheus integration.
For testing purposes: