Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CronJob: Respect ControllerRef #42177

Merged
merged 10 commits into from
Apr 20, 2017

Conversation

enisoc
Copy link
Member

@enisoc enisoc commented Feb 27, 2017

What this PR does / why we need it:

This is part of the completion of the ControllerRef proposal. It brings CronJob into compliance with ControllerRef. See the individual commit messages for details.

Which issue this PR fixes:

This ensures that other controllers do not fight over control of objects that a CronJob owns.

Special notes for your reviewer:

Release note:

CronJob controller now respects ControllerRef to avoid fighting with other controllers.

cc @erictune @kubernetes/sig-apps-pr-reviews

@enisoc enisoc added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Feb 27, 2017
@enisoc enisoc added this to the v1.6 milestone Feb 27, 2017
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 27, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@k8s-github-robot k8s-github-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 27, 2017
@enisoc enisoc force-pushed the controller-ref-cronjob branch 2 times, most recently from 4973031 to d471655 Compare February 27, 2017 18:42
Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@soltysh
Copy link
Contributor

soltysh commented Feb 27, 2017

/approve

@soltysh
Copy link
Contributor

soltysh commented Feb 27, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 27, 2017
@kow3ns
Copy link
Member

kow3ns commented Feb 27, 2017

Reviewed 9 of 9 files at r1.
Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks broke.


pkg/controller/cronjob/cronjob_controller.go, line 58 at r1 (raw file):

// ControllerKind contains the schema.GroupVersionKind for this controller type.
var ControllerKind = batch.SchemeGroupVersion.WithKind("CronJob")

Probably should not be exported from package


Comments from Reviewable

@enisoc enisoc force-pushed the controller-ref-cronjob branch from d471655 to 1083d91 Compare February 28, 2017 01:02
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 28, 2017
@enisoc enisoc added the kind/bug Categorizes issue or PR as related to a bug. label Mar 1, 2017
@calebamiles
Copy link
Contributor

@k8s-bot kops aws e2e test this
@k8s-bot cvm gce e2e test this
@k8s-bot gci gce e2e test this
@k8s-bot gce etcd3 e2e test this

@enisoc enisoc force-pushed the controller-ref-cronjob branch 2 times, most recently from 9586055 to 3e166bd Compare March 3, 2017 23:00
@enisoc
Copy link
Member Author

enisoc commented Mar 4, 2017

@k8s-bot node e2e test this
@k8s-bot gci gce e2e test this
@k8s-bot cvm gce e2e test this

@enisoc enisoc force-pushed the controller-ref-cronjob branch from 3e166bd to 7505b1f Compare March 6, 2017 19:18
@enisoc
Copy link
Member Author

enisoc commented Mar 6, 2017

@soltysh Please take a look. I believe all comments are addressed.

@soltysh
Copy link
Contributor

soltysh commented Mar 7, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 7, 2017
@soltysh
Copy link
Contributor

soltysh commented Mar 7, 2017

@ethernetdan @calebamiles is this ok to merge for 1.6 release?

// List children (Jobs) before parents (CronJob).
// This guarantees that if we see any Job that got orphaned by the GC orphan finalizer,
// we must also see that the parent CronJob has non-nil DeletionTimestamp (see #42639).
// Note that this only works because we are NOT using any caches here.
jl, err := jm.kubeClient.BatchV1().Jobs(metav1.NamespaceAll).List(metav1.ListOptions{})
if err != nil {
glog.Errorf("Error listing jobs")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error is not printed here

@@ -117,6 +116,14 @@ func (jm *CronJobController) syncAll() {
js := jl.Items
glog.V(4).Infof("Found %d jobs", len(js))

sjl, err := jm.kubeClient.BatchV2alpha1().CronJobs(metav1.NamespaceAll).List(metav1.ListOptions{})
if err != nil {
glog.Errorf("Error listing cronjobs: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change glog.Errorf calls to utilruntime.HandleError (the latter is ratelimited among other things)

@@ -237,6 +244,18 @@ func syncOne(sj *batchv2alpha1.CronJob, js []batchv1.Job, now time.Time, jc jobC
}
*sj = *updatedSJ

if sj.DeletionTimestamp != nil {
// The CronJob is being deleted.
// Don't do anything other than updating status.
Copy link
Contributor

@0xmichalis 0xmichalis Apr 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no status update.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh well it seems it has already happened above


// Adopt Jobs it owns that don't have ControllerRef yet.
// That is, the Jobs were created by a pre-v1.6.0 master.
It("should adopt Jobs it owns that don't have ControllerRef yet", func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be an integration test, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was mainly added because we've run into problems with RBAC before. Do integration tests cover that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect so, @deads2k ?

enisoc added 5 commits April 18, 2017 13:59
Now that CronJob adds ControllerRef to Jobs it creates,
we need to set this default so legacy behavior is maintained.
This should only happen if the Jobs were created by an older version
of the CronJob controller, since from now on we add ControllerRef upon
creation.

CronJob doesn't do actual adoption because it doesn't use label
selectors to find its Jobs. However, we should apply ControllerRef
for potential server-side cascading deletion, and to advise other
controllers we own these objects.
@enisoc enisoc force-pushed the controller-ref-cronjob branch from bb3305d to c168dad Compare April 18, 2017 21:18
continue
}
job.OwnerReferences = append(job.OwnerReferences, sjControllerRef)
updatedJob, err := jc.UpdateJob(job.Namespace, job)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have more time, shall we switch to patch?

@@ -324,7 +324,8 @@ var _ = framework.KubeDescribe("Generated release_1_5 clientset", func() {
observeCreation(w)

By("deleting the cronJob")
if err := cronJobClient.Delete(cronJob.Name, nil); err != nil {
// Use OrphanDependents=false so the delete is synchronous.
if err := cronJobClient.Delete(cronJob.Name, &metav1.DeleteOptions{OrphanDependents: new(bool)}); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OrphanDependents is being deprecated in 1.7. Should we switch to use deleteOptions.propagationPolicy instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// List children (Jobs) before parents (CronJob).
// This guarantees that if we see any Job that got orphaned by the GC orphan finalizer,
// we must also see that the parent CronJob has non-nil DeletionTimestamp (see #42639).
// Note that this only works because we are NOT using any caches here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soltysh we'll need to take care of this when we change cronjob controller to use informer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we've talked about it during KubeCon. If I have a reasonable solution in place in openshift I'll make sure to fix cronjob controller.

@janetkuo
Copy link
Member

Do we need to test that SJ won't adopt Jobs it doesn't own?

@enisoc
Copy link
Member Author

enisoc commented Apr 19, 2017

Do we need to test that SJ won't adopt Jobs it doesn't own?

This is unit tested via TestGroupJobsByParent(). Jobs that the CronJob doesn't own should not even be passed into adoptJobs().

The e2e test is only to ensure surprises like RBAC don't prevent adopt calls from working. In this case, the correct action is to do nothing, so we don't need an e2e test to make sure other components don't interfere.

There is no "release/orphan" case since CronJob does not use a label selector to find its children.

@enisoc enisoc force-pushed the controller-ref-cronjob branch from c168dad to 01aded9 Compare April 19, 2017 22:31
enisoc added 5 commits April 19, 2017 15:42
This is needed now that the default is OrphanDependents.
This prevents a race with the GC while it orphans dependents.
Currently, an e2e test is the only way to ensure we have the proper RBAC
permissions to adopt Jobs.
@enisoc enisoc force-pushed the controller-ref-cronjob branch from 01aded9 to be1fe95 Compare April 19, 2017 22:42
Copy link
Member

@janetkuo janetkuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@enisoc enisoc added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 20, 2017
@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 42177, 42176, 44721)

@k8s-github-robot k8s-github-robot merged commit f25a657 into kubernetes:master Apr 20, 2017
@enisoc enisoc deleted the controller-ref-cronjob branch April 21, 2017 19:19
@enisoc enisoc mentioned this pull request Apr 24, 2017
@davidghiurco
Copy link

Does this mean that whenever I delete a cronjob from a k8s cluster, all its jobs, and associated pods would get deleted too?
Because I am still seeing orphaned jobs and pods with
kubectl get jobs
and
kubectl get pods

after deleting their associated cronjob. See my comment on #42627

@soltysh
Copy link
Contributor

soltysh commented Jun 14, 2018

Does this mean that whenever I delete a cronjob from a k8s cluster, all its jobs, and associated pods would get deleted too?

That depends how you remove the object, it is still possible to remove an object and orphan its dependents.

@davidghiurco
Copy link

I delete the object using helm.
e.g.
helm delete --purge <helm_release_name>

@soltysh
Copy link
Contributor

soltysh commented Jun 21, 2018

Can you provide me the actual requests helm delete sends over to the server, I'm not familiar with helm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.