Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create scheduling_decision metrics #18

Merged

Conversation

yangkev
Copy link

@yangkev yangkev commented Mar 31, 2020

This commit introduces 2 new metrics:

scheduling_decision_invoke is a counter per CronJob, per Namespace,
that is incremented when the cronjob_controller decides to invoke a
CronJob.

scheduling_decision_skip is a counter per CronJob, per Namespace, that
is incremented when the cronjob_controller sees that a CronJob has unmet
schedule times, but chooses to still skip invoking the CronJob. This can
happen because:

  1. the controller missed the deadline to invoke the CronJob
  2. ConcurrencyPolicy is Forbid, and a previous job is still running
  3. ConcurrencyPolicy is Replace, and there's an error in trying to
    replace the currently running job (such as failed deletion)
  4. The controller fails to instantiate a Job spec from the CronJob
    template

Note: Kubernetes 1.14 does not include k8s.io/component-base/metrics.
When porting this to 1.16, we'll need to convert to using
k8s.io/component-base/metrics

@yangkev yangkev force-pushed the yangkev-cronmetrics-schedulingdecision branch from cf28a35 to 163f7b1 Compare March 31, 2020 23:54
@vllry
Copy link

vllry commented Apr 1, 2020

/lgtm

@yangkev yangkev force-pushed the yangkev-test-toolate-patch branch from ad3088c to 4928a20 Compare April 1, 2020 00:25
This commit introduces 2 new metrics:

`scheduling_decision_invoke` is a counter per CronJob, per Namespace,
that is incremented when the cronjob_controller decides to invoke a
CronJob.

`scheduling_decision_skip` is a counter per CronJob, per Namespace, that
is incremented when the cronjob_controller sees that a CronJob has unmet
schedule times, but chooses to still skip invoking the CronJob. This can
happen because:
1. the controller missed the deadline to invoke the CronJob
2. ConcurrencyPolicy is Forbid, and a previous job is still running
3. ConcurrencyPolicy is Replace, and there's an error in trying to
replace the currently running job (such as failed deletion)
4. The controller fails to instantiate a Job spec from the CronJob
template

Note: Kubernetes 1.14 does not include k8s.io/component-base/metrics.
When porting this to 1.16, we'll need to convert to using
k8s.io/component-base/metrics
These reasons are:
1. ConcurrencyPolicy would be violated
2. Scheduling deadline was missed
3. General Kubernetes errors
@yangkev yangkev force-pushed the yangkev-cronmetrics-schedulingdecision branch from 014c75e to 2dc7fdd Compare April 1, 2020 00:29
@yangkev yangkev changed the base branch from yangkev-test-toolate-patch to release-1.14.10-lyft April 1, 2020 00:30
@yangkev yangkev merged commit 1ff6294 into release-1.14.10-lyft Apr 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants