Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove CronJob "100 missed start times" error (binary search approach) #13

Merged
merged 1 commit into from Mar 24, 2020

Conversation

vllry
Copy link

@vllry vllry commented Feb 24, 2020

This PR is a less efficient but simpler alternative to #12

As the robfig/cron module only exposes a Next() function for schedules, this PR uses a binary search, rather than brute force, to traverse the schedule window.

The flow of getLatestMissedScheduleBinarySearch is roughly this:

  1. Check the second half of the window for the earliest schedule. If there is one, recurse.
  2. Check the first half of the window for the earliest schedule. If there is one, recurse.
  3. If neither window contained a valid schedule, there are no missed start times.
    The answer is the latest missed start time that is found.

getRecentUnmetScheduleTimes() is still the interface for performing this check (in the interest of easy patching for the next few releases). getRecentUnmetScheduleTimes() retains the required properties of its return types - specifically, results[-1]is the most result missed start time, and 0 <= len(results) <= 2` (used for events/logging). The results no longer contain a precise count, or other start times prior to the most recent one.

@vllry vllry changed the title [WIP] Binary search: Remove CronJob "100 missed start times" error [Unready] Binary search: Remove CronJob "100 missed start times" error Feb 25, 2020
@vllry vllry force-pushed the cronjobs-100-binarysearch branch from 7534e37 to aa4de58 Compare March 17, 2020 23:46
@vllry vllry changed the title [Unready] Binary search: Remove CronJob "100 missed start times" error Binary search: Remove CronJob "100 missed start times" error Mar 17, 2020
@vllry vllry force-pushed the cronjobs-100-binarysearch branch from aa4de58 to b1d481d Compare March 18, 2020 21:08
@vllry vllry changed the title Binary search: Remove CronJob "100 missed start times" error Remove CronJob "100 missed start times" error (binary search approach) Mar 18, 2020
// If we are looking at a time window less than a minute, there is "room" for at most 1 schedule, and we can skip the binary search.
minimumTimeUnit := time.Minute
if endWindow.Sub(startWindow) < minimumTimeUnit {
nextStart := schedule.Next(startWindow)
Copy link
Author

@vllry vllry Mar 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be one less (small) Next call to do more of a check prior to both branches that call getLatestMissedScheduleBinarySearch. When this function calls itself, it already knows if the call will yield a new result or not (duration < minute couldn't possibly contain 2 schedules, so the Next schedule in that window is definitely the last one).

@vllry vllry force-pushed the cronjobs-100-binarysearch branch from b1d481d to f7716e7 Compare March 19, 2020 19:47
} else {
latestSchedule, found := getLatestMissedScheduleBinarySearch(nextSchedule, endWindow, schedule)
if found && latestSchedule != nextSchedule {
return latestSchedule, 2 // We missed at least 2 schedules. Return the latest.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused how we conclude that we missed atleast 2 schedules here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have 1 schedule (being in the parent else statement implies !nextSchedule.After(endWindow)), AND we have another schedule result after that.

Copy link
Author

@vllry vllry Mar 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This structure is a weird hack to get an accurate "none, one, multiple" missed-schedule count (used for telemetry purposes).

There are other ways to get that data, such as returning an (incompte) count from getLatestMissedScheduleBinarySearch, and checking making a Next(startWindow) call if and only if the count is 1. If the count is 1, and there was more than 1 missed start time, Next(startWindow) would return a unique value. If were was really only 1, it would result the same as the final result.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh ok I see

}

// Break the window into 2 halves for a binary search.
midway := startWindow.Add(endWindow.Sub(startWindow) / 2)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use the standard library call for the binary search implementation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize there was one.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing it directly seems more straightforward than trying to cram something into the Search function. It's a binary search approach, but it's not entirely a classical binary search problem.

Comment on lines +176 to +183
if !nextScheduleAfterMidway.After(endWindow) {
latestFound, found := getLatestMissedScheduleBinarySearch(nextScheduleAfterMidway, endWindow, schedule) // Check a sub-window between the found schedule and the end of the window.
if found {
return latestFound, true
} else {
return nextScheduleAfterMidway, true
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just recurse on getLatestMissedScheduleBinarySearch(midway, endWindow, schedule)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little more efficient to jump ahead to nextScheduleAfterMidway, and ignore the gap between midway and nextScheduleAfterMidway. The underlying implementation of Next() is sensitive to the duration of time between the start and the result - for example, Next() is a bit slower to get the next monthly schedule than the next minutely one.

@vllry vllry force-pushed the cronjobs-100-binarysearch branch from fe8217f to 87476dc Compare March 23, 2020 23:54
Co-Authored-By: Tom Wanielista <tomwans@users.noreply.github.com>

Co-Authored-By: Kevin Yang <16491459+yangkev@users.noreply.github.com>
@vllry vllry force-pushed the cronjobs-100-binarysearch branch from 87476dc to 8410402 Compare March 24, 2020 00:02
@vllry vllry merged commit 6e25011 into release-1.14.7-lyft Mar 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants