-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Increase staggered build timeout to 180 min #18568
Increase staggered build timeout to 180 min #18568
Conversation
Hey @josephevans , Thanks for submitting the PR
CI supported jobs: [windows-cpu, unix-cpu, edge, centos-cpu, website, miscellaneous, centos-gpu, windows-gpu, unix-gpu, clang, sanity] Note: |
I think 180 min is too high for a simple docker rebuild from scratch. 60 min tops, everything else will just delay the display of a stuck job. |
I agree 180min is a little excessive, but the sanity build already has this value as a timeout. It doesn't make sense to have a timeout less than the sanity build. If we want to revisit the timeouts in the individual builds, that should fall under a separate project/PR. |
This was observed for this PR #18560 Where sanity build exceeded the 20min timeout. |
Well, I'm not going to give way to the broken window theory. I'm rather a fan of the boy scout rule. Feel free to reduce/align the timeout of the underlying sanity job while you're at it. |
The sanity build timeout was increased from 120 to 180 minutes over a year ago. We clearly had some issues with it timing out in the past, so I'd like to avoid having this happen again in the future. |
Exactly, it was done one and a half year ago and I personally merged it. But since then, the system has been reworked by @leezu . There is a system in place in cloud watch which records all the metrics. Since I no longer work at Amazon, I lost access to them. If you can present me with a graph of the maximum durations, I'll be happy to reconsider. But until proven otherwise, I think it's safe to go back to a more reasonable level. We don't want a PR to be hanging for three hours just because the sanity check got into a hiccup. Before this timeout was totally fine because it didn't matter. But since the decision was made zo safe a few bucks and in turn sacrifice the development speed, I'm gonna be harsh on the critical path. And this pipeline was intentionally put on the critical path, so I'm treating it as a such a critical piece in my reviews. @sandeep-krishnamurthy I understand the push towards frugality, but then I'd expect insisting on highest standards if a critical path is introduced. This is no means meant against Joseph, but if the newly introduced critical path is actually causing issues, I'd like to see it properly addressed and not just duct taped. This breaking point was known when the switch to a sequential system was made and now the impact is being noticed. So I'd appreciate if it could be addressed (or reverted) properly and not expect Joseph to just quickly fix it without allocating some time and priority for him. |
Thanks @marcoabreu for your very thoughtful comments. I've decreased the timeouts for sanity build to 60 min as well as the staggered build trigger. |
* Increase staggered build timeout to 180 min, since sanity build has 180 min timeout. * Decrease timeout so everyone is happy. Co-authored-by: Joe Evans <joeev@amazon.com>
* Increase staggered build timeout to 180 min, since sanity build has 180 min timeout. * Decrease timeout so everyone is happy. Co-authored-by: Joe Evans <joeev@amazon.com>
* Increase staggered build timeout to 180 min, since sanity build has 180 min timeout. * Decrease timeout so everyone is happy. Co-authored-by: Joe Evans <joeev@amazon.com> Co-authored-by: Joe Evans <joeev@amazon.com> Co-authored-by: Tao Lv <tao.a.lv@intel.com>
* Increase staggered build timeout to 180 min, since sanity build has 180 min timeout. * Decrease timeout so everyone is happy. Co-authored-by: Joe Evans <joeev@amazon.com>
Description
The sanity build already has a timeout of 180 minutes. The staggered full build pipeline currently has a timeout of 20 minutes, but we are seeing timeouts because sometimes the sanity build takes longer than this (because of having to rebuild docker caches for branches other than master.) This results in the rest of the builds failing to get triggered. Increasing this timeout to 180 minutes allows the sanity build to complete and won't prevent the rest of the builds from being triggered.