-
Notifications
You must be signed in to change notification settings - Fork 6.8k
C++ test Core dump DROPOUT_PERF.TimingGPU #9857
Comments
looks like someone broke it, eh? |
Yep :( |
then it's not "flaky", right? I mean, it's never broken before, right? I also noticed it breaking in another build today. |
It's flaky because otherwise it would have been reported during PR stage. Well the number of test failures drastically increased since the mkldnn merge. But that's why we merged it, right? To get some data early on. |
just a side note, should we add try catch around for TESTs and fail so that consecutive tests continue to execute ? |
No, this would waste too many resources. Generally, multiple failures can be fixed by one change. If we wouldn't follow fail-fast, a lot of time would be wasted. Considering the fact that PRs should be small, the chance of multiple independent bugs causing multiple test failures is quite small compared to the number of total tests executed within every PR. The right approach is to fix one test failure, create a new commit and then fix the next test failure. While this actually sounds pretty wasteful as well, past experience has shown that this is rarely required. Additionally, we're currently working on something that allows you to run tests locally in exactly the same way the CI does. This will allow you to test yourself before pushing it to the CI. |
@marcoabreu I ran this test ~1000 times, couldnt replicate the failure. Can we close this issue ? |
@spidydev @marcoabreu I think there were some crossed wires on this one. This test is actually enabled. Was it at one time disabled? |
@KellenSunderland From the history, I dont think the test was ever disabled. |
@spidydev agree. @marcoabreu think we could close this one? |
Sure, thanks for following up. I think this was due to the MKLDNN change. |
The issue is not related to C++ API. Removing the C++ label. @mxnet-label-bot remove [C++] |
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/master/399/pipeline
The text was updated successfully, but these errors were encountered: