Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Flaky cpp test: dropout gpu perf #14257

Closed
eric-haibin-lin opened this issue Feb 26, 2019 · 5 comments
Closed

Flaky cpp test: dropout gpu perf #14257

eric-haibin-lin opened this issue Feb 26, 2019 · 5 comments

Comments

@eric-haibin-lin
Copy link
Member

The dropout cpp test fails intermittently on MXNet master. I can reproduce the failure by running the cpp test locally a couple of times.

Reference failure log:
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-gpu/detail/PR-14066/5/pipeline/

@mxnet-label-bot
Copy link
Contributor

Hey, this is the MXNet Label Bot.
Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it.
Here are my recommended labels: Test, Flaky

@piyushghai
Copy link
Contributor

@mxnet-label-bot Add [Test, Flaky]

@wkcn
Copy link
Member

wkcn commented Feb 28, 2019

There is another issue about dropout gpu perf.
http://jenkins.mxnet-ci.amazon-ml.com/job/mxnet-validation/job/unix-gpu/job/PR-14278/1/display/redirect

C++ exception with description "[07:12:07] /work/mxnet/src/resource.cc:429: Check failed: state_space->ctx.dev_id == stream->dev_id (0 vs. -1) The device id of cudnn dropout state space doesn't match that from stream.

Maybe we should modify the code of tests/cpp/include/test_op.h.
opContext_.run_ctx.stream = mshadow::NewStream<gpu>(true, true); -> opContext_.run_ctx.stream = mshadow::NewStream<gpu>(true, true, 0);

@eric-haibin-lin
Copy link
Member Author

I see. Maybe that will fix the the failure reported previously

@eric-haibin-lin
Copy link
Member Author

#14278 fixes the stream used in the test. feel free to reopen if this issue happens again

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants