Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Use cudnn for dropout by default #14278

Merged
merged 5 commits into from
Mar 5, 2019
Merged

Use cudnn for dropout by default #14278

merged 5 commits into from
Mar 5, 2019

Conversation

roywei
Copy link
Member

@roywei roywei commented Feb 28, 2019

Description

enable cuddn for dropout after #13896
This was turned on only in Gluon dropout layers, but not in mx.nd.Dropout.

Test:
After setting changing the flag, ran the sample code in #13896, was able to get the same speed reported.

import mxnet as mx
a = mx.nd.ones((10, 200, 300, 500), ctx=mx.gpu(0))
a.attach_grad()
mx.autograd.set_recording(True)
%timeit mx.nd.Dropout(a, 0.5, mode='always').wait_to_read()

4.24 ms ± 4.35 µs per loop (mean ± std. dev. of 7 runs, 100 loops each

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

set cudnn_off default as false

@wkcn
Copy link
Member

wkcn commented Feb 28, 2019

Thanks for your contribution!

unix-gpu CI raises the following exception:

C++ exception with description "[07:12:07] /work/mxnet/src/resource.cc:429: Check failed: state_space->ctx.dev_id == stream->dev_id (0 vs. -1) The device id of cudnn dropout state space doesn't match that from stream.

Maybe we should modify the code of tests/cpp/include/test_op.h.
opContext_.run_ctx.stream = mshadow::NewStream<gpu>(true, true); -> opContext_.run_ctx.stream = mshadow::NewStream<gpu>(true, true, 0);

@roywei roywei changed the title use cudnn for dropout by default [WIP]use cudnn for dropout by default Feb 28, 2019
@roywei roywei requested a review from anirudh2290 as a code owner February 28, 2019 18:46
@roywei roywei changed the title [WIP]use cudnn for dropout by default Use cudnn for dropout by default Feb 28, 2019
@wkcn
Copy link
Member

wkcn commented Mar 1, 2019

Thanks for your contribution!
LGTM.
Wait for the finish of CI.

tests/cpp/include/test_op.h Outdated Show resolved Hide resolved
@anirudhacharya
Copy link
Member

@mxnet-label-bot add [pr-awaiting-review]

@marcoabreu marcoabreu added the pr-awaiting-review PR is waiting for code review label Mar 1, 2019
Copy link
Member

@eric-haibin-lin eric-haibin-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix

@roywei
Copy link
Member Author

roywei commented Mar 2, 2019

@wkcn Thanks a lot for pointing out the fix needed on test!
It seems the last CI run was having problem reporting the correct status. (All jobs have passed, but one showing unfinished).

Copy link
Member

@wkcn wkcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Could you please retrigger the CI? Thanks!

@roywei
Copy link
Member Author

roywei commented Mar 3, 2019

seems empty commit did not trigger CI, trying close and reopen

@roywei roywei closed this Mar 3, 2019
@roywei roywei reopened this Mar 3, 2019
@wkcn
Copy link
Member

wkcn commented Mar 3, 2019

@roywei It seems that the CI has some problems, and all PRs are waiting for status to be reported.

@roywei
Copy link
Member Author

roywei commented Mar 3, 2019

@wkcn yes, and it might be related to #11654

@wkcn
Copy link
Member

wkcn commented Mar 4, 2019

@roywei Hi. The problem of CI has been addressed. Could you please retrigger it? Thanks!

@wkcn wkcn merged commit 7243806 into apache:master Mar 5, 2019
@wkcn
Copy link
Member

wkcn commented Mar 5, 2019

Merged. Thanks!

vdantu pushed a commit to vdantu/incubator-mxnet that referenced this pull request Mar 31, 2019
* use cudnn for dropout by default

* update test

* use dev_id

* trigger ci

* trigger ci
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
* use cudnn for dropout by default

* update test

* use dev_id

* trigger ci

* trigger ci
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants