-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[CI][NightlyTestsForBinaries] Test Large Tensor: GPU Failing #14981
Comments
Hey, this is the MXNet Label Bot. |
@mxnet-label-bot add [test] |
fixed in latest run, we can close this now: http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/NightlyTestsForBinaries/detail/master/320/pipeline |
Currently, both CPU and GPU tests have been disabled due to the same memory issue. Had a discussion with @access2rohit and @apeforest, we can try a few things:
We are having problems testing the above solutions on CI machines that have multiple jobs running in parallel. |
failed with 200G shared memory on P3.2x and failed, we need another approach for testing large tensor. |
Description
Test Large Tensor: GPU step is failing with:
see http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/NightlyTestsForBinaries/detail/master/312/pipeline/144 for the full log
The text was updated successfully, but these errors were encountered: