[MXNET-1405] tests for large tensor support for Softmax operator #15042

access2rohit · 2019-05-22T20:18:56Z

Description

Added test case to verify Large Tensor Support for Softmax Operator

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-1405], where [MXNET-1405] tests for large tensor support for Softmax operator #15042 refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

access2rohit · 2019-05-22T20:20:19Z

@mxnet-label-bot add [pr-awaiting-review]

access2rohit · 2019-05-22T20:20:34Z

@apeforest : Please review

apeforest · 2019-05-23T04:53:18Z

tests/nightly/test_large_array.py

@@ -279,6 +279,19 @@ def test_diag():
    assert_almost_equal(r.asnumpy(), np.diag(a_np, k=k))


+def test_softmax():
+    def softmax_forward(input_data, true_output):
+        data = mx.sym.Variable('data')


Why not just use ndarray API?

apeforest · 2019-05-23T04:54:02Z

tests/nightly/test_large_array.py

+        nparr = ndarr.asnumpy()
+        assert_almost_equal(nparr, true_output, rtol=1e-5, atol=1e-5)
+
+    softmax_forward(mx.nd.ones((128, LARGE_X)), np.full((128, LARGE_X), 0.0078125))


Is this 0.0078125 hand calculated? How to verify it? Can you use numpy calculation as the expected result?

Yes ... will do

apeforest · 2019-05-23T04:54:34Z

tests/nightly/test_large_array.py

+        nparr = ndarr.asnumpy()
+        assert_almost_equal(nparr, true_output, rtol=1e-5, atol=1e-5)
+
+    softmax_forward(mx.nd.ones((128, LARGE_X)), np.full((128, LARGE_X), 0.0078125))


Why 128? Can we use predefined constants?

apeforest · 2019-05-23T04:55:17Z

tests/nightly/test_large_array.py

@@ -279,6 +279,19 @@ def test_diag():
    assert_almost_equal(r.asnumpy(), np.diag(a_np, k=k))


+def test_softmax():
+    def softmax_forward(input_data, true_output):


If this is called once, do we need this sub routine?

apeforest · 2019-05-29T18:22:54Z

tests/nightly/test_large_array.py

+def test_softmax():
+    input_data = mx.nd.ones((SMALL_Y, LARGE_X))
+    true_output = np.full((SMALL_Y, LARGE_X), (1 / SMALL_Y))
+    output = nd.softmax(input_data, axis=0)


when you test this way, the softmax is operating only on one dimension which does not exceed 2^32, right?

But shouldn't we test softmax over a vector larger than 2^32?

The test fails

fine. let's leave that as TODO in the next phase to support large int in one dimension.

apeforest

LGTM

access2rohit · 2019-05-31T21:14:45Z

@mxnet-label-bot add [pr-awaiting-merge]

…che#15042)

marcoabreu added the pr-awaiting-review PR is waiting for code review label May 22, 2019

apeforest reviewed May 23, 2019

View reviewed changes

access2rohit force-pushed the softmax branch 3 times, most recently from 355049f to 03aa04d Compare May 29, 2019 17:59

apeforest reviewed May 29, 2019

View reviewed changes

apeforest approved these changes May 31, 2019

View reviewed changes

marcoabreu added the pr-awaiting-merge Review and CI is complete. Ready to Merge label May 31, 2019

access2rohit force-pushed the softmax branch 4 times, most recently from 15301a2 to f523f73 Compare June 2, 2019 22:14

[MXNET-1405] tests for large tensor support for Softmax operator

73dc83d

access2rohit force-pushed the softmax branch from 1e0d067 to 73dc83d Compare June 3, 2019 17:12

apeforest merged commit a37cd7a into apache:master Jun 4, 2019

haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019

[MXNET-1405] tests for large tensor support for Softmax operator (apa…

612f2c3

…che#15042)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MXNET-1405] tests for large tensor support for Softmax operator #15042

[MXNET-1405] tests for large tensor support for Softmax operator #15042

access2rohit commented May 22, 2019 •

edited

Loading

access2rohit commented May 22, 2019

access2rohit commented May 22, 2019

apeforest May 23, 2019

apeforest May 23, 2019

access2rohit May 23, 2019

apeforest May 23, 2019

apeforest May 23, 2019

access2rohit May 23, 2019

apeforest May 29, 2019

access2rohit May 29, 2019

apeforest May 29, 2019

access2rohit May 29, 2019

apeforest May 31, 2019

apeforest left a comment

access2rohit commented May 31, 2019

[MXNET-1405] tests for large tensor support for Softmax operator #15042

[MXNET-1405] tests for large tensor support for Softmax operator #15042

Conversation

access2rohit commented May 22, 2019 • edited Loading

Description

Checklist

Essentials

access2rohit commented May 22, 2019

access2rohit commented May 22, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apeforest left a comment

Choose a reason for hiding this comment

access2rohit commented May 31, 2019

access2rohit commented May 22, 2019 •

edited

Loading