Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Fix infer shape partial after unknown shape changed to -1 #14869

Merged
merged 17 commits into from
May 21, 2019

Conversation

roywei
Copy link
Member

@roywei roywei commented May 3, 2019

Description

fix #14833
As we changed unknown shape in mxnet from 0 to -1, some operators infer shape logic could be wrong.
MXNet unit tests are not testing for this corner case, but Keras-MXNet relies heavily on parietal shape infer, so many unit tests are failing.

  1. changed CHECK_GE and CHECK_LE to compare ndim() with signed int as LHS could be -1 now.

  2. For some operators. when a tensor shape is entirely unknown, we should return directly. But we need to use ndim_is_known instead of shape_is_known. Because we can still continue to infer shape if tensor shape if partially unknown. Using shape_is_known will cause a direct return unless the tensor shape is fully known.

  3. For binary ops, if LHS and RHS both have fully unknown shapes, we should return directly. dot ops is missing this logic.

  4. added unit tests

This fixes failing keras-mxnet unit tests.

Thanks to @reminisce for helping me debug and figure out the changes needed!

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@anirudhacharya
Copy link
Member

@mxnet-label-bot add [pr-work-in-progress]

@marcoabreu marcoabreu added the pr-work-in-progress PR is still work in progress label May 3, 2019
@roywei roywei changed the title [WIP][Do Not Merge]Fix infer shape partial after unknown shape changed to -1 Fix infer shape partial after unknown shape changed to -1 May 6, 2019
@roywei roywei force-pushed the fix_infer_shape branch from 5502fa0 to acaa440 Compare May 10, 2019 12:49
@reminisce reminisce self-requested a review May 11, 2019 04:05
@roywei roywei force-pushed the fix_infer_shape branch from 52f35a6 to 03136b9 Compare May 15, 2019 16:48
@roywei
Copy link
Member Author

roywei commented May 15, 2019

@reminisce @haojin2 please help take a look, thanks!

@roywei
Copy link
Member Author

roywei commented May 15, 2019

@mxnet-label-bot update[pr-awaiting-review]

@marcoabreu marcoabreu added pr-awaiting-review PR is waiting for code review and removed pr-work-in-progress PR is still work in progress labels May 15, 2019
@haojin2
Copy link
Contributor

haojin2 commented May 16, 2019

@reminisce Any more comments on this?

@@ -1207,6 +1207,14 @@ inline bool DotShape(const nnvm::NodeAttrs& attrs,
CHECK_EQ(out_attrs->size(), 1U);
mxnet::TShape& lshape = (*in_attrs)[0];
mxnet::TShape& rshape = (*in_attrs)[1];
// check if lhs ndim is larger than 1 and last dim is known
if (lshape.ndim() < 1 || !dim_size_is_known(lshape, lshape.ndim() - 1)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ndim=0 is a valid case representing scalar tensors. This line should be replaced by the following.

if (!ndim_is_known(lshape) || !ndim_is_known(rshape)) return false;
CHECK_GT(lshape.ndim(), 0) << "scalar tensor is not supported by this operator.";
CHECK_GT(rshape.ndim(), 0) << "scalar tensor is not supported by this operator.";

return false;
}
// check if rhs ndim is larger than 1 and first dim is known
if (rshape.ndim() < 1 || !dim_size_is_known(rshape, 0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this.

@@ -8369,6 +8369,89 @@ def test_add_n():
assert_almost_equal(rslt.asnumpy(), add_n_rslt.asnumpy(), atol=1e-5)


def test_dot_partial_shape():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move shape inference tests to test_infer_shape.py.

@roywei roywei force-pushed the fix_infer_shape branch from 2263a15 to 6e241ec Compare May 20, 2019 23:13
assert result == [(-1, 3)]


def test_where_partial_shape():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you forgot to call these test functions in __main__.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@roywei roywei force-pushed the fix_infer_shape branch from 25a9da0 to c9c2abf Compare May 21, 2019 16:22
@reminisce reminisce merged commit 5854b98 into apache:master May 21, 2019
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
* change check and shape_is_known

* rever some changes

* revert

* revert

* revert

* add test

* add more tests

* update test dot

* fix test

* update reduce axes

* fix lint

* update check

* fix lint

* address comments

* remove invalid test case

* run all tests

* update test case
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Numpy][Keras-MXNet] Infer shape partial failed for unknown shapes
5 participants