Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Revert the change on broadcast_to param shape #14998

Merged

Conversation

reminisce
Copy link
Contributor

Description

This PR reverts the mistaken change on broadcast_to's param shape introduced by #14661. There was no unit test for the param shape with 0 as dimension size so this error was not caught. Such case has been added in unit tests to ensure correctness.

Thank @DickJC123 for reporting this in #14954.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

@reminisce reminisce added the Bug label May 20, 2019
@reminisce reminisce changed the title Revert the change broadcast_to param shape Revert the change on broadcast_to param shape May 20, 2019
@reminisce reminisce requested a review from eric-haibin-lin May 20, 2019 21:24
@reminisce
Copy link
Contributor Author

@DickJC123 Could you please give a review? Thanks.

@DickJC123
Copy link
Contributor

Trying to pop my to-do stack. Will look at this by EOD tomorrow.

@DickJC123
Copy link
Contributor

Thank you for your patience as I dealt with other MXNet fires. I still feel I want to think about the Shape serialization to files. But in the meantime, let me ask about the issue that started it all, and that you're addressing with this PR:

Isn't the real problem that the Python output shape parameter of broadcast_to() isn't being converted correctly by the parameter parser? In the default mode (so not numpy compatibility), a shape dimension size of 0 should be converted to -1 in the backend, which is what the broadcast_to() infer shape routine is already (i.e. correctly) looking for? In numpy compatibility mode, the user should use -1 for the output shape in Python to denote unknown.

@reminisce
Copy link
Contributor Author

@DickJC123 Thanks for the review. For this specific problem, it actually has nothing to do with numpy compatibility mode. The document says that 0 just means the output dimension size is the same as the corresponding input dimension size; it's not related to any judgement on whether the shape dimension size should be 0 or unknown. So per document, 0 should never be changed to -1 in whatever situation, but it was changed by a mistake. This is the same as MXNet reshape operator where the shape parameter has 0, -1, -2, -3, and -4 for different shape inference maneuvers.

@omarroberts40
Copy link

omarroberts40 commented May 24, 2019 via email

@fhieber
Copy link
Contributor

fhieber commented May 24, 2019

I would appreciate a timely merge of this fix. Thank you!

@DickJC123
Copy link
Contributor

I'm close to giving this the thumbs up (see the end of my comments). I think a PR relating to Shape serialization (as we have discussed) should be separate from this one because of the possible extended discussion, given the 1.5 release deadline and @fhieber's eagerness for this fix.

On this PR, let me try one more time with my argument. Consider numpy broadcast_to() behavior:

Python 3.5.2 (default, Nov 12 2018, 13:43:14) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> x = np.array([1,2,3])
>>> np.broadcast_to(x, (3,3))
array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])
>>> np.broadcast_to(x, (0,3))
array([], shape=(0, 3), dtype=int64)
>>> 

If we say that MXNet's broadcast_to() uses 0 to mean 'match input', regardless of compatibility mode, then we could never duplicate numpy's behavior. However, if we think of 0 to mean 'unknown' in non-compatibility mode, and we say that 'unknown' really means 'unknown until inputs are bound, then match input size', we'd retain numpy functionality:

In numpy compatibility mode, MXNet could broadcast to a shape with a zero size as in:
mx.sym.broadcast_to(x, (0,3)) and boadcast_to() an input-matching shape as in mx.sym.broadcast_to(x, (-1,3)) [this being an extension to numpy capability].

And speaking of which, is there a reason you chose -1 for unknown size in numpy compatibility mode? Why not None as in (None, 3) for the example above? If you chose None, then even MXNet reshape functionality could be considered a compatible superset of Numpy's (in compatibility mode):

Numpy reshape:
0 = 0 size
-1 = calculate dim so total elements match

MXNet not in numpy-compatible mode:
0 = output size matches input [in contrast to numpy definition]
-1 = calculate dim so total elements match
-2 -> -4 other MXNet-specific definitions

MXNet numpy-compatible mode:
None = unknown size (until binding), then output size matches input
0 = 0 size [like numpy]
-1 = calculate dim so total elements match [like numpy]
-2 -> -4 other MXNet-specific definitions

Having typed this up, I see one problem- what value holds 'None' size in the backend, if -1 might have a meaning as it does in reshape? It could be defined to have a non-conflicting value as in enum ShapeSizeConstants { kUnknownSize = MIN_INT } and tested via:

inline bool dim_size_is_known(const dim_t dim_size) {  return dim_size != kUnknownSize; }

Anyway thanks for reading this far. Unless this really resonates with you, I'll leave it by saying I support this PR as it stands- this corrects many of the Sockeye test failures I was seeing (the rest related to Shape serialization, not broadcast_to()).

@reminisce
Copy link
Contributor Author

@DickJC123 Thanks for the deep analysis. I can see the confusion of this numpy compatibility flag. We decided to change it to another name called np_shape in this PR: #15063

Please know that this flag is designed for only affecting the incoming MXNet-NumPy operators which are not in the master branch for now. Legacy operators are NOT expected to comply with NumPy when this flag is set True. We are not going to announce any numpy related design/implementation in 1.5 release notes.

@DickJC123
Copy link
Contributor

Thanks for your explanation. Again, this PR LGTM.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants