Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNet-1211] Factor and "Like" modes in BilinearResize2D operator #13226

Merged
merged 9 commits into from
May 5, 2019
Merged

[MXNet-1211] Factor and "Like" modes in BilinearResize2D operator #13226

merged 9 commits into from
May 5, 2019

Conversation

lobanov-m
Copy link
Contributor

@lobanov-m lobanov-m commented Nov 12, 2018

Description

This PR adds few options to BilinearResize2D operator. This options are needed for Fully-Convolutional realization of nets, that uses bilinear resize. The added modes are "odd_scale", "like" and few options to change height and width of tensor to even or odd: "to_even_down", "to_even_up", "to_odd_down", "to_odd_up".
Such transformations couldn't be done in other ways because when we use symbolic models we can't get tensors shapes from Python or any other frontend.
With these mods it is is possible to realize segmentation network ICNet that could work with input of any size.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • For user-facing API changes, API doc string has been updated.
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • "Odd-scale" is for getting odd height and width, for example for pooling, which is recommended by authors of DeepLab and ICNet. So if we have tensor with shape (n, c, 65, 33) we can resize it to (n, c, 33, 17) with parameters (scale_height=0.5, scale_width=0.5, mode='odd_scale') and back with parameters (scale_height=2, scale_width=2, mode='odd_scale')
  • "Like" is also useful mode which could be used in decoders of segmentation networks. It resizes first input to the shape of second input.
  • "to_even_..." and "to_odd_..." modes changes height and width to even and odd value respectively adding or subtracting 1 from h and w if required.
  • Changes in visualization.py are needed for normal graph visualization when using mode "Like". In this mode there are two input nodes, but actually one of them is used only for getting desired shape. Information about it in graph isn't important but graph becomes too complicated. So this change just deletes dependence from "like"-node.

Comments

  • The changes are almost backward-compatible: since we added mode "like", the operator takes two inputs, so in other modes second positional argument in Python (or other) should be None, and height and width became third and forth positional arguments respectively. So, some code, which used height and width as positional argument will not work.

@lobanov-m lobanov-m changed the title Bilinear resize scale like Factor and "Like" modes in BilinearResize2D operator Nov 12, 2018
@anirudhacharya
Copy link
Member

@mxnet-label-bot add [pr-awaiting-review]

@marcoabreu marcoabreu added the pr-awaiting-review PR is waiting for code review label Nov 13, 2018
@lobanov-m lobanov-m changed the title Factor and "Like" modes in BilinearResize2D operator [MXNet-1211] Factor and "Like" modes in BilinearResize2D operator Nov 13, 2018
@stu1130
Copy link
Contributor

stu1130 commented Nov 21, 2018

@apeforest @samskalicky @yuxihu @ChaiBapchya could you please take a look at this?

Copy link
Member

@anirudhacharya anirudhacharya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add tests for this operator?

.describe("output height (required)");
DMLC_DECLARE_FIELD(width).set_range(1, 10000)
.describe("output width (required)");
DMLC_DECLARE_FIELD(height).set_range(0, 10000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change the range from 1 to 0. How does 0 size make sense for resizing?

Copy link
Contributor Author

@lobanov-m lobanov-m Nov 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is required for using "scale"-mode. F.e. you have tensor with shape (1, 3, 100, 100), so you can do BilinearResize2D(tensor, height=0.5, width=0.5, mode='scale') and get tensor of shape (1, 3, 50, 50)

DMLC_DECLARE_FIELD(width).set_range(1, 10000)
.describe("output width (required)");
DMLC_DECLARE_FIELD(height).set_range(0, 10000)
.set_default(-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

height and width are currently required parameters. Why set a default value for them? Also how does setting default=-1 make sense when the range for the field is (0,1000)

Copy link
Contributor Author

@lobanov-m lobanov-m Nov 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not required parameter, when using mode 'like'. In this mode we can pass two tensors and the first tensor would be resized to second tensor's size. But for other modes it is still required, so default value is inappropriate and should raise error.

.describe(R"(output height if mode is "size" or scale on which input height should be multiplied if mode is
"scale" or "odd_scale")");
DMLC_DECLARE_FIELD(width).set_range(0, 10000)
.set_default(-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

.add_enum("like", bilinear_resize::like)
.add_enum("to_even", bilinear_resize::to_even)
.set_default(bilinear_resize::size)
.describe(R"(resizing mode. "size" - resize to distinct size; "scale" - original height and width are multiplied
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in scale mode what is the "appropriate scale"? I dont see that defined in the params.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think any. If we have original tensor with shape (1, 3, 10000, 10000) and resize it with height=0.001 and width=0.001 in "scale" mode we will get tensor with shape (1, 3, 10, 10). And back we can resize the result tensor in scale mode with height=100 and width=100 and get tensor of original size. I'll try to clarify this in description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably I used wrong term. Maybe it would be better if I change "scale" to "factor".

.set_default(bilinear_resize::size)
.describe(R"(resizing mode. "size" - resize to distinct size; "scale" - original height and width are multiplied
by appropriate scale; "odd_scale" - the same, but result height and width are odd value
(if h%2==0 then h=h+1 and the same for width); "like" - resize first input to the height
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good to clarify what h is here? original image height or the height parameter in the parameter list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right, I'll do this, thank you.

CHECK_EQ(inputs.size(), 1);
bool modeLike = param.mode == bilinear_resize::like;
size_t expected = modeLike ? 2 : 1;
CHECK_EQ(outputs.size(), expected);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so there will be 2 outputs in like mode? what is the second output? if it does have two outputs, it should be specified in the param description.

Copy link
Contributor Author

@lobanov-m lobanov-m Nov 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two outputs of the backward function. The operator still has one output. In "like" mod we pass two input tensors to resize one to the size of second, so the backward function should return gradients to both tensors. Actually the second tensor, from which we get result size, should get zero gradients from the output of this operator, because it is needed only to get it's shape. It is realized in cc and cu files.

@lobanov-m
Copy link
Contributor Author

I'm realy sorry for force push, I get confused with rebase and commited all master to this branch, after that tried to roll back and finally erased my first commits. :(

@@ -6527,12 +6527,64 @@ def check_bilinear_resize_op(shape, height, width):
x = mx.nd.random.uniform(shape=shape)
y = mx.nd.contrib.BilinearResize2D(x, height=height, width=width)
assert_almost_equal(y.asnumpy(), py_bilinear_resize(x.asnumpy(), height, width))
def check_bilinear_resize_modes_op(shape, height_factor=None, width_factor=None, shape_1=None, mode=None):
x = mx.nd.random.uniform(shape=shape)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit tests for operators should cover forward pass and backward pass. See here - https://mxnet.incubator.apache.org/faq/add_op_in_backend.html#unit-test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done :)

Copy link
Contributor

@ChaiBapchya ChaiBapchya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it is failing on 2 different validation CI checks
1 looks like uncovered a flaky test
other has build failure which needs your attention. Please review and make requisite changes. Thanks.

@lobanov-m
Copy link
Contributor Author

Thank you, @ChaiBapchya.
As I understood the build has failed because of the timeout.
Branches [GPU: MKLDNN ] and [GPU: MKLDNN_CUDNNOFF] has failed.
In pipeline there are lines:

[GPU: MKLDNN] Running on mxnetlinux-cpu_x4398dm2a1 in /home/jenkins_slave/workspace/alidation_unix-gpu_PR-13226-7WOHJMMJA2UQ45SCSTCQK6UH3VENFWCLRGEEC4X3CRC25ELVL4IQ   
[Pipeline] [GPU: MKLDNN] {    
[Pipeline] [GPU: MKLDNN] ws    
[GPU: MKLDNN] Running in /home/jenkins_slave/workspace/build-mkldnn-gpu    
[Pipeline] [GPU: MKLDNN] {    
[Pipeline] [GPU: MKLDNN] timeout    
[GPU: MKLDNN] Timeout set to expire in 2 hr 0 min    
[Pipeline] [GPU: MKLDNN] {

And in the log file of build there is no error and it starts at 19:30 and fails at 21:30:

[build-mkldnn-gpu] Running shell script
+ ci/build.py --docker-registry mxnetci --platform ubuntu_build_cuda --docker-build-retries 3 --shm-size 500m /work/runtime_functions.sh build_ubuntu_gpu_mkldnn
build.py: 2018-11-30 19:30:49,272Z INFO MXNet container based build tool.
...
Sending interrupt signal to process
build.py: 2018-11-30 21:30:29,503Z WARNING Signal 15 received, cleaning up...
build.py: 2018-11-30 21:30:29,503Z WARNING Cleaning up containers
Terminated
script returned exit code 143

The same for [GPU: MKLDNN_CUDNNOFF]. No error in log, build script lasts for 1h 59m 55s and receives interrupt signal. Jenkins build description.

What should I do with this problem?

@azai91
Copy link
Contributor

azai91 commented Dec 4, 2018

@lobanov-m this was around the time we were changing our CI. can you retrigger the build (push an empty commit).

@lobanov-m
Copy link
Contributor Author

Last mentioned problem with pull request have been resolved. Is anything else should be changed?

@sandeep-krishnamurthy
Copy link
Contributor

@anirudhacharya @azai91 - ping for review :-)

Copy link
Member

@anirudhacharya anirudhacharya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sandeep-krishnamurthy
Copy link
Contributor

@apeforest - Can you please help review this PR?
@lobanov-m - Can you please rebase? Thanks for your contributions

@vandanavk
Copy link
Contributor

@mxnet-label-bot update [pr-awaiting-response]

@marcoabreu marcoabreu added pr-awaiting-response PR is reviewed and waiting for contributor to respond and removed pr-awaiting-review PR is waiting for code review labels Feb 5, 2019
@lobanov-m
Copy link
Contributor Author

Rebase done.

@lobanov-m
Copy link
Contributor Author

@mxnet-label-bot update [pr-awaiting-review]

@marcoabreu marcoabreu added pr-awaiting-review PR is waiting for code review and removed pr-awaiting-response PR is reviewed and waiting for contributor to respond labels Feb 19, 2019
@anirudhacharya
Copy link
Member

@mxnet-label-bot add [pr-awaiting-merge]

@lobanov-m please rebase your PR to resolve conflicts.

@lobanov-m
Copy link
Contributor Author

@mxnet-label-bot add[pr-awaiting-review]

@marcoabreu marcoabreu added the pr-awaiting-review PR is waiting for code review label Mar 25, 2019
@lobanov-m
Copy link
Contributor Author

@mxnet-label-bot remove[pr-awaiting-response]

@marcoabreu marcoabreu removed the pr-awaiting-response PR is reviewed and waiting for contributor to respond label Mar 25, 2019
@lobanov-m
Copy link
Contributor Author

@pinaraws rebase done

@lobanov-m
Copy link
Contributor Author

Built failed because of

test_model.R:129: error: Fine-tune
cannot open URL 'http://data.dmlc.ml/models/imagenet/inception-bn/Inception-BN-0126.params'
1: GetInception() at R-package/tests/testthat/test_model.R:129

and the second error:

--2019-03-25 08:58:30--  (try: 2)  http://data.dmlc.ml/mxnet/models/imagenet/inception-bn.tar.gz
Connecting to data.dmlc.ml (data.dmlc.ml)|54.208.175.7|:80... failed: Connection timed out.
Retrying.

@lobanov-m
Copy link
Contributor Author

lobanov-m commented Mar 26, 2019

Error in build doesn't connected with pull request:

Rscript -e "library(devtools); library(methods); options(repos=c(CRAN='https://cloud.r-project.org/')); install_deps(pkg='R-package', dependencies = TRUE)"
Error in library(devtools) : there is no package called 'devtools'

@piyushghai
Copy link
Contributor

@anirudhacharya Are all your concerns addressed on this PR ?
Is this good to merge ?

@anirudhacharya
Copy link
Member

@piyushghai yes, my concerns are addressed, please see here - #13226 (review)

@piyushghai
Copy link
Contributor

@ChaiBapchya Are your concerns also addressed on this PR ?

@Roshrini
Copy link
Member

@lobanov-m Can you please resolve conflicts?

Also added tests and some fixes to visualization needed due to added modes.
@lobanov-m
Copy link
Contributor Author

@Roshrini rebase done.

@wkcn wkcn removed the pr-awaiting-review PR is waiting for code review label Apr 28, 2019
Copy link
Member

@wkcn wkcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All concerns have been addresses.
LGTM.
Thanks @lobanov-m for the excellent work!
I will merge it after the PR passes.

@wkcn wkcn merged commit b30949f into apache:master May 5, 2019
@wkcn
Copy link
Member

wkcn commented May 5, 2019

Glad to merge it. Thank you!

@lobanov-m
Copy link
Contributor Author

It's good news. Thank you!

access2rohit pushed a commit to access2rohit/incubator-mxnet that referenced this pull request May 14, 2019
…ache#13226)

* Added "factor" and "like" modes into BilinearResize2D operator.
Also added tests and some fixes to visualization needed due to added modes.

* Lint fix

* Test fix

* retrigger CI

* retrigger CI

* Retrigger CI

* retrigger CI

* retrigger ci

* retrigger ci again
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
…ache#13226)

* Added "factor" and "like" modes into BilinearResize2D operator.
Also added tests and some fixes to visualization needed due to added modes.

* Lint fix

* Test fix

* retrigger CI

* retrigger CI

* Retrigger CI

* retrigger CI

* retrigger ci

* retrigger ci again
@lobanov-m lobanov-m deleted the bilinear-resize-scale-like branch August 14, 2019 14:17
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-merge Review and CI is complete. Ready to Merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.