This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[MXNET-1255] update hybridize documentation #13597
Merged
Merged
Changes from 4 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -1,6 +1,9 @@ | ||||||
# Hybrid - Faster training and easy deployment | ||||||
|
||||||
*Note: a newer version is available [here](http://gluon.mxnet.io/chapter07_distributed-learning/hybridize.html).* | ||||||
*Related Contents:* | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
* [Fast, portable neural networks with Gluon HybridBlocks](https://gluon.mxnet.io/chapter07_distributed-learning/hybridize.html) | ||||||
* [A Hybrid of Imperative and Symbolic Programming | ||||||
](http://en.diveintodeeplearning.org/chapter_computational-performance/hybridize.html) | ||||||
|
||||||
Deep learning frameworks can be roughly divided into two categories: declarative | ||||||
and imperative. With declarative frameworks (including Tensorflow, Theano, etc) | ||||||
|
@@ -137,4 +140,106 @@ to gluon with `SymbolBlock`: | |||||
net2 = gluon.SymbolBlock.imports('model-symbol.json', ['data'], 'model-0001.params') | ||||||
``` | ||||||
|
||||||
<!-- INSERT SOURCE DOWNLOAD BUTTONS --> | ||||||
## Operators that do not work with hybridize | ||||||
|
||||||
If you want to hybridize your model, you must use `F.some_operator` in your 'hybrid_forward' function. | ||||||
`F` will be `mxnet.nd` before you hybridize and `mxnet.sym` after hybridize. While most APIs are the same in NDArray and Symbol, there are some differences. Writing `F.some_operator` and call `hybridize` may not work all of the time. | ||||||
Here we list some frequently used NDArray APIs that can't be hybridized and provide you the work arounds. | ||||||
|
||||||
### Element-wise Operators | ||||||
|
||||||
In NDArray APIs, the following arithmetic and comparison APIs are automatically broadcasted if the input NDArrays have different shapes. | ||||||
However, that's not the case in Symbol API. It's not automatically broadcasted, and you have to manually specify to use another set of broadcast operators for Symbols expected to have different shapes. | ||||||
|
||||||
|
||||||
| NDArray APIs | Description | | ||||||
aaronmarkham marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|---|---| | ||||||
| [*NDArray.\__add\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__add__) | x.\__add\__(y) <=> x+y <=> mx.nd.add(x, y) | | ||||||
aaronmarkham marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
| [*NDArray.\__sub\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__sub__) | x.\__sub\__(y) <=> x-y <=> mx.nd.subtract(x, y) | | ||||||
| [*NDArray.\__mul\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__mul__) | x.\__mul\__(y) <=> x*y <=> mx.nd.multiply(x, y) | | ||||||
| [*NDArray.\__div\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__div__) | x.\__div\__(y) <=> x/y <=> mx.nd.divide(x, y) | | ||||||
| [*NDArray.\__mod\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__mod__) | x.\__mod\__(y) <=> x%y <=> mx.nd.modulo(x, y) | | ||||||
| [*NDArray.\__lt\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__lt__) | x.\__lt\__(y) <=> x<y <=> x mx.nd.lesser(x, y) | | ||||||
| [*NDArray.\__le\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__le__) | x.\__le\__(y) <=> x<=y <=> mx.nd.less_equal(x, y) | | ||||||
| [*NDArray.\__gt\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__gt__) | x.\__gt\__(y) <=> x>y <=> mx.nd.greater(x, y) | | ||||||
| [*NDArray.\__ge\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__ge__) | x.\__ge\__(y) <=> x>=y <=> mx.nd.greater_equal(x, y)| | ||||||
| [*NDArray.\__eq\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__eq__) | x.\__eq\__(y) <=> x==y <=> mx.nd.equal(x, y) | | ||||||
| [*NDArray.\__ne\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__ne__) | x.\__ne\__(y) <=> x!=y <=> mx.nd.not_equal(x, y) | | ||||||
|
||||||
The current workaround is to use corresponding broadcast operators for arithmetic and comparison to avoid potential hybridization failure when input shapes are different. | ||||||
|
||||||
| Symbol APIs | Description | | ||||||
|---|---| | ||||||
|[*broadcast_add*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_add) | Returns element-wise sum of the input arrays with broadcasting. | | ||||||
|[*broadcast_sub*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_sub) | Returns element-wise difference of the input arrays with broadcasting. | | ||||||
|[*broadcast_mul*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_mul) | Returns element-wise product of the input arrays with broadcasting. | | ||||||
|[*broadcast_div*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_div) | Returns element-wise division of the input arrays with broadcasting. | | ||||||
|[*broadcast_mod*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_mod) | Returns element-wise modulo of the input arrays with broadcasting. | | ||||||
|[*broadcast_equal*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_equal) | Returns the result of element-wise *equal to* (==) comparison operation with broadcasting. | | ||||||
|[*broadcast_not_equal*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_not_equal) | Returns the result of element-wise *not equal to* (!=) comparison operation with broadcasting. | | ||||||
|[*broadcast_greater*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_greater) | Returns the result of element-wise *greater than* (>) comparison operation with broadcasting. | | ||||||
|[*broadcast_greater_equal*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_greater_equal) | Returns the result of element-wise *greater than or equal to* (>=) comparison operation with broadcasting. | | ||||||
|[*broadcast_lesser*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_lesser) | Returns the result of element-wise *lesser than* (<) comparison operation with broadcasting. | | ||||||
|[*broadcast_lesser_equal*](https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.broadcast_lesser_equal) | Returns the result of element-wise *lesser than or equal to* (<=) comparison operation with broadcasting. | | ||||||
|
||||||
For example, if you want to add a NDarray to your input x, use `broadcast_add` instead of `+`: | ||||||
|
||||||
```python | ||||||
def hybrid_forward(self, F, x): | ||||||
# avoid writing: return x + F.ones((1, 1)) | ||||||
return F.broadcast_add(x, F.ones((1, 1))) | ||||||
``` | ||||||
|
||||||
If you used `+`, it would still work before hybridization, but will throw an error of shape missmtach after hybridization. | ||||||
|
||||||
### Shape | ||||||
|
||||||
Gluon's imperative interface is very flexible and allows you to print the shape of the NDArray. However, Symbol does not have shape attributes. As a result, you need to avoid printing shapes in `hybrid_forward`. | ||||||
Otherwise, you will get the following error: | ||||||
```bash | ||||||
AttributeError: 'Symbol' object has no attribute 'shape' | ||||||
``` | ||||||
|
||||||
### Slice | ||||||
`[]` in NDArray is used to get a slice from the array. However, `[]` in Symbol is used to get an output from a grouped symbol. | ||||||
For example, you will get different results for the following method before and after hybridization. | ||||||
|
||||||
```python | ||||||
def hybrid_forward(self, F, x): | ||||||
return x[0] | ||||||
``` | ||||||
|
||||||
The current workaround is to explicitly call [`slice`](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.slice) or [`slice_axis`](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.slice_axis) operators in `hybrid_forward`. | ||||||
|
||||||
|
||||||
### Not implemented operators | ||||||
|
||||||
Some of the often used operators in NDArray are not implemented in Symbol, and will cause hybridization failure | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
#### NDArray.asnumpy | ||||||
Symbol does not support `asnumpy` function, you need to avoid calling `asnumpy` in `hybrid_forward`. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
#### Array creation APIs | ||||||
|
||||||
`mx.nd.array()` is used a lot, but Symbol does not have the `array` API. The current workaround is to use `F.ones`, `F.zeros`, or `F.full`, which exist in both the NDArray and Symbol APIs. | ||||||
|
||||||
#### In-Place Arithmetic Operators | ||||||
|
||||||
In-place arithmetic operators may be used in Gluon imperative mode, however if you expect to hybridize, you should write these operations explicitly instead. | ||||||
For example, avoid writing `x += y` and use `x = x + y`, otherwise you will get `NotImplementedError`. This applies to all the following operators: | ||||||
|
||||||
| NDArray in-place arithmetic operators | Description | | ||||||
|---|---| | ||||||
|[*NDArray.\__iadd\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__iadd__) | x.\__iadd\__(y) <=> x+=y | | ||||||
|[*NDArray.\__isub\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__isub__) | x.\__isub\__(y) <=> x-=y | | ||||||
|[*NDArray.\__imul\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__imul__) | x.\__imul\__(y) <=> x*=y | | ||||||
|[*NDArray.\__idiv\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__idiv__) | x.\__rdiv\__(y) <=> x/=y | | ||||||
|[*NDArray.\__imod\__*](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.NDArray.__imod__) | x.\__rmod\__(y) <=> x%=y | | ||||||
|
||||||
|
||||||
|
||||||
## Summary | ||||||
|
||||||
The recommended practice is to utilize the flexibility of imperative NDArray API during experimentation. Once you finalized your model, make necessary changes mentioned above so you can call `hybridize` function to improve performance. | ||||||
|
||||||
<!-- INSERT SOURCE DOWNLOAD BUTTONS --> |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Content instead of Contents (but don't restart CI just for that change).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a link to the new dive into deeplearning book so made it "contents"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a semantic thing I guess. There's probably a rule here but... usually you can say Content is plural already (as it refers to a mass of stuff) and you don't usually say "Related Contents". It's more common to say "Related Content" even if there's more than one item in the list.
It is weird because you do say "Table of Contents", not "Table of Content".
🤷♂️ English is weird. It's not a showstopper... it's fine whichever way you want to do it. (but do it my way). Jk