Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

fix ssd quantization script error #13843

Merged
merged 5 commits into from
Jan 15, 2019
Merged

Conversation

ciyongch
Copy link
Contributor

Description

This PR is to address an error of "symbol not found" during ssd quantization process.
Since there're some Op outputs are not in the format of {opname}_output, which results in no calibrated min/max for such variables in quantized param file, and cause the error.
Leave the calib_layer to None (which is default) is a safety way to do the calibration.

@pengzhao-intel @TaoLv @ZhennanQin @xinyu-intel @apeforest

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@ciyongch ciyongch requested a review from szha as a code owner January 11, 2019 02:15
@TaoLv TaoLv added Quantization Issues/Feature Requests related to Quantization Example labels Jan 11, 2019
@TaoLv
Copy link
Member

TaoLv commented Jan 11, 2019

Thank you for the quick fix @ciyongch . Do you mind pasting the error log here?

Copy link
Contributor

@xinyu-intel xinyu-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small changes

@@ -128,6 +127,8 @@ def save_params(fname, arg_params, aux_params, logger=None):
if exclude_first_conv:
excluded_sym_names += ['conv1_1']

excluded_sym_names += ['multibox_loc_pred', 'concat0', 'concat1']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add to line124

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done :)

@xinyu-intel
Copy link
Contributor

Would you mind also help me fix a typo in ssd readme?

data/
|---val.rec
|---val.lxt
|---val.idx
model/
|---ssd_vgg16_reduced_300-0000.params  <-----
|---ssd_vgg16_reduced_300-symbol.json

@ciyongch
Copy link
Contributor Author

@TaoLv, error log is as below:

$ python evaluate.py --cpu --num-batch 10 --batch-size 224 --deploy --prefix=./model/cqssd_
[17:17:21] src/io/iter_image_det_recordio.cc:283: ImageDetRecordIOParser: /home/ubuntu/mxnet/example/ssd/data/val.rec, use 32 threads for decoding..
[17:17:21] src/io/iter_image_det_recordio.cc:340: ImageDetRecordIOParser: /home/ubuntu/mxnet/example/ssd/data/val.rec, label padding width: 254
[17:17:23] src/operator/subgraph/mkldnn/mkldnn_conv_property.cc:138: Start to execute MKLDNN Convolution optimization pass.
Traceback (most recent call last):
  File "evaluate.py", line 108, in <module>
    voc07_metric=args.use_voc07_metric)
  File "/home/ubuntu/mxnet/example/ssd/evaluate/evaluate_net.py", line 107, in evaluate_net
    mod.set_params(args, auxs, allow_missing=False, force_init=True)
  File "/home/ubuntu/mxnet/python/mxnet/module/module.py", line 350, in set_params
    allow_extra=allow_extra)
  File "/home/ubuntu/mxnet/python/mxnet/module/module.py", line 309, in init_params
    _impl(desc, arr, arg_params)
  File "/home/ubuntu/mxnet/python/mxnet/module/module.py", line 300, in _impl
    raise RuntimeError("%s is not presented" % name)
RuntimeError: flatten10_0_max is not presented

Thank you for the quick fix @ciyongch . Do you mind pasting the error log here?

@ciyongch
Copy link
Contributor Author

no problem @xinyu-intel

@xinyu-intel
Copy link
Contributor

Thanks:)

label_names=(label_name,),
calib_quantize_op = True,
calib_quantize_op=True,
logger=logger)
sym_name = '%s-symbol.json' % ('./model/cqssd_vgg16_reduced_300')
param_name = '%s-%04d.params' % ('./model/cqssd_vgg16_reduced_300', epoch)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not remove 0000 in here instead of adding 0000 in the readme? Any special meaning for it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a convention of mxnet. We can save different parameter files at different epochs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. Agree the change makes sense :)

@@ -218,7 +218,7 @@ data/
|---val.lxt
|---val.idx
model/
|---ssd_vgg16_reduced_300.params
|---ssd_vgg16_reduced_300-0000.params
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add the note, user needs to rename, such as ssd-val-fc19a535.idx to val.idx.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another suggestion is to move/copy the SSD quantization instruction into ssd/README too.

Copy link
Member

@TaoLv TaoLv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@TaoLv
Copy link
Member

TaoLv commented Jan 13, 2019

@reminisce @ZhennanQin Please take a look.

Copy link
Contributor

@pengzhao-intel pengzhao-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for a hotfix.

Regarding other parts of changes I mentioned, I think it can be left to another PR.

@ciyongch
Copy link
Contributor Author

LGTM for a hotfix.

Regarding other parts of changes I mentioned, I think it can be left to another PR.

@pengzhao-intel Good suggestion to update the readme as you mentioned above, no problem to me to update the change in this PR:)

@ciyongch
Copy link
Contributor Author

@TaoLv @reminisce This PR is only a fix to SSD quantization script in example and the update to README, please help to merge if no other comments :)

Copy link
Member

@TaoLv TaoLv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments.

@@ -34,7 +34,7 @@ The following models have been tested on Linux systems.
|[Inception V3](#7)|[Gluon-CV](https://gluon-cv.mxnet.io/model_zoo/classification.html)|[Validation Dataset](http://data.mxnet.io/data/val_256_q90.rec)|76.49%/93.10% |76.38%/93% |
|[ResNet152-V2](#8)|[MXNet ModelZoo](http://data.mxnet.io/models/imagenet/resnet/152-layers/)|[Validation Dataset](http://data.mxnet.io/data/val_256_q90.rec)|76.76%/93.03%|76.48%/92.96%|
|[Inception-BN](#9)|[MXNet ModelZoo](http://data.mxnet.io/models/imagenet/inception-bn/)|[Validation Dataset](http://data.mxnet.io/data/val_256_q90.rec)|72.09%/90.60%|72.00%/90.53%|
| [SSD-VGG](#10) | [example/ssd](/~https://github.com/apache/incubator-mxnet/tree/master/example/ssd) | VOC2007/2012 | 0.83 mAP | 0.82 mAP |
| [SSD-VGG](#10) | [example/ssd](/~https://github.com/apache/incubator-mxnet/tree/master/example/ssd) | VOC2007/2012 | 0.8366 mAP | 0.8364 mAP |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SSD-VGG16

@@ -210,40 +210,7 @@ python imagenet_inference.py --symbol-file=./model/imagenet1k-inception-bn-quant

<h3 id='10'>SSD-VGG</h3>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SSD-VGG16

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no problem, any other description in current README need udpated?

@TaoLv TaoLv added the pr-awaiting-review PR is waiting for code review label Jan 14, 2019
@TaoLv
Copy link
Member

TaoLv commented Jan 15, 2019

Thank you for the fix @ciyongch. Merging it now.

@TaoLv TaoLv merged commit 4fe5461 into apache:master Jan 15, 2019
TaoLv pushed a commit to TaoLv/incubator-mxnet that referenced this pull request Jan 15, 2019
* fix ssd quantization script error

* update readme for ssd

* move quantized SSD instructions from quantization/README.md to ssd/README.md

* update ssd readme and accuracy

* update readme for SSD-vGG16
@TaoLv
Copy link
Member

TaoLv commented Jan 15, 2019

@lifu-wang You may be interested. We will have nightly build with this fix soon.

TaoLv added a commit that referenced this pull request Jan 16, 2019
* fix ssd quantization script error

* update readme for ssd

* move quantized SSD instructions from quantization/README.md to ssd/README.md

* update ssd readme and accuracy

* update readme for SSD-vGG16
stephenrawls pushed a commit to stephenrawls/incubator-mxnet that referenced this pull request Feb 16, 2019
* fix ssd quantization script error

* update readme for ssd

* move quantized SSD instructions from quantization/README.md to ssd/README.md

* update ssd readme and accuracy

* update readme for SSD-vGG16
lanking520 pushed a commit to lanking520/incubator-mxnet that referenced this pull request Feb 18, 2019
* fix ssd quantization script error

* update readme for ssd

* move quantized SSD instructions from quantization/README.md to ssd/README.md

* update ssd readme and accuracy

* update readme for SSD-vGG16
@ciyongch ciyongch deleted the udpate_ssd_script branch March 13, 2019 02:26
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
* fix ssd quantization script error

* update readme for ssd

* move quantized SSD instructions from quantization/README.md to ssd/README.md

* update ssd readme and accuracy

* update readme for SSD-vGG16
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Example pr-awaiting-review PR is waiting for code review Quantization Issues/Feature Requests related to Quantization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants