Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Enhance gpu quantization #14094

Merged
merged 12 commits into from
Mar 6, 2019
Merged

Enhance gpu quantization #14094

merged 12 commits into from
Mar 6, 2019

Conversation

jitMatrix
Copy link
Contributor

@jitMatrix jitMatrix commented Feb 8, 2019

Description

"Fixes #14092"

As #14092 mentioned, GPU only supports int8 quantization and does not support uint8. So, add an error message in quantize model function.

@reminisce

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@jitMatrix jitMatrix requested a review from szha as a code owner February 8, 2019 06:29
@@ -499,6 +499,9 @@ def quantize_model(sym, arg_params, aux_params,
if quantized_dtype not in ('int8', 'uint8'):
raise ValueError('unknown quantized_dtype %s received,'
' expected `int8` or `uint8`' % quantized_dtype)
if quantized_dtype == 'uint8' and ctx != cpu():
raise ValueError('currently gpu does not support uint8 quantization,'
' please set quantized_dtype to int8')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the something like below?
“Currently, uint8 quantization is only supported by CPU, please switch to the context of CPU or int8 data type for GPU"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, changed:)

Copy link
Contributor

@pengzhao-intel pengzhao-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks for the quick fix :)

@jitMatrix
Copy link
Contributor Author

@reminisce Can you help take a look at this? Thanks:)

@vandanavk
Copy link
Contributor

@mxnet-label-bot add [pr-awaiting-review, Quantization]

@marcoabreu marcoabreu added pr-awaiting-review PR is waiting for code review Quantization Issues/Feature Requests related to Quantization labels Feb 8, 2019
Copy link
Member

@anirudh2290 anirudh2290 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick fix!

@@ -499,6 +499,9 @@ def quantize_model(sym, arg_params, aux_params,
if quantized_dtype not in ('int8', 'uint8'):
raise ValueError('unknown quantized_dtype %s received,'
' expected `int8` or `uint8`' % quantized_dtype)
if quantized_dtype == 'uint8' and ctx != cpu():
raise ValueError('currently, uint8 quantization is only supported by CPU,'
' please switch to the context of CPU or int8 data type for GPU')
Copy link
Member

@anirudh2290 anirudh2290 Feb 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to add this error to backend like in the case for MKLDNN with int8 so that we dont have to add error handling to other frontends when we support quantization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently, only python frontend support quantization and in fact calibration progress will not use backend specific quantized operator. So I think it's good to add error message in this place currently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In QuantizeCompute (quantize-inl.h) you can check if std::is_same<xpu,gpu>::value and check for param.out_type and throw exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this modification can work since infer type error mxnet.base.MXNetError: [02:07:55] /home/ubuntu/experimentals/1.4_release/src/operator/quantization/../tensor/matrix_op-inl.h:250: Check failed: src.type_flag_ == ret.type_flag_ (3 vs. 5) will occur before QuantizeCompute and we cannot get the ctx information during infer stage. So I think it's good to interrupt this action during the calibration stage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isnt that called from the forward pass of quantized_conv ? The quantize forward pass should execute before this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add check src_type in quantized_conv.cu, please take a review again.

@ankkhedia
Copy link
Contributor

@rajeshii Thanks for the quick turnaround. Could you please look into comments by @anirudh2290

@@ -76,6 +76,9 @@ class QuantizedCuDNNConvOp {
if (param_.pad.ndim() == 0U) param_.pad = mshadow::Shape2(0, 0);
N = 0, H = 2, W = 3, C = 1;
src_type_ = mshadow::DataType<SrcType>::kCudnnFlag;
CHECK_EQ(src_type_, 5U)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the 5U here?

return
elif qdtype == 'uint8' and is_test_for_gpu():
print('skipped testing quantize_model for gpu uint8 since it is not supported yet')
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add else clause.

@vandanavk
Copy link
Contributor

@anirudh2290 @TaoLv is this PR good to go?

@vandanavk
Copy link
Contributor

@rajeshii Please add "Fixes #14092" to the PR description to automatically close the issue when this PR is merged.

@@ -110,6 +110,9 @@ class QuantizedCuDNNConvOp {
const TShape& fshape = filter.shape_;
const TShape& oshape = out.shape_;

CHECK_EQ(data.type_flag_, mshadow::kInt8)
<< "currently, uint8 quantization is only supported by CPU, "
"please switch to the context of CPU or int8 data type for GPU.";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add inside quantize-inl.h, this way it will return an error message even for networks without this op.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, added:)

Copy link
Member

@anirudh2290 anirudh2290 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

@vandanavk
Copy link
Contributor

@mxnet-label-bot update [Quantization, pr-awaiting-merge]

@marcoabreu marcoabreu added pr-awaiting-merge Review and CI is complete. Ready to Merge and removed pr-awaiting-review PR is waiting for code review labels Mar 5, 2019
@TaoLv TaoLv merged commit 49d7fc6 into apache:master Mar 6, 2019
vdantu pushed a commit to vdantu/incubator-mxnet that referenced this pull request Mar 31, 2019
* enhance gpu quantization

* fix test and improve error message

* add check srctype to quantized_conv.cu

* improve infer type

* fix lint

* add dtype check in quantize

* revert check in python level and quantized_conv

* Revert "add dtype check in quantize"

This reverts commit ab68668.

* add dtype check in quantize

* fix quantize test case
nswamy pushed a commit that referenced this pull request Apr 5, 2019
* enhance gpu quantization

* fix test and improve error message

* add check srctype to quantized_conv.cu

* improve infer type

* fix lint

* add dtype check in quantize

* revert check in python level and quantized_conv

* Revert "add dtype check in quantize"

This reverts commit ab68668.

* add dtype check in quantize

* fix quantize test case
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
* enhance gpu quantization

* fix test and improve error message

* add check srctype to quantized_conv.cu

* improve infer type

* fix lint

* add dtype check in quantize

* revert check in python level and quantized_conv

* Revert "add dtype check in quantize"

This reverts commit ab68668.

* add dtype check in quantize

* fix quantize test case
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-merge Review and CI is complete. Ready to Merge Quantization Issues/Feature Requests related to Quantization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve message when quantized_dtype uint8 is used with gpu context
9 participants