Enhance gpu quantization #14094

jitMatrix · 2019-02-08T06:29:37Z

Description

As #14092 mentioned, GPU only supports int8 quantization and does not support uint8. So, add an error message in quantize model function.

@reminisce

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Feature1, tests, (and when applicable, API doc)
Feature2, tests, (and when applicable, API doc)

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

pengzhao-intel · 2019-02-08T07:59:40Z

python/mxnet/contrib/quantization.py

@@ -499,6 +499,9 @@ def quantize_model(sym, arg_params, aux_params,
    if quantized_dtype not in ('int8', 'uint8'):
        raise ValueError('unknown quantized_dtype %s received,'
                         ' expected `int8` or `uint8`' % quantized_dtype)
+    if quantized_dtype == 'uint8' and ctx != cpu():
+        raise ValueError('currently gpu does not support uint8 quantization,'
+                         ' please set quantized_dtype to int8')


How about the something like below？
“Currently, uint8 quantization is only supported by CPU, please switch to the context of CPU or int8 data type for GPU"

okay, changed:)

pengzhao-intel

LGTM

Thanks for the quick fix :)

jitMatrix · 2019-02-08T16:10:22Z

@reminisce Can you help take a look at this? Thanks:)

vandanavk · 2019-02-08T17:09:36Z

@mxnet-label-bot add [pr-awaiting-review, Quantization]

anirudh2290

Thanks for the quick fix!

anirudh2290 · 2019-02-09T01:05:26Z

python/mxnet/contrib/quantization.py

@@ -499,6 +499,9 @@ def quantize_model(sym, arg_params, aux_params,
    if quantized_dtype not in ('int8', 'uint8'):
        raise ValueError('unknown quantized_dtype %s received,'
                         ' expected `int8` or `uint8`' % quantized_dtype)
+    if quantized_dtype == 'uint8' and ctx != cpu():
+        raise ValueError('currently, uint8 quantization is only supported by CPU,'
+                         ' please switch to the context of CPU or int8 data type for GPU')


better to add this error to backend like in the case for MKLDNN with int8 so that we dont have to add error handling to other frontends when we support quantization.

currently, only python frontend support quantization and in fact calibration progress will not use backend specific quantized operator. So I think it's good to add error message in this place currently.

In QuantizeCompute (quantize-inl.h) you can check if std::is_same<xpu,gpu>::value and check for param.out_type and throw exception.

I don't think this modification can work since infer type error mxnet.base.MXNetError: [02:07:55] /home/ubuntu/experimentals/1.4_release/src/operator/quantization/../tensor/matrix_op-inl.h:250: Check failed: src.type_flag_ == ret.type_flag_ (3 vs. 5) will occur before QuantizeCompute and we cannot get the ctx information during infer stage. So I think it's good to interrupt this action during the calibration stage.

isnt that called from the forward pass of quantized_conv ? The quantize forward pass should execute before this.

add check src_type in quantized_conv.cu, please take a review again.

ankkhedia · 2019-02-15T18:40:21Z

@rajeshii Thanks for the quick turnaround. Could you please look into comments by @anirudh2290

TaoLv · 2019-02-25T13:13:37Z

src/operator/quantization/quantized_conv.cu

@@ -76,6 +76,9 @@ class QuantizedCuDNNConvOp {
    if (param_.pad.ndim() == 0U)    param_.pad = mshadow::Shape2(0, 0);
    N = 0, H = 2, W = 3, C = 1;
    src_type_ = mshadow::DataType<SrcType>::kCudnnFlag;
+    CHECK_EQ(src_type_, 5U)


What's the 5U here?

TaoLv · 2019-02-25T13:14:51Z

tests/python/quantization/test_quantization.py

+            return
+        elif qdtype == 'uint8' and is_test_for_gpu():
+            print('skipped testing quantize_model for gpu uint8 since it is not supported yet')
+            return


Please add else clause.

vandanavk · 2019-03-04T23:07:39Z

@anirudh2290 @TaoLv is this PR good to go?

vandanavk · 2019-03-04T23:09:08Z

@rajeshii Please add "Fixes #14092" to the PR description to automatically close the issue when this PR is merged.

anirudh2290 · 2019-03-05T00:17:58Z

src/operator/quantization/quantized_conv.cu

@@ -110,6 +110,9 @@ class QuantizedCuDNNConvOp {
    const TShape& fshape = filter.shape_;
    const TShape& oshape = out.shape_;

+    CHECK_EQ(data.type_flag_, mshadow::kInt8)
+      << "currently, uint8 quantization is only supported by CPU, "
+         "please switch to the context of CPU or int8 data type for GPU.";


can we add inside quantize-inl.h, this way it will return an error message even for networks without this op.

ok, added:)

This reverts commit ab68668.

…ization

anirudh2290

Thanks for the fix!

vandanavk · 2019-03-05T18:11:55Z

@mxnet-label-bot update [Quantization, pr-awaiting-merge]

* enhance gpu quantization * fix test and improve error message * add check srctype to quantized_conv.cu * improve infer type * fix lint * add dtype check in quantize * revert check in python level and quantized_conv * Revert "add dtype check in quantize" This reverts commit ab68668. * add dtype check in quantize * fix quantize test case

enhance gpu quantization

17415db

jitMatrix requested a review from szha as a code owner February 8, 2019 06:29

pengzhao-intel reviewed Feb 8, 2019

View reviewed changes

fix test and improve error message

a240ec9

pengzhao-intel approved these changes Feb 8, 2019

View reviewed changes

marcoabreu added pr-awaiting-review PR is waiting for code review Quantization Issues/Feature Requests related to Quantization labels Feb 8, 2019

anirudh2290 reviewed Feb 9, 2019

View reviewed changes

SamanthaFeidFischer approved these changes Feb 10, 2019

View reviewed changes

Rajeshii added 2 commits February 19, 2019 06:01

resolve conflict

519da15

add check srctype to quantized_conv.cu

6b06d50

TaoLv reviewed Feb 25, 2019

View reviewed changes

Rajeshii added 2 commits February 26, 2019 03:52

improve infer type

00d8099

fix lint

44d959e

anirudh2290 reviewed Mar 5, 2019

View reviewed changes

Rajeshii added 6 commits March 5, 2019 08:34

add dtype check in quantize

ab68668

revert check in python level and quantized_conv

8406726

Revert "add dtype check in quantize"

7b43c18

This reverts commit ab68668.

Merge remote-tracking branch 'upstream/master' into enhance-gpu-quant…

71999ef

…ization

add dtype check in quantize

845c063

fix quantize test case

9c44eb5

anirudh2290 approved these changes Mar 5, 2019

View reviewed changes

marcoabreu added pr-awaiting-merge Review and CI is complete. Ready to Merge and removed pr-awaiting-review PR is waiting for code review labels Mar 5, 2019

TaoLv merged commit 49d7fc6 into apache:master Mar 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance gpu quantization #14094

Enhance gpu quantization #14094

jitMatrix commented Feb 8, 2019 •

edited

Loading

pengzhao-intel Feb 8, 2019

jitMatrix Feb 8, 2019

pengzhao-intel left a comment

jitMatrix commented Feb 8, 2019

vandanavk commented Feb 8, 2019

anirudh2290 left a comment

anirudh2290 Feb 9, 2019 •

edited

Loading

jitMatrix Feb 19, 2019

anirudh2290 Feb 19, 2019

jitMatrix Feb 25, 2019

anirudh2290 Feb 25, 2019

jitMatrix Feb 25, 2019

ankkhedia commented Feb 15, 2019

TaoLv Feb 25, 2019

TaoLv Feb 25, 2019

vandanavk commented Mar 4, 2019

vandanavk commented Mar 4, 2019

anirudh2290 Mar 5, 2019

xinyu-intel Mar 5, 2019

anirudh2290 left a comment

vandanavk commented Mar 5, 2019

Enhance gpu quantization #14094

Enhance gpu quantization #14094

Conversation

jitMatrix commented Feb 8, 2019 • edited Loading

Description

Checklist

Essentials

Changes

Comments

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pengzhao-intel left a comment

Choose a reason for hiding this comment

jitMatrix commented Feb 8, 2019

vandanavk commented Feb 8, 2019

anirudh2290 left a comment

Choose a reason for hiding this comment

anirudh2290 Feb 9, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ankkhedia commented Feb 15, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vandanavk commented Mar 4, 2019

vandanavk commented Mar 4, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anirudh2290 left a comment

Choose a reason for hiding this comment

vandanavk commented Mar 5, 2019

jitMatrix commented Feb 8, 2019 •

edited

Loading

anirudh2290 Feb 9, 2019 •

edited

Loading