mkldnn quantized FC is slow #17705

eric-haibin-lin · 2020-02-27T05:30:21Z

quantized BERT model with int8 is 2x slower than float32.

Download trained SST params from: https://dist-bert.s3.amazonaws.com/demo/finetune/sst.params

Clone is install gluon-nlp v0.9

# calibration
KMP_AFFINITY=granularity=fine,noduplicates,compact,1,0 OMP_NUM_THREADS=1 numactl --physcpubind=0 --membind=0 python3 finetune_classifier.py --task_name SST --only_calibration --model_parameters  sst.params
# fp32 inference
python3 finetune_classifier.py --task_name SST --epoch 1 --only_inference --model_parameters sst.params --round_to 128 --dev_batch_size 1
# int8 inference
python3 finetune_classifier.py --task_name SST --epoch 1 --only_inference --model_prefix ./output_dir/model_bert_SST_quantized_customize --deploy --round_to 128 --dev_batch_size 1

I'm using c5.12xlarge. I tried to set OMP_NUM_THREAD=8, but int8 is still slower than float32.

The text was updated successfully, but these errors were encountered:

pengzhao-intel · 2020-02-27T05:31:37Z

@wuxun-zhang @ciyongch

ciyongch · 2020-02-27T05:36:56Z

Thanks for reporting this, I will take a look.

ciyongch · 2020-02-27T09:24:28Z

@eric-haibin-lin I just created a PR #17707 to address this issue, please take a review.

pengzhao-intel · 2020-03-10T11:54:43Z

Feel free to reopen if the issue is not resolved.

eric-haibin-lin added the Bug label Feb 27, 2020

pengzhao-intel added the MKLDNN label Feb 27, 2020

ciyongch mentioned this issue Feb 27, 2020

[MKLDNN] Remove overhead of sg_mkldnn_fullyconnected op #17707

Merged

7 tasks

pengzhao-intel closed this as completed Mar 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mkldnn quantized FC is slow #17705

mkldnn quantized FC is slow #17705

eric-haibin-lin commented Feb 27, 2020

pengzhao-intel commented Feb 27, 2020

ciyongch commented Feb 27, 2020

ciyongch commented Feb 27, 2020

pengzhao-intel commented Mar 10, 2020

mkldnn quantized FC is slow #17705

mkldnn quantized FC is slow #17705

Comments

eric-haibin-lin commented Feb 27, 2020

pengzhao-intel commented Feb 27, 2020

ciyongch commented Feb 27, 2020

ciyongch commented Feb 27, 2020

pengzhao-intel commented Mar 10, 2020