Add compile flag to choose data type for tensor size due to performance degradation #371

apeforest · 2019-03-29T18:16:28Z

Changing data type for index_t from 'uint32_ttoint64_t` caused performance degradation in operators defined in mshadow library.

I can think of three solutions to this problem:
(1) Add a compilation flag to choose data types for tensor size (This PR)
(2) Add an environment variable to choose data type for tensor size at runtime
(3) Choose data type for tensor size at runtime based on the size of the tensor

Due to the urgency of customer impact and the non-trivial change for approach (2) and (3), this PR is taking the quick fix of approach (1).

For more information and performance analysis, please refer to PR:
apache/mxnet#14570

apeforest · 2019-03-29T18:42:07Z

@TaoLv @eric-haibin-lin @tqchen Please help to review.

larroy

LGTM

I had an issue also with tensor size types and overflow.

TaoLv

Try to understand that the change only affects the total size of tensor not the size of each dimension.

apeforest · 2019-03-30T06:14:39Z

@TaoLv It affects the size of each dimension as well.

TaoLv · 2019-03-30T15:58:57Z

@apeforest Do you mean size of each dimension will also be defined as int64 if MSHADOW_INT64_TENSOR_SIZE=1 and the size may exceed INT32_MAX? If so, I'm afraid some functions in mshadow will be broken. See those gemm functions in dot_engine-inl.h. M/N/K there are explicitly defined as int.
/~https://github.com/dmlc/mshadow/blob/master/mshadow/dot_engine-inl.h#L281

samskalicky

LGTM

into perf/large-tensor

apeforest · 2019-04-02T17:23:18Z

@TaoLv Updated the gemm function signature with index_t. Please help to check if this addressed your concern.

TaoLv · 2019-04-03T03:00:26Z

Thank you @apeforest . Do we have any large dot/GEMM unit tests in MXNet for this change?

apeforest · 2019-04-03T19:36:51Z

@TaoLv There is one in test_operator.py: /~https://github.com/apache/incubator-mxnet/blob/e2f5b47346e148c2376da7e6628750747f2d6a94/tests/python/unittest/test_operator.py#L5668

apeforest · 2019-04-04T05:46:29Z

@szha @eric-haibin-lin @tqchen Could you please help to review/merge this PR? Thanks!

TaoLv · 2019-04-04T15:44:03Z

@apeforest Looks like M/N/K in that case are really small (eg. 2/2/3). I'm afraid it's not enough to test changes in this PR.

apeforest · 2019-04-08T18:17:33Z

@TaoLv The main purpose of this PR is not to support large tensors using gemm engine. It is to fall back to int32 by default with a compilation flag. We need more thorough inspection of performance impact using int64 before we turning the flag on again.

eric-haibin-lin · 2019-04-08T18:38:32Z

mshadow/dot_engine-inl.h

+                          index_t m, index_t n, index_t k, float alpha,
+                          const float *A, index_t lda,
+                          const float *B, index_t ldb, float beta,
+                          float *C, index_t ldc) {
    cublasStatus_t err = cublasSgemm(Stream<gpu>::GetBlasHandle(stream),


Does lda, ldb, ldc support int64 in cublas?

cublasStatus_t cublasSgemm(cublasHandle_t handle, cublasOperation_t transa, cublasOperation_t transb, int m, int n, int k, const float *alpha, const float *A, int lda, const float *B, int ldb, const float *beta, float *C, int ldc)

That's why I want to have a dot/GEMM unit test for large input tensors. Same problem may also exist for openblas and MKL.

TaoLv · 2019-04-09T01:48:31Z

Hi @apeforest , I feel okay if this flag only impacts the total tensor size as BLAS libraries should work well with that. But if it will also impact the size of each dimension, then we need more changes and validations to accommodate that. So this flag is not ready to be exposed to users.

TaoLv · 2019-04-09T01:49:23Z

@pengzhao-intel

apeforest · 2019-04-09T07:18:01Z

Hi @apeforest , I feel okay if this flag only impacts the total tensor size as BLAS libraries should work well with that. But if it will also impact the size of each dimension, then we need more changes and validations to accommodate that. So this flag is not ready to be exposed to users.

@TaoLv The data type of each dimension in the tensor will be defined as index_t (if the flag is on then it is int64). However, it does not mean we will support the size in any dimension greater than INT32_MAX. We can only support the total element size greater than INT32_MAX in the tensor.

TaoLv · 2019-04-09T07:28:16Z

Thank you for the explanation @apeforest . Then seems we need document it somewhere and prevent users from passing a large tensor (dim[x] > INT32_MAX) to operators. If so I think there is no need to change the API in dot_engine.

apeforest · 2019-04-09T17:02:42Z

Thank you for the explanation @apeforest . Then seems we need document it somewhere and prevent users from passing a large tensor (dim[x] > INT32_MAX) to operators. If so I think there is no need to change the API in dot_engine.

Makes sense. I have reverted the API signature in dot_engine.

mshadow/tensor.h

apeforest added 3 commits March 29, 2019 10:00

add a compiler flag to select int64 type

4305465

fix typo

44a5348

fix compilation error

0ea7165

apeforest mentioned this pull request Mar 29, 2019

add a compiler flag to use int64 as tensor size apache/mxnet#14570

Merged

7 tasks

Merge remote-tracking branch 'upstream/master' into perf/large-tensor

0c3a6e7

larroy approved these changes Mar 29, 2019

View reviewed changes

TaoLv approved these changes Mar 30, 2019

View reviewed changes

samskalicky approved these changes Apr 1, 2019

View reviewed changes

apeforest added 5 commits April 2, 2019 09:22

add a compiler flag to select int64 type

359959e

fix typo

6945913

fix compilation error

f618c0b

Merge branch 'perf/large-tensor' of /~https://github.com/apeforest/mshadow

12464b6

into perf/large-tensor

fix type in gemm functions

655a992

fix lint

a0c5721

eric-haibin-lin reviewed Apr 8, 2019

View reviewed changes

revert type change in gemm function call

8b28974

sync change with upstream

bb6a037

eric-haibin-lin reviewed Apr 10, 2019

View reviewed changes

mshadow/tensor.h Outdated Show resolved Hide resolved

remove uint32

9057bf9

eric-haibin-lin approved these changes Apr 12, 2019

View reviewed changes

eric-haibin-lin merged commit 6e94643 into dmlc:master Apr 12, 2019

ciyongch mentioned this pull request Jul 3, 2019

Operator Performance Regression on CPU apache/mxnet#15429

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add compile flag to choose data type for tensor size due to performance degradation #371

Add compile flag to choose data type for tensor size due to performance degradation #371

apeforest commented Mar 29, 2019

apeforest commented Mar 29, 2019

larroy left a comment

TaoLv left a comment

apeforest commented Mar 30, 2019

TaoLv commented Mar 30, 2019

samskalicky left a comment

apeforest commented Apr 2, 2019

TaoLv commented Apr 3, 2019

apeforest commented Apr 3, 2019

apeforest commented Apr 4, 2019 •

edited

Loading

TaoLv commented Apr 4, 2019

apeforest commented Apr 8, 2019 •

edited

Loading

eric-haibin-lin Apr 8, 2019

TaoLv Apr 9, 2019

TaoLv commented Apr 9, 2019

TaoLv commented Apr 9, 2019

apeforest commented Apr 9, 2019

TaoLv commented Apr 9, 2019

apeforest commented Apr 9, 2019

Add compile flag to choose data type for tensor size due to performance degradation #371

Add compile flag to choose data type for tensor size due to performance degradation #371

Conversation

apeforest commented Mar 29, 2019

apeforest commented Mar 29, 2019

larroy left a comment

Choose a reason for hiding this comment

TaoLv left a comment

Choose a reason for hiding this comment

apeforest commented Mar 30, 2019

TaoLv commented Mar 30, 2019

samskalicky left a comment

Choose a reason for hiding this comment

apeforest commented Apr 2, 2019

TaoLv commented Apr 3, 2019

apeforest commented Apr 3, 2019

apeforest commented Apr 4, 2019 • edited Loading

TaoLv commented Apr 4, 2019

apeforest commented Apr 8, 2019 • edited Loading

eric-haibin-lin Apr 8, 2019

Choose a reason for hiding this comment

TaoLv Apr 9, 2019

Choose a reason for hiding this comment

TaoLv commented Apr 9, 2019

TaoLv commented Apr 9, 2019

apeforest commented Apr 9, 2019

TaoLv commented Apr 9, 2019

apeforest commented Apr 9, 2019

apeforest commented Apr 4, 2019 •

edited

Loading

apeforest commented Apr 8, 2019 •

edited

Loading