use multi-thread eigen while run on mobile device #6751

hjchen2 · 2017-12-19T10:55:22Z

移动端使用多线程Eigen，加速inference。下图为不同线程数下MobileNet的测试结果（测试机为标准版小米MI5，其中两个cpu核锁频到1363MHz，另外两个cpu核锁频到1401MHz）：

framework	speed	cpu	memory	size
paddlepaddle	353ms	25%	210M	3M
paddlepaddle(2 threads)	290ms	42%	210M	3M
paddlepaddle(4 threads)	253ms	50%	210M	3M

非deepwise卷积使用Eigen两线程加速比2x左右，四线程加速3x左右，但由于有将近140ms被batch normalization消耗，所以总体加速不是很高。

hedaoyuan · 2017-12-20T07:18:27Z

paddle/capi/Main.cpp

@@ -12,6 +12,12 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License. */

+#ifdef _OPENMP


_OPENMP和非_OPENMP分支有什么区别？

sorry，刚开始是在这里判断是不是使用openmp，但后来就将这个分支判断挪到ThreadsNumManager了。此处_OPENMP的分支确实需要移除

hedaoyuan · 2017-12-20T07:20:24Z

paddle/capi/Main.cpp

@@ -44,6 +50,17 @@ paddle_error paddle_init(int argc, char** argv) {
  return kPD_NO_ERROR;
 }

+paddle_error paddle_set_num_threads(int n) {


这两个接口可能不是很有必要，真实场景中，用户一般也不会清楚该把threads设置为多少。用多少个线程做多线程计算，这个需要paddle针对每个op的计算量自己计算。

对，最好的情况是框架能够根据计算量和op计算类型自动调整线程数，但现在paddle还无法做到。这个接口我觉得还是可以有的，起码测试性能时不用每次都改代码~~~

hedaoyuan · 2017-12-20T07:29:20Z

paddle/function/EigenDevice.h

+
+namespace paddle {
+
+int GetAndroidCpuCount();


不需要声明

hedaoyuan · 2017-12-20T07:29:29Z

paddle/function/EigenDevice.h

+
+int GetAndroidCpuCount();
+
+int GetOSXCpuCount();


hedaoyuan · 2017-12-20T07:38:03Z

paddle/function/EigenGemm.cpp

@@ -70,7 +72,11 @@ struct EigenBlasGemm {
    dims[0].first = transA ? 0 : 1;
    dims[0].second = transB ? 1 : 0;

-    Eigen::DefaultDevice device;
+#if defined(__ANDROID__) || defined(__OSX__)


我看编译的时候有一个EIGEN_USE_THREADS，为什么不用这个宏？

这里我主要考虑的是在移动端用起来，服务器端也可以设置EIGEN_USE_THREADS来支持多线程计算，但这个应该需要和trainer count一起考虑下怎么设置线程数

hedaoyuan

可以先把EigenBlasGemm::compute修改成支持多线程的，API接口部分暂时不用修改。

…o my-cool-stuff

… my-cool-stuff

hjchen2 · 2017-12-20T11:48:45Z

@hedaoyuan 我改完后重新提交了，帮忙review一下，谢谢~

hedaoyuan · 2017-12-21T03:14:25Z

paddle/capi/Main.cpp

 #include "capi_private.h"
 #include "main.h"
+#include "paddle/function/EigenDevice.h"


这里不需要修改。

嗯，好的

hedaoyuan · 2017-12-21T03:40:01Z

paddle/function/EigenDevice.h

+public:
+  static void Set(int n) {
+#ifdef _OPENMP
+    omp_set_num_threads(n);


这两种设置多线程的方式性能上有什么区别？我看编译选项中并没有添加-fopenmp，_OPENMP方式什么时候会被用到？

现在还不会用到，因为用openmp的话，需要用g++编译器，但g++编译后的效率比clang差的比较多。我之前尝试过使用openmp多线程优化NeonDepthwiseConv，但NeonDepthwiseConv占用的时间并不是很多，远比不上g++编译器带来的性能损失，所以就弃用了openmp，但还是保留了openmp设置线程数的方式。另外eigen也支持直接使用设置的openmp线程数创建线程池。如果以后也不考虑使用openmp的话，这里确实可以去掉_OPENMP的分支。

hedaoyuan · 2017-12-21T03:40:23Z

paddle/function/EigenDevice.cpp

+#include <sys/types.h>
+#endif
+
+// #include <android/log.h>


25这行删掉吧。

hedaoyuan · 2017-12-21T03:40:38Z

paddle/function/EigenDevice.cpp

+  }
+  int rank0, rank1;
+  int num = fscanf(fp, "%d-%d", &rank0, &rank1);
+  //  __android_log_print(ANDROID_LOG_DEBUG, "Paddle",


无用的代码删掉吧。

… my-cool-stuff

hjchen2

fix style and remove openmp support

hjchen2 · 2017-12-22T03:02:19Z

paddle/function/EigenDevice.cpp

+  }
+  int rank0, rank1;
+  int num = fscanf(fp, "%d-%d", &rank0, &rank1);
+  //  __android_log_print(ANDROID_LOG_DEBUG, "Paddle",


hjchen2 · 2017-12-22T03:02:23Z

paddle/capi/Main.cpp

 #include "capi_private.h"
 #include "main.h"
+#include "paddle/function/EigenDevice.h"


hedaoyuan · 2017-12-22T03:28:05Z

cmake/configure.cmake

@@ -38,7 +38,7 @@ if(NOT WITH_TIMER)
 endif(NOT WITH_TIMER)

 if(USE_EIGEN_FOR_BLAS)
-    add_definitions(-DPADDLE_USE_EIGEN_FOR_BLAS)
+    add_definitions(-DPADDLE_USE_EIGEN_FOR_BLAS -DEIGEN_USE_THREADS)


去掉-DEIGEN_USE_THREADS，默认还是用单线程的计算。

hedaoyuan · 2017-12-22T03:33:19Z

paddle/function/EigenGemm.cpp

+#ifdef EIGEN_USE_THREADS
+    const Eigen::ThreadPoolDevice& device = GetThreadPoolDevice();
+#else
+    const Eigen::DefaultDevice device;


这个分支编译失败。另外，这里可以考虑另写一个多线程的Gemm接口。

hedaoyuan · 2017-12-22T03:43:05Z

paddle/function/EigenDevice.cpp

+#endif
+
+const Eigen::ThreadPoolDevice& GetThreadPoolDevice() {
+  int num_threads = ThreadsNumManager::Get();


不需要设置线程数等于CPU核数，遇到一些8核或10核的系统，性能反而变差。这里可以考虑直接把num_threads直接设置为2或者4吧。

Done，最大设为2了

… my-cool-stuff

CLAassistant · 2018-05-24T17:12:52Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

chenhoujiang seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

… my-cool-stuff

add EIGEN_USE_THREADS definition

faaf9c4

Xreki requested review from hedaoyuan and qingqing01 December 19, 2017 11:22

add EIGEN_USE_THREADS definition and fix a bug

df7d9c6

hedaoyuan reviewed Dec 20, 2017

View reviewed changes

hedaoyuan requested changes Dec 20, 2017

View reviewed changes

lilu12 and others added 3 commits December 20, 2017 17:07

Merge branch 'my-cool-stuff' of /~https://github.com/hjchen2/Paddle int…

339136e

…o my-cool-stuff

Merge branch 'develop' of /~https://github.com/PaddlePaddle/Paddle into…

f98a693

… my-cool-stuff

use multi-thread eigen while run on mobile device

5952625

hedaoyuan reviewed Dec 21, 2017

View reviewed changes

chenhoujiang added 2 commits December 21, 2017 14:24

Merge branch 'develop' of /~https://github.com/PaddlePaddle/Paddle into…

00e146e

… my-cool-stuff

remove unused code and remove openmp support

36135f6

hjchen2 commented Dec 22, 2017

View reviewed changes

hedaoyuan reviewed Dec 22, 2017

View reviewed changes

chenhoujiang added 5 commits December 25, 2017 16:18

Merge branch 'develop' of /~https://github.com/PaddlePaddle/Paddle into…

9b926e1

… my-cool-stuff

fix style and solved compile problem using multi-threaded eigen

45ebc1d

fix compile bug for ios

158b5d2

adjust code style

a427b65

fix conflicts

8c34e98

chenhoujiang added 4 commits May 25, 2018 01:43

merge latest develop

9923a8f

merge upstream develop

05da8d5

Merge branch 'develop' of /~https://github.com/PaddlePaddle/Paddle into…

db30c2e

… my-cool-stuff

fix code style

f24c904

hjchen2 closed this May 24, 2018

This was referenced May 24, 2018

enable eigen multi-thread on mobile device #10918

Closed

enable eigen multi-thread on mobile device #10924

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use multi-thread eigen while run on mobile device #6751

use multi-thread eigen while run on mobile device #6751

hjchen2 commented Dec 19, 2017

hedaoyuan Dec 20, 2017

hjchen2 Dec 20, 2017

hedaoyuan Dec 20, 2017

hjchen2 Dec 20, 2017

hedaoyuan Dec 20, 2017

hjchen2 Dec 20, 2017

hedaoyuan Dec 20, 2017

hedaoyuan Dec 20, 2017

hjchen2 Dec 20, 2017

hedaoyuan left a comment

hjchen2 commented Dec 20, 2017

hedaoyuan Dec 21, 2017

hjchen2 Dec 21, 2017

hjchen2 Dec 22, 2017

hedaoyuan Dec 21, 2017

hjchen2 Dec 21, 2017

hedaoyuan Dec 21, 2017

hjchen2 Dec 21, 2017

hedaoyuan Dec 21, 2017

hjchen2 Dec 21, 2017

hjchen2 Dec 22, 2017

hjchen2 left a comment

hjchen2 Dec 22, 2017

hjchen2 Dec 22, 2017

hedaoyuan Dec 22, 2017

hjchen2 Dec 25, 2017

hedaoyuan Dec 22, 2017

hjchen2 Dec 25, 2017

hedaoyuan Dec 22, 2017

hjchen2 Dec 25, 2017

CLAassistant commented May 24, 2018 •

edited

Loading

use multi-thread eigen while run on mobile device #6751

use multi-thread eigen while run on mobile device #6751

Conversation

hjchen2 commented Dec 19, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hedaoyuan left a comment

Choose a reason for hiding this comment

hjchen2 commented Dec 20, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hjchen2 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CLAassistant commented May 24, 2018 • edited Loading

CLAassistant commented May 24, 2018 •

edited

Loading