-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SIMD flags for runtime check #800
Conversation
using namespace paddle; // NOLINT | ||
|
||
TEST(SIMDFlags, gccTest) { | ||
#if (defined(__GNUC__) || defined(__GNUG__)) && !(defined(__clang__)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GCC可以用__builtin_cpu_supports对比一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
也许我们应该找几个奇怪的CPU试一下。。不知道我们是不是AMD CPU的机器。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确实应该找其他的CPU试一下。
|
||
static SIMDFlags* instance(); | ||
|
||
inline bool isSSE() { return simd_flags_ & SIMD_SSE; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inline bool isSSE() const;
下同
using namespace paddle; // NOLINT | ||
|
||
TEST(SIMDFlags, gccTest) { | ||
#if (defined(__GNUC__) || defined(__GNUG__)) && !(defined(__clang__)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
也许我们应该找几个奇怪的CPU试一下。。不知道我们是不是AMD CPU的机器。
namespace paddle { | ||
|
||
/// init simd instance | ||
static InitFunction __init_simd_flags( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
另外,感觉起来,这里Lazy的Init也无所谓,不一定非要在main函数之前init吧。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
#if (defined(__GNUC__) || defined(__GNUG__)) && !(defined(__clang__)) | ||
CHECK(__builtin_cpu_supports("sse") == HAS_SSE); | ||
CHECK(__builtin_cpu_supports("sse2") == HAS_SSE2); | ||
CHECK(__builtin_cpu_supports("sse3") == HAS_SSE3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里写错了。__builtin_cpu_supports返回的值是positive number, 不一定是1或者true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
#pragma once | ||
|
||
#include <iostream> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include 这个用来做啥?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
之前还写了个os<<重载函数,删了之后,忘了删头文件了。
|
||
SIMDFlags(); | ||
|
||
static SIMDFlags* instance(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instance()也可以返回const
#ifdef _WIN32 | ||
|
||
/// for MSVC | ||
#define CPUID(info, x) __cpuidex(info, x, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_WIN32下不需要头文件?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok 我一起再改一下
* add argmin and arsort ops
* add argmin and argsort ops (PaddlePaddle#800) * add argmin and arsort ops * Add dot bmm ops (PaddlePaddle#803) * add bmm * add dot op * clean CreateConst * clean CreateCast * add activation ops (PaddlePaddle#808) * add activation ops * fix 1function-redefined error
* add argmin and arsort ops
No description provided.