-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sycl: cleanup oneDNN related code #12097
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall. Can you paste here some performance numbers using the Nvidia backend with and without oneDNN? Since we are suggesting to enable oneDNN for Nvidia the performance should be at least as good.
af9f64c
to
3692b1a
Compare
llama.cpp use the official release of oneAPI (including oneDNN). So, this draft would be pending for a long time I guess. |
@NeoZhangJianyu The official release of oneDNN included in the oneapi release doesn't include nvidia support, and therefore it needs to be compiled from source. The oneDNN PR addresses an issue related to nvidia builds. Additionally, the changes don't affect intel devices as an 'official' release of oneDNN can still be used. |
ggml/src/ggml-sycl/gemm.hpp
Outdated
auto a_mem = dnnl::memory(a_in_md, eng, const_cast<void*>(a)); | ||
auto b_mem = dnnl::memory(b_in_md, eng, const_cast<void*>(b)); | ||
auto matmul_pd = dnnl::matmul::primitive_desc(eng, a_in_md, b_in_md, c_md); | ||
auto a_mem = dnnl::memory(a_in_md, eng, (void *) a); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove const_cast
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was an oversight, I've reverted it, thanks for flagging.
ggml/src/ggml-sycl/gemm.hpp
Outdated
auto b_mem = dnnl::memory(b_in_md, eng, const_cast<void*>(b)); | ||
auto matmul_pd = dnnl::matmul::primitive_desc(eng, a_in_md, b_in_md, c_md); | ||
auto a_mem = dnnl::memory(a_in_md, eng, (void *) a); | ||
auto b_mem = dnnl::memory(b_in_md, eng, (void *) b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
@Rbiessy numbers below:
build: 3692b1a (4518)
build: 3692b1a (4518)
build: 3692b1a (4518)
build: 3692b1a (4518)
build: 3692b1a (4518)
build: 3692b1a (4518) |
3692b1a
to
11bc77b
Compare
11bc77b
to
ec9f879
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes! Overall LGTM...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM as well, thanks for adding the performance as well.
This PR cleans up and improves some oneDNN-related code:
Marking it as a draft PR as it needs oneapi-src/oneDNN#2768 to be merged in oneDNN to fix a bug with missing dependencies in oneDNN.