-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Low GPU utilization on python/paddle/v2/fluid/tests/book/test_label_semantic_roles.py #7652
Comments
yes, now the |
I have tested the performance, when running on CUDAPlace, the speed will be twice of running on CPUPlace. You can have a check. |
@jacquesqiao I will remove the copy operation in the linear_chain_crf op to avoid this repeated copy. related to #7654 |
remove the memory copy inside the linear_chain_crf_op inn this PR #7675 . This will fix the problem that both the framework and the operator itself copys inputs from GPU memory and copy memory back to GPU memory when this operator is running on GPU. |
我们使用profile的工具看了一下不同op执行的时间,结果如下。
可以看到crf有关的op执行的时间是占整个过程的大部分时间的。这是因为crf是在cpu上运行的,后续可以通过提供crf的gpu实现来解决这个问题。 |
Compile Version:0.11.0
Device:GPU,a P40 card
Scripts:python/paddle/v2/fluid/tests/book/test_label_semantic_roles.py
I changed the scripts on line 178 from
place = fluid.CPUPlace() to
place = fluid.CUDAPlace(0)
This example can run normally, but the GPU utilization is only about 20% on a single card. I guess there should be some ops not running on GPU. Is there anyway I can check whether all ops are running on GPU devices?
The text was updated successfully, but these errors were encountered: