-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GemmConvMobileFunction(optimized for mobile) #7034
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost LGTM.
paddle/function/GemmConvOp.cpp
Outdated
size_t colWidth = outputHeight * outputWidth; | ||
// Max col matrix height 256, Max col matrix width 1024 | ||
size_t stepColHeight = std::min(colHeight, (size_t)256); | ||
size_t stepColWidth = std::min(colWidth, (size_t)2048); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static_cast<size_t>(256)
static_cast<size_t>(2048)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/function/Im2ColTest.cpp
Outdated
for (size_t filterWidth : {3, 7}) { | ||
for (size_t stride : {1, 2}) { | ||
for (size_t padding : {0, 1}) { | ||
for (size_t dilation : {1, 3}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the test case can be reduced to speed up the unit testing time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/function/Im2Col.h
Outdated
(imRowIdx - paddingHeight) >= inputHeight || | ||
(imColIdx - paddingWidth) < 0 || | ||
(imColIdx - paddingWidth) >= inputWidth) { | ||
colData[colh * colWidthSize + colw] = T(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static_cast<T>(0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
outputWidth}); | ||
|
||
resizeBuffer<Device>(stepColHeight * stepColWidth * sizeof(real)); | ||
colData = reinterpret_cast<real*>(memory_->getBuf()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that the memory_
is not released in every layer. Do you think the if at most 2M extra workspace is used in every layer, we do not need to release it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Add release the memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Fix #6801 #6802
#6802 (review)