-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add reduce op #4086
Add reduce op #4086
Conversation
ops::ReduceKernel<paddle::platform::CPUPlace, float, ops::MinFunctor>); | ||
REGISTER_OP_CPU_KERNEL(reduce_min_grad, | ||
ops::ReduceGradKernel<paddle::platform::CPUPlace, float, | ||
ops::MaxOrMinGradFunctor>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我觉得将min,max等作为attr来写,注册一个kernel,这样cc文件会更加简短精炼。
目前ReduceMinOpMaker和ReduceMaxOpMaker等基本都是重复的。现在分开写了四个kernel,写成一个后,会省将近3/4的代码。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
抱歉啊,这个回复的晚了,这里参考了下tensorflow的做法 /~https://github.com/tensorflow/tensorflow/blob/216dcbf1e08c02d87774120ebd5b251c5c30c56c/tensorflow/core/kernels/reduction_ops_sum.cc#L26 ,另外pytorch中也有类似的reduce操作 /~https://github.com/pytorch/pytorch/blob/master/torch/autograd/_functions/reduce.py ,感觉可能分为多个OP在意义上更清楚一些,另外将min、max作为attr感觉会kernel部分的代码会比较长,后续可能也会加下其他的reduce操作,functor的话还有一个潜在好处是的可以复用目前的kernel不太容易复用,我也不能确定哪种会更好。多谢评论与思考建议~
break; | ||
case 6: | ||
ReduceCompute<6>(context); | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的6个case是什么意思呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里对应到EigenTensor的几种rank,由于EigenTensor是用的template,这里显示写了出来
auto out = EigenTensor < T, D == 1 ? 1 : (D - 1) > ::From(*output, dims); | ||
auto& place = context.GetEigenDevice<Place>(); | ||
Functor functor; | ||
functor(place, x, out, reduce_dim); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果作为attr来写,这里也不用functor,用switch来做,29-86行中一半的代码都能省出来。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functor的话还有一个潜在好处是的可能比较容易复用,目前的kernel好像不太容易复用
ops::ReduceKernel<paddle::platform::GPUPlace, float, ops::MinFunctor>); | ||
REGISTER_OP_GPU_KERNEL(reduce_min_grad, | ||
ops::ReduceGradKernel<paddle::platform::GPUPlace, float, | ||
ops::MaxOrMinGradFunctor>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
目前实现是,每个循环里调用一次eigen的gpu kernel,效率会比较慢。可以考虑单独写。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个不是特别什么意思,可以稍具体说明下么~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有些代码可以省略掉
self.attrs = {'dim': 1} | ||
self.outputs = {'Out': self.inputs['X'].mean(axis=self.attrs['dim'])} | ||
|
||
def test_check_output(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
定义一个父类,然后TestMeanOp(xxxx)
,然后test_check_out
等函数就都可以重用了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的建议,多谢。这里reduce_max和reduce_min不进行梯度检测,几种OP中test的内容不是特别统一,感觉就先行不改了。
paddle/operators/reduce_op.cc
Outdated
"(Tensor) The input tensor. Tensors with rank at most 6 are supported"); | ||
AddOutput("Out", "(Tensor) The result tensor."); | ||
AddComment(R"DOC( | ||
ReduceMean operator computes the sum of input tensor along the given dimension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sum,Mean,Max,Min
的几个OpMaker基本都是一样的。这些代码其实可以用一个基类实现,然后子类来处理不相同的部分。
参考一下:#4139 ElementwiseOpmaker。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks~
paddle/operators/reduce_op.cc
Outdated
: OpProtoAndCheckerMaker(proto, op_checker) { | ||
AddInput( | ||
"X", | ||
"(Tensor) The input tensor. Tensors with rank at most 6 are supported"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
咱们的Tensor最多支持9维。这个6维是怎么来的?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里目前参考的是crop_op中的/~https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/crop_op.h ,可能确实需要再统一下。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
然后?
ebcf710
to
8b3bf28
Compare
paddle/operators/reduce_op.cc
Outdated
"(Tensor) The input tensor. Tensors with rank at most 6 are supported"); | ||
AddOutput("Out", "(Tensor) The result tensor."); | ||
AddComment(R"DOC( | ||
ReduceMean operator computes the sum of input tensor along the given dimension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReduceMean -> ReduceSum
paddle/operators/reduce_op.cc
Outdated
namespace operators { | ||
|
||
using framework::Tensor; | ||
using framework::LoDTensor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/reduce_op.cc
Outdated
dims_vector.erase(dims_vector.begin() + dim); | ||
} | ||
auto out_dims = framework::make_ddim(dims_vector); | ||
ctx.Output<framework::LoDTensor>("Out")->Resize(out_dims); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
framework::LoDTensor
-> framework::Tensor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/reduce_op.cc
Outdated
AddOutput("Out", "(Tensor) The result tensor."); | ||
AddAttr<int>("dim", | ||
"(int, default 0) The dimension to reduce. " | ||
"Must be in the range [-rank(input), rank(input))") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add more comments for the dim, or add some examples in the Doc
.
这里-dim是倒数第几维吧? 加下注释吧。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/reduce_op.h
Outdated
auto equals = x == y.broadcast(dim); | ||
auto ones = dx.constant(1); | ||
auto zeros = dx.constant(0); | ||
dx.device(place) = dy.broadcast(dim) * equals.select(ones, zeros); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果max/min值有多个,backward梯度传播的时候,这里的实现多个max值的梯度都为max的grad
,而不是一些max值的梯度为0,另外,和 @guoshengCS 讨论,TF这里对多个max/min值的梯度,取了平均: /~https://github.com/tensorflow/tensorflow/blob/37f7ad75bbd2ca140d1092342eb3590d54193bc8/tensorflow/cc/gradients/math_grad.cc#L711
咱们这里的处理加下注释吧~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/reduce_op.h
Outdated
auto* input2 = context.Input<Tensor>(framework::GradVarName("Out")); | ||
auto* output = context.Output<Tensor>(framework::GradVarName("X")); | ||
|
||
if (output != nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果backward的输出只有一个的话,内部实现不用考虑output == nullptr
,这种case,在/~https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/backward.cc#L67 里会返回NOP
。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
private: | ||
template <size_t D> | ||
void ReduceCompute(const framework::ExecutionContext& context) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为了和上面ReduceCompute
区别,这里是否要叫做ReduceGradCompute
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/reduce_op.h
Outdated
|
||
// For EigenTensor unsupported reduce | ||
template <typename T, typename Functor> | ||
class ReduceGradEigenFreeKernel : public framework::OpKernel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个kernel用在哪里?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. 这里先行删掉
7651339
to
477a6a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Resolves #4060