-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add broadcasting support (e.g. matrix-vector) for cos sim operator. #3918
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost LGTM.
paddle/operators/cos_sim_op.cc
Outdated
"Shape of Input(Out) must be [X.Dim(0), 1]."); | ||
auto out_grad_dims = | ||
ctx.Input<Tensor>(framework::GradVarName("Out"))->dims(); | ||
PADDLE_ENFORCE_EQ(out_grad_dims, framework::make_ddim({x_dims[0], 1}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can using auto x_dims = framework::make_ddim({x_dims[0], 1})
to define x_dims
first, and in the following you can just use x_dims
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cos_sim_op.h
Outdated
auto place = context.GetEigenDevice<Place>(); | ||
auto xy = (x * y).sum(Eigen::array<int, 1>({1})); | ||
x_norm.device(place) = x.square().sum(Eigen::array<int, 1>({1})).sqrt(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can define row_along = Eigen::array<int, 1>({1});
first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
auto xy = (x * y).sum(Eigen::array<int, 1>({1})); | ||
z.device(place) = xy / x_norm / y_norm; | ||
} else { | ||
Eigen::DSizes<int, 2> bcast(rows_x, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better change bcast
to bcast_dims
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to bcast_rows
paddle/operators/cos_sim_op.h
Outdated
int rows_y = in_y->dims()[0]; | ||
int cols = framework::product(in_x->dims()) / rows_x; | ||
auto x = EigenMatrix<T>::From(*in_x, framework::make_ddim({rows_x, cols})); | ||
auto y = EigenMatrix<T>::From(*in_y, framework::make_ddim({rows_y, cols})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can save framework::make_ddim({rows_x, cols})
and framework::make_ddim({rows_y, cols})
to variables and reuse them in the followings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cos_sim_op.h
Outdated
Eigen::DSizes<int, 2> bcast_row(rows_x, 1); | ||
auto y_bcast = y.broadcast(bcast_row); | ||
auto y_snorm_bcast = | ||
y_norm.square().eval().broadcast(bcast_row).eval().broadcast(bcast); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please merge the two broadcast
operations to one like broadcast({rows, cols})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very comprehensive unit tests! Thanks!
paddle/operators/cos_sim_op.cc
Outdated
// shape check | ||
auto x_dims = ctx.Input<Tensor>("X")->dims(); | ||
auto y_dims = ctx.Input<Tensor>("Y")->dims(); | ||
PADDLE_ENFORCE_EQ(framework::arity(x_dims), framework::arity(y_dims), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DDim
has a member function size()
, with returns the arity of itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cos_sim_op.cc
Outdated
PADDLE_ENFORCE_EQ( | ||
framework::slice_ddim(x_dims, 1, framework::arity(x_dims)), | ||
framework::slice_ddim(y_dims, 1, framework::arity(y_dims)), | ||
"All dimensions except 1st of Input(X) and Input(Y) must be equal."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
‘... the 1st ...’ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cos_sim_op.cc
Outdated
auto y_dims = ctx.Input<Tensor>("Y")->dims(); | ||
PADDLE_ENFORCE_EQ(framework::arity(x_dims), framework::arity(y_dims), | ||
"Ranks of Input(X) and Input(Y) must be equal."); | ||
PADDLE_ENFORCE_GE(framework::arity(x_dims), 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't the rank of x be 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The operator assumes that the first dim of the tensor is for batch axis, so we require that the rank to be larger than 1. Otherwise,it is difficult to tell whether a vector of shape 5 is actually 5 x 1 (five samples in the batch) or 1 x 5 (one sample)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very reasonable! Thanks!
paddle/operators/cos_sim_op.cc
Outdated
"Rank of Input(X) must not be less than 2."); | ||
PADDLE_ENFORCE_EQ( | ||
framework::slice_ddim(x_dims, 1, framework::arity(x_dims)), | ||
framework::slice_ddim(y_dims, 1, framework::arity(y_dims)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these two sliced dims are the same, tensors must have the same rank. The above PADDLE_ENFORCE_EQ(framework::arity(x_dims), framework::arity(y_dims), ...
is unnecessary.
The infershape of backward operator has the same problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do not make sure that X and Y are of the same rank (meaning rank of Y is not less than 2), framework::slice_ddim(y_dims, 1, framework::arity(y_dims))
might throw exceptions because framework::arity(y_dims)
might be smaller than 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it!
paddle/operators/cos_sim_op.cc
Outdated
framework::slice_ddim(y_dims, 1, framework::arity(y_dims)), | ||
"All dimensions except 1st of Input(X) and Input(Y) must be equal."); | ||
PADDLE_ENFORCE(x_dims[0] == y_dims[0] || y_dims[0] == 1, | ||
"1st dimension of Input(Y) must be equal to Input(X) or " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
‘The 1st dimension...’
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cos_sim_op.cc
Outdated
"just 1 (which will be broadcasted to match Input(X))."); | ||
|
||
// resize tensor | ||
ctx.Output<Tensor>("Out")->Resize({x_dims[0], 1}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the second dimension is just 1, why do we need it? We can squeeze it to one dimension.
The infershape of backward operator has the same problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That's a question we've discussed with @QiJune ---- make the output tensor to be of rank 1 or of rank 2? I prefer rank2, since when we use the output for further processing, it is very clear that it is a batch of samples of scalar instead of one single sample of vector. In a word, it reduces ambiguity. We can discuss this more.
paddle/operators/cos_sim_op.cc
Outdated
AddInput("X", "The first input of cos_sim op."); | ||
AddInput("Y", "The second input of cos_sim op."); | ||
AddInput("X", "The 1st input of cos_sim op."); | ||
AddInput("Y", "The 2nd input of cos_sim op."); | ||
AddOutput("Out", "The output of cos_sim op."); | ||
AddOutput("XNorm", "Row norm of the first input.").AsIntermediate(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is 'row norm'? Please add more instruction for this output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cos_sim_op.h
Outdated
int rows_x = in_x->dims()[0]; | ||
int rows_y = in_y->dims()[0]; | ||
int cols = framework::product(in_x->dims()) / rows_x; | ||
auto x = EigenMatrix<T>::From(*in_x, framework::make_ddim({rows_x, cols})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EigenMatrix
has a member function Reshape
, which converts a paddle tensor to EigenMatrix.
static typename EigenMatrix::ConstType Reshape(const Tensor& tensor, int num_col_dims);
The EigenMatrix's first dimension(column length) will be the product of tensor's first num_col_dims
dimensions. In current case, it is 1.
See /~https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/eigen.h#L67
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you for the excellent job!
Resolve #3917