Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cosine similarity operator. #3815

Merged
merged 3 commits into from
Sep 6, 2017

Conversation

xinghai-sun
Copy link
Contributor

Resolved #3814

dzhwinter
dzhwinter previously approved these changes Sep 4, 2017
Copy link
Contributor

@dzhwinter dzhwinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM++

grad_y->mutable_data<T>(context.GetPlace());

auto dims = x->dims();
int size = static_cast<int>(framework::product(dims));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the return type of ssize_t not proper here. Since the product of dims measures the variable reserved memory, then int(int64) is enough. Is it any other necessary reason behind? @Canpio @QiJune

Copy link
Contributor Author

@xinghai-sun xinghai-sun Sep 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that no negative value is needed in many variables in dim.h, why not directly use size_t instead of int, to avoid such unnecessary type conversion.

Copy link
Collaborator

@JiayiFeng JiayiFeng Sep 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using int64_t can't solve the problem, for int64_t also can not cast to int automatically.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All shape,size, dim related will use int64_t. And I will fix it later
So codes here will be:

int64_t = framework::product(dims);
auto new_dims = framework:::make_ddim({dims[0], size / dims[0])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use size_t for all of them?

auto *x_grad = ctx.Output<Tensor>(framework::GradVarName("X"));
auto *y_grad = ctx.Output<Tensor>(framework::GradVarName("Y"));
x_grad->Resize(x_dims);
y_grad->Resize(y_dims);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (x_grad) x_grad->Resize(x_dims);
if (y_grad) y_grad->Resize(y_dims);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

auto XY = (X * Y).sum(Eigen::array<int, 1>({1}));
XNorm.device(place) = (X * X).sum(Eigen::array<int, 1>({1})).sqrt();
YNorm.device(place) = (Y * Y).sum(Eigen::array<int, 1>({1})).sqrt();
Z.device(place) = XY / XNorm / YNorm;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The names of variables (including function parameters) and data members are all lowercase

https://google.github.io/styleguide/cppguide.html#Variable_Names

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
self.compare_grad(op, inputs)
self.check_grad(
op, inputs, set(["X", "Y"]), "Out", max_relative_error=0.05)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

dX.device(place) =
dZ_bcast * (Y / norm_prod_bcast - Z_bcast * X / X_snorm_bcast);
dY.device(place) =
dZ_bcast * (X / norm_prod_bcast - Z_bcast * Y / Y_snorm_bcast);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to handle the case whether one of the input has no gradients.

if (dX) {
  auto X_snorm_bcast = X_norm.square().eval().broadcast(bcast);
  dX.device(place) =
    dZ_bcast * (Y / norm_prod_bcast - Z_bcast * X / X_snorm_bcast);
}
// ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@xinghai-sun xinghai-sun merged commit b59f301 into PaddlePaddle:develop Sep 6, 2017
@xinghai-sun xinghai-sun deleted the cos_sim_layer2 branch September 6, 2017 06:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants