Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add axis for mul_op and rowwise_add_op #3888

Merged
merged 18 commits into from
Sep 8, 2017

Conversation

JiayiFeng
Copy link
Collaborator

@JiayiFeng JiayiFeng commented Sep 5, 2017

fixes #3722

  1. Add a global function FlattenToMatrix to convert a tensor to a matrix.
  2. Add attributes x_num_col_dims and y_num_col_dims for mul_op and adjust its InferShape and kernel computation. num_col_dims mean how many dimensions will be producted togother to build the result matrix's first dimension.
    e.g. [2,3,4,5,6] num_col_dims=3 ====> [24, 30]
  3. Add unit tests for cases that mul_op takes tensors as inputs
  4. Add axis for rowwise_add_op
  5. Add unit tests for cases that rowwise_add_op takes tensors as inputs

@JiayiFeng JiayiFeng requested a review from reyoung September 5, 2017 18:17
PADDLE_ENFORCE(num_row_dims > 0 && num_row_dims < rank,
"`num_row_dims` must be between (0, rank_of_tensor).");
return EigenMatrix::From(
tensor, make_ddim({static_cast<int>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make_ddim could be removed, just {0, 10} is OK.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a method in class DDim, such as

class DDim {
 public:
  Dim<2> FlattenToMat(int numFlattenDims) const;
};

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@JiayiFeng JiayiFeng changed the title [WIP] Add axis for operators Add axis for mul_op and rowwise_add_op Sep 5, 2017
@QiJune
Copy link
Member

QiJune commented Sep 6, 2017

Whether should we add a reshape operator? if user want to multiply two tensors, he should reshape the tensor to matrix first.

@JiayiFeng
Copy link
Collaborator Author

JiayiFeng commented Sep 6, 2017

@QiJune It's a feasible way, but might be slow and consume more memory.

public:
explicit EqualLargerThanChecker(T lower_bound) : lower_bound_(lower_bound) {}
void operator()(T& value) const {
PADDLE_ENFORCE(value >= lower_bound_, "equal_larger_than check fail");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check fail -> check fails

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -148,5 +148,13 @@ inline Tensor& Tensor::Resize(const DDim& dims) {

inline const DDim& Tensor::dims() const { return dims_; }

template <typename T>
inline Tensor FlattenToMatrix(const Tensor& src, int num_row_dims) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to add the explanation for num_row_dims.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_row_dims is not easy to use, so I use num_col_dms instead. And comments have been added.

AddAttr<int>(
"x_num_row_dims",
"mul_op can take tensors with more than two dimensions as input `X`, "
"in that case, tensors will be flattened to a matrix. The matrix's "
Copy link
Contributor

@qingqing01 qingqing01 Sep 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flattened -> reshaped? In the Numpy, flatten means converting to a vector.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"mul_op can take tensors with more than two dimensions as input `X`, "
"in that case, tensors will be flattened to a matrix. The matrix's "
"second dimension(row length) will be the product of tensor's last "
"`num_row_dims` dimensions, and the matrix's first dimension(column "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

second dimension(row length)
matrix's first dimension(column length)

matrix's first dimension是 dims[0]? second dimension是dims[1]吗? 如果是,matrix's first dimension表示的row length(也就是height), second dimension表示的是col length(也就是width)。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我想表达的row length的意思是“行的长度”,所以似乎应该是width?

"`num_row_dims` dimensions, and the matrix's first dimension(column "
"length) will be the product of tensor's first `rank - num_row_dims` "
"dimensions.")
.SetDefault(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

依据上面的描述,和最常用的情况不符合,最常用的是reshape成:height = dims[0], width = product(dims[1:])

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已经修改,把参数从num_raw_dims改成了num_col_dims,表示乘起来的前面维度的数目

@@ -2,13 +2,13 @@

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
you may obtain a copy of the License at
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you -> You

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
WITHOUT WARRANTIES OR CONDITIONS OF ANy KIND, either express or implied.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ANy -> ANY

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

z->mutable_data<T>(context.GetPlace());
const Tensor* x = context.Input<Tensor>("X");
const Tensor* y = context.Input<Tensor>("Y");
Tensor* Z = context.Output<Tensor>("Out");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Z -> z

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

DDim flatten_to_2d(const DDim& src, int num_col_dims) {
int rank = src.size();
return make_ddim(
{static_cast<int>(product(slice_ddim(src, 0, num_col_dims))),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it seems that another PR changes int -> int64_t for ddim.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to change here. Thanks!


template <typename T, int MajorType = Eigen::RowMajor,
typename IndexType = Eigen::DenseIndex>
struct EigenVector : public EigenTensor<T, 1, MajorType, IndexType> {
// Flatten reshapes a Tensor into an EigenVector.
static typename EigenVector::Type Flatten(Tensor& tensor) {
return EigenVector::From(
tensor, make_ddim({static_cast<int>(product(tensor.dims_))}));
return EigenVector::From(tensor, {static_cast<int>(product(tensor.dims_))});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int -> int64

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_t or ssize_t is better, please do not mix use of int64 and size_t

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

product() retuens int64_t now, so static_cast can be removed.

}

static typename EigenVector::ConstType Flatten(const Tensor& tensor) {
return EigenVector::From(
tensor, make_ddim({static_cast<int>(product(tensor.dims_))}));
return EigenVector::From(tensor, {static_cast<int>(product(tensor.dims_))});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int -> int64

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

DDim flatten_to_1d(const DDim& src) {
return make_ddim({static_cast<int>(product(src))});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int -> int64

TEST(Eigen, MatrixReshape) {
Tensor t;
float* p =
t.mutable_data<float>(make_ddim({2, 3, 6, 4}), platform::CPUPlace());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make_ddim is not needed, just t.mutable_data<float>({2, 3, 6, 4}) is cool.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@@ -148,5 +148,13 @@ inline Tensor& Tensor::Resize(const DDim& dims) {

inline const DDim& Tensor::dims() const { return dims_; }

template <typename T>
inline Tensor ReshapeToMatrix(const Tensor& src, int num_col_dims) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline is not needed in class method. It will be compiler's choice whether inline or not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a class method. It's a global function.

using namespace paddle::framework;
using namespace paddle::platform;
Tensor src;
int* src_ptr = src.mutable_data<int>(make_ddim({2, 3, 4, 9}), CPUPlace());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make_ddim is not needed.

ctx.op().Input("Y"));
auto x_dims = ctx.Input<Tensor>("X")->dims();
auto y_dims = ctx.Input<Tensor>("Y")->dims();
int x_num_col_dims = GetAttr<int>("x_num_col_dims");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry... I just merge a PR, change GetAttr to Attr. Since all method in Op is Input, Output. Not GetInput or GetOutput.

@@ -47,6 +56,23 @@ class MulOpMaker : public framework::OpProtoAndCheckerMaker {
AddInput("X", "The first input of mul op");
AddInput("Y", "The second input of mul op");
AddOutput("Out", "The output of mul op");
AddAttr<int>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a very useful syntax in C++ 11.

AddAttr<int>("x_num_col_dims", R"DOC(mul_op can take ...
....
)DOC");

R"LABEL(...)LABEL" is just like python's """...""". which LABEL is a custom label to identify where the string begins and ends.

See http://en.cppreference.com/w/cpp/language/string_literal

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, Thank you!

"Out@GRAD M X N must equal to Y dims 1, N ");

auto x_mat_dims =
framework::flatten_to_2d(x_dims, GetAttr<int>("x_num_col_dims"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetAttr->Attr

auto x_mat_dims =
framework::flatten_to_2d(x_dims, GetAttr<int>("x_num_col_dims"));
auto y_mat_dims =
framework::flatten_to_2d(y_dims, GetAttr<int>("y_num_col_dims"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetAttr->Attr

x_dims.size(), b_dims.size(),
"The rank of input `X` must be larger than the one of input `b`.");

int num_col_dims = x_dims.size() - b_dims.size();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting implementation here. So the rowwise_add's num_col_dims is decided by the rank difference.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It makes sure that b is flattened to a vector.

reyoung
reyoung previously approved these changes Sep 6, 2017
Copy link
Collaborator

@reyoung reyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent Job!

But sorry for I just merging A PR change GetAttr -> Attr. So please merge develop branch before merge.

};

template <typename T>
class EqualLargerThanChecker {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is better to compatible with gtest. Such as CHECK_GE or something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EqualLargerThan is a function, not a macro, so the name shall not be too short.

public:
explicit EqualLargerThanChecker(T lower_bound) : lower_bound_(lower_bound) {}
void operator()(T& value) const {
PADDLE_ENFORCE(value >= lower_bound_, "equal_larger_than check fails.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PADDLE_ENFORCE_GE(xxx, xxx, "comment")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!


template <typename T, int MajorType = Eigen::RowMajor,
typename IndexType = Eigen::DenseIndex>
struct EigenVector : public EigenTensor<T, 1, MajorType, IndexType> {
// Flatten reshapes a Tensor into an EigenVector.
static typename EigenVector::Type Flatten(Tensor& tensor) {
return EigenVector::From(
tensor, make_ddim({static_cast<int>(product(tensor.dims_))}));
return EigenVector::From(tensor, {static_cast<int>(product(tensor.dims_))});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_t or ssize_t is better, please do not mix use of int64 and size_t

max_relative_error=0.5,
no_grad_set={"Y"})


# TODO(dzh,qijun) : mulgrad test case need transpose feature of blas library
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line of comment can be removed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@JiayiFeng JiayiFeng requested a review from wangkuiyi September 7, 2017 05:04
qingqing01
qingqing01 previously approved these changes Sep 7, 2017
Copy link
Collaborator

@reyoung reyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent

@JiayiFeng JiayiFeng merged commit 544458e into PaddlePaddle:develop Sep 8, 2017
@JiayiFeng JiayiFeng deleted the dev_add_axis branch September 8, 2017 04:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Need attribute axis for rowwise_add, fc operator, etc.
5 participants