Row convolution operation. #2373

qingqing01 · 2017-06-04T05:57:51Z

Fix #2228

Add a row convolution function.

CPU implementation.
GPU implementation.
pass test_LayerGrad unit test.
pass function comparison unit test.
Add Python API.
Add code annotation in both C++ and Python interface.

… row_conv

luotao1

请在layers.rst补充row_conv信息，另外能否把layer.multiplex信息一并补上（#2308 ）

… row_conv

qingqing01 · 2017-06-06T09:08:56Z

@luotao1 已补充： 6e8c566 。 Thanks!

hedaoyuan

另外，在实现计算kernel的时候尽量用template，避免用real，这样，后续需要支持float16或其他类型的计算时会更容易。

hedaoyuan · 2017-06-07T08:08:07Z

paddle/function/RowConvOp.cpp

+    CHECK_EQ(in.shape().ndims(), 2UL);
+    CHECK_EQ(out.shape().ndims(), 2UL);
+    CHECK_EQ(in.shape()[1], out.shape()[1]);
+    CHECK_EQ(in.shape()[0], out.shape()[0]);


147-149三行可以换成CHECK(in.shape() == out.shape());

hedaoyuan · 2017-06-07T08:12:43Z

paddle/function/RowConvOp.cpp

+    CHECK_EQ(in.shape().ndims(), 2UL);
+    CHECK_EQ(outGrad.shape().ndims(), 2UL);
+    CHECK_EQ(in.shape()[1], outGrad.shape()[1]);
+    CHECK_EQ(in.shape()[0], outGrad.shape()[0]);


CHECK(in.shape() == outGrad.shape());
CHECK(in.shape() == inGrad.shape());

hedaoyuan · 2017-06-07T09:05:23Z

paddle/gserver/layers/RowConvLayer.cpp

+  MatrixPtr wGrad = weight_->getWGrad();
+  size_t h = getInputValue(0)->getHeight();
+  size_t w = getInputValue(0)->getWidth();
+  outputs.addArg(


inputGrad和weightGrad最好还是拆成两个Funciton计算。这样，可以避免创建一个空的参数。

if (inGrad) { backwardInput(...); } if (wGrad) { backwardWeight(...); }

加了TODO，后续会拆分成两个function.

hedaoyuan · 2017-06-07T09:12:41Z

paddle/gserver/layers/RowConvLayer.cpp

+  resetOutput(height, width);
+
+  const auto startPos = getInput(0).sequenceStartPositions->getVector(useGpu_);
+  wDims_ = TensorShape({contexLength_, width});


wDims_是weight_ shape吧，这里contexLength_ != weight_.height_吗？

这里contexLength_ == weight_.height_

嗯，那修改成根据weight属性创建wDims更可读吧TensorShape(weight.height_, weight,width_)。

hedaoyuan · 2017-06-07T09:29:51Z

paddle/function/RowConvOpGpu.cu

+
+  __shared__ real sw[BLOCK_H][BLOCK_W];
+
+  for (int i = tidy; i < context; i += blky) {


如果，context > 32，这里外层增加一个for，是否就可以去掉KeRowConv2了？

paper里面context是19，考虑到可能这个值大于32的情况也不多，就分成了2个kernel，每个kernel读写都相对简洁一些~

hedaoyuan · 2017-06-07T09:35:26Z

paddle/function/RowConvOp.cpp

+}
+
+template <>
+void RowConvGrad<DEVICE_TYPE_CPU>(const CpuMatrix& outG,


这里，filterG和inG放在一个Kernel里实现并没有提高计算性能，还是写层两个Kernel函数实现较好。

加了TODO，后续PR再修改~

hedaoyuan · 2017-06-07T09:39:42Z

paddle/function/RowConvOp.cpp

+    // check
+    CHECK_EQ(2UL, inputs.size());
+    CHECK_EQ(1UL, outputs.size());
+    CHECK_EQ(outputs[0].getArgType(), ADD_TO);


forward可以实现一下ASSIGN_TO，这样可以加速inference的计算。

加两个TODO, 后续PR再修改~

xinghai-sun · 2017-06-11T13:26:12Z

@qingqing01 这里可能有个问题：DS2中的row convolution是需要加在RNNs之后的，即其input layer输出类型是sequence, 而不是dense vector.

我们讨论下是否有必要开发sequence类型的2-D conv （现有sequence类型的1-D conv且不带stride），否则就需要多次sequence到dense_vector的来回转换，可能有损性能。

另外，Lookahead conv kernel 在本实现中也未支持？

qingqing01 · 2017-06-12T02:06:12Z

@xinghai-sun

这里可能有个问题：DS2中的Row Convolution是需要加在RNNs之后的，即其input layer输出类型是sequence, 而不是dense vector.

这里实现的row convolution的输入输出都是 sequence类型，可以看paddle/gserver/layers/RowConvLayer.cpp的52行和54行。在网络中，这层也是用在BlockExpand Layer之后的RNN网络中，所以输入是带sequence信息的。

我们讨论下是否有必要开发sequence类型的2-D conv （现有sequence类型的1-D conv且不带stride），否则就需要多次sequence到dense_vector的来回转换，可能有损性能。

依据Row Conv的输入支持sequence类型，所以这个是没有必要的吧。

Lookahead conv kernel 在本实现中也未支持？

这句我没有太明白指的是什么

qingqing01 · 2017-06-12T02:18:06Z

Lookahead conv kernel 在本实现中也未支持？

是filter吧？ /~https://github.com/PaddlePaddle/Paddle/pull/2373/files#diff-7017557ab2e2d717f6816645d86eee5cR30 这里支持可学习filter (weight)，用户可以配置conv kernel的大小, 只是这里配置的context_len等于lookahead_step + 1，/~https://github.com/PaddlePaddle/Paddle/pull/2373/files#diff-5118293a2b796585b95af1e799956915R5607 这里也写了注释和note。

xinghai-sun · 2017-06-12T03:16:40Z

Got it. Thanks @qingqing01 .

qingqing01 added 5 commits June 1, 2017 00:56

CPU implementation of row convolution

cb6436b

Merge branch 'develop' of /~https://github.com/PaddlePaddle/Paddle into…

a181586

… row_conv

GPU implementation of row conv.

b3ac51f

Merge branch 'develop' of /~https://github.com/PaddlePaddle/Paddle into…

b783e08

… row_conv

Fix bug and Python API.

18cd1f2

qingqing01 requested review from lcy-seso, xinghai-sun, luotao1 and hedaoyuan June 5, 2017 01:55

luotao1 reviewed Jun 5, 2017

View reviewed changes

qingqing01 requested a review from pkuyym June 5, 2017 07:59

qingqing01 added 2 commits June 6, 2017 16:52

Add layers into doc.

6e8c566

Merge branch 'develop' of /~https://github.com/PaddlePaddle/Paddle into…

a5dfc55

… row_conv

hedaoyuan requested changes Jun 7, 2017

View reviewed changes

update code

37015fa

hedaoyuan approved these changes Jun 8, 2017

View reviewed changes

qingqing01 force-pushed the row_conv branch from f65d0b5 to b9951f5 Compare June 8, 2017 09:04

follow comments

f18d83f

qingqing01 force-pushed the row_conv branch from b9951f5 to f18d83f Compare June 8, 2017 09:48

Merge branch 'develop' into row_conv

6bc9277

qingqing01 merged commit 1b8d2e6 into PaddlePaddle:develop Jun 12, 2017

ryanleary mentioned this pull request Jun 14, 2017

Implement Row Convolution in PyTorch SeanNaren/deepspeech.pytorch#46

Closed

qingqing01 deleted the row_conv branch July 7, 2017 13:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Row convolution operation. #2373

Row convolution operation. #2373

qingqing01 commented Jun 4, 2017 •

edited

Loading

luotao1 left a comment •

edited

Loading

qingqing01 commented Jun 6, 2017

hedaoyuan left a comment

hedaoyuan Jun 7, 2017

qingqing01 Jun 8, 2017

hedaoyuan Jun 7, 2017

qingqing01 Jun 8, 2017

hedaoyuan Jun 7, 2017

qingqing01 Jun 8, 2017

hedaoyuan Jun 7, 2017

qingqing01 Jun 8, 2017

hedaoyuan Jun 8, 2017

qingqing01 Jun 8, 2017

hedaoyuan Jun 7, 2017

qingqing01 Jun 8, 2017

hedaoyuan Jun 7, 2017

qingqing01 Jun 8, 2017

hedaoyuan Jun 7, 2017

qingqing01 Jun 8, 2017

xinghai-sun commented Jun 11, 2017 •

edited

Loading

qingqing01 commented Jun 12, 2017 •

edited

Loading

qingqing01 commented Jun 12, 2017 •

edited

Loading

xinghai-sun commented Jun 12, 2017


		__shared__ real sw[BLOCK_H][BLOCK_W];

		for (int i = tidy; i < context; i += blky) {

Row convolution operation. #2373

Row convolution operation. #2373

Conversation

qingqing01 commented Jun 4, 2017 • edited Loading

luotao1 left a comment • edited Loading

Choose a reason for hiding this comment

qingqing01 commented Jun 6, 2017

hedaoyuan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xinghai-sun commented Jun 11, 2017 • edited Loading

qingqing01 commented Jun 12, 2017 • edited Loading

qingqing01 commented Jun 12, 2017 • edited Loading

xinghai-sun commented Jun 12, 2017

qingqing01 commented Jun 4, 2017 •

edited

Loading

luotao1 left a comment •

edited

Loading

xinghai-sun commented Jun 11, 2017 •

edited

Loading

qingqing01 commented Jun 12, 2017 •

edited

Loading

qingqing01 commented Jun 12, 2017 •

edited

Loading