Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix the div 0 error of sequence_concat #49963

Merged
merged 8 commits into from
Feb 7, 2023

Conversation

Liyulingyue
Copy link
Contributor

PR types

Bug fixes

PR changes

APIs

Describe

@paddle-bot
Copy link

paddle-bot bot commented Jan 20, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added contributor External developers status: proposed labels Jan 20, 2023
@Liyulingyue
Copy link
Contributor Author

Liyulingyue commented Jan 21, 2023

不知道为什么加入语句似乎没有触发执行,需要一些提示和帮助

@SigureMo
Copy link
Member

不知道为什么加入语句似乎没有触发执行,需要一些提示和帮助

2023-01-20 20:12:37 ======================================================================
2023-01-20 20:12:37 ERROR: test_errors (test_sequence_concat.TestSequenceConcatOpError)
2023-01-20 20:12:37 ----------------------------------------------------------------------
2023-01-20 20:12:37 Traceback (most recent call last):
2023-01-20 20:12:37   File "/workspace/Paddle/build/python/paddle/fluid/tests/unittests/sequence/test_sequence_concat.py", line 140, in test_errors
2023-01-20 20:12:37     self.assertRaises(ValueError, test_0_shape)
2023-01-20 20:12:37   File "/opt/_internal/cpython-3.7.0/lib/python3.7/unittest/case.py", line 743, in assertRaises
2023-01-20 20:12:37     return context.handle('assertRaises', args, kwargs)
2023-01-20 20:12:37   File "/opt/_internal/cpython-3.7.0/lib/python3.7/unittest/case.py", line 178, in handle
2023-01-20 20:12:37     callable_obj(*args, **kwargs)
2023-01-20 20:12:37   File "/workspace/Paddle/build/python/paddle/fluid/tests/unittests/sequence/test_sequence_concat.py", line 135, in test_0_shape
2023-01-20 20:12:37     x4_data = fluid.layers.data(name="x4", shape=[0], dtype='float32')
2023-01-20 20:12:37 AttributeError: module 'paddle.fluid.layers' has no attribute 'data'
2023-01-20 20:12:37 
2023-01-20 20:12:37 ----------------------------------------------------------------------

根据报错可以发现 paddle.fluid.layers.data 这个 API 已经不存在了,这是在清理 fluid 过程中 #49301 清理掉的一个 API,可以 merge 一下 develop 并参考该 PR 修改 paddle.fluid.layers.datapaddle.static.data,再看看是否还会有问题~

wanghuancoder
wanghuancoder previously approved these changes Feb 3, 2023
Copy link
Contributor

@wanghuancoder wanghuancoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

if input_x.dim() == 1 and 0 in input_x.shape:
raise ValueError(
'the shape of element in input list should not be 0.'
)
Copy link

@iclementine iclementine Feb 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change intended to prevent division by zero at SequenceConcatOp's InferShape?
image

If so, why only raise Exception when an input has shape [0]. In my opinion, when an input has a leading zero in its shape, like [0, 2, 3], an exception should be raised.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, I will try to solve it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change intended to prevent division by zero at SequenceConcatOp's InferShape? image

If so, why only raise Exception when an input has shape [0]. In my opinion, when an input has a leading zero in its shape, like [0, 2, 3], an exception should be raised.

Hi, in the follow case, an exception have been writen.

If I simplely require that 'the input size must not be 0', some collisions may occur.

So, could you tell me where should I modify or what should I do?

import paddle
paddle.enable_static()
x = paddle.static.data(name='x', shape=[1,1], dtype='float32', lod_level=1)
y = paddle.static.data(name='y', shape=[1,0], dtype='float32', lod_level=1)
out = paddle.static.nn.sequence_concat(input=[x, y])
ValueError: Variable 'y' has been created before. The previous shape is (1, 0), the new shape is (0, 1). They are not matched.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I simplely require that 'the input size must not be 0', some collisions may occur.

emmmm,就这个问题而言,因为是 InferShape 里出现了除第一个维度的需求,'the input size must not be 0' 的检查是不是不太合适?when an input has a leading zero in its shape 对此的检查就刚刚好

Hi, in the follow case, an exception have been writen.

ValueError: Variable 'y' has been created before. The previous shape is (1, 0), the new shape is (0, 1). They are not matched.

这个错误与本问题无关啊,只要你重新使用相同 name 定义了不同 shape 的 Variable 就会出现此错误,这里上文应该有一个定义过的 shape=(1, 0)y,删掉或者将这个重命名即可(Jupyter 直接 Restart kernel

So, could you tell me where should I modify or what should I do?

直接在 Python 端修改:

- if input_x.dim() == 1 and 0 in input_x.shape: # 仅仅要求 shape != [0]
+ if input_x.shape[0] == 0:                     # 要求 shape != [0, ...]

或者直接在 C++ 端增加一样的检查,比如这里 71 行后就很合适,个人觉得直接 C++ 端更好一些?

batch_size += x_dim[0];
if (feature_size == 0) {
feature_size = phi::product(x_dim) / x_dim[0];
} else {
PADDLE_ENFORCE_EQ(
feature_size,
phi::product(x_dim) / x_dim[0],
platform::errors::InvalidArgument(
"Each input of SequenceConcatOp inputs must have same feature "
"size, But "
"the feature size we received is %d, the feature size of 1st "
"input is %d",
feature_size,
phi::product(x_dim) / x_dim[0]));
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I simplely require that 'the input size must not be 0', some collisions may occur.

emmmm,就这个问题而言,因为是 InferShape 里出现了除第一个维度的需求,'the input size must not be 0' 的检查是不是不太合适?when an input has a leading zero in its shape 对此的检查就刚刚好

Hi, in the follow case, an exception have been writen.

ValueError: Variable 'y' has been created before. The previous shape is (1, 0), the new shape is (0, 1). They are not matched.

这个错误与本问题无关啊,只要你重新使用相同 name 定义了不同 shape 的 Variable 就会出现此错误,这里上文应该有一个定义过的 shape=(1, 0)y,删掉或者将这个重命名即可(Jupyter 直接 Restart kernel

So, could you tell me where should I modify or what should I do?

直接在 Python 端修改:

- if input_x.dim() == 1 and 0 in input_x.shape: # 仅仅要求 shape != [0]
+ if input_x.shape[0] == 0:                     # 要求 shape != [0, ...]

或者直接在 C++ 端增加一样的检查,比如这里 71 行后就很合适,个人觉得直接 C++ 端更好一些?

batch_size += x_dim[0];
if (feature_size == 0) {
feature_size = phi::product(x_dim) / x_dim[0];
} else {
PADDLE_ENFORCE_EQ(
feature_size,
phi::product(x_dim) / x_dim[0],
platform::errors::InvalidArgument(
"Each input of SequenceConcatOp inputs must have same feature "
"size, But "
"the feature size we received is %d, the feature size of 1st "
"input is %d",
feature_size,
phi::product(x_dim) / x_dim[0]));
}
}

更新一下报错信息~

ValueError: (InvalidArgument) Each input of SequenceConcatOp inputs must have same feature size, But the feature size we received is 1, the feature size of 1st input is 0
  [Hint: Expected feature_size == phi::product(x_dim) / x_dim[0], but received feature_size:1 != phi::product(x_dim) / x_dim[0]:0.] (at /paddle/paddle/fluid/operators/sequence_ops/sequence_concat_op.cc:84)
  [operator < sequence_concat > error]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我也是觉得在 c++ 端添加 x_dim[0] 不能为 0 的检查会更好一些。

P.S. 由于现在 paddle 对 zero-sized tensor 支持不够好,所以目前添加这样的检查(无论是在 c++ 端还是在 Python 端)我觉得都 ok.

不过其实为了计算 feature_size 大可以用 product(x_shape[1: ]) 的方式来计算而不必用 x_numel / x_shape[0] 的方式计算. (P.S. paddle 的 lod tensor (多层不定长序列)的第 0 维表示递归折叠后的长度,而后面的可以称为 feature_shape,表示序列中的基础元素的形状)

@luotao1 luotao1 closed this Feb 6, 2023
@luotao1 luotao1 reopened this Feb 6, 2023
@PaddlePaddle PaddlePaddle locked and limited conversation to collaborators Feb 6, 2023
@PaddlePaddle PaddlePaddle unlocked this conversation Feb 6, 2023
@Liyulingyue Liyulingyue requested review from iclementine, SigureMo and wanghuancoder and removed request for iclementine, SigureMo and wanghuancoder February 7, 2023 05:16
Copy link

@iclementine iclementine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@iclementine iclementine merged commit 84fe2de into PaddlePaddle:develop Feb 7, 2023
@Liyulingyue Liyulingyue deleted the div0_4 branch February 21, 2023 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor External developers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants