【PaddlePaddle Hackathon 4 No.8】为 Paddle 新增 xlogy API #365

Li-fAngyU · 2023-02-21T08:12:12Z

Add paddle.xlogy rfc file.

paddle-bot · 2023-02-21T08:12:17Z

你的PR提交成功，感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备，具体请参考示例和模版。
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.

luotao1 · 2023-02-21T08:55:19Z

rfcs/APIs/20230221_api_design_for_xlogy.md

+飞桨中还没有 xlogy，直接使用 input * log(other), 无法达到分段函数的需求，需要额外增加对 float('inf') 和 float('nan') 等情况的处理。
+
+
+# 三、业内方案调研


增加下 tf.math.xlogy 的情况，以及后面的对比分析

luotao1 · 2023-02-21T08:57:29Z

rfcs/APIs/20230221_api_design_for_xlogy.md

+}
+```
+
+可以看到就是利用 if 语句判断 y 是否为 NAN，以及 x 是否等于 0。需要注意的时判断 y 是否为 NAN 的优先级大于判断 x 是否为0。CUDA Kernel 和 CPU Kernel 是相似的。


需要注意的时：时->是

luotao1 · 2023-02-21T08:57:39Z

rfcs/APIs/20230221_api_design_for_xlogy.md

+        return x * zlog(y)
+```
+
+可以看到也是直接利用 if 语句进行分段的实现思路。但是与 torch 不同，scipy 没有对 NAN 值单独进行判断和return的操作。这是可以理解的因为 log(NAN) 也是NAN， NAN 与任何数进行运算还是 NAN。


这是可以理解的，因为

luotao1 · 2023-02-21T09:12:59Z

rfcs/APIs/20230221_api_design_for_xlogy.md

+注： x == 0 且 other != Nan （可以等价替换为） x == 0 且 log(other) != Nan。
+
+因此有如下等价替换过程： 
+`x == 0 且 other != Nan` <==> `x == 0 且 log(other) != Nan` <==> `x == 0 且 (other <=0 或 other == inf）`


x == 0 且 log(other) != Nan <==> x == 0 且 (other <=0 或 other == inf） 这个替换对么？

log(other) != NaN 等价于 other <=0 或 other == inf ？

这个替换不太对，下个commit会重新描述一下这段。

luotao1 · 2023-02-21T09:30:50Z

rfcs/APIs/20230221_api_design_for_xlogy.md

+    check_variable_and_dtype(x, 'x', ['float32', 'float64'], 'xlogy')
+    mask = (x == 0) & ((other <= 0) | (other == float('inf')))
+    other = paddle.where(mask, paddle.ones(other.shape, other.dtype), other)
+    return x * paddle.log(other)


如果 x=0， other = NaN，94行返回的是0，应该是NaN吧？

这里，当时在考虑输入x支持int类型时，做过一点简单测试。在paddle下，当a为int类型时，a*Nan值为0；当a为float类型时，a*Nan值为Nan。（理论上NaN与任何数做运算都应该是NaN吧，不是很确定，但是在torch上进行简单测试应该是这样的。）

Paddle 测试代码：

import paddle print('paddle version:',paddle.__version__) a_int = paddle.to_tensor([0], dtype= paddle.int32) a_float = paddle.to_tensor([0.], dtype= paddle.float32) a_nan = paddle.to_tensor([float('nan')], dtype= paddle.float32) print(a_float*a_nan) print(a_int*a_nan) print(a_float*paddle.log(a_nan)) print(a_int*paddle.log(a_nan)) #paddle version: 0.0.0 #Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True, [nan]) #Tensor(shape=[1], dtype=int32, place=Place(cpu), stop_gradient=True, [0]) #Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True, [nan]) #Tensor(shape=[1], dtype=int32, place=Place(cpu), stop_gradient=True, [0])

Torch 测试代码：

import torch print('torch version:',torch.__version__) a_int = torch.Tensor([0]).int() a_float = torch.Tensor([0.]).float() a_nan = torch.Tensor([float('nan')]) print('a_int dtype:', a_int.dtype) print('a_float dtype:', a_float.dtype) print(a_float*a_nan) print(a_int*a_nan) print(a_float*torch.log(a_nan)) print(a_int*torch.log(a_nan)) # torch version: 1.13.1+cu116 # a_int dtype: torch.int32 # a_float dtype: torch.float32 # tensor([nan]) # tensor([nan]) # tensor([nan]) # tensor([nan])

因为a类型不同导致a*Nan的结果不一致，于是就限制输入x仅支持int类型了。

但是在torch上进行简单测试应该是这样的

可以贴一下你在torch上的测试代码和效果

好的，已经贴上去了。

luotao1 · 2023-02-21T09:35:49Z

rfcs/APIs/20230221_api_design_for_xlogy.md

+Paddle：xlogy API 代码：
+```python
+def xlogy(x, other, name=None):
+    check_variable_and_dtype(x, 'x', ['float32', 'float64'], 'xlogy')


x应该也可以不是float类型，还需要对y做类型检查

x的类型问题在上面有解释，下个commit会增加对y的类型检查。

从业内情况看，都可以支持x/y是int类型

可是paddle.log(x)，仅支持float32,和float64。

paddle.log

luotao1 · 2023-02-21T09:37:51Z

rfcs/APIs/20230221_api_design_for_xlogy.md

+def xlogy(x, other, name=None):
+    check_variable_and_dtype(x, 'x', ['float32', 'float64'], 'xlogy')
+    mask = (x == 0) & ((other <= 0) | (other == float('inf')))
+    other = paddle.where(mask, paddle.ones(other.shape, other.dtype), other)


用不止一个where是不是逻辑更加清楚、直观点。

思考了一下，好像不能利用多个where筛选出(x == 0) & ((other <= 0) | (other == float('inf')))这个条件。

比如用一个where先处理 other=NaN的情况，再用一个where 处理 x = 0 的情况？

Li-fAngyU · 2023-02-21T10:47:35Z

收到，我修正下。

luotao1

等我咨询下两处diff的处理情况：

paddle.log(x)，仅支持float32,和float64；但torch.log(x)，可以支持int，即输入int，输出float
在paddle下，当a为int类型时，a*Nan值为0；当a为float类型时，a*Nan值为Nan。（理论上NaN与任何数做运算都应该是NaN吧，不是很确定，但是在torch上进行简单测试应该是这样的。）测试代码见【PaddlePaddle Hackathon 4 No.8】为 Paddle 新增 xlogy API #365 (comment)

luotao1 · 2023-02-22T03:29:28Z

和 @jeff41404 讨论结论：

非常感谢你的反馈：

int * nan 应该等于 nan，这是1个bug，要修改
log系列支持int是一个feature，逻辑不难，但是实现时要注意反向的问题（也可以看看pytorch），理论上int类型的数据不可导，没有梯度

这2个问题辛苦在这个任务中一并完成，如果后面发现工作太多，随时和我们联系

Li-fAngyU · 2023-02-22T03:44:33Z

int*nan 等于0这个bug，不太清楚应该从哪修起。
log支持int问题有如下疑问：

torch无法对int类型的tensor设置requires_grad=True, 所以无法对int型变量，计算反向求梯度。

example code：

import torch
print('torch version:', torch.__version__)
a = torch.Tensor([2]).int()
a.requires_grad = True
y = torch.log(a)
# torch version: 1.13.1+cu116
#     3 a = torch.Tensor([2]).int()
# ----> 4 a.requires_grad = True
#     5 y = torch.log(a)
# 
# RuntimeError: only Tensors of floating point and complex dtype can require gradients

而paddle好像是可以支持对int类型变量求反向梯度的。

example code:

import paddle
a = paddle.to_tensor([2], 'int32', stop_gradient=False)
y = 3*a
y.backward()
print(a.grad.numpy())
# [3]

所以，使 log 支持int类型时，是否需要保留paddle对int类型变量能够求反向梯度的特点呢？

luotao1 · 2023-02-22T06:54:52Z

所以，使 log 支持int类型时，是否需要保留paddle对int类型变量能够求反向梯度的特点呢？

和 @jeff41404 讨论后：这个也是个bug，也需要修

int*nan 等于0这个bug，不太清楚应该从哪修起。

可以提3个issue出来（int*nan，int类型不应该有反向梯度，log系列需要支持int类型），来获得更多的帮助。

Li-fAngyU · 2023-03-01T04:24:59Z

请问 xlogy 这个API是要等上面这些bug修复完再开发吗？

luotao1 · 2023-03-01T07:07:47Z

请问 xlogy 这个API是要等上面这些bug修复完再开发吗？

对的，看看能否组团先修一下 bug

add xlogy rfc file

3384304

paddle-bot bot added contributor status: proposed labels Feb 21, 2023

This was referenced Feb 21, 2023

【PaddlePaddle Hackathon 4】8、为 Paddle 新增 xlogy API PaddlePaddle/Paddle#50688

Closed

【PaddlePaddle Hackathon 第四期】任务总览 PaddlePaddle/Paddle#50629

Closed

luotao1 assigned luotao1 and cloud2009 Feb 21, 2023

Li-fAngyU added 2 commits February 21, 2023 16:54

修改文本错误

7a7c91a

修改文本错误

4aea09a

luotao1 reviewed Feb 21, 2023

View reviewed changes

fix upon problems.

2faa97d

Li-fAngyU requested a review from luotao1 February 21, 2023 11:51

luotao1 reviewed Feb 21, 2023

View reviewed changes

This was referenced Feb 22, 2023

int类型变量与Nan进行乘法结果为0 PaddlePaddle/Paddle#50768

Closed

Int类型变量不应该支持反向梯度 PaddlePaddle/Paddle#50770

Open

log类型API需要支持Int类型 PaddlePaddle/Paddle#50772

Closed

Merge branch 'PaddlePaddle:master' into master

ddb8527

Li-fAngyU changed the title ~~add xlogy rfc file~~ 【PaddlePaddle Hackathon 4 No.8】为 Paddle 新增 xlogy API Mar 3, 2023

luotao1 assigned Ligoml Mar 6, 2023

Li-fAngyU mentioned this pull request Mar 6, 2023

【PaddlePaddle Hackathon 4 No.23】新增 API vander PaddlePaddle/docs#5681

Merged

Ligoml mentioned this pull request Mar 7, 2023

【PaddlePaddle Hackathon 第四期】任务总览 PaddlePaddle/Paddle#51281

Closed

Ligoml removed the status: proposed label Aug 29, 2023

cloud2009 closed this Sep 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【PaddlePaddle Hackathon 4 No.8】为 Paddle 新增 xlogy API #365

【PaddlePaddle Hackathon 4 No.8】为 Paddle 新增 xlogy API #365

Li-fAngyU commented Feb 21, 2023 •

edited

Loading

paddle-bot bot commented Feb 21, 2023

luotao1 Feb 21, 2023

Li-fAngyU Feb 21, 2023

luotao1 Feb 21, 2023

luotao1 Feb 21, 2023

luotao1 Feb 21, 2023

Li-fAngyU Feb 21, 2023

luotao1 Feb 21, 2023

Li-fAngyU Feb 21, 2023 •

edited

Loading

luotao1 Feb 21, 2023

Li-fAngyU Feb 21, 2023

luotao1 Feb 21, 2023

Li-fAngyU Feb 21, 2023

luotao1 Feb 21, 2023

Li-fAngyU Feb 21, 2023

luotao1 Feb 21, 2023 •

edited

Loading

Li-fAngyU Feb 21, 2023 •

edited

Loading

luotao1 Feb 21, 2023

Li-fAngyU commented Feb 21, 2023

luotao1 left a comment •

edited

Loading

luotao1 commented Feb 22, 2023

Li-fAngyU commented Feb 22, 2023 •

edited

Loading

luotao1 commented Feb 22, 2023

Li-fAngyU commented Mar 1, 2023

luotao1 commented Mar 1, 2023

		飞桨中还没有 xlogy，直接使用 input * log(other), 无法达到分段函数的需求，需要额外增加对 float('inf') 和 float('nan') 等情况的处理。


		# 三、业内方案调研

【PaddlePaddle Hackathon 4 No.8】为 Paddle 新增 xlogy API #365

【PaddlePaddle Hackathon 4 No.8】为 Paddle 新增 xlogy API #365

Conversation

Li-fAngyU commented Feb 21, 2023 • edited Loading

paddle-bot bot commented Feb 21, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Li-fAngyU Feb 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luotao1 Feb 21, 2023 • edited Loading

Choose a reason for hiding this comment

Li-fAngyU Feb 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Li-fAngyU commented Feb 21, 2023

luotao1 left a comment • edited Loading

Choose a reason for hiding this comment

luotao1 commented Feb 22, 2023

Li-fAngyU commented Feb 22, 2023 • edited Loading

luotao1 commented Feb 22, 2023

Li-fAngyU commented Mar 1, 2023

luotao1 commented Mar 1, 2023

Li-fAngyU commented Feb 21, 2023 •

edited

Loading

Li-fAngyU Feb 21, 2023 •

edited

Loading

luotao1 Feb 21, 2023 •

edited

Loading

Li-fAngyU Feb 21, 2023 •

edited

Loading

luotao1 left a comment •

edited

Loading

Li-fAngyU commented Feb 22, 2023 •

edited

Loading