Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【PaddlePaddle Hackathon 4 No.8】为 Paddle 新增 xlogy API #365

Closed
wants to merge 5 commits into from

Conversation

Li-fAngyU
Copy link
Contributor

@Li-fAngyU Li-fAngyU commented Feb 21, 2023

Add paddle.xlogy rfc file.

PR链接:PaddlePaddle/Paddle#50688

@paddle-bot
Copy link

paddle-bot bot commented Feb 21, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备,具体请参考示例模版
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.

飞桨中还没有 xlogy,直接使用 input * log(other), 无法达到分段函数的需求,需要额外增加对 float('inf') 和 float('nan') 等情况的处理。


# 三、业内方案调研
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

增加下 tf.math.xlogy 的情况,以及后面的对比分析

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的

}
```

可以看到就是利用 if 语句判断 y 是否为 NAN,以及 x 是否等于 0。需要注意的时判断 y 是否为 NAN 的优先级大于判断 x 是否为0。CUDA Kernel 和 CPU Kernel 是相似的。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要注意的时:时->是

return x * zlog(y)
```

可以看到也是直接利用 if 语句进行分段的实现思路。但是与 torch 不同,scipy 没有对 NAN 值单独进行判断和return的操作。这是可以理解的因为 log(NAN) 也是NAN, NAN 与任何数进行运算还是 NAN。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这是可以理解的,因为

注: x == 0 且 other != Nan (可以等价替换为) x == 0 且 log(other) != Nan。

因此有如下等价替换过程:
`x == 0 且 other != Nan` <==> `x == 0 且 log(other) != Nan` <==> `x == 0 且 (other <=0 或 other == inf)`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x == 0 且 log(other) != Nan <==> x == 0 且 (other <=0 或 other == inf) 这个替换对么?

log(other) != NaN 等价于 other <=0 或 other == inf ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个替换不太对,下个commit会重新描述一下这段。

check_variable_and_dtype(x, 'x', ['float32', 'float64'], 'xlogy')
mask = (x == 0) & ((other <= 0) | (other == float('inf')))
other = paddle.where(mask, paddle.ones(other.shape, other.dtype), other)
return x * paddle.log(other)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果 x=0, other = NaN,94行返回的是0,应该是NaN吧?

Copy link
Contributor Author

@Li-fAngyU Li-fAngyU Feb 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里,当时在考虑输入x支持int类型时,做过一点简单测试。在paddle下,当a为int类型时,a*Nan值为0;当a为float类型时,a*Nan值为Nan。(理论上NaN与任何数做运算都应该是NaN吧,不是很确定,但是在torch上进行简单测试应该是这样的。)

Paddle 测试代码:

import paddle
print('paddle version:',paddle.__version__)
a_int = paddle.to_tensor([0], dtype= paddle.int32)
a_float = paddle.to_tensor([0.], dtype= paddle.float32)
a_nan = paddle.to_tensor([float('nan')], dtype= paddle.float32)
print(a_float*a_nan)
print(a_int*a_nan)
print(a_float*paddle.log(a_nan))
print(a_int*paddle.log(a_nan))
#paddle version: 0.0.0
#Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
       [nan])
#Tensor(shape=[1], dtype=int32, place=Place(cpu), stop_gradient=True,
       [0])
#Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
       [nan])
#Tensor(shape=[1], dtype=int32, place=Place(cpu), stop_gradient=True,
       [0])

Torch 测试代码:

import torch
print('torch version:',torch.__version__)
a_int = torch.Tensor([0]).int()
a_float = torch.Tensor([0.]).float()
a_nan = torch.Tensor([float('nan')])
print('a_int dtype:', a_int.dtype)
print('a_float dtype:', a_float.dtype)
print(a_float*a_nan)
print(a_int*a_nan)
print(a_float*torch.log(a_nan))
print(a_int*torch.log(a_nan))
# torch version: 1.13.1+cu116
# a_int dtype: torch.int32
# a_float dtype: torch.float32
# tensor([nan])
# tensor([nan])
# tensor([nan])
# tensor([nan])

因为a类型不同导致a*Nan的结果不一致,于是就限制输入x仅支持int类型了。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

但是在torch上进行简单测试应该是这样的

可以贴一下你在torch上的测试代码和效果

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,已经贴上去了。

Paddle:xlogy API 代码:
```python
def xlogy(x, other, name=None):
check_variable_and_dtype(x, 'x', ['float32', 'float64'], 'xlogy')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x应该也可以不是float类型,还需要对y做类型检查

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x的类型问题在上面有解释,下个commit会增加对y的类型检查。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

从业内情况看,都可以支持x/y是int类型

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可是paddle.log(x),仅支持float32,和float64。

paddle.log

def xlogy(x, other, name=None):
check_variable_and_dtype(x, 'x', ['float32', 'float64'], 'xlogy')
mask = (x == 0) & ((other <= 0) | (other == float('inf')))
other = paddle.where(mask, paddle.ones(other.shape, other.dtype), other)
Copy link
Collaborator

@luotao1 luotao1 Feb 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

用不止一个where是不是逻辑更加清楚、直观点。

Copy link
Contributor Author

@Li-fAngyU Li-fAngyU Feb 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

思考了一下,好像不能利用多个where筛选出(x == 0) & ((other <= 0) | (other == float('inf')))这个条件。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

比如用一个where先处理 other=NaN的情况,再用一个where 处理 x = 0 的情况?

@Li-fAngyU
Copy link
Contributor Author

收到,我修正下。

@Li-fAngyU Li-fAngyU requested a review from luotao1 February 21, 2023 11:51
Copy link
Collaborator

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

等我咨询下两处diff的处理情况:

  1. paddle.log(x),仅支持float32,和float64;但torch.log(x),可以支持int,即输入int,输出float
  2. 在paddle下,当a为int类型时,a*Nan值为0;当a为float类型时,a*Nan值为Nan。(理论上NaN与任何数做运算都应该是NaN吧,不是很确定,但是在torch上进行简单测试应该是这样的。)测试代码见 【PaddlePaddle Hackathon 4 No.8】为 Paddle 新增 xlogy API #365 (comment)

@luotao1
Copy link
Collaborator

luotao1 commented Feb 22, 2023

@jeff41404 讨论结论:

非常感谢你的反馈:

  • int * nan 应该等于 nan,这是1个bug,要修改
  • log系列支持int是一个feature,逻辑不难,但是实现时要注意反向的问题(也可以看看pytorch),理论上int类型的数据不可导,没有梯度

这2个问题辛苦在这个任务中一并完成,如果后面发现工作太多,随时和我们联系

@Li-fAngyU
Copy link
Contributor Author

Li-fAngyU commented Feb 22, 2023

  • int*nan 等于0这个bug,不太清楚应该从哪修起。
  • log支持int问题有如下疑问:

torch无法对int类型的tensor设置requires_grad=True, 所以无法对int型变量,计算反向求梯度。

example code:

import torch
print('torch version:', torch.__version__)
a = torch.Tensor([2]).int()
a.requires_grad = True
y = torch.log(a)
# torch version: 1.13.1+cu116
#     3 a = torch.Tensor([2]).int()
# ----> 4 a.requires_grad = True
#     5 y = torch.log(a)
# 
# RuntimeError: only Tensors of floating point and complex dtype can require gradients

而paddle好像是可以支持对int类型变量求反向梯度的。

example code:

import paddle
a = paddle.to_tensor([2], 'int32', stop_gradient=False)
y = 3*a
y.backward()
print(a.grad.numpy())
# [3]

所以,使 log 支持int类型时,是否需要保留paddle对int类型变量能够求反向梯度的特点呢?

@luotao1
Copy link
Collaborator

luotao1 commented Feb 22, 2023

所以,使 log 支持int类型时,是否需要保留paddle对int类型变量能够求反向梯度的特点呢?

@jeff41404 讨论后:这个也是个bug,也需要修

int*nan 等于0这个bug,不太清楚应该从哪修起。

可以提3个issue出来(int*nan,int类型不应该有反向梯度,log系列需要支持int类型),来获得更多的帮助。

@Li-fAngyU
Copy link
Contributor Author

请问 xlogy 这个API是要等上面这些bug修复完再开发吗?

@luotao1
Copy link
Collaborator

luotao1 commented Mar 1, 2023

请问 xlogy 这个API是要等上面这些bug修复完再开发吗?

对的,看看能否组团先修一下 bug

@Li-fAngyU Li-fAngyU changed the title add xlogy rfc file 【PaddlePaddle Hackathon 4 No.8】为 Paddle 新增 xlogy API Mar 3, 2023
@cloud2009 cloud2009 closed this Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants