【PaddlePaddle Hackathon 4】[103] 新增tie_weights能力提交rfc文档 #5098

qiuwenbogdut · 2023-03-04T14:33:38Z

PR types

Others

PR changes

Docs

Description

[103] 新增tie_weights能力提交rfc文档

CLAassistant · 2023-03-04T14:33:43Z

All committers have signed the CLA.

paddle-bot · 2023-03-04T14:33:45Z

Thanks for your contribution!

codecov · 2023-03-04T15:01:25Z

Codecov Report

Merging #5098 (a52af67) into develop (9497c54) will increase coverage by 5.07%.
The diff coverage is n/a.

@@             Coverage Diff             @@
##           develop    #5098      +/-   ##
===========================================
+ Coverage    46.36%   51.44%   +5.07%     
===========================================
  Files          448      465      +17     
  Lines        64619    66479    +1860     
===========================================
+ Hits         29958    34197    +4239     
+ Misses       34661    32282    -2379

see 177 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

sijunhe

感谢您的RFC. 为了更加方便浏览，请您将截图更改为markdown语法的code block, 并且附上代码链接，例如:

def tie_weights(self):
    """
    Tie the weights between the input embeddings and the output embeddings.
    """
    if hasattr(self, "get_output_embeddings") and hasattr(self, "get_input_embeddings"):
        output_embeddings = self.get_output_embeddings()
        if output_embeddings is not None:
            self._tie_or_clone_weights(output_embeddings, self.get_input_embeddings())

代码链接

qiuwenbogdut · 2023-03-05T14:05:05Z

@sijunhe 你好已经重新提交了一个版本的rfc文档

sijunhe

图片可以从这个PR里删除了
@gongel 来review一下

sijunhe · 2023-03-06T02:45:22Z

docs/community/rfcs/20230304_api_design_for_tie_weight_task_103.md

+
+(1) [代码链接1](/~https://github.com/qiuwenbogdut/PaddleNLP/blob/develop/examples/language_model/transformer-xl/mem_transformer.py#L811)
+
+```python


其实paddlenlp内大部分的tie_weights实现是直接在模型layer定义层面实现的，见例子，而不是类似transformers一样在模型以外统一实现的。当然，这个项目的目标就是看一下能否在模型外统一实现，而不用每个模型都自己实现一次

类似transformers 一样模型外统一实现, 我是在paddleNLP目录下这里model_utils.py#L897添加tie weight代码吗?

看了一下，paddle里面tie_weghts实现有两种

一种在modeling.py中定义了tie_weghts函数，相应的模型也实现了get_input_embeding()和get_output_embeding()来获取输入和输出embeding层。

一种直接将输入embeding的weight，赋值给输出层weight，在定义模型层的时候

我们在model_utils.py中实现tie_weghts，考虑以上两种情况

将输入和输出embeding层的weight进行绑定

获取输入embeding层，获取输出weight，将输入embeding层的weight赋值给输出层 embeding weight

weight直接复制在Paddle是不可行的，类似于这种操作，是无法修改输出层的weight：

PaddleNLP/paddlenlp/transformers/convbert/modeling.py

Line 475 in 2abc332

output_embeddings.weight = input_embeddings.weight

@gongel 在本地写了一下小脚本进行了一下测试
output_embeddings.weight = input_embeddings.weight
操作之后,

input_emending.weight的id 也和 output_embedding.id是一致的.

修改input_emending中weight的值, output_embedding中的值也会跟着改变

测试代码如下:

import numpy as np from paddle.nn import Embedding """step1 定义两个不同的embedding 对象 AA 和 BB""" print('------------step1') AA = Embedding(1,2) BB = Embedding(1,2) AA.weight = BB.weight # 进行权重的绑定 """ step2 测试一下绑定结果""" print('------------step2') print('检测 AA 和 BB 的id是否一致:', AA is BB,id(AA), id(BB)) # AA 和 BB 的id 不一致 print('检测 AA.weight 和 BB.weight 的id是否一致:',AA.weight is BB.weight,id(AA.weight), id(BB.weight)) # 但是AA.weight 和 BB.weight 的id是一致的 print("AA.weight: ",AA.weight) print("BB.weight: ",BB.weight) """ step3 尝试修改一下AA的weight的值 BB的weight的值是否也跟着会一起修改""" # 修改一下其中一个AA 的权重值, 看一下 BB的权重值会不会变化 print('------------step3') AA.weight.set_value(np.array([[4.0,6.0]],dtype=np.float32)) print('检测修改后的 AA.weight 和 BB.weight 的id是否一致:',AA.weight is BB.weight,id(AA.weight), id(BB.weight)) # AA.weight 和 BB.weight 的id是一致的 print("AA.weight 修改后的值: ",AA.weight) print("BB.weight:",BB.weight)

sijunhe · 2023-03-06T05:18:12Z

docs/community/rfcs/20230304_api_design_for_tie_weight_task_103.md

+
+1. 获取模型input embedding  权重对象 A
+2. 获取模型 output embedding 权重对象 B
+3. 让A和B 都指向同一个权重值


对于paddle这个3会是一个难点，所以之前paddlenlp才会有类似这种实现

您的意思是, 预训练模型后面可能会接不通的head, 比如

分类的head(线性层),

language modeling` head

等等
有可能有些任务下是没有output embeding的比如基于ernie 的分类任务.

也有将预训练模型的input embeding 输入到 language modeling` head 来初始化output embeding. 这种情况是一个实现难点?

维持一个对象可行，将embedding的weight直接传给head来构建linear输出层，期望是在get_input_embeding()拿到weight，然后传给head层，注意考虑模型实例化和tie_weights的先后顺序。

sijunhe · 2023-03-06T05:18:40Z

docs/community/rfcs/20230304_api_design_for_tie_weight_task_103.md

+## API实现方案
+
+# 六、测试和验收的考量
+参考：[新增API 测试及验收规范](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/dev_guides/api_contributing_guides/api_accpetance_criteria_cn.html)


这一块可以想一下需要增加怎样的单测，能够证明tie weights成功了, 可以看下当前tests/transformers底下的测试，考虑如何在/~https://github.com/PaddlePaddle/PaddleNLP/blob/develop/tests/transformers/test_modeling_common.py#L61 里面对于实现了tie_weights的模型增加一个通用的测试

目前发现的自带实现tie weight的模型有:

electra

convbert

reformer

单元测试是先实现测试这些已有模型的input_embedding 和 output_embeding 是否指向同一个权重对象吗?

写测试的时候考虑两种情况:

一种在modeling.py中定义了tie_weghts函数，相应的模型也实现了get_input_embeding()和get_output_embeding()来获取输入和输出embeding层。

在定义模型层的时候一种直接将输入embeding层的weight，赋值给输出层weight，

如上，该方法不可行。

gongel · 2023-03-07T12:02:15Z

@qiuwenbogdut 目前PaddleNLP现存的带有tie_weights函数的，基本上都是实现不正确的。如何验证呢？（后期加单测也需要）我的建议是：有两个办法

直接判断输出层weight和输入层weight的id，如果一致即通过，否则Failed.
训练几个step，经过几个反向后，看下输出层weight和输入层weight是否一致，如果一致即通过，否则Failed.

gongel · 2023-03-08T10:46:21Z

收到，看你的demo确实符合预期，期待你的实现！

sijunhe

@qiuwenbogdut 您好。请您先不要在这个PR里实现代码，可以现将我们讨论的内容整合入RFC，我们将这个合入。代码实现可以新开启一个PR~

qiuwenbogdut · 2023-03-09T15:07:37Z

@sijunhe 好的,明天将将讨论的内容先整合入RFC先,提交代码再创建一个pR

qiuwenbogdut · 2023-03-10T02:25:30Z

@sijunhe @gongel rfc 文档已经整合更新, 辛苦review以下谢谢

gongel

LGTM，weight赋值的时候注意transpose.

sijunhe

LGTM. 期待你的实现！

[103] 新增tie_weights能力提交rfc文档

9cb4f54

paddle-bot bot added contributor status: proposed labels Mar 4, 2023

qiuwenbogdut mentioned this pull request Mar 4, 2023

【PaddlePaddle Hackathon 第四期】任务总览 PaddlePaddle/Paddle#50629

Closed

sijunhe reviewed Mar 5, 2023

View reviewed changes

[103] 新增tie_weights能力提交rfc文档 v2

6929649

sijunhe reviewed Mar 6, 2023

View reviewed changes

Ligoml mentioned this pull request Mar 7, 2023

【PaddlePaddle Hackathon 第四期】任务总览 PaddlePaddle/Paddle#51281

Closed

qiuwenbogdut requested a review from gongel March 8, 2023 10:11

sijunhe reviewed Mar 9, 2023

View reviewed changes

qiuwenbogdut force-pushed the dev_qiuwenbo branch from 6ef980d to f21a999 Compare March 9, 2023 23:26

[103] 新增tie_weights 能力提交rfc文档 v3

a52af67

qiuwenbogdut force-pushed the dev_qiuwenbo branch from f21a999 to a52af67 Compare March 10, 2023 01:36

gongel approved these changes Mar 10, 2023

View reviewed changes

sijunhe approved these changes Mar 10, 2023

View reviewed changes

sijunhe merged commit 20f1edd into PaddlePaddle:develop Mar 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【PaddlePaddle Hackathon 4】[103] 新增tie_weights能力提交rfc文档 #5098

【PaddlePaddle Hackathon 4】[103] 新增tie_weights能力提交rfc文档 #5098

qiuwenbogdut commented Mar 4, 2023

CLAassistant commented Mar 4, 2023 •

edited

Loading

paddle-bot bot commented Mar 4, 2023

codecov bot commented Mar 4, 2023 •

edited

Loading

sijunhe left a comment

qiuwenbogdut commented Mar 5, 2023

sijunhe left a comment

sijunhe Mar 6, 2023

qiuwenbogdut Mar 7, 2023

qiuwenbogdut Mar 7, 2023

gongel Mar 7, 2023 •

edited

Loading

qiuwenbogdut Mar 8, 2023

qiuwenbogdut Mar 8, 2023

sijunhe Mar 6, 2023

qiuwenbogdut Mar 7, 2023

gongel Mar 7, 2023

sijunhe Mar 6, 2023

qiuwenbogdut Mar 7, 2023

qiuwenbogdut Mar 7, 2023

gongel Mar 7, 2023

gongel commented Mar 7, 2023

gongel commented Mar 8, 2023

sijunhe left a comment

qiuwenbogdut commented Mar 9, 2023

qiuwenbogdut commented Mar 10, 2023

gongel left a comment

sijunhe left a comment


		(1) [代码链接1](/~https://github.com/qiuwenbogdut/PaddleNLP/blob/develop/examples/language_model/transformer-xl/mem_transformer.py#L811)

		```python

【PaddlePaddle Hackathon 4】[103] 新增tie_weights能力 提交rfc文档 #5098

【PaddlePaddle Hackathon 4】[103] 新增tie_weights能力 提交rfc文档 #5098

Conversation

qiuwenbogdut commented Mar 4, 2023

PR types

PR changes

Description

CLAassistant commented Mar 4, 2023 • edited Loading

paddle-bot bot commented Mar 4, 2023

codecov bot commented Mar 4, 2023 • edited Loading

Codecov Report

sijunhe left a comment

Choose a reason for hiding this comment

qiuwenbogdut commented Mar 5, 2023

sijunhe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gongel Mar 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gongel commented Mar 7, 2023

gongel commented Mar 8, 2023

sijunhe left a comment

Choose a reason for hiding this comment

qiuwenbogdut commented Mar 9, 2023

qiuwenbogdut commented Mar 10, 2023

gongel left a comment

Choose a reason for hiding this comment

sijunhe left a comment

Choose a reason for hiding this comment

【PaddlePaddle Hackathon 4】[103] 新增tie_weights能力提交rfc文档 #5098

【PaddlePaddle Hackathon 4】[103] 新增tie_weights能力提交rfc文档 #5098

CLAassistant commented Mar 4, 2023 •

edited

Loading

codecov bot commented Mar 4, 2023 •

edited

Loading

gongel Mar 7, 2023 •

edited

Loading