-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add MobileNetV3 #38653
add MobileNetV3 #38653
Conversation
Thanks for your contribution! |
013f498
to
ac21bab
Compare
66f44b7
to
4e0dc4b
Compare
Sorry to inform you that f747ccd's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
58623fb
to
3b22137
Compare
Sorry to inform you that 3b22137's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
c11dbff
to
4a59819
Compare
4a59819
to
7e021b4
Compare
return paddle.multiply(x=identity, y=x) | ||
|
||
|
||
class ConvBNLayer(nn.Layer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
看看这些类能否复用已有的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ConvBNLayer
看起来是能够复用的,而且随着各式各样 Vision Models 的添加,这种重复也会越来越大,就目前这些模型而言,类似的代码已经出现在了:
- inceptionv3.ConvBNLayer
- mobilenetv1.ConvBNLayer
- mobilenetv2.ConvBNLayer
- mobilenetv3.ConvBNLayer (当前模型)
- resnext.ConvBNLayer
- shufflenetv2.ConvBNLayer
我们是否可以参考 torchvision 里那样,在 paddle.vision.ops
里添加一个容易复用的通用模块,比如像 torchvision 的 ConvNormActivation
这里 Norm 没有限制成 BN 貌似是为了兼容 LN 等 Norm 的模块,比如 ConvNeXt
不过如果要复用同一代码的话,由于现有各个模型内部的 ConvBNLayer
内部参数命名各不相同,所以可能涉及到更新现有权重的问题,整体做起来工程量可能不小。
如果上述方案是可以的话,我可以在本 PR 中实施创建可复用的模块 ConvNormActivation
,并在 mobilenetv3 引用它。然后我会在下一个 PR 中在其余 5 个模型中引用它,并逐步更新参数。
@LielinJiang 请问这样可以嘛?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以的~
self.avg_pool = nn.AdaptiveAvgPool2D(1) | ||
self.conv1 = nn.Conv2D( | ||
in_channels=channels, | ||
out_channels=channels // reduction, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在对代码的修改和权重转换过程中,我发现 paddleclas 中目前的实现在 SE 模块中 squeeze 时通道数是直接除 4,而无论是 keras.applications 还是 torchvision 都在除 4 后进一步使用 _make_divisible 保证其大小能够被 8 整除。比如说 scale=1 时的 mobilenetv3_large 在第一个 SE 模块中第一个卷积时理应输入 72 输出 _make_divisible(72//4)=24,但目前 paddleclas 实现是输出 72//4=18。唔我不太清楚这是不是因为疏漏导致的,但确实和 keras 和 pytorch 实现是不一样的。
这样的话,我们是不是应该放弃使用 paddleclas 里的预训练权重(因为如果修正模型结构的话,SE 那部分 shape 完全对不上),转而使用从 torchvision 现有模型转换的权重呢?不过 torchvision 目前是 small 和 large 都只有 scale=1.0 的权重……比 paddleclas 少不少……
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这部分最开始是和gluoncv对齐的。因为当时torch版本的实现无法训练出论文精度。我的建议是按照paddleclas的版本先合入,然后在文档中说明与torchvision版本的不同点
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好哒~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
又和之前复现的同学探讨了下,确实clas的版本和其他框架版本都不一样(包括gluoncv)。所以感觉要不使用torchvision的版本吧,先把已有的权重转过来。缺少的权重我们后续再想办法
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯,那我使用 torchvision 的试下~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的好的,麻烦了~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
""" | ||
Configurable block used for Convolution-Normalzation-Activation blocks. | ||
This code is based on the torchvision code with modifications. | ||
You can also see at /~https://github.com/pytorch/vision/blob/main/torchvision/ops/misc.py#L68 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要修改文档,不能重复,并且去掉参考链接~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
「不能重复」是指?另外这个 API 不是一个会对外暴露的 API,目前不会在官网生成文档~
PR types
New features
PR changes
APIs
Describe
向 paddle.vision.models 添加以下模型
Tasks
Performance updated
AI Studio 测试详情:https://aistudio.baidu.com/studio/project/partial/verify/3294252/037497a8fe694d89954b1c6e9bf274dd
基准参考:/~https://github.com/PaddlePaddle/PaddleClas/blob/release/2.2/docs/en/ImageNet_models_en.md
虽然有很多不同 scale 的预训练模型,但这里参考了 paddle 的 mobilenet_v1 和 mobilenet_v2 使用参数 scale 调节 scale,并没有为每个 scale 暴露一个 API,同时也与 torchvision 暴露的 mobilenet_v3_small 和 mobilenet_v3_large 对齐~不过我不太清楚这样做是否合适,如果需要改成类似 PaddleClas 那样每个预训练模型都暴露一个 API 的话我再改一下~