Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize CINN cache key #37786

Merged
merged 11 commits into from
Dec 9, 2021
Merged

Conversation

thisjiang
Copy link
Contributor

@thisjiang thisjiang commented Dec 2, 2021

PR types

Performance optimization

PR changes

OPs

Describe

Optimize cinn_launch_op compile's cache key generates. Test at 32G V100 bs=32, after optimizing, the cost of cinn_cache_key construct and get can reduce from 5ms to 1ms. The IPS would improve from 241 images/sec to 245 images/sec, improving about 1.6%.

Before Optimizing:
image

After Optimizing:
image

优化方法

在原有根据图结构来判断是否是相同图外,增加一个根据图地址来判断是否是相同图的功能。

  • 设置CinnCacheKey基类,其中包含大部分key所需的必要函数,构造函数增加graph_hash函数用于获取图的has值。
  • 基于CinnCacheKey派生出CinnCacheKeyByStructure类,设置graph_hash返回图结构的hash值,该类即为原有的根据图结构来判断是否是相同图。
  • 基于CinnCacheKey派生出CinnCacheKeyByAddress类,设置graph_hash返回图地址的hash值,该类即为新增的根据图地址来判断是否是相同图。
  • Compile设置两层检查,第一层检查图地址是否相同,第二层检查图结构是否相同,图地址检查相比图结构检查速度快很多。因此相比原有单层检查图结构起到了性能优化的作用。

@paddle-bot-old
Copy link

paddle-bot-old bot commented Dec 2, 2021

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@@ -29,55 +29,69 @@ namespace paddle {
namespace framework {
namespace paddle2cinn {

using GraphHashProto = CinnCacheKey::GraphHashProto;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里Proto后缀中文应该翻译成什么?我一开始以为是个protobuf类

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proto表示原型,这里的意思是函数原型

Copy link
Contributor

@CtfGo CtfGo Dec 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

你这里像设计模式中的策略模式,可以考虑是否换成strategy为后缀,否则容易和protobuf混淆,不过这不重要,看你喜好。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done,已修改GraphHashProtoGraphHashStrategy~

Comment on lines -90 to +95
std::unordered_map<CinnCacheKey, std::unique_ptr<CinnCompiledObject>,
std::unordered_map<CinnCacheKeyByAddress, CinnCompiledObject*,
CinnCacheKey::Hash>
cache_;
cache_by_address_;
std::unordered_map<CinnCacheKeyByStructure,
std::unique_ptr<CinnCompiledObject>, CinnCacheKey::Hash>
cache_by_struct_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

第二层cache是不是没用了?没命中第一层cache却命中第二层的情况是存在两个不同子图结构、所有变量名、shape、arch相同,但其中变量名应该没法一致

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉还是有用的吧,比如说万一图被clone了导致图结构虽然一模一样,但地址变了?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

现有场景CinnCompiler的compile只从cinn_launch_op调用,它是用compilation_key先从CinnCompiler拿出Graph,compilation_key和存储的Graph都是在子图划分pass跑完就不变,所以应该不会出现你说的情况。不过也可以保留吧,看后续是否可以把变量名去掉,做到结构一致时也能命中第二层cache

CtfGo
CtfGo previously approved these changes Dec 7, 2021
Copy link
Contributor

@CtfGo CtfGo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@wzzju wzzju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Xreki Xreki merged commit 2567dfa into PaddlePaddle:develop Dec 9, 2021
@thisjiang thisjiang deleted the optimize_cinn_cache_key branch December 10, 2021 03:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants