Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inference] add config.enable_low_precision_io api and remove rely on AnalysisConfig::Precison in trt #52485

Merged
merged 2 commits into from
May 22, 2023

Conversation

yuanlehome
Copy link
Contributor

@yuanlehome yuanlehome commented Apr 3, 2023

PR types

New features

PR changes

APIs

Description

  • 移除Paddle-TRT底层实现对上层接口AnalysisConfig::Precision的依赖,内部使用phi::DataType;
  • 新增config. enable_low_precision_io接口来设置使用混合精度推理时输入输出数据的类型,此接口限制在模型结构为fp32以及开启GPU混合精度模式下使用;
    • 默认为false(即不调用此api),即输入输出数据类型为fp32
      • 若模型为fp32,需要喂float数据,会吐出float数据(常规情况)
    • 若设置为true,需要喂float16数据,会吐出flaot16数据(可能有特殊情况,即最后一个op只支持fp32或只能运行在cpu上,此时会吐出float数据)(本PR专门支持的情况)
    • 未列出的其他情况,属于不合理情况,行为未定义。

fix bug: #54042, #54032, #54129

@paddle-bot
Copy link

paddle-bot bot commented Apr 3, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@yuanlehome yuanlehome changed the title Specify io dtype [Paddle Inference] add config.keep_io_precision api Apr 3, 2023
@yuanlehome yuanlehome changed the title [Paddle Inference] add config.keep_io_precision api [Paddle Inference] add config.keep_io_datatype api Apr 4, 2023
@yuanlehome yuanlehome force-pushed the specify_io_dtype branch 4 times, most recently from 86cd435 to 44f13a0 Compare April 14, 2023 03:16
@leo0519
Copy link
Collaborator

leo0519 commented Apr 17, 2023

Hi @yuanlehome , is there any unit test for the new config keep_io_datatype?

@yuanlehome
Copy link
Contributor Author

yuanlehome commented Apr 17, 2023

Hi @yuanlehome , is there any unit test for the new config keep_io_datatype?

Currently not available, I need to resolve some unitest issues on CI first.
Perhaps you can tell me what kind of usage examples you need?TRT? Naive GPU? Model precision?

@leo0519
Copy link
Collaborator

leo0519 commented Apr 17, 2023

Perhaps you can tell me what kind of usage examples you need?TRT? Naive GPU? Model precision?

A simple network fed into Paddle Inference with the following conditions is good to check the input datatype of TRT engine.

  1. enable_use_gpu is set
  2. enable_tensorrt_engine is set with precision FP16
  3. keep_io_datatype is set to be false

The result predictor in Paddle Inference should take an FP16 GPU tensor as input and an FP16 tensor as output.

To test the IR graph that adds cast, a program consisting of some operators unable to be converted into TRT is also required.

@yuanlehome
Copy link
Contributor Author

Perhaps you can tell me what kind of usage examples you need?TRT? Naive GPU? Model precision?

A simple network fed into Paddle Inference with the following conditions is good to check the input datatype of TRT engine.

  1. enable_use_gpu is set
  2. enable_tensorrt_engine is set with precision FP16
  3. keep_io_datatype is set to be false

The result predictor in Paddle Inference should take an FP16 GPU tensor as input and an FP16 tensor as output.

To test the IR graph that adds cast, a program consisting of some operators unable to be converted into TRT is also required.

Good tips. Maybe I want to wait for this PR to merge before proceeding.

@leo0519
Copy link
Collaborator

leo0519 commented Apr 17, 2023

Hi @yuanlehome , I would like to add more information for the UT:

There are two parts of the UT

  • Test FP16 IO

    1. Initialize a paddle program with a simple network (ex. fullyconnected -> batchnorm -> relu -> ... those OPs can be converted into TRT)
    2. Run Paddle Inference with GPU on and TensorRT off, and then take the output as baseline.
    3. Run Paddle Inference with GPU on, TensorRT on (TRT precision is set FP16), and keep_io_datatype False
    4. Feed FP16 input data and compare the output with baseline.
  • Test auto_mixed subgraph pass

    1. Initialize a paddle program with some operators Paddle-TRT cannot convert.
    2. Run Paddle Inference with GPU on and TensorRT off and take the output as baseline.
    3. Run Paddle Inference with GPU on and TensorRT on (FP16 precision), and keep_io_datatype False
    4. Feed input data and compare the output with baseline.

If there are any questions, please let me know.

@paddle-ci-bot
Copy link

paddle-ci-bot bot commented Apr 22, 2023

Sorry to inform you that 44f13a0's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

@yuanlehome yuanlehome force-pushed the specify_io_dtype branch 4 times, most recently from 6d7bf24 to 13f08cb Compare May 18, 2023 04:45
@yuanlehome yuanlehome changed the title [Paddle Inference] add config.keep_io_datatype api [Paddle Inference] add config.enable_low_precision_io api and remove rely on AnalysisConfig::Precison in trt May 18, 2023
@yuanlehome yuanlehome changed the title [Paddle Inference] add config.enable_low_precision_io api and remove rely on AnalysisConfig::Precison in trt [Inference] add config.enable_low_precision_io api and remove rely on AnalysisConfig::Precison in trt May 18, 2023
@yuanlehome yuanlehome force-pushed the specify_io_dtype branch 3 times, most recently from c7e2b05 to b0efe3e Compare May 19, 2023 03:56
Copy link
Contributor

@XieYunshen XieYunshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for set_tests_properties(test_trt_inference_fp16_io PROPERTIES TIMEOUT 300)

Copy link
Contributor

@jiweibo jiweibo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@zhangjun zhangjun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jiweibo jiweibo merged commit d1bbd90 into PaddlePaddle:develop May 22, 2023
bukejiyu pushed a commit to bukejiyu/Paddle that referenced this pull request May 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants