[Inference] add config.enable_low_precision_io api and remove rely on AnalysisConfig::Precison in trt #52485

yuanlehome · 2023-04-03T15:34:20Z

PR types

New features

PR changes

APIs

Description

移除Paddle-TRT底层实现对上层接口AnalysisConfig::Precision的依赖，内部使用phi::DataType；
新增config. enable_low_precision_io接口来设置使用混合精度推理时输入输出数据的类型，此接口限制在模型结构为fp32以及开启GPU混合精度模式下使用；
- 默认为false(即不调用此api)，即输入输出数据类型为fp32
  - 若模型为fp32，需要喂float数据，会吐出float数据（常规情况）
- 若设置为true，需要喂float16数据，会吐出flaot16数据（可能有特殊情况，即最后一个op只支持fp32或只能运行在cpu上，此时会吐出float数据）（本PR专门支持的情况）
- 未列出的其他情况，属于不合理情况，行为未定义。

fix bug: #54042, #54032, #54129

paddle-bot · 2023-04-03T15:34:25Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

leo0519 · 2023-04-17T02:33:23Z

Hi @yuanlehome , is there any unit test for the new config keep_io_datatype?

yuanlehome · 2023-04-17T02:47:22Z

Hi @yuanlehome , is there any unit test for the new config keep_io_datatype?

Currently not available, I need to resolve some unitest issues on CI first.
Perhaps you can tell me what kind of usage examples you need？TRT? Naive GPU? Model precision?

leo0519 · 2023-04-17T03:10:45Z

Perhaps you can tell me what kind of usage examples you need？TRT? Naive GPU? Model precision?

A simple network fed into Paddle Inference with the following conditions is good to check the input datatype of TRT engine.

enable_use_gpu is set
enable_tensorrt_engine is set with precision FP16
keep_io_datatype is set to be false

The result predictor in Paddle Inference should take an FP16 GPU tensor as input and an FP16 tensor as output.

To test the IR graph that adds cast, a program consisting of some operators unable to be converted into TRT is also required.

yuanlehome · 2023-04-17T05:09:52Z

Perhaps you can tell me what kind of usage examples you need？TRT? Naive GPU? Model precision?

A simple network fed into Paddle Inference with the following conditions is good to check the input datatype of TRT engine.

enable_use_gpu is set

enable_tensorrt_engine is set with precision FP16

keep_io_datatype is set to be false

The result predictor in Paddle Inference should take an FP16 GPU tensor as input and an FP16 tensor as output.

To test the IR graph that adds cast, a program consisting of some operators unable to be converted into TRT is also required.

Good tips. Maybe I want to wait for this PR to merge before proceeding.

leo0519 · 2023-04-17T08:29:31Z

Hi @yuanlehome , I would like to add more information for the UT:

There are two parts of the UT

Test FP16 IO
1. Initialize a paddle program with a simple network (ex. fullyconnected -> batchnorm -> relu -> ... those OPs can be converted into TRT)
2. Run Paddle Inference with GPU on and TensorRT off, and then take the output as baseline.
3. Run Paddle Inference with GPU on, TensorRT on (TRT precision is set FP16), and keep_io_datatype False
4. Feed FP16 input data and compare the output with baseline.
Test auto_mixed subgraph pass
1. Initialize a paddle program with some operators Paddle-TRT cannot convert.
2. Run Paddle Inference with GPU on and TensorRT off and take the output as baseline.
3. Run Paddle Inference with GPU on and TensorRT on (FP16 precision), and keep_io_datatype False
4. Feed input data and compare the output with baseline.

If there are any questions, please let me know.

paddle-ci-bot · 2023-04-22T03:31:10Z

Sorry to inform you that 44f13a0's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

paddle/fluid/inference/tensorrt/convert/op_converter.h

…fig::Precision in trt

XieYunshen

LGTM for set_tests_properties(test_trt_inference_fp16_io PROPERTIES TIMEOUT 300)

paddle/fluid/inference/tensorrt/convert/op_converter.h

jiweibo

LGTM

XiaoguangHu01

LGTM

zhangjun

LGTM

… AnalysisConfig::Precison in trt (PaddlePaddle#52485)

yuanlehome changed the title ~~Specify io dtype~~ [Paddle Inference] add config.keep_io_precision api Apr 3, 2023

yuanlehome changed the title ~~[Paddle Inference] add config.keep_io_precision api~~ [Paddle Inference] add config.keep_io_datatype api Apr 4, 2023

yuanlehome force-pushed the specify_io_dtype branch 4 times, most recently from 86cd435 to 44f13a0 Compare April 14, 2023 03:16

yuanlehome force-pushed the specify_io_dtype branch 4 times, most recently from 6d7bf24 to 13f08cb Compare May 18, 2023 04:45

yuanlehome changed the title ~~[Paddle Inference] add config.keep_io_datatype api~~ [Paddle Inference] add config.enable_low_precision_io api and remove rely on AnalysisConfig::Precison in trt May 18, 2023

yuanlehome changed the title ~~[Paddle Inference] add config.enable_low_precision_io api and remove rely on AnalysisConfig::Precison in trt~~ [Inference] add config.enable_low_precision_io api and remove rely on AnalysisConfig::Precison in trt May 18, 2023

yuanlehome force-pushed the specify_io_dtype branch 3 times, most recently from c7e2b05 to b0efe3e Compare May 19, 2023 03:56

zhangjun reviewed May 19, 2023

View reviewed changes

paddle/fluid/inference/tensorrt/convert/op_converter.h Show resolved Hide resolved

add config.enable_low_precision_io api and remove rely on AnalysisCon…

f293ff8

…fig::Precision in trt

yuanlehome force-pushed the specify_io_dtype branch from a8d4b10 to f293ff8 Compare May 19, 2023 12:52

disable assert_allclose

4ff9507

yuanlehome force-pushed the specify_io_dtype branch from a56156d to 4ff9507 Compare May 19, 2023 15:19

XieYunshen approved these changes May 22, 2023

View reviewed changes

jiweibo reviewed May 22, 2023

View reviewed changes

paddle/fluid/inference/tensorrt/convert/op_converter.h Show resolved Hide resolved

jiweibo approved these changes May 22, 2023

View reviewed changes

XiaoguangHu01 approved these changes May 22, 2023

View reviewed changes

zhangjun approved these changes May 22, 2023

View reviewed changes

jiweibo merged commit d1bbd90 into PaddlePaddle:develop May 22, 2023

bukejiyu pushed a commit to bukejiyu/Paddle that referenced this pull request May 22, 2023

[Inference] add config.enable_low_precision_io api and remove rely on…

82e4189

… AnalysisConfig::Precison in trt (PaddlePaddle#52485)

jeng1220 mentioned this pull request May 25, 2023

數個重要修正需要 cherry-pick 到 release/2.5 branch #54100

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inference] add config.enable_low_precision_io api and remove rely on AnalysisConfig::Precison in trt #52485

[Inference] add config.enable_low_precision_io api and remove rely on AnalysisConfig::Precison in trt #52485

yuanlehome commented Apr 3, 2023 •

edited

Loading

paddle-bot bot commented Apr 3, 2023

leo0519 commented Apr 17, 2023

yuanlehome commented Apr 17, 2023 •

edited

Loading

leo0519 commented Apr 17, 2023 •

edited

Loading

yuanlehome commented Apr 17, 2023

leo0519 commented Apr 17, 2023

paddle-ci-bot bot commented Apr 22, 2023

XieYunshen left a comment

jiweibo left a comment

XiaoguangHu01 left a comment

zhangjun left a comment

[Inference] add config.enable_low_precision_io api and remove rely on AnalysisConfig::Precison in trt #52485

[Inference] add config.enable_low_precision_io api and remove rely on AnalysisConfig::Precison in trt #52485

Conversation

yuanlehome commented Apr 3, 2023 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Apr 3, 2023

leo0519 commented Apr 17, 2023

yuanlehome commented Apr 17, 2023 • edited Loading

leo0519 commented Apr 17, 2023 • edited Loading

yuanlehome commented Apr 17, 2023

leo0519 commented Apr 17, 2023

paddle-ci-bot bot commented Apr 22, 2023

XieYunshen left a comment

Choose a reason for hiding this comment

jiweibo left a comment

Choose a reason for hiding this comment

XiaoguangHu01 left a comment

Choose a reason for hiding this comment

zhangjun left a comment

Choose a reason for hiding this comment

yuanlehome commented Apr 3, 2023 •

edited

Loading

yuanlehome commented Apr 17, 2023 •

edited

Loading

leo0519 commented Apr 17, 2023 •

edited

Loading