Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supports serving for PPMiniLM #1620

Merged
merged 11 commits into from
Feb 17, 2022

Conversation

LiuChiachi
Copy link
Contributor

@LiuChiachi LiuChiachi commented Jan 21, 2022

PR types

New features

PR changes

Models

Description

Supports Serving for PPMiniLM

This PR could run if these PRs are merged:

add copyright for serving

reorganize
@LiuChiachi LiuChiachi changed the title Supports serving Supports serving for PPMiniLM Jan 21, 2022
@LiuChiachi LiuChiachi marked this pull request as ready for review January 21, 2022 08:37

class PPMiniLMOp(Op):
def init_op(self):
import paddlenlp as ppnlp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from paddlenlp.transformers import PPMiniLMTokenizer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Done.

@LiuChiachi LiuChiachi requested a review from ZeyuChen January 24, 2022 12:24
@@ -0,0 +1,82 @@
# PP-MiniLM 使用 Paddle Serving API 进行推理
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PP-MiniLM 使用 Paddle Serving 进行服务化部署

Paddle Serving不是个API,也不是推理了,是服务化部署

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

收到,感谢指出,已修改

@@ -394,6 +403,12 @@ cd ..



<a name="使用PaddleServing预测"></a>

### 使用 Paddle Serving 进行预测
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

使用 Paddle Serving 进行服务化部署

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感谢,已修改:)


### 预测
### 使用 Paddle Inference 进行预测
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

使用 Paddle Inference 进行推理部署

| ppminilm.pdiparams | 模型权重文件,供推理时加载使用 |
| ppminilm.pdmodel | 模型结构文件,供推理时加载使用 |

假设这 2 个文件已生成,其中模型是集成了 FasterTokenizer 算子的模型,并放在在目录 `$MODEL_DIR` 下
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

其中模型是集成了 FasterTokenizer 算子的模型
这个前提需要让用户感知吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不需要,已删除。默认导出的模型是带 FasterTokenizer 算子的模型,所以不需要强调。之后lite那边再单独写一下目前只支持不带 FasterTokenizer 算子。


使用 Paddle Serving 需要在服务器端安装相关模块,需要 v0.8.0 之后的版本:
```shell
pip install paddle-serving-app paddle-serving-client paddle-serving-server paddlepaddle
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么还要引导安装paddlepaddle呢?这个地方是否会跟paddlenlp本身已安装的paddle冲突呢?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感谢指出,已经修改

在启动预测之前,需要按照自己的情况修改 config 文件中的配置,主要需要修改的配置释义如下:

- `rpc_port` : rpc端口。
- `device_type` : 0 代表 cpu, 1 代表 gpu, 2 代表 tensorRT, 3 代表 arm cpu, 4 代表 kunlun xpu。
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPU GPU TensorRT Arm CPU, Kunlun XPU
注意术语的标准写法。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感谢,已经修改

@@ -0,0 +1,35 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个文件是干什么呢的?我看文档没有对RPC client进行介绍
那现在是默认只使用websevice是吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

web service起服务,然后rpc client发请求,两个需要配套使用。


```

## 启动 client 进行推理
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serving的逻辑,启动客户端不是为了推理,是为了获取服务端的结果
这些标题从技术角度看不严谨。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感谢指出,已经修改成“启动 client 发起推理请求“

* [环境要求](#环境要求)
* [运行方式](#运行方式)
* [性能测试](#性能测试)
* [使用 Paddle Serving 预测](#使用PaddleServing预测)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

部署和预测,是两码事。

Copy link

@tianxin1860 tianxin1860 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@LiuChiachi LiuChiachi merged commit 9ac1714 into PaddlePaddle:develop Feb 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants