-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supports serving for PPMiniLM #1620
Supports serving for PPMiniLM #1620
Conversation
add copyright for serving reorganize
|
||
class PPMiniLMOp(Op): | ||
def init_op(self): | ||
import paddlenlp as ppnlp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from paddlenlp.transformers import PPMiniLMTokenizer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Done.
…nto add-ppminilm-serving
@@ -0,0 +1,82 @@ | |||
# PP-MiniLM 使用 Paddle Serving API 进行推理 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PP-MiniLM 使用 Paddle Serving 进行服务化部署
Paddle Serving不是个API,也不是推理了,是服务化部署
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
收到,感谢指出,已修改
@@ -394,6 +403,12 @@ cd .. | |||
|
|||
|
|||
|
|||
<a name="使用PaddleServing预测"></a> | |||
|
|||
### 使用 Paddle Serving 进行预测 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
使用 Paddle Serving 进行服务化部署
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感谢,已修改:)
|
||
### 预测 | ||
### 使用 Paddle Inference 进行预测 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
使用 Paddle Inference 进行推理部署
| ppminilm.pdiparams | 模型权重文件,供推理时加载使用 | | ||
| ppminilm.pdmodel | 模型结构文件,供推理时加载使用 | | ||
|
||
假设这 2 个文件已生成,其中模型是集成了 FasterTokenizer 算子的模型,并放在在目录 `$MODEL_DIR` 下 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
其中模型是集成了 FasterTokenizer 算子的模型
这个前提需要让用户感知吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不需要,已删除。默认导出的模型是带 FasterTokenizer 算子的模型,所以不需要强调。之后lite那边再单独写一下目前只支持不带 FasterTokenizer 算子。
|
||
使用 Paddle Serving 需要在服务器端安装相关模块,需要 v0.8.0 之后的版本: | ||
```shell | ||
pip install paddle-serving-app paddle-serving-client paddle-serving-server paddlepaddle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为什么还要引导安装paddlepaddle呢?这个地方是否会跟paddlenlp本身已安装的paddle冲突呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感谢指出,已经修改
在启动预测之前,需要按照自己的情况修改 config 文件中的配置,主要需要修改的配置释义如下: | ||
|
||
- `rpc_port` : rpc端口。 | ||
- `device_type` : 0 代表 cpu, 1 代表 gpu, 2 代表 tensorRT, 3 代表 arm cpu, 4 代表 kunlun xpu。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CPU GPU TensorRT Arm CPU, Kunlun XPU
注意术语的标准写法。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感谢,已经修改
@@ -0,0 +1,35 @@ | |||
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个文件是干什么呢的?我看文档没有对RPC client进行介绍
那现在是默认只使用websevice是吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
web service起服务,然后rpc client发请求,两个需要配套使用。
|
||
``` | ||
|
||
## 启动 client 进行推理 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Serving的逻辑,启动客户端不是为了推理,是为了获取服务端的结果
这些标题从技术角度看不严谨。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感谢指出,已经修改成“启动 client 发起推理请求“
* [环境要求](#环境要求) | ||
* [运行方式](#运行方式) | ||
* [性能测试](#性能测试) | ||
* [使用 Paddle Serving 预测](#使用PaddleServing预测) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
部署和预测,是两码事。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
New features
PR changes
Models
Description
Supports Serving for PPMiniLM
This PR could run if these PRs are merged: