-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ctc init #63
ctc init #63
Conversation
ctc/model.py
Outdated
stride_x=1, | ||
stride_y=1, | ||
block_x=1, | ||
block_y=3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
依据 PaddlePaddle/Paddle#2296 这里看conv_features的 输出的特征c=128, h=11, w=3, block_y应该是11,估计这块设置不对,导致了GPU内存问题,以及issues #2296 的问题。
训练的配置写好了,基本可以 PR 了。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
另外,需要infer过程。
ctc/README.md
Outdated
# CTC (Connectionist Temporal Classification) 模型CRNN教程 | ||
## 背景简介 | ||
|
||
现实世界中的序列学习任务需要从连续的输入序列中预测出对应标签序列, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
序列学习任务需要从连续的输入序列中预测出对应标签序列
连续的输入准确吗? 机器翻译的输入不连续吧?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ctc/README.md
Outdated
## 背景简介 | ||
|
||
现实世界中的序列学习任务需要从连续的输入序列中预测出对应标签序列, | ||
比如语音识别任务从连续的语音中得到对应文字序列,类似于seq2seq任务; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seq2seq任务
如果是个专有名字需要加链接。
ctc/README.md
Outdated
CTC相关模型就是实现此类seq2seq任务的的一类算法,具体地,CTC模型为输入序列中每个时间步做一次分类输出一个标签(CTC中 Classification的来源), | ||
最终对输出的标签序列处理成对应的输出序列(具体算法参见下文)。 | ||
|
||
CTC 算法在很多领域中有应用,比如手写数字识别、语音识别、手势识别、连续图像文字识别等,除去不同任务中的专业知识不同, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
连续图像文字识别不够准确。
ctc/README.md
Outdated
CTC 算法在很多领域中有应用,比如手写数字识别、语音识别、手势识别、连续图像文字识别等,除去不同任务中的专业知识不同, | ||
所有任务均为连续序列输入,标签序列输出。 | ||
|
||
本文将针对 **场景文字识别 (STR, Scene Text Recognition)** 任务,演示如何用 PaddlePaddle 实现 一个一站式 CTC 的模型 **CRNN(Convolutional Recurrent Neural Network)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
一站式 -> 端到端
ctc/README.md
Outdated
CTC 算法在很多领域中有应用,比如手写数字识别、语音识别、手势识别、连续图像文字识别等,除去不同任务中的专业知识不同, | ||
所有任务均为连续序列输入,标签序列输出。 | ||
|
||
本文将针对 **场景文字识别 (STR, Scene Text Recognition)** 任务,演示如何用 PaddlePaddle 实现 一个一站式 CTC 的模型 **CRNN(Convolutional Recurrent Neural Network)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要提下OCR,解释下和STR的区别?
ctc/data_provider.py
Outdated
|
||
|
||
if __name__ == '__main__': | ||
image_file_list = '/home/disk1/yanchunwei/90kDICT32px/train_all.txt' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个脚本需要有下载数据的过程,这个路径可以换成下载之后的路径。
ctc/README.md
Outdated
### 图像数据及处理 | ||
本任务使用数据集\[[4](#参考文献)\],数据中包括了图片数据和对应的目标文本,其中预测的目标文本需要转化为一维的ID列表,我们用如下类来实现 | ||
|
||
```python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这块不用粘贴这么多代码,告诉用户是在哪个脚本和函数即可。
ctc/data_provider.py
Outdated
if self.fixed_shape: | ||
image = cv2.resize( | ||
image, self.fixed_shape, interpolation=cv2.INTER_CUBIC) | ||
# image = to_chw(image) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
image = to_chw(image)
注释的代码去掉。
ctr/model.py
Outdated
@@ -0,0 +1,76 @@ | |||
#!/usr/bin/env python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CTR和DSSM部分代码和文档可以从这个PR里去掉吗?
ctc/train.py
Outdated
|
||
trainer.train( | ||
reader=paddle.batch( | ||
paddle.reader.shuffle(dataset.train, buf_size=100), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buf_size可以增大点,加大shuffle范围吧。
出现 gpu allocate 多次的问题