-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack LSTM Net for Paddle Book6 #5503
Conversation
python/paddle/v2/framework/layers.py
Outdated
'isReverse': is_reverse, | ||
'gateActivation': gate_activation, | ||
'cellActivation': cell_activation, | ||
'candidateActivation': candidate_activation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these attr names have been changed to snake_case
, please update.
python/paddle/v2/framework/layers.py
Outdated
'cellActivation': cell_activation, | ||
'candidateActivation': candidate_activation | ||
}) | ||
return hidden |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Cell
is also the output.
inputs = [fc1, lstm1] | ||
|
||
for i in range(2, stacked_num + 1): | ||
fc = layers.fc(input=inputs, size=hid_dim) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这处和book不一致,book: /~https://github.com/PaddlePaddle/book/blob/develop/06.understand_sentiment/train.py#L58
这个fc有两个输入,有两组weight,每个weight的初始化,强调下lstm作为输入的weight初始化是0。
fc_para_attr = paddle.attr.Param(learning_rate=1e-3)
lstm_para_attr = paddle.attr.Param(initial_std=0., learning_rate=1.)
for i in range(2, stacked_num + 1): | ||
fc = layers.fc(input=inputs, size=hid_dim) | ||
lstm = layers.dynamic_lstm( | ||
input=fc, size=hid_dim, is_reverse=(i % 2) == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里有一处和book不同:lstm的candidate_activation
在book里(即book里的act)用的是relu
,
/~https://github.com/PaddlePaddle/book/blob/develop/06.understand_sentiment/train.py#L80
prediction = layers.fc(input=[fc_last, lstm_last], | ||
size=class_dim, | ||
act='softmax') | ||
cost = layers.cross_entropy(input=prediction, label=label) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为了数值稳定性,我们有softmax_with_cross_entropy_op
,建议demo里 softmax+ cross_entropy
换成softmax_with_cross_entropy_op
?
paddle.reader.shuffle( | ||
paddle.dataset.imdb.train(word_dict), buf_size=1000), | ||
batch_size=BATCH_SIZE) | ||
place = core.CPUPlace() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是否加个GPU的例子?
# place = core.GPUPlace(0)
outs = exe.run(g_main_program, | ||
feed={"words": tensor_words, | ||
"label": tensor_label}, | ||
fetch_list=[cost, acc]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
后续会作为demo吗? 如果作为demo,是不是应该测试下test集?(也可以加TODO,作为后续PR。)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve this PR, but some mentioned reviews need to be updated later, I create an issue #5591
Fix #5504