-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizer use init program #5275
Optimizer use init program #5275
Conversation
def create_persistable_var(self, init_program, initializer, prefix, *args, | ||
**kwargs): | ||
""" | ||
create a persistable var in init_program.global_block and current block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need to create the variable in both init_program and main_program. Parameters are created only in init_program.
If init_program and main_program are in the same scope, will the same var_name be a conflict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now framework will create some persistable var or parameter in both init_program and program, and prepend init_op in init_program to initialize the persistable var, when executor create a real variable, if it finds the var is persistable, it will only create it in the root scope of Executor.
* Using the LayerHelper to create initialize operator and variables
Polish Optimizer initialization code.
@@ -63,7 +67,7 @@ def _finish_update(self, block): | |||
""" | |||
pass | |||
|
|||
def _add_accumulator(self, block, name, param, dtype=None, fill_value=0.0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not remove dtype. It is not necessary that the type of the accumulator will be the same as that of param. dtype should be an argument to the function. There can be cases where we want integer accumulators for float parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
if dtype is None:
dtype = param.data_type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. If dtype is None, then let it be float, which is the default when we create variables. dtype
should be independent of the param type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I will change this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. If dtype is None, then let it be float, which is the default when we create variables. dtype should be independent of the param type.
We cannot use float
for every accumulator. If the parameter is double, the accumulator should be double.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that sounds like a valid point. @jacquesqiao could you please change it what @reyoung has suggested. I apologize for the rework that you have to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you.
project: #4679
fix: #4975