Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizer use init program #5275

Merged

Conversation

jacquesqiao
Copy link
Member

@jacquesqiao jacquesqiao commented Nov 1, 2017

project: #4679
fix: #4975

def create_persistable_var(self, init_program, initializer, prefix, *args,
**kwargs):
"""
create a persistable var in init_program.global_block and current block.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need to create the variable in both init_program and main_program. Parameters are created only in init_program.
If init_program and main_program are in the same scope, will the same var_name be a conflict?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now framework will create some persistable var or parameter in both init_program and program, and prepend init_op in init_program to initialize the persistable var, when executor create a real variable, if it finds the var is persistable, it will only create it in the root scope of Executor.

jacquesqiao and others added 3 commits November 1, 2017 11:22
* Using the LayerHelper to create initialize operator and variables
Polish Optimizer initialization code.
@@ -63,7 +67,7 @@ def _finish_update(self, block):
"""
pass

def _add_accumulator(self, block, name, param, dtype=None, fill_value=0.0):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not remove dtype. It is not necessary that the type of the accumulator will be the same as that of param. dtype should be an argument to the function. There can be cases where we want integer accumulators for float parameters.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

if dtype is None:
   dtype = param.data_type

Copy link
Contributor

@abhinavarora abhinavarora Nov 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. If dtype is None, then let it be float, which is the default when we create variables. dtype should be independent of the param type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I will change this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. If dtype is None, then let it be float, which is the default when we create variables. dtype should be independent of the param type.

We cannot use float for every accumulator. If the parameter is double, the accumulator should be double.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that sounds like a valid point. @jacquesqiao could you please change it what @reyoung has suggested. I apologize for the rework that you have to do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@abhinavarora abhinavarora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you.

@jacquesqiao jacquesqiao merged commit f48159a into PaddlePaddle:develop Nov 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix tensor initialization in Optimizers Python wrappers
4 participants