[opt] Add regularation and Nesterov for mergerd_momentum op #37527

zhangbo9674 · 2021-11-24T13:17:46Z

PR types

Performance optimization

PR changes

OPs

Describe

增强mergerd_momentum op功能，包括：

由仅支持输入单个lr到支持输入多lrs（数量与输入的参数一致）；
添加use_nesterov属性，支持use_nesterov策略的计算；
添加regularization属性，支持regularization计算。

paddle-bot-old · 2021-11-24T13:17:50Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

zhiqiu · 2021-11-26T03:08:50Z

paddle/fluid/operators/optimizers/merged_momentum_op.h

+                            "Attr(regularization_coeff) number must be equal "
+                            "to Input(Param) number."));
+    }
+    VLOG(1) << use_nesterov << regularization_methods.size()


Suggested change

VLOG(1) << use_nesterov << regularization_methods.size()

VLOG(5) << "use_nesterov: " << use_nesterov <<", regularization_methods.size(): " << regularization_methods.size()

zhiqiu · 2021-11-26T03:10:14Z

paddle/fluid/operators/optimizers/merged_momentum_op.h

+                            "to Input(Param) number."));
+      PADDLE_ENFORCE_EQ(n, regularization_coeffs.size(),
+                        platform::errors::InvalidArgument(
+                            "Attr(regularization_coeff) number must be equal "


The size of Attr(regularization_coeff) must be equal to the size of Input(Param), but got the size of Attr(regularization_coeff) is %d, the size of Input(Param) is %d

Try to make the error message helpful, same for others.

zhiqiu · 2021-11-26T03:11:40Z

paddle/fluid/operators/optimizers/merged_momentum_op.cc

@@ -68,6 +69,18 @@ class MergedMomentumOpMaker : public framework::OpProtoAndCheckerMaker {
        .AsDispensable()
        .AsDuplicable();
    AddAttr<float>("mu", "(float) Momentum coefficient");
+    AddAttr<bool>("use_nesterov",
+                  "(bool, default false) "
+                  "Use Nesterov Momentum")


Suggested change

"Use Nesterov Momentum")

"Use Nesterov Momentum or not")

zhiqiu · 2021-11-26T03:15:01Z

paddle/fluid/operators/optimizers/merged_momentum_op.h

-      PADDLE_LAUNCH_MERGED_MOMENTUM_KERNEL(false);
-    }
+      for (size_t idx = 0; idx < n; idx++) {
+        std::string regularization_method = " ";


Suggested change

std::string regularization_method = " ";

std::string regularization_method = "";

sneaxiy · 2021-11-29T16:51:37Z

paddle/fluid/operators/optimizers/merged_momentum_op.h


-#undef PADDLE_LAUNCH_MERGED_MOMENTUM_KERNEL
+        params_out[idx]->data<T>();
+        velocitys_out[idx]->data<MPType>();


I do not know what is the purpose to write these 2 lines? Just check whether params_out[idx] and velocitys_out[idx] is properly initialized?

tks, this code has been deleted.

sneaxiy · 2021-11-29T17:04:16Z

paddle/fluid/operators/optimizers/merged_momentum_op.h

+        if (regularization_methods.size() != 0) {
+          regularization_method = regularization_methods[idx];
+        }
+        RegularizationType regularization_flag{RegularizationType::kNONE};


NIT. Not required to change. Maybe the following code would be simpler:

RegularizationType regularization_flag = regularization_methods.size() > 0 && regularization_methods[idx] == "l2_decay" ? RegularizationType::kL2DECAY : RegularizationType::kNONE.

tks, this code has been modified according to the comments.

sneaxiy · 2021-11-29T17:05:08Z

paddle/fluid/operators/optimizers/merged_momentum_op.h

+            }
+          }
+        }
+      }


Seems too many duplicate codes with momentum_op.h. Maybe we can use a common function defined in momentum_op.h?

I think these codes have reused the DenseMomentumFunctor function in momentum_op.h.

zhiqiu

LGTM

Superjomn

LGTM

…ddle#37527) * add regularation and Nesterov for mergerd_momentum * refine unittest for use_nesterov attr * refine op check * refine code * fix bug * refine code of regularization_flag * delete useless code

add regularation and Nesterov for mergerd_momentum

d07149f

zhangbo9674 added 2 commits November 25, 2021 09:28

refine unittest for use_nesterov attr

f27e863

refine op check

e27ce50

zhiqiu reviewed Nov 26, 2021

View reviewed changes

zhangbo9674 added 2 commits November 26, 2021 09:13

refine code

ea8cb54

fix bug

c049ae0

sneaxiy reviewed Nov 29, 2021

View reviewed changes

zhangbo9674 added 2 commits November 30, 2021 03:41

refine code of regularization_flag

cf953a3

delete useless code

46e67d0

zhiqiu approved these changes Nov 30, 2021

View reviewed changes

Superjomn approved these changes Nov 30, 2021

View reviewed changes

zhiqiu merged commit c8ffdec into PaddlePaddle:develop Nov 30, 2021

zhangbo9674 mentioned this pull request Dec 1, 2021

Add multi_tensor for momentum optimizer and clear_grads #37564

Merged

zhangbo9674 deleted the dev/merge_momentum branch March 2, 2023 02:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[opt] Add regularation and Nesterov for mergerd_momentum op #37527

[opt] Add regularation and Nesterov for mergerd_momentum op #37527

zhangbo9674 commented Nov 24, 2021 •

edited

Loading

paddle-bot-old bot commented Nov 24, 2021

zhiqiu Nov 26, 2021

zhangbo9674 Nov 26, 2021

zhiqiu Nov 26, 2021

zhiqiu Nov 26, 2021

zhangbo9674 Nov 26, 2021

zhiqiu Nov 26, 2021

zhangbo9674 Nov 26, 2021

zhiqiu Nov 26, 2021

zhangbo9674 Nov 26, 2021

sneaxiy Nov 29, 2021

zhangbo9674 Nov 30, 2021

sneaxiy Nov 29, 2021

zhangbo9674 Nov 30, 2021

sneaxiy Nov 29, 2021

zhangbo9674 Nov 30, 2021

zhiqiu left a comment

Superjomn left a comment

	VLOG(1) << use_nesterov << regularization_methods.size()
	VLOG(5) << "use_nesterov: " << use_nesterov <<", regularization_methods.size(): " << regularization_methods.size()

	std::string regularization_method = " ";
	std::string regularization_method = "";

[opt] Add regularation and Nesterov for mergerd_momentum op #37527

[opt] Add regularation and Nesterov for mergerd_momentum op #37527

Conversation

zhangbo9674 commented Nov 24, 2021 • edited Loading

PR types

PR changes

Describe

paddle-bot-old bot commented Nov 24, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhiqiu left a comment

Choose a reason for hiding this comment

Superjomn left a comment

Choose a reason for hiding this comment

zhangbo9674 commented Nov 24, 2021 •

edited

Loading