Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using go to implement parameter server side optimization #265

Open
typhoonzero opened this issue Aug 3, 2017 · 6 comments
Open

Using go to implement parameter server side optimization #265

typhoonzero opened this issue Aug 3, 2017 · 6 comments
Assignees
Labels

Comments

@typhoonzero
Copy link
Collaborator

typhoonzero commented Aug 3, 2017

There are two ways to accomplish this:

  1. port eigen library to go, then use go to do optimization
    • provide a general tensor library in go, can use somewhere else.
    • since pserver merely use GPU, do not need op to provide both CPU and GPU
  2. port paddle "operators" to go, then use go to do optimization
    • ops can both use in pserver and trainers, for both remote optimize and local optimize
    • higher level implement
@helinwang
Copy link
Collaborator

helinwang commented Aug 3, 2017

Here are some initial thoughts:

provide a general tensor library in go, can use somewhere else.
If we want a tensor library in Go, this native library could be an option: /~https://github.com/gonum/gonum

For 2, The current implementation is we have optimizer in C++, I think it's will be quite simple to compile the C++ operator implementation into optimizer.

If we want to directly call Eigen from Go (frequently), we need to be careful of the cgo cost: https://www.cockroachlabs.com/blog/the-cost-and-complexity-of-cgo/

Will take more look tomorrow and discuss in tomorrow night US time.

@typhoonzero
Copy link
Collaborator Author

typhoonzero commented Aug 3, 2017

It seems that https://golang.org/pkg/reflect/#MakeFunc is able to make multi-typed tensor operations.

Edit:

Seems MakeFunc cannot directly manipulate value calculations for the type reflect.Value can not directly add together, will continue to try later.

@typhoonzero
Copy link
Collaborator Author

For 2, The current implementation is we have optimizer in C++, I think it's will be quite simple to compile the C++ operator implementation into optimizer.

No. We must re-implement optimizers as "Ops" for more complexity.

@typhoonzero
Copy link
Collaborator Author

最近尝试写了几种方式 port eigen:

  1. 使用reflect,需要强制type convert,这个对性能损失会比较严重
  2. 使用"text/template"和go generate,生成多类型代码,模版编写不可读

关于cgo的性能:
cgo的频繁调用会极大的影响性能,对于计算库来说,频繁调用又是不可避免的。cgo是不能像python c扩展那样带来性能提升,反而会降低性能。所以使用cgo扩展port eigen并不能获得接近原生eigen的性能。

@helinwang
Copy link
Collaborator

helinwang commented Aug 9, 2017

@typhoonzero

使用reflect,需要强制type convert,这个对性能损失会比较严重
使用"text/template"和go generate,生成多类型代码,模版编写不可读

都非常好奇如何实现的,能否贴一下示例代码?(感觉可以成为/~https://github.com/PaddlePaddle/blog/issues/1 中的一个亮点)

对于计算库来说,频繁调用又是不可避免的

我觉得取决于什么样的计算库,Eigen这种底层的计算库可能会非常频繁。如果是一个layer(甚至一个Op)用C++实现,Go调用,可能cgo overhead跟计算所花时间对比,可以忽略不计?

@typhoonzero
Copy link
Collaborator Author

都非常好奇如何实现的,能否贴一下示例代码?(感觉可以成为PaddlePaddle/blog#1 中的一个亮点)

嗯嗯,这个实现了一半,后续我可以完善一个demo case贴在blog中。

我觉得取决于什么样的计算库,Eigen这种底层的计算库可能会非常频繁。如果是一个layer(甚至一个Op)用C++实现,Go调用,可能cgo overhead跟计算所花时间对比,可以忽略不计?

嗯如果Op或Layer使用C++实现,实际大部分的code还是在C++端,这个方式和Tensorflow类似,只是用go来描述网络的配置了。

@dzhwinter dzhwinter self-assigned this Aug 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants