-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Roadmap] v0.2 release checklist #302
Comments
M2C. Models:
Core system improvement:
Project Improvement:
Others:
|
I think we should support an operator that does |
BTW, any action item for accelerating GAT? |
@zheng-da Is it similar to "sparse |
I see what you mean by How are we going to implement these operators? in DGL or in the backend? If we implement it in DGL, how to support async computation in MXNet? |
That will be sparse softmax I proposed?
Seems that PyTorch operators can be implemented externally (/~https://github.com/rusty1s/pytorch_scatter) so putting that in DGL repo should be fine. I don't know if/how external operators can hook into MXNet; should we compile MXNet from source? Also I guess MXNet can implement these operators in their own repo regardless, since having these sparse operators should be always beneficial? |
In terms of implementation, it's better to be in DGL so it can be used in every framework. In general, we should follow the guidance of each framework on implementing custom operators (such as this guide in pytorch). We should avoid dependencies on the framework's C++ libraries. This leaves us few choices including: In terms of async, is MX's CustomOp async or not? |
Is there any plan for |
Previously, we discussed caching the results from the schedulers. It helps us avoid the expensive scheduling. I just realized that there is a lot of data copy from CPU to GPU during the computation even though we have copied all data in Frame to GPU. The data copy occurs on Index (I suppose Index is always copied on CPUs first). Caching the scheduling result can also help avoid data copy from CPU to GPU. |
@zheng-da , agree. This should be put on the roadmap.
This is somewhat related to the sparse softmax proposed by @BarclayII. In my mind, there are two levels. The lower-level is |
I assume we also need a "sparse softmax" kernel (similar to TF)? What I was thinking is to have |
We should add more MXNet tutorials in the website. |
In terms of the implementation of the new operators, CustomOp in MXNet might not be a good way of implementing new operators. It's usually very slow. For performance, it's still best to implement them in the backend frameworks directly. At least, we can do it in MXNet. Not sure about Pytorch. |
Do you know why is it slow? It might be a good chance to improve that part. Also we need to benchmark Pytorch's custom op to see how much overhead it has. We should try our best to have them in DGL. Otherwise, it will be really difficult to maintain them in every frameworks. |
It calls Python code from C code. Because the operator is implemented in Python, it's expressiveness is limited. Implementing sparse softmax efficiently in Python is hard. |
For sparse softmax I created a feature request at MXNet repo: apache/mxnet#12729 |
Minor suggestion for project improvement:
|
graph_nets? |
@alwem, could you elaborate? |
@jermainewang Have you consulted the idea of the graph_nets model? Some of their solutions seem to be good! |
We did some investigation of graph_nets. We found that DGL could cover all the models in graph_nets. Maybe we miss something. Could you point out? |
Hi @Huangzhanpeng , thank you for the suggestion. It will be great if you could help contribute node2vec and GraphRNN to DGL. From my understanding, the random walk can be done in networkx first and then used in DGL. GraphRNN is similar to the DGMG model (see our tutorials here) in that it is a generative model trained on a sequence of nodes/edges. I guess there will be many shared building blocks between the two. |
@jermainewang Thank you for your response. In my actual work, node2vec's random walk on networkx is not available with large scale data. If there is time, I really want to try to implement graphrnn in dgl. |
@Huangzhanpeng There is always time :). Please go ahead. If you encounter any problems during the implementation, feel free to raise questions on https://discuss.dgl.ai. The team is very responsive. About the random walk, @BarclayII is surveying the common random walk algorithm and we might include APIs for them in our next release. |
Just updated the roadmap with a checklist. Our tentative date for this release is this month (02/28). For all committers @zheng-da @szha @BarclayII @VoVAllen @ylfdq1118 @yzh119 @GaiYu0 @mufeili @aksnzhy @zzhang-cn @ZiyueHuang , please vote +1 if you agree with this plan. |
I would rather reply with emoticon. +1 as reply would pollute the thread. |
The release plan passed by voting. |
May I kindly ask whether there is an updated tentative date for the 0.2 release? I'm desperately waiting for some features and unfortunately cannot build dgl from-source on the server. Thanks for your efforts! |
@lgalke Thanks for asking. Our release date is delayed for a week due to some performance issues found recently. We are waiting for the final PR to be merged #434 so you could expect a new release in 2 days !! It's our first major release after open source so we are still adapting to the release process. Thank you for your patience. |
v0.2 has just been officially released. Thanks everyone for the support! |
Thanks everyone for the hard work. We really did a lot for a smooth beta release. With the repo being opened and more community help in-coming, it is a good time to figure out the roadmap to v0.2 release. Here is a draft proposal, feel free to reply, comment and discuss about this. Note that the list is long but we could figure out the priority later. We'd like to hear your opinions and push DGL to a next stage.
Model examples
Core system improvement
Tutorial/Blog
Project improvement
Deferred goals
(will not be included in this release unless someone takes over)
src_mul_edge
.src_mul_dst
.The text was updated successfully, but these errors were encountered: