Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Feature request: share memory between numpy.array and mxnet.ndarray #14244

Closed
SunDoge opened this issue Feb 24, 2019 · 17 comments
Closed

Feature request: share memory between numpy.array and mxnet.ndarray #14244

SunDoge opened this issue Feb 24, 2019 · 17 comments

Comments

@SunDoge
Copy link

SunDoge commented Feb 24, 2019

Last time I did a benchmark and @szha pointed out that

MXNet made the choice of not doing a zero-copy for numpy arrays, but instead making a copy of the numpy data.

This is a safe choice but sometimes it can be a performance problem. I check the C API and find no function to share data from outside. Is it possible for MXNet to provide such api?

@junrushao
Copy link
Member

junrushao commented Feb 26, 2019

I will take a look this weekend.

@junrushao
Copy link
Member

junrushao commented Mar 5, 2019

Hey, I did some research this weekend and am convinced it might be possible. So is it acceptable if we assume we are using cython (instead of ctypes) @szha

@szha
Copy link
Member

szha commented Mar 5, 2019

@junrushao1994 could you elaborate on the consideration?

@junrushao
Copy link
Member

The point is that we need transfer/share ownership of numpy's ndarray to mxnet's C++ backend, because we cannot guarantee the frontend object to exist forever.

Therefore, we need a customized deleter of MXNet's NDArray, aka calling something Py_DECREF of the numpy's ndarray object from the C++ backend - It is possible to implement the deleter via ctypes or cython, and then pass it to something roughly like follows this chunk of code.

/~https://github.com/apache/incubator-mxnet/blob/0f88f61379bd5f59fff6b825be1507d020bf2b7e/include/mxnet/ndarray.h#L131-L148

Per private discussion with @reminisce, his concern is how this could be compatible with the executor and memory planning, in which the executor may take over the ownership. I am not sure about this.

@wkcn
Copy link
Member

wkcn commented Mar 9, 2019

For the sharing from numpy to NDArray, we should use ctypes or weakref module add the inference of numpy object, and decrease the inference through NDArray::deleter.

For the sharing from NDArray to numpy, I think we can add a deleter attribute for numpy object.

https://docs.scipy.org/doc/numpy/user/basics.subclassing.html?highlight=deleter

@junrushao
Copy link
Member

Agreed with @wkcn

@junrushao
Copy link
Member

junrushao commented Mar 27, 2019

I prototyped a version that supports zero copy from numpy to DLManagedTensor in dlpack 0.2, but it turns out that MXNet hasn’t support dlpack 0.2 yet...

@wkcn
Copy link
Member

wkcn commented Mar 28, 2019

We can update the submodule dlpack.

@junrushao
Copy link
Member

@wkcn I opened a PR about this to dmlc/dlpack#38.

@junrushao
Copy link
Member

junrushao commented Mar 28, 2019

@wkcn Sorry I made a mistake. @reminisce reminded me that we already got dlpack 0.2 in MXNet, so everything should be fine

@wkcn
Copy link
Member

wkcn commented Mar 28, 2019

In the submodule DLPack, DLPACK_VERSION is 010 rather than 020.

@junrushao
Copy link
Member

@wkcn this is because dlpack forgots to change its version number to 020 when releasing v0.2 lol. I check the commit hash here, and it is identical to tag v0.2.

@junrushao
Copy link
Member

Being lazy for a while, and now I am thinking of adding an API to mxnet. @wkcn @szha @SunDoge What do you guys think of names for this API? mx.nd.zerocopy_from() or anything else?

@wkcn
Copy link
Member

wkcn commented Apr 8, 2019

I think it is suitable to add a new argument.

NDArray to NumPy

a = mx.nd.array([1,2,3])
b = a.asnumpy(shared=True)

NumPy to NDArray

c = np.array([4,5])
d = mx.nd.array(c, shared=True)

@reminisce
Copy link
Contributor

I think adding a parameter to the existing API may introduce certain level of ambiguity which is not desirable. For example, mx.nd.array takes not just numpy arrays as arguments, but shared=True only works for numpy arrays. We can consider adding this parameter to a new API, for example: mx.nd.from_numpy(zero_copy=True), and it falls back to mx.nd.array when zero_copy is False.

@junrushao
Copy link
Member

@reminisce Sounds good. Will do!

@SunDoge
Copy link
Author

SunDoge commented Jun 4, 2019

Thx.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants