-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Fix memory leak for size-zero ndarray #14365
Conversation
@mxnet-label-bot update [pr-work-in-progress] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution!
LGTM: )
@yuxihu Can you look at the failing checks. LGTM |
Yes, I am looking into the test failures. |
Very nice catch @yuxihu ! |
2248647
to
142ddcf
Compare
Left few nit picky comments |
@eric-haibin-lin please help review. @mxnet-label-bot update [pr-awaiting-review] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still around size 0 to page size?
Yes. I am not sure the reason behind the logic so I do not change it in this PR. We also use aligned alloc in CPU which allocates 16/64 bytes when size is 0. |
@eric-haibin-lin It is ready for final review. |
This reverts commit 3ab1dec.
* free memory for size zero storage handle * skip adding nullptr into GPU memory pool * set context for aux handle * set context for aux handle once it is created
* free memory for size zero storage handle * skip adding nullptr into GPU memory pool * set context for aux handle * set context for aux handle once it is created
* free memory for size zero storage handle * skip adding nullptr into GPU memory pool * set context for aux handle * set context for aux handle once it is created
* free memory for size zero storage handle * skip adding nullptr into GPU memory pool * set context for aux handle * set context for aux handle once it is created
Fixes #13951 Fixes #14358
For size-zero ndarray (e.g. mx.nd.array([]), mx.nd.ones(0)), the storage handle size is 0. Currently we only free handles which size is larger than 0. This leads to memory leak for size-zero ndarray.
In this PR, we remove the check on storage handle size which was used to decide if we need to free a storage handle. After relaxing the check, we need to make sure nullptr is not reused in pooled storage manager and the context for aux handle is correctly set for sparse ndarray.
With this PR, the memory leak issues mentioned above are fixed.