-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
@@ -165,6 +166,46 @@ def test_multiple_waitalls(): | |||
assert caught, "No exception thrown" | |||
mx.nd.waitall() | |||
|
|||
@with_seed() | |||
def test_exc_profiler(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems a bit heavy for a unittest, I think passing some nd.ones() through a simple dense layer would do the same right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have simplified it. Please take a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR LGTM. But did you try to investigate why this happened? Can a handle be double freed such that the counter is incorrect? I noticed that we are passing handle by value when calling Free() which is not ideal as the caller might still hold the handle dptr and the handle size is not zeroed.
@yuxihu Good point I didnt investigate whether handle is getting double freed. I will check the same, though this PR doesn't have to block on it. |
Yes, it is good to have the logic implemented in this PR anyway. The investigation can be done separately. |
Description
After adding the waitall PR the nightlies started failing and was reported here : #14397 (comment)
This was because waitall rethrowing exceptions that were hidden earlier.
Seems like operator-= for ProfileCounter is called during frees which checks if the amount to be freed is greater than alloced. Added a check before operator is called.
More permanent fix would be to understand why memory pool counters are incorrect when free is called but this PR should stop the profiler nightly test failures.
@ThomasDelteil
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments