-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
Please remove commented out code |
Thanks for the quick fix! I left 1 comment. Rest LGTM! Post complete test run for test_large_array.py here, then it should be good to go. |
8d3ea39
to
fd69dfb
Compare
1399bbd
to
3a4025f
Compare
Gets killed as it reaches max memory limit 480G/480G of p3 instance
|
@anirudh2290 can you review and merge this |
There are a few tests failing after you resumed running. Why ?
|
Tests that give error when run together, pass when run individually.
I'm guessing memory overflow has to do with this (not sure though) But we've seen this since quite sometime
|
I assume you're testing with the teardown for now. But before making the PR final, I'd appreciate if you could elaborate. Feel free to ping me once you figured it out. |
@marcoabreu : we need teardown to free up memory allocated once each test finishes. The reason of doing this is to ensure that memory is freed up after each test run, which wasn't happening before. These tests require more memory, running all of them in a single execution was causing out of memory error since memory was not being freed up in time(as observed using |
Is there a issue for the disabled nightly large tensor tests ? If not can you please open an issue for these disabled tests. |
also why does the title still say WIP |
raise | ||
finally: | ||
mx.nd.waitall() | ||
mx.cpu().empty_cache() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are the tests only for cpu context ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya. We don't test for gpu (as large tensor GPU is not supported)
* fix activation * remove comments * fix copy_to * fix lint, remove redundant function, fix shape sizes for random functions * fix sigmoid issue * fix leaky relu * fix random shuffle * fix pooling * fix dropout * fix index copy * add teardown and fix lint * post test cleanup * removed decorator since it needs C API for CPU memory release
Description
Hence made it
A very big value that is > CPU memory gives the error
Hence made it just greater than 2**32
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments