-
Notifications
You must be signed in to change notification settings - Fork 1.1k
TBlob.get_with_shape: new and old shape do not match total elements #2286
Comments
This is the repository for TuriCreate not for MXNet. Did you mean to create an issue for MXNet? I'm going to close this issue. Feel to clarify how this is a TuriCreate issue and reopen. |
No, that is not the problem, TuriCreate requires an MXNet 1.1.0 that has the issue. It's not compatible with fixed version to MXNet. Because then it needs a new numpy and tensor. Can TuriCreate be updated to not require such outdated frameworks? I don't see why you'd close the issue without even considering the context. TuriCreate is crippled by MXNet and you should change your requirements or else no one will be able to create big models. I can't use the audio classifier past ~60K items before it dies. |
@rplom - What OS and Python version are you using? Could you share the python stack trace and your code? |
@TobyRoseman i opened the other ticket about the requirements. I have Python 3.6.5. But the thing is unless you can build a custom MXNet with the USE_INT64_TENSOR_SIZE flag (which I will do locally) you don't get the fixed 64 bit integer stuff. If I build that I'm on MXNet 1.5 and then it's the classic pulling a decency thread. Thanks for looking at this. I have something pretty amazing I'm building with Turi and am pushing the boundaries a bit.. |
I'm happy to share the code privately with you, but I don't want to post it here. |
My tensor size is now over 4 billion and mxnet can't handle it. I built the new branch of MXNet and tried it, but by pulling that in I needed a new numpy and tensorflow. Turi kind of works with the new libraries but some accuracy seems lost.
File "/Users/creator/venv/lib/python3.6/site-packages/mxnet/base.py", line 146, in check_call
raise MXNetError(py_str(LIB.MXGetLastError()))
mxnet.base.MXNetError: [05:01:12] include/mxnet/./tensor_blob.h:276: Check failed: this->shape.Size() == shape.Size() (4304252928 vs. 9285632) TBlob.get_with_shape: new and old shape do not match total elements
Stack trace returned 8 entries:
[bt] (0) 0 libmxnet.so 0x000000014c7a17cf libmxnet.so + 59343
[bt] (1) 1 libmxnet.so 0x000000014c7a156f libmxnet.so + 58735
[bt] (2) 2 libmxnet.so 0x000000014c7c3249 libmxnet.so + 197193
[bt] (3) 3 libmxnet.so 0x000000014d89f01e MXNDListFree + 1430366
[bt] (4) 4 libmxnet.so 0x000000014d886f95 MXNDListFree + 1331925
[bt] (5) 5 libmxnet.so 0x000000014d7115fd MXNDArraySyncCopyFromCPU + 13
[bt] (6) 6 _ctypes.cpython-36m-darwin.so 0x0000000112d00237 ffi_call_unix64 + 79
[bt] (7) 7 ??? 0x00007ffee14b78c0 0x0 + 140732678240448
Large Tensor Support
Before MXNet 1.5.0, MXNet supported a maximal tensor size of around 4 billion (2³²). This was due to uint32_t being used as the default data type for tensor size, as well as variable indexing. Now you can enable large tensor support by changing the following build flag to 1: USE_INT64_TENSOR_SIZE = 1.(Note this is set to 0 by default) This enabled large scale training for example large graph network training using Deep Graph Library.
see: apache/mxnet#9207
https://medium.com/apache-mxnet/apache-mxnet-1-5-0-release-is-now-available-4138f5233401
The text was updated successfully, but these errors were encountered: