-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Too large max depth value in _recursive_fork_recordio #12619
Comments
Thanks for submitting the issue @caiqi |
see #12622 |
@zhreshold I change the code you commit, but the error still exits |
@Angzz What os? Can you print this for me to debug?
|
@zhreshold |
Okay, I modified the search depth to be less aggressive. |
@zhreshold OK, I will update mxnet pre version to do a experiment, thanks |
when update to 1.3.1b20180925, error occurs when train ssd with coco, but voc is normal: ---------------- train log and error log ------------------ INFO:root:Start training from [Epoch 0] |
@Angzz Would disable these lines help? /~https://github.com/apache/incubator-mxnet/blob/29ac19124555ca838f5f3a01da638eda221b07b2/python/mxnet/gluon/data/dataloader.py#L181-L183 Are you using RecordFiles? If not, it has nothing to do with JPEG images. |
@zhreshold Sorry, I don't understand why delete these lines, if delete, the recursive mechanism will not work? I do not use the |
when train to 13 epoch for coco, another error occurs: [13:44:22] src/resource.cc:262: Ignore CUDA Error [13:44:22] src/storage/storage.cc:65: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading CUDA: initialization error Stack trace returned 10 entries: [13:44:22] src/engine/threaded_engine_perdevice.cc:99: Ignore CUDA Error [13:44:22] /home/travis/build/dmlc/mxnet-distro/mxnet-build/3rdparty/mshadow/mshadow/./tensor_gpu-inl.h:35: Check failed: e == cudaSuccess CUDA: initialization error Stack trace returned 10 entries: [13:44:22] src/resource.cc:262: Ignore CUDA Error [13:44:22] src/storage/storage.cc:65: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading CUDA: initialization error Stack trace returned 10 entries: terminate called after throwing an instance of 'std::system_error' |
finally I solve this problem by this link: |
@Angzz Not sure why, maybe python related. However, it is not relevant to this thread. I am going to close this issue. Let me know if anyone is still getting the same original recursion error. |
Hi, I am getting a very similar error:
I am using the latest I have a custom class, that inherits from Just applying the transforms sequentially works fine, but Composing them results in a RecursionError. |
Reopening this issue since it looks like we have a public example now in the lipnet code that can be used to figure out what's going on... |
@RuRo has your problem solved? |
It seems that 1000 is too large for _recursive_fork_recordio in
/~https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/data/dataloader.py#L178
Even if len(obj.dict.items()) > 2, this function will be called by more than 2 ** 1000 times.
The following code in /~https://github.com/dmlc/gluon-cv/blob/master/scripts/detection/ssd/train_ssd.py#L96 in gluon-cv will cause
RecursionError: maximum recursion depth exceeded in comparison
error on windows 10 with the latest build. I found that the reason is that there will be a HybridSequential object in the dataset object and the HybridSequential contains many children. This function is brought in commit #12554 . Is it ok to jump out of this function when obj is not an instance of mx.gluon.data.dataset.Dataset?The text was updated successfully, but these errors were encountered: