Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When working on refactoring paddle code using unsupported tensor module of Eigen3, I meet the problem that GCC5.4 with -O1 is good, but -O2 will cause segment fault.
There is a bug in unspported tensor module, m_devicePropInitialized and are m_deviceProperties defined as a static variable in a header file. Thanks to @hedaoyuan help to debug to verify this.
If the TensorDeviceCuda.h header file is included by several .cc file, the value of m_devicePropInitialized and m_deviceProperties will be different. Each .cc file will have its own value. The first file value maybe is true(device properties have been inited), but the second file value can still be false. So, it will cause segment fault, when get value from m_deviceProperties[0](m_deviceProperties is actually nullptr in other .cc file).
I found that Tensorflow also used tensor module of Eigen3, but have no such error. Tensorflow has implemented EigenCudaStreamDevice. It's interesting that in the constructor of EigenCudaStreamDevice, no cuda stream will passed, but in Reinitialize, a cuda stream will be passed.
So, I implement a class EigenCudaStreamDevice just as TensorFlow does. And I set gcc version to 5.4, and compile with Release mode.
I will check TensorFlow and Eigen3 in further.