-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Revert "Improve FC perf when no_bias=False" #15099
Conversation
This reverts commit 6cf964a.
cc @anirudh2290 |
Really weird! CI check passed even on my PR. So if CI check doesn't fail, does that tell much ? |
We certainly need to find the root cause on why CI wasn't reliable in this case. (e.g. could it have been just a cache issue?) |
The root cause seems to be not enough shared memory in the docker container used in CI. Using fixes the build. We have some files which are very slow to compile and produce big binaries, I think is excess of inline. |
Hmm, if this was required, how did the original PR pass CI check? |
I was able to reproduce this with make with on my local machine with command:
Note: I was able to reproduce only with a clean build after make clean. for example, after building on top of previous commit i wasn't able to reproduce this. |
@stu1130 has tried with ~2G and failed. It also seems 8G in #15100 failed.
We have tried with 500m before #15033, everything works fine. Everything after this commit failed with 500m.
I would prefer to merge this and keep #15084 open until we find root cause. |
@roywei sounds good to me. |
Is there are next step for those of use with PRs that failed CI due to this issue? |
@aaronmarkham Rebase and retrigger CI and it should be fine. |
We need to investigate what's going on, I thought it passed for me at 8G but then my PR validation failed. Could be a linker bug or an offset linking bigger than 2^32, it's still not 100% clear to me. |
I'd suggest strace dump of linker |
Just failed with 16G for me. |
) This reverts commit 6cf964a.
Reverts #15033
@stu1130 and I are running a binary search on the commits that may cause #15084
This seems to be the one. The reason causing CI build failure is still unknown, as this only happens with CI enviroments, builidng locally with GPU + MKLDNN is fine on master.
I m creating this PR to test if CI check can pass