-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent hang/deadlock in .net core 5.0 RC1 while debugging #42375
Comments
Tagging subscribers to this area: @tommcdon |
Some more debugging. If I start with ctrl-f5 (no debugger) and later...
The third could be a red herring since it's an intermittent issue, but seems consistent. Also worth noting in the third case that the diagnostics tool in VS reports 1,3 gb of memory after I have been using a native debugger but only ~700 mb when I use a managed debugger. |
@joakimriedel would it be possible to send us a dump of the debuggee, devenv, and mvsmon at the point of the hang? To securely share files, please open a VS Feedback Item and send us a link to it. |
@tommcdon please see the original VS Feedback Item in the first issue post above: https://developercommunity.visualstudio.com/content/problem/1187332/debugger-hangs-sporadically-in-visual-studio-2019.html |
Btw, it was another red herring about the breakpoints mentioned in the VS Feedback Item. This will reproduce without first setting any breakpoints in the code. |
@tommcdon I just found something! The frequency of this error happening depends on the amount of LogLevel I set. Using all "Debug" it reproduces almost instantly on first load of the application. If I set all LogLevels to "Warning" (almost none or little log items) I cannot seem to reproduce. Does this give you any pointers on where the problem might lie? |
@joakimriedel, if you collected a dump using dotnet-dump, you can use the |
@josalem very interesting, thanks for the tip! all threads but these three are in either thread A
thread B
thread C
My amateur guess is that these three threads are deadlocking each other somehow. I made two dumps of the hung process a minute apart, and it showed the same three threads with the same call stack. This actually strengthens the observation I had above that modifying the LogLevel will increase the likelihood of reproducing the error. ping @tommcdon |
@joakimriedel thank you for sending the VS feedback item our way. It looks like there may be a thread suspension related issue here that I am hoping @kouvel or @noahfalk can take a look at. It appears that the debugger is requesting debugger suspension but we are not able to synchronize for the debugger. If this a regression between P8 and RC1, then the stack below might be interesting. Out of curiosity, can you try disabling Tiered compilation by setting COMPLUS_TieredCompilation=0?
|
This looks like another case of #38736. We start choking on processes that are heavily async trying to make sure evaluating properties and functions won't deadlock when we try to freeze the process. However, we often do so too eagerly. That issue is open to relax the conditions under which we emit the notifications. I don't have a good solution for your perf issue under the debugger. The only work around I know of is to reduce the number of events getting generated. A trace will tell us for sure if it's the logging machinery as you believe (and if it's even the notifications as I'm thinking). |
@hoyosjs as you see in the attached image in the first post cpu is less than 10% when it hangs. But you are right that it is a heavily async process. In the huge call stack that I only supplied the top rows for above, there are 15+ |
It looks like the thread that holds the lock above is stuck trying to enter cooperative GC mode:
I think #40060 may have missed this spin-wait that is blocking switching GC modes. I'll look into fixing this. In the meantime, it should be possible to work around this by disabling tiered compilation when debugging. This can be done in the project file of the web app as below. After a clean build the next launch should make the config effective. <Project Sdk="Microsoft.NET.Sdk.Web">
<PropertyGroup>
<TieredCompilation>false</TieredCompilation>
</PropertyGroup>
</Project> |
Thanks @kouvel but unfortunately it still reproduces with TieredCompilation set to false. I also tried the environment variable COMPLUS_TieredCompilation pointed out by @tommcdon to no avail. My problem is that I am debugging some hard query issues in EF Core 5.0 RC1 where I need more verbose log output but setting the LogLevel to Debug will hang the debugger due to this regression. I hope you will find a solution to this. I am surprised to not see many others affected by this? Is this an AMD-specific problem running Threadripper? I will try our solution on an Intel-machine and see if it still reproduces. EDIT: No. Reproduces on an Intel machine as well. A bit different call stacks. Is this the SpinWait you referred to?
I also find that last one interesting in the light of various deadlocks in the SqlClient: |
Hmm strange. Could you please share a heap dump of the IISExpress process while VS is in the hung state with TieredCompilation set to false in the project file (in the same feedback ticket)? Just want to see if the setting is effective in the process and if there is maybe a different type of hang also happening. |
The spin-wait I was referring to above is in coreclr.dll here:
I think that is still an issue, but perhaps there are other issues involved, or for some reason the setting to turn off tiering maybe is not working. Hopefully the new heap dump would provide more info. |
@kouvel I made another dump with tiering off. Unfortunately I cannot seem to edit the feedback ticket that I opened through VS2019. I'm not sure what kind of details that would be exposed in a dump like this, so I password protected the link. Send me an email or suggest other means of transportation for the password. |
Thanks @joakimriedel, couldn't find your e-mail, can you e-mail me at kouvel@microsoft.com? |
@kouvel you've got mail. |
It looks like the same underlying issue, I forgot that the problematic code path is taken in another case. Could you please try with |
Any estimate on when this might be patched in a service release? Debugging experience in .NET Core 5 is very frustrating since it hangs at least once every third hour or so at random places, something that never happened in earlier versions. Restarting the debugging session is a simple workaround, but every time I lose process state (and time). |
This one seems to be a bit more complicated than the other one. We're currently discussing solutions, at the moment I'm reasonably confident that something can be done to resolve the deadlock even if the diagnostics experience is a bit worse when the rare case does happen. Still aiming for the first servicing release. |
@jeffschwMSFT do you happen to know when the first servicing release for 5.0 would be released? |
We are taking issues for consideration now for 5.0.1. |
Thanks for investigating @kouvel ! Out of curiosity; from the response to this issue not many others seem to be affected by this bug. Our application is a pretty standard .NET Core solution with MVC and Web API - why do I hit this edge case and not other people with a similar setup? |
The timing window is typically short. A thread checks that debugger suspension is not in progress, soon afterwards the debugger asks to suspend the runtime, then the thread may for example trigger a GC from allocating a few bytes when calling a virtual or interface method through a new type for the first time, which is also rare. The timing window may increase if there are more threads to suspend. |
…ding in some cases 1. When suspending for the debugger is in progress (the debugger is waiting for some threads to reach a safe point for suspension), a thread that is not yet suspended may trigger another runtime suspension. This is currently not allowed because the order of operations conflicts with requirements to send GC events for managed data breakpoints to work correctly when suspending for a GC. Instead, the thread suspends for the debugger first, and after the runtime is resumed, continues suspending for GC. 2. At the same time, if the thread that is not suspended yet is in a forbid-suspend-for-debugger region, it cannot suspend for the debugger, which conflicts with the above scenario, but is currently necessary for the issue fixed by dotnet#40060 3. The current plan is to change managed data breakpoints implementation to pin objects instead of using GC events to track object relocation, and to deprecate the GC events APIs 4. With that, the requirement in #1 goes away, so this change conditions the check to avoid suspending the runtime during a pending suspension for the debugger when GC events are not enabled - Verified that the latest deadlock seen in dotnet#42375 manifests only when a data breakpoint set and not otherwise - Combined with dotnet#44471 and a VS update to use that to switch to the pinning mechanism, the deadlock issue seen above should disappear completely
This github issue was difficult to find in Google results. We are experiencing similar hangs, where debugging hangs, and the memory and CPU graphs come to a halt. Experienced it while debugging both a console application, and an asp.net application, both under .NET 5.0, both with EF Core 5.0. |
@lcrumbling this fix will probably be available in 5.0.1 if it's this same issue. It's currently getting ported to servicing. Sorry for the inconvenience. |
Should be fixed by #44563, which is expected to be included in 5.0.1 |
This issue is fixed in 5.0.100 .NET 5 SDK right? Because something similar occur on my machine. |
@MikaelFerland there's a rare case where it's not fixed for 5.0.100. That fix will be released in 5.0.101. If you can grab a dump using something like |
@hoyosjs thank you, I got 3 dumps : 1 of VS (devenv.exe) and the 2 others are my processes. If the deadlock is happening in VS you need the devenv.exe dump right? |
If you have something like OneDrive, probably upload them there and have a link protected to share just with juan.hoyos at microsoft dot com. I'll let you know once I have them. Are your processes x64? (with the exception of devenv) |
They are x64 processes, you should have received the link. |
@MikaelFerland I tried this and sadly I need the app side to see this. You can stop sharing the OneDrive share. You can upload the files securely on https://developercommunity2.visualstudio.com/t/Debugger-hangs-sporadically-in-Visual-St/1187332 |
@hoyosjs The files have been upload! |
I just encountered the same issue as the above users. I have another project to upgrade as well. Will be interesting to see if it encounters the same issue, or if it is ok. |
@gobananasgo are you hitting it with .net version 5.0.100? |
@mangod9 |
@sossee2 that sounds like a different issue, could you please open a new issue? It might be useful to have access to a crash dump as well. |
Description
After upgrading our web application from 3.1 to 5.0 RC1, I get intermittent hangs/deadlocks while debugging. At first, I thought it was a problem specific to Visual Studio 16.8 preview 3 (https://developercommunity.visualstudio.com/content/problem/1187332/debugger-hangs-sporadically-in-visual-studio-2019.html), but I have now verified that the issue is the same when debugging through Visual Studio Code which makes me think this is related to the 5.0 RC1 runtime and not the IDE.
I have not been able to reproduce this when running the project without debugging (ctrl-f5) - only when debugging (f5).
Unfortunately I cannot reliably reproduce this, it seems to be timing related and more often happens when loading pages in the application that fires a lot of different connections to the server. The application also heavily uses EF Core and SignalR Core.
Configuration
.NET Core 5.0 RC1
Win10 build 2004
Threadripper CPU
Regression?
Yes. This never occurred in net core 3.1.
Other information
I find it very hard to debug this issue. It only happens when starting the project in the debugger. But since it totally hangs the debugger, I am unable to break into the application. I cannot attach another debugger, then I get the error "another debugger is already attached".
Some characteristics:
When it happens, the diagnostics logger stops updating
I cannot break or terminate the debugger
The IIS worker process is idle.
The output window stops logging any more entries. The last entry is at random, I cannot see any pattern in that the hang would happen after a certain kind of event.
The only way to get back control is to kill the IIS worker process manually through task manager.
I have followed the steps to generate a dotnet-dump, but do not know how to analyze it.
How can I go forward to resolve this? I can reproduce but not choose when to reproduce. Sometimes it hangs on first load, sometimes I have to click around in the application for a few minutes to reproduce.
The text was updated successfully, but these errors were encountered: