Skip to content
This repository has been archived by the owner on Oct 7, 2020. It is now read-only.

Under no circumstances should HIE ever stop responding #1146

Open
KillyMXI opened this issue Mar 25, 2019 · 21 comments
Open

Under no circumstances should HIE ever stop responding #1146

KillyMXI opened this issue Mar 25, 2019 · 21 comments

Comments

@KillyMXI
Copy link

The title is a quote from #461.

I was looking through opened issues and it really struck me how common this pattern:
On any unexpected input, HIE just shuts in itself.
And it is a worst way to handle the situation when we have a really unstable piece of software with a lot of things "unexpected".
As a user, I:

  • can't continue my work at least with the parts that known to work;
  • have to go through a lot of trouble to gather debug information.

The issues with the same pattern:

In no way we can work productively with HIE while you will unhurriedly play whack-a-mole with the issues.
I don't want to learn to avoid certain lines just because you can't handle them.
And restarting HIE to recover it's functionality takes too long.

Messaging itself MUST be more fault-tolerant.

  • Failed to handle a request? - report back as useful failure message as you can in given situation and get ready to process following requests. This is not about debug logging - this is to explain to user why his request wasn't handled;
  • IDE extensions should be able to tell user what went wrong and what can be done about it;
  • User should be able to use other parts of HIE functionality while you're working on fixes for reported issues.
@mpickering
Copy link
Collaborator

Seems like this should be one of the higher priority issues to fix.

@Anrock Anrock pinned this issue Apr 18, 2019
@Anrock
Copy link
Collaborator

Anrock commented Apr 18, 2019

Agreed. I've pinned the issue to help with visibility.

@alanz
Copy link
Collaborator

alanz commented Apr 20, 2019

I think the first step is to have some reproducable bug reports.

Because the underlying architecture is supposed to make it robust, in that the main loop keeps processing regardless, we have deferred execution to provide a result if the GHC invocation fails, and for things like Hover it should respond immediately based on their being a currently cached result.

@mpickering
Copy link
Collaborator

I think actions like "hover" block because the implementation was inefficient.

I also observed that reporting a large number of diagnostics (500) causes something to pause, not sure if that was a HIE or vscode issue.

@alanz
Copy link
Collaborator

alanz commented Apr 21, 2019

I am pretty sure that the default setting for number of issues is to limit it to 100. Which are sorted to be the first ones, so perhaps the sorting process is where the delay comes from.

@alanz
Copy link
Collaborator

alanz commented Apr 21, 2019

And I think the current delayer is requesting completions, if anyone wants to take a look. It should work like Hover, if there is no currently cached module it returns immediately with an empty result.

@bscottm
Copy link

bscottm commented Apr 28, 2019

I think this is the appropriate place to post my particular issue, which resembles a closed file descriptor, but seems more closely related to this issue -- HIE stalls after a 3-4 requests. At first I thought the problem was with SublimeText, the LSP package and the Windows implementation of pipes. I tried hie-8.6.4 under Linux on VirtualBox, same problem. It seems to take about three or four JSON RPCs before hie-8.6.4 stalls.

LSP's basic loop calls Python's str.readline() to grab the next line of input looking for the "Content-length" header. That just stalls waiting for input. At first I thought that it might be a closed file descriptor, but I would expect the str.readline() loop to return EOF (an empty string.)

This particular problem showed up about two to three weeks ago, and I've been trying to diagnose it when I can free cycles. I'm on the latest github version with submodules synced. hie-wrapper log is attached, although I anonymized some of my paths. Apologies for the big source file that gets downloaded to the server...

hie-wrapper.log

@mpickering
Copy link
Collaborator

@bscottm Can you please open a new issue for your report?

@NickSeagull
Copy link

Is there something people eager to contribute can do to help with this? 😃

@fendor
Copy link
Collaborator

fendor commented Jun 2, 2019

Sorry for the late response, @NickSeagull!
There is a bunch of stuff you can do:

Improve underlying bios system

Currently the most unstable part of hie seems to be the build system itself. If you are comfortable with stuff like ghc-mod and cabal, then you can try to help out on issue #1126. Ask @mpickering where help may be required.

Plugins that crash

Sometimes plugins crash in hie, taking the whole application with them. Although I am currently not aware of plugins that regurarly crash, each plugin must be validated on its own. Currently, I have checked hoogle, hsimport, package, pragma, floskell and brittany. Almost all of them can still crash on IOExceptions, such as File not found or Permission denied. In practice though, these issues never happened to me and there are no open issues for that. To help with this, you can go through plugins, adding comments and tests.

Build system

Build system of hie has a bunch of problems at the moment. There are missing tests to validate that still everything works as expected. Also, if you are a windows expert, you can try to tackle windows specific issues, such as #1219, #1133, #983, ... the list goes on.

Hlint data-files

Although hie no longer crashes if hlint crashes, it is not optimal that hlint can not be used hlint data-files are missing. The issue #1143 discusses the problem in detail. If you want to tackle this, talk to me or @power-fungus, we have already decided on a plan of attack.

Windows specific problems

It seems we do not have a lot of windows developers at the moment. If you want to tackle that problem, the most obvious problem is that the tests currently do not run on windows. It may be an issue with lsp-test, but currently no one knows for sure.

@jneira
Copy link
Member

jneira commented Jun 2, 2019

Nice resume @fendor, i'll try to help with any of windows related ones

@ProofOfKeags
Copy link

the underlying crashes may be more difficult to fix, but making sure that it doesn't take HIE down with it seems like a fairly straight forward fix, no? It may be ugly but couldn't you indiscriminately catch all plugin exceptions and then report some null value back, or does LSP have no room in the spec to represent the failure of the request?

@fendor
Copy link
Collaborator

fendor commented Jun 17, 2019

That is true, it is difficult to fix! However, just catching all errors is in my opinion not the right way. A lot of the errors are fatal, e.g. they are not recoverable without human intervention. In this case, better error reporting and expanded help section, might also be a good idea.
Catching all plugin exceptions does make sense, the plugins hoogle and hlint already do that, but still, smart error messages are a must have in this case, instead of silently/semi silently failing. So, taking care of proper error handling is also an open issue, afaik.

@ProofOfKeags
Copy link

what thread in the code should I start pulling at to take care of that issue, also is there an issue number you know about that I could look at?

@fendor
Copy link
Collaborator

fendor commented Jun 17, 2019

Which issue exaclty? Plugin stability?

@ProofOfKeags
Copy link

"proper error handling", related to stability of course

@fendor
Copy link
Collaborator

fendor commented Jun 17, 2019

That seems to be rather complicated. You have to differentiate between Commands, CodeActions and stuff that happens in the protocol, e.g. LspStdio. There is no specific issue for that afaik and I am not sure how it actually can be done properly.

@robrix
Copy link
Contributor

robrix commented Jul 5, 2019

I think the first step is to have some reproducable bug reports.

I can reproduce unresponsive hie quite reliably. Would you prefer me to file new issues, or add reports here?

Per #1146 (comment), going to file a new issue.

@DanielG DanielG unpinned this issue Jul 27, 2019
@NickSeagull
Copy link

Would it make sense to create a "parent issue" to start tracking the state of this? I'm eager to create it, define some subtasks that can be split into other issues if needed to be tracked down and link to PRs etc...

I'd love to make HIE super stable, and would like to help with the organization and coding afterwards

@jneira
Copy link
Member

jneira commented Dec 21, 2019

I think this meta issue was really useful but the new hie architecture should be more stable and maybe it makes this obsolete.
I would close it if nobody disagree with that.

@KillyMXI
Copy link
Author

KillyMXI commented Dec 21, 2019

As an author of this issue, ideally, I should check how my experience changed since I opened it.
Unfortunately, I moved away from Haskell back then, after having too many issues, and right now I'm not sure I have enough time to bring everything in working order and play around.
I would like to, but I can't give any time estimation now.

I find it unfortunate that I cannot see here what exactly were done to prevent this sort of failures from happening. I only see that some of the linked issues were addressed (when HIE compiled with GHC 8.6.5...), while others are still open.

I would've had no objections if I were able to see that the changes in place were not just patches for concrete holes, but have actually increased fault tolerance.

I've just made a quick glance through newer issues, and from #1480 I can conclude that HIE can still crash on user code and imported libraries.

That was my initial point. Development tools face unpolished code the most, and should withstand any bad code.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants