Skip to content
New issue

Have a question about this project?Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of serviceand privacy statement.We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop infinite retry when WSL raised error #8968

Closed
licanhuaopened this issue Oct 11, 2020 · 7 comments
Closed

Stop infinite retry when WSL raised error #8968

licanhuaopened this issue Oct 11, 2020 · 7 comments
Assignees

Comments

@licanhua
Copy link

licanhua commented Oct 11, 2020

Hello,

I'm from Microsoft and working on WSL, and from WSL measures, I found that likely Docker Desktop is doing infinite retry if WSL has problem on the user machines(For example, lxss manager crashes, have problem to create virtual networking, WSL bugs...)
On specific devices, WSL is launched more than 50,000 times per day but always failing with error. And I guess it's because that docker desktop is retrying.
I'm not on that machine, but those machine should be very busy to try on an unrecoverable WSL error, and also complains that docker desktop doesn't work but don't know why.

If docker has infinite retry, I would suggest you to just retry a small number of times, if it reached the threshold, just raise a message that WSL have problem, and please fix it first

@stephen-turner
Copy link
Contributor

@simonferquelAny comments?

@simonferquel
Copy link

This is actually something we recently fixed, but it is not yet released.
It will be in next Edge. (not in next stable though. Should we consider cherry-picking the fix?)

@licanhua
Copy link
Author

how do you fixed it? re-try a limited number of times, re-try wait and re-try?
when the fix will be released? I'm happy to verify it in my side(I can make the lxss returns any error you like) to see if it fixed the problem or not.
I don't know what's the impact from user's view, but I really like this is fixed on stable too.

@simonferquel
Copy link

We previously had a limitation in our architecture that made it mandatory to have a retry loop there as the backend startup and distro integration logic where controlled from 2 different processes that did not communicate together. We changed all that, and thus we know exactly when to trigger the integration logic within the backend startup code.
So now, if there is a failure, we know that it is not because of synchronization issues, but it is actually a real error where our integration logic fails to run within your distro. There are some known reasons while it would fail (e.g. distros without a glibc like Alpine), and we also had bugs in this area with exotic shells etc.

The fix will appear in next Edge version. When the integration process fail, you'll now have a native Windows notification asking you if you want to restart the integration process or not.

@simonferquel
Copy link

The issue has been fixed in Edge release 2.4.2.0, thank you for reporting it.

@Vankog
Copy link

Vankog commented Oct 22, 2020

@simonferquel

Actually, I got this particular notification since the latest release and to be honest, I did not know how to fix this. There is no information visible on what to do.

I was lucky that I read the release notes to remember something like this restarting issue notification. This way I had a breadcrumb to follow into this issue and finally read the one information that actually helped:

There are some known reasons while it would fail (e.g. distros without a glibc like Alpine)

I do use Alpine for by main WSL2 distro. So onlynowI know I should switch.

Thus, I suggest the following:
If this issue arises and the notification triggers, could you somehow provide some further information to the user? Some troubleshooting or a known-problems page or such?
I mean, Docker does not complain about this in any visible way, except for this notification. So what's the impact? (Fallback to non WSL mode?)
Would be nice to be pushed to the right direction for troubleshooting. (I actually ran Diagnostics, but I couldn't make any sense out of the dump.)

Thanks!:-)

@docker-robott
Copy link
Collaborator

Closed issues are locked after 30 days of inactivity.
This helps our team focus on active issues.

If you have found a problem that seems similar to this, please open a new issue.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle locked

@docker docker locked and limited conversation to collaborators Nov 21, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants