I recently ran into the problem, that suddenly one of my nodes had a “nervous breakdown” and it wasn’t accessible through traefik anymore. All that was printed on the page a “Bad Gateway”. But I was sure that it was working before, I didn’t change any settings.
I tried various solutions including adding a new overlay network to traefik and the services on the problematic node. Nothing would work. So in the end I drained and restarted the node, after which the problem was gone. The commands to do so on an Ubuntu instance would be:
docker node ls # to find the name/id of the node docker node update --availability drain worker1 # after all containers are removed sudo reboot
Afterwards, you can simply rescale/redeploy some of your services to repopulate the instance. There currently isn’t any “rebalance” command, because the Docker team doesn’t like the idea of killing healthy containers too much.