I'm trying to build a system to collect data from some sensors on a bunch of Raspberry Pi 3.
In order to do that, I have created a docker-compose file with a bunch of services that I plan to distribute using Docker Swarm, but a few seconds after joining the worker node, using docker node ls
on the manager, I always find the worker node in status down
.
This is what I have already tried:
Initial setup used for testing:
- 2 Windows 11 PCs on the same local network
- Docker Desktop with WSL2 backend on both
- The first PC is a custom desktop connected via Ethernet
- The second is a Surface Pro 8 connected via WiFi
- Before every try, I opened the following ports on both PCs: 2377 TCP, 7946 TCP/UDP, 4789 UDP
First try (based on this guide):
- Run
docker swarm init
on the desktop PC - Run
docker run --rm -d -p 0.0.0.0:2378:2378 --name swarm-manager-proxy alpine/socat tcp-l:2378,fork,reuseaddr tcp:<IP-INIT>:2377
using the IP obtained with the first command (usually was 192.168.65.4) - Run
docker swarm join --token <TOKEN> <MANAGER-LOCAL-IP>:2378
on the SP8 using the IP of the desktop in the local network
Result: The connection succeeds, but after a few seconds, using docker node ls
, I find the worker node in down
status, and with docker node inspect <WORKER-ID>
, I find the heartbeat failure
status.
Second try: Swapped manager and worker PC, obtained the same results
Third try:
- Installed Ubuntu on the desktop PC
- Installed Docker following this guide till point 3
- Connected the SP8 as a worker
Result: Same as before
Fourth try: Reinstalled Docker Desktop on the SP8, same result as before
Fifth try: Swapped manager/worker role, using the alpine/socat container workaround on the SP8 (still with Ubuntu on the desktop)
Result: THE WORKER NODE (desktop) STAYS ACTIVE!
In this setup, I'm able to deploy a stack using a docker-compose file; the services get started correctly on the desired machines, but I was not able to verify if they are able to communicate via the overlay network.
So, at this point, I understand nothing about what is going on. Could someone explain to me if this is due to some limitations inside Docker Swarm using WSL2 or if I'm doing something wrong?