--- service: docker symptoms: cannot connect to docker daemon, docker daemon failed to start, docker socket permission denied, containers cannot resolve dns, docker network broken, daemon.json conflict, docker oom, unable to remove filesystem tags: docker, dockerd, containerd, container, daemon, daemon.json, cgroup, dns, docker0, socket, compose --- ## Symptoms - `Cannot connect to the Docker daemon. Is the docker daemon running on this host?` - `permission denied` on `/var/run/docker.sock` - `dockerd` fails to start after a `daemon.json` change - Containers cannot resolve DNS or pull images - Docker bridge/network disappears or container networking breaks after boot - Container or daemon is killed by the kernel OOM killer - `Error: Unable to remove filesystem` when removing a container ## Diagnostics ### Check daemon health and client target ``` docker info systemctl is-active docker systemctl status docker ps -ef | grep dockerd env | grep DOCKER_HOST ``` If `DOCKER_HOST` is set incorrectly, the CLI may be talking to the wrong daemon. ### Check daemon logs and startup failures ``` journalctl -u docker -n 200 journalctl -u containerd -n 100 cat /etc/docker/daemon.json systemctl cat docker ``` Look for conflicts between `daemon.json` keys and systemd startup flags, especially duplicate `hosts` settings. ### Check socket permissions and group access ``` ls -la /var/run/docker.sock id getent group docker ls -la ~/.docker/ ``` If the user was added to the `docker` group recently, a new login shell may be required. ### Check kernel, cgroups, and memory pressure ``` uname -r free -h dmesg | grep -i -E 'docker|cgroup|oom|killed process' ``` Low memory, missing kernel features, or cgroup issues can stop containers or the daemon. ### Check Docker networking and DNS ``` docker network ls ip addr show docker0 sysctl net.ipv4.ip_forward cat /etc/resolv.conf ps aux | grep dnsmasq ``` Loopback DNS resolvers in `/etc/resolv.conf` often break container DNS unless Docker is given explicit nameservers. ### Check storage and stuck mounts ``` df -h /var/lib/docker docker system df lsof /var/lib/docker ``` Bind-mounting `/var/lib/docker` into other containers can keep container filesystems busy and block removal. ## Remediation **Daemon not running or client aimed at the wrong host:** Unset an incorrect `DOCKER_HOST`, then start the daemon: ``` unset DOCKER_HOST systemctl restart docker ``` **`daemon.json` conflicts with systemd flags:** Remove duplicate settings or create a systemd override so `dockerd` is started without conflicting flags. **Permission denied on Docker socket:** Add the user to the `docker` group, then re-login: ``` usermod -aG docker $USER newgrp docker ``` If `~/.docker/` was created by `sudo`, fix ownership: ``` sudo chown "$USER":"$USER" "$HOME/.docker" -R sudo chmod g+rwx "$HOME/.docker" -R ``` **Container DNS broken:** Configure explicit DNS servers in `/etc/docker/daemon.json`, then restart Docker. **Docker networking disappears after boot:** Stop the host network manager from managing Docker interfaces and confirm `net.ipv4.ip_forward=1`. **OOM kills:** Treat this as host memory pressure first; reduce workload, add memory, or enforce container memory limits. **Unable to remove filesystem:** Find the process holding the path open with `lsof`, then stop that process or the container bind-mounting `/var/lib/docker`.