openshift-on-openstack icon indicating copy to clipboard operation
openshift-on-openstack copied to clipboard

failed to detect process id for "docker" - failed to find pid of "docker": exit status 1

Open zhangjunli177 opened this issue 9 years ago • 1 comments

After docker&k8s is installed in each node, there's following error dumped in /var/log/messages in every 5 minutes:- Jul 26 02:46:13 oakcloud-origin-node-ve98q5ri origin-node: I0726 02:46:13.459145 27742 openstack.go:289] Claiming to support Instances Jul 26 02:46:14 oakcloud-origin-node-ve98q5ri origin-node: E0726 02:46:14.252638 27742 container_manager_linux.go:267] failed to detect process id for "docker" - failed to find pid of "docker": exit status 1 Jul 26 02:46:24 oakcloud-origin-node-ve98q5ri origin-node: I0726 02:46:24.009912 27742 openstack.go:289] Claiming to support Instances


According to post https://github.com/kubernetes/kubernetes/issues/26259 docker process uses "docker-current" as its name, while k8s is trying to find docker process with name "docker". That mismatch causes the problem.

Following the solution in the post, I update the function docker_install_and_enable() in file openshift-on-openstack/fragments/master-boot.sh and node-boot.sh, the error can be fixed.

function docker_install_and_enable() {
    if ! rpm -q docker
    then
        yum -y install docker || notify_failure "could not install docker"
    fi
    systemctl enable docker
    sed -i "s#/usr/bin/docker#exec -a docker /usr/bin/docker#" /usr/lib/systemd/system/docker.service
}

As I'm still an newbie here, no sure whether it's a right place. Also the code looks ugly, should have a better way.

zhangjunli177 avatar Aug 02 '16 07:08 zhangjunli177

Hi @zhangjunli177 , thanks for reporting this. This issue is tracked also in https://bugzilla.redhat.com/show_bug.cgi?id=1337400 and the fix has been merged to upstream already - https://github.com/kubernetes/kubernetes/pull/25907/commits/6744a7417aa846f89f4bb6e25c49561709f06dff . It's fixed in RHEL atomic-openshift pkgs and hopefully should be fixed with next openshift-origin release.

For the above reason I would prefer not including a workaround in these templates (supposing a fix is released soon for openshift-origin). As a quick fix you might put something like this into the "docker_install_and_enable" function (or a separate function): mkdir -p /etc/systemd/system/docker.service.d cat << EOF > /etc/systemd/system/docker.service.d/override.conf [Service] ExecStart=exec -a docker /usr/bin/docker-current daemon
--exec-opt native.cgroupdriver=systemd
$OPTIONS
$DOCKER_STORAGE_OPTIONS
$DOCKER_NETWORK_OPTIONS
$ADD_REGISTRY
$BLOCK_REGISTRY
$INSECURE_REGISTRY EOF

jprovaznik avatar Aug 09 '16 09:08 jprovaznik