xiaoping378.github.io icon indicating copy to clipboard operation
xiaoping378.github.io copied to clipboard

构建生产环境级的docker Swarm集群

Open xiaoping378 opened this issue 9 years ago • 0 comments

构建生产环境级的docker Swarm集群

此文档适用于不低于1.12版本的docker,因为swarm已内置于docker-engine里。

  1. 硬件需求

    这里以5台PC服务器为例, 分别如下作用

    • manager0
    • manager1
    • node0
    • node1
    • node2
  2. 每台PC上安装docker-engine

    一台一台的ssh上去执行,或者使用ansible批量部署工具。

    安装docker-engine

    curl -sSL https://get.docker.com/ | sh
    

    启动之,并使之监听2375端口

    sudo docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock
    

    亦可修改配置,使之永久生效

    mkdir /etc/systemd/system/docker.service.d
    cat <<EOF >>/etc/systemd/system/docker.service.d/docker.conf
    [Service]
    ExecStart=
    ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --dns 180.76.76.76  --insecure-registry registry.cecf.com -g /home/Docker/docker
    EOF
    

    如果开启了防火墙,需要开启如下端口

    • TCP port 2377 for cluster management communications
    • TCP and UDP port 7946 for communication among nodes
    • TCP and UDP port 4789 for overlay network traffic
  3. 创建swarm

    docker swarm init --advertise-addr <MANAGER-IP>
    

    我的实例里如下:

    [root@manager0 ~]# docker swarm init --advertise-addr 10.42.0.243
    Swarm initialized: current node (e5eqi0lue90uidzsfddeqwfl8) is now a manager.
    
    To add a worker to this swarm, run the following command:
    
        docker swarm join \
        --token SWMTKN-1-3iskhw3lsc9pkdtijj1d23lg9tp7duj18f477i5ywgezry7zlt-dfwjbsjleoajcdj13psu702s6 \
        10.42.0.243:2377
    
    To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
    

    使用 --advertise-addr 来声明manager0的IP,其他的nodes必须可以和此IP互通, 一旦完整初始化,此node即是manger又是worker node.

    通过docker info来查看

    $ docker info
    
    Containers: 2
    Running: 0
    Paused: 0
    Stopped: 2
      ...snip...
    Swarm: active
      NodeID: e5eqi0lue90uidzsfddeqwfl8
      Is Manager: true
      Managers: 1
      Nodes: 1
      ...snip...
    

    通过docker node ls来查看集群的node信息

    [root@manager0 ~]# docker node ls
    ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
    e5eqi0lue90uidzsfddeqwfl8 *  manager0  Ready   Active        Leader
    
    

    这里的*来指明docker client正在链接在这个node上。

  4. 加入swarm集群

    执行在manager0上产生docker swarm init产生的结果即可

    如果当时没记录下来,还可以在manager上补看 想把node以worker身份加入,在manager0上执行下面的命令来补看。

    docker swarm join-token worker
    

    想把node以manager身份加入,在manager0上执行下面的命令来来补看。

    docker swarm join-token manager
    

    为了manager的高可用,我这里需要在manager1上执行

    docker swarm join \
    --token SWMTKN-1-3iskhw3lsc9pkdtijj1d23lg9tp7duj18f477i5ywgezry7zlt-86dk7l9usp1yh4uc3rjchf2hu \
    10.42.0.243:2377
    

    我这里就是依次在node0-2上执行

    docker swarm join \
      --token SWMTKN-1-3iskhw3lsc9pkdtijj1d23lg9tp7duj18f477i5ywgezry7zlt-dfwjbsjleoajcdj13psu702s6 \
      10.42.0.243:2377
    

    这样node就会加入之前我们创建的swarm集群里。

    再通过docker node ls来查看现在的集群情况, swarm的集群里是以node为实例的

    [root@manager0 ~]# docker node ls
    ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
    0tr5fu8ebi27cp2ot210t67fx    manager1  Ready   Active        Reachable
    46irkik4idjk8rjy7pqjb84x0    node1     Ready   Active
    79hlu1m7x9p4cc4npa4xjuax3    node0     Ready   Active
    9535h8ow82s8mzuw5kud2mwl3    consul0   Ready   Active
    e5eqi0lue90uidzsfddeqwfl8 *  manager0  Ready   Active        Leader
    

    这里MANAFER标明各node的身份,空即为worker身份。

  5. 部署服务

    Usage:    docker service COMMAND
    
    Manage Docker services
    
    Options:
          --help   Print usage
    
    Commands:
      create      Create a new service
      inspect     Display detailed information on one or more services
      ps          List the tasks of a service
      ls          List services
      rm          Remove one or more services
      scale       Scale one or multiple services
      update      Update a service
    

    部署示例如下:

    docker service create --replicas 2 --name helloworld alpine ping 300.cn
    

    docker service ls罗列swarm集群的所有services docker service ps helloworld查看service部署到了哪个node上 docker service inspect helloworld 查看service 资源、状态等具体信息 docker servcie scale helloworld=5来扩容service的个数 docker service rm helloworld 来删除service docker service update 来实现更新service的各项属性,包括滚动升级等。

    可更新的属性包含如下:

    Usage:    docker service update [OPTIONS] SERVICE
    
    Update a service
    
    Options:
          --args string                    Service command args
          --constraint-add value           Add or update placement constraints (default [])
          --constraint-rm value            Remove a constraint (default [])
          --container-label-add value      Add or update container labels (default [])
          --container-label-rm value       Remove a container label by its key (default [])
          --endpoint-mode string           Endpoint mode (vip or dnsrr)
          --env-add value                  Add or update environment variables (default [])
          --env-rm value                   Remove an environment variable (default [])
          --help                           Print usage
          --image string                   Service image tag
          --label-add value                Add or update service labels (default [])
          --label-rm value                 Remove a label by its key (default [])
          --limit-cpu value                Limit CPUs (default 0.000)
          --limit-memory value             Limit Memory (default 0 B)
          --log-driver string              Logging driver for service
          --log-opt value                  Logging driver options (default [])
          --mount-add value                Add or update a mount on a service
          --mount-rm value                 Remove a mount by its target path (default [])
          --name string                    Service name
          --publish-add value              Add or update a published port (default [])
          --publish-rm value               Remove a published port by its target port (default [])
          --replicas value                 Number of tasks (default none)
          --reserve-cpu value              Reserve CPUs (default 0.000)
          --reserve-memory value           Reserve Memory (default 0 B)
          --restart-condition string       Restart when condition is met (none, on-failure, or any)
          --restart-delay value            Delay between restart attempts (default none)
          --restart-max-attempts value     Maximum number of restarts before giving up (default none)
          --restart-window value           Window used to evaluate the restart policy (default none)
          --stop-grace-period value        Time to wait before force killing a container (default none)
          --update-delay duration          Delay between updates
          --update-failure-action string   Action on update failure (pause|continue) (default "pause")
          --update-parallelism uint        Maximum number of tasks updated simultaneously (0 to update all at once) (default 1)
      -u, --user string                    Username or UID
          --with-registry-auth             Send registry authentication details to swarm agents
      -w, --workdir string                 Working directory inside the container
    

xiaoping378 avatar Sep 08 '16 09:09 xiaoping378