ceph-helm icon indicating copy to clipboard operation
ceph-helm copied to clipboard

ceph-mgr and ceph-osd is not starting

Open ghost opened this issue 6 years ago • 1 comments

Is this a request for help?: Yes


Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Version of Helm and Kubernetes:

$ helm version
Client: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.4", GitCommit:"c27b913fddd1a6c480c229191a087698aa92f0b1", GitTreeState:"clean", BuildDate:"2019-02-28T13:37:52Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.4", GitCommit:"c27b913fddd1a6c480c229191a087698aa92f0b1", GitTreeState:"clean", BuildDate:"2019-02-28T13:30:26Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}

Which chart: ceph

What happened: ceph-mgr and ceph-osd won't start up ouput of ceph-mgr:

+ source variables_entrypoint.sh 
++ ALL_SCENARIOS='osd osd_directory osd_directory_single osd_ceph_disk osd_ceph_disk_prepare osd_ceph_disk_activate osd_ceph_activate_journal mgr' 
++ : ceph 
++ : ceph-config/ceph 
++ : 
++ : 
++ : 0 
++ : dockerblade-slot6-oben.example.com 
++ : dockerblade-slot6-oben.example.com 
++ : /etc/ceph/monmap-ceph 
++ : /var/lib/ceph/mon/ceph-dockerblade-slot6-oben.example.com 
++ : 0 
++ : 0 
++ : mds-dockerblade-slot6-oben.example.com 
++ : 0 
++ : 100 
++ : 0 
++ : 0 
+++ uuidgen 
++ : 5700ffd2-02f6-4212-8a76-8a57f3fe2a04 
+++ uuidgen 
++ : 38e7aef4-c42b-457a-af33-fa8dc3ff1eb7 
++ : root=default host=dockerblade-slot6-oben.example.com 
++ : 0 
++ : cephfs 
++ : cephfs_data 
++ : 8 
++ : cephfs_metadata 
++ : 8 
++ : dockerblade-slot6-oben.example.com 
++ : 
++ : 
++ : 8080 
++ : 0 
++ : 9000 
++ : 0.0.0.0 
++ : cephnfs 
++ : dockerblade-slot6-oben.example.com 
++ : 0.0.0.0 
++ CLI_OPTS='--cluster ceph' 
++ DAEMON_OPTS='--cluster ceph --setuser ceph --setgroup ceph -d' 
++ MOUNT_OPTS='-t xfs -o noatime,inode64' 
++ MDS_KEYRING=/var/lib/ceph/mds/ceph-mds-dockerblade-slot6-oben.example.com/keyring 
++ ADMIN_KEYRING=/etc/ceph/ceph.client.admin.keyring 
++ MON_KEYRING=/etc/ceph/ceph.mon.keyring 
++ RGW_KEYRING=/var/lib/ceph/radosgw/dockerblade-slot6-oben.example.com/keyring 
++ MGR_KEYRING=/var/lib/ceph/mgr/ceph-dockerblade-slot6-oben.example.com/keyring 
++ MDS_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-mds/ceph.keyring 
++ RGW_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-rgw/ceph.keyring 
++ OSD_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-osd/ceph.keyring 
++ OSD_PATH_BASE=/var/lib/ceph/osd/ceph 
+ source common_functions.sh 
++ set -ex 
+ [[ ! -e /usr/bin/ceph-mgr ]] 
+ [[ ! -e /etc/ceph/ceph.conf ]] 
+ '[' 0 -eq 1 ']' 
+ '[' '!' -e /var/lib/ceph/mgr/ceph-dockerblade-slot6-oben.example.com/keyring ']' 
+ timeout 10 ceph --cluster ceph auth get-or-create mgr.dockerblade-slot6-oben.example.com mon 'allow profile mgr' osd 'allow *' mds 'allow *' -o /var/lib/ceph/mgr/ceph-dockerblade-slot6-oben.example.com/keyring 

and the output of ceph-osd's osd-prepare-pod:

+ export LC_ALL=C 
+ LC_ALL=C 
+ source variables_entrypoint.sh 
++ ALL_SCENARIOS='osd osd_directory osd_directory_single osd_ceph_disk osd_ceph_disk_prepare osd_ceph_disk_activate osd_ceph_activate_journal mgr' 
++ : ceph 
++ : ceph-config/ceph 
++ : 
++ : osd_ceph_disk_prepare 
++ : 1 
++ : dockerblade-slot5-unten 
++ : dockerblade-slot5-unten 
++ : /etc/ceph/monmap-ceph 
++ : /var/lib/ceph/mon/ceph-dockerblade-slot5-unten 
++ : 0 
++ : 0 
++ : mds-dockerblade-slot5-unten 
++ : 0 
++ : 100 
++ : 0 
++ : 0 
+++ uuidgen 
++ : e101933b-67b3-4267-824f-173d2ef7a47b 
+++ uuidgen 
++ : 10dd57d2-f3c7-4cab-88ea-8e3771baeaa7 
++ : root=default host=dockerblade-slot5-unten 
++ : 0 
++ : cephfs 
++ : cephfs_data 
++ : 8 
++ : cephfs_metadata 
++ : 8 
++ : dockerblade-slot5-unten 
++ : 
++ : 
++ : 8080 
++ : 0 
++ : 9000 
++ : 0.0.0.0 
++ : cephnfs 
++ : dockerblade-slot5-unten 
++ : 0.0.0.0 
++ CLI_OPTS='--cluster ceph' 
++ DAEMON_OPTS='--cluster ceph --setuser ceph --setgroup ceph -d' 
++ MOUNT_OPTS='-t xfs -o noatime,inode64' 
++ MDS_KEYRING=/var/lib/ceph/mds/ceph-mds-dockerblade-slot5-unten/keyring 
++ ADMIN_KEYRING=/etc/ceph/ceph.client.admin.keyring 
++ MON_KEYRING=/etc/ceph/ceph.mon.keyring 
++ RGW_KEYRING=/var/lib/ceph/radosgw/dockerblade-slot5-unten/keyring 
++ MGR_KEYRING=/var/lib/ceph/mgr/ceph-dockerblade-slot5-unten/keyring 
++ MDS_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-mds/ceph.keyring 
++ RGW_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-rgw/ceph.keyring 
++ OSD_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-osd/ceph.keyring 
++ OSD_PATH_BASE=/var/lib/ceph/osd/ceph 
+ source common_functions.sh 
++ set -ex 
+ is_available rpm 
+ command -v rpm 
+ is_available dpkg 
+ command -v dpkg 
+ OS_VENDOR=ubuntu 
+ source /etc/default/ceph 
++ TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728 
+ case "$CEPH_DAEMON" in 
+ OSD_TYPE=prepare 
+ start_osd 
+ [[ ! -e /etc/ceph/ceph.conf ]] 
+ '[' 1 -eq 1 ']' 
+ [[ ! -e /etc/ceph/ceph.client.admin.keyring ]] 
+ case "$OSD_TYPE" in 
+ source osd_disk_prepare.sh 
++ set -ex 
+ osd_disk_prepare 
+ [[ -z /dev/container/block-data ]] 
+ [[ ! -e /dev/container/block-data ]] 
+ '[' '!' -e /var/lib/ceph/bootstrap-osd/ceph.keyring ']' 
+ timeout 10 ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring health 
+ exit 1 

What you expected to happen: to start up flawlessly

Anything else we need to know: here is my overrides.yml:

network:
  public: 10.42.0.0/16
  cluster: 10.42.0.0/16

osd_devices:
  - name: block-data
    device: /dev/container/block-data
    zap: "1"

storageclass:
  name: ceph-rbd
  pool: rbd
  user_id: k8s

I am using Rancher 2 / RKE on bare-metal. I am unsure about the network-setup. Maybe i have some issues here:

  • All nodes (6) can see and reach each other by IPv4 address only. Although the nodes have names there is no DNS set up outside of the cluster
  • Rancher/RKE sets up a flannel-network with CIDR 10.42.0.0/16 which is what i used as network.public and network.cluster

ghost avatar Mar 26 '19 10:03 ghost

Hi :) did you managed to overcome this issue? I'm experiencing the same problem

ranrubin avatar Sep 01 '19 13:09 ranrubin