「Rook Ceph」- 常见 OSD 问题处理

  CREATED BY JENKINSBOT

[ROOK-CEPH] OSD Init Container 启动失败

ceph: failed to initialize OSD · Issue #8023 · rook/rook · GitHub
Cluster unavailable after node reboot, symlink already exist · Issue #10860 · rook/rook · GitHub

问题描述

在 Rook Ceph 中,当节点重启后,OSD-<ID> Pod 的 Init Container 无法正常启动,提示如下错误:

# kubectl logs rook-ceph-osd-5-7f759955bc-9bqt4 -c activate 
...
Running command: /usr/bin/ceph-bluestore-tool prime-osd-dir --dev /dev/sdb --path /var/lib/ceph/osd/ceph-5 --no-mon-config
Running command: /usr/bin/chown -R ceph:ceph /dev/sdb
Running command: /usr/bin/ln -s /dev/sdb /var/lib/ceph/osd/ceph-5/block
 stderr: ln: failed to create symbolic link '/var/lib/ceph/osd/ceph-5/block': File exists
Traceback (most recent call last):
  File "/usr/sbin/ceph-volume", line 11, in <module>
    load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')()
...

解决方案

查看 activate 所挂载的 activate-osd 存储目录,删除其中的 blcok 文件(其为软链接)。