「Ceph」- CRUSH map

  CREATED BY JENKINSBOT

CRUSH algorithm

The CRUSH algorithm enables the Ceph Storage Cluster to scale, rebalance, and recover dynamically.
Using the CRUSH algorithm:Ceph calculates which placement group should contain the object, and further calculates which Ceph OSD Daemon should store the placement group.

它保存的信息包括:集群设备列表、bucket 列表、故障域(failure domain)分层结构、保存数据时用到的为故障域定义的规则(rules)等;

# ceph osd crush dump

CRUSH Map 的内容

其包含很多内容,主要为如下四部分:Device、Bucket Type、Bucket、Rule;

Device(设备)

集群的所有设备列表,其位于 CRUSH Map 的开头部分:

# devices
device 0 osd.0 class ssd
device 1 osd.1 class ssd
device 2 osd.2 class ssd
device 3 osd.3 class ssd
...

要将 PG 映射到 OSD,CRUSH 需要 OSD 设备列表;

Bucket Type

...
# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 zone
type 10 region
type 11 root
...

Bucket

host laptop-asus-k53sd {
	id -7		# do not change unnecessarily
	id -8 class ssd		# do not change unnecessarily
	# weight 1.819
	alg straw2
	hash 0	# rjenkins1
	item osd.3 weight 1.819
}
host pc-amd64-100249 {
    ...
}
host pc-amd64-100254 {
    ...
}
root default {
	id -1		# do not change unnecessarily
	id -4 class ssd		# do not change unnecessarily
	# weight 2.529
	alg straw2
	hash 0	# rjenkins1
	item pc-amd64-100254 weight 0.300
	item pc-amd64-100249 weight 0.409
	item laptop-asus-k53sd weight 1.819
}
...

Rules

...
rule replicated_rule {
	id 0
	type replicated
	min_size 1
	max_size 10
	step take default
	step chooseleaf firstn 0 type host
	step emit
}
...

各部分间的关系

Device、Bucket Type、Bucket,这三者用以描述存储设备的结构。其为树形结构,使用节点与叶子两种层次。
Rule,通过引用 Device、Bucket Type、Bucket 来控制存储池内数据的存放方式:复制、放置、属性;