This is how to add or remove OSDs from exisiting Cluster.
+--------------------+ | +----------------------+
| client.srv.local] |10.0.0.30 | 10.0.0.31| [www.srv.local] |
| Ceph Client +-----------+-----------+ RADOSGW |
| | | | |
+--------------------+ | +----------------------+
+----------------------------+----------------------------+
| | |
|10.0.0.51 |10.0.0.52 |10.0.0.53
+-----------+-----------+ +-----------+-----------+ +-----------+-----------+
| [node01.srv.local] | | [node02.srv.local] | | [node03.srv.local] |
| Object Storage +----+ Object Storage +----+ Object Storage |
| Monitor Daemon | | | | |
| Manager Daemon | | | | |
+-----------------------+ +-----------------------+ +-----------------------+
Mục Lục
[1] For example, Add a [node04] node to OSDs on Admin Node.
For Block device on new [node04] Node, use [/dev/sdb] on this example.
# transfer public key
root@node01:~# ssh-copy-id node04
# install required packages
root@node01:~# ssh node04 "apt update; apt -y install ceph"
# transfer required files
root@node01:~# scp /etc/ceph/ceph.conf node04:/etc/ceph/ceph.conf
root@node01:~# scp /etc/ceph/ceph.client.admin.keyring node04:/etc/ceph
root@node01:~# scp /var/lib/ceph/bootstrap-osd/ceph.keyring node04:/var/lib/ceph/bootstrap-osd
# configure OSD
root@node01:~# ssh node04 \
"chown ceph. /etc/ceph/ceph.* /var/lib/ceph/bootstrap-osd/*; \
parted --script /dev/sdb 'mklabel gpt'; \
parted --script /dev/sdb "mkpart primary 0% 100%"; \
ceph-volume lvm create --data /dev/sdb1"
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 2c941f07-96f7-4b51-91fd-5296a102eaa9
Running command: /usr/sbin/vgcreate --force --yes ceph-d299b0ff-e127-4de7-a0f4-bf8dffd95308 /dev/vdb1
stdout: Physical volume "/dev/vdb1" successfully created.
stdout: Volume group "ceph-d299b0ff-e127-4de7-a0f4-bf8dffd95308" successfully created
Running command: /usr/sbin/lvcreate --yes -l 100%FREE -n osd-block-2c941f07-96f7-4b51-91fd-5296a102eaa9 ceph-d299b0ff-e127-4de7-a0f4-bf8dffd95308
stdout: Logical volume "osd-block-2c941f07-96f7-4b51-91fd-5296a102eaa9" created.
.....
.....
Running command: /usr/bin/systemctl start ceph-osd@3
--> ceph-volume lvm activate successful for osd ID: 3
--> ceph-volume lvm create successful for: /dev/vdb1
root@node01:~# ceph -s
cluster:
id: 72840c24-3a82-4e28-be87-cf9f905918fb
health: HEALTH_OK
services:
mon: 1 daemons, quorum node01 (age 3h)
mgr: node01(active, since 31m)
mds: cephfs:1 {0=node01=up:active}
osd: 4 osds: 4 up (since 4m), 4 in (since 4m)
rgw: 1 daemon active (www)
task status:
scrub status:
mds.node01: idle
data:
pools: 8 pools, 193 pgs
objects: 215 objects, 35 KiB
usage: 4.4 GiB used, 316 GiB / 320 GiB avail
pgs: 193 active+clean
[2] To remove an OSD Node from existing Cluster, run commands like follows.
For example, Remove [node04] node.
root@node01:~# ceph -s
cluster:
id: 72840c24-3a82-4e28-be87-cf9f905918fb
health: HEALTH_OK
services:
mon: 1 daemons, quorum node01 (age 3h)
mgr: node01(active, since 32m)
mds: cephfs:1 {0=node01=up:active}
osd: 4 osds: 4 up (since 4m), 4 in (since 4m)
rgw: 1 daemon active (www)
task status:
scrub status:
mds.node01: idle
data:
pools: 8 pools, 193 pgs
objects: 215 objects, 35 KiB
usage: 4.4 GiB used, 316 GiB / 320 GiB avail
pgs: 193 active+clean
root@node01:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.31238 root default
-3 0.07809 host node01
0 hdd 0.07809 osd.0 up 1.00000 1.00000
-5 0.07809 host node02
1 hdd 0.07809 osd.1 up 1.00000 1.00000
-7 0.07809 host node03
2 hdd 0.07809 osd.2 up 1.00000 1.00000
-9 0.07809 host node04
3 hdd 0.07809 osd.3 up 1.00000 1.00000
# specify OSD ID of a node you'd like to remove
root@node01:~# ceph osd out 3
marked out osd.3.
# live watch cluster status
# after running [ceph osd out ***], rebalancing is executed automatically
# to quit live watch, push [Ctrl + c]
root@node01:~# ceph -w
cluster:
id: 72840c24-3a82-4e28-be87-cf9f905918fb
health: HEALTH_WARN
Reduced data availability: 33 pgs inactive, 15 pgs peering
Degraded data redundancy: 15/645 objects degraded (2.326%), 4 pgs degraded
services:
mon: 1 daemons, quorum node01 (age 3h)
mgr: node01(active, since 33m)
mds: cephfs:1 {0=node01=up:active}
osd: 4 osds: 4 up (since 5m), 3 in (since 8s); 46 remapped pgs
rgw: 1 daemon active (www)
task status:
scrub status:
mds.node01: idle
data:
pools: 8 pools, 193 pgs
objects: 215 objects, 35 KiB
usage: 3.3 GiB used, 237 GiB / 240 GiB avail
pgs: 17.617% pgs unknown
59.067% pgs not active
15/645 objects degraded (2.326%)
95 activating
45 active+clean
34 unknown
9 remapped+peering
6 peering
4 activating+degraded
progress:
Rebalancing after osd.3 marked out (5s)
[==..........................] (remaining: 45s)
2020-08-31T19:38:47.069524+0900 mon.node01 [WRN] Health check update: Degraded data redundancy: 160/645 objects degraded (24.806%), 32 pgs degraded (PG_DEGRADED)
2020-08-31T19:38:47.069550+0900 mon.node01 [INF] Health check cleared: PG_AVAILABILITY (was: Reduced data availability: 33 pgs inactive, 15 pgs peering)
.....
.....
2020-08-31T19:40:10.658884+0900 mon.node01 [INF] Health check cleared: PG_DEGRADED (was: Degraded data redundancy: 1 pg undersized)
2020-08-31T19:40:10.658994+0900 mon.node01 [INF] Cluster is now healthy
# after status turns to [HEALTH_OK], disable OSD service on the target node
root@node01:~# ssh node04 "systemctl disable --now ceph-osd@3.service"
# remove the node to specify target OSD ID
root@node01:~# ceph osd purge 3 --yes-i-really-mean-it
purged osd.3
root@node01:~# ceph -s
cluster:
id: 72840c24-3a82-4e28-be87-cf9f905918fb
health: HEALTH_OK
services:
mon: 1 daemons, quorum node01 (age 3h)
mgr: node01(active, since 36m)
mds: cephfs:1 {0=node01=up:active}
osd: 3 osds: 3 up (since 28s), 3 in (since 3m)
rgw: 1 daemon active (www)
task status:
scrub status:
mds.node01: idle
data:
pools: 8 pools, 193 pgs
objects: 215 objects, 35 KiB
usage: 3.3 GiB used, 237 GiB / 240 GiB avail
pgs: 193 active+clean