Hey admin colleagues
1) Do you run CEPH?
2) If so, do you have a SSD backed pool?
3) Do you use this pool for RBD images?
If you've answered all three with yes, do you run fstrim inside these RBD devices? (If yes please leave a comment so we can maybe reach out to you.) If you don't do it for a reason please leave a comment.
@ops yep, but also when I still had HDDs below, as fstrim on an rbd image reduces the number of objects in use - thin provisioning and stuff
Discard for the SSDs below is issued by bluestore already
@LittleFox the performance with HDDs here was too bad and HDDs to cheap.
Do you see major performance issues during the fstrim run on your VMs? How big is your cluster?
@ops the fstrim operation takes some time (at most around a minute per image), but that's mostly due to that only being half-automated and not run very frequently
For the few volumes with automated fstrim, I don't see any performance problems so far - but mostly use these volumes as k8s PVs, not disks for virtual machines, so they don't serve an OS but only the application itself
@LittleFox how big are these images? A minute is quite short if I look at our timings. Could you tell something about the number of OSDs and the current write rate in IOPS across the pool?
@ops fwiw full fstrim on the 300GiB took 159s with up to 1.5k IOPS reported by ceph - filesystem mixed ext4, xfs and btrfs, with most data on btrfs