Hey admin colleagues

1) Do you run CEPH?
2) If so, do you have a SSD backed pool?
3) Do you use this pool for RBD images?

If you've answered all three with yes, do you run fstrim inside these RBD devices? (If yes please leave a comment so we can maybe reach out to you.) If you don't do it for a reason please leave a comment.

@ops We are using trim in VMs running on Ceph RBDs in SSD pools, mostly because it seemed the right thing to do if Ceph shall be able to reclaim pool resources from partially filled RBDs. I have not payed too much attention if this decreases performance, is that something you experienced?

@INCO during the run of fstrim the fs is nearly unusably slow. For example 2 Minutes read time for the 40KB the textfile component of node_exporter needs to read.


@INCO how big is your cluster? How full and what is your usecase? If we may ask.

· · Web · 1 · 0 · 0

@ops Our Cluster is Pretty small, 30% of 24TB (raw size) is in use mostly for lightly used VM OS disks.
Thank you for the information I will keep that in mind should we also experience slowdown issues.

So far our biggest issue was when an SSD fails ungracefully, it apparently takes some time for the OSD to be marked as failed and IO being redirected to a redundant block on another OSD, and until then we saw some VMs freeze in disk IO.

@INCO how many OSDs do you have with how many (write) IOPS?

@ops I don't have exact statistics on hand, but atm. it's less than 100 IOPS for 12 OSDs. So I guess next to nothing.

@INCO ok that's far to small to compare it to our setup. Thanks for your help anyways.

Sign in to participate in the conversation
Uberspace Mastodon

This server is for internal use only at the moment (Imprint)