Skip to main content

6 posts tagged with "recovery"

View All Tags

· One min read
Joachim Kraftmayer

Commvault has been in use as a data protection solution for years and is now looking to replace its existing storage solution (EMC), for its entire customer environments.

Commvault provides data backup through a single interface. Through the gradual deployment of Ceph S3 in several expansion stages, the customer built confidence in Ceph as a storage technology and more and more backups are gradually being transferred to the new backend.

In the first phase, Ceph S3 was allowed to excel in its performance and scalability capabilities.

In the following phases, the focus will be on flexibility and use as unified storage for cloud computing and Kubernetes.

For all these scenarios, the customer relies on Ceph as an extremely scalable, high-performance and cost-effective storage backend.

Over 1 PB of backup data and more than 500 GBytes per hour of backup throughput can be easily handled by Ceph S3 and it is ready to grow even further with the requirements in the future.

After in-depth consultation, we were able to exceed the customer’s expectations for the Ceph cluster in production.

· One min read
Joachim Kraftmayer

The customer uses Commvault as a data backup solution for their entire customer environments.

Wherever the data resides, Commvault provides the backup of the data through a single interface. The customer thus avoids costly data loss scenarios, disconnected data silos, lack of recovery SLAs and inefficient scaling.

For all these scenarios, the customer relies on Ceph as a powerful and cost-effective storage backend for Commvault.

With over 2 PB of backup data and more than 1 TByte per hour of backup throughput, Ceph can easily handle and is ready to grow even further with the requirements in the future.

In conclusion, we were able to clearly exceed the customer’s expectations of the Ceph Cluster already in the test phase.

· One min read
Joachim Kraftmayer

The crash module collects information about daemon crashdumps and stores it in the Ceph cluster for later analysis.

If you see this message in the status of Ceph (ceph -s), you should first execute the following command to list all collected crashes:

ceph crash ls

Here you can see in the output which OSD(s) had or have problems with the respective time of occurrence.

You can get more information with the help of

ceph crash info <ID>

for the respective crash event.

If the crash is no longer relevant it can be confirmed with the following two commands:

ceph crash archive

or

ceph crash archive-all

After that the warning disappears from the ceph status output.

Sources

https://docs.ceph.com/en/quincy/mgr/crash/

· One min read
Joachim Kraftmayer

login into one ceph monitor node and create a new recovey client:

you can do it with the client.admin but i prefer to create a seperate recovery client.

cephadm docker host:

ceph -n mon. --keyring /var/lib/ceph/&lt;fsid&gt;/mon/&lt;mon-name&gt;/keyring get-or-create client.recovery mon 'allow *' mds 'allow *' mgr 'allow *' osd 'allow *'

ceph standard host:

ceph -n mon. --keyring /var/lib/ceph/mon/&lt;mon-name&gt;/keyring get-or-create client.recovery mon 'allow *' mds 'allow *' mgr 'allow *' osd 'allow *'

install ceph-common:

apt install ceph-common

create two files:

/etc/ceph/ceph.conf

[global]
fsid=<you find the ceph_fsid file in each path of osd, mon or mgr>
mon_host = [v2:<ip addr of the active ceph monitor>;:3300/0,v1:<ip addr of the active ceph monitor>:6789/0]
/etc/ceph/ceph.client.recovery.keyring (add the output of the ceph get-or-create command. replace the : with = and set the name in [])

· One min read
Joachim Kraftmayer
  • osd max backfills: This is the maximum number of backfill operations allowed to/from OSD. The higher the number, the quicker the recovery, which might impact overall cluster performance until recovery finishes.
  • osd recovery max active: This is the maximum number of active recover requests. Higher the number, quicker the recovery, which might impact the overall cluster performance until recovery finishes.
  • osd recovery op priority: This is the priority set for recovery operation. Lower the number, higher the recovery priority. Higher recovery priority might cause performance degradation until recovery completes.
ceph tell 'osd.*' injectargs --osd-max-backfills=2 --osd-recovery-max-active=2

Recommendation

Start in small steps, observe the Ceph status, client IOPs and throughput and then continue to increase in small steps.

In the producton with regard to the applications and hardware infrastructure, we recommend setting these settings back to default as soon as possible.

Sources

https://www.suse.com/support/kb/doc/?id=000019693