Clyso Blog | Clyso GmbH

timeseries monitoring

July 27, 2023 · One min read

Managing Director at Clyso

After more than 10 years of experience with different evolutionary stages of timeseries monitoring systems and the growing need for metrics, we decided to replace database and file based solutions.

To store metrics for the long term, we rely on object storage technology to provide almost unlimited storage capacity in the backend.

The Object Storage is provided to us by multiple Ceph clusters. We are now also able to dynamically connect alternative storage locations, such as AWS and/or GCP, as needed.

TISAX compliant storage solution with Ceph

July 27, 2023 · One min read

Joachim Kraftmayer

Managing Director at Clyso

*TISAX* (Trusted Information Security Assessment Exchange) is a standard for information security defined by the automotive industry. A large number of automotive manufacturers and suppliers in the German automotive industry have required many business partners to have existing TISAX certification since 2017.
[https://de.wikipedia.org/wiki/TISAX](https://de.wikipedia.org/wiki/TISAX)

On behalf of the customer, we support the customer in replacing its existing storage solution and introducing and commissioning Ceph as a future-proof and TISAX-compliant storage solution for its internal processes and data volumes.

The customer decided to connect its existing environment with NFS and Kerberos authentication and for its private cloud to connect via RBD.

Reddit Challenge Accepted - Is 10k IOPs achievable with NVMes?

July 21, 2023 · 11 min read

Mark Nelson

Head of R&D at Clyso

Hello Ceph community! It's that time again for another blog post! Recently, a user on the ceph subreddit asked whether Ceph could deliver 10K IOPs in a combined random read/write FIO workload from one client. The setup consists of 6 nodes with 2 4GB FireCuda NVMe drives each. They wanted to know if anyone would mind benchmarking a similar setup and report the results. Here at Clyso we are actively working on improving the Ceph code to achieve higher performance. We have our own tests and configurations for evaluating our changes to the code, but it just so happens that one of the places we do our work (the upstream ceph community performance lab!) appears to be a good match for testing this user's request. We decided to sit down for a couple of hours and give it a try. u/DividedbyPi, one of our friends over at 45drives.com, wrote that they are also going to give it a shot and report the results on their youtube channel in the coming weeks. We figure this could be a fun way to get results from multiple vendors. Let's see what happens!

Cephalocon 2023

May 17, 2023 · One min read

Joachim Kraftmayer

Managing Director at Clyso

Dan van der Ster

CTO at Clyso

Mark Nelson

Head of R&D at Clyso

Cephalocon 2023 in Amsterdam saw the entire Ceph community out in force. And Clyso had a big presence with six talks and as host of the whole event!

Ceph Day NYC 2023

April 17, 2023 · One min read

Dan van der Ster

CTO at Clyso

Joachim Kraftmayer

Managing Director at Clyso

The Ceph Community hosted its first post-pandemic event at the Bloomberg offices in New York City. Ceph Day NYC was a great success!

Ceph Reef vs Quincy RBD Performance

March 27, 2023 · One min read

Joachim Kraftmayer

Managing Director at Clyso

Clyso's Mark Nelson has written the first part in a series looking at performance testing of the upcoming Ceph Reef release vs the previous Quincy release. See the blog post here!

Please feel free to contact us if you are interested in Ceph support or performance consulting!

ceph - how do disable mclock scheduler

March 22, 2023 · One min read

Joachim Kraftmayer

Managing Director at Clyso

After more than 4 years of development, mclock is the default scheduler for ceph quincy (version 17).If you don't want to use the scheduler, you can disable it with the option osd_op_queue.

WPQ was the default before Ceph Quincy and the change requires a restart of the OSDs.

Source:

https://docs.ceph.com/en/quincy/rados/configuration/osd-config-ref/#confval-osd_op_queue"

https://docs.ceph.com/en/quincy/rados/configuration/osd-config-ref/#qos-based-on-mclock>"

Fix CephFS Filesystem Read-Only

January 6, 2023 · One min read

Joachim Kraftmayer

Managing Director at Clyso

After a reboot of the MDS Server it can happen that the CephFS Filesystem becomes read-only:

HEALTH_WARN 1 MDSs are read only
[WRN] MDS_READ_ONLY: 1 MDSs are read only
    mds.XXX(mds.0): MDS in read-only mode
[https://tracker.ceph.com/issues/58082](https://tracker.ceph.com/issues/58082)

In the MDS log you will find following entry

log_channel(cluster) log [ERR] : failed to commit dir 0x1 object, errno -22 mds.0.11963 unhandled write error (22) Invalid argument, force readonly... mds.0.cache force file system read-only log_channel(cluster) log [WRN] : force file system read-only mds.0.server force_clients_readonly

https://tracker.ceph.com/issues/58082

This is a known upstream issue thought the fix is still not merged

As a workaround you can use following steps:

ceph config set mds mds_dir_max_commit_size 80
ceph fs fail <fs_name>
ceph fs set <fs_name> joinable true

If not successful you may need to increase the mds_dir_max_commit_size, e.g. to 160

ceph Quincy release with bugfix for PGLog dups

November 5, 2022 · One min read

Joachim Kraftmayer

Managing Director at Clyso

Our bugfix from earlier this year was published in the ceph quincy release 17.2.4.

Trimming of PGLog dups is now controlled by size instead of the version. This fixes the PGLog inflation issue that was happening when online (in OSD) trimming jammed after a PG split operation. Also, a new offline mechanism has been added: ceph-objectstore-tool now has a trim-pg-log-dups op that targets situations where an OSD is unable to boot due to those inflated dups. If that is the case, in OSD logs the “You can be hit by THE DUPS BUG” warning will be visible. Relevant tracker: https://tracker.ceph.com/issues/53729"

osds with unlimited ram growth

how to identify osds affected by pg dup bug

Sources

https://docs.ceph.com/en/latest/releases/quincy/#v17-2-4-quincy

\[WRN\] clients failing to respond to cache pressure

November 5, 2022 · One min read

Joachim Kraftmayer

Managing Director at Clyso

At the time when the MDS cache runs full, the process must clear inodes from its cache. This also means that the MDS will prompt some clients to also clear some inodes from their cache.

The MDS asks the cephfs client several times to release the inodes. If the client does not respond to this cache recall request, Ceph will log this warning.

related posts​

Sources​

related posts

Sources