ceph radosgw-admin remove rights from users
Example to remove authorizations of a user with radosgw-admin
Remove the authorization of buckets=read from the user clyso-user-id.
radosgw-admin caps rm --uid=clyso-user-id --caps="buckets=read"
Example to remove authorizations of a user with radosgw-admin
Remove the authorization of buckets=read from the user clyso-user-id.
radosgw-admin caps rm --uid=clyso-user-id --caps="buckets=read"
The aim is to achieve a scaling of the rgw instances for the production system so that 10,000 active connections are possible.
As a result of various test runs, the following configuration emerged for our setup
[client.rgw.<id>]
keyring = /etc/ceph/ceph.client.rgw.keyring
rgw content length compat = true
rgw dns name = <rgw.hostname.clyso.com>
rgw enable ops log = false
rgw enable usage log = false
rgw frontends = civetweb port=80
error_log_file=/var/log/radosgw/civetweb.error.log
rgw num rados handles = 8
rgw swift url = http://<rgw.hostname.clyso.com>
rgw thread pool size = 512
rgw thread pool size ist der Standardwert für num_threads des civeweb webservers.
Line 54: https://github.com/ceph/ceph/blob/master/src/rgw/rgw_civetweb_frontend.cc
set_conf_default(conf_map, "num_threads",
std::to_string(g_conf->rgw_thread_pool_size));
[client.radosgw]
keyring = /etc/ceph/ceph.client.radosgw.keyring
rgw content length compat = true
rgw dns name = <fqdn hostname>
rgw enable ops log = false
rgw enable usage log = false
rgw frontends = civetweb port=8080 num_threads=512
error_log_file=/var/log/radosgw/civetweb.error.log
rgw num rados handles = 8
rgw swift url = http://<fqdn hostname>
rgw thread pool size = 51``
https://github.com/ceph/ceph/blob/master/doc/radosgw/config-ref.rst
http://docs.ceph.com/docs/master/radosgw/config-ref/
https://github.com/ceph/ceph/blob/master/src/rgw/rgw_civetweb_frontend.cc
http://www.osris.org/performance/rgw.html
https://www.swiftstack.com/docs/integration/python-swiftclient.html
If the description of the Ceph options for ceph.conf is not sufficient or if you miss one or the other description in the official documentation, you might find it here:
raw.githubusercontent.com/ceph/ceph/master/src/common/options.cc
Ceph Version: Luminous 12.2.2 with Filestore and XFS
After more than 2 years, several OSDs on the productive Ceph cluster reported the error message:
** ERROR: osd init failed: (28) No space left on device
and terminated itself. Attempts to restart the OSD always ended with the same error message.
The Ceph cluster changed from HEALTH_OK to HEALTH_ERR status with the warning:
ceph osd near full
ceph pool near full
The superficial check with df -h sometimes showed 71% to 89% used disk space and no more files could be created in the file system.
No remount or unmount and mount has changed the situation.
The first suspicion was that the inode64 option for XFS might be missing, but this option was set. After closer examination of the internal statistics of the XFS file system with
xfs_db -r "-c freesp -s" /dev/sdd1
df -h
df -i
we chose the following solution:
First we stopped the recovery with
ceph osd set noout
so as not to fill the remaining OSDs any further. We then automatically distributed the data on the remaining Ceph cluster according to usage with
ceph osd reweight-by-utilization
We then moved a single PG (important: always different PGs per OSD) from the affected OSD to /root to have additional space on the file system and started the OSDs.
In the next step, we deleted virtual machine images that were no longer required from our cloud environment.
It took some time for the blocked requests to clear and the system to resume normal operation.
Unfortunately, it was not possible for us to definitively clarify the cause.
However, as we are currently in the process of switching from Filestore to Bluestore, we will soon no longer need XFS.
If you quickly need the syntax for the radosgw-admin command.
clyso-ceph-rgw-client:~/clyso # radosgw-admin object stat --bucket=size-container --object=clysofile
{
"name": "clysofile",
"size": 26,
"policy": {
"acl": {
"acl_user_map": [
{
"user": "clyso-user",
"acl": 15
}
],
"acl_group_map": [],
"grant_map": [
{
"id": "clyso-user",
"grant": {
"type": {
"type": 0
},
"id": "clyso-user",
"email": "",
"permission": {
"flags": 15
},
"name": "clyso-admin",
"group": 0,
"url_spec": ""
}
}
]
},
"owner": {
"id": "clyso-user",
"display_name": "clyso-admin"
}
},
"etag": "clyso-user",
"tag": "d667b6f1-5737-4f5e-bad0-fc030f0a4e94.11729649.143382",
"manifest": {
"objs": [],
"obj_size": 26,
"explicit_objs": "false",
"head_size": 26,
"max_head_size": 4194304,
"prefix": ".ZQzVc6phBAMCv3lSbiHBo0fftkpXmjm_",
"rules": [
{
"key": 0,
"val": {
"start_part_num": 0,
"start_ofs": 4194304,
"part_size": 0,
"stripe_max_size": 4194304,
"override_prefix": ""
}
}
],
"tail_instance": "",
"tail_placement": {
"bucket": {
"name": "size-container",
"marker": "d667b6f1-5737-4f5e-bad0-fc030f0a4e94.11750341.561",
"bucket_id": "d667b6f1-5737-4f5e-bad0-fc030f0a4e94.11750341.561",
"tenant": "",
"explicit_placement": {
"data_pool": "",
"data_extra_pool": "",
"index_pool": ""
}
},
"placement_rule": "default-placement"
}
},
"attrs": {
"user.rgw.pg_ver": "��",
"user.rgw.source_zone": "eR[�\u0011",
"user.rgw.tail_tag": "d667b6f1-5737-4f5e-bad0-fc030f0a4e94.11729649.143382",
"user.rgw.x-amz-meta-mtime": "1535100720.157102"
}
}
The Ceph cluster has recognized that a placement group (PG) is missing important information. This may be missing information on any write operations that have occurred or that there are no error-free copies.
The recommendation is to bring all OSDs that are in the down or out state back into the Ceph cluster, as these could contain the required information. In the case of an Ereasure Coding (EC) pool, the temporary reduction of the min_size can enable recovery. However, the min_size cannot be smaller than the number of defined data shunks for this pool.
https://docs.ceph.com/docs/master/rados/operations/pg-states/ https://docs.ceph.com/docs/master/rados/operations/erasure-code/
In a productive cluster, the removal of OSDs or entire hosts can affect regular operations for users, depending on the load. It is therefore recommended, for example, to gradually remove an OSD or a host from productive operation in order to ensure full replication over the entire process.
You could execute the commands manually step by step and always wait until the data has been completely redistributed in the cluster.
ceph osd crush reweight osd.<ID> 1.0
ceph osd crush reweight osd.<ID> 8.0
ceph osd crush reweight osd.<ID> 6.0
ceph osd crush reweight osd.<ID> 4.0
ceph osd crush reweight osd.<ID> 2.0
ceph osd crush reweight osd.<ID> 0.0
We wrote our own script for automation years ago, so it should also work with earlier versions, such as Hammer, Jewel, Kraken and Luminous.
ceph osd out <ID>
INITV: service ceph stop osd.<ID>
SYSTEMD: systemctl stop ceph-osd@<ID>
ceph osd crush remove osd.<ID>
ceph auth del osd.<ID>
ceph osd rm <ID>
Achtung beim Löschen von Elementen aus der CRUSHMAP fängt der Ceph Cluster an die Verteilung wieder auszugleichen.
The Ceph OSD daemon can consume more than 2.5 times its set main memory in recovery.
Current consumption:
ceph daemon osd.X dump_mempools|jq '.total.bytes'
root@master.qa.cloud.clyso.com:~ # radosgw-admin user list
[
...
"57574cda626b45fba1cd96e68a57ced2",
...
"admin",
...
]
radosgw-admin user info --uid=57574cda626b45fba1cd96e68a57ced2
{
"user_id": "57574cda626b45fba1cd96e68a57ced2",
"display_name": "qa-clyso-backup",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"auid": 0,
"subusers": [],
"keys": [],
"swift_keys": [],
"caps": [],
"op_mask": "read, write, delete",
"default_placement": "",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"temp_url_keys": [],
"type": "keystone"
}
root@master.qa.cloud.clyso.com:~ # radosgw-admin quota set --quota-scope=user --uid=57574cda626b45fba1cd96e68a57ced2 --max-size=32985348833280```
## verify the set quota max_size and max_size_kb
```bash
root@master.qa.cloud.clyso.com:~ # radosgw-admin user info --uid=57574cda626b45fba1cd96e68a57ced2
{
"user_id": "57574cda626b45fba1cd96e68a57ced2",
"display_name": "qa-clyso-backup",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"auid": 0,
"subusers": [],
"keys": [],
"swift_keys": [],
"caps": [],
"op_mask": "read, write, delete",
"default_placement": "",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": 32985348833280,
"max_size_kb": 32212254720,
"max_objects": -1
},
"temp_url_keys": [],
"type": "keystone"
}
root@master.qa.cloud.clyso.com:~ # radosgw-admin quota enable --quota-scope=user --uid=57574cda626b45fba1cd96e68a57ced2
root@master.qa.cloud.clyso.com:~ # radosgw-admin user info --uid=57574cda626b45fba1cd96e68a57ced2
{
"user_id": "57574cda626b45fba1cd96e68a57ced2",
"display_name": "qa-clyso-backup",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"auid": 0,
"subusers": [],
"keys": [],
"swift_keys": [],
"caps": [],
"op_mask": "read, write, delete",
"default_placement": "",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": true,
"check_on_raw": false,
"max_size": 32985348833280,
"max_size_kb": 32212254720,
"max_objects": -1
},
"temp_url_keys": [],
"type": "keystone"
}
root@master.qa.cloud.clyso.com:~ # radosgw-admin user stats --uid=57574cda626b45fba1cd96e68a57ced2 --sync-stats
{
"stats": {
"total_entries": 10404,
"total_bytes": 54915680,
"total_bytes_rounded": 94674944
},
"last_stats_sync": "2017-08-21 07:09:58.909073Z",
"last_stats_update": "2017-08-21 07:09:58.906372Z"
}
Again and again I come across people who use a replication of 2 for a replicated Ceph pool.
If you know exactly what you are doing, you can do this. I would strongly advise against it.
One simple reason for this is that you can't form a clear majority with two people, there always has to be at least a third.
There are error scenarios in which it can quickly happen that both OSDs (osdA and osdB) of a placement group (replication size = 2) are not available. If an osdA fails, the cluster only has one copy of the object and the default value (min_size = 2) on the pool means that the cluster would no longer allow any write operations to the object.
If min_size=1 (not recommended) then the osdB could be gone for a short time and the osdA would come back. Now osdA does not know whether further write operations have been carried out on osdB during its offline phase.
You can then only hope that all osds come back or you can then manually make the decision for the most current data set. While in the background more and more blocked_requests accumulate in the cluster that would like to access the data.