Upstream work on the discoverability of certain recommended volume type aspects.
As a user, I want to be able to look into a volume type object and see all aspects it fulfills so that I can choose the best suited volume type for my volumes.
In #265 a standard for volume types is created. Right now SCS, providers and consumers need to rely on Tags in the description of a volume type to discover volume types with the recommended aspects encrypted and replicated.
We want to find a way to use the internal extra_spec in volume types for the description of those two aspects, when they are present in a volume type. If this is not possible, we would like to introduce another property, which has to be set by the admin, when setting the volume type. Only after that we will have the possibility to automatically check for a volume type with replication or encyption.
Definition of Done:
- [ ] The aspect of encryption is discoverable for a user role in upstream OpenStack
- [ ] The aspect of replication is discoverable for a user role in upstream OpenStack
- [ ] The standard is changed to use the upstream way of discovering replication and encryption in a volume type
For volume types there exist already the concept of user visible extra specs, that are triggered by policy: https://docs.openstack.org/cinder/latest/admin/user-visible-extra-specs.html It should be possible to add support for an encryption extra spec, as everything necessary is already there. The encryption parameters can be accessed as an admin like this:
stack@woc15:~/devstack$ openstack volume type show LUKS --encryption-type
+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| access_project_ids | None |
| description | None |
| encryption | cipher='aes-xts-plain64', control_location='front-end', created_at='2023-11-30T16:13:15.000000', deleted='False', deleted_at=, encryption_id='872a9738-34d0-43d6-977a-91ea4858e74f', key_size='256', provider='luks', updated_at= |
| id | bee35d94-d42c-42d8-b226-32be9ca5b694 |
| is_public | True |
| name | LUKS |
| properties | |
| qos_specs_id | None |
+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
For replication it is harder, as it depends on the used backend. That means, it cannot be derived from inside openstack but would need the input from the provider, whether there is replication or not. I will go into discussion with the Cinder developers about this.
But it might be possible to easily do that as the setting of extra specs is only allowed for admins now and only when the volume type is not in use. So it would just be necessary to make something like this visible to users:
openstack volume type set --property replications=3 LUKS
openstack volume type show LUKS --encryption-type
+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| access_project_ids | None |
| description | None |
| encryption | cipher='aes-xts-plain64', control_location='front-end', created_at='2023-11-30T16:13:15.000000', deleted='False', deleted_at=, encryption_id='872a9738-34d0-43d6-977a-91ea4858e74f', key_size='256', provider='luks', updated_at= |
| id | bee35d94-d42c-42d8-b226-32be9ca5b694 |
| is_public | True |
| name | LUKS |
| properties | replications='3' |
| qos_specs_id | None |
+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
This is currently only visible for admins.
While going through the cinder code to check the places for any upstream patches, I stumbled across the policies: https://github.com/openstack/cinder/blob/master/cinder/policies/volume_type.py#L156 This allows users to read the extra specs but it can be shut down through a policy file (only allowing admin access). We should keep that in mind.
I am currently trying to implement a visible user extra spec for encryption, that is automatically set when creating or unset when deleting an encryption type in here: https://github.com/openstack/cinder/blob/master/cinder/api/contrib/volume_type_encryption.py It should not be setable outside of that workflow.
I am currently running into some errors with my implementation and try to fix that:
openstack volume type create --encryption-provider nova.volume.encryptors.luks.LuksEncryptor --encryption-cipher aes-xts-plain64 --encryption-key-size 256 --encryption-control-location front-end LUKS
Failed to set encryption information for this volume type: The resource could not be found.<br /><br />
(HTTP 404)
local variable 'encryption' referenced before assignment
I found a neat solution for a possible encryption extra spec:
stack@devstack:~/devstack$ openstack volume type create --encryption-provider nova.volume.encryptors.luks.LuksEncryptor --encryption-cipher aes-xts-plain64 --encryption-key-size 256 --encryption-control-location front-end LUKS3
+-------------+------------------------------------------------------------------------------+
| Field | Value |
+-------------+------------------------------------------------------------------------------+
| description | None |
| encryption | cipher='aes-xts-plain64', control_location='front-end', |
| | encryption_id='135648f5-a979-4ee4-a5ed-14115431bf0f', key_size='256', |
| | provider='nova.volume.encryptors.luks.LuksEncryptor' |
| id | d0ff021e-9379-452d-8596-f1669447edbd |
| is_public | True |
| name | LUKS3 |
+-------------+------------------------------------------------------------------------------+
stack@devstack:~/devstack$ openstack volume type show LUKS3
+--------------------+--------------------------------------+
| Field | Value |
+--------------------+--------------------------------------+
| access_project_ids | None |
| description | None |
| id | d0ff021e-9379-452d-8596-f1669447edbd |
| is_public | True |
| name | LUKS3 |
| properties | encryption_enabled='1' |
| qos_specs_id | None |
+--------------------+--------------------------------------+
stack@devstack:~/devstack$ source openrc demo
WARNING: setting legacy OS_TENANT_NAME to support cli tools.
stack@devstack:~/devstack$ openstack volume type show LUKS3
+--------------------+--------------------------------------+
| Field | Value |
+--------------------+--------------------------------------+
| access_project_ids | None |
| description | None |
| id | d0ff021e-9379-452d-8596-f1669447edbd |
| is_public | True |
| name | LUKS3 |
| properties | encryption_enabled='1' |
+--------------------+--------------------------------------+
It is created when creating an encrypted volume type and visible to users. I will polish it and make an upstream patchset and ask them, if that would fit.
I pushed my changes to https://review.opendev.org/c/openstack/cinder/+/907519 to discuss them with upstream.
I pushed my changes to https://review.opendev.org/c/openstack/cinder/+/907519 to discuss them with upstream.
I was fixing a few Errors and found out, that creating custom properties / extra specs lets the volume creation fail. I am asking Upstream about the details of this.
I am investigating a hint that it might be the scheduler, who is responsible for the errors of custom properties. Mailing List Discussion: https://lists.openstack.org/archives/list/[email protected]/thread/ZWXFYIQ3FSFC5MGTSIMVEHCEJRQQRQ3X/
I am investigating a hint that it might be the scheduler, who is responsible for the errors of custom properties. Mailing List Discussion: https://lists.openstack.org/archives/list/[email protected]/thread/ZWXFYIQ3FSFC5MGTSIMVEHCEJRQQRQ3X/
The cause of this is indeed the Cinder Scheduler, that checks ALL properties (== extra_specs) here, for their compatability with the backends: https://github.com/openstack/cinder/blob/master/cinder/scheduler/filters/capabilities_filter.py#L56
Changing this Filter would be possible, but whether we would only allow one or two special extra_specs to be skipped in this filter or to cheange the filter using a list of extra_specs to check for their presence and compatability is something we need to discuss with upstream.
I attended the Cinder meeting to discuss my findings, but unfortunately there was not enough time, we will continue the discussion via ML or in the next week.
https://meetings.opendev.org/meetings/cinder/2024/cinder.2024-02-07-14.00.log.html
thanks @josephineSei for keeping this updated.
After trying to reach out to people and going through the patches again, I added this as a discussion point for the midcycle ptg: https://etherpad.opendev.org/p/cinder-caracal-midcycles
To be better prepared for the discussion tomorrow, I try to look for alternative ways to include the visibility of either volume type defined or administrator defined aspects in volume types.
While it may be possible to include a method in the API call for showing a volume type, that gives normal users information about whether an encryption type is present or not, this is not a feasible way for the replication, that is backend-specific. This aspect has to be set by the admin explicitly but is NOT allowed to interact with the volume scheduler, as it might lead to Errors.
mysql> describe volume_types;
+--------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+-------+
| created_at | datetime | YES | | NULL | |
| updated_at | datetime | YES | | NULL | |
| deleted_at | datetime | YES | | NULL | |
| deleted | tinyint(1) | YES | | NULL | |
| id | varchar(36) | NO | PRI | NULL | |
| name | varchar(255) | YES | | NULL | |
| qos_specs_id | varchar(36) | YES | MUL | NULL | |
| is_public | tinyint(1) | YES | | NULL | |
| description | varchar(255) | YES | | NULL | |
+--------------+--------------+------+-----+---------+-------+
9 rows in set (0.01 sec)
mysql> describe volume_type_extra_specs;
+----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+----------------+
| created_at | datetime | YES | | NULL | |
| updated_at | datetime | YES | | NULL | |
| deleted_at | datetime | YES | | NULL | |
| deleted | tinyint(1) | YES | | NULL | |
| id | int | NO | PRI | NULL | auto_increment |
| volume_type_id | varchar(36) | NO | MUL | NULL | |
| key | varchar(255) | YES | | NULL | |
| value | varchar(255) | YES | | NULL | |
+----------------+--------------+------+-----+---------+----------------+
8 rows in set (0.00 sec)
mysql> describe encryption;
+------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------+--------------+------+-----+---------+-------+
| created_at | datetime | YES | | NULL | |
| updated_at | datetime | YES | | NULL | |
| deleted_at | datetime | YES | | NULL | |
| deleted | tinyint(1) | YES | | NULL | |
| cipher | varchar(255) | YES | | NULL | |
| control_location | varchar(255) | NO | | NULL | |
| key_size | int | YES | | NULL | |
| provider | varchar(255) | NO | | NULL | |
| volume_type_id | varchar(36) | NO | | NULL | |
| encryption_id | varchar(36) | NO | PRI | NULL | |
+------------------+--------------+------+-----+---------+-------+
10 rows in set (0.01 sec)
The database shows, that the encryption parameters are saved in a different table, while something to simply store information would be the extra_specs table or to create a new additional table. This would have the downside, that next to the extra_specs table there would be a need for an additional metadata table or something alike. This would blow up the database and it would put work on the API methods and maybe even the view of the volume type objects. In that case the extra specs and metadata would share the "properties field" of the volume type object:
openstack volume type show LUKS --encryption-type
+--------------------+----------------------------------------------------------------------+
| Field | Value |
+--------------------+----------------------------------------------------------------------+
| access_project_ids | None |
| description | None |
| encryption | cipher='aes-xts-plain64', control_location='front-end', |
| | created_at='2024-02-01T09:33:28.000000', deleted='False', |
| | deleted_at=, encryption_id='d7995c8e-83a4-4535-94ee-151e482dac2e', |
| | key_size='256', |
| | provider='nova.volume.encryptors.luks.LuksEncryptor', updated_at= |
| id | 39bc174e-c075-4296-8a09-f3b274ae0caa |
| is_public | True |
| name | LUKS |
| properties | |
| qos_specs_id | None |
+--------------------+----------------------------------------------------------------------+
A better option would be to put the extra_specs into a separate field and leave the properties field for other metadata. But I doubt that it would be likely to adjust OpenStack like this.
- In the end I can conclude, that the best option would be to open up the extra_specs to also hold key-value pairs that are not used by the scheduler. (most likely the easiest way to implement - no API or DB changes)
- The second best option would be to define a new field for any user input and information, that works exactly like the properties field and would require a new DB table, new API endpoints and some rework for the views. (API and DB changes, but backwards compatible)
- The third option would be to put the information into the
propertiesfield mixed with the extra_specs but only in the API. Having a separation between two tables in the DB and just mixing those inputs, when information is requested. (new DB table, API behavior might change, a lot of work in separating the properties -> imho this is Error prone) - The last option would be to define a new field and move all extra_specs into that field and keep the properties field for any other user visible information. (DB and API changes, NOT backward compatible, best separation but most unlikely option)
I have attended the Cinder midcycle meeting to discuss about the use cases we have and what we want to achieve. I explained everything and we also discussed about several options to achieve this (like the ones I mentioned in the last comment) Unfortunately they also have no certain direction to push me to. So the discussion on how we can achieve to have it will continue on the ML and may even result in a spec for Cinder. Most likely we either only get an easy way to get the ecnryption OR we might need to adjust DB and API with a new metadata field.
The etherpad of the midcycle with an Action Item for me: https://etherpad.opendev.org/p/cinder-caracal-midcycles
I have created a blueprint^1 and am working out a spec for the change in the volume types (still under work, as I also have to consider all DB, API and other impacts ALL options would have right now). I wrote an email to the mailinglist^2 already containing the options to discuss, so the topic gets more attention.
Someone already answered on the ML, but this input does not help for backends with ceph. It just shows exactly the problem we have, when using backend internal configuration to achieve replication, which is transparent for Cinder:
$ sudo ceph osd pool ls detail
....
pool 46 'docker_volumes' replicated size 3 min_size 2 crush_rule 2 object_hash......
$ cinder get-pools --detail
+-----------------------------+----------------------------------------------------------------------------------------------+
| Property | Value |
+-----------------------------+----------------------------------------------------------------------------------------------+
| allocated_capacity_gb | XXX |
| backend_state | up |
| driver_version | x.y.0 |
| filter_function | None |
| free_capacity_gb | xyzxyz.ab |
| goodness_function | None |
| location_info | ceph:/etc/ceph/ceph-ext.conf.................. |
| max_over_subscription_ratio | xx.x |
| multiattach | True |
| name | block@three_times_replicated-sitewide#three_times_replicated-sitewide |
| provisioned_capacity_gb | xxx |
| replication_enabled | False |
| reserved_percentage | 0 |
| storage_protocol | ceph |
| thin_provisioning_support | True |
| timestamp | 2024-02-16T08:16:13.541591 |
| total_capacity_gb | xyzxyz.ab |
| vendor_name | Open Source |
| volume_backend_name | three_times_replicated-sitewide |
+-----------------------------+----------------------------------------------------------------------------------------------+
I pushed the first version of the spec to gerrit[^1]. Where I tried to show the impact of the several options on the workflow in Cinder. This is necessary, after we did not come to a decision in the midcycle meeting. I hope we get a decision on what to do until or latest at the next ptg. [^1]:https://review.opendev.org/c/openstack/cinder-specs/+/909195
I attended the Public Cloud Sig to align with the process of discoverability in overall openstack. And I also attended the Cinder meeting and put the spec on the review list.
I edited the spec according to the review I got and fixed some grammatical errors.
I attended the Cinder meeting and again put the spec on the review list. I also added the spec to the etherpad for the next PTG as a topic to discuss: https://etherpad.opendev.org/p/dalmatian-ptg-cinder
I prepared the spec patch for the next cycle.
After having the new document sturcture for the next cycle I updated the spec patch again. https://review.opendev.org/c/openstack/cinder-specs/+/909195
I prepared myself for the dicussion of the different ways for user visible information in volume types in the PTG that will happen today or maybe tomorrow.
Results of the PTG discussion
I presented some options on how to tackle the user visibility and we came to an agreement:
- A new DB table for volume_type_metadata should be exclusively used for information from the deployer for the user.
- This should be presented in a new metadata field for volume types in the API.
- To standardize the phrases in the metadata field, metadefs can be used in addition.
Metadef API
The metadef API currently does exist in Glance: https://docs.openstack.org/api-ref/image/v2/metadefs-index.html
And from this document it will define metadata keys for various resources in OpenStack: https://docs.openstack.org/glance/latest/user/metadefs-concepts.html
Adding a new definition there seems to be the equivalent to a standard in scs. Users could apply to this but are not forced to.
Nevertheless for SCS it might be worth looking into this metadefs API in a separate issue.
For the scope of this issue this is just a possible second phase after introducing a way for deployers to add metadata to volume types. (https://github.com/SovereignCloudStack/standards/issues/565)
Work items
-
[ ] rewrite the spec
-
[ ] add patches to Cinder after the spec was merged (only for the DB and the API)
-
[ ] add patches to OSC/SDK to be able to add something to the new metadata field
-
[ ] adjust Standard and Test Script to use the new field (maybe done after the dalmatian release)
2nd Phase:
- [ ] add metadef description for encrypted and replicated volume types
I updated the spec: https://review.opendev.org/c/openstack/cinder-specs/+/909195
To be prepared for the next Cinder meeting, I am updating my devstack node. I want to start implementing on the current master. But I was running into some issues with the new release. One of them was that glance did not answer - it wasn't even rolled out correctly in combination with Ceph. I found the reason randomly and it had nothing to to with the Error message that was:
Failed to contact the endpoint at http://.... for discovery. Fallback to using that endpoint as the base url.
Failed to contact the endpoint at http://.... for discovery. Fallback to using that endpoint as the base url.
The image service for : exists but does not have any supported versions.
It was a configuration in the glance-api.conf . The new default for "image_cache_driver" since 2024.1 is "centralized_db", but something in the glance process expected one of the older config values. After I changed it, I was able to contact the glance endpoint.
Today I finished the work on my devstack (at least I think so), was re-iterating and presenting the spec at the Cinder team meeting (https://meetings.opendev.org/meetings/cinder/2024/cinder.2024-04-24-14.01.log.html) and began resarch on the implementation.
This will be done in small parts:
- Implementing an upgrade path to add a new db table: volume_type_metadata
- Adding all database requests for this new table
- Adjusting the current API to also show the new metadata field
- Adding the new API to set and unset new metadata.
I am researching and implementing the first part of upgrading the Cinder DB. currently my devstack seems to not run through all cinder migrations of the DB.
Maybe that is because there are two ways of upgrading the Cinder DB: alembis and sqlalchemy. I added the table desciption here and here.
I added a first patchset with the database upgrade: https://review.opendev.org/c/openstack/cinder/+/918316 I am working on the API extension to see the metadata in the volume type object. This will also be put in that patch set above.
Having part 1 in the first patchset, I started working on part three to actually show the metadata field in the volume type.
I was looking through the open review comments on the spec