Thursday, 2024-10-17

*** elodilles_pto is now known as elodilles09:12
pas-ha[m]I failed to read the whole backlog of discussion on admin-ness etc, but some my thoughts: you want 'admin' or 'reader' or whatever only in one service? as admin create app creds with access rules limiting those to API of this service only. Done.09:51
pas-ha[m]The actual problem I faced is that in all the services, 'adminness' means 2 things - admin API (possibility to perform certain action, like create a flavor') and admin for DB (possibility to read other projects data). And currently these are conflated.09:53
pas-ha[m]This is why for now the only way to create a 'global reader' using project-scoped token is to create app creds with admin role and access rules scoped to GET APIs of all services only. This way all non-read request will be blocked by keystonemiddleware, but once the request reaches the actual service, it can do (read) whatever admin can.09:55
pas-ha[m]What I also found suboptimal, is that in most services the actual 'personas' are defined as constants in code, not the rule: that can be easily overridden or extended by the cloud owner, requiring modification of many-many policy rules instead just several...09:57
pas-ha[m]And I do agree that 'manager' will be a better solution instead of making a role do different things depending on the project it is assigned to. Leave 'admin' to be the admin of the whole cloud, period, and watch whom you give it carefully. Replace 'project admins' and even 'domain admins' with project and domain managers. Should make things more clear IMO.09:59
sean-k-mooneypas-ha[m]: scoping accees base on the service has never been supported12:26
sean-k-mooneyit was discussed and delcared out of scope of the current srbac work12:26
sean-k-mooneypas-ha[m]: part of the probelm there is keyston maps roles to user in a given project12:27
sean-k-mooneybtu to my konowladge there is not concept of a role assingment mapping between a user a a (service,project) pair12:27
sean-k-mooneypas-ha[m]: so you cant express service (or endpoint) based access contol in the keystone data model without just crating many custom roles12:28
pas-ha[m]yeah, I know, and I am fine with that, just pointing out how one can achieve that right now w/o any hacks, with APIs already available, and that is fine I guess12:28
sean-k-mooneywhich was rejected for interop reasons12:28
pas-ha[m]app creds + access rules = profit12:28
sean-k-mooneyoh you can scope app creds12:29
sean-k-mooneygood to know12:29
pas-ha[m]down to specific API path and verb12:29
sean-k-mooneypersonall i prefer using applciation creds even as a human12:29
pas-ha[m]so like can give a get access to a specific server only (by uuid) and nothing else12:30
sean-k-mooneythat very powerful and useful to know indeed12:30
pas-ha[m]switching the subject to scheduling, may I bring your attention to https://bugs.launchpad.net/nova/+bug/2076089 ? is there other way currently for an admin to boot an instance on a disabled compute host?12:30
sean-k-mooneyi knew i coudl scope by roles when using them but i have never looked at "access rules" before12:31
sean-k-mooneypas-ha[m]: boot or spwan12:31
pas-ha[m]boot12:31
sean-k-mooneydisableing computes is expictly for makign it imporssible to schdule a new worklaod12:31
sean-k-mooneyso https://bugs.launchpad.net/nova/+bug/2076089 is not valid12:32
pas-ha[m]but it was working before 😕12:32
sean-k-mooneythe disables feature is descided to explcliy block that 12:32
pas-ha[m]and is a normal part of node onboarding12:32
sean-k-mooneyyou can start existing vms on a disbale compute12:32
pas-ha[m]how does one test a node w/o exposing it to the whole cloud?12:32
sean-k-mooneyin general you dont but in practice va a host aggreate and something like the tenant filter12:33
pas-ha[m]short of adding a new az of course... that users still can see and attempt to boot on12:33
sean-k-mooneyyou and do it in a way that users wont see using a number of methods12:33
pas-ha[m]yeah... sad ☚ī¸ though still can migrate to a disabled host using old API12:33
sean-k-mooneyyou can use a private flavor and required trait12:34
sean-k-mooneythe tenat isolation filter or some other methods12:34
sean-k-mooneypas-ha[m]: basiclaly set https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.enable_new_services to false12:34
pas-ha[m]with required trait that does not prevent from other flavors seeing it as available12:34
sean-k-mooneythat will cause it to be enroled as disabeld then you can put it into a seperate aggret that only your test proejct can use and test it after you enable it12:35
sean-k-mooneypas-ha[m]: the basic workflow is enrole new nodes as disabled, put them in a restircted test aggreate, enable them, test them and finally remove them for the aggrate12:37
sean-k-mooneypas-ha[m]: https://docs.openstack.org/nova/latest/admin/aggregates.html#tenant-isolation-with-placement is what i generally recommend fo this12:38
sean-k-mooneybut there are other approchs that work too12:39
sean-k-mooneyi prefer doing it with tenant isolation so that you can use the actual flavors/images you intent do use and just delete the placment aggreate when your done12:40
sean-k-mooneypas-ha[m]: you cant migrate via the old api by the way, the way disabel works will filter out the host so --foce wont work12:41
pas-ha[m]nope,it does work, tested on Antelope12:41
sean-k-mooneyit should not because the placment prefilter cant be turned off12:41
pas-ha[m]```12:42
sean-k-mooneyif it works its a bug12:42
pas-ha[m] * import openstack... (full message at <https://matrix.org/oftc/media/v1/media/download/AUsfx3faFN1prujTkMr2mB2XbtUHmg98qQVs5w-ulEaFy1UTw3_HViUOe21mF6pw3ijesukW_RFxPoO-qfRuuFpCeS4pkHBQAG1hdHJpeC5vcmcvWEJQRlZGcGRCc0JuWWFiZmdocmlKaGlw>)12:42
pas-ha[m]API ref says the same btw. if you use older that 2.30 api, the scheduler does not apply when migrating to a specific host12:43
sean-k-mooney pas-ha[m]  https://github.com/openstack/nova/blob/master/nova/scheduler/request_filter.py#L242-L25512:43
sean-k-mooneypas-ha[m]: every request we make to placmenet unconditon fobids host with the COMPUTE_STAUS_DISBALED trait12:43
pas-ha[m]afaiu in older api scheduler is not involved at all I guess...12:44
pas-ha[m]https://docs.openstack.org/api-ref/compute/#live-migrate-server-os-migratelive-action12:44
pas-ha[m]Prior to microversion 2.30, specifying a host will bypass validation by the scheduler, which could result in failures to actually migrate the instance to the specified host, or over-subscription of the host.12:44
sean-k-mooneyit is but it does very minimal checks12:44
sean-k-mooneybut the palcment one i did nto think was skiped12:44
sean-k-mooneybasically it does nto run any fitler or weighers12:44
sean-k-mooneybut it does confirm the host extist12:45
sean-k-mooneyits possibel we have a bug and we are early outing before check with placment12:45
pas-ha[m]sean-k-mooney: and, changing to other scheduling problems, what would you think on https://bugs.launchpad.net/nova/+bug/2078669 (and https://review.opendev.org/c/openstack/nova/+/927744)?12:49
pas-ha[m]this is also also something that was working before the AZFilter was removed (we had to re-instate it back in downstream 😕 )12:49
pas-ha[m]how can I boot an instance 'not in any of those AZs that I see' ?12:50
sean-k-mooneythats a diffent issue and strictly speakign also not supproted12:52
sean-k-mooneypas-ha[m]: see the red warining box in https://docs.openstack.org/nova/latest/admin/availability-zones.html12:53
sean-k-mooneyi am more open to supproting the defautl nova az but bauzas is stongly against it. we have docuemented taht you should not ever request the nova az effectivly since AZs were first added12:54
*** iurygregory_ is now known as iurygregory12:57
bauzassean-k-mooney: hmmm, what is the question ?12:57
pas-ha[m]yeah, I know that doc. Still, this is in admin part of docs, and users rarely are aware of it, and they do see 'nova' as in az list... and again, if I have hosts in az 'foo' and hosts in no explicit az ('nova'), previously it was possible to boot "not in foo" == boot in 'nova' az. Now its not possible any more 😕12:57
pas-ha[m]btw, the api ref says that regular users should not see 'nova' az in az list, but I am pretty sure that's still the case even on master12:58
pas-ha[m]bauzas: https://bugs.launchpad.net/nova/+bug/2078669 and the patch to that12:58
sean-k-mooneyhorizon adds it to the lsit if no azs are retruned i belive12:59
pas-ha[m]openstack cli does show it too12:59
bauzaspas-ha[m]: tbc, the concern is that by default 'nova' isn't an AZ13:00
bauzasif you want, you can change the default value for instances so there you could create a specific aggregate with a 'nova' AZ 13:01
bauzasbut if you set an instance with '-az nova' then we'll pin into it13:01
pas-ha[m]I know, but it is represented as such to the users still. And the user sees az 'expensive' and az 'nova', user thinks 'i definitely do not need 'expensive', and uses 'nova'. Before, with AZFilter being available, the instances landed where user intended. Now they are not.13:03
sean-k-mooneyubuntu@devstack-iso:~$ openstack --os-cloud=devstack --debug availability zone list --compute 2>&1| nc termbin.com 999913:05
sean-k-mooneyhttps://termbin.com/b8q913:05
sean-k-mooneybauzas: we do have a bug ^13:05
pas-ha[m]I proposed a patch that would make az 'nova' (or whatever default az is) be treated as 'not any host that belongs to any explicit az' (if there's no actual explicit az nova of course). IMO that should be exaclty what the user intended.13:06
sean-k-mooneypas-ha[m]: that would break things13:06
pas-ha[m]this was effectively working like that until AZFilter was removed recently13:07
sean-k-mooneythe problem with that is we do not allow moving host betwene azs13:07
pas-ha[m](in bobcat or caracal)13:07
sean-k-mooneyif they have instnace on them that is13:08
pas-ha[m]yeah, but I am not forcing any AZ here. I am only forcing placement to discard any other candidates13:08
sean-k-mooneyyou cant do that effiectnly without pasically passing all az aggreatd in a not13:09
sean-k-mooneywhich does not scale13:09
sean-k-mooneyand will hit the max url limit in large clouds13:09
sean-k-mooneythe only way to make this work woudl be to group all hosts not in an az into a seperate aggreate13:09
sean-k-mooneyand then explcitly request that13:09
pas-ha[m]yeah, that's an interesting thought.. about the url length at least13:10
bauzasnot sure I understand the problem sorry13:10
bauzasthe fact that we return 'nova' per the user AZ list ?13:11
sean-k-mooneybauzas: as a non admin if you do an az list you get nova13:11
bauzasif so, we should *not* return it, I agree13:11
sean-k-mooneywhich mean it looks valid to and end suer13:11
sean-k-mooneybecause they cant no what the defautl az is13:11
sean-k-mooney*default internal az is13:11
sean-k-mooneyactuilly no i was right the first time https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.default_availability_zone13:12
bauzasokay, then I agree: we shouldn't return the default AZ in that list13:12
bauzashttps://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.default_availability_zone13:12
sean-k-mooneyyou can actully create a real az callded nova and change that to something else13:12
sean-k-mooneythe defautl az can be returned if its an actual az13:13
bauzasfor defaulting to pin a specific AZ, this is https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.default_schedule_zone13:13
bauzaswe could even make default_az to None, I don't see why we need to have 'nova'13:13
sean-k-mooneyi was just thinking about that13:13
bauzasbut we could and should discuss this at the PTG13:13
sean-k-mooneyya i dont know of the top of my head anything that would break13:13
sean-k-mooneyit would stop instance reporting an az if they are on a host that are not part of any az13:14
sean-k-mooneybut that is more correct then "nova"13:14
pas-ha[m]as I said, api ref already promises that non-admin should not see it13:14
pas-ha[m]>  The default availability zone is named nova. This AZ is only shown when listing the availability zones as an admin.13:14
pas-ha[m]https://docs.openstack.org/api-ref/compute/#create-server13:14
sean-k-mooneypas-ha[m] reight beu twe have a bug because that is not what the api is doing13:15
sean-k-mooneyhttp://192.168.50.7:80 "GET /compute/v2.1/os-availability-zone HTTP/1.1" 200 9713:16
sean-k-mooneyRESP: [200] Connection: close Content-Length: 97 Content-Type: application/json Date: Thu, 17 Oct 2024 13:15:54 GMT OpenStack-API-Version: compute 2.1 Server: Apache/2.4.58 (Ubuntu) Vary: OpenStack-API-Version,X-OpenStack-Nova-API-Version X-OpenStack-Nova-API-Version: 2.1 x-compute-request-id: req-17af8a83-82f0-45ce-81fd-c6d54a0c0ad9 x-openstack-request-id:13:16
sean-k-mooneyreq-17af8a83-82f0-45ce-81fd-c6d54a0c0ad913:16
sean-k-mooneyRESP BODY: {"availabilityZoneInfo": [{"zoneName": "nova", "zoneState": {"available": true}, "hosts": null}]}13:16
sean-k-mooneyill quickly checkc but im 99% shure i dont have any hostaggrate defiend on this devstack13:16
pas-ha[m]And I am afraid something would break if we start doing that. This would effectively force operators to always put hosts into some explicit AZ, because users who are used to pick the AZ from the list will just use one of those the list presents them, making hosts not in any AZ underused13:17
sean-k-mooneywell az are optional13:18
sean-k-mooneyi would expect most user not to sepcify one13:18
sean-k-mooneyunless its via horizon in shich case they still dont need to select one13:18
sean-k-mooneypas-ha[m]: the alaswer ot this is AZ shoudl be replacced with a first class api object and we shoudl have some way to auto enrole comptue services in AZ13:24
sean-k-mooneythe problem is we have never found an accpatbale solution to do that but i know other are interested in that direction13:25
pas-ha[m]just to re-iterate, for the exact use case that we had broken for us:... (full message at <https://matrix.org/oftc/media/v1/media/download/AZi6ENlpQm3FPa8euVAaf8oGaS7R6Eq6HS5W4Rwq_LvrIBUm5dYilERrZnNQr5tnKVfobGYw-LXZy-WaqFYBXT5CeS4sIl5AAG1hdHJpeC5vcmcva0RHSEVMREVleFlzVUF6SldadkRXeWlW>)13:27
sean-k-mooneyan AZ is nto the correct way to to enabel that usecase13:28
sean-k-mooneyti can be doen with az but its error prone13:28
sean-k-mooneypas-ha[m]: just to make sure i understand correctly do you allow hugepage flavor to be used elsewhere or rather what is the cirtia for allowing it on a dpdk host13:31
pas-ha[m]only those hosts are configured with huge pages, so scheduler filters other hosts out. the problem is to limit flavors that not use huge pages out of those hosts.13:33
pas-ha[m]probably adding some extra spec like forbidden=... to any other flavor would work too, but that's more cumbersome than what was working before.13:34
sean-k-mooneyyou can map specific flavor to host with the AggregateTypeAffinityFilter you can use can also use AggregateInstanceExtraSpecsFilter or aggregate_image_properties_isolation to enfoce extra specs are requested in teh flavor/image or you can use the palcemnet feature ath was created explcitly for this type of grouping13:34
sean-k-mooneyhttps://docs.openstack.org/nova/latest/reference/isolate-aggregates.html13:34
sean-k-mooneyif you use  ^13:35
sean-k-mooneyyou can make so that if a falvor does not request a custome trait it cant be shcheduled to the host13:35
sean-k-mooneyyou just create a CUSTOM_DPDK trait. create a host aggate with your dpdk host and add trait:CUSTOM_DPDK=required13:37
sean-k-mooneyand request trait:CUSTOM_DPDK=required in your dpdk flavors13:37
sean-k-mooneypas-ha[m]: that was added in train https://specs.openstack.org/openstack/nova-specs/specs/train/implemented/placement-req-filter-forbidden-aggregates.html13:40
opendevreviewGhanshyam proposed openstack/nova master: Update gate jobs as per the 2025.1 cycle testing runtime  https://review.opendev.org/c/openstack/nova/+/93264818:39
opendevreviewMengyang Zhang proposed openstack/nova-specs master: Add Burst Length Support to Cinder QoS  https://review.opendev.org/c/openstack/nova-specs/+/93265320:00
opendevreviewMengyang Zhang proposed openstack/nova-specs master: Add Burst Length Support to Cinder QoS  https://review.opendev.org/c/openstack/nova-specs/+/93265320:25

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!