opendevreview | Merged openstack/nova master: docs: Replace 'nova boot' with 'openstack server create' https://review.opendev.org/c/openstack/nova/+/794007 | 01:37 |
---|---|---|
*** abhishekk is now known as akekane|home | 04:44 | |
*** akekane|home is now known as abhishekk | 04:44 | |
opendevreview | Takashi Kajinami proposed openstack/nova master: Clean up allocations left by evacuation when deleting service https://review.opendev.org/c/openstack/nova/+/778696 | 06:07 |
opendevreview | Takashi Kajinami proposed openstack/nova master: Clean up allocations left by evacuation when deleting service https://review.opendev.org/c/openstack/nova/+/778696 | 06:19 |
*** bhagyashris_ is now known as bhagyashris|ruck | 06:25 | |
opendevreview | Yongli He proposed openstack/nova master: Smartnic support - cyborg drive https://review.opendev.org/c/openstack/nova/+/771362 | 08:12 |
opendevreview | Yongli He proposed openstack/nova master: smartnic support - new vnic type https://review.opendev.org/c/openstack/nova/+/771363 | 08:12 |
opendevreview | Yongli He proposed openstack/nova master: smartnic support - create arqs https://review.opendev.org/c/openstack/nova/+/758944 | 08:12 |
opendevreview | Yongli He proposed openstack/nova master: smartnic support - build instance with smartnic arqs https://review.opendev.org/c/openstack/nova/+/798249 | 08:12 |
opendevreview | Yongli He proposed openstack/nova master: smartnic support - cleanup arqs https://review.opendev.org/c/openstack/nova/+/798054 | 08:12 |
opendevreview | Yongli He proposed openstack/nova master: smartnic support - reject server move and suspend https://review.opendev.org/c/openstack/nova/+/779913 | 08:12 |
opendevreview | Yongli He proposed openstack/nova master: smartnic support - functional tests https://review.opendev.org/c/openstack/nova/+/780147 | 08:12 |
gibi | stephenfin: can we get this in https://review.opendev.org/c/openstack/os-resource-classes/+/796591 I'd like to get a os-release-classes lib released as neutron also needs this | 08:42 |
opendevreview | Merged openstack/os-resource-classes master: Add packet rate related resource classes https://review.opendev.org/c/openstack/os-resource-classes/+/796591 | 09:34 |
opendevreview | Stephen Finucane proposed openstack/nova master: Use neutronclient's port binding APIs https://review.opendev.org/c/openstack/nova/+/706295 | 10:21 |
opendevreview | Stephen Finucane proposed openstack/nova master: fixtures: Raise HTTP 409 if binding is already active https://review.opendev.org/c/openstack/nova/+/800767 | 10:21 |
stephenfin | gibi: Respun that again (hopefully the last time) if you care to take a look ^ | 10:21 |
gibi | on it | 10:23 |
stephenfin | ta | 10:23 |
gibi | done | 10:26 |
gibi | thanks for fixig the todo I left in the fixture | 10:26 |
* gibi disappeares for 90 mins | 10:26 | |
opendevreview | Takashi Kajinami proposed openstack/nova master: Clean up allocations left by evacuation when deleting service https://review.opendev.org/c/openstack/nova/+/778696 | 10:28 |
opendevreview | Takashi Kajinami proposed openstack/nova master: Clean up allocations left by evacuation when deleting service https://review.opendev.org/c/openstack/nova/+/778696 | 10:33 |
opendevreview | Takashi Kajinami proposed openstack/nova master: Clean up allocations left by evacuation when deleting service https://review.opendev.org/c/openstack/nova/+/778696 | 10:36 |
opendevreview | Balazs Gibizer proposed openstack/placement master: Move placement specs from nova https://review.opendev.org/c/openstack/placement/+/800769 | 10:47 |
sean-k-mooney | stephenfin: if the recheck fails on https://review.opendev.org/c/openstack/os-vif/+/798055 ill adress the remaining typos | 11:32 |
sean-k-mooney | otherwise ill fix them in a followup when i start working on the trunk bridge deletion bug. | 11:32 |
* gibi resurfaces | 12:00 | |
opendevreview | Balazs Gibizer proposed openstack/nova-specs master: Move placement specs to placement repo https://review.opendev.org/c/openstack/nova-specs/+/800775 | 12:04 |
gibi | melwitt: I think the best is to move the placement targeting specs to the placement repository. So I proposed a nova-specs and a placement patch to do so | 12:05 |
gibi | melwitt: https://review.opendev.org/c/openstack/nova-specs/+/800775 | 12:05 |
gibi | melwitt: https://review.opendev.org/c/openstack/placement/+/800769 | 12:05 |
sean-k-mooney | was that motivated by my comment on melwitt's consumer types path | 12:09 |
sean-k-mooney | *patch | 12:09 |
gibi | sean-k-mooney: I already detected my mistake when we merged my re-parent RP spec | 12:11 |
gibi | sean-k-mooney: just haven't had the time to do the move since | 12:12 |
gibi | \ | 12:46 |
*** hemna0 is now known as hemna | 12:53 | |
opendevreview | Stephen Finucane proposed openstack/nova master: db: Final cleanups https://review.opendev.org/c/openstack/nova/+/800484 | 13:03 |
gibi | sean-k-mooney: I looked through the cyborg smartnic impl. I do miss the part when the smartnic arq triggers a piece of domain xml generated to pass through a VF. The flavor based arqs are handled via the arq uuids passed to the libvirt driver but the smartnic impl does not pass the smartnic arqs this way. Does this xml generation happens becuase of the port has some pci info in the bindig profile? | 14:45 |
gibi | the last bullet in the spec https://specs.openstack.org/openstack/nova-specs/specs/xena/approved/support-sriov-smartnic.html#nova seems to confirm my above guess | 14:48 |
melwitt | gibi: ack will review them, thanks for doing that | 15:26 |
gibi | melwitt: thanks | 15:32 |
mnaser | hmm, i ran into an interesting scenario. i have 2 az in a cloud with cross_az_attach=False, everything works as expected, but for bfv, since a volume type is only available in a specific az, volumes go to 'error' right away | 15:46 |
mnaser | would this be a 'cinder' bug technically because nova is requesting a volume with no type included but az=<foobar> and it's going to the default that is not available in az=<foobar> ? | 15:47 |
mnaser | yeah, i think that's more of a cinder thing now that i'm writing it out.. i'll take it there if anyone is curious | 15:48 |
opendevreview | Balazs Gibizer proposed openstack/placement master: Add support for RP re-parenting and orphaning https://review.opendev.org/c/openstack/placement/+/784020 | 15:56 |
gibi | sean-k-mooney: I fixed nits and replied to your comment in ^^ | 15:56 |
NobodyCam | morning Nova Folks.. I have a instance that is stuck in build state. it shows up with openstack server list. But I get "No server with a name or ID" when attempting to run a openstack server show.. would anyone be able to point me in the right direction to remove this failed instance | 15:56 |
melwitt | NobodyCam: you might have an orphaned build_requests record for the instance. if you have one of those records and no instance_mappings record for the instance, you will need to delete the build_requests record from the database manually | 15:59 |
melwitt | these tables are both in the nova_api database | 16:00 |
NobodyCam | ahh Thank you melwitt !! | 16:00 |
NobodyCam | been looking at nova db. haven't even looked at the api side ;p | 16:01 |
melwitt | NobodyCam: yeah.. you likely don't have any trace of the instance in the nova db. usually the reason you can see the instance in the 'server list' but not the 'server show' is because 'server list' will count build_requests as instances as well in order to preserve the behavior where you can see an instance immediately after you request to boot it. but if there's no instance_mappings record, nova will think it can't find anything. it | 16:06 |
melwitt | was this bug https://bugs.launchpad.net/nova/+bug/1784093 | 16:06 |
melwitt | are you running a version older than queens? | 16:06 |
NobodyCam | actually in this region it is Queens, it's in process of migrating to ussuri but not complete yet | 16:09 |
melwitt | hm, ok. there might be another corner case but ^ was the most common one | 16:10 |
melwitt | released in 17.0.11 | 16:11 |
NobodyCam | Interesting not in build_requests but did find reference to the instance_id in instance_mappings table in nova_api DB | 16:20 |
melwitt | oh, so the inverse? hm | 16:24 |
NobodyCam | I should not this region had a network outage, caused some strange things, I am attempting to recover | 16:25 |
NobodyCam | snot/note/ | 16:25 |
melwitt | I need to go back through and look at all those again, there's some places where if one db write succeeds while another fails, you get these bad states | 16:26 |
NobodyCam | Safe to just nuke the instance_mapping record? | 16:26 |
melwitt | yeah, I'd make sure it's not in nova.instances and then delete nova_api.instance_mappings, nova_api.request_specs for the instance uuid | 16:27 |
NobodyCam | : thumbs_up : | 16:27 |
NobodyCam | Thank you melwitt I think this will get things back on track for us! | 16:42 |
melwitt | np, good luck | 16:43 |
gmann | gibi: dansmith replied on these comment for 'project admin getting hypervisor uuid spec' https://review.opendev.org/c/openstack/nova-specs/+/793011/3/specs/xena/approved/allow-project-admin-list-hypervisors.rst#43 | 17:18 |
gmann | gibi: dansmith https://review.opendev.org/c/openstack/nova-specs/+/793011/3/specs/xena/approved/allow-project-admin-list-hypervisors.rst#49 | 17:18 |
gmann | can you please confirm/reply for those, accordingly I will update spec | 17:18 |
gmann | sorry for responding late on this | 17:20 |
opendevreview | melanie witt proposed openstack/nova stable/train: [CI] Fix gate by using zuulv3 live migration and grenade jobs https://review.opendev.org/c/openstack/nova/+/795435 | 17:31 |
opendevreview | Stephen Finucane proposed openstack/nova master: nova-manage: Introduce bdm show, refresh, get_connector commands https://review.opendev.org/c/openstack/nova/+/800634 | 17:33 |
opendevreview | Ghanshyam proposed openstack/nova-specs master: Allow project admin to list hypervisors https://review.opendev.org/c/openstack/nova-specs/+/793011 | 18:25 |
gmann | dansmith: ^^ updated | 18:26 |
dansmith | gmann: okay just to be clear, you're planning to accept the uuid as the hostname, in the same field, and without a microversion for that? | 18:28 |
gmann | dansmith: yeah. | 18:28 |
dansmith | okay I guess I dunno about the legitimacy of that.. because someone won't be able to know whether a nova is new enough to accept that behavior. it's not a structural change in the request/response, but it is a different behavior that they won't know if they can use or not, right? | 18:29 |
gmann | dansmith: humm, behavior wise yes it is changed.. | 18:31 |
dansmith | yeah, so I dunno, probably best to get someone else's opinion on the matter, but it sure seems like that should go hand-in-hand with the microversion to expose it | 18:32 |
gmann | issue with microversion bump is then it would not be aligned with policy change which are without microversion. | 18:34 |
sean-k-mooney | why are we over loading the filed | 18:34 |
gmann | I think booting server with uuid will end up with error ? | 18:35 |
sean-k-mooney | i can kid of understand at the osc level allowign --host | 18:35 |
gmann | currently | 18:35 |
sean-k-mooney | to be the uuid or hostname | 18:35 |
sean-k-mooney | but at the api that feels weird to me | 18:35 |
gmann | humm | 18:35 |
gmann | testing here https://review.opendev.org/c/openstack/tempest/+/793632 | 18:35 |
sean-k-mooney | we have instance of this i think for instnace show or flavor show where you can pass the name or uuid | 18:36 |
sean-k-mooney | but i kind of assume we were goning to add a new filed for this | 18:37 |
sean-k-mooney | its that not we ment by https://review.opendev.org/c/openstack/nova-specs/+/793011/4/specs/xena/approved/allow-project-admin-list-hypervisors.rst#97 | 18:38 |
gmann | sean-k-mooney: for this case, we are thinking to allow in same field and that is why interop issue | 18:38 |
sean-k-mooney | well we said we would accpet hypervisor-uuid now | 18:38 |
sean-k-mooney | so that is a new field no? | 18:39 |
sean-k-mooney | i have not been following this closely sorry | 18:39 |
gmann | sean-k-mooney: no, in same field like in 'availability_zone' az:noda:host host as uuid | 18:39 |
gmann | if new field then sure we need microversion bump | 18:40 |
sean-k-mooney | i kind of feel like this should be a micoverion bump | 18:41 |
sean-k-mooney | i mean its unlikely you are suing uuids for your hypervior host names but it would have been allowed before | 18:41 |
sean-k-mooney | actully dont we do that for ironic | 18:41 |
sean-k-mooney | the hypervior_hostname is the ironic node uuid | 18:42 |
dansmith | but that is the actual hostname we record, | 18:42 |
dansmith | so it's different | 18:42 |
sean-k-mooney | so if we reuse the same filed dont we have the possibliyt of a uuid colission even thought that wont happen in reality | 18:43 |
opendevreview | Merged openstack/nova master: Use neutronclient's port binding APIs https://review.opendev.org/c/openstack/nova/+/706295 | 18:43 |
sean-k-mooney | dansmith: well for ironic server the hypervior hostname for each server is the uuid right | 18:43 |
dansmith | right, that's my point.. so it's not like there's prior art for making them interchangeable, it's that we actually record the uuid *as* the hostname in that case | 18:44 |
sean-k-mooney | yes | 18:44 |
sean-k-mooney | do we also make the compute node uuid match the host name for ironic? | 18:44 |
dansmith | yeah, I think that changed at some point and now we do | 18:46 |
sean-k-mooney | yes https://github.com/openstack/nova/blob/master/nova/virt/ironic/driver.py#L354-L355 | 18:46 |
sean-k-mooney | ya i dont think it always was either but it is now | 18:46 |
sean-k-mooney | https://github.com/openstack/nova/commit/9f28727eb75e05e07bad51b6eecce667d09dfb65 | 18:46 |
sean-k-mooney | to fix https://bugs.launchpad.net/nova/+bug/1771806 | 18:47 |
sean-k-mooney | gmann: that tempest test https://review.opendev.org/c/openstack/tempest/+/793632/7/tempest/scenario/test_server_multinode.py is still using the old way to select hosts | 18:48 |
sean-k-mooney | using the az hack no? | 18:48 |
gmann | sean-k-mooney: adding host uuid there https://review.opendev.org/c/openstack/tempest/+/793632/7/tempest/scenario/test_server_multinode.py#61 | 18:49 |
sean-k-mooney | but you are using host_name | 18:50 |
sean-k-mooney | not host or hypervisor_hostname | 18:51 |
sean-k-mooney | its still computing the az | 18:51 |
sean-k-mooney | https://review.opendev.org/c/openstack/tempest/+/793632/7/tempest/scenario/test_server_multinode.py#82 | 18:51 |
sean-k-mooney | its not useing https://specs.openstack.org/openstack/nova-specs/specs/train/implemented/add-host-and-hypervisor-hostname-flag-to-create-server.html | 18:52 |
gmann | host_name is what i changed to add uuid in L61 | 18:52 |
gmann | host['host_name'] is uuid | 18:53 |
gmann | I remember it failed with uuid but double checking | 18:53 |
sean-k-mooney | so using the az as this test is is more or less deprecated | 18:53 |
sean-k-mooney | we can test this but we shoudl be testing with 5.74 | 18:54 |
sean-k-mooney | and not seeting the az at all | 18:54 |
gmann | sean-k-mooney: I am testing with AZ case for force host. 'host', 'hostname' which use sch I am not testing currently | 18:54 |
gmann | AZ way is not deprecated right? | 18:54 |
sean-k-mooney | its stognly discuaged | 18:55 |
gmann | create server support both | 18:55 |
sean-k-mooney | its not deperecated and i dont think we want to extend its usage | 18:55 |
gmann | yeah discuaged but not deprecated | 18:55 |
sean-k-mooney | i would like to deprecated it | 18:55 |
gmann | I am just checking compatibility of uuid case | 18:55 |
sean-k-mooney | its replacement has been avaiable since train | 18:55 |
sean-k-mooney | gmann: well i gues my point is i dont think we should be supporting this the old way | 18:57 |
sean-k-mooney | im fine with project admins requesting a host using the info form the hypervior api to get the uuid | 18:57 |
sean-k-mooney | but im not sure they shoudl be able to force the host and bypass the schudler filters | 18:58 |
gmann | sean-k-mooney: sure. but in current proposed change I am checking uuid compatibility. But if we want to disallow the bypass the schudler filters that is separate behavior change or we can say 'bypass the schudler filters stop working with new rbac' | 18:59 |
sean-k-mooney | no i dont think that is really valid. | 19:00 |
sean-k-mooney | its an iterup issue | 19:00 |
sean-k-mooney | i cant tell if you are using the new rbac or not | 19:00 |
gmann | I think we can discuss this separately to deprecate/stop the 'bypass the schudler filters' case instead of mixing two | 19:00 |
sean-k-mooney | which is why i have argrued that any policy change should be a micorover bump | 19:00 |
sean-k-mooney | the old az format can work without any nova code changes right | 19:01 |
sean-k-mooney | with the uuid becaue it accpeted both | 19:01 |
gmann | yeah that is why I am kind of agree with dansmith concern of 'need of microversion bump' just testing also so that I am fully convinced on interop issue | 19:01 |
gmann | sean-k-mooney: let's see. i can add host way also in test | 19:02 |
sean-k-mooney | gmann: can you add a new tempest test to test with the other way to request it with out bypassign the filter | 19:02 |
gmann | sure | 19:02 |
sean-k-mooney | cool | 19:02 |
sean-k-mooney | in which case we would expect the uuid to be in host? or hypervisor_hostname? | 19:03 |
sean-k-mooney | or in hypervisor_uuid | 19:03 |
sean-k-mooney | which does not exsit today | 19:04 |
sean-k-mooney | dansmith: was this what you were poking at^ | 19:04 |
gmann | if new field then we need microverion bump for sure | 19:05 |
sean-k-mooney | if feels a little odd to me to do openstack hypervisors list and get the uuid and put it in hypervisor_hostname | 19:05 |
sean-k-mooney | and using host would feel equially odd since that is the host on which the compute service runs so for ironic that is not the same thing | 19:06 |
sean-k-mooney | so for the newer way i think weee need a hypervisor_uuid filed on server create to use this | 19:07 |
dansmith | I'm not so concerned about new field vs new behavior for existing field, | 19:07 |
dansmith | but it seems worthy of a microversion to me either way so we know if we can use it or not | 19:08 |
sean-k-mooney | dansmith: well with a new field the old filed behavior would not change | 19:08 |
dansmith | right but new field would mean microversion for sure | 19:08 |
gmann | yeah, only way to avoid microversion is existing field but that also seems difficult/not right | 19:09 |
dansmith | well, I was saying I don't even think you can avoid it then, because we won't know when we can or can't use it | 19:10 |
gmann | yeah.. | 19:10 |
gmann | now we are back to same issue on how to solve this for new rbac | 19:10 |
sean-k-mooney | if it was a uuid we would basicaly have to always check if there was a hypervior_hostname with the value or hypverviour_uuid with that value | 19:11 |
sean-k-mooney | assuming it was one filed | 19:11 |
sean-k-mooney | gmann: for new rbac i dont understant the problem | 19:11 |
sean-k-mooney | you just use the new microverion how those that affect things? | 19:12 |
gmann | sean-k-mooney: with project admin needs to know the hypervisor name to boot server on host | 19:12 |
sean-k-mooney | right which it cant get today without system_reader | 19:12 |
gmann | with new rbac and older microverison (than where we fix this), project admin would not be able to boot server on host specify | 19:13 |
sean-k-mooney | we dont | 19:13 |
sean-k-mooney | we only support this form the new microverion on | 19:13 |
sean-k-mooney | and if you want too supprot that for old microverion give them system_reader or create a custom policy role for hypervior list | 19:14 |
sean-k-mooney | so effectivly i belive we should just pretend that we are allowing proejct admins to spyify a host as a new feature in xena | 19:15 |
gmann | yeah so for older microversion they would not have choice than updating the policy. which is what issue we have today | 19:16 |
gmann | and with old rbac default they are able to do | 19:16 |
gmann | I think we need to think more on this. I am not hurrying it for Xena at the last min. | 19:17 |
sean-k-mooney | well let take a step back | 19:18 |
sean-k-mooney | the old polices for hypervior was admin api and the new one is system reader | 19:18 |
sean-k-mooney | correct | 19:18 |
sean-k-mooney | so up to now without custom policy you basically need to be an admin or have readonly admin right to list hostname via the hyperviors api | 19:19 |
sean-k-mooney | so even thoght project_admins could spefify a host https://github.com/openstack/nova/blob/master/nova/policies/servers.py#L177-L196 | 19:20 |
sean-k-mooney | in pratice they did not know the name unless an admin told them | 19:21 |
sean-k-mooney | so in practice they could not use this capablity | 19:21 |
sean-k-mooney | and the same is true of the non az way https://github.com/openstack/nova/blob/master/nova/policies/servers.py#L177-L225 | 19:21 |
gmann | sean-k-mooney: legacy project_admins can list hypervisor and specify host | 19:22 |
sean-k-mooney | gmann: there is no such thing a sa legacy project admin | 19:22 |
gmann | legacy admin | 19:22 |
sean-k-mooney | they are a fully admin | 19:23 |
gmann | yeah the old admin can do both operation | 19:23 |
sean-k-mooney | right | 19:23 |
sean-k-mooney | so i dont see why we have to try and support this before xena | 19:24 |
sean-k-mooney | we may have typed PROJECT_ADMIN on those policies | 19:24 |
sean-k-mooney | but in reality form an PROJECT_ADMIN point of view they could not use them | 19:24 |
gmann | PROJECT_ADMIN is new rbac policy | 19:24 |
sean-k-mooney | yes i know | 19:25 |
gmann | if we think with old admin way then they were allowed to list hypervisor and boot server on that. but with new rbac whatever admin system or project could not do | 19:25 |
sean-k-mooney | correct | 19:25 |
sean-k-mooney | althgouh system admin woudl fail for other reason | 19:25 |
sean-k-mooney | namely becaue it does not have a project uuid | 19:26 |
sean-k-mooney | so what im proposing is se simple document to use requested_destination with a uuid you need a new microverion and add a new hypervior_uuid filed to server create | 19:27 |
sean-k-mooney | in addion to that we can allow listing the hyperviors uuid via os-hypervior with project admin | 19:27 |
sean-k-mooney | that is consitent with our microverion gudieline as we are adding a new feautre to the api. booting with a hypervior uuid as a target host | 19:28 |
gmann | yeah we can do that always if we want to leave old microversion unsolved. but as discussed in xena PTG or since starting , first we are trying "how we can solve this problem without microversion" | 19:28 |
sean-k-mooney | and it enable the use case with the new rbac | 19:29 |
sean-k-mooney | gmann: right my anaser to "how we can solve this problem without microversion" is we should not | 19:29 |
sean-k-mooney | unless the resoltion is dont use RBAC with the old microverions | 19:30 |
gmann | one way is going back and allow hypervisor name to list for project-admin. but again it violate our new rbac goal | 19:33 |
melwitt | don't we already show real hypervisor hostname vs obfuscated one in the same field depending on whether admin or non today? how is allowing uuid as well any different? | 19:35 |
melwitt | that is, I don't see the problem with showing a uuid there without a microversion | 19:35 |
opendevreview | Merged openstack/nova stable/ussuri: guestfs: With libguestfs >= v1.41.1 decode returned bytes to string https://review.opendev.org/c/openstack/nova/+/787902 | 19:36 |
sean-k-mooney | melwitt: no i dont think we do | 19:40 |
mnaser | i know i'm not supposed to be mucking around with this, but besides request_specs in nova_api and instances.availiabiltiy_zone in nova.. is there anywhere else that leaves a reference to the az that an instance lives in? | 19:40 |
mnaser | i'm doing some very bad(tm) things and updating az column in instances worked for some instances but did not for some other ones, the api still reports the old az | 19:41 |
melwitt | sean-k-mooney: oh, you're right. there is a hostId field | 19:41 |
gmann | melwitt: showing uuid is fine but specifying that uuid in POST /servers leads to interop issue | 19:41 |
sean-k-mooney | yes we have hostId for normal users | 19:42 |
sean-k-mooney | and then OS-EXT-SRV-ATTR:hypervisor_hostname and OS-EXT-SRV-ATTR:host | 19:42 |
sean-k-mooney | for admins | 19:42 |
gmann | POST /servers which accept hostname today and start accepting host uuid is soemthing need microversion | 19:42 |
sean-k-mooney | mnaser: its in neutron and cinder also | 19:43 |
melwitt | oh, sorry, the details of this must have fallen out of my brain | 19:43 |
sean-k-mooney | mnaser: we set the az in the device_owner field in the neutron port bindings and i think in the cinder volumlue attachments | 19:44 |
mnaser | sean-k-mooney: right.. in this case those are purposely non-bfv systems so no cinder attachments, now neutron is a good point | 19:44 |
mnaser | but still.. nova api reports old az still, not sure where to change | 19:44 |
mnaser | looks for instance_extra and instance_system_metadata | 19:45 |
sean-k-mooney | ya one of those would be likely but maybe this is nin the api db somewhere | 19:45 |
sean-k-mooney | we had a list at onepoint | 19:45 |
mnaser | build_requests has nothing, i already updated nova_api.request_specs.spec | 19:46 |
mnaser | maybe its cached at this point | 19:46 |
melwitt | yeah, you might be getting https://github.com/openstack/nova/blob/master/nova/availability_zones.py#L195-L211 | 19:48 |
mnaser | ahhh so it is cached | 19:48 |
sean-k-mooney | do you know where we block you removing host form az if it has instnaces | 19:49 |
sean-k-mooney | the commit that blocked that has a list of all the places where az are stored i think too | 19:49 |
mnaser | yeah i think in my case it's more of the cache getting the host az to present via the api | 19:50 |
melwitt | https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/services.py#L281-L284 | 19:51 |
sean-k-mooney | https://github.com/openstack/nova/commit/8e19ef4173906da0b7c761da4de0728a2fd71e24 | 19:52 |
sean-k-mooney | there we go ^ | 19:52 |
mnaser | ok so melwitt theory that i'm hitting the case where the az for the compute node != az for instance record, and then its just getting the compute node one | 19:52 |
melwitt | maybe. I didn't look deep into it | 19:53 |
sean-k-mooney | mnaser: did you rename the az or move the compute node | 19:54 |
mnaser | sean-k-mooney: neither.. i am trying to cold migrate to another az :X | 19:54 |
sean-k-mooney | ah ya that breaks things | 19:54 |
sean-k-mooney | gibi: and bauzas were working on that recently | 19:54 |
mnaser | in theory it all works but just the scheduler is not happy | 19:54 |
sean-k-mooney | one sec | 19:54 |
sean-k-mooney | mnaser: well if the vm requeted an az in the first place you are nota llowed to migrate to another az | 19:55 |
mnaser | melwitt: that was it, removing the compute node (that was disabled) out of the host aggregate that was setting the az made to go back to hte db value | 19:55 |
sean-k-mooney | if it did not yes it can work | 19:55 |
mnaser | sean-k-mooney: but what if you use default_schedule_zone=nova :) | 19:55 |
sean-k-mooney | then it will populate that in the request spec | 19:55 |
sean-k-mooney | and you cna migrate within that | 19:56 |
mnaser | but in my case, lets imagine it was set to default_schedule_zone=foobar and now we're trying to clean it up to be all inside `nova` | 19:56 |
sean-k-mooney | mnaser: right that is not supported | 19:56 |
mnaser | hence the bad things(tm) | 19:56 |
sean-k-mooney | so your in for pain | 19:56 |
sean-k-mooney | yep | 19:56 |
sean-k-mooney | operator do tend to do that form time to time i have noticed | 19:57 |
mnaser | i expect to be told collect the broken pieces on my own if it breaks :) | 19:57 |
sean-k-mooney | so maybe there should be a way to do that at some point | 19:57 |
mnaser | i think its cause operator feel that az's are not very 'heavy' constraints | 19:57 |
mnaser | and then many years go by and you're like oh wait this isn't straight forward... | 19:57 |
sean-k-mooney | right AZ are thigns you set up once and never touch as its user facing | 19:58 |
sean-k-mooney | host aggrartes you can change to your hearts delight as they are not | 19:58 |
sean-k-mooney | you proably know as well as anyone that openstack AZ are not like aws AZ where each maps to a different datcenter/fault domain | 19:59 |
sean-k-mooney | but in terem of thinking about changing them you shoudl treat them that way | 20:00 |
mnaser | yeah but im saying host aggregates seem flexible, az's are hard set | 20:01 |
mnaser | the mix of both probably gives the impression that one is just as flexible as the other | 20:01 |
sean-k-mooney | yep and the fact that an az is just a metadata tag on a hsot aggreate probly does not help | 20:01 |
sean-k-mooney | i never want to write this but i could see someine writing a nova manage command or something that would move a host or vms betwen azs but realticly that will better live out of tree | 20:02 |
sean-k-mooney | there are far to many choices to make on what to do. do you jsut update the AZ in the db or do you move vm to other node in the az they requested if set and move the other or one that are in a specifed az to the new az | 20:04 |
mnaser | yeah the combination of possible scenarios is .. a lot | 20:04 |
sean-k-mooney | and likely will be defferent for each operator/case | 20:06 |
sean-k-mooney | which is why we have never stdardised a tool to do this in nova | 20:06 |
sean-k-mooney | mnaser: https://review.opendev.org/c/openstack/nova/+/798145 for https://bugs.launchpad.net/nova/+bug/1934770 | 20:08 |
sean-k-mooney | mnaser: that is proably of interset to you | 20:08 |
sean-k-mooney | mnaser: your migrate issue might be similar | 20:09 |
sean-k-mooney | mnaser: we could consider allowing cross az live/cold migrate explcitly in the api as a new fature at somepoint | 20:10 |
sean-k-mooney | mnaser: that would allow you to move host between az by live migrating the vms to the new az and then when the host is empty just remove it form one and add it to the other | 20:11 |
sean-k-mooney | spcifying an az to live/cold migrate woudl have to update the request spec and other db filed with the new az but it could be done | 20:12 |
sean-k-mooney | althogh not this cycle at this point | 20:12 |
sean-k-mooney | anyway i need to call it a day | 20:12 |
opendevreview | Ghanshyam proposed openstack/nova master: DNM: testing https://review.opendev.org/c/openstack/nova/+/794863 | 23:01 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!