16:00:24 <bauzas> #startmeeting nova 16:00:24 <opendevmeet> Meeting started Tue May 24 16:00:24 2022 UTC and is due to finish in 60 minutes. The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:24 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:24 <opendevmeet> The meeting name has been set to 'nova' 16:00:32 <bauzas> hey folks 16:00:37 <whoami-rajat> Hi 16:00:53 <bauzas> #link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting 16:01:15 <elodilles> o/ 16:01:59 <opendevreview> Kashyap Chamarthy proposed openstack/nova master: libvirt: Add a workaround to skip compareCPU() on destination https://review.opendev.org/c/openstack/nova/+/838926 16:02:08 <gibi> o/ 16:02:33 <bauzas> ok, let's start 16:02:38 <bauzas> #topic Bugs (stuck/critical) 16:02:43 <bauzas> #info No Critical bug 16:02:47 <bauzas> #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 15 new untriaged bugs (+1 since the last meeting) 16:02:56 <bauzas> no worries sean, I saw you did a hard work 16:03:04 <bauzas> #link https://storyboard.openstack.org/#!/project/openstack/placement 26 open stories (0 since the last meeting) in Storyboard for Placement 16:03:09 <bauzas> #info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster 16:03:22 <bauzas> sean-k-mooney: any bug you wanted to discuss for triage ? 16:03:40 <sean-k-mooney> am not anyting pressing 16:03:42 <sean-k-mooney> https://etherpad.opendev.org/p/nova-bug-triage-2022-05-17 16:03:46 <sean-k-mooney> we had one feature request 16:03:49 <sean-k-mooney> for numa in placment 16:03:56 <sean-k-mooney> which i marked as invlaid 16:04:07 <sean-k-mooney> and and one duplciate fo an 16:04:14 <sean-k-mooney> oslo messaging bug 16:04:26 <sean-k-mooney> the rest did nto have enouch info to triage really 16:04:31 <sean-k-mooney> so i marked them incomplete 16:04:42 <sean-k-mooney> i also checked some of the incomplete form last week 16:04:46 <sean-k-mooney> but no change really 16:05:13 <sean-k-mooney> one fixed bug form stephen https://bugs.launchpad.net/nova/+bug/1974173 16:05:20 <sean-k-mooney> thats about it 16:06:34 <bauzas> ok thanks 16:06:40 <bauzas> and thanks again for triaging 16:06:57 <bauzas> elodilles: are you okay for getting the baton for this week ? 16:07:44 <elodilles> bauzas: yepp o7 16:07:54 <bauzas> thanks 16:08:00 <bauzas> #info Next bug baton is passed to elodilles 16:08:20 <bauzas> #topic Gate status 16:08:59 <bauzas> #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs 16:09:03 <bauzas> #link https://zuul.openstack.org/builds?project=openstack%2Fplacement&pipeline=periodic-weekly Placement periodic job status 16:09:08 <bauzas> #link https://zuul.opendev.org/t/openstack/builds?job_name=nova-emulation&pipeline=periodic-weekly&skip=0 Emulation periodic job runs 16:09:14 <bauzas> as you can see ^ nothing to tell 16:09:45 <bauzas> both jobs and pipelines work 16:09:52 <bauzas> #info Please look at the gate failures and file a bug report with the gate-failure tag. 16:09:57 <bauzas> #info STOP DOING BLIND RECHECKS aka. 'recheck' https://docs.openstack.org/project-team-guide/testing.html#how-to-handle-test-failures 16:10:05 <bauzas> as a reminder for everyone ^ :) 16:10:30 <gibi> please note that we are still playing wack-a-mole with the volume detach issue. There are still open tempest patches adding more SSHABLE waiters 16:10:49 <bauzas> gibi: yup, we'll discuss this for the stable topic 16:11:03 <gibi> ack, but this is affecting master still :) 16:11:05 <sean-k-mooney> bauzas: well it affects master too 16:11:10 <sean-k-mooney> but sure 16:11:30 <bauzas> yep, I know, thanks for the reminder it also impacts master 16:12:14 <bauzas> #topic Release Planning 16:12:19 <bauzas> #link https://releases.openstack.org/zed/schedule.html 16:12:22 <bauzas> #info Zed-1 was last week 16:12:48 <bauzas> thanks sean-k-mooney for accepting the rc1 releases for the projectsd 16:13:03 <bauzas> oh, actually elodilles 16:13:12 <sean-k-mooney> ya it was elodilles 16:13:25 <sean-k-mooney> i repled on one after the fact 16:13:27 <bauzas> #link https://review.opendev.org/c/openstack/releases/+/841851 novaclient release for zed-1 16:13:34 <elodilles> well, it had a deadline, so needed a review & merge o:) 16:13:43 <sean-k-mooney> we dicussed it but i forgot to do it before the deadline 16:13:51 <bauzas> #link https://review.opendev.org/c/openstack/releases/+/841845 os-vif release for zed-1 16:13:56 <bauzas> sean-k-mooney: me too 16:14:05 <sean-k-mooney> elodilles: strictly speaking we dont have to do it by m1 16:14:08 <bauzas> and i was off this friday, didn't help 16:14:17 <sean-k-mooney> that is just the convention the the release team are following 16:14:30 <sean-k-mooney> btu its not requried by the release model 16:14:34 <bauzas> elodilles: don't be afraid to ping me if you need me to review some release change 16:14:38 <sean-k-mooney> we jsut need a intermediat release :) 16:14:43 <bauzas> correct 16:15:06 <elodilles> bauzas: ack :) 16:16:36 <elodilles> sean-k-mooney: not necessary, yes, but if there is no -1 from the team, then release managers merges the generated patches at deadlines o:) 16:16:55 <sean-k-mooney> elodilles: right the deadline is actully m3 16:17:04 <sean-k-mooney> the docs dont mention m1 at all 16:17:21 <sean-k-mooney> that is jsut a hold over form the release with milestone model 16:17:31 <sean-k-mooney> but thanks form taking care of it in any case 16:18:48 <sean-k-mooney> https://github.com/openstack/releases/blob/61f891ddd7bd3b28ac7b5e7e9e1d9203fbbe297d/doc/source/reference/release_models.rst#cycle-with-intermediary= 16:19:18 <elodilles> sean-k-mooney: see #2, and it's last chapter: https://releases.openstack.org/reference/process.html#milestone-1 16:19:34 <bauzas> elodilles: how can I see whether for example os-vif is either using a cycle-with-rc model or a cycle-with-intermediary one ? 16:19:55 <sean-k-mooney> elodilles: yep that is not in lien with the governance doc 16:19:57 <elodilles> bauzas: in the yaml file under deliverables/zed 16:20:05 <sean-k-mooney> anyway its not importnat now 16:20:18 <bauzas> elodilles: ok b/c https://releases.openstack.org/teams/nova.html doesn't tell it 16:20:31 <bauzas> anyway, moving on 16:20:42 <elodilles> ++ 16:20:49 <bauzas> #topic Review priorities 16:20:55 <bauzas> #link https://review.opendev.org/q/status:open+(project:openstack/nova+OR+project:openstack/placement+OR+project:openstack/os-traits+OR+project:openstack/os-resource-classes+OR+project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/osc-placement)+label:Review-Priority%252B1 16:21:00 <bauzas> #link https://review.opendev.org/c/openstack/project-config/+/837595 Gerrit policy for Review-prio contributors flag. Naming bikeshed in there. 16:21:06 <bauzas> #link https://docs.openstack.org/nova/latest/contributor/process.html#what-the-review-priority-label-in-gerrit-are-use-for Documentation we already have 16:21:39 <bauzas> I provided a comment for https://review.opendev.org/c/openstack/project-config/+/837595 16:21:43 <bauzas> please review it 16:21:58 <gibi> done :) 16:22:38 <bauzas> thanks 16:22:50 <bauzas> at least I'm French so in general I'm not good at naming things 16:23:06 <bauzas> but at least I try to find a consensus 16:23:14 <gibi> thank you for that 16:23:39 <bauzas> I think all contributors know what nova-core means 16:23:52 <bauzas> hopefully 16:24:15 <gibi> that is a fair assumption 16:24:35 <bauzas> for other repos, we could name the label differently of course, like 'osvif-core' if this is named by gerrit 16:25:21 <bauzas> ie. nova-specs-core review promise 16:25:31 <bauzas> os-vif-core etc. 16:25:41 <bauzas> but this is a naming bikeshed 16:26:24 <bauzas> anyway, moving on 16:26:32 <bauzas> #topic Stable Branches 16:26:40 <bauzas> in general I ask elodilles 16:26:46 <bauzas> but this time, let me do it 16:26:52 <bauzas> #info ussuri and older branches are still blocked, newer branches should be OK 16:27:03 <bauzas> melwitt had a point 16:27:33 <elodilles> just an update for that ^^^ i think ussuri is blocked but the older branches are not blocked anymore 16:27:47 <bauzas> #link https://etherpad.opendev.org/p/nova-stable-branch-ci stable branches CI issues tracking, feel free to update with stable branch CI issues 16:27:58 <bauzas> elodilles: woah 16:28:11 <bauzas> kudos to the team then 16:28:13 <elodilles> bauzas: l-c branches were merged 16:28:32 <elodilles> bauzas: i don't say they don't have intermittent failures though o:) 16:28:43 <bauzas> elodilles: I thought most of the issues were related to volume detach things, which are unrelated to l-c 16:28:44 <elodilles> but at least they are not blocked 16:28:48 <bauzas> ah 16:29:23 <bauzas> elodilles: but then, why ussuri is blocked while older not ? 16:29:47 <elodilles> ussuri and train were where tempest were not pinned, 16:29:58 <elodilles> and where tempest is running with py36 16:30:08 <elodilles> if i'm not mistaken that's it 16:30:21 <elodilles> and gmann's train fix has landed 16:31:04 <bauzas> ok thanks 16:31:05 <elodilles> originally we thought that ussuri does not need a fix as it has zuulv3 jobs already, but that's not true unfortunately 16:31:21 <bauzas> gmann told me he couldn't attend this meeting, so let's discuss this again next week 16:31:33 <elodilles> i mean, it has zuulv3 jobs, but still we are facing with the same issue 16:31:42 <elodilles> bauzas: ++ 16:31:43 <gibi> so I think the next step is still to gather the intermitten failures and try to fix them 16:32:02 <bauzas> gibi: yeah, we'll track those on a weekly basis thanks to the etherpad 16:32:12 <gibi> ack 16:32:31 <elodilles> thanks melwitt for starting the etherpad \o/ 16:32:37 <bauzas> yup, melwitt++ 16:34:29 <bauzas> anything to discuss about those intermittent issues btw. ? 16:35:30 <elodilles> i guess we still need to collect them to have the full picture 16:35:36 <bauzas> yup 16:36:32 <gibi> yepp 16:37:11 <elodilles> maybe one note: for placement we don't have periodic-stable on wallaby and older 16:37:45 <bauzas> :/ 16:37:49 <gibi> elodilles: do you suspect some instability in placement? 16:38:14 <elodilles> gibi: nope, but the gate is broken in wallaby and older in placement 16:38:15 <gibi> or is this just proactively running some jobbs 16:38:25 <gibi> broken?! 16:38:27 <elodilles> gibi: see melwitt's etherpad 16:38:28 <gibi> that is bad :/ 16:38:46 <elodilles> though probably they are some known issues to fix 16:39:09 <gibi> I agree to add some periodic there then 16:39:25 <elodilles> gibi: ack, i can backport the patch that added the periodic 16:39:39 <elodilles> * periodic-stable 16:40:18 <bauzas> gibi: agreed too 16:41:04 <bauzas> moving on ? 16:41:14 <elodilles> bauzas: ++ 16:41:24 <bauzas> #topic Open discussion 16:41:29 <bauzas> (whoami-rajat) Discuss regarding the design of rebuild volume backed instance feature 16:41:35 <bauzas> whoami-rajat: your turn 16:41:35 <whoami-rajat> Hi 16:41:39 <whoami-rajat> thanks 16:41:52 <whoami-rajat> #link https://review.opendev.org/c/openstack/nova-specs/+/840155 16:42:20 <whoami-rajat> So I started working on this feature in yoga (this was proposed/reproposed several times before) and the spec got approved 16:42:41 <whoami-rajat> now while reproposing it, sean-k-mooney has some concerns regarding the new parameter we are introducing ``reimage_boot_volume`` 16:43:01 <whoami-rajat> it's a request parameter to tell the API, we are performing rebuild on a volume backed instance and not an ephemeral disk 16:43:13 <sean-k-mooney> yep 16:43:28 <whoami-rajat> initially the idea was not to have feature parity between both workflows but later there were many concerns with this operation being destructive 16:43:38 <whoami-rajat> even if you follow past specs, the concern has been discussed 16:44:15 <whoami-rajat> so lyarwood suggested to add this parameter ``reimage_boot_volume`` so any user who would like to opt in for this (as it has data loss risk) would only be able to do it 16:44:22 <sean-k-mooney> i really think that havign feature partiy btween bfv=True|false is imporant 16:44:38 <sean-k-mooney> i dont think the data loss argument holds 16:44:57 <sean-k-mooney> my reason is tha thtis is a deliberate instance action to rebuild the root disk 16:45:06 <gibi> rebuild is destructive for image bases instances too 16:45:11 <sean-k-mooney> yep 16:45:30 <sean-k-mooney> and rebuild is not the same as evacuate 16:45:31 <whoami-rajat> yes but the destructive operation is performed by cinder in this case where the volume resides on the cinder side 16:45:45 <sean-k-mooney> for evacuate we shoudl preserve the data 16:45:48 <bauzas> that's the whole purpose of this spec 16:45:58 <sean-k-mooney> for rebuild via the api we shoudl reimage the root volume 16:46:01 <bauzas> rebuild on BFV wasn't destructive, right? 16:46:14 <sean-k-mooney> rebuild was rejected 16:46:16 <sean-k-mooney> for bfv 16:46:17 <whoami-rajat> we didn't support rebuild on BFV 16:46:34 <sean-k-mooney> so the wole point is to allow rebuild with bfv 16:46:50 <bauzas> if so, there is a clear implication of what rebuild means for the root disk 16:47:13 <bauzas> we blocked because we were unable to rebuild the root disk if bfv 16:47:19 <sean-k-mooney> and technialy extra ephmeral disks 16:47:28 <sean-k-mooney> bauzas: correct 16:48:05 <bauzas> then, I don't see a need for differenciating BFV and non-BFV from an API pov 16:48:20 <bauzas> both will be destructive for the root disk 16:48:33 <sean-k-mooney> if so we also do not need an api microversion correct 16:48:39 <sean-k-mooney> and no api change at all 16:48:44 <sean-k-mooney> we just remove the block 16:48:50 <whoami-rajat> the destructive nature of this operation was the concern from many folks, I can't name everyone but this was approved in yoga so you can see 16:48:51 <bauzas> good question 16:48:53 <sean-k-mooney> when cinder is new enough 16:49:10 <whoami-rajat> dansmith, has been actively reviewing the changes I proposed last cycle so maybe he can weigh in 16:49:49 <bauzas> whoami-rajat: frankly, if we were about adding some parameter, it would be more for *not* recreating the volume 16:50:10 <dansmith> bauzas: the point of the spec/effort is to rebuild the root volume 16:50:17 <dansmith> i.e. to reimage it, but let cinder do the reimaging 16:50:29 <bauzas> dansmith: that's what I understand 16:50:38 <bauzas> so... 16:51:37 <bauzas> tbc, I don't see a need for an API param that'd say "yes, I want to rebuild by reimaging" 16:51:56 <bauzas> which would imply that the default would be "rebuild by not reimaging" 16:52:25 <sean-k-mooney> bauzas: no default would reject 16:52:49 <sean-k-mooney> bauzas: that was the behavior that i think lee suggested but i dont think i reviewd the previous iteration 16:52:57 <dansmith> I think user-initiated rebuild where we don't reimage root is pointless right? 16:53:06 <dansmith> as long as we don't rebuild on evacuate then we're good, 16:53:06 <sean-k-mooney> correct 16:53:13 <sean-k-mooney> ya 16:53:13 <bauzas> I agree 16:53:19 <dansmith> but this is specifically to make BFV behave like regular instances 16:53:48 <sean-k-mooney> right so evacuate shoudl continue to preseve the root disk if its on shared storage 16:53:50 <bauzas> correct me if I'm wrong, but I feel we are on the same page 16:53:57 <bauzas> evacuate should differ 16:54:06 <sean-k-mooney> and rebuidl will always reimage it provided cinder i new enough 16:54:06 <whoami-rajat> Since the main destruction is performed on the cinder side, I know a lot of folks on cinder side that won't agree to the idea of not adding this additional precautionary measure to avoid it 16:54:13 <bauzas> but rebuild should behave like regular instance, ie. reimage 16:54:17 <dansmith> bauzas: okay I guess I thought you were arguing for a special param 16:54:20 <whoami-rajat> as where the initial concern started ^ 16:54:35 <bauzas> dansmith: I was absolutely on the other direction, see above :) 16:54:36 <sean-k-mooney> i realy dont liek the idea of make bfv special in the nova api 16:54:45 <bauzas> me too 16:54:51 <dansmith> bauzas: ack, sorry, I'm double-meeting-ing 16:55:04 <bauzas> from an API point of view, this is clear 16:55:11 <sean-k-mooney> whoami-rajat: if we want to prevent this form the cidner side 16:55:21 <sean-k-mooney> i think cinder need a way to block the reimage not nova 16:55:31 <bauzas> of course, since we share the same internal methods for evacuate and rebuild, we should make them differ based on some conditional 16:55:32 <sean-k-mooney> like locking the volume or similar 16:55:47 <bauzas> but this conditional doesn't have to be exposed at the API level 16:55:57 <sean-k-mooney> bauzas: i think we pass a flag to rebudil to signal if its an evacuate right 16:55:58 <dansmith> bauzas: we already have a flag to pass, 16:56:04 <dansmith> bauzas: because we have to honor the old microversion, 16:56:11 <dansmith> so we can just make sure it's ==false for the evac case 16:56:16 <bauzas> dansmith: yeah, I know, that's the conditional I thought 16:56:30 <dansmith> conditional at the rpc layer, but the only conditional in the api is "old or new microversion" 16:56:44 <dansmith> the only conditional *should* be version, I mean 16:56:46 <bauzas> dansmith: correct, that being said, there was an open question 16:56:50 <sean-k-mooney> dansmith: well do we need a microverion 16:56:57 <bauzas> about even whether we would need a microversion 16:57:01 <sean-k-mooney> there is not api request change 16:57:08 <bauzas> if we just unblock 16:57:10 <dansmith> I think we absolutely do, 16:57:17 <sean-k-mooney> i realy think we should not 16:57:19 <dansmith> because right now rebuild does not destroy data and after this, it would 16:57:30 <sean-k-mooney> right now it rejects the request 16:57:35 <whoami-rajat> sean-k-mooney, if the operation is initiated from nova side, I'm not sure how from cinder side we can provide a user input to block this 16:57:44 <dansmith> sean-k-mooney: only if the image is different 16:58:00 <sean-k-mooney> dansmith: no its rejected always i tought 16:58:06 <dansmith> sean-k-mooney: if the image is the same, it allows it 16:58:13 <sean-k-mooney> ... 16:58:13 <dansmith> whoami-rajat: right? 16:58:15 <whoami-rajat> but maybe I'm the only one defending the proposal 16:58:43 <whoami-rajat> dansmith, yes, for same image it does allow the rebuild 16:58:54 <dansmith> sean-k-mooney: ^ 16:59:14 <bauzas> hah 16:59:18 <sean-k-mooney> that seams like a bug 16:59:23 <sean-k-mooney> since tha talso destorys data 16:59:34 <bauzas> (18:46:01) bauzas: rebuild on BFV wasn't destructive, right? 16:59:45 <sean-k-mooney> there is no differne form a data perspective fi you use the same image or differnt one 16:59:47 <bauzas> damn, we're about the end of time 16:59:49 <dansmith> sean-k-mooney: it doesn't on BFV but does on regular instances 17:00:13 <dansmith> sean-k-mooney: on BFV if the image is the same, it will just rebuild the ports or whatever, but no change to the disk 17:00:20 <dansmith> but will destroy the disk with the same image on a regular instance 17:00:23 <bauzas> I'll close this meeting, but I beg the people here to continue discussing this topic after 17:00:24 <sean-k-mooney> that the same as a hard reboot 17:00:37 <dansmith> sean-k-mooney: alas, it's api behavior we have had for YEARS 17:00:44 <sean-k-mooney> rebuild is not a move op 17:00:49 <dansmith> so changing it to now destroy data is a Bad Plan (tm) 17:00:51 <sean-k-mooney> and it should not really update the port either 17:00:55 <dansmith> well understood :) 17:00:59 <bauzas> thanks all, and for people interested in this bfv resize discuss, please stay around 17:01:04 <bauzas> #endmeeting