16:00:04 <bauzas> #startmeeting nova
16:00:04 <opendevmeet> Meeting started Tue Apr 26 16:00:04 2022 UTC and is due to finish in 60 minutes.  The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:04 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:04 <opendevmeet> The meeting name has been set to 'nova'
16:00:11 <gibi> o/
16:00:29 <bauzas> ehlo compute-ers
16:00:53 <bauzas> #link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting
16:01:14 <Uggla> o/
16:01:37 <elodilles> o/
16:03:04 <bauzas> okay, let's start, people could come, once they're undragged from some internal big meeting :)
16:03:24 <gmann> o/
16:03:26 <bauzas> #topic Bugs (stuck/critical)
16:03:28 <bauzas> damn
16:03:30 <bauzas> #topic Bugs (stuck/critical)
16:03:35 <bauzas> #info No Critical bug
16:03:40 <bauzas> #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 28 new untriaged bugs (-7 since the last meeting)
16:03:46 <bauzas> kudos to gibi for this hard work
16:03:59 <bauzas> #link https://storyboard.openstack.org/#!/project/openstack/placement 26 open stories (0 since the last meeting) in Storyboard for Placement
16:04:06 <bauzas> #info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster
16:04:25 <bauzas> gibi: wanted to discuss about the bugs you triaged before we look at next week bug baton owner ?
16:04:31 <gibi> just a small link
16:04:33 <gibi> https://etherpad.opendev.org/p/nova-bug-triage-20220419
16:04:41 <gibi> I collected the triaged bugs here
16:04:55 <gibi> I think the only interesting information is the bugs in triaged state without assignee
16:05:02 <bauzas> as I said before, this is a good idea to keep some references, mostly about the incomplete
16:05:02 <gibi> we have one this week
16:05:14 <gibi> [Needs Assignee] Concurrent migration of vms with the same multiattach volume fails https://bugs.launchpad.net/nova/+bug/1968645
16:05:20 <bauzas> well, we have around 950 open bugs
16:05:29 <gibi> yeah, but at least this is a fresh one :)
16:05:37 <gibi> probably the reporter is still around if we have questions :)
16:06:02 <bauzas> gibi: what we could agree on is to find some volunteer for only high bugs that just got triaged
16:06:36 <gibi> I think it is worth to advertise these bugs but I agree that we don't have to forcefully assign it
16:06:49 <bauzas> sounds a low hanging fruit
16:07:03 <bauzas> if this is about adding a retry loop
16:07:14 <gibi> yeah, it does not seem super hard
16:07:21 <gibi> yepp it is a retry loop
16:07:26 <gibi> for cinder attachment create
16:07:38 <gibi> anyhow we can move on :)
16:08:16 <gibi> I'm ready to pass the baton
16:08:51 <bauzas> gibi: I'll then add the low-hanging-fruit tag
16:08:57 <gibi> bauzas: works of me
16:08:58 <bauzas> this may help
16:09:01 <gibi> *for
16:10:15 <bauzas> gibi: nah, it worked *of* you as you proposed the solution :)
16:10:47 <bauzas> anyway, this leads to the last point
16:10:49 <bauzas> melwitt: around ?
16:11:03 <melwitt> yes
16:11:15 <bauzas> melwitt: you're next in the bug triage roster
16:11:29 <bauzas> melwitt: do you feel brave enough to get the bug baton ?
16:12:01 <melwitt> bauzas: haha, yes. sounds cool
16:12:28 <bauzas> as a reminder for everyone, this baton doesn't imply any matter of ownership or responsibility
16:12:47 <bauzas> anyone wanting to help is welcome, based on his or her free time
16:13:00 <bauzas> and others may help if they want
16:13:16 <bauzas> melwitt: which leads to me saying you can ping me if you need help for upstream triage
16:13:29 <melwitt> cool, thanks
16:13:38 <bauzas> #info Next bug baton is passed to melwitt
16:13:46 <bauzas> melwitt: thanks, very much appreciated
16:13:57 <bauzas> next topic then,
16:14:01 <bauzas> #topic Gate status
16:14:06 <bauzas> #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs
16:14:12 <bauzas> #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs
16:14:26 <bauzas> #link https://zuul.openstack.org/builds?project=openstack%2Fplacement&pipeline=periodic-weekly Placement periodic job status
16:14:30 <bauzas> #link https://zuul.opendev.org/t/openstack/builds?job_name=nova-emulation&pipeline=periodic-weekly&skip=0 Emulation periodic job runs
16:14:34 <bauzas> #info Please look at the gate failures and file a bug report with the gate-failure tag.
16:14:51 <bauzas> that's been a while we got a new gate failure
16:15:05 <gibi> I saw some intermittent one but had not time to dig in
16:15:19 <gmann> for centos-8-stream, non voting job is failing 100% now.
16:15:23 <gmann> As testing runtime for Zed is centos-9-stream and in QA we agreed to drop the c8s support in devstack as well as in Tempest, I have proposed the changes there and will notify it on ML also. also moving c8s job to c9s, please review it - https://review.opendev.org/c/openstack/nova/+/839275
16:15:44 <bauzas> thanks gmann
16:15:48 <gmann> this series #link https://review.opendev.org/q/topic:drop-c8s-testing
16:16:09 <bauzas> gmann: on my list, will vote on it later today or tomorrow
16:16:26 <gmann> thanks, meanwhile I will get tempest depends-on merge
16:17:03 <bauzas> cool
16:17:24 <bauzas> I have to sit down and consider all the implications but your patch seems good to me
16:17:35 <bauzas> and either way, ship is sailed
16:17:52 <bauzas> centos9 is targeted for zed
16:17:57 <gmann> yeah
16:18:29 <bauzas> do we have any tracking LP bug about the centos8 failures ?
16:18:58 <bauzas> ideally, I'd want those to be wontfix if we decide we move on
16:19:15 <gmann> no bug as of now, we are going as per the testing runtime. and as we drop the py36 from projects then it is broken like for nova as of now
16:19:29 <bauzas> ok
16:19:37 <bauzas> then no paperwork to fill in
16:19:45 <gmann> it is failing as in nova we made nova require >=py3.8 and other projects will do the same
16:19:45 <bauzas> :)
16:19:52 <bauzas> oh
16:19:55 <bauzas> right
16:20:51 <bauzas> anyway, I guess we can continue
16:20:58 <gmann> yeah
16:21:10 <bauzas> #info STOP DOING BLIND RECHECKS aka. 'recheck' https://docs.openstack.org/project-team-guide/testing.html#how-to-handle-test-failures
16:21:30 <bauzas> just as a weekly periodic reminder (which includes me, stupid bias)
16:21:54 <bauzas> next topic,
16:21:59 <bauzas> #topic Release Planning
16:22:03 <bauzas> #link https://releases.openstack.org/zed/schedule.html
16:22:07 <bauzas> #info Zed-1 is due in 3 weeks
16:22:10 <bauzas> tick-tock
16:22:27 <artom> ♫ on the clock ♫
16:22:50 * melwitt knows the reference and feels embarrassed :)
16:23:10 <bauzas> I said last week I should ask this week for a spec review day, but do people feel good about considering it for in two weeks ?
16:23:16 <gibi> bah, I have to allocate  some time to the placement PCI tracking spec
16:23:19 <artom> No shame, those pop songs are earworms by design
16:23:52 <bauzas> artom: but the party don't stop
16:23:52 <artom> gibi, you have sean-k-mooney's original spec to work from, so not starting from scratch
16:24:00 <gibi> artom: true true
16:24:15 * bauzas appreciates the english grammar mistake, btw.
16:24:15 <gibi> bauzas: I'm OK to have a review day next week or the week after
16:24:33 <melwitt> :)
16:24:35 <bauzas> gibi: I have to write some spec for deprecating the keypair generation, you know
16:24:46 <bauzas> so
16:24:47 <gibi> :)
16:25:12 <bauzas> maybe saying a spec review day not next tuesday, but tuesday after that ?
16:25:22 <bauzas> ie. May 10th ?
16:25:37 <sean-k-mooney> the 10 ya that would be ok i think
16:26:13 <bauzas> I see no objections
16:26:23 <gibi> good for me
16:26:41 <bauzas> #agreed first spec review day will happen on May 10th, bauzas to communicate thru the mailing list
16:27:09 <bauzas> this leaves 2 weeks for people writing specs, you are warned
16:27:20 <bauzas> (again, this includes me)
16:27:32 * gibi feels warned
16:27:55 * bauzas feels gibi overfeels more than he should :)
16:28:17 <bauzas> we'll have another round of spec reviews, as we agreed last PTG, either way
16:28:51 <bauzas> ok, next
16:29:00 <bauzas> #topic Review priorities
16:29:04 <bauzas> #link https://review.opendev.org/q/status:open+(project:openstack/nova+OR+project:openstack/placement+OR+project:openstack/os-traits+OR+project:openstack/os-resource-classes+OR+project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/osc-placement)+label:Review-Priority%252B1
16:29:52 <bauzas> the vIOMMU change probably needs paperwork
16:30:41 <bauzas> tl;dr: https://review.opendev.org/c/openstack/nova/+/830646 requires at least a blueprint, and maybe a spec
16:30:42 <sean-k-mooney> ya i started to review it i kind of think it shoudl have a mini spec
16:30:50 <melwitt> +1
16:31:09 <bauzas> I can use my hammer
16:31:21 <bauzas> but I could soften it
16:31:51 <bauzas> explaining we require a blueprint and some discussion at a nova meeting before we can pursue reviewing it
16:31:59 <bauzas> ricolin: are you around ?
16:32:18 <sean-k-mooney> im not againt the proposaly in fact i have wanted to add supprot for a while but we need to agree on the extra_specs/image propeties and define the scope
16:32:28 <sean-k-mooney> i was recommening keeping it small for now
16:32:31 <bauzas> yeah, and that's why we need a debate
16:32:55 <bauzas> not about which paper stamp we should use
16:33:02 <sean-k-mooney> there is viommu supprot with the limited chagne required for there acclerator to work
16:33:08 <bauzas> but whether we all agree on the design
16:33:13 <sean-k-mooney> and then there is full supprot with security and isolation
16:33:38 <bauzas> correct, that's why we need to discuss this correctly and address the design scope
16:33:49 <sean-k-mooney> i would suggest we split it like that and only do the former this cycle
16:34:11 <bauzas> sean-k-mooney: well, we need a owner, at first :)
16:34:35 <bauzas> even if we agree on the direction, we need gears
16:34:37 <sean-k-mooney> yep mnaser also expressed interest. im not sure that stephenfin will work on it this cycle
16:35:05 <bauzas> I'm sure stephenfin said he was okay to leave it for others to continue :)
16:35:27 <bauzas> hence the gerrit hammer
16:35:47 <bauzas> this may help people to react and assign some time for this
16:36:43 <bauzas> I'll also drop the review-prio flag which is meaningless in this case as we can't merge it as it is
16:37:43 <bauzas> ok, moving on if nobody yells
16:38:10 <bauzas> #topic Stable Branches
16:38:14 <bauzas> elodilles: your turn
16:38:20 <elodilles> #info ussuri and older branches are blocked until 'l-c drop' patches merge - https://review.opendev.org/q/I514f6b337ffefef90a0ce9ab0b4afd083caa277e
16:38:30 <elodilles> #info other branches should be OK
16:38:36 <elodilles> #info nova projects' stable/victoria transitioned to Extended Maintenance - no further releases will be produced from victoria, but branch remains open to accept bug fixes
16:38:44 <elodilles> and that's all I think ^^^
16:39:15 <melwitt> we need second core wink wink
16:39:24 <elodilles> :]
16:39:41 <gibi> I would be happy to approve... ;)
16:39:52 <gibi> (has some blocked train backports :D)
16:40:03 <bauzas> I can do things
16:40:12 <bauzas> my brain fsck'd me
16:40:12 <gibi> bauzas: ask elodilles to add me into the stable-core group
16:40:20 <gibi> then I can help
16:40:53 <bauzas> gibi: I think we said at the PTG I should propose your name against the stable team
16:41:01 <gibi> yeah I think so
16:41:05 <gibi> so lets do it :D
16:41:05 <bauzas> so,
16:41:19 <bauzas> #1 I'll do my homework and review such l-c patches
16:41:53 <bauzas> #2 I'll do my duty and engage discussions about reconciling the nova team and the nova-stable team in some intelligent manner
16:43:18 <bauzas> last topic in the agenda,
16:43:21 <bauzas> #topic Open discussion
16:43:27 <bauzas> (gibi) Allow claiming PCI PF if child VF is unavailable https://review.opendev.org/c/openstack/nova/+/838555
16:43:33 <bauzas> gibi: take the mic
16:43:38 <gibi> thanks
16:43:45 <gibi> so
16:43:51 <gibi> it is a bug
16:43:58 <gibi> we saw DB inconsistencies at customers
16:44:23 <gibi> the pci_devices table contains available PF and unavailable children VF rows
16:44:39 <gibi> this basically an impossible situation
16:44:54 <gibi> the VF should be available if the PF is available
16:45:04 <gibi> or the VF should be allocated
16:45:12 <gibi> anyhow
16:45:26 <gibi> I propsed a fix https://review.opendev.org/c/openstack/nova/+/838555
16:45:40 <gibi> it removes some of the strictness of the state check during the PCI claim
16:45:56 <gibi> basically allows allocating the available PF if the children VFs are unavailable
16:46:06 <gibi> this would heal the inconsistent DB state
16:46:13 <gibi> artom had a good point in the review
16:46:32 <gibi> that we tend to handle DB healing via nova-manage CLI instead
16:46:45 <bauzas> not true
16:46:59 <sean-k-mooney> we heal some things on object load form the db
16:47:02 <bauzas> we had db healing made thru data migrations
16:47:51 <artom> Right, but those fixes are from "we had thing X a long time ago, and it might still be in the DB, so now when we load it we convert to thing X"
16:47:59 <sean-k-mooney> we do have the heal allcoation type commands but i think healing this on agent start is the right thing
16:48:01 <bauzas> we only heal things thru the CLI if this is for example something due to some relationship between two DBs
16:48:08 <artom> vs examples like placement-audit and Lee's connection_info update
16:48:09 <sean-k-mooney> as we are also fixing the in memroy representation
16:48:18 <bauzas> sean-k-mooney: correct, because two DBs were involved
16:48:52 <artom> My other point was - the DB somehow got into an inconsistent state, wouldn't it be wiser to at least let the operator know, vs siltently fixing it?
16:48:58 <bauzas> in the case of placement audit, this was about reconciling two datastores kept from two different projects
16:49:05 <sean-k-mooney> artom: i dont think so
16:49:17 <bauzas> I agree with sean-k-mooney
16:49:25 <bauzas> we could log such thing
16:49:32 <sean-k-mooney> artom: i think this happened beacue of how the custoemr recreated the compute node after the hdd died
16:49:32 <bauzas> but no need to claim it loud
16:49:34 <artom> OK :) Not a hill I want to die on, but wanted to at least raise the question
16:49:35 <gibi> I've added a WARNING log in the patch
16:49:52 <sean-k-mooney> i dont think this is something that most operators would hit
16:49:52 <artom> Seems like I'm outnumbered :)
16:50:12 <bauzas> artom: I won't ask you how many divisions you have
16:50:19 <gibi> sean-k-mooney: one more thing, you said it should be fixed at agent restart. Now my patch fixes it during PCI claim
16:50:31 <artom> bauzas, divisions o_O?
16:50:51 <bauzas> gibi: I like the pci claim approach
16:51:05 <sean-k-mooney> i have not reviewd the third patch yet
16:51:08 <gibi> sean-k-mooney: and I have a separate patch for the agent restart + remove VF + inconsistent state case
16:51:24 <gibi> sean-k-mooney: ahh, OK, let me know you opininon once you reviewed it
16:51:27 <bauzas> artom: sorry, I'll DM you the reference :)
16:51:40 <sean-k-mooney> ack we can proceed on the gerrit review
16:51:47 <artom> It's going to be an obscure French thing, isn't it :P
16:52:28 <gibi> artom, sean-k-mooney, bauzas: thanks, we can move on
16:52:37 <bauzas> ++
16:52:56 <bauzas> sounds we got a consensus : review gibi's patch
16:52:56 <gibi> I will comment on the patch linking to the meeting logs
16:53:28 <bauzas> #agreed let's continue to review gibi's work with pci claims fixing the inconstitency
16:53:45 <bauzas> that's all we had for the meeting
16:53:55 <bauzas> any last minute item people wanna raise ?
16:54:32 <bauzas> looks not,
16:54:37 <gibi> -
16:54:37 <bauzas> thanks all !
16:54:39 <gibi> thanks!
16:54:46 <bauzas> #endmeeting