16:00:59 #startmeeting nova 16:00:59 Meeting started Tue Aug 22 16:00:59 2023 UTC and is due to finish in 60 minutes. The chair is gibi. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:59 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:59 The meeting name has been set to 'nova' 16:01:04 \o 16:01:05 o/ 16:01:10 o/ 16:01:11 o/ 16:01:11 o/ 16:01:18 when does bauzas get back? 2024? 16:01:25 i think monday 16:01:31 but i didnt check whcih year 16:01:39 "Monday, 2024" 16:02:23 yepp 28th of Aug he is back 16:02:49 so you will get rid of me being the chair after this meeting :) 16:03:16 lets start 16:03:25 #topic Bugs (stuck/critical) 16:03:32 I don't see any critical bug 16:03:47 #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 43 new untriaged bugs (+3 since the last meeting) 16:03:52 #info bug baton is on PTO (bauzas please pass it forward after the summer period) 16:04:09 is there any bugs we need to discuss? 16:04:41 If anyone could assess https://bugs.launchpad.net/nova/+bug/2023035 it would be appreciated as it's blocking us upgrading to 2023.1 without using old config workarounds 16:05:16 Merged openstack/nova master: Log excessive lazy-loading behavior https://review.opendev.org/c/openstack/nova/+/891340 16:05:44 \o/ 16:06:39 andrewbonney: I see you know about the workaround 16:07:16 Yeah, just doesn't feel ideal if it's easily patchable 16:07:20 andrewbonney: I did not look deeper but it seems the one of the recent comment shows what is the difference between the two host cpus 16:07:40 isn't this a libvirt thing anyway? 16:08:05 could be 16:08:51 I guess it says libvirt version is the same on both ends, so hmm 16:08:54 `cpu_mode = host-model` so I guess it is difference between the model 16:09:03 maybe we could get kashyap to look at this 16:09:31 We'd be happy to provide any other debug if it's useful 16:09:31 there are many factor that can cause it to fail 16:10:19 do you have diffent kernel version on the diffent hosts 16:10:41 this kid of sound like perhaps some of the mitigations that are enabeld are not the same on both 16:10:50 The testing we did found that the running XML for a guest failed in the virsh hypervisor-cpu-compare on the same host it was running on 16:11:08 Where cpu-compare worked 16:11:57 and do you have the patchs that we mreged recently to update how we did the cpu compatiblity checks 16:12:27 This was using b9089ac from 2023.1 16:12:27 namely https://review.opendev.org/c/openstack/nova/+/869950 16:12:44 Yes, we think that caused it 16:12:47 ok that should be in that then 16:13:24 OK. I will ask kashyap to look at it when he is back 16:13:25 well that is usign the new api 16:13:37 that the libvirt folks told use to use 16:13:41 andrewbonney: is it simple to just revert that patch from what you have to see? 16:14:12 Yes I could do that tomorrow and update the bug, but given what we tested with virsh command line I think it proves it already 16:14:34 but yeah, we probably need to dig deeper if this is the new api we're supposed to be using and it's not doing what we want.. not sure if that means we need to do something different or not 16:14:41 It felt like the guest XML that was being put into the new API was wrong, but I don't know libvirt enough to be sure 16:14:42 but kashyap would probably be a good person to chase that down 16:14:57 Thanks. Don't want to take up all your meeting time :) 16:15:04 OK, moving on 16:15:12 any other bug to discuss? 16:16:05 #topic Gate status 16:16:22 definitely improving 16:16:40 glad to hear that 16:16:50 one of my patches merged today with a single +W and no rechecks 16:16:54 which was pretty shocking :) 16:17:30 we also merged one functional improvement https://review.opendev.org/c/openstack/nova/+/892126 16:17:39 hopefully that will help too 16:17:48 yeah, which sounds like may have more impact than it appears, which is cool 16:18:07 I'll be looking for functional failures in the gate queue once that has soaked for a while 16:18:11 I will check the functional results later this week to see if the common issues are disappeared or not 16:18:16 yeah 16:18:17 yeah, hopefully it does not go worst during release time, current gate stability is much better 16:18:59 yep, and if we can get some +Ws on these, they'll further reduce some of our runtime loading: https://review.opendev.org/q/topic:reduce-lazy-loads 16:19:22 during periods of high load, it can take a bit to execute a lazy load, and it's all unnecessary 16:20:25 but yeah, summary is we're in a much better place than we were.. not enough to consider it done, but enough to not be very fearful of release time I think :) 16:20:27 yepp, those are easy wins 16:20:54 #info Please look at the gate failures and file a bug report with the gate-failure tag. 16:20:59 i can try and take a look at them 16:21:03 moving on 16:21:14 #topic Release Planning 16:21:36 we have 2 weeks before Feature Freeze 16:22:05 I expect bauzas will start a feature tracking pad as soon as he is back 16:22:28 any features we need to discuss here? 16:24:03 then moving on 16:24:14 #topic Stable Branches 16:24:29 elodilles_pto is off 16:25:02 is there any stable topic to discuss? 16:26:23 #topic Open discussion 16:26:28 nothing on the agenda 16:26:41 Spencer Harmon proposed openstack/nova master: libvirt tsc_frequency https://review.opendev.org/c/openstack/nova/+/891924 16:26:42 any other topics to bring up? 16:27:09 Yes 16:27:18 go ahead 16:27:27 I sould have mentioned this with the features maybe 16:27:33 no worries 16:27:52 I'm still working on the NFS online resize feature 16:28:15 didnt we say no to exposing the tsc_frequency in the past... 16:28:52 Sorry for noise; didn't mean to interrupt. Trying to get tests working. Maybe a topic for next week? 16:29:09 And there is a change for nova that has open dependencies in Cinder: https://review.opendev.org/c/openstack/nova/+/873560 16:30:02 So bauzas listed it as "Action from owner required" in the feature tracking pad 16:30:10 kgube: so that likely will have to be next cycle 16:30:13 kgube: is tere a nova spec for that side? 16:30:22 yes 16:30:30 it was not approved this cycle i think 16:30:32 the spec has been merged 16:30:34 maybe last cyle 16:30:40 okay I can't find it 16:31:12 oh it was 16:31:14 https://review.opendev.org/c/openstack/nova-specs/+/877233 16:31:20 https://specs.openstack.org/openstack/nova-specs/specs/2023.2/approved/use-extend-volume-completion-action.html 16:31:24 https://review.opendev.org/c/openstack/nova-specs/+/877233 16:31:41 got it, wrong topic 16:32:02 Anyway, the Cinder team has been slow with reviews 16:32:04 does the nova side of this breaks anything without the cinder dependency? 16:32:04 ok well we can review what is the status of the cinder side 16:32:21 i dont belive so 16:32:22 The is a +1, but not from the core 16:32:27 if the cinder side isn't going to be merged ASAP, I think we probably need to punt 16:32:36 and not waste time reviewing the nova side in the last two weeks 16:32:58 hm, okay 16:33:57 I agree 16:34:22 so if we merged the nova code the cinder volume would never be in extending so it would not break anything but i also dont like ahvign dead code espeicly when it has an api microverion 16:35:13 so ya i think we would likely want to merge both in the same release 16:35:42 strictly speaking old nova and new cinder or old cinder and new nova shoudl work with each other 16:36:13 yeah, but since it requires an api change, 16:36:44 I think we should not presumptively merge this because cinder might change their definition of what the event means, when it comes, etc and then we end up with a thing we have to technically support for no reason 16:36:47 + compute service change and min version check 16:36:51 It's a bit frustrating because the Cinder change has been ready for review for weeks, and even on top of their feature tracking etherpad: https://etherpad.opendev.org/p/cinder-2023.2-bobcat-features 16:36:57 I think we've always expected these to merge in the dependent service first 16:37:10 but I guess they have been very busy 16:37:16 sean-k-mooney: right, there are REST API changes, implicit RPC ones, service versions, etc 16:37:18 certenly with neutron we have 16:37:35 kgube: yeah, I definitely understand your frustration 16:37:40 kgube: yes, this is frustrating 16:37:58 kgube: do you have depends on or tempest/devestack covergae to show this working by any chance 16:38:33 this is feeling more like an early pre milestone 1 type of change then the next two weeks 16:39:00 but if you have already staged the patches with depends on and can show this working in ci then that more compleing 16:39:37 sean-k-mooney: It is covered by the existing online volume extend tests if you run them on an affected driver 16:39:47 no devstack change that I see 16:40:05 we do not have a cinder nfs job in novas gate 16:40:06 kgube: right, but we need to see them running against the proper driver and this change 16:40:46 the depend on relation ship si not quite right althoguh its close 16:40:52 I guess the nfs job on this patch maybe? https://review.opendev.org/c/openstack/cinder/+/873889/7?tab=change-view-tab-header-zuul-results-summary 16:40:54 There is a devstack-nfs job 16:40:58 https://review.opendev.org/c/openstack/cinder/+/873686/9 woudl be the patch to make the nfs driver work 16:41:23 that patch has depends-on relationships to cinder itself, which is weird 16:41:38 ya so that patch should depend on either the nova patch 16:41:45 ro the nova patch should ideally depend on it 16:41:53 well, not reall,y 16:42:01 because you need to run a cinder patch to see that job 16:42:11 the nova patch depens on the cinder client patch which depends on https://review.opendev.org/c/openstack/cinder/+/873557/8 16:42:13 it looks like every patch in the stack depends on the thing below it or something? 16:42:16 confusing 16:42:31 this one depends on the nova patch: https://review.opendev.org/c/openstack/nova/+/873560 16:42:32 ya that will work but its not how its ment to be used 16:42:37 and the top cinder one depends on that one 16:42:48 so it's a bit convoluted 16:42:48 oh the netapp one 16:42:55 anyway kgube disconnected 16:42:58 ya so the issue is that the api change patch is first 16:43:03 so we can probably end it here and discuss later 16:43:05 and then the driver patches ar after that 16:43:10 sure 16:43:40 OK 16:43:47 is the any other topic before we close? 16:44:03 sorry, I got disconnected 16:44:35 kgube: we were just deciphering the patch dependency network to figure out how to see a job running the whole stack 16:44:45 which I think is there, but it's convoluted 16:44:51 either way I think the answer is the same :/ 16:45:17 yeah, I was afraid of that 16:45:23 but I can understand it 16:45:58 i dont know if cinder has the same policy as nova 16:46:11 but for nova we merge the api change last in a series 16:46:22 so the fact that is first in the cidner one is also a little odd 16:47:10 I don't see any other topic being raised. So I will close the meeting and you can continue untangling the dep tree on the channel 16:47:10 Well the nova change needs the new volume action 16:47:15 #endmeeting