16:00:18 #startmeeting nova 16:00:19 Meeting started Thu Sep 17 16:00:18 2020 UTC and is due to finish in 60 minutes. The chair is gibi. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:20 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:23 The meeting name has been set to 'nova' 16:00:31 \o 16:00:38 o/ 16:00:38 o/ 16:00:44 ~~~o/~~~ 16:00:58 o/ 16:01:29 dansmith are you partially under water? 16:01:41 gibi: nearly suffocated by smoke 16:01:49 I'll work on my ascii art :) 16:01:54 sorry to hear that 16:02:36 OK, let's get started 16:02:46 #topic Bugs (stuck/critical) 16:03:04 No critical bug 16:03:11 #link 7 new untriaged bugs (-29 since the last meeting): https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 16:03:25 I would like to thank every bug triager 16:03:43 not so long ago we had more than 100 such bugs 16:03:56 thank you! 16:04:20 rc critical bugs #link https://bugs.launchpad.net/nova/+bugs?field.tag=victoria-rc-potential 16:04:22 \o/ 16:04:25 we have one 16:04:39 https://bugs.launchpad.net/nova/+bug/1882521 16:04:40 Launchpad bug 1882521 in OpenStack Compute (nova) "Failing device detachments on Focal" [High,Confirmed] - Assigned to Lee Yarwood (lyarwood) 16:04:49 the Focal saga. 16:04:53 gmann left some notes on the agenda 16:05:00 Until qemu bug (https://bugs.launchpad.net/qemu/+bug/1894804) is fixed we have two options 16:05:01 Launchpad bug 1894804 in qemu (Ubuntu) "Second DEVICE_DELETED event missing during virtio-blk disk device detach" [Undecided,New] 16:05:07 yeah, we can go with either of the option i listed there 16:05:11 Skip 3 failing test and move integration job to run on Focal - https://review.opendev.org/#/c/734700/4/tools/tempest-integrated-gate-compute-blacklist.txt 16:05:16 To keep running these tests somewhere, we need 'tempest-integrated-compute' job to keep running on Bionic and rest all jobs toFocal. 16:05:45 I would skip these test and move our testing to Focal if this is the only blocker of Focal 16:05:46 to keep test coverage i think 2nd one is better choice 16:07:02 barbican is one blocker for now along with nova. gnoochi one is fixed by ceilometer on mariadb 16:07:26 keeping one job running on bionic is no issue and we will not lose the test coverage also 16:07:29 sure if other bugs makes hard to move to Focal then lets keep our testing on Bionic 16:08:02 it will be like we are testing on Bionic as well on Focal(rest all other jobs on Focal with these test skip) 16:08:27 only 'tempest-integrated-compute' will run on bionic until qemu bug is fixed 16:08:36 that works for me 16:08:55 what about the others in the room? 16:09:18 trying to digest 16:09:42 gibi: we could move that to centos8 too but that works for me 16:09:51 tl;dr: all jobs be on Focal except one testing the 3 tests that fail on Focal ? 16:10:03 amirite ? 16:10:03 yeah 16:10:16 bauzas: yes 16:10:20 this sounds a reasonable tradeoff to me 16:10:27 with a very few efforts 16:10:41 objections here that I can't see ? 16:10:57 no objection but using centos8 i think makes more sense 16:11:02 this will tho say we will run two different qemu versions ? 16:11:03 its one of the support os 16:11:04 makes sense to me 16:11:12 but 18.04 is fine too for now 16:11:29 (that being said, I'm all fine with two versions of qemu in different jobs) 16:11:50 I would go with the easier solution as we have limited time until the release 16:11:52 sean-k-mooney: for long term I have plan to distribute the testing on different distro not only ubuntu but let's do that later not at this stage of release 16:11:53 that just means we can't bump the minimum versions on libvirt, that's it 16:12:04 amirite too on this ? ^ 16:12:10 gmann: sure 16:12:32 ie. we stick with minimum versions be Bionic 16:12:33 bauzas: am no we jsut enable the cloud arcive 16:12:37 on bionic 16:12:44 we can still bump the min verions 16:12:54 ok that was my question 16:13:03 so this is not just "stick with Bionic" 16:13:12 this is "stick with Bionic + UCA" 16:13:17 us bionic with the ussuric cloud archive 16:13:26 gotcha 16:13:35 okay this sounds good to me 16:13:42 so, wait, 16:13:47 and then we can bump the minimums on victoria 16:13:52 am I right that we've been testing on bionic all cycle, 16:14:01 yep 16:14:03 yes, 16:14:05 ahah, good point 16:14:07 and that this would move 95% of our tests to focal suddenly, and only run the problematic ones on bionic? 16:14:08 we were ment to move like before m2 16:14:25 do we have existing jobs that run on Focal ? 16:14:36 even if they're not in the check pipelines 16:14:37 if so, I think it makes sense to test 100% on bionic, and 95% on focal at least until the release and then cut the bionic job down to the problem children 16:14:39 os-vif moved a while ago 16:14:47 other project have too 16:14:53 bauzas: this one tests https://review.opendev.org/#/c/734700/5 16:14:58 nova has been laging behind 16:15:16 dansmith: yup, double the jobs 16:15:18 and i had nova gates tested with that which worked well except these 3 failing tests 16:15:27 bauzas: yeah 16:15:28 dansmith: bionic is not in the supprot runtimes list in the governace repo 16:15:49 sean-k-mooney: but that's just out of sync with reality (i.e. what it has been tested on all cycle) right? 16:15:50 its not ment to be used for testing for victoria 16:15:51 dansmith: yeah Focal is testing runtime for Victoria 16:15:58 sean-k-mooney: at which point this being a blocker ? 16:16:21 (us continuing to use Bionic for three jobs) 16:16:23 this was droped as a blcoker when we figured out its a qemu bug 16:16:24 man 16:16:26 3 tests 16:16:39 we can use it for 3 tests 16:16:44 who said this was a blocker ? 16:16:50 but we shoudl move the testing to focal 16:16:55 so that stable/victoria is on focal 16:16:59 can't we just be brave and tell what happens in the relnotes or somewhere ? 16:17:14 it is decision we need to make before Focal migration is done (tempest and devstack job switch) 16:17:17 like, "this has been tested upstream on both Focal and Bionic" 16:17:38 we could but we should still move most of the jobs to focal 16:17:41 after all, if the jobs pass on Bionic, what's the matter ? 16:17:53 operators just know they can use both 16:18:01 which is nice to them 16:18:21 e.g. i think 100% bionic would be wrong since victora will not be packaged on victoria or supported on victoria by canonnical 16:18:26 and I don't want us to transform ourself into Canonical QE 16:18:35 but we are only keeping bionic testing only for failing tests just to avoid losin coverage otherwise 16:18:38 bauzas: so you say we document the broken test in reno. That is cool. 16:18:52 I'm just saying let's be honest 16:18:54 it is Focal we have to tell to user that we are testing with 16:18:55 sure 16:19:12 sean-k-mooney: yeah I get that, but it just seems like we're shifting what we're validating at the last minute.. a cycle full of stable tests replaced by a couple weeks of a different environment 16:19:20 since people would run on Focal, those tests mean that we're broken either way 16:19:21 I dunno, just seems easy to keep both, but maybe not 16:19:27 dansmith: the patches have been up for a very long time 16:19:38 they were blocked by these 3 tests 16:19:38 dansmith: I'm personnally in favor of being pragmatic like you 16:19:48 sean-k-mooney: what does that have to do with what we've actually been testing? 16:19:50 can we change the openstack communication about Focal support? 16:19:51 and us not being pedantic on things we "support" upstream 16:20:05 the point is actually very simple 16:20:14 dansmith: noting but other thatn i think it would be incorrect not to test on focal too 16:20:18 bauzas: right, openstack governance saying V means Focal is fine, but if we haven't been testing on that all cycle... 16:20:23 we do support Focal EXCEPT the fact we regress on some identified bugs 16:20:26 gibi: i do not think so as most of the testing ready and tox based moved to Focal already 16:20:32 sean-k-mooney: no, I'm arguing for both, that's all 16:20:32 bauzas: +1 16:20:42 dansmith: ya im ok with both 16:20:47 ack 16:21:01 dansmith: the current proposal is to run just tempest-ingetrated-compute on 18.04 16:21:10 so we could document something saying "heh, look, those are the bugs with Focal, please consider switching back to Ussuri QEMU versions if you hit those things" 16:21:10 running all jobs on both would be excessive 16:21:13 dansmith: but we cannot test those from start of cycle, it is migration cycle and at the end if testing is green then we are good 16:21:24 sean-k-mooney: oh I thought it was "just three tests of that" .. if it's the whole normal job, then that's fine 16:21:36 dansmith: yes whole job 16:21:39 OK I thing we are converged 16:21:44 gmann: cool 16:21:58 run tempest-ingetrated-compute on 18.04 move the rest to 20.04, document the qemu bug in a reno 16:22:09 sounds good 16:22:19 I'm still struggling with the exact impact of QEMU bugs so we could properly document them 16:22:25 in the prelude section 16:22:35 but if people feel brave enough, they can patch my prelude change 16:22:37 bauzas: that's obviously your job, mr. prelude :P 16:22:38 bauzas: device detach fails if the system is under load 16:22:46 so volume detach 16:23:02 dansmith: damn you. 16:23:19 sean-k-mooney: okay, then I can try to write something in the prelude 16:23:23 bauzas: sean-k-mooney and lyarwood can help wiht that prelude part 16:23:29 yep 16:23:30 ok i will update my tempest patch for keeping 'tempest-integrated-compute' on 18.04. 16:23:37 let's do this in another change on top of the existing one 16:23:39 bauzas: :P 16:23:55 sure we can do it in a follow up if you like 16:23:57 any other bug we need to touch? 16:24:02 here, now 16:24:13 sean-k-mooney: I'll follow-up with you on the workaround we need to document 16:24:23 (and lyarwood) 16:24:28 bauzas: there isent really a workaround 16:24:35 but sure we can do it after the meeting 16:24:38 downgrade QEMU? 16:24:43 maybe 16:24:47 either way, we're done 16:24:53 OK, moving on 16:24:58 #topic Release Planning 16:25:03 next week we need to produce RC1 16:25:07 * lyarwood catches up in the background 16:25:10 release tracking etherpad #link https://etherpad.opendev.org/p/nova-victoria-rc-potential 16:25:22 besides the prelude I'm not tracking any patch that needs to land before 16:25:25 RC1 16:25:28 if you have such patch let me know 16:25:38 prelude: #link https://review.opendev.org/#/c/751045/ 16:25:54 also please tag RC critical bugs with victoria-rc-potential tag in launchpad 16:25:57 if any 16:26:26 I also did not see any FFE request on the ML so the nova feature freeze is now complete 16:26:46 is there anything else about the relase we need to discuss? 16:26:50 huzzah 16:27:52 #topic Stable Branches 16:28:02 lyarwood, elod: any news? 16:28:14 Nothing from me, elod 16:28:16 ? 16:28:35 there was the gate issue (pypi proxy), but that is fixed 16:29:05 otherwise i guess we need to prepare for stable/victoria cut next week 16:29:25 thanks 16:29:39 #topic Sub/related team Highlights 16:29:59 API (gmann) 16:30:17 elod: stable/victoria won't be open until we GA 16:30:36 we'll have the stable branch, but this will be for regressions only that require a new RC 16:30:41 only one thing - we can backport this once it is merged -https://review.opendev.org/#/c/752211/1 16:31:00 that's all from me 16:31:25 thanks 16:31:31 Libvirt (bauzas) 16:31:40 heh 16:31:52 well, haven't looked at the etherpad tbh 16:32:03 but we're on RC period, if people have bugs, that's all good 16:32:15 I do remember we now have the versions bumps 16:32:29 https://review.opendev.org/#/q/status:open+project:openstack/nova+branch:master+topic:bump-libvirt-qemu-victoria 16:32:34 yeah sorry was the outcome of the focal discussion earlier that we will move to it? 16:32:39 or only in part? 16:32:45 lyarwood: we can bump AFAICU 16:32:45 if it's only in part we can't bump versions 16:32:57 as Bionic will run Ubuntu UCA on a single job 16:33:06 and other jobs running Focal 16:33:12 okay, well it's really really late to be doing this but if people are happy we could merge that 16:33:37 actually you're making a point 16:33:40 * lyarwood forgets if there was anything else outstanding with the series 16:33:42 and dansmith had the same concern 16:33:43 is there a risk? we are bumping the minimum but we tested with newer that the new minimum doesnt it? 16:33:44 lyarwood: this one https://review.opendev.org/#/c/734700/6/.zuul.yaml@221 16:33:54 gibi: the fact is that we never really ran Focal 16:34:17 OK, then lets have the Focal swith first 16:34:23 bumping the minimums + running Focal on 95% of jobs presents a certain risk of unstability 16:34:37 which means we won't bump for Victoria 16:34:41 yes 16:34:48 it means we will not have time to bump in 16:34:49 V 16:34:50 I'm cool with this, lyarwood concerns ? 16:35:04 I'm fine pushing the bump to W 16:35:10 did we not already merge some of the bumps 16:35:14 no 16:35:27 nope 16:35:36 really i had to rebase on of my pathce because of one of the patches in that seriese 16:35:49 can I gently push a procedural -2, pleeaaaaase ? 16:35:57 the test changes landed sean-k-mooney 16:36:04 that's been a while I wasn't mean 16:36:07 but they fixed broken tests anyway 16:36:12 right that is what broke me 16:36:15 ok 16:36:18 bauzas: yeah go ahead 16:36:24 okay we can move on 16:36:40 am we proably shoudl merge the updated next verision number 16:36:44 but maybe not the rest 16:36:57 so next cycle we can still move to the one we were planning to move too 16:37:00 lyarwood: <3 16:37:14 e.g. libvirt 6.0.0 for wallaby 16:37:18 instead of 5.0.0 16:37:39 actually https://review.opendev.org/#/c/749707/3 could be mergeable 16:38:12 sean-k-mooney: okay I can break that out if it keeps that part on track for W 16:38:19 I'd personally like to see https://review.opendev.org/#/c/746981/ merged 16:38:22 okay, I guess we have a consensus on blocking the minimum bumps, let's discuss offline on what's acceptable for V 16:38:30 the rest are noise and not necessary 16:38:40 how dare you 16:38:42 in the context of the release, that is 16:38:42 stephenfin: yeah, that's what I lean to consider 16:38:42 :D 16:39:05 okay, I'll cut the head on another patch 16:39:06 so we actully bump th emin version 16:39:09 but not the cleanup 16:39:09 no 16:39:16 so we can still supprot the older version 16:39:23 sean-k-mooney: that's what I'm suggesting 16:39:28 though not for that reason 16:39:33 n-cpu doesn't start with the older version 16:39:41 the MIN_ version would still be bumped 16:39:46 so that's pointless 16:39:51 ah right 16:40:06 I'm saying leave the dead code there 16:40:15 so bump next_min_version but nothing else 16:40:23 next_min and min 16:40:33 haha this is fun 16:40:53 so if we do that we'd also need bionic on UCA 16:40:56 stephenfin: min will break because of https://review.opendev.org/#/c/746981/5/nova/virt/libvirt/driver.py@788 16:40:59 oh 16:41:05 my question is simple : does Bionic with Ussuri UCA supports libvirt==5.0.0 ? 16:41:07 sorry, I thought we'd done the UCA change 16:41:15 if not, ignore me 16:41:16 bauzas: yes 16:41:26 i think that migt be the trian version ill check 16:41:33 but it has a 5.x version 16:41:39 18:34 < bauzas> bumping the minimums + running Focal on 95% of jobs presents a certain 16:41:42 risk of unstability 16:41:45 this is where we started ^^ 16:41:45 okay, let's keep the procedural hold and figure out a masterplan tomorrow 16:41:58 bauzas: +1 16:42:09 train is 5.4 16:42:14 http://ubuntu-cloud.archive.canonical.com/ubuntu/dists/bionic-updates/train/main/binary-amd64/Packages 16:42:24 ussuri is 6.0.0 from focal 16:42:42 dansmith: we could be discussing out of your usual workhours, any concerns you'd raise ? 16:42:49 so we would enabel train not ussuri as i said previously 16:42:52 or just say "I dare don't care" 16:43:15 yeah otherwise we'd hit the same focal issue, iirc that's what happened when I tried this a while ago as a workaround 16:43:41 https://review.opendev.org/#/c/747123/ 16:43:55 sorry, I zoned out, thought we were done with this 16:44:01 that's actually fun that 4 Red Hat engineers try to fix an Ubuntu problem 16:44:05 I wish we were 16:44:11 I should get a doubled payslip 16:44:42 either way, I'm done with this 16:44:43 bauzas: you can get mine, I don't have a distro behind me ;) 16:45:08 let's figure out what we can do without breaking too much things tomorrow 16:45:19 OK 16:45:30 any last words? 16:45:57 install Fedora ? 16:45:58 #topic PTG and Forum planning 16:46:08 mail #link http://lists.openstack.org/pipermail/openstack-discuss/2020-September/017063.html 16:46:19 PTG topics #link https://etherpad.opendev.org/p/nova-wallaby-ptg 16:46:32 bauzas: man, don't get sean-k-mooney started... 16:46:49 stephenfin: don't reply to a troll 16:46:49 anything you want to bring up about the PTG? 16:46:51 never ever. 16:47:05 :) 16:47:06 do we need cross project session organized? 16:47:41 im not sure 16:47:52 maybe with cinder and neutron 16:48:11 nothing from API perspective. 16:48:25 i assuem unified limits will continue 16:48:27 sean-k-mooney: do you have specific topic in your mind? 16:48:38 for neutron and cinder? 16:48:43 for neutron we have teh optional numa affinity with neutorn ports 16:48:54 neutron are finishing merging there side currently 16:49:04 so realy its just our bit left 16:49:09 not sure that need discussion 16:49:17 maybe a neutron xp session ? 16:49:34 just because I've been told we should have a conversation 16:49:53 on crazy pants, but whatever 16:50:08 do you have specific topics 16:50:12 OK, I will talk to slaweq to set things in motion 16:50:21 maybe neutron side has some topics 16:50:27 ya 16:50:40 or we just use that slot for coffe and small talk 16:50:42 i expect routed netorks schduling to be of interst 16:51:26 lyarwood: anything for cinder? 16:51:58 sean-k-mooney: sorry had to step away 16:52:23 sean-k-mooney: nothing feature wise but it would be useful to have a slot 16:52:35 sean-k-mooney: don't you have some ideas on things that were of interests by our PM ? 16:52:44 either way 16:52:50 yes 16:52:51 OK then I will set up slots with neutron and cinder 16:52:57 I don't have specific things in minde 16:52:59 thanks gibi 16:52:59 mind* 16:53:05 anything else about the PTG? 16:53:14 that being said, discussing on next steps with routed networks could be a thing 16:53:44 gibi: happy to help organise the cinder session FWIW 16:53:56 reminder about the zuulv3 legacy job removal: there https://review.opendev.org/#/c/744883/ (and a future one which depends on it) and 16:53:59 ups, sorry 16:54:03 too early 16:54:03 lyarwood: thanks 16:54:14 useful 16:54:19 #topic Open discussion 16:54:37 >.< sorry 16:54:39 tosky: re the zuulv3 thing, reviews would be appreciated as I'm struggling to find time to finish this 16:54:59 tosky: I'm just borking up some simple ansible tasks at the moment iirc 16:55:15 obviously reviews from the rest of the nova team would also be appreciated ;) 16:55:16 lyarwood: I guess that once nova-multinode-evacuate is implemented, nova-multinode-evacuate-ceph will be easily set up? 16:55:29 tosky: that needs a multinode base job that they don't have yet 16:55:36 tosky: and isn't simple with the current state of the plugin 16:55:44 tosky: I was going to suggest leaving that as a TODO for W 16:55:44 and there is also nova-multinode-live-migration-ceph, right 16:56:03 with the underlying multinode job for ceph both would be easy yes 16:56:07 we have grenade also pending on this - https://review.opendev.org/#/c/742056/ 16:56:08 but we don't have that yet 16:56:10 lyarwood: that's up to the nova team; from my point of view, I just need the legacy jobs to disappear 16:56:12 i need to check the failure 16:56:20 yep, I was going to mention that grenade one 16:56:50 tosky: kk well as no one else seems interested lets say that's the plan for V 16:57:15 tosky: land the evacuate job, defer the ceph jobs until W and remove the legacy LM job 16:57:25 tosky: oh and the grenade job in V 16:57:31 gmann: maybe in the grenade job you could reuse part of the work done by lyarwood to remove the usage of the hooks 16:57:38 but I haven't checked if it covers that 16:57:46 yep 16:57:49 tosky: gmann has already done some of this iirc 16:58:13 so please merge everything you could before V branching next week - but just to spare you another set of backports! 16:58:18 yeah, refreshing the results to see what failed 16:58:39 * lyarwood will try to find time tomorrow to finish up the evacuate job 16:58:55 thanks! We are almost there with this goal (throughout openstack as a while) 16:59:14 anything else for the last 40 seconds? 17:00:04 then thanks for joining 17:00:07 #endmeeting