Tuesday, 2023-11-28

sahido/09:45
dvo-plvHello sean-k-mooney gibi Could you please review this spec and code11:22
dvo-plvhttps://review.opendev.org/c/openstack/nova-specs/+/89592411:22
dvo-plvhttps://review.opendev.org/c/openstack/nova/+/87607511:22
opendevreviewSylvain Bauza proposed openstack/nova master: libvirt: Cap with max_instances GPU types  https://review.opendev.org/c/openstack/nova/+/89962514:29
opendevreviewSylvain Bauza proposed openstack/nova master: WIP  https://review.opendev.org/c/openstack/nova/+/90208414:29
bauzasreminder : nova meeting in 12 mins15:48
bauzas*here15:49
* gibi will be a bit distracted during the meeting15:52
bauzas#startmeeting nova16:00
opendevmeetMeeting started Tue Nov 28 16:00:54 2023 UTC and is due to finish in 60 minutes.  The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot.16:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:00
opendevmeetThe meeting name has been set to 'nova'16:00
elodilleso/16:00
bauzas#link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting16:01
bauzaswho's around ?16:01
fwiesel\o16:01
auniyal6o/16:02
Ugglao/16:03
bauzasokay let's start16:03
bauzas#topic Bugs (stuck/critical) 16:03
bauzas#info No Critical bug16:03
bauzas#link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 38 new untriaged bugs (+1 since the last meeting)16:03
bauzasI've triaged two simple bugs16:04
bauzasmaybe gmann could look at https://bugs.launchpad.net/nova/+bug/204403516:04
bauzasnothing else to say16:05
bauzas#info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster16:05
bauzas#info bug baton is gibi16:05
bauzasshit16:05
bauzas#undo16:05
opendevmeetRemoving item from minutes: #info bug baton is gibi16:05
bauzassorry16:05
bauzasgibi: can you actually take the baton ?16:05
bauzaslooks gibi is not there, we'll see later16:08
bauzasmoving on16:09
bauzas#topic Gate status 16:09
bauzas#link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs 16:09
bauzas#link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&pipeline=periodic-weekly Nova&Placement periodic jobs status16:09
bauzas#info Please look at the gate failures and file a bug report with the gate-failure tag.16:10
bauzasnova-emulation still fails :(16:10
bauzas#link https://zuul.openstack.org/build/13439683a24840f8b7a358c5e51e5c4e16:10
sean-k-mooneyit likely will until someone has time to fix it16:10
sean-k-mooney to be clear looking at the output only 1 test failed16:11
dansmithgate is not very healthy16:11
dansmithI've been rechecking something for a while and can't make it past16:12
dansmith13 rechecks so far :/16:12
dansmiththis is my last day of the year, unfortunately, so I'm hoping someone else can take up the charge of trying to track down and improve things16:12
bauzasyeah I've seen some rechecks16:13
bauzasI'll see if I have time... after all those mdev thingies unfortunately :(16:13
* gibi joins late16:14
bauzasanyway, I don't think we can discuss anything now16:14
* gibi does not make promises16:15
bauzasshall we move on ?16:16
bauzasgibi: ack16:16
bauzas#topic Release Planning 16:18
bauzas#link https://releases.openstack.org/caracal/schedule.html#nova16:18
bauzas(the releases patch got merged :) )16:19
elodilles\o/16:19
bauzasnext week, we'll have a spec review day16:19
bauzas#info Spec review day planned next Dec 5th 202416:19
bauzas#unso16:19
bauzas#undo16:19
opendevmeetRemoving item from minutes: #info Spec review day planned next Dec 5th 202416:19
bauzas#info Spec review day planned next Dec 5th 202316:19
bauzas#info Caracal-2 (and spec freeze) milestone in 6 weeks16:21
bauzasanything about releases ?16:21
bauzaslooks not16:24
bauzas#topic Review priorities 16:24
bauzas#link https://etherpad.opendev.org/p/nova-caracal-status16:24
bauzaswe have a review demand16:24
bauzas#link https://bugs.launchpad.net/nova/+bug/1869804 and https://review.opendev.org/c/openstack/nova/+/87777316:25
bauzasI've seen sean-k-mooney already reviewing it so I'm fine with moving it to the 'accepted bugfixes' section16:25
bauzasanyway16:27
bauzasmoving on16:27
bauzas#topic Stable Branches 16:27
bauzaselodilles: go for it16:27
sean-k-mooneyyes its a non trivial fix but im reviewing it16:27
elodilles#info stable gates don't seem blocked16:27
elodilles#info stable release patches still open for review: https://review.opendev.org/q/project:openstack/releases+is:open+intopic:nova16:28
elodilles+1: yoga is going to be unmaintained, so final stable/yoga release should happen ASAP - https://etherpad.opendev.org/p/nova-stable-yoga-eom16:28
elodillesactually, i've updated the release patches ^^^16:28
elodillesto contain the recently merged stable backports16:28
elodillesand i suggest to not to wait anymore16:29
elodillesthe patches were proposed 1 month ago :(16:29
elodillesand we can release *anytime*16:29
elodilles(except yoga of course, as that release is the final one)16:29
elodillesso release liaisons or PTL pls review o:)16:30
elodillesand the usual, last item:16:30
elodilles#info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci16:30
elodillesthat's all from me16:30
bauzas++16:31
bauzasand yeah I've seen the releases patches16:31
bauzasfwiw, I'm ok with releasing 2023.1 and 2023.2 since both have the fixes I want16:32
elodilles\o/16:32
bauzasZed is still missing the RPC backports16:32
bauzasI'll try to look at them tomorrow16:32
bauzaselodilles: ping me again tomorrow16:32
elodillesbauzas: ack, thanks in advance!16:33
bauzasok, next is a new topic16:36
bauzas#topic vmwareapi 3rd-party CI efforts Highlights 16:36
fwieselShall I?16:36
bauzasfwiesel: shoot :)16:36
fwiesel#Info Gerrit Event Stream connected to our CI (Argo) with personal user16:36
fwieselSo, nothing spectacular. Events go in, they could trigger whatever we want to run later. That part is the more tricky one, and peope are working on it.16:37
fwieselThe most complicated part is the networking, we try to cut it down as much as possible so we test only nova.16:38
fwiesel#Info CI user needs a name. According to naming convention: SAP vmwareapi CI. Has risk of confusion, as SAP is not a vendor of the technology. Our preference simply "SAP CI" or "SAP tempest CI".16:38
fwieselSo, the naming convention on the 3rd party CI suggests "Company Technology CI"16:39
bauzasfwiesel: I'd prefer 'vmwareapi CI run by SAP'16:39
bauzassomething telling users what exactly has been run16:40
gibi+1 on keep mentioning vmwareapi in the name16:40
fwieselThat would work. But if we run another CI, say for cinder, then we end up with the same thing as the VMware nsxt CI running cinder16:40
bauzasor yeah, "SAP vmwareapi CI"16:40
bauzashonestly, I don't wanna bikeshed16:41
bauzasI understand that you'd appreciate the SAP name in the CI job name16:41
fwieselOkay then we go with your proposal "vmwareapi CI run by SAP"16:41
bauzasbut I also want my users to know that it's testing with vmwareapi virt driver16:41
bauzasso, please keep both names16:41
fwieselFine by me.16:41
bauzasI'm French and I'm bad at namings :)16:41
bauzasbut my gut is saying "any words with both SAP and vmwareapi are fine by me"16:42
fwieselThen we agree on "vmwareapi CI run by SAP"?16:42
fwiesel#agreed The CI user will be named "vmwareapi CI run by SAP"16:43
fwieselThat's from my side. Any questions?16:44
fwieselWell, then I think we are through.16:45
bauzas++16:46
bauzasthanks fwiesel for the report16:46
bauzas:)16:46
bauzasgreatly appreciated16:46
fwieselYou're welcome16:46
bauzaslast topic then16:47
bauzas#topic Open discussion 16:47
bauzasthe agenda is empty16:47
bauzasanything anyone ?16:47
sean-k-mooneynope16:48
bauzascool16:48
bauzasthanks all, giving you back 12 mins16:48
bauzas#endmeeting16:48
opendevmeetMeeting ended Tue Nov 28 16:48:22 2023 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:48
opendevmeetMinutes:        https://meetings.opendev.org/meetings/nova/2023/nova.2023-11-28-16.00.html16:48
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/nova/2023/nova.2023-11-28-16.00.txt16:48
opendevmeetLog:            https://meetings.opendev.org/meetings/nova/2023/nova.2023-11-28-16.00.log.html16:48
elodillesthanks bauzas o-16:49
elodilleso/16:50
fwieselthanks a lot  \o16:50
melwittdansmith: I have noticed (anecdotally) that a large portion of gate failures have been due to guest kernel panics. I have spent some time looking at them and see for example "pci 0000:00:03.6: BAR 13: failed to assign [io  size 0x1000]" and "pci 0000:00:03.6: BAR 13: no space for [io  size 0x1000]" I'm not sure how to tell which thing caused the panic. and in my search for more info online I haven't found anything other than mention of19:41
melwitt old kernel bugs or people saying to run fsck and try again. not sure how to approach this, so if you have any thoughts, it would probably help19:41
melwittit seems weird to me how frequent this is. don't know what that could mean either. in the past guest kernel panics were a lot more rare19:42
dansmithorly, well, that might explain the uptick lately... but yeah, I'd lean on kashyap and sean-k-mooney to look at those sorts of things19:46
dansmithunfortunately, we've had trouble getting our kernel team to look at those things because we don't really run a supported kernel, but with the recent cirros update, we're closer to a support-able kernel I think19:47
dansmithso might be worth trying again19:47
melwittok, yeah. I checked and we are running the newest kernel that cirros supports. I did wonder if there could be a chance for improvement by using something other than cirros, like to get a different/newer kernel version19:49
melwittor I guess maybe building our own cirros with a newer kernel 19:50
dansmithI mean, it's possible but cirros is also tailored to work in very small environments, which is important for us20:07
dansmitha stock distro kernel (and certainly image) won't likely work well in our super tiny guests20:07
dansmithand if we had to increase the size of those, we'd need to make other changes20:08
dansmithgetting some kernel person to tell us what the crashes mean would be a good first start20:08
dansmithlike if it's "that should never happen" or "they're already memory constrained and hitting a bug as a result" or whatever20:08
melwittyeah20:30
opendevreviewMerged openstack/python-novaclient master: Fix typos  https://review.opendev.org/c/openstack/python-novaclient/+/90145721:58
sean-k-mooney[m]melwitt:  cirrus uses the ubuntu lts kernel just in case you didnt know that23:44
sean-k-mooney[m]i really should try and make time to get my alpine serise working again but honestly havnt had time or energy to do that.23:45
melwittsean-k-mooney[m]: I saw on their site 🙂 but I also see the cirros we're using is also using the nearly newest kernel for jammy already so I guess maybe there's little chance a newer version would make a difference23:46
sean-k-mooney[m]yep 6.x is using the 22.04/jammy kernel23:47
sean-k-mooney[m]part of the issue to dans point is that kernel does expect to be deployed in a vm with more ram then we have in our normal guest vms23:47
melwittsean-k-mooney[m]: this is a paste of the kernel panic if you may be able to tell what it means https://pastebin.com/raw/cVDem0Xr23:48
melwittI see23:48
sean-k-mooney[m]is that q35 or pc i wonder23:51
sean-k-mooney[m]i cant quite tell from that dmseg output23:51
* melwitt looks23:52
sean-k-mooney[m]its q3523:53
sean-k-mooney[m]pcieport 0000:00:02.0: PME:23:53
sean-k-mooney[m]so i wonder if we could mitigate this by plugging less pcie ports23:53
melwittit's from this nova-next fail https://0c0c4e33641a15d6198e-9e6a8447d6325ce608287fa851bc53b0.ssl.cf2.rackcdn.com/892800/3/gate/nova-next/4b8b450/testr_results.html23:54
sean-k-mooney[m]ya so nova next is where we currently test q3523:55
sean-k-mooney[m]and its the only job that uses q3523:55
clarkbsean-k-mooney[m]: I wasn't aware the linux kernel publishes minimum memory requirements23:55
clarkband I definitely have run linux in less memory than the ci test nodes...23:55
sean-k-mooney[m]clarkb linux does not canonical do for there images23:56
sean-k-mooney[m]and ya puppy linux or tinycore or alpine23:56
sean-k-mooney[m]all support much less but i think the lowest canonical support is 25623:57
sean-k-mooney[m]maybe 128. that not just for the kernel of couse that for the full image with user space23:57
melwittit's also happening in other jobs but maybe for different reasons? example https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_fe3/898188/16/check/tempest-integrated-storage/fe33fa7/testr_results.html23:57
melwittthat ^ is not q3523:58
JayFWe run tinycore for Ironic test VMs, and already have automation for building test images, if that's something you all have interest in.23:59
JayFhttps://opendev.org/openstack/ironic-python-agent-builder/src/branch/master/tinyipa23:59
sean-k-mooney[m]https://ubuntu.com/core/docs/system-requirements so 256mb is there minim for there smallest image23:59
JayFObviously we have to put more in there than you all would, but if you go that route there's no need to stop from zero.23:59
JayFs/stop/start/23:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!