sahid | o/ | 09:45 |
---|---|---|
dvo-plv | Hello sean-k-mooney gibi Could you please review this spec and code | 11:22 |
dvo-plv | https://review.opendev.org/c/openstack/nova-specs/+/895924 | 11:22 |
dvo-plv | https://review.opendev.org/c/openstack/nova/+/876075 | 11:22 |
opendevreview | Sylvain Bauza proposed openstack/nova master: libvirt: Cap with max_instances GPU types https://review.opendev.org/c/openstack/nova/+/899625 | 14:29 |
opendevreview | Sylvain Bauza proposed openstack/nova master: WIP https://review.opendev.org/c/openstack/nova/+/902084 | 14:29 |
bauzas | reminder : nova meeting in 12 mins | 15:48 |
bauzas | *here | 15:49 |
* gibi will be a bit distracted during the meeting | 15:52 | |
bauzas | #startmeeting nova | 16:00 |
opendevmeet | Meeting started Tue Nov 28 16:00:54 2023 UTC and is due to finish in 60 minutes. The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:00 |
opendevmeet | The meeting name has been set to 'nova' | 16:00 |
elodilles | o/ | 16:00 |
bauzas | #link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting | 16:01 |
bauzas | who's around ? | 16:01 |
fwiesel | \o | 16:01 |
auniyal6 | o/ | 16:02 |
Uggla | o/ | 16:03 |
bauzas | okay let's start | 16:03 |
bauzas | #topic Bugs (stuck/critical) | 16:03 |
bauzas | #info No Critical bug | 16:03 |
bauzas | #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 38 new untriaged bugs (+1 since the last meeting) | 16:03 |
bauzas | I've triaged two simple bugs | 16:04 |
bauzas | maybe gmann could look at https://bugs.launchpad.net/nova/+bug/2044035 | 16:04 |
bauzas | nothing else to say | 16:05 |
bauzas | #info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster | 16:05 |
bauzas | #info bug baton is gibi | 16:05 |
bauzas | shit | 16:05 |
bauzas | #undo | 16:05 |
opendevmeet | Removing item from minutes: #info bug baton is gibi | 16:05 |
bauzas | sorry | 16:05 |
bauzas | gibi: can you actually take the baton ? | 16:05 |
bauzas | looks gibi is not there, we'll see later | 16:08 |
bauzas | moving on | 16:09 |
bauzas | #topic Gate status | 16:09 |
bauzas | #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs | 16:09 |
bauzas | #link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&pipeline=periodic-weekly Nova&Placement periodic jobs status | 16:09 |
bauzas | #info Please look at the gate failures and file a bug report with the gate-failure tag. | 16:10 |
bauzas | nova-emulation still fails :( | 16:10 |
bauzas | #link https://zuul.openstack.org/build/13439683a24840f8b7a358c5e51e5c4e | 16:10 |
sean-k-mooney | it likely will until someone has time to fix it | 16:10 |
sean-k-mooney | to be clear looking at the output only 1 test failed | 16:11 |
dansmith | gate is not very healthy | 16:11 |
dansmith | I've been rechecking something for a while and can't make it past | 16:12 |
dansmith | 13 rechecks so far :/ | 16:12 |
dansmith | this is my last day of the year, unfortunately, so I'm hoping someone else can take up the charge of trying to track down and improve things | 16:12 |
bauzas | yeah I've seen some rechecks | 16:13 |
bauzas | I'll see if I have time... after all those mdev thingies unfortunately :( | 16:13 |
* gibi joins late | 16:14 | |
bauzas | anyway, I don't think we can discuss anything now | 16:14 |
* gibi does not make promises | 16:15 | |
bauzas | shall we move on ? | 16:16 |
bauzas | gibi: ack | 16:16 |
bauzas | #topic Release Planning | 16:18 |
bauzas | #link https://releases.openstack.org/caracal/schedule.html#nova | 16:18 |
bauzas | (the releases patch got merged :) ) | 16:19 |
elodilles | \o/ | 16:19 |
bauzas | next week, we'll have a spec review day | 16:19 |
bauzas | #info Spec review day planned next Dec 5th 2024 | 16:19 |
bauzas | #unso | 16:19 |
bauzas | #undo | 16:19 |
opendevmeet | Removing item from minutes: #info Spec review day planned next Dec 5th 2024 | 16:19 |
bauzas | #info Spec review day planned next Dec 5th 2023 | 16:19 |
bauzas | #info Caracal-2 (and spec freeze) milestone in 6 weeks | 16:21 |
bauzas | anything about releases ? | 16:21 |
bauzas | looks not | 16:24 |
bauzas | #topic Review priorities | 16:24 |
bauzas | #link https://etherpad.opendev.org/p/nova-caracal-status | 16:24 |
bauzas | we have a review demand | 16:24 |
bauzas | #link https://bugs.launchpad.net/nova/+bug/1869804 and https://review.opendev.org/c/openstack/nova/+/877773 | 16:25 |
bauzas | I've seen sean-k-mooney already reviewing it so I'm fine with moving it to the 'accepted bugfixes' section | 16:25 |
bauzas | anyway | 16:27 |
bauzas | moving on | 16:27 |
bauzas | #topic Stable Branches | 16:27 |
bauzas | elodilles: go for it | 16:27 |
sean-k-mooney | yes its a non trivial fix but im reviewing it | 16:27 |
elodilles | #info stable gates don't seem blocked | 16:27 |
elodilles | #info stable release patches still open for review: https://review.opendev.org/q/project:openstack/releases+is:open+intopic:nova | 16:28 |
elodilles | +1: yoga is going to be unmaintained, so final stable/yoga release should happen ASAP - https://etherpad.opendev.org/p/nova-stable-yoga-eom | 16:28 |
elodilles | actually, i've updated the release patches ^^^ | 16:28 |
elodilles | to contain the recently merged stable backports | 16:28 |
elodilles | and i suggest to not to wait anymore | 16:29 |
elodilles | the patches were proposed 1 month ago :( | 16:29 |
elodilles | and we can release *anytime* | 16:29 |
elodilles | (except yoga of course, as that release is the final one) | 16:29 |
elodilles | so release liaisons or PTL pls review o:) | 16:30 |
elodilles | and the usual, last item: | 16:30 |
elodilles | #info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci | 16:30 |
elodilles | that's all from me | 16:30 |
bauzas | ++ | 16:31 |
bauzas | and yeah I've seen the releases patches | 16:31 |
bauzas | fwiw, I'm ok with releasing 2023.1 and 2023.2 since both have the fixes I want | 16:32 |
elodilles | \o/ | 16:32 |
bauzas | Zed is still missing the RPC backports | 16:32 |
bauzas | I'll try to look at them tomorrow | 16:32 |
bauzas | elodilles: ping me again tomorrow | 16:32 |
elodilles | bauzas: ack, thanks in advance! | 16:33 |
bauzas | ok, next is a new topic | 16:36 |
bauzas | #topic vmwareapi 3rd-party CI efforts Highlights | 16:36 |
fwiesel | Shall I? | 16:36 |
bauzas | fwiesel: shoot :) | 16:36 |
fwiesel | #Info Gerrit Event Stream connected to our CI (Argo) with personal user | 16:36 |
fwiesel | So, nothing spectacular. Events go in, they could trigger whatever we want to run later. That part is the more tricky one, and peope are working on it. | 16:37 |
fwiesel | The most complicated part is the networking, we try to cut it down as much as possible so we test only nova. | 16:38 |
fwiesel | #Info CI user needs a name. According to naming convention: SAP vmwareapi CI. Has risk of confusion, as SAP is not a vendor of the technology. Our preference simply "SAP CI" or "SAP tempest CI". | 16:38 |
fwiesel | So, the naming convention on the 3rd party CI suggests "Company Technology CI" | 16:39 |
bauzas | fwiesel: I'd prefer 'vmwareapi CI run by SAP' | 16:39 |
bauzas | something telling users what exactly has been run | 16:40 |
gibi | +1 on keep mentioning vmwareapi in the name | 16:40 |
fwiesel | That would work. But if we run another CI, say for cinder, then we end up with the same thing as the VMware nsxt CI running cinder | 16:40 |
bauzas | or yeah, "SAP vmwareapi CI" | 16:40 |
bauzas | honestly, I don't wanna bikeshed | 16:41 |
bauzas | I understand that you'd appreciate the SAP name in the CI job name | 16:41 |
fwiesel | Okay then we go with your proposal "vmwareapi CI run by SAP" | 16:41 |
bauzas | but I also want my users to know that it's testing with vmwareapi virt driver | 16:41 |
bauzas | so, please keep both names | 16:41 |
fwiesel | Fine by me. | 16:41 |
bauzas | I'm French and I'm bad at namings :) | 16:41 |
bauzas | but my gut is saying "any words with both SAP and vmwareapi are fine by me" | 16:42 |
fwiesel | Then we agree on "vmwareapi CI run by SAP"? | 16:42 |
fwiesel | #agreed The CI user will be named "vmwareapi CI run by SAP" | 16:43 |
fwiesel | That's from my side. Any questions? | 16:44 |
fwiesel | Well, then I think we are through. | 16:45 |
bauzas | ++ | 16:46 |
bauzas | thanks fwiesel for the report | 16:46 |
bauzas | :) | 16:46 |
bauzas | greatly appreciated | 16:46 |
fwiesel | You're welcome | 16:46 |
bauzas | last topic then | 16:47 |
bauzas | #topic Open discussion | 16:47 |
bauzas | the agenda is empty | 16:47 |
bauzas | anything anyone ? | 16:47 |
sean-k-mooney | nope | 16:48 |
bauzas | cool | 16:48 |
bauzas | thanks all, giving you back 12 mins | 16:48 |
bauzas | #endmeeting | 16:48 |
opendevmeet | Meeting ended Tue Nov 28 16:48:22 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:48 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/nova/2023/nova.2023-11-28-16.00.html | 16:48 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/nova/2023/nova.2023-11-28-16.00.txt | 16:48 |
opendevmeet | Log: https://meetings.opendev.org/meetings/nova/2023/nova.2023-11-28-16.00.log.html | 16:48 |
elodilles | thanks bauzas o- | 16:49 |
elodilles | o/ | 16:50 |
fwiesel | thanks a lot \o | 16:50 |
melwitt | dansmith: I have noticed (anecdotally) that a large portion of gate failures have been due to guest kernel panics. I have spent some time looking at them and see for example "pci 0000:00:03.6: BAR 13: failed to assign [io size 0x1000]" and "pci 0000:00:03.6: BAR 13: no space for [io size 0x1000]" I'm not sure how to tell which thing caused the panic. and in my search for more info online I haven't found anything other than mention of | 19:41 |
melwitt | old kernel bugs or people saying to run fsck and try again. not sure how to approach this, so if you have any thoughts, it would probably help | 19:41 |
melwitt | it seems weird to me how frequent this is. don't know what that could mean either. in the past guest kernel panics were a lot more rare | 19:42 |
dansmith | orly, well, that might explain the uptick lately... but yeah, I'd lean on kashyap and sean-k-mooney to look at those sorts of things | 19:46 |
dansmith | unfortunately, we've had trouble getting our kernel team to look at those things because we don't really run a supported kernel, but with the recent cirros update, we're closer to a support-able kernel I think | 19:47 |
dansmith | so might be worth trying again | 19:47 |
melwitt | ok, yeah. I checked and we are running the newest kernel that cirros supports. I did wonder if there could be a chance for improvement by using something other than cirros, like to get a different/newer kernel version | 19:49 |
melwitt | or I guess maybe building our own cirros with a newer kernel | 19:50 |
dansmith | I mean, it's possible but cirros is also tailored to work in very small environments, which is important for us | 20:07 |
dansmith | a stock distro kernel (and certainly image) won't likely work well in our super tiny guests | 20:07 |
dansmith | and if we had to increase the size of those, we'd need to make other changes | 20:08 |
dansmith | getting some kernel person to tell us what the crashes mean would be a good first start | 20:08 |
dansmith | like if it's "that should never happen" or "they're already memory constrained and hitting a bug as a result" or whatever | 20:08 |
melwitt | yeah | 20:30 |
opendevreview | Merged openstack/python-novaclient master: Fix typos https://review.opendev.org/c/openstack/python-novaclient/+/901457 | 21:58 |
sean-k-mooney[m] | melwitt: cirrus uses the ubuntu lts kernel just in case you didnt know that | 23:44 |
sean-k-mooney[m] | i really should try and make time to get my alpine serise working again but honestly havnt had time or energy to do that. | 23:45 |
melwitt | sean-k-mooney[m]: I saw on their site 🙂 but I also see the cirros we're using is also using the nearly newest kernel for jammy already so I guess maybe there's little chance a newer version would make a difference | 23:46 |
sean-k-mooney[m] | yep 6.x is using the 22.04/jammy kernel | 23:47 |
sean-k-mooney[m] | part of the issue to dans point is that kernel does expect to be deployed in a vm with more ram then we have in our normal guest vms | 23:47 |
melwitt | sean-k-mooney[m]: this is a paste of the kernel panic if you may be able to tell what it means https://pastebin.com/raw/cVDem0Xr | 23:48 |
melwitt | I see | 23:48 |
sean-k-mooney[m] | is that q35 or pc i wonder | 23:51 |
sean-k-mooney[m] | i cant quite tell from that dmseg output | 23:51 |
* melwitt looks | 23:52 | |
sean-k-mooney[m] | its q35 | 23:53 |
sean-k-mooney[m] | pcieport 0000:00:02.0: PME: | 23:53 |
sean-k-mooney[m] | so i wonder if we could mitigate this by plugging less pcie ports | 23:53 |
melwitt | it's from this nova-next fail https://0c0c4e33641a15d6198e-9e6a8447d6325ce608287fa851bc53b0.ssl.cf2.rackcdn.com/892800/3/gate/nova-next/4b8b450/testr_results.html | 23:54 |
sean-k-mooney[m] | ya so nova next is where we currently test q35 | 23:55 |
sean-k-mooney[m] | and its the only job that uses q35 | 23:55 |
clarkb | sean-k-mooney[m]: I wasn't aware the linux kernel publishes minimum memory requirements | 23:55 |
clarkb | and I definitely have run linux in less memory than the ci test nodes... | 23:55 |
sean-k-mooney[m] | clarkb linux does not canonical do for there images | 23:56 |
sean-k-mooney[m] | and ya puppy linux or tinycore or alpine | 23:56 |
sean-k-mooney[m] | all support much less but i think the lowest canonical support is 256 | 23:57 |
sean-k-mooney[m] | maybe 128. that not just for the kernel of couse that for the full image with user space | 23:57 |
melwitt | it's also happening in other jobs but maybe for different reasons? example https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_fe3/898188/16/check/tempest-integrated-storage/fe33fa7/testr_results.html | 23:57 |
melwitt | that ^ is not q35 | 23:58 |
JayF | We run tinycore for Ironic test VMs, and already have automation for building test images, if that's something you all have interest in. | 23:59 |
JayF | https://opendev.org/openstack/ironic-python-agent-builder/src/branch/master/tinyipa | 23:59 |
sean-k-mooney[m] | https://ubuntu.com/core/docs/system-requirements so 256mb is there minim for there smallest image | 23:59 |
JayF | Obviously we have to put more in there than you all would, but if you go that route there's no need to stop from zero. | 23:59 |
JayF | s/stop/start/ | 23:59 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!