#openstack-nova log

16:00:15 <bauzas> #startmeeting nova
16:00:15 <opendevmeet> Meeting started Tue Feb 13 16:00:15 2024 UTC and is due to finish in 60 minutes.  The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:15 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:15 <opendevmeet> The meeting name has been set to 'nova'
16:00:19 <bauzas> hey folks
16:00:35 <auniyal> o/
16:00:38 <dansmith> o/
16:00:49 <bauzas> sorry, those days I'm a bit off from the channel
16:01:03 <bauzas> #link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting
16:01:05 <elodilles> o/
16:01:08 <ratailor> o/
16:01:13 <bauzas> let's start
16:01:17 <bauzas> people will arrive
16:01:23 <bauzas> #topic Bugs (stuck/critical)
16:01:27 <bauzas> #info No Critical bug
16:01:31 <bauzas> #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 59 new untriaged bugs (+3 since the last meeting)
16:01:36 <bauzas> #info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster
16:01:40 <bauzas> #info bug baton is bauzas
16:01:41 <gibi> o/
16:01:48 <bauzas> (I forgot again to look at the bugs :( )
16:02:03 <bauzas> any bugs you would want to discuss ?
16:02:34 <bauzas> looks not, moving on
16:02:43 <bauzas> #topic Gate status
16:02:46 <fwiesel> \o
16:02:48 <bauzas> #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs
16:02:52 <bauzas> #link https://etherpad.opendev.org/p/nova-ci-failures-minimal
16:02:56 <bauzas> #link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&pipeline=periodic-weekly Nova&Placement periodic jobs status
16:03:00 <bauzas> #info Please look at the gate failures and file a bug report with the gate-failure tag.
16:03:34 <bauzas> so, we have some issue with the c9s job
16:03:40 <bauzas> but we know why :)
16:03:55 <bauzas> (some libvirt regression)
16:04:46 <bauzas> in the next week, the new libvirt release (that fixes the issue) would be in the c9s
16:05:14 <bauzas> (hopefully)
16:05:52 <bauzas> any CI failures you would want to discuss ?
16:07:54 <bauzas> looks not
16:08:01 <bauzas> moving on
16:08:17 <bauzas> #topic Release Planning
16:08:21 <bauzas> #link https://releases.openstack.org/caracal/schedule.html#nova
16:08:25 <bauzas> #info Caracal-3 (and feature freeze) milestone in 2 weeks
16:08:29 <bauzas> tick-tock
16:08:38 <bauzas> 2 weeks and a half, tbh
16:09:21 <bauzas> for me, given I was working on my own series for testing it, I'll eventually review by tomorrow
16:09:33 <bauzas> review *other* features, I mean
16:09:57 <bauzas> I was wanting to do this today, but I had another problem that I fixed
16:10:15 <bauzas> so, please look at your gerrit emails tomorrow :)=
16:10:21 <bauzas> #topic Review priorities
16:10:24 <bauzas> #link https://etherpad.opendev.org/p/nova-caracal-status
16:10:30 <bauzas> again, everything is there
16:10:46 <bauzas> nothing to say more about it
16:10:52 <bauzas> #topic Stable Branches
16:10:59 <bauzas> elodilles: heya
16:11:02 <elodilles> o/
16:11:09 <elodilles> #info stable/ussuri transitioned to End of Life
16:11:21 <elodilles> this i forgot to mention last week ^^^
16:11:34 <bauzas> \o/
16:11:35 <elodilles> #info unmaintained/yoga is open for patches
16:11:56 <elodilles> though gate might be still problematic, but at least generated patch has merged \o/
16:12:12 <elodilles> #info stable gates don't seem blocked, though deleted stable/yoga can cause problems (e.g. on grenade jobs)
16:12:51 <elodilles> this might be important for 2023.1 (skip level grenade) and zed (grenade) for the coming weeks ^^^
16:13:14 <elodilles> so probably the best is to follow things here:
16:13:18 <elodilles> #info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci
16:13:32 <elodilles> and maybe one more thing to mention:
16:13:49 <elodilles> now that we have unmaintained/yoga
16:14:20 <elodilles> so far only openstack-unmaintained-core have rights for nova's unmaintained/ branch
16:14:31 <elodilles> there are two option here
16:14:41 <elodilles> 1) add people to this group
16:14:55 <elodilles> 2) create a nova-unmaintained-core group
16:15:09 <dansmith> the goal is that most projects go with #1
16:15:11 <elodilles> and populate that one with interested people ^^^
16:15:26 <bauzas> I have no opinion
16:15:34 <elodilles> dansmith: yepp, true, though some project already started to create their own group
16:15:46 <dansmith> meaning "nova doesn't really maintain the unmaintained stuff, but community members that are interested in old versions do"
16:16:00 <dansmith> elodilles: right, but see the discussion in -tc right now.. that seems to be a misundersanding
16:16:02 <elodilles> anyway, this is just a heads up to the team to think about this
16:16:17 <bauzas> but in general, people wanting to "maintain" a project also wants to maintain cinder and neutron
16:16:18 <gibi> I don't want to maintain ussuri :)
16:16:20 <elodilles> dansmith: ACK
16:16:22 <dansmith> we're hoping/expecting most projects to effectively let those dry up and only get maintained if there are people around to care
16:16:47 <dansmith> bauzas: right, the idea is someone maintaining ussuri need to care about nova and neutron in that release, so let them
16:16:56 <elodilles> ACK, then we can keep things as it is then :)
16:17:01 <bauzas> tbh, after thinking a bit, I wouldn't want to have a specific group told 'nova-' something
16:17:05 <dansmith> that's certainly my preference
16:17:21 <elodilles> (note, i am member of openstack-unmaintained-core, so either way, i have rights to mess around there)
16:17:22 <bauzas> so no about 2) if the name is 'nova-unmaintained-group'
16:17:29 <dansmith> the idea is that the nova project gets to stop worrying about those old releases
16:17:42 <dansmith> if we create a nova- group, we kinda have to keep caring about it
16:17:45 <bauzas> yeah, that's why I don't want that group to be named 'nova'
16:17:52 <bauzas> dansmith: that's my point
16:17:56 <dansmith> yup
16:18:08 <elodilles> ACK
16:18:28 <elodilles> OK, that's it from my side then about stable :X
16:18:39 <bauzas> anyway, the unmaintained branch is no longer supported by nova
16:19:07 <bauzas> so, people can create groups like they want
16:19:29 <bauzas> but again, I'm fine till they don't name those groups like the projects
16:20:13 <bauzas> unmaintained-compute-specialized-group meh to it
16:20:39 <elodilles> no need for a group at all, there is openstack-unmaintained-core group already
16:20:53 <bauzas> then I'm cool
16:21:01 <dansmith> and that's the preference :)
16:21:08 <elodilles> +1
16:21:23 <dansmith> the per-project group override is for people who don't want the new plan, basically
16:21:34 <bauzas> anyway, unmaintained projects are actually now forks
16:21:35 <dansmith> I think this horse is dead
16:21:51 <bauzas> unmaintained branches
16:21:53 <bauzas> *
16:22:24 <bauzas> people can fork as much as they want provided they don't name those branches "nova-something"
16:23:03 <bauzas> anyway, I'm done
16:23:07 <bauzas> moving on ?
16:23:10 <elodilles> +1
16:23:22 <bauzas> #topic vmwareapi 3rd-party CI efforts Highlights
16:23:28 <fwiesel> #info Fixes to CI for various branches. Missing is still versions prior zed (requiring Ubuntu 20.04)
16:23:29 <bauzas> fwiesel: heya
16:23:51 <bauzas> yoga is now unmaintained :D
16:23:56 <fwiesel> So, I only tested before on master, which didn't translate well to other branches.
16:24:20 <bauzas> so IMHO you shouldn't really care on maintaining 3rd party jobs for Yoga and older branches
16:24:42 <fwiesel> Problem is xena. It is also Ubuntu 20.04.
16:24:58 <fwiesel> Ah, x<z ...
16:25:15 <fwiesel> Great, that was my implicit question. So we are fine just with zed and later?
16:25:22 <elodilles> (and Zed will move to Unmaintained in ~3 months)
16:26:02 <bauzas> fwiesel: that actually depends on what you want to test for yourselves
16:26:32 <bauzas> but if you don't really want to test Xena for your own sake, fwiw, the nova project is fine with not testing vmwareapi on that branch
16:26:44 <fwiesel> Well, we want to move away from Xena ourselves. My main reason for testing older versions was trying to bisect the bug I was mentioning.
16:26:54 <fwiesel> #info Debugging local root disk (Raised bug: https://bugs.launchpad.net/nova/+bug/2053027)
16:27:21 <fwiesel> That took most of my time. It works for us on our heavily patched xena, I hoped to establish a base-line.
16:28:01 <fwiesel> At least from what I can gather, it is a rather strange one, and probably is too much for the summary in this meeting.
16:28:52 <fwiesel> I am almost of a mind to rip out the code for image upload for the one used in cinder (incidentally the one in oslo.vmware). Any strong feelings on this one?
16:28:53 <bauzas> fwiesel: from your report, it sounds to me this is a glance issue
16:28:59 <fwiesel> Well, cinder works.
16:29:09 <fwiesel> With the same glance :)
16:29:11 <bauzas> so that's a client issue
16:29:36 <fwiesel> It all runs in the same venv, so the code for all the libraries are the same.
16:31:24 <fwiesel> The debug output for the request for nova and cinder against glance is almost the same. Cinder also passes on the X-Service-Token to glance, while nova doesn't.
16:31:38 <dansmith> not sure how it's a glance thing
16:31:52 <fwiesel> Not that I believe that to be the error, just for completness.
16:32:11 <bauzas> dansmith: right my bad, it's a client thing
16:32:20 <bauzas> but the OSError sounds a permission issue
16:32:23 <fwiesel> Either way, probably not something that can be solved in a couple of minutes in this meeting. I presume
16:32:54 <fwiesel> You get an "OSError" because the client (i.e. nova) closes the connections, so glance cannot write to the socket anymore.
16:33:32 <dansmith> fwiesel: not sure that makes sense either :)
16:34:09 <fwiesel> Yeah, either way. I would suggest to leave the discussion on the details of the bug for after the meeting.
16:34:11 <dansmith> I'll have to look more at the logs after the meeting, there's not really enough meat in the bug to really see I think
16:34:17 <bauzas> I don't really know the workflow that's implied by this oslo.vmware call
16:35:07 <bauzas> oh found it
16:35:09 <bauzas> http://openstack-ci-logs.global.cloud.sap/openstack/nova/35af4b345d997b63f999a090e236d91b78ea4304/n-cpu-1.service.log
16:35:47 <bauzas> that's when getting the image from glance
16:36:27 <sean-k-mooney> right so that should be via http via the glance api
16:36:36 <sean-k-mooney> even if its on NFS as a stoage backend
16:37:06 <bauzas> I don't actually indeed see why we need to call oslo.vmware
16:37:34 <sean-k-mooney> well for the volume creation we would just ask cindeer to create the voluem from the image right
16:38:04 <fwiesel> That happens for the boot from volume case. And that works. But we also have the boot from "local" disk, and then nova needs to pull the image.
16:38:11 <sean-k-mooney> we would only need to call into oslo.vmware when asking vmware to create the instance using the cidner volume
16:38:22 <bauzas> yeah their problem is with the standard boot from image to disk
16:38:38 <sean-k-mooney> ok but we dont have an agent runing on the esxi host
16:38:46 <sean-k-mooney> or vshper node
16:39:06 <sean-k-mooney> we are asking vsphare to download it form glance right
16:39:11 <bauzas> apparently they do some caching on the downloaded image
16:39:22 <bauzas> so this is indeed not a glance communication problem
16:39:42 <dansmith> can we discuss after the meeting?
16:39:48 <bauzas> from what I can understand from the stacktrace, the image is downloaded but then you call oslo.vmware to cache that image
16:39:54 <bauzas> dansmith: good point
16:40:01 <bauzas> the next topic should be empty
16:40:27 <bauzas> fwiesel: are you done with your points ? we'll continue troubleshooting right after the meeting
16:40:47 <bauzas> apparently so
16:40:48 <fwiesel> I am done. Over to you
16:40:51 <bauzas> #topic Open discussion
16:40:55 <bauzas> .
16:41:00 <bauzas> nothing in the agenda
16:41:04 <bauzas> anything anyone ?
16:41:20 <sean-k-mooney> did you want to chat about the sate of the vgpu seriese
16:41:36 <bauzas> sean-k-mooney: well, I eventually was able to live-migrate
16:41:39 <sean-k-mooney> or wait till next week when you have done some more testing
16:41:57 <bauzas> I know now the reason why we need some very large downtime option
16:42:29 <sean-k-mooney> do wehttps://docs.openstack.org/nova/latest/configuration/config.html#libvirt.live_migration_downtime
16:42:37 <sean-k-mooney> is that enough to configure it
16:42:41 <bauzas> but again, I think I'm done, I'm currently working on providing a asciinema stuff
16:42:50 <bauzas> so people will see it
16:43:03 <bauzas> sean-k-mooney: correct, that and the two other options
16:43:09 <sean-k-mooney> ack
16:43:24 <sean-k-mooney> i just wanted to confirm if the existign config options and the code you have for review is enough
16:44:20 <sean-k-mooney> assuming yes we can proceed on gerrit
16:44:21 <bauzas> yeah, so I'll modify https://review.opendev.org/c/openstack/nova/+/904258/13/doc/source/admin/virtual-gpu.rst
16:44:42 <bauzas> to explain that people will need to use a large max downtimez
16:44:54 <sean-k-mooney> ack
16:45:07 <bauzas> anyway, I'm done now
16:45:13 <bauzas> any other things ?
16:45:41 <sean-k-mooney> nope just wanted to confim that before i review the rest of your seies
16:45:52 <bauzas> sean-k-mooney: no worries
16:45:58 <bauzas> so, thanks all
16:46:07 <fwiesel> Thanks a lot.
16:46:08 <bauzas> #endmeeting