*** tosky_ is now known as tosky | 00:25 | |
opendevreview | OpenStack Proposal Bot proposed openstack/nova master: Imported Translations from Zanata https://review.opendev.org/c/openstack/nova/+/903427 | 02:33 |
---|---|---|
opendevreview | alisafari proposed openstack/nova stable/2023.2: Fix traits to cpu flags mapping https://review.opendev.org/c/openstack/nova/+/903443 | 05:26 |
opendevreview | alisafari proposed openstack/nova stable/2023.1: Fix traits to cpu flags mapping https://review.opendev.org/c/openstack/nova/+/903444 | 05:27 |
opendevreview | Alex Welsh proposed openstack/nova master: Make map_cell0 command update existing mappings https://review.opendev.org/c/openstack/nova/+/903140 | 09:42 |
opendevreview | Alex Welsh proposed openstack/nova master: Make map_cell0 command update existing mappings https://review.opendev.org/c/openstack/nova/+/903140 | 10:13 |
*** l8 is now known as layer8 | 11:05 | |
*** layer8 is now known as Guest10075 | 11:06 | |
Guest10075 | register | 11:08 |
opendevreview | Alex Welsh proposed openstack/nova master: Make map_cell0 command update existing mappings https://review.opendev.org/c/openstack/nova/+/903140 | 13:47 |
bauzas | sean-k-mooney: could I ask a gentle lift for https://review.opendev.org/c/openstack/nova-specs/+/900636 ? | 14:05 |
opendevreview | Alex Welsh proposed openstack/nova master: Make map_cell0 command update existing mappings https://review.opendev.org/c/openstack/nova/+/903140 | 14:17 |
opendevreview | Stephen Finucane proposed openstack/nova master: docs: Revamp the security groups guide https://review.opendev.org/c/openstack/nova/+/903507 | 14:36 |
stephenfin | Nice doc improvement there (IMO) based on some testing I did recently ^ | 14:40 |
opendevreview | Merged openstack/nova master: Support setting alias on libvirt disks https://review.opendev.org/c/openstack/nova/+/892800 | 14:43 |
opendevreview | Merged openstack/nova master: Set libvirt device alias for volumes https://review.opendev.org/c/openstack/nova/+/892801 | 14:53 |
opendevreview | Merged openstack/nova master: Detach disks using alias when possible https://review.opendev.org/c/openstack/nova/+/893068 | 14:54 |
bauzas | gibi: sean-k-mooney: like I said above, could you please swing my nova spec for gpu live-migration ? https://review.opendev.org/c/openstack/nova-specs/+/900636 | 15:10 |
opendevreview | Takashi Kajinami proposed openstack/nova master: Remove deprecated [api] use_forwarded_for https://review.opendev.org/c/openstack/nova/+/903339 | 15:13 |
*** layer8 is now known as Guest10097 | 15:29 | |
*** Guest10097 is now known as layer_8 | 15:38 | |
bauzas | reminder : nova meeting in 20 mins | 15:40 |
bauzas | *her | 15:40 |
gibi | I will need to disappear 17:30 CET | 15:50 |
bauzas | ack | 15:59 |
bauzas | #startmeeting nova | 16:01 |
opendevmeet | Meeting started Tue Dec 12 16:01:15 2023 UTC and is due to finish in 60 minutes. The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:01 |
opendevmeet | The meeting name has been set to 'nova' | 16:01 |
bauzas | sorry folks for the delay, I had to write the agenda :D | 16:01 |
bauzas | #link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting | 16:01 |
bauzas | who's around ? | 16:01 |
fwiesel | o/ | 16:01 |
grandchild | o/ | 16:01 |
JayF | o/ | 16:02 |
bauzas | let's slowly start | 16:04 |
bauzas | hopefully people will join | 16:04 |
bauzas | #topic Bugs (stuck/critical) | 16:04 |
bauzas | #info No Critical bug | 16:05 |
bauzas | #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 41 new untriaged bugs (+3 since the last meeting) | 16:05 |
bauzas | not sure anyone had time to look at bugs this week | 16:05 |
bauzas | #info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster | 16:05 |
bauzas | melwitt is on the next round for the bug baton but she's off | 16:05 |
bauzas | I'll ask her later iirc | 16:06 |
bauzas | anything about bugs ? | 16:06 |
bauzas | looks not | 16:06 |
bauzas | moving on | 16:06 |
bauzas | #topic Gate status | 16:06 |
bauzas | #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs | 16:06 |
bauzas | looks like the gate is more stable today | 16:06 |
* gibi had not time to look at bugs sorry | 16:07 | |
bauzas | I was able to recheck a few changes without issues | 16:07 |
bauzas | #link https://etherpad.opendev.org/p/nova-ci-failures-minimal | 16:07 |
bauzas | most of the patches from https://review.opendev.org/q/topic:%22nova-ci-improvements%22 are now merged | 16:08 |
JayF | I'll note that Ironic's gate was broken with some of the recent Ironic<>Nova driver changes; there's a small fix in the gate now we've been trying to merge since yesterday. I don't think there's an action for Nova team as it was quickly approved and I'm rechecking. | 16:08 |
bauzas | JayF: ack thanks | 16:09 |
bauzas | #link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&pipeline=periodic-weekly Nova&Placement periodic jobs status | 16:09 |
bauzas | all greens for master | 16:09 |
bauzas | nova-emulation continues to fail on stable/zed, but that's not a problem | 16:09 |
bauzas | #info Please look at the gate failures and file a bug report with the gate-failure tag. | 16:10 |
bauzas | anything about the gate ? | 16:10 |
bauzas | looks not | 16:10 |
bauzas | #topic Release Planning | 16:10 |
bauzas | #link https://releases.openstack.org/caracal/schedule.html#nova | 16:10 |
bauzas | #info Caracal-2 (and spec freeze) milestone in 4 weeks | 16:10 |
bauzas | time flies | 16:10 |
bauzas | last week, we had a good spec review day with 3 specs merged | 16:10 |
bauzas | but I beg here cores to look at other specs :) | 16:11 |
bauzas | fwiw, I'll do my duty | 16:11 |
bauzas | https://review.opendev.org/q/project:openstack/nova-specs+is:open+file:%5Especs/2024.1/.* is the list of open specs for 2024.1 | 16:12 |
bauzas | that's it for me on the release cadence | 16:13 |
bauzas | nothing really important else | 16:13 |
bauzas | moving on | 16:13 |
bauzas | #topic Review priorities | 16:13 |
Uggla | o/ | 16:13 |
bauzas | one important thing | 16:13 |
bauzas | #link https://etherpad.opendev.org/p/nova-caracal-status | 16:13 |
bauzas | I updated the etherpad | 16:13 |
bauzas | #info please use and reuse this etherpad by looking at both the specs and the bugfixes | 16:14 |
sean-k-mooney | do we want to add a fixed/merged section in that | 16:14 |
bauzas | sean-k-mooney: we have it | 16:14 |
bauzas | but not for the bugfixes | 16:14 |
bauzas | I can add it | 16:14 |
sean-k-mooney | ya we have feature compelte | 16:15 |
sean-k-mooney | basically it might be a nice refernce for the prolog or release summary | 16:15 |
bauzas | yes | 16:15 |
bauzas | anyway, moving on | 16:16 |
bauzas | #topic Stable Branches | 16:16 |
bauzas | elodilles_pto: oh, he's on PTO | 16:16 |
bauzas | #info stable gates don't seem blocked | 16:16 |
bauzas | #info stable release patches still open for review: https://review.opendev.org/q/project:openstack/releases+is:open+intopic:nova | 16:17 |
bauzas | #info yoga is going to be unmaintained, so final stable/yoga release should happen ASAP - https://etherpad.opendev.org/p/nova-stable-yoga-eom | 16:17 |
bauzas | also, I'll add my own point | 16:17 |
bauzas | #link Yoga EOL change https://review.opendev.org/c/openstack/releases/+/903278 | 16:18 |
bauzas | folks, if you want to wait to merge the EOL change until some change, please tell it above ^ | 16:19 |
bauzas | for me, I already +1d this EOL change | 16:19 |
bauzas | oh shit | 16:19 |
JayF | I'll note that's the Ussuri EOL change if you wanna fix the minutes | 16:19 |
bauzas | #undo | 16:19 |
opendevmeet | Removing item from minutes: #link https://review.opendev.org/c/openstack/releases/+/903278 | 16:19 |
JayF | jinx :) | 16:19 |
bauzas | #link *Ussuri* EOL change https://review.opendev.org/c/openstack/releases/+/903278 | 16:19 |
bauzas | voila | 16:20 |
bauzas | for Yoga, that's for EM | 16:20 |
bauzas | #info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci | 16:20 |
bauzas | that's it I guess for the stable topic | 16:20 |
bauzas | anything else for stable branches ? | 16:20 |
sean-k-mooney | one thing | 16:21 |
sean-k-mooney | dependign on how the new images work on master | 16:21 |
sean-k-mooney | we may want to consider backporting those changes to stabel branches if we see | 16:21 |
sean-k-mooney | kernel panics there | 16:21 |
sean-k-mooney | so we should revisit this topic in a few weeks | 16:21 |
sean-k-mooney | nothing to do now | 16:21 |
bauzas | we should wait a bit until we backport them | 16:22 |
bauzas | but sure | 16:22 |
bauzas | looks to me the gate is better by now | 16:22 |
bauzas | ok, then moving to the next topic | 16:23 |
bauzas | #topic vmwareapi 3rd-party CI efforts Highlights | 16:23 |
fwiesel | #Info Fully working devstack in lab environment manually set up. Now working on automatic setup / teardown for CI | 16:23 |
bauzas | fwiesel: grandchild: anything you want to tell ? | 16:23 |
bauzas | fwiesel: ++ | 16:23 |
bauzas | thanks ! | 16:23 |
fwiesel | So, "fully working" meaning, we can spin up an instance and it has network, etc..., but.... | 16:23 |
fwiesel | #Info Initial test runs show vmwareapi is broken (only boot from volume works, not nova boot), bugfixes will come after working CI | 16:23 |
bauzas | ahah | 16:24 |
fwiesel | We cannot simply boot from an image | 16:24 |
bauzas | so that's why we need a 3rd party CI :)à | 16:24 |
bauzas | looks like we regressed during some time | 16:24 |
fwiesel | So probably the CI needs to be non-voting in the beginning, I assume | 16:24 |
bauzas | fwiesel: will you be able to provide some bugfixes ? have you found the root cause ? | 16:24 |
bauzas | fwiesel: oh yeah, definitely | 16:25 |
bauzas | we'll run the job as non-voting first | 16:25 |
fwiesel | bauzas: I haven't looked at the root cause yet, as I thought a working CI has priority. Then we tackle the bugs one by one. | 16:25 |
bauzas | fwiesel: cool, no worries | 16:25 |
sean-k-mooney | fwiesel: third party ci cannot be voting | 16:25 |
bauzas | fwiesel: in case you need help, we can discuss those in your topic next weeks | 16:25 |
fwiesel | bauzas: But yes, we will be able to provide the bug fixes. It should be not terribly difficult | 16:25 |
sean-k-mooney | you can leave code review +1 or -1 | 16:25 |
sean-k-mooney | but never verifed +1 or -1 | 16:26 |
sean-k-mooney | or at least in a way that will prevent a patch merging | 16:26 |
fwiesel | sean-k-mooney: Ah, thanks for the explanation. Then I got the terminology wrong. | 16:26 |
sean-k-mooney | no worries | 16:26 |
bauzas | sean-k-mooney: I think we had 3rd-party CI jobs voting before ? (with zuul 2) | 16:26 |
sean-k-mooney | no | 16:26 |
sean-k-mooney | never | 16:26 |
sean-k-mooney | we disucssed it in the past and said that was not ok | 16:27 |
sean-k-mooney | as we cant have the gate blocked by a third party ci | 16:27 |
sean-k-mooney | if we are revieing a vmware patch and it breaks the vmware ci | 16:27 |
fwiesel | I meant, it probably shouldn't even leave a -1 in the beginning | 16:27 |
sean-k-mooney | we are very unlikely to merge it | 16:27 |
sean-k-mooney | but htat was left to cores to judge | 16:27 |
clarkb | you can haev a third party CI -1 and +1 without blocking anything. Only -2 blocks | 16:27 |
bauzas | okay, maybe this was in 2015 or earlier, but IIRC we had a job that was testing the DB time when upgrading and it was a 3rd-party CI | 16:28 |
sean-k-mooney | clarkb: we can yes | 16:28 |
sean-k-mooney | since gate only looks form verified form zuul | 16:28 |
bauzas | but maybe it never voted, can't exactly remember the details | 16:28 |
sean-k-mooney | bauzas: as far as i am aware we have never had third party voting ci and | 16:29 |
sean-k-mooney | i am not sure i want to change that in the future | 16:29 |
bauzas | anyway, this is not a problem | 16:29 |
sean-k-mooney | sure we just need to see the logs and if it passed or failed | 16:29 |
bauzas | let's see what fwiesel and grandchild can do with their CI and what they can provide for regression bugfixes | 16:29 |
fwiesel | That's from my side. Any questions? | 16:30 |
sean-k-mooney | yep | 16:30 |
sean-k-mooney | fwiesel: just one thing | 16:30 |
bauzas | fwiw, I'm okay with verifying some link every week during our meeting about how many job runs failed | 16:30 |
sean-k-mooney | you said local images dont work | 16:30 |
sean-k-mooney | did you make sure to use vmdks | 16:30 |
sean-k-mooney | instead of qcow | 16:30 |
bauzas | so even if we don't make them voting, we could continue to check those jobs continue to work | 16:30 |
fwiesel | sean-k-mooney: Sure, we only run with vmdks. | 16:31 |
sean-k-mooney | ok i was wondering if it was a simple format issue | 16:31 |
sean-k-mooney | feel free to file a bug with details when you have time | 16:31 |
bauzas | fwiesel: do you know we changed the VDMK types ? | 16:31 |
sean-k-mooney | o we blocked one of the times right | 16:32 |
bauzas | https://bugs.launchpad.net/nova/+bug/1996188 for the context | 16:32 |
fwiesel | bauzas: No, I don't. Thanks for the info | 16:32 |
bauzas | so you now need to pass an allowed list of vdmk types | 16:32 |
fwiesel | Ah, no. That one is fine... The same check is in cinder, and it works with boot from volume | 16:32 |
bauzas | maybe this is the root cause, maybe not | 16:32 |
bauzas | fwiesel: nova has its own config option | 16:33 |
bauzas | https://review.opendev.org/c/openstack/nova/+/871612 | 16:33 |
fwiesel | Sure, but we use the image subformat that isn't blocked | 16:33 |
bauzas | cool | 16:33 |
fwiesel | (streamOptimized or something) | 16:33 |
bauzas | was more a fyi, just in case | 16:34 |
bauzas | the vdmk bug hit me before :) | 16:34 |
bauzas | and when we had it, noone was around for telling us whether it was a problem for vmwareapi :) | 16:34 |
bauzas | anyway | 16:34 |
bauzas | shouldn't be a problem | 16:35 |
bauzas | fwiesel: thanks for the report, greatly appreciated | 16:35 |
fwiesel | you're welcome | 16:35 |
bauzas | fwiesel: someone also freaked out in the mailing list | 16:35 |
bauzas | I haven't replied but you could | 16:35 |
fwiesel | bauzas: Good idea, I will | 16:35 |
fwiesel | bauzas: openstack-discuss? | 16:36 |
bauzas | oh sean-k-mooney did | 16:36 |
bauzas | fwiesel: yup https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/message/DLTQ2KHQPFD4S7LLYKTWUPNFXRSTRTHU/ | 16:36 |
sean-k-mooney | ya i just said what we dicussed before | 16:36 |
sean-k-mooney | i.e. we wont remvoe anything until at least m2 depending on the ci status | 16:37 |
sean-k-mooney | and advies that while peopel can deprecate support | 16:37 |
bauzas | fwiw, I see a good progress | 16:37 |
sean-k-mooney | they shoudl not remove it | 16:37 |
bauzas | but let's continue to discuss it every week | 16:37 |
sean-k-mooney | yep | 16:37 |
bauzas | anyway, good talks | 16:37 |
bauzas | fwiw, I also advertised in the previous OpenInfraLive episode about what we agreed at the PTG and I explained verbally our current state, which is good for telling 'if you care about something, come tell us' | 16:38 |
bauzas | taking vmwareapi non-removal as an example of a working effort | 16:39 |
bauzas | anyway, time flis | 16:39 |
bauzas | flies* | 16:39 |
bauzas | #topic Open discussion | 16:39 |
bauzas | we have two topics | 16:39 |
bauzas | that we punted from last week | 16:40 |
bauzas | (artom) Specless blueprint for persistent mdevs | 16:40 |
bauzas | https://blueprints.launchpad.net/nova/+spec/persistent-mdevs | 16:40 |
bauzas | artom: around ? | 16:40 |
artom | Heya | 16:40 |
artom | To basically yeah - from memory, this would be limited to the libvirt driver, the idea is to persist mdevs when an instance is booted that uses an mdev | 16:41 |
bauzas | I'm maybe opiniated, so I won't really tell a lot, but I think this is a simple specless feature that only touches how our virt driver is creating a mdev | 16:41 |
artom | So that in case of a host reboot, the instances can come back without operator intervention | 16:41 |
artom | There might be operator intervention necessary to manually clean mdevs in certain cases | 16:42 |
bauzas | I don't see any upgrade concerns about it, the fact is that we will start persisting mdevs upon reboot by every compute that's upgraded | 16:42 |
artom | Because the mdevs would outlive their instances and host reboots | 16:42 |
artom | So for instance, changing the enabled mdev types (after draining the host), the operator would need to clean up the old mdevs | 16:42 |
sean-k-mooney | one thing you need to be careful of | 16:42 |
bauzas | artom: surely this will require a releasenote and some upstream docs, but this doesn't require to add a new DB model or anything about RPC | 16:42 |
gibi | as draining the host will not remove mdevs? | 16:42 |
sean-k-mooney | is on restating nova to a version with this support | 16:43 |
sean-k-mooney | if we have vms using mdevs created via sysfs | 16:43 |
sean-k-mooney | we need to supprot cretign the libvirt nodedev object to persist them | 16:43 |
bauzas | sean-k-mooney: I see the upgrade path for persisting the mdevs to restart the instances | 16:44 |
sean-k-mooney | i.e. we need to supprot upgrade in place without restarts | 16:44 |
sean-k-mooney | why? | 16:44 |
sean-k-mooney | we shoudl not need vm downtime or move operations | 16:44 |
bauzas | oh my bad, I was wrong | 16:44 |
bauzas | the mdev would already be created | 16:44 |
sean-k-mooney | yep | 16:44 |
sean-k-mooney | and in use by the vm | 16:44 |
sean-k-mooney | we jsut need to create the mdev with the same uuid in the libvirt api | 16:45 |
sean-k-mooney | to have it persited | 16:45 |
bauzas | so, I think this feature requires some upgrade doc that explains how to persist the mdecv | 16:45 |
sean-k-mooney | well nova can do it | 16:45 |
bauzas | I mean, an admin doc | 16:45 |
sean-k-mooney | but we can have n upgade doc to cover how this works | 16:45 |
sean-k-mooney | i would hope its litrally just update the nova-compute binary adn restart the compute agent | 16:46 |
sean-k-mooney | no other upgrade impact | 16:46 |
bauzas | sean-k-mooney: do you think of a nova-compute startup method that would check every single mdev and would persist it ? | 16:46 |
sean-k-mooney | init host can reconsile what we expect based on teh current xmls | 16:46 |
sean-k-mooney | ya | 16:46 |
sean-k-mooney | that is what i was thinking | 16:46 |
bauzas | that's an additional effort, sure, still specless I think | 16:46 |
sean-k-mooney | we can defer that to impelmation review if we like | 16:46 |
sean-k-mooney | but i woudl ike to see upgrade in place support with or without a hard reboot i guess | 16:47 |
bauzas | sounds acceptable o me | 16:47 |
bauzas | we already have a broken method that's run on init_host | 16:47 |
bauzas | we could amend it to persist the mdev instead | 16:47 |
bauzas | (ie. delete and recreate it using libvirt API with the same uuid) | 16:48 |
bauzas | artom: does that sound reasonable to you ? | 16:48 |
artom | Sorry, reading back, in multiple places at the same time | 16:49 |
bauzas | IRC meetings are good, but people do many things at the same time :) | 16:49 |
bauzas | I wish we could be more in sync somehow :) | 16:49 |
bauzas | given we only have 10 mins left and anther procedural approval to do | 16:50 |
bauzas | lemme summarize | 16:50 |
bauzas | 1/ the feature will ensure that every new mdev to create will use the libvirt API | 16:51 |
bauzas | 2/ at compute restart, the implementation will check every mdev that's created and will recreate it using the libvirt API | 16:51 |
artom | OK, yeah, I think that makes sense, though the mechanics of persisting existing transient mdevs are less obvious to me at this time | 16:52 |
bauzas | 3/ documentation will address the fact that operator needs some cleanup (unpersisting the mdevs) in case they want to change the type, additional to the fact they need to drain vgpu instances from that host | 16:52 |
sean-k-mooney | i would hope its just generateing an xml and asking livbirt to create it | 16:52 |
bauzas | artom: don't be afraid, I exactly see what, where and how to do it | 16:52 |
sean-k-mooney | it should see that it already exist and i hope just write the mdevctl file | 16:52 |
sean-k-mooney | we can do this at the end of the seriese | 16:53 |
sean-k-mooney | to not blcoke the over all feature | 16:53 |
bauzas | based on those 3 bullet points, I don't see anything that requires a spec | 16:53 |
bauzas | anyone disagreeing ? | 16:53 |
bauzas | looks not | 16:53 |
bauzas | and as a reminder, a specless approval isn't a blank check about design | 16:54 |
bauzas | if something controversial comes up, we could revisit that and ask for a spec | 16:54 |
bauzas | based on that | 16:54 |
gibi | looks OK to me | 16:54 |
bauzas | #agreed https://blueprints.launchpad.net/nova/+spec/persistent-mdevs accepted as a specless blueprint | 16:54 |
gibi | Im not clear about when manual cleanup is needed but we can discuss that in the review | 16:55 |
bauzas | #action artom to amend the blueprint description to note what we agreed | 16:55 |
bauzas | moving on | 16:55 |
bauzas | I really want the last item to be discussed | 16:55 |
bauzas | (JayF/johnthetubaguy) Specless blueprint for ironic guest metadata | 16:55 |
JayF | o/ | 16:55 |
bauzas | JayF: 'sup ? | 16:55 |
JayF | I am unsure if John will be here, but I am. | 16:55 |
bauzas | https://blueprints.launchpad.net/nova/+spec/ironic-guest-metadata | 16:55 |
bauzas | shot | 16:55 |
bauzas | shoot* even | 16:55 |
sean-k-mooney | reading it quickly it looks reasonable but i would use flavor uuid instead of name or both | 16:56 |
bauzas | ditto here | 16:56 |
JayF | Essentially, libvirt instances get a large amount of useful metadata that Ironic would like to get as well for various uses -- the primary case that drove us to this was implementing Ironic's "automatic_lessee" support, allowing Ironic to use the project that provisioned an instance some RBAC access to it | 16:57 |
JayF | but generally many of those metadata items map to previous feature requests / things that operators have asked for in node instance_info for troubleshooting in the past (like flavor) | 16:57 |
bauzas | I only care about the upgrade path | 16:57 |
JayF | so it seemed like a good fit/easy win | 16:57 |
sean-k-mooney | so i guess my request woudl be can we update the decription with the full list of things we want to be set | 16:57 |
bauzas | would we need some interim period to ensure all computes report that metadata ? | 16:58 |
JayF | Essentially this is just additional metadata you'd set on deploy | 16:58 |
JayF | it'd be Ironic's job to do the right thing if it's set/not set | 16:58 |
sean-k-mooney | so this is settign metadta on the ironic nodes right | 16:58 |
sean-k-mooney | not the compute nodes | 16:58 |
JayF | from a Nova standpoint, it should be 100% backwards compatible | 16:58 |
sean-k-mooney | as in when an instnace is shcudled ot a node | 16:58 |
sean-k-mooney | so for upgrade i guess we coudl have a nova-manage command | 16:58 |
sean-k-mooney | to set it for existing instnces | 16:58 |
JayF | oh, you mean for backfilling instance metadata, I understand | 16:59 |
sean-k-mooney | yep | 16:59 |
JayF | that's a case I hadn't even consideredd! | 16:59 |
sean-k-mooney | so i assume this would normally only be set on spawn | 16:59 |
sean-k-mooney | since ironci dose not support resize | 16:59 |
sean-k-mooney | well spawn or rebuild/evacuate | 16:59 |
JayF | Alright, looks like I have two actions: 1) List all the specific fields and 2) add details about migration path for preexisting instances and how/if they get metadata | 17:00 |
JayF | sean-k-mooney: we do rebuild, very common use case | 17:00 |
sean-k-mooney | ya so we would want to update it on rebuild right | 17:00 |
sean-k-mooney | so 3 actions. list the data to set, list when it will be set | 17:00 |
bauzas | sounds then an implementation detail to me | 17:00 |
sean-k-mooney | and then if we want to have a nova-manage command to backfile then detail that too | 17:00 |
JayF | yeah but one I don't mind enumerated in the blueprint to ensure I don't miss it | 17:00 |
bauzas | nova-manage what ? fill my ironic stuff for that instance ? | 17:01 |
sean-k-mooney | bauzas: ya set the metadtaa on the corresponding ironc node for an existing instance | 17:02 |
bauzas | couldn't it be some ironic script that would gather the details from the nova API ? | 17:02 |
sean-k-mooney | that shoudl be pretty simple to do its just a ironic api call | 17:02 |
sean-k-mooney | it could btu that feels liek a worse soultion to me | 17:02 |
JayF | So I'll note | 17:02 |
bauzas | sean-k-mooney: I'm not 100% happy with a very specific virt driver method in our nova-manage command | 17:02 |
JayF | From an Ironic standpoint, having the data backfilled is not super awesome | 17:03 |
sean-k-mooney | im pretty sure it woould not be the first | 17:03 |
JayF | we aren't going to do much with it | 17:03 |
bauzas | this would require nova-manage to be able to speak the ironic language | 17:03 |
sean-k-mooney | but this feels like the volume refesh commands to me | 17:03 |
sean-k-mooney | bauzas: it already can talk to ironci | 17:03 |
JayF | so while I'm happy to implement it, and I'm sure someone would find a use for it, I don't think it's in the primary path for enabling the sorta features we want | 17:03 |
sean-k-mooney | so i dont really mind where it ends up i guess but i think it woudl be nice to have | 17:03 |
bauzas | sean-k-mooney: creds and all the likes are set on nova.conf that nova-manage reads ? | 17:03 |
JayF | bauzas: you already have creds on nova-computes to do the calls you need | 17:04 |
JayF | PATCH calls to /v1/node/{UUID} | 17:04 |
bauzas | JayF: nova-manage isn't meant to be run on nova computes | 17:04 |
JayF | ack | 17:04 |
sean-k-mooney | bauzas: well its normally run on the contoler which often by acidnet is where nova-compute with ironic runs | 17:05 |
bauzas | despite we shipped that sail with the volume attach command :) | 17:05 |
sean-k-mooney | but we cant assume it will be colocated | 17:05 |
bauzas | did I say " we shipped that sail" ? oh gosh, I'm tired | 17:06 |
bauzas | anyway | 17:06 |
bauzas | sounds there is kind of a grey path about what we would for non-greenfields instances | 17:06 |
bauzas | would do* | 17:06 |
JayF | I like the idea of Ironic owning the migratino script | 17:07 |
bauzas | time is flying tho and we're *again* late | 17:07 |
JayF | and will think abuot it further and likely propose that | 17:07 |
sean-k-mooney | we shoudl figure our a solution to exisitng instnace but we dont need to do that now | 17:07 |
bauzas | (I'm really sorry about it) | 17:07 |
bauzas | JayF: what would you be your preference ? | 17:07 |
sean-k-mooney | JayF: do you want to think about that for a few days and let use knwo what you think is the best approch | 17:07 |
bauzas | approving the blueprint with a note saying "this is only a path for new instances, the migration path is yet to be defined" ? | 17:08 |
sean-k-mooney | im kind of feeling liek a spec would help by the way | 17:08 |
bauzas | or we could revisit the approval in later meetings | 17:08 |
JayF | Yeah I'm thinking Ironic side script, because then we can allow the Ironic-side actions to be done, too | 17:08 |
JayF | I'd say lets revisit | 17:08 |
sean-k-mooney | but if other are ok im not going to say we must have one | 17:08 |
JayF | I think I'll get to talk to John in the intervening week | 17:08 |
bauzas | okay, I'll keep the bleuprint in the agenda | 17:08 |
JayF | I wasn't sure what the edges were on this, now I know what they are and can file them down :) | 17:08 |
JayF | thank you | 17:08 |
bauzas | and we could revisit it next week | 17:08 |
bauzas | thanks | 17:09 |
sean-k-mooney | i just dont knwo this code as well as other so it would help me to have a little more detail. but we coudl just put more detail in the blueprint | 17:09 |
bauzas | and because we're horribly late, I'll end the meeting now | 17:09 |
sean-k-mooney | o/ | 17:09 |
bauzas | thanks all | 17:09 |
bauzas | and sorry again | 17:09 |
bauzas | #endmeeting | 17:09 |
opendevmeet | Meeting ended Tue Dec 12 17:09:32 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 17:09 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/nova/2023/nova.2023-12-12-16.01.html | 17:09 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/nova/2023/nova.2023-12-12-16.01.txt | 17:09 |
opendevmeet | Log: https://meetings.opendev.org/meetings/nova/2023/nova.2023-12-12-16.01.log.html | 17:09 |
gibi | thanks bauzas | 17:09 |
* bauzas switches to advent of code | 17:09 | |
JayF | sean-k-mooney: the base feature in Ironic we want to hook into is right now, w/standalone ironic, you set [conductor]automatic_lessee=true, you get node.lessee={project_id that spawned it} set automatically by Ironic, and you can use RBAC to give some Ironic API access to that node (anything from read access to ability to self-serve some maintenance items) | 17:10 |
JayF | sean-k-mooney: so doing a backfill using nova-based script would be awkward, because we could backfill the nova half but the Ironic side wouldn't have taken the "on spawn" action... but if we do it from Ironic side, it can pull info from Nova, make the change to put the metadata on the node, and also use Ironic config to flip other node settings as needed (for example: setting | 17:11 |
JayF | node lessee) | 17:11 |
sean-k-mooney | JayF: ok so you think it would be better to have an ironic-manage or even an ironic api endpoint | 17:12 |
sean-k-mooney | to trigger the backfile or just od it with a script | 17:12 |
sean-k-mooney | i tought it might be simlpel to just have nova-manage call the function that normally generates the metatada we would set and call the ironic api with that | 17:13 |
JayF | A script, for sure, and I could see the shape of that script being such that nova-backfill is an option on it | 17:13 |
sean-k-mooney | but if that is not simple then no need to do it in nova | 17:13 |
JayF | Yeah but like I said, that's only like, half of the equation and alone is not super valuable | 17:13 |
sean-k-mooney | ack | 17:14 |
sean-k-mooney | for what its wroth you have remined me that i have wanted to extend the metadata in the libvirt xml for years | 17:14 |
sean-k-mooney | sepeicicly i want to have both the image and flagor (name and uuid) and have the image /falvor extra specs direclly in the metadata | 17:14 |
JayF | well that's like, another thing I wanted to say | 17:14 |
JayF | Why do we want a libvirt metadata list and an Ironic metadata list? | 17:15 |
JayF | can there not just be a "here's a big list of instance metadata that hypervisors that care about metadata cna have" | 17:15 |
sean-k-mooney | we coudld have this in common code yes | 17:15 |
JayF | John and I were looking at this, and that's where the bp came from | 17:16 |
sean-k-mooney | all the dirver need to do is transform the dic of metadta to where it stores it | 17:16 |
JayF | ++ that would be ideal | 17:16 |
sean-k-mooney | so the reaosn i want to add both the name and uuid is the image uses one and the flavor uese the other | 17:16 |
sean-k-mooney | both shoudl use both | 17:16 |
sean-k-mooney | and the extra_specs/image properties are so if you dont ahve a db dump | 17:17 |
JayF | the only fields I have strong feelings about are any-flavor-identifier-at-all and the project information | 17:17 |
sean-k-mooney | you can just look at the xml and know everything about it | 17:17 |
JayF | the rest of them are things that are situationally useful | 17:17 |
JayF | almost all of those metadata fields in the libvirt driver, I was able to think of a time troubleshooting when that in node.instance_info would've been useful :) | 17:18 |
sean-k-mooney | JayF: related question | 17:18 |
sean-k-mooney | out of interest why do you care about the flavor | 17:18 |
JayF | Super common support question for Ironic admins | 17:18 |
sean-k-mooney | what will it be used for? | 17:18 |
JayF | that's the answer | 17:18 |
JayF | 'what flavor did you use to boot this' often dictates some actions Ironic takes on spawn | 17:19 |
sean-k-mooney | oh ok so the resouce class on the node is not enough | 17:19 |
JayF | yep | 17:19 |
sean-k-mooney | ok makes sense | 17:19 |
sean-k-mooney | this is why its there in the libvirt xml | 17:19 |
sean-k-mooney | but often i get a comptue log with no flavor info | 17:19 |
JayF | because resource_class + capabilities (not sure I'm using right nova term ?) means you might have the same node | 17:19 |
sean-k-mooney | so that why the flavor detail like the extra spec are useful to me | 17:19 |
JayF | but it gets a different deployment template, giving it different bios config on spawn | 17:19 |
johnthetubaguy | sean-k-mooney: FWIW, I feel like this should be "similar" to what we do with libvirt xml, to get the extra info in there. | 17:20 |
JayF | so you might have, supersize-node-configa supersize-node-configb and both are node.resource_class=supersize-node | 17:20 |
sean-k-mooney | johnthetubaguy: ya so the libvirt part is just using the flavor and image + the project id i think | 17:20 |
JayF | johnthetubaguy: we were just talking about maybe that code becoming common | 17:20 |
JayF | johnthetubaguy: so the ironic/libvirt driver code would just be taking a dict of metadata and putting it in the right place | 17:21 |
johnthetubaguy | sean-k-mooney: yeah, I think it makes sense to be the same, or at least similar. | 17:21 |
sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/virt/libvirt/config.py#L3600-L3642 | 17:22 |
sean-k-mooney | well actuly https://github.com/openstack/nova/blob/master/nova/virt/libvirt/config.py#L3600-L3699 | 17:22 |
sean-k-mooney | i have been meing to extend that for a while but it was never a high enough priority | 17:23 |
sean-k-mooney | i think that ahs most of what you want already | 17:23 |
JayF | I'm pretty sure that's *exactly* the code we had up in the meeting when we made this bp | 17:24 |
johnthetubaguy | JayF: what is missing? yeah, that is the code I was thinking about before. | 17:24 |
JayF | as long as it has the project that spawned it and the flavor that's the primary things we care about | 17:26 |
JayF | and I see a self.owner and a self.flavor | 17:26 |
johnthetubaguy | JayF: +1 I think that has it all. Feels like a case of adding to_dict(), in a way. | 17:28 |
sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L5780 | 17:28 |
sean-k-mooney | so this is where we generate this today | 17:28 |
johnthetubaguy | ah, that was it. | 17:28 |
sean-k-mooney | we could move that up to the base virt dirver | 17:28 |
sean-k-mooney | it just taks the instance and network info today | 17:28 |
sean-k-mooney | ok so not quite | 17:29 |
sean-k-mooney | its currently building livbirt objects | 17:29 |
johnthetubaguy | Yeah, if we have something generating a dict we can both use, that would have everything we need | 17:29 |
sean-k-mooney | but we coudl build a dict or other object that is not libvirt specific | 17:29 |
JayF | and then the MVP for the Ironic change becomes "toss that metadata into the instance_info patch on spawn" and "do the proper $whatever on rebuild/other non-spawn instance modifying actions" | 17:31 |
sean-k-mooney | basically ya | 17:31 |
JayF | then we'll have an Ironic side change to respect that metadata for automatic_lessee and to add a backfill script | 17:31 |
sean-k-mooney | the libvirt driver just generate it every time we generate an xml | 17:31 |
JayF | (which the backfill script will also backfill node.lessee if configured to do so, which is why having it in Ironic is great) | 17:32 |
sean-k-mooney | so on reboot as well | 17:32 |
sean-k-mooney | im not sure if you just want to do that | 17:32 |
sean-k-mooney | and say reboot the node to backfile or not | 17:32 |
sean-k-mooney | that is how we handlel it when we add more metadata for libvirt | 17:32 |
sean-k-mooney | just tell people reboot, cold migate or anything else that updates the xml | 17:33 |
JayF | What kind of actions cause metadata to change? | 17:33 |
JayF | Let me ask that differently. | 17:33 |
JayF | What kind of non-API-initiated-actions can cause metadata to change? | 17:33 |
JayF | ^ user | 17:33 |
sean-k-mooney | interface attach/detach, resize, rebuild and evcauate | 17:33 |
sean-k-mooney | technially it wont change on reboot but we just do that anyway because its simpelr since we deltee and recreate the xml on every reboot | 17:34 |
JayF | we'd may want to consider filtering interface metadata in ironic instance_info, because ironic has it's own network info | 17:34 |
JayF | and rebuild is the only other case that impacts Ironic | 17:34 |
sean-k-mooney | i think its ok for a virt direr to profject a view of the metdata that only has the filed that makes sense to it | 17:35 |
JayF | so I think we should be able to get it to a case where Ironic instnace metadata would only change meaningfully on rebuild/spawn/destroy | 17:35 |
sean-k-mooney | ya that sounds about right | 17:35 |
johnthetubaguy | sean-k-mooney: yeah, I think the common code can provide the set of metadata allowed, Ironic can pick a subset of that, I think skipping the network bits makes sense. | 17:36 |
sean-k-mooney | with that said | 17:36 |
sean-k-mooney | on the network side i know ironci suport portgroups | 17:36 |
sean-k-mooney | but when an ironic node is assocated with a nova instance | 17:36 |
sean-k-mooney | the only addtion or remoavl of interface | 17:37 |
sean-k-mooney | should be done via nova api | 17:37 |
sean-k-mooney | im aware that vif plug in the ironic driver calls ironic which does thign with neutron | 17:38 |
sean-k-mooney | btu it shoudl not really affect which ports/port groups are assocated with the instance form a nutron point of view | 17:38 |
JayF | Yes. The feeling behind not wanting to pass thru the network metadata is much, much more basic than that: there's already like, three views to networks when people come to Ironic with support: Nova's view, Neutron's view, Ironic's view (port object)... not even counting various bifrost/whatever type of configs | 17:39 |
JayF | I just don't want to see any bugs with "here's the node.instance_info['meta']['network']" or pollute the DB with more redundant info :) | 17:40 |
sean-k-mooney | :) | 17:40 |
sean-k-mooney | well the metadta does not ahve the full network object | 17:40 |
sean-k-mooney | just the ip s | 17:40 |
sean-k-mooney | assocated with an instance | 17:40 |
sean-k-mooney | but if you already have that no need to duplciate it as you said | 17:41 |
JayF | which is actually information that people might want | 17:41 |
JayF | IP is in neutron, we don't have it in Ironic | 17:41 |
JayF | Ironic holds more of the information about the physical port: mac address, binding profile/metadata about how it's physically connected to switches, etc | 17:41 |
sean-k-mooney | the ip stuff was a relitivly late addtion | 17:45 |
sean-k-mooney | https://github.com/openstack/nova/commit/838370a49014351051bbef2d1c2ada1f47ac2bfb | 17:45 |
sean-k-mooney | it was added in wallaby | 17:46 |
sean-k-mooney | anyway that data would be avialable for you to consume if it was useful | 17:46 |
sean-k-mooney | JayF: this is the spec we used to add that https://specs.openstack.org/openstack/nova-specs/specs/wallaby/implemented/libvirt-driver-ip-metadata.html | 17:47 |
opendevreview | Stephen Finucane proposed openstack/nova master: Bump hacking version https://review.opendev.org/c/openstack/nova/+/903529 | 18:39 |
opendevreview | Stephen Finucane proposed openstack/nova master: Resolve mypy error https://review.opendev.org/c/openstack/nova/+/903530 | 18:39 |
stephenfin | That mypy fix is required to unblock an u-c bump, while the other fix and the u-c bump are required to fix installing nova (for unit tests) on Python 3.12 (Fedora 39) ^ | 18:46 |
JayF | sean-k-mooney: johnthetubaguy: Is there anything I can do to help get https://review.opendev.org/c/openstack/nova/+/900831 landed? Was hoping to get this in front of the sharding re-proposal patch, which I was hoping to do this week (about to go rev the spec) | 22:01 |
opendevreview | Merged openstack/nova master: Fix regression breaking Ironic boot-from-volume https://review.opendev.org/c/openstack/nova/+/903324 | 22:41 |
JayF | Huzzah! | 22:41 |
opendevreview | Jay Faulkner proposed openstack/nova-specs master: Re-submit Ironic-shards for Caracal https://review.opendev.org/c/openstack/nova-specs/+/902698 | 23:25 |
opendevreview | Jay Faulkner proposed openstack/nova master: [ironic] Partition & use cache for list_instance* https://review.opendev.org/c/openstack/nova/+/900831 | 23:51 |
*** haleyb is now known as haleyb|out | 23:53 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!