15:03:46 <TheJulia> #startmeeting ironic 15:03:46 <opendevmeet> Meeting started Mon Aug 25 15:03:46 2025 UTC and is due to finish in 60 minutes. The chair is TheJulia. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:03:46 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:03:46 <opendevmeet> The meeting name has been set to 'ironic' 15:03:47 <TheJulia> o/ 15:03:51 <clif> o/ 15:04:26 <kubajj> o/ 15:05:08 <cardoe> o/ 15:05:09 <TheJulia> So who is going to chair our meeting this week? Or are we all going to need to have more coffeeeeee? 15:05:14 <mostepha> o/ 15:05:26 <TheJulia> \o 15:05:52 <TheJulia> I guess I'm chairing today 15:06:01 <TheJulia> Welcome to this week's Ironic meeting! 15:06:06 <TheJulia> Our agenda can be found on the wiki! 15:06:08 <TheJulia> #link https://wiki.openstack.org/wiki/Meetings/Ironic#Agenda_for_August_25.2C_2025 15:06:19 <cid> o/ 15:06:44 <JayF> o/ 15:06:51 <TheJulia> As for our standard reminder, please review priority items labled with the hashtag ironic-week-prio. A dashboard link is available for this purpose. 15:06:53 <TheJulia> #link https://review.opendev.org/q/hashtag:%22ironic-week-prio%22+(status:open) 15:07:26 <TheJulia> This week is R-5 on the release schedule. 15:07:28 <TheJulia> #link https://releases.openstack.org/flamingo/schedule.html 15:08:04 <TheJulia> As a reminder! This week other projects freeze. Final release for client libraries occur, requirements also freeze, and well, yeah. You can see the full list on the schedule 15:08:30 <TheJulia> Next week! Cycle highlights are due, so we should also consider composing our prelude for this release 15:09:03 <TheJulia> Based upon my experience over the years, we're likely going to have to release in 2 weeks. 15:09:04 <JayF> I'll take a look at cycle highlights 15:09:11 <JayF> #action JayF to do cycle highlights 15:09:19 <TheJulia> #chair jayf 15:09:19 <opendevmeet> Current chairs: TheJulia jayf 15:09:21 <JayF> #action JayF to do cycle highlights 15:09:32 <TheJulia> Does anyone else have anything to remind us of or announce this week? 15:10:40 <TheJulia> I guess not, onward! 15:10:45 <TheJulia> #topic Working Group Updates 15:10:58 <TheJulia> Our first update I'll be proxying for the Standalone networking working group. 15:12:04 <TheJulia> alegacy messaged me with a scheduled message and let me know that he made progress but feels he will need to revise the spec. Basically they realized a chicken/egg issue that we're already aware of in ironic, but I suspect we never clearly documented. tl;dr how do you get the data automatically. 15:12:27 <TheJulia> In terms of Eventlet removal! 15:13:02 <JayF> I move that we rename the Eventlet WG the Eventlet is dead partytime group 15:13:03 <JayF> ;) 15:13:18 <TheJulia> It feels like we're in final clean-up stage, where we may be partying and forgetting any final clean-ups. 15:13:37 <TheJulia> For any final cleanups, lets go ahead and get them tracked and raise awareness if we're aware of any. 15:13:50 <JayF> request logging fix landed this morning, there's a couple of dev environment cleanups I have up that (I think?) have not yet landed 15:14:00 <TheJulia> That matches my awareness 15:14:09 <JayF> https://review.opendev.org/c/openstack/ironic/+/958249 15:14:18 <TheJulia> I think we were also going to pull the wsgi entry point? 15:14:25 <JayF> Ah, that's a good one too 15:14:31 <TheJulia> or am I incorrectly presenting that one? 15:14:40 <TheJulia> (why? I don't know?!) 15:14:59 <JayF> I think that's aaccurate, but I don't think it's tied with eventlet directly 15:15:04 <JayF> so much as tied with "remove old crap" :) 15:15:09 <TheJulia> Yeah 15:15:28 <TheJulia> Okay, anything else regarding working group updates? 15:15:48 <JayF> For eventlet WG 15:15:57 <JayF> we should mention we are encouraging contributors to set aside some meaningful time 15:16:06 <JayF> to perform basic testing locally, if you have hardware test against it 15:16:16 <JayF> the weirder the use case tested the better :) 15:16:28 <TheJulia> I concur 15:16:46 <JayF> also 👏 to our CNCF counterparts for helping us find the first (only?) major regression 15:17:11 <TheJulia> #agreed Contributors should try to take time to exercise the "weirder" use cases to observe if they have found any regressions. 15:17:12 <TheJulia> ++ 15:17:38 <TheJulia> FWIW, I did sit down and just verify the vnc console stuff was working still last week and I was able to confirm it was happy as a single process. 15:18:11 <JayF> I would *really* like testing against ilo driver 15:18:26 <JayF> but I don't personally have access to that hardware so :/ 15:18:27 <TheJulia> Unfortunately, I have no ilo gear 15:19:42 <TheJulia> I *suspect* we may have another issue brewing, of sorts, with the redfish cache, but I've not sat down to explore or inspect logs 15:19:51 <TheJulia> (session cache) 15:20:46 <kubajj> TheJulia: I recently figured out that we do have some ilo nodes we could test stuff on 15:20:47 <TheJulia> Anyhow, it looks like have no discussion topics today, so we can proceed to the Bug Deputy updates if there are nothing else 15:21:10 <JayF> it might be a good idea to chat about portgroup.physical_network; but that can wait until open discussion 15:21:17 <TheJulia> Yeah, it would :) 15:21:26 <TheJulia> kubajj: if y'all can, that would be much appreciated. 15:21:56 * TheJulia waits for a moment to see if there is anything eventlet/testing related before we proceed 15:22:02 <kubajj> TheJulia: although I am not 100% familiar with the potential issues that could arise 15:23:49 <TheJulia> kubajj: largely the removal of eventlet changes the threading model to use... actual threads. It increases memory usage may expose issues in drivers doing anything odd. Because threads will basically become blocking, the aspects eventlet monkeypatches, largely around socket/thread interactions can appear as pain points 15:24:34 <kubajj> TheJulia: I assume this involves both ironic and the IPA? 15:25:03 <TheJulia> kubajj: Eventlet was removed from both, IPA is less multi-threaded and is much simpler in many ways. Ironic at it's core is *really* where it is at. 15:26:08 <JayF> It's hard to me to imagine we have any edges in IPA that haven't been tested 15:26:14 <kubajj> TheJulia: ok, we will try to build Ironic and IPA from master and see what fails 15:26:23 <JayF> We made a point to test image streaming + failures when building that feature which I think is the edgiest case there 15:26:25 <JayF> but IMBW :) 15:26:35 <TheJulia> kind of yeah, and also there is an easy fallback path for IPA, load a slightly older image up and roll forward while we work a bug ;) 15:27:02 <TheJulia> #topic Bug Deputy Updates 15:27:13 <TheJulia> iurygregory: you're up if your around 15:27:27 <iurygregory> quick updates 15:27:33 <iurygregory> 3 new bugs during the week 15:27:40 <iurygregory> all have patches up =) 15:27:49 <TheJulia> I think 2 are actually "resolved" for now 15:27:52 <iurygregory> some are even merged 15:27:54 <iurygregory> yeah 15:27:58 <TheJulia> cool cool 15:29:45 <TheJulia> Anything else? Would anyone like to be our bug deputy for this week? 15:29:57 <clif> I'll volunteer 15:30:04 <clif> assuming no one else is in line already 15:30:30 <TheJulia> #action clif to be our bug deputy for this week! 15:30:38 <clif> \o/ 15:31:22 <TheJulia> Well, since we have no RFEs to review this week! 15:31:25 <TheJulia> #topic Open Discussion 15:31:45 <kubajj> anybody knows what's up with IPA CI? 15:32:09 <TheJulia> kubajj: is it blowing up a lot in dib's extract-image logic ? 15:32:13 <JayF> #note CI for anything that needs a CentOS image, including IPA CI and 4k test jobs in Ironic, are failing due to upstream CentOS issues. 15:32:17 <JayF> TheJulia: yes 15:32:22 <TheJulia> yeah, that! 15:32:27 <JayF> I noticed it last week when 4k job was busted 15:32:36 <JayF> figured no need to pile on and be too loud about it 15:32:55 <TheJulia> They already have multiple groups of folks being loud on the ticket 15:33:13 <TheJulia> #link https://issues.redhat.com/browse/CS-2983 15:33:22 <clif> I'm a bit perturbed it hasn't been addressed yet 15:33:24 <kubajj> I am not sure if it is that, but could be https://zuul.opendev.org/t/openstack/build/5a5e32685bb14ace847a3b9f8fa0d07a 15:33:26 <TheJulia> So, as for that physical_network topic! 15:33:29 <TheJulia> clif: as am I.... 15:33:57 <JayF> looking at https://docs.openstack.org/ironic/latest/admin/networking.html#terminology 15:34:13 <JayF> plus the fact that we can loosen things up later, but it's not as easy to tighten them up later 15:34:37 <JayF> I think it makes sense to make portgroup.physical_network either be inherited or be limited to values present on their underlying ports 15:35:00 <JayF> the part that makes this hairy is that I suspect it's possible to have portgroup X containing ports A and B, and having ports A and B have different physical networks 15:35:09 <JayF> even though that config wouldn't make sense, if the data exists we'd have to handle it 15:35:21 <JayF> does that match cardoe and TheJulia concerns/thoughts? 15:35:26 * JayF tried to catch up this morning :D 15:37:21 <TheJulia> I think having a portgroup spanning to physically different portgroups is *really* a bad idea... but its sort of kind of valid modeling wise 15:37:32 <TheJulia> because you *could* sort of say the portgroup exists on *both* 15:37:44 <TheJulia> I mean... not ideal, likely model breaking overall, but sort of possible. 15:38:01 <TheJulia> And even then, odds are you *really* wouldn't have a proper LACP config in such a case 15:38:04 <JayF> my question is 15:38:10 <JayF> is there ever a case where that ^^ makes sense 15:38:23 <JayF> that doesn't involve "operator physically moves the cable from X fabric to Y" 15:38:25 <JayF> I can't find one. 15:38:38 <JayF> and if the physical world changes, asking people to update the API isn't just OK; it's the most sensible thing 15:39:06 <TheJulia> I think if someone came with a case where they said it makes sense, I'd challenge them to explain why. I mean, I guess I can see a sort of meta question to this argument, I'm viewing it as a shorthand, but there are other fields in this model they could also use 15:39:24 <TheJulia> I'm thinking inherant and enforce, and maybe we put a config option around the validation 15:39:34 <JayF> I'd say just lock down by default 15:39:34 <TheJulia> and if someone does something like that blow up/detonate/scream/etc 15:39:41 <TheJulia> and re-iterate in the docs 15:39:45 <TheJulia> cardoe: thoughts^ 15:39:48 <JayF> if you wanna set portgroup.physical_network, it needs to be one that's on port.physical_network 15:39:51 <TheJulia> I'm good with that, personally 15:39:59 <JayF> if there's ever a mismatch when we go to do things, go boom 15:40:08 <TheJulia> yeah 15:40:09 <JayF> *then if later needed* add the toggle switch and implement the other case 15:40:27 <JayF> I'd rather be able to make assumptions about the data is the use case isn't obvious 15:41:08 <TheJulia> fair 15:41:35 <TheJulia> clif: seems like you'd need to add more validation logic (if you haven't already) 15:42:05 <TheJulia> (It occurs to me cardoe has a meeting overlap around now on mondays as well...( 15:42:07 <TheJulia> ) 15:42:07 <cardoe> Sorry I had to step away and getting caught up. 15:42:26 <JayF> TheJulia: clif: should we also add enforcement the other way: if someone updates port.physical_network to a different value than it's portgroup, is that OK? Should that blank the portgroup value? 15:42:31 <cardoe> Yes. I've got the show slide decks to senior leadership about project status call right now. 15:42:34 <JayF> Or is this just a safeguard on the way in 15:43:17 <clif> can a port belong to more than one portgroup? 15:43:59 <clif> and once a port is in a portgroup what do we currently do if the port changes somehow? 15:43:59 <cardoe> JayF: you summed it up well. 15:44:09 <TheJulia> clif: no, it can't belong to more than one afaik 15:44:20 <cardoe> yeah it needs to be removed afaik 15:44:42 <TheJulia> so there is more guardrail logic needed on the port, but I think that could also be added separately 15:44:51 <TheJulia> as long as we don't forget it. 15:45:29 <cardoe> I'd like if 2 ports are part of the same PortGroup and someone tried to update the physical_network on 1 Port to differ from the other Port that they get an error. That's actually be a real jira I've got here. 15:46:06 <TheJulia> Actually, that... could make a lot of sense, use the portgroup as the aggregate to reset the ports 15:46:12 <clif> so if physical network is something derived/inherited from member ports, is category as well? 15:46:19 <TheJulia> since... changing them is sort of a PITA separately. 15:46:32 <cardoe> TheJulia: That would be good with me. 15:46:38 <TheJulia> category is for humans, not the code ;) 15:46:44 <TheJulia> at least, I think it is for humans 15:47:00 <JayF> my real concern about this suggestion 15:47:10 <JayF> is that we don't have an easy way in the client to do that update atomically, do we? 15:47:36 <JayF> I guess you can't update portgroup/port atomically in our API at all 15:47:38 <TheJulia> Going back to the other open discussion topic, I've politely asked for an ETR as to when they expect to have the mirrors resolved 15:47:47 * TheJulia is tempted to also increase the issue's priority 15:47:51 <JayF> so it'd have to be a side-effect of updating portgroup.physical_network to update ports[].physical_network ... which seems OK? 15:48:02 <JayF> as long as we document it loudly :) 15:48:19 <TheJulia> I think it would be a super logical side-effect 15:48:23 <cardoe> JayF: yeah if we said to update physical_network on Ports in a PortGroup, you must update the PortGroup. 15:48:44 <TheJulia> And doing that is also way more user friendly, really 15:48:51 <TheJulia> Just, needs to be documented, very loudly. 15:48:52 <JayF> would we allow port.physical_network to be manually updated if on the relevant microversion? 15:48:57 <JayF> My feeling is NO 15:49:06 <JayF> because that could cause inconsistency 15:49:06 <cardoe> I'm a no as well. 15:49:23 <TheJulia> Also likely with a "expect to wait for a little bit for the ironic-neutron-agent to pickup the change 15:49:37 <JayF> but we /could/ permanently break some (weird and maybe unrealistic, as noted above) workflows that might need ports on different physnets in a group 15:50:15 <TheJulia> Portgroups are a bit of a weird edge case, so I think its a "if set on the port but not portgroup, its okay to reset/change" 15:50:25 <TheJulia> but the moment its set it would need to be set the same across all ports 15:50:35 <TheJulia> and if your changing one, you need to change them all 15:50:44 <JayF> To summarize: 15:50:44 <JayF> - When we update portgroup.physical_network, update member ports.physical_network 15:50:44 <JayF> - On that same microversion boundry, disallow update of port.physical_network when a member of a portgroup (with a friendly directive error IMO) 15:50:44 <JayF> - Document the hell out of this behavior change in release notes and anywhere we mention physical_network 15:50:46 <TheJulia> *unless* you remove the port from the portgroup 15:51:41 <JayF> so there's one case left that I'm unsure about 15:51:41 <TheJulia> I believe that sums it up, aside from possibly permitting in the case of if portgroup.physical_network is not already set 15:52:09 <TheJulia> They are not in a great state, but it just means they aren't using the portgroup.physical_network field at all 15:52:20 <TheJulia> (That is less likely to be breaking to anyone as well) 15:52:23 <JayF> if I have a preexisting db with portgroup X containing A and B ports, and they are *already on* different physical_networks .... it's just okay? 15:52:24 <opendevreview> Jakub Jelinek proposed openstack/ironic-python-agent master: Fix skip block devices for RAID arrays https://review.opendev.org/c/openstack/ironic-python-agent/+/937342 15:52:37 <JayF> sounds like based on what you said, we allow that 15:52:38 <TheJulia> JayF: I'll file under "not great" 15:52:51 <clif> so are we saying that portgroup.phsyical_network is it's own attribute/member but it's value(s) are logically limited to it's members 15:52:54 <clif> ? 15:52:57 <JayF> well you want to enable not-great as long as portgroup.physical_network is null (which makes sense and is less breaky) 15:52:58 <TheJulia> yeah, its "not great" and if they try to use this feature then we need to break... err... I mean guide them 15:53:21 <JayF> clif: so think about physical network as actually reflecting a switch (fabric) in the real world 15:53:35 <JayF> there is *never* a case we can figure out where two ports would be bonded (portgrouped) without being on the same switch fabric 15:53:57 <JayF> but we cannot assume there's not some downstream somewhere abusing that field for something else, or using it in a patch, etc, so we're try to be as not-heavy-handed as possible 15:54:16 <JayF> update of this item: - On that same microversion boundry, disallow update of port.physical_network when a member of a portgroup that has portgroup.physical_network set (with a friendly directive error IMO) 15:54:17 <cardoe> Agreed. 15:54:28 <JayF> cardoe might be that downstream /s 15:54:29 <JayF> lol 15:54:48 <TheJulia> time check, 5 minutes 15:55:08 <JayF> clif: did that help? 15:55:16 <clif> I think so 15:55:43 <JayF> in a perfect world modelled like we expect ports[].physical_network to never be different but we're used to people abusing our fields for other things 15:56:12 <JayF> SGTM. If you'll update that patch we can have a look. 15:56:19 <JayF> I think this issue is at an end, and probably the whole meeting? 15:56:27 <JayF> Anything else on this topic or open discussion in general? 15:56:54 <TheJulia> I *suspect* we're good.... 15:57:08 * TheJulia expects someone to do the typical new topic right as we end the meeting ;) 15:57:15 <JayF> Oh, one more thing! 15:57:16 <cid> :D 15:57:19 <JayF> #endmeeting 15:57:39 <JayF> #endmeeting ironic 15:57:44 <JayF> I thought I was chair? 15:57:44 <TheJulia> ohh, interesting 15:57:49 <TheJulia> I think because it is case matching 15:57:50 <TheJulia> doh! 15:57:52 <TheJulia> #endmeeting