15:03:46 <TheJulia> #startmeeting ironic
15:03:46 <opendevmeet> Meeting started Mon Aug 25 15:03:46 2025 UTC and is due to finish in 60 minutes.  The chair is TheJulia. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:03:46 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:03:46 <opendevmeet> The meeting name has been set to 'ironic'
15:03:47 <TheJulia> o/
15:03:51 <clif> o/
15:04:26 <kubajj> o/
15:05:08 <cardoe> o/
15:05:09 <TheJulia> So who is going to chair our meeting this week? Or are we all going to need to have more coffeeeeee?
15:05:14 <mostepha> o/
15:05:26 <TheJulia> \o
15:05:52 <TheJulia> I guess I'm chairing today
15:06:01 <TheJulia> Welcome to this week's Ironic meeting!
15:06:06 <TheJulia> Our agenda can be found on the wiki!
15:06:08 <TheJulia> #link https://wiki.openstack.org/wiki/Meetings/Ironic#Agenda_for_August_25.2C_2025
15:06:19 <cid> o/
15:06:44 <JayF> o/
15:06:51 <TheJulia> As for our standard reminder, please review priority items labled with the hashtag ironic-week-prio. A dashboard link is available for this purpose.
15:06:53 <TheJulia> #link https://review.opendev.org/q/hashtag:%22ironic-week-prio%22+(status:open)
15:07:26 <TheJulia> This week is R-5 on the release schedule.
15:07:28 <TheJulia> #link https://releases.openstack.org/flamingo/schedule.html
15:08:04 <TheJulia> As a reminder! This week other projects freeze. Final release for client libraries occur, requirements also freeze, and well, yeah. You can see the full list on the schedule
15:08:30 <TheJulia> Next week! Cycle highlights are due, so we should also consider composing our prelude for this release
15:09:03 <TheJulia> Based upon my experience over the years, we're likely going to have to release in 2 weeks.
15:09:04 <JayF> I'll take a look at cycle highlights
15:09:11 <JayF> #action JayF to do cycle highlights
15:09:19 <TheJulia> #chair jayf
15:09:19 <opendevmeet> Current chairs: TheJulia jayf
15:09:21 <JayF> #action JayF to do cycle highlights
15:09:32 <TheJulia> Does anyone else have anything to remind us of or announce this week?
15:10:40 <TheJulia> I guess not, onward!
15:10:45 <TheJulia> #topic Working Group Updates
15:10:58 <TheJulia> Our first update I'll be proxying for the Standalone networking working group.
15:12:04 <TheJulia> alegacy messaged me with a scheduled message and let me know that he made progress but feels he will need to revise the spec. Basically they realized a chicken/egg issue that we're already aware of in ironic, but I suspect we never clearly documented. tl;dr how do you get the data automatically.
15:12:27 <TheJulia> In terms of Eventlet removal!
15:13:02 <JayF> I move that we rename the Eventlet WG the Eventlet is dead partytime group
15:13:03 <JayF> ;)
15:13:18 <TheJulia> It feels like we're in final clean-up stage, where we may be partying and forgetting any final clean-ups.
15:13:37 <TheJulia> For any final cleanups, lets go ahead and get them tracked and raise awareness if we're aware of any.
15:13:50 <JayF> request logging fix landed this morning, there's a couple of dev environment cleanups I have up that (I think?) have not yet landed
15:14:00 <TheJulia> That matches my awareness
15:14:09 <JayF> https://review.opendev.org/c/openstack/ironic/+/958249
15:14:18 <TheJulia> I think we were also going to pull the wsgi entry point?
15:14:25 <JayF> Ah, that's a good one too
15:14:31 <TheJulia> or am I incorrectly presenting that one?
15:14:40 <TheJulia> (why? I don't know?!)
15:14:59 <JayF> I think that's aaccurate, but I don't think it's tied with eventlet directly
15:15:04 <JayF> so much as tied with "remove old crap" :)
15:15:09 <TheJulia> Yeah
15:15:28 <TheJulia> Okay, anything else regarding working group updates?
15:15:48 <JayF> For eventlet WG
15:15:57 <JayF> we should mention we are encouraging contributors to set aside some meaningful time
15:16:06 <JayF> to perform basic testing locally, if you have hardware test against it
15:16:16 <JayF> the weirder the use case tested the better :)
15:16:28 <TheJulia> I concur
15:16:46 <JayF> also 👏 to our CNCF counterparts for helping us find the first (only?) major regression
15:17:11 <TheJulia> #agreed Contributors should try to take time to exercise the "weirder" use cases to observe if they have found any regressions.
15:17:12 <TheJulia> ++
15:17:38 <TheJulia> FWIW, I did sit down and just verify the vnc console stuff was working still last week and I was able to confirm it was happy as a single process.
15:18:11 <JayF> I would *really* like testing against ilo driver
15:18:26 <JayF> but I don't personally have access to that hardware so :/
15:18:27 <TheJulia> Unfortunately, I have no ilo gear
15:19:42 <TheJulia> I *suspect* we may have another issue brewing, of sorts, with the redfish cache, but I've not sat down to explore or inspect logs
15:19:51 <TheJulia> (session cache)
15:20:46 <kubajj> TheJulia: I recently figured out that we do have some ilo nodes we could test stuff on
15:20:47 <TheJulia> Anyhow, it looks like have no discussion topics today, so we can proceed to the Bug Deputy updates if there are nothing else
15:21:10 <JayF> it might be a good idea to chat about portgroup.physical_network; but that can wait until open discussion
15:21:17 <TheJulia> Yeah, it would :)
15:21:26 <TheJulia> kubajj: if y'all can, that would be much appreciated.
15:21:56 * TheJulia waits for a moment to see if there is anything eventlet/testing related before we proceed
15:22:02 <kubajj> TheJulia: although I am not 100% familiar with the potential issues that could arise
15:23:49 <TheJulia> kubajj: largely the removal of eventlet changes the threading model to use... actual threads. It increases memory usage may expose issues in drivers doing anything odd. Because threads will basically become blocking, the aspects eventlet monkeypatches, largely around socket/thread interactions can appear as pain points
15:24:34 <kubajj> TheJulia: I assume this involves both ironic and the IPA?
15:25:03 <TheJulia> kubajj: Eventlet was removed from both, IPA is less multi-threaded and is much simpler in many ways. Ironic at it's core is *really* where it is at.
15:26:08 <JayF> It's hard to me to imagine we have any edges in IPA that haven't been tested
15:26:14 <kubajj> TheJulia: ok, we will try to build Ironic and IPA from master and see what fails
15:26:23 <JayF> We made a point to test image streaming + failures when building that feature which I think is the edgiest case there
15:26:25 <JayF> but IMBW :)
15:26:35 <TheJulia> kind of yeah, and also there is an easy fallback path for IPA, load a slightly older image up and roll forward while we work a bug ;)
15:27:02 <TheJulia> #topic Bug Deputy Updates
15:27:13 <TheJulia> iurygregory: you're up if your around
15:27:27 <iurygregory> quick updates
15:27:33 <iurygregory> 3 new bugs during the week
15:27:40 <iurygregory> all have patches up =)
15:27:49 <TheJulia> I think 2 are actually "resolved" for now
15:27:52 <iurygregory> some are even merged
15:27:54 <iurygregory> yeah
15:27:58 <TheJulia> cool cool
15:29:45 <TheJulia> Anything else? Would anyone like to be our bug deputy for this week?
15:29:57 <clif> I'll volunteer
15:30:04 <clif> assuming no one else is in line already
15:30:30 <TheJulia> #action clif to be our bug deputy for this week!
15:30:38 <clif> \o/
15:31:22 <TheJulia> Well, since we have no RFEs to review this week!
15:31:25 <TheJulia> #topic Open Discussion
15:31:45 <kubajj> anybody knows what's up with IPA CI?
15:32:09 <TheJulia> kubajj: is it blowing up a lot in dib's extract-image logic ?
15:32:13 <JayF> #note CI for anything that needs a CentOS image, including IPA CI and 4k test jobs in Ironic, are failing due to upstream CentOS issues.
15:32:17 <JayF> TheJulia: yes
15:32:22 <TheJulia> yeah, that!
15:32:27 <JayF> I noticed it last week when 4k job was busted
15:32:36 <JayF> figured no need to pile on and be too loud about it
15:32:55 <TheJulia> They already have multiple groups of folks being loud on the ticket
15:33:13 <TheJulia> #link https://issues.redhat.com/browse/CS-2983
15:33:22 <clif> I'm a bit perturbed it hasn't been addressed yet
15:33:24 <kubajj> I am not sure if it is that, but could be https://zuul.opendev.org/t/openstack/build/5a5e32685bb14ace847a3b9f8fa0d07a
15:33:26 <TheJulia> So, as for that physical_network topic!
15:33:29 <TheJulia> clif: as am I....
15:33:57 <JayF> looking at https://docs.openstack.org/ironic/latest/admin/networking.html#terminology
15:34:13 <JayF> plus the fact that we can loosen things up later, but it's not as easy to tighten them up later
15:34:37 <JayF> I think it makes sense to make portgroup.physical_network either be inherited or be limited to values present on their underlying ports
15:35:00 <JayF> the part that makes this hairy is that I suspect it's possible to have portgroup X containing ports A and B, and having ports A and B have different physical networks
15:35:09 <JayF> even though that config wouldn't make sense, if the data exists we'd have to handle it
15:35:21 <JayF> does that match cardoe and TheJulia concerns/thoughts?
15:35:26 * JayF tried to catch up this morning :D
15:37:21 <TheJulia> I think having a portgroup spanning to physically different portgroups is *really* a bad idea... but its sort of kind of valid modeling wise
15:37:32 <TheJulia> because you *could* sort of say the portgroup exists on *both*
15:37:44 <TheJulia> I mean... not ideal, likely model breaking overall, but sort of possible.
15:38:01 <TheJulia> And even then, odds are you *really* wouldn't have a proper LACP config in such a case
15:38:04 <JayF> my question is
15:38:10 <JayF> is there ever a case where that ^^ makes sense
15:38:23 <JayF> that doesn't involve "operator physically moves the cable from X fabric to Y"
15:38:25 <JayF> I can't find one.
15:38:38 <JayF> and if the physical world changes, asking people to update the API isn't just OK; it's the most sensible thing
15:39:06 <TheJulia> I think if someone came with a case where they said it makes sense, I'd challenge them to explain why. I mean, I guess I can see a sort of meta question to this argument, I'm viewing it as a shorthand, but there are other fields in this model they could also use
15:39:24 <TheJulia> I'm thinking inherant and enforce, and maybe we put a config option around the validation
15:39:34 <JayF> I'd say just lock down by default
15:39:34 <TheJulia> and if someone does something like that blow up/detonate/scream/etc
15:39:41 <TheJulia> and re-iterate in the docs
15:39:45 <TheJulia> cardoe: thoughts^
15:39:48 <JayF> if you wanna set portgroup.physical_network, it needs to be one that's on port.physical_network
15:39:51 <TheJulia> I'm good with that, personally
15:39:59 <JayF> if there's ever a mismatch when we go to do things, go boom
15:40:08 <TheJulia> yeah
15:40:09 <JayF> *then if later needed* add the toggle switch and implement the other case
15:40:27 <JayF> I'd rather be able to make assumptions about the data is the use case isn't obvious
15:41:08 <TheJulia> fair
15:41:35 <TheJulia> clif: seems like you'd need to add more validation logic (if you haven't already)
15:42:05 <TheJulia> (It occurs to me cardoe has a meeting overlap around now on mondays as well...(
15:42:07 <TheJulia> )
15:42:07 <cardoe> Sorry I had to step away and getting caught up.
15:42:26 <JayF> TheJulia: clif: should we also add enforcement the other way: if someone updates port.physical_network to a different value than it's portgroup, is that OK? Should that blank the portgroup value?
15:42:31 <cardoe> Yes. I've got the show slide decks to senior leadership about project status call right now.
15:42:34 <JayF> Or is this just a safeguard on the way in
15:43:17 <clif> can a port belong to more than one portgroup?
15:43:59 <clif> and once a port is in a portgroup what do we currently do if the port changes somehow?
15:43:59 <cardoe> JayF: you summed it up well.
15:44:09 <TheJulia> clif: no, it can't belong to more than one afaik
15:44:20 <cardoe> yeah it needs to be removed afaik
15:44:42 <TheJulia> so there is more guardrail logic needed on the port, but I think that could also be added separately
15:44:51 <TheJulia> as long as we don't forget it.
15:45:29 <cardoe> I'd like if 2 ports are part of the same PortGroup and someone tried to update the physical_network on 1 Port to differ from the other Port that they get an error. That's actually be a real jira I've got here.
15:46:06 <TheJulia> Actually, that... could make a lot of sense, use the portgroup as the aggregate to reset the ports
15:46:12 <clif> so if physical network is something derived/inherited from member ports, is category as well?
15:46:19 <TheJulia> since... changing them is sort of a PITA separately.
15:46:32 <cardoe> TheJulia: That would be good with me.
15:46:38 <TheJulia> category is for humans, not the code ;)
15:46:44 <TheJulia> at least, I think it is for humans
15:47:00 <JayF> my real concern about this suggestion
15:47:10 <JayF> is that we don't have an easy way in the client to do that update atomically, do we?
15:47:36 <JayF> I guess you can't update portgroup/port atomically in our API at all
15:47:38 <TheJulia> Going back to the other open discussion topic, I've politely asked for an ETR as to when they expect to have the mirrors resolved
15:47:47 * TheJulia is tempted to also increase the issue's priority
15:47:51 <JayF> so it'd have to be a side-effect of updating portgroup.physical_network to update ports[].physical_network ... which seems OK?
15:48:02 <JayF> as long as we document it loudly :)
15:48:19 <TheJulia> I think it would be a super logical side-effect
15:48:23 <cardoe> JayF: yeah if we said to update physical_network on Ports in a PortGroup, you must update the PortGroup.
15:48:44 <TheJulia> And doing that is also way more user friendly, really
15:48:51 <TheJulia> Just, needs to be documented, very loudly.
15:48:52 <JayF> would we allow port.physical_network to be manually updated if on the relevant microversion?
15:48:57 <JayF> My feeling is NO
15:49:06 <JayF> because that could cause inconsistency
15:49:06 <cardoe> I'm a no as well.
15:49:23 <TheJulia> Also likely with a "expect to wait for a little bit for the ironic-neutron-agent to pickup the change
15:49:37 <JayF> but we /could/ permanently break some (weird and maybe unrealistic, as noted above) workflows that might need ports on different physnets in a group
15:50:15 <TheJulia> Portgroups are a bit of a weird edge case, so I think its a "if set on the port but not portgroup, its okay to reset/change"
15:50:25 <TheJulia> but the moment its set it would need to be set the same across all ports
15:50:35 <TheJulia> and if your changing one, you need to change them all
15:50:44 <JayF> To summarize:
15:50:44 <JayF> - When we update portgroup.physical_network, update member ports.physical_network
15:50:44 <JayF> - On that same microversion boundry, disallow update of port.physical_network when a member of a portgroup (with a friendly directive error IMO)
15:50:44 <JayF> - Document the hell out of this behavior change in release notes and anywhere we mention physical_network
15:50:46 <TheJulia> *unless* you remove the port from the portgroup
15:51:41 <JayF> so there's one case left that I'm unsure about
15:51:41 <TheJulia> I believe that sums it up, aside from possibly permitting in the case of if portgroup.physical_network is not already set
15:52:09 <TheJulia> They are not in a great state, but it just means they aren't using the portgroup.physical_network field at all
15:52:20 <TheJulia> (That is less likely to be breaking to anyone as well)
15:52:23 <JayF> if I have a preexisting db with portgroup X containing A and B ports, and they are *already on* different physical_networks .... it's just okay?
15:52:24 <opendevreview> Jakub Jelinek proposed openstack/ironic-python-agent master: Fix skip block devices for RAID arrays  https://review.opendev.org/c/openstack/ironic-python-agent/+/937342
15:52:37 <JayF> sounds like based on what you said, we allow that
15:52:38 <TheJulia> JayF: I'll file under "not great"
15:52:51 <clif> so are we saying that portgroup.phsyical_network is it's own attribute/member but it's value(s) are logically limited to it's members
15:52:54 <clif> ?
15:52:57 <JayF> well you want to enable not-great as long as portgroup.physical_network is null (which makes sense and is less breaky)
15:52:58 <TheJulia> yeah, its "not great" and if they try to use this feature then we need to break... err... I mean guide them
15:53:21 <JayF> clif: so think about physical network as actually reflecting a switch (fabric) in the real world
15:53:35 <JayF> there is *never* a case we can figure out where two ports would be bonded (portgrouped) without being on the same switch fabric
15:53:57 <JayF> but we cannot assume there's not some downstream somewhere abusing that field for something else, or using it in a patch, etc, so we're try to be as not-heavy-handed as possible
15:54:16 <JayF> update of this item: - On that same microversion boundry, disallow update of port.physical_network when a member of a portgroup that has portgroup.physical_network set (with a friendly directive error IMO)
15:54:17 <cardoe> Agreed.
15:54:28 <JayF> cardoe might be that downstream /s
15:54:29 <JayF> lol
15:54:48 <TheJulia> time check, 5 minutes
15:55:08 <JayF> clif: did that help?
15:55:16 <clif> I think so
15:55:43 <JayF> in a perfect world modelled like we expect ports[].physical_network to never be different but we're used to people abusing our fields for other things
15:56:12 <JayF> SGTM. If you'll update that patch we can have a look.
15:56:19 <JayF> I think this issue is at an end, and probably the whole meeting?
15:56:27 <JayF> Anything else on this topic or open discussion in general?
15:56:54 <TheJulia> I *suspect* we're good....
15:57:08 * TheJulia expects someone to do the typical new topic right as we end the meeting ;)
15:57:15 <JayF> Oh, one more thing!
15:57:16 <cid> :D
15:57:19 <JayF> #endmeeting
15:57:39 <JayF> #endmeeting ironic
15:57:44 <JayF> I thought I was chair?
15:57:44 <TheJulia> ohh, interesting
15:57:49 <TheJulia> I think because it is case matching
15:57:50 <TheJulia> doh!
15:57:52 <TheJulia> #endmeeting