15:01:29 <JayF> #startmeeting ironic
15:01:29 <opendevmeet> Meeting started Mon Dec 11 15:01:29 2023 UTC and is due to finish in 60 minutes.  The chair is JayF. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:01:29 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:01:29 <opendevmeet> The meeting name has been set to 'ironic'
15:01:38 <JayF> #topic Announcements/Reminder
15:01:45 <JayF> #info Standing reminder to review patches tagged ironic-week-prio and to hashtag any patches ready for review with ironic-week-prio: https://tinyurl.com/ironic-weekly-prio-dash
15:03:01 <TheJulia> o/
15:03:07 <TheJulia> dtantsur: w/r/t ncsi, just an option afaik
15:03:24 <JayF> No action items from last week, skipping that one
15:03:40 <JayF> #topic Caracal release schedule
15:03:50 <JayF> #info Next milestone C-2, Jan 11
15:03:59 <JayF> #topic Meeting schedule for holiday
15:04:22 <JayF> I propose we cancel the 12/25 and 1/1 meetings. Well, I propose I won't be here, you all can do what you want with Christmas + New Years Day :D
15:05:20 <JayF> If I don't hear an objection going to move ahead with that plan
15:05:31 <rpittau> o/
15:06:33 <JayF> #action JayF to email mailing list about Christmas + New Year's Day meetings being cancelled
15:06:39 <JayF> #topic Review Ironic CI Status
15:06:55 <JayF> So Ironic CI is currently busted, BFV job was broken by some Ironic driver changes that landed
15:06:55 <dtantsur> The bfv fix hasn't merged yet..
15:07:09 <JayF> the fix is in the nova gate, if it gets clogged up we can push a quick -nv to get things free
15:07:28 <JayF> I'm a bit confused how this landed broken, I had a patch w/Depends-On on the tip that passed all our jobs
15:07:51 <opendevreview> Verification of a change to openstack/ironic-python-agent master failed: Fix referencing to the raid_device var which is not set  https://review.opendev.org/c/openstack/ironic-python-agent/+/900324
15:08:07 <TheJulia> I don't know what patches in question, but it looked like it was lacking a key we expected to be present
15:08:13 <TheJulia> possibly a race condition?!
15:08:47 <JayF> Well, there was a comment on the original change that it could be racey, but with dtantsur's fix, it doesn't look like it could've ever worked
15:08:56 <JayF> so I'm thinking maybe a follow up just to ensure our jobs are running nova from git
15:09:04 <JayF> well, we know it is, because they broken
15:09:29 <JayF> Either way, we know the path forward, path behind matters less, not like we push major changes to Ironic<>Nova driver often.
15:09:39 <JayF> Anything else on CI/Nova breakage?
15:10:21 <dtantsur> We cannot run nova NOT from git
15:11:03 <JayF> fair
15:11:08 <JayF> moving on
15:11:11 <JayF> #topic Bug Deputy
15:11:21 <JayF> I was the bug deputy. I triaged some bugs. I did not put together a report.
15:11:35 <JayF> I'll note for whoever is taking it next (I can go another week if needed), there's something mildly screwy with the dashboard
15:11:55 <JayF> I didn't have time to look, but either launchpad API returns bad data or something in our bug deduplication breaks on non-Ironic projects
15:12:07 <JayF> but I managed to triage most new bugs
15:12:32 <rpittau> I can go for it this week
15:12:44 <JayF> #action rpittau to be bug deputy this week
15:12:53 <JayF> well, I say I didn't have time to look; I looked; I didn't solve lol
15:12:59 <rpittau> :D
15:13:04 <JayF> We have lots of RFEs up
15:13:09 <JayF> including some I proxied as bug deputy
15:13:13 <JayF> #topic RFE Review
15:13:23 <JayF> dtantsur: has three, going to link them in and let you talk about them
15:13:29 <JayF> #link https://bugs.launchpad.net/ironic/+bug/2045548
15:13:34 <JayF> #link https://bugs.launchpad.net/ironic/+bug/2045551
15:13:46 <JayF> both carryovers from last week, we didn't have many folks in the meeting and wanted more people to see these
15:13:51 <JayF> they were provisionally approved
15:13:54 <dtantsur> Yeah, thanks!
15:14:07 <JayF> #link https://bugs.launchpad.net/sushy-tools/+bug/2046153 "Testing the minimal subset of Redfish features" is the new one for dtantsur
15:14:10 <TheJulia> no objection to 2045548, only concern about the kernel command line length limit
15:14:23 <dtantsur> TheJulia: do you remember the limit from the top of your head?
15:14:31 <TheJulia> 1024 chars
15:14:34 <TheJulia> total
15:14:43 <dtantsur> I think we're quite good still
15:15:02 <TheJulia> yeah, maybe move it to the end *just in case*
15:15:34 <TheJulia> I think the kernel truncates it, if memory serves it is also configurable and at one point was 256 charts
15:15:36 <TheJulia> chars
15:15:38 <dtantsur> I'll how doable that is (without relying on Python dict ordering)
15:15:59 <TheJulia> I was thinking template wise
15:16:00 <TheJulia> fwiw
15:16:15 <dtantsur> I think all these arguments end up in a dict.. anyway, technical details
15:16:47 <TheJulia> they do, but the actual line draws from several different fields in the tempalate, the actual consumption side, yes hits  dict and entires may be truncated
15:16:47 <JayF> how long until ipa-b64-config=dgjskljdfhgksljdhg
15:17:05 <dtantsur> lol
15:17:17 <JayF> I'm only half-joking, we're getting more complex in the preboot configuration we need
15:17:21 <dtantsur> with virtual media, we should start embedding it as a file already (we actually have all the code there)
15:17:32 <JayF> that is worthy of being written up
15:17:52 * JayF notes he is +1 to all the proposed RFEs
15:18:01 <dtantsur> I'll try to.. but I cannot do everything at once
15:18:20 <TheJulia> The second rfe, I guess I'm trying to understand why we feel the need to create the second script/why we feel it is explicitly needed. I.e. is there a path not to have it, and do we need more, or less complexity to get there
15:18:31 <TheJulia> everything at once is impossible
15:18:33 <dtantsur> 2045551 is a bit more interesting. It's continuation of the whole "self-configuring ironic" thing
15:18:51 <dtantsur> to be clear: this script exists now, it's just normally hand-written
15:19:08 <dtantsur> we *could* roll this logic into boot.ipxe, but then we won't be able to reuse the ipxe_script_template
15:19:49 <dtantsur> like, I don't want https://github.com/metal3-io/ironic-image/blob/main/ironic-config/inspector.ipxe.j2 to be handwritten any more
15:19:51 <TheJulia> I'm personally pro all together, if possible
15:20:39 <dtantsur> Not sure I know what you mean
15:21:10 <TheJulia> from my pov, there is no technical reason why we've kept that pattern, out of "this is the way we always did it"
15:21:24 <TheJulia> and we should evolve that. What that looks like exactly, more so depends on the overall requirements
15:21:30 <dtantsur> it falls the loop <MAC1>, <MAC2>, ..., fallback
15:21:49 <dtantsur> where do you suggest it goes, https://opendev.org/openstack/ironic/src/branch/master/ironic/drivers/modules/boot.ipxe ?
15:21:57 <TheJulia> yes, except what did we do in bifrost?
15:22:04 <dtantsur> same as in metal3: a separate file
15:22:20 <dtantsur> it's also the approach we document for ironic-inspector
15:22:29 <TheJulia> we could take boot_failed path
15:22:38 <TheJulia> oh, so bifrost got changed way from the single file ?
15:22:44 <dtantsur> if we add that to boot.ipxe, then the operators who customize boot.ipxe and ipxe_config.template will need to change their templates
15:22:50 <TheJulia> yes
15:22:51 <dtantsur> bifrost has always worked like that
15:22:53 <TheJulia> that is unavailable
15:23:03 <dtantsur> nothing I touched implemented inspection as part of boot.ipxe
15:23:10 <dtantsur> (I don't think TripleO either)
15:23:12 <TheJulia> maybe bifrost for the past five years, but a very long time ago it was a single file
15:23:33 <dtantsur> possibly? if so, it's long gone
15:23:38 <TheJulia> yeah
15:23:45 <dtantsur> I'm not sure why we'd ask people to duplicate their templates though
15:23:52 <TheJulia> it worked quite nicely, from my point of view, but that was a long time ago
15:23:58 <TheJulia> nor am I
15:23:58 <dtantsur> (not my pain any more since we don't customize ipxe_config.template, but still)
15:24:02 <TheJulia> other than hardware discovery
15:24:54 <dtantsur> I'm afraid your memory misleads you
15:25:04 <dtantsur> https://opendev.org/openstack/bifrost/commit/c1f9beac6358efd70a4197f57f71ef98499fa7d6 is your patch that introduced inspector support, and it uses the separate file
15:25:05 <TheJulia> sweet!
15:25:48 <dtantsur> The value I see in the separate file is to reuse ipxe_config.template out of box
15:26:13 * TheJulia shrugs
15:26:13 <dtantsur> so, people who customized something for cleaning, deployment and managed inspection will get the same thing for the unmanaged inspection
15:26:42 <dtantsur> I'm worried about managed and unmanaged inspection ever diverging
15:26:52 <TheJulia> yeah, I just worry how much debt we're carrying for that and if we should make it more streamlined, or not
15:27:11 <JayF> I need to step away for a minute; if you all get done with this next up are those two sushy-tools RFEs which I'm +1 with very low context on (I put them on the agenda from seeing them in triage)
15:27:14 <JayF> #chair TheJulia dtantsur
15:27:14 <opendevmeet> Current chairs: JayF TheJulia dtantsur
15:27:14 <TheJulia> I guess i always thought we were going to try and drift everything towards managed
15:27:26 <TheJulia> anyway, we should carry on
15:27:32 <dtantsur> We will try, but we don't succeed right away
15:27:40 <TheJulia> naturally :)
15:27:42 <dtantsur> People want auto-discovery all the time...
15:27:48 <TheJulia> true
15:27:59 <TheJulia> and we have to still soft of have that as a path, but that doesn't mean things can't evolve
15:28:09 <TheJulia> Onward?
15:28:18 <dtantsur> Mmmm, so no consensus?
15:28:27 <TheJulia> dtantsur: lazy consensus?
15:28:38 <dtantsur> I thought you were against the RFE as it's written?
15:29:01 <TheJulia> eh, not a fan, but doesn't mean I'm going to take a hardline stance on it
15:29:09 <TheJulia> I'm more asking question for the big picture
15:29:34 <TheJulia> #link https://bugs.launchpad.net/sushy-tools/+bug/2045906
15:29:36 <TheJulia> #link https://bugs.launchpad.net/sushy-tools/+bug/2045908
15:30:05 <dtantsur> I don't have anything against extending sushy-tools as long as we have some understanding why people are doing it
15:30:11 <dtantsur> i.e. they're not doing it in production
15:30:11 <TheJulia> ++
15:30:25 <TheJulia> The first one, seems like a "non-issue"/"just do it"
15:30:27 <JayF> I think these are the folks who came by here the other day
15:30:41 <JayF> someone dropped in and asked if we'd be OK with getting features into sushy-tools
15:30:50 <dtantsur> yeah, SKU,Serial are not even RFE-worthy
15:30:56 <JayF> and I gave them the usual interrogation; sounds like they are building a product in the category of Ironic
15:31:05 <JayF> and want to enhance the testing suite in sushy-tools more
15:31:06 <TheJulia> the latter one, yeah, I'm trying to understand why
15:31:20 <dtantsur> if it's for testing their product - no problem
15:31:41 <dtantsur> as long as it does not end up putting more burden on us
15:31:43 <JayF> TheJulia: I could easily see a datacenter management product monitoring power usage, especially if it's oriented differently than OpenStack
15:32:02 <JayF> dtantsur: if anything, getting another project depending on sushy{,-tools} may reduce our overall burden
15:32:07 <TheJulia> JayF: of course, which I think kind of goes to dtantsur's comment
15:32:19 <dtantsur> by the way, we should rename sushy-tools
15:32:24 <dtantsur> it causes too much confusion with sushy proper
15:32:27 <TheJulia> ++++
15:32:45 * dtantsur has recently learned there is an alternative to sushy-tools called ksushy
15:32:58 <TheJulia> .....
15:33:05 <dtantsur> ikr?
15:33:10 <JayF> too bad we didn't get OIF to (tm) sushy
15:33:12 <JayF> lol;
15:33:16 <dtantsur> :D
15:33:25 <JayF> I'd be OK with a sushy-tools rename, maybe start a next-PTG etherpad with that?
15:33:42 <dtantsur> possibly
15:33:44 <TheJulia> or send something to the mailing list
15:33:49 <TheJulia> deferring to a PTG is a bad habit
15:33:59 <TheJulia> because it forces discussion to halt until high bandwidth discussion
15:34:01 <JayF> Maybe; but in this case I'd say halfway thru the cycle is a bad time to rename it
15:34:03 <dtantsur> I'm +1 to both proposals if they don't end up dumping a lot of obscure code on us
15:34:04 <TheJulia> which casues the topic to drop
15:34:14 <JayF> which is the only reason I punted it; not for the high-bw but for the "after release" :)
15:34:27 <TheJulia> JayF: would need to be well communicated and we would already be into the next cycle once we reach the PTG
15:34:50 <TheJulia> I've left a comment on the first sushy-tools rfe
15:34:55 <JayF> good point
15:34:56 <dtantsur> We could have renaming by forking fwiw
15:35:00 <JayF> so we are approving all those RFEs, right?
15:35:04 <dtantsur> Also dropping the static emulator that I'm afraid nobody uses
15:35:06 <dtantsur> yeah
15:35:18 <JayF> #info All RFEs evaluated at meeting are approved.
15:35:21 <JayF> #topic Open Discussion
15:35:24 <dtantsur> mmm, also mine? :)
15:35:26 <TheJulia> uhh, we did't get to the last one
15:35:29 <dtantsur> I don't mind it :D
15:35:31 <JayF> #undo
15:35:31 <opendevmeet> Removing item from minutes: #topic Open Discussion
15:35:33 <JayF> #undo
15:35:33 <opendevmeet> Removing item from minutes: #info All RFEs evaluated at meeting are approved.
15:35:46 * JayF presses << on the tape player
15:35:49 <JayF> What last one TheJulia?
15:35:59 <JayF> TheJulia: we batched dtantsur'
15:36:12 <dtantsur> #link https://bugs.launchpad.net/sushy-tools/+bug/2046153 Testing the minimal subset of Redfish features
15:36:12 <JayF> **we batched dtantsur's three together, so even though we did them out of order, I thought we discussed all 5?
15:36:24 <dtantsur> It's not even really an RFE, more of a heads-up
15:36:38 <dtantsur> I want us to avoid relying on a redfish implementation having ALL features we support
15:36:38 <TheJulia> the last one wasn't linked until when dtantsur just did so
15:36:43 <TheJulia> thought we were going in order
15:36:52 <dtantsur> for that, I want a CI job that runs with a bare sushy-tools with no extras at all
15:36:59 <dtantsur> objections, concerns?
15:37:20 <TheJulia> dtantsur: none really
15:37:46 <TheJulia> it *should* be a test scenario in the tempest plugin explicitly instead of a separate CI job itself, if at all possible.
15:37:50 <JayF> TheJulia: some mornings I can hide when I didn't get my coffee made before meeting... others... :)
15:38:04 <dtantsur> TheJulia: I was thinking to just changing one non-vmedia job
15:38:06 <TheJulia> or just, run the emulator on minimal and make sure none of the advanced features are turned on
15:38:29 <dtantsur> all I care about is that we don't e.g. depends on firmware versions being available as we've just accidentally done
15:38:29 <TheJulia> ok
15:38:43 <dtantsur> (context: https://review.opendev.org/c/openstack/ironic/+/903185)
15:39:27 <dtantsur> (in the past, we accidentally made EthernetInterfaces required, breaking mmm Cisco?)
15:39:42 <TheJulia> sweet!
15:39:51 <JayF> sounds like a pretty good diea then
15:40:28 <TheJulia> yeah, every redfish feature we do, we should be checking "if the value is not none"
15:40:35 <TheJulia> stupidly easy thing to miss though
15:40:47 <dtantsur> on that note, I also want to document which resources we require and which we can use if they are present
15:41:00 <dtantsur> (quite unfortunately, that the profiles work stagnated..)
15:41:30 <TheJulia> so a profile merged, but it is not quite clear
15:41:51 <dtantsur> Background: from time to time, our partners ask me for a list of Redfish APIs we require
15:42:04 <dtantsur> so I essentially have this document downstream. I can just publish it.
15:42:48 <TheJulia> makes sense, the profile is rather broad
15:43:00 <TheJulia> And I'm not sure anyone is using it :\
15:43:01 <JayF> IDK how to go about this; but if we could document quirks too that'd be cool
15:43:11 <JayF> e.g. the bug around gigabyte servers wanting a different payload for boot mode than anything else
15:43:22 <dtantsur> We're trying to fix such things instead :)
15:43:24 <TheJulia> Did we get a reply to that one?
15:44:06 <JayF> https://bugs.launchpad.net/ironic/+bug/2045191 yes
15:44:17 <JayF> In fact it looks very good
15:44:22 <JayF> I left it incomplete because I wanted someone more redfishy to verify it was enough info
15:44:24 <JayF> but it LGTM
15:44:48 <JayF> I also reached out a handful of different ways trying to get a contact in Gigabytes' server engineering team
15:45:16 <JayF> #topic Open Discussion
15:45:23 <JayF> (we are basically already doing this, just reflecting the reality)
15:45:43 <dtantsur> another fishy item, just as a heads-up: https://review.opendev.org/c/openstack/sushy-tools/+/903331
15:45:46 <TheJulia> oh joy, yeah, what we have to do with them was semi-incompatiable with some of the other vendors
15:46:04 <TheJulia> ..... hmmmmmm
15:46:19 <JayF> This is part of why I'm trying to work out a contact in their server group :D
15:46:31 <JayF> Seems like the sort of thing that ... collaboration could improve future versions of
15:46:36 <TheJulia> yeah
15:46:39 <JayF> (I do not think I was/will be successful FWIW)
15:47:39 <TheJulia> I don't know if I'll have time to take a look this week, this week is shaping up already to be a very busy week
15:48:01 <JayF> I'm going to be focusing primarily on nova driver bits, trying to get sharding spec mergable and getting all the code lined back up
15:48:07 <JayF> so I can work on those tempest tests again
15:48:13 <TheJulia> I suspect we could likely just teach ironic to recognize such a constrant and send back the same value or something slightly wicked
15:48:31 <TheJulia> but where in that sequence is sort of a question, I'll need to look a little deeper
15:49:49 <JayF> Makes sense.
15:50:00 <JayF> Oh, one thing I wanted to mention for open discussion
15:50:19 <JayF> right now there are patches up in openstack/releases to retire Ironic Train/Ussuri branches due to the unmaintained branches migration
15:50:36 <JayF> AIUI devstack/various supporting CI cast things are going away for Train/Ussuri
15:50:55 <opendevreview> Riccardo Pittau proposed openstack/bifrost master: Fix disable-dhcp option in playbook  https://review.opendev.org/c/openstack/bifrost/+/903135
15:51:02 <JayF> if anyone wants to keep these up, and in unmaintained/* branches, they have [slightly less than, at this point] a month to go -1 that patch and volunteer to be the steward of those branches
15:51:27 <JayF> https://review.opendev.org/c/openstack/releases/+/903199
15:51:59 <JayF> #info Train is scheduled to be EOL: https://review.opendev.org/c/openstack/releases/+/903199 -- if you wish it to not EOL, you must -1 that patch and volunteer to steward the branch.
15:52:08 <JayF> Anything else about that or for open discussion?
15:52:29 <TheJulia> I guess the Train is soon departing the last station
15:53:51 <JayF> 4+ years is too good of a run for a single release of software :D
15:53:58 <JayF> I'm going to close out the meeting if there's nothing else?
15:54:11 <TheJulia> ... unless we're required to release for 5 years
15:54:48 <TheJulia> https://www.digitalsme.eu/cyber-resilience-act-the-eu-strikes-a-deal-on-security-requirements-for-digital-products/
15:55:34 <JayF> I think this is confirmation we're off the Ironic path for the day :D
15:55:36 <JayF> #endmeeting