15:00:16 <rpittau> #startmeeting ironic 15:00:16 <opendevmeet> Meeting started Mon Jul 29 15:00:16 2024 UTC and is due to finish in 60 minutes. The chair is rpittau. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:16 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:16 <opendevmeet> The meeting name has been set to 'ironic' 15:00:27 <rpittau> Hello everyone! 15:00:27 <rpittau> Welcome to our weekly meeting! 15:00:27 <rpittau> The meeting agenda can be found here: 15:00:27 <rpittau> #link https://wiki.openstack.org/wiki/Meetings/Ironic#Agenda_for_July_29.2C_2024 15:01:03 <TheJulia> o/ 15:01:50 <mohammed> o/ 15:02:00 <JayF> O/ 15:02:48 <cardoe> o/ 15:03:31 <rpittau> alright let's start 15:03:34 <masghar> o/ 15:03:35 <rpittau> #topic Announcements/Reminders 15:03:42 <cid> o/ 15:03:53 <rpittau> #info Standing reminder to review patches tagged ironic-week-prio and to hashtag any patches ready for review with ironic-week-prio 15:03:53 <rpittau> #link https://tinyurl.com/ironic-weekly-prio-dash 15:03:56 <iurygregory> o/ 15:04:20 <rpittau> 2-3 patches that looks ready there 15:05:01 <rpittau> #info 2024.2 Dalmatian Release Schedule 15:05:02 <rpittau> #link https://releases.openstack.org/dalmatian/schedule.html 15:05:02 <rpittau> new bugfix branches will be requested this week 15:05:31 <rpittau> we're at R-9, 3 weeks to release for non-client libraries 15:06:06 <rpittau> #info he next OpenInfra PTG which will take place October 21-25, 2024 virtually! Registration is now open! 15:06:06 <rpittau> #link https://ptg.openinfra.dev 15:06:06 <rpittau> the ironic team has been registered! 15:06:17 <rpittau> and JayF prepared a meetpad page 15:06:24 <rpittau> #link https://etherpad.opendev.org/p/ironic-ptg-october-2024 15:06:31 <TheJulia> ... 21-st through 25th... right when I was planning on being on vacation :( 15:06:37 <TheJulia> le-sigh 15:06:39 <rpittau> put your ides/suggestions/comments there 15:06:48 <rpittau> TheJulia: d'oh 15:07:13 <rpittau> I may miss a couple of days too, probably half of that week 15:07:52 <rpittau> well, we'll see 15:08:27 <rpittau> anything else we want to announce/remind ? 15:08:52 <JayF> I'll note that that ptg schedule is also within a stone's throw of my brother's wedding. I don't have a direct conflict, but I could see enough of us having trouble that we may want to have some type of off-cycle ironic sync 15:09:42 <rpittau> JayF: let's see how many people are available and then we can definitely think about having a separate sync, maybe reducing the official PTG time 15:09:43 <TheJulia> Honestly, I think more high bandwidth/sync communiation is better than once every six months 15:10:04 <TheJulia> so if we have a pre-ptg sync, we could at least get some stuff raised up/identified for wider discussion 15:10:10 <JayF> That's actually a pretty good observation. At a minimum, consider bringing back spuc... We could also consider doing a once-monthly video meeting similar to what the TC does 15:10:33 <TheJulia> which would better enable us to move forward with one person who might have previously been critical for the discussion to take place 15:10:59 <rpittau> sure, although IMHO the PTG helps on planning for the next release, so we should still decide on priorities for the next cycle 15:11:06 <TheJulia> Sanity presevation unconference often did end up being some deep technical discussion 15:11:18 <TheJulia> rpittau: agree completely 15:11:45 <TheJulia> I just don't think we should shoehorn all activities including problem definition into the PTG and expect success to result 15:12:37 <JayF> ++ 15:12:38 <TheJulia> like Adam's whole disk encryption proposal.... that might need some advance discussion before planning or at least base context sharing 15:12:42 <rpittau> I guess we can come up with the priorities during other discussions and then just write them down to have them ready for the next cycle 15:12:56 <JayF> TheJulia: I just wanna talk with him about it b/c it's so dang cool :D 15:13:27 <TheJulia> Well, spreading understanding is distinctly different than defining and deciding on priorities 15:13:40 <TheJulia> JayF: oh my, I haven't even read it yet becuase insanely busy 15:13:46 <JayF> we've often combined what and how in PTG 15:13:48 * TheJulia resists for now 15:13:50 <JayF> splitting them is wise 15:14:06 <TheJulia> we don't have to entirely split, just... spread context, and it doesn't have to be formal 15:14:13 <JayF> TheJulia: eh, more that I'm a sicko who appreciates seeing crap I've manually done in a gentoo desktop get mainstream enough that I might to get it for my day job :D 15:14:42 <TheJulia> eh, that is more Freaky, TBH :) 15:14:46 <cardoe> Someone likes self flogging. 15:14:59 <TheJulia> heh 15:15:00 <JayF> not exactly :D 15:15:10 <TheJulia> Okay, back to the topic at hand! 15:15:14 <JayF> It's not that tough anyway, dracut supports it well 15:15:24 <rpittau> let's see how many topics we have for the PTG first 15:15:57 <TheJulia> I do think bringing back the SPUC is worthwhile just from a contributor connection/relationship building standpoint 15:16:15 <JayF> ++ especially with new folks in the community 15:16:35 <cardoe> Can someone define SPUC? (sorry newbie here) 15:16:42 <TheJulia> Sanity Preservation Un-Conference 15:16:49 <masghar> (Also listening) 15:16:59 <TheJulia> A weekly call we did during the pandemic to give people an outlet to vent/connect/relate 15:17:09 <TheJulia> And discuss any topics in high bandwidth 15:17:11 <masghar> Oh that sounds nice 15:17:44 <TheJulia> Keep in mind, when we started it, most of the core contributors were loosing all remaining sanity 15:17:48 <TheJulia> But it really did help 15:18:23 <JayF> We eventually gave up on sanity preservation 15:18:30 <JayF> but friendship is valuable even for those who have no sanity :) 15:18:35 <TheJulia> ++ 15:18:37 <rpittau> I agree 15:18:37 <rpittau> if we bring that back maybe we could do it once per month or something like that 15:18:42 <TheJulia> Anyway, lets get meeting back on track 15:18:59 <rpittau> yep 15:19:11 <rpittau> #topic Review Ironic CI status 15:19:16 <TheJulia> Anything that is not weekly makes it hard to keep it in mind, fwiw. 15:19:28 <rpittau> #info ironic CI was impacted by an issue caused by the removal of simplejson from osc reqs 15:19:59 <JayF> stevebaker[m] and I are doing a stable branch CI audit, I think much of that is in better shape than usual 15:20:04 <JayF> but I need to check in to see how far we've gotten 15:20:22 <rpittau> JayF: I think I reviewed most of the changes already 15:20:55 <rpittau> btw this is the meetpad link 15:20:55 <rpittau> #link https://etherpad.opendev.org/p/ironic-ci-audit-july-2024 15:22:12 <JayF> yeah I just haven't checked that in a couple days, had other distractions 15:22:17 <rpittau> I'll have another look at the patches during the week 15:22:58 <rpittau> probably we need to clarify that anything that still uses CS8 must be removed or made non-voting 15:23:21 <rpittau> not sure it's worth to upgrade to CS9 unless we still maintain the branch 15:23:45 <TheJulia> Removed, I believe 15:24:00 <TheJulia> since the base images are also gone and the jobs just get ignored as a result 15:24:12 <rpittau> yeah, actually better not waste CI resources 15:24:17 <TheJulia> ++ 15:24:47 <rpittau> anything else CI related ? 15:25:41 <rpittau> alright so we don't have discussion topics, unless someone has something to bring up 15:26:08 <JayF> I think cardoe had something for open discussion 15:26:18 <rpittau> sure thing 15:26:27 <cardoe> Hello all. Got a few asks. 15:27:02 <cardoe> So working on an effort around a sizeable amount of machines living in Ironic (multiple ones that we'll have as regions) 15:27:17 <cardoe> Our deployment is centered around Kubernetes and we're using OpenStack Helm. 15:27:38 <cardoe> We're letting people consume the hardware via Nova and some like jamesdenton consume it via baremetal directly. 15:27:43 <cardoe> Just wanted to share that context. 15:27:58 <TheJulia> Interesting.... 15:28:04 <TheJulia> context++ 15:28:14 <JayF> oh, you're working on my old cluster, or something adjacent to it :D 15:28:15 <cardoe> Happy to give more details anytime as well. 15:28:40 <cardoe> So we've tried to use the redfish inspector via sushy and there's quite a bit of bits lacking. 15:28:45 <JayF> I'm sorry and/or you're welcome as appropriate :) 15:28:57 <chris218> Hi is the meeting still ongoing? 15:29:05 <TheJulia> chris218: yes, cardoe has the floor 15:29:07 <cardoe> I've been monkey patching some more endpoints into sushy because I know there's a specific targeted profile, but how much appetite is there to extensions there? 15:29:21 <cardoe> e.g. ethernet_interfaces is busted on a bunch of hardware. 15:29:32 <cardoe> https://review.opendev.org/c/openstack/ironic/+/924943 15:29:45 <TheJulia> cardoe: so there is a lot of history here, so I guess to sort of start off 15:30:10 <cardoe> e.g. Dell R76xx family of hardware with iDRAC 9 7.x.y.z returns only the 2 on board 1GB interfaces and then an empty MAC address and none of the other NIC cards plugged in. 15:30:13 <TheJulia> The out-of-band introspection is minimal information, but I think we would accept patches to obtain/match identify more within reason 15:30:22 <TheJulia> ethernet_interfaces themselves has a long, painful, history 15:30:33 <cardoe> Right. I've switched us to using Ports 15:30:37 <TheJulia> cool cool 15:30:57 <dtantsur> You have a fix though, it just needs to be moved to sushy? 15:31:40 <cardoe> Well the fix for that change request is to just filter out the empty MAC. Which is done in a few places in Ironic code. Probably should be centralized. 15:31:40 <TheJulia> so the other challenge there is depending on the cards, the interface between the BMC and the firmware (i.e. is it dell's version fo the firmware, or chipset OEM's card, or even third party firwmare), your going to get different BMC reporting behavior *as well* 15:31:58 <cardoe> My "full" fix is to stop using ethernet_interfaces and use the ports endpoint in Redfish, but that's a big change to the interop profile. 15:32:00 <TheJulia> filtering the empty mac makes sense 15:32:26 <dtantsur> cardoe: yeah, also we cannot stop using them - we'll definitely find a hardware that only has EthernetInterfaces... 15:32:28 <cardoe> I've got piles of HP and Dell gear only that I'm testing against. So that's gonna be my bias filter. 15:32:35 <dtantsur> (we've been in similar positions several times) 15:32:37 <TheJulia> the underlying problem is.... I'm trying to think of the protocol used for out of band communications between the BMC and the cards, it knows it is a nic, but not enough information/support to pass along the mac address 15:32:46 <TheJulia> so filtering empty makes *tons* of sense 15:33:06 <dtantsur> but we definitely need to fix sushy to rule out empty strings. if you're not planning on it, we should probably Just Do It ourselves 15:33:22 <cardoe> So would there be an appetite for adding a "ports" interface to sushy? Ironic doesn't have to depend on it. 15:33:32 <dtantsur> TheJulia: we already have "is not None" already btw, but that does not capture empty strings 15:33:35 <TheJulia> I would be happy to review such patches 15:34:02 <cardoe> You don't have to do it. You gimme $TIME and I'll cherry-pick them out and push them to gerrit 15:34:04 <dtantsur> cardoe: we're open for any standard-compliant extensions that open new possibilities for Ironic or adjacent projects 15:34:05 <TheJulia> dtantsur: yeah, that was sort of what I was figuring 15:34:34 <cardoe> Yeah so my Ports implementation follows the DMTF spec 15:35:02 <rpittau> that's a great selling point :) 15:35:06 <cardoe> I tried to make it as identical how you guys have done the others. But I'm sure there's some stuff I've missed and don't know. I'm happy to get review and do the lift to get it landed. 15:35:46 <JayF> that sounds like about as easy of a change as we'll ever have to approve from a theoretical standpoint. A standards-compliant implementation of a redfish endpoint mimicing existing style :D 15:35:55 <cardoe> Essentially I'm trying to add as much of what Ironic Inspector did into the Redfish inspector. 15:36:07 <cardoe> To have as much out of band as possible. 15:36:24 <dtantsur> ++ 15:36:42 <cardoe> When I send a human into the DC to fix up a box, I wanna quickly out of band make sure the hardware they swapped is where it's suppose to be. 15:37:00 <cardoe> and before they get back to their desk be able to tell them "no go back and fix it" 15:37:05 <TheJulia> The one thing to keep in mind, you may still want a full OS to boot and look for any devices, because BMC support and reporting replies on firmware on the card behaving on the i2c bus 15:37:44 <TheJulia> ++ 15:37:45 <cardoe> 100%. We're doing out-of-band not using the ironic redfish inspector today. But I'm pushing my team to use sushy (our fork) for all of that. 15:38:04 <cardoe> Then we're flipping it to ironic inspector to do the more complete inspection. 15:38:10 <TheJulia> very cool 15:38:17 <TheJulia> Sounds good 15:38:26 <cardoe> Switching to the "agent" currently. 15:38:57 <cardoe> Just wondering if we can maybe contribute this to the redfish inspector eventually. 15:39:07 <cardoe> Or land our sushy bits into actual upstream. 15:39:18 <JayF> I think the answer is very yes 15:39:30 <cardoe> We did some Dell specific bits in our sushy fork which now wanting us to fork sushy-oem-idrac and get it there. 15:39:33 <cardoe> Okay good. 15:39:45 <JayF> I think we've considered in the past rolling sushy-oem-drac back into sushy 15:39:53 <JayF> might be worth having a larger discussion about that if it makes what you're doing easier, too 15:39:59 <cardoe> The other issue we've got is for new hardware out of the box (or maybe pallet? or shipping container? I dunno never seen the gear in person) 15:40:18 <cardoe> A lot of Dell stuff doesn't PXE boot for example. 15:40:32 <TheJulia> sushy-oem-idrac is largely for the super-specific behavior dell doesn't intend to change/fix/make compliant 15:40:35 <cardoe> We do some Redfish calls to fix it up 15:40:37 <dtantsur> is virtual media an option (instead of pxe)? 15:40:44 <TheJulia> will the dell gear do httpboot out of the box? 15:41:04 <cardoe> Not sure. I need to still test with that. 15:41:23 <cardoe> That was my backport question cause the idrac driver doesn't allow redfish-https 15:41:47 <TheJulia> oh, to send a network boot url out of band? 15:42:12 <cardoe> yeah. It's on a REAL soon sprint for someone to work that. 15:42:17 <cardoe> And confirm 15:42:25 <TheJulia> yeah, that is likely just a bug from my point of view 15:42:36 <cardoe> I assume httpboot you meant redfish-https 15:42:44 <TheJulia> yeah, one of the flavors 15:42:53 <TheJulia> there are two distinct http boot flavors 15:43:22 <cardoe> So we poke the BIOS settings via redfish 15:43:25 <dtantsur> virtual media exists for somewhat longer though 15:43:37 <TheJulia> I think dell gear from the factory ships leaning towards httpboot out of the box, but I've never seen a fresh from the factory box which has not been modified by well meaning humans 15:43:48 <cardoe> yeah virtual media doesn't work on the last dozen? or more racks we've gotten from Dell. 15:44:07 <TheJulia> oh joy 15:44:28 <cardoe> https://paste.opendev.org/show/bpoxYEvFBbqHpsGTy1IF/ https://paste.opendev.org/show/bUZln4wKQ1eINnjpTscw/ 15:44:28 <TheJulia> cardoe: if you can get us details in a bug, we might be able to assist with that 15:44:35 <dtantsur> huh, interesting. we have bugs from time to time, but never seen any issues at scale 15:44:35 <cardoe> Will do. 15:44:56 <cardoe> I mean I didn't try EVERY server in the rack 15:45:12 <TheJulia> I think this one has appeared in very recent firmware 15:45:14 <TheJulia> "Virtual Media is detached or Virtual Media devices are already in use." 15:45:19 <cardoe> But n is definitely n>1 15:45:29 <cardoe> yes. These boxes all came with iDRAC9 7.00.60.00 15:45:30 <TheJulia> I ?think? Iury was starting to look a week or two ago 15:45:43 * iurygregory looks 15:45:51 <cardoe> Whatever the 6.x.y.z version was didn't do this. 15:46:06 <TheJulia> yeah 15:46:16 <cardoe> So to that effect, I tried to use the bios config of the redfish driver in Ironic. 15:46:19 <iurygregory> oh I saw this error once 15:46:24 <TheJulia> Anything else cardoe? I think chris218 was next up :) 15:46:24 <cardoe> But it doesn't allow for disable_ramdisk=True 15:46:31 <iurygregory> but it was also a networking problem 15:46:50 <dtantsur> cardoe: re disable_ramdisk, janders and I have mid-terms plans to look into that (in the context of servicing) 15:46:56 <iurygregory> ironic couldn't reach the address I had the vmedia 15:47:05 <cardoe> okay I'll work with janders on that. 15:47:14 <cardoe> Perfect. I'll give up the floor. Thank you all. 15:47:19 <rpittau> cardoe: would be nice to continue the discussion, please open a bug and we'll look into it :) 15:47:20 <rpittau> thanks! 15:47:20 <TheJulia> Thanks cardoe! 15:47:53 <rpittau> any other discussion topics? 15:47:56 <cardoe> Just wanna say thank you all for the hard work. I wanna make sure we contribute back and get my folks engaged so thanks for entertaining me and my questions. 15:48:26 <TheJulia> chris218: you were wondering if the meeting was still in progress? Do you have a topic for discussion? 15:50:48 <TheJulia> I guess something got their attention. 15:50:54 * TheJulia shrugs 15:51:02 <cid> chris218 is probably away from the keyboard. 15:51:04 <rpittau> let's move on 15:51:19 <rpittau> #topic Bug Deputy Updates 15:51:24 <rpittau> cid: anything to report? 15:51:41 <cid> Nothing really. 15:51:59 <cid> Only a new bug was filed and someone help me with triaging it. 15:52:10 * cid one 15:52:35 <rpittau> ok cool 15:52:46 <rpittau> any volunteer for bug deputy for this week ? 15:53:10 <iurygregory> I can 15:53:15 <rpittau> thanks iurygregory :) 15:53:49 <rpittau> #topic RFE review 15:54:00 <rpittau> we have an rfe to discuss apparently https://review.opendev.org/c/openstack/sushy-tools/+/923111 15:54:27 <JayF> There seems to be a lot of ... undocumented context around the direction of sushy-tools? 15:54:38 <mohammed> a tiny patch on sushy-tools. It's a minimal and generic hook that can be enabled to send status changes to a pluggable component's API endpoint 15:54:47 <JayF> I don't have a strong care about it, other than worrying about if this complexity could be maintained if the folks interested in these advanced features went away 15:54:56 <dtantsur> JayF: not really a lot, the root is very simple: the group of us wants to adapt sushy-tools for scale-testing Ironic 15:55:02 <dtantsur> like, scale scale 15:55:28 <JayF> That sounds like a worthy thing, and exciting, but without the plan laid out for upstream about how it's being done, it does make it difficult to review 15:55:31 <dtantsur> then there is some history on how we came to this specific proposal 15:55:53 <TheJulia> I think the problem is the original discussion didn't spread wide so ongoing context is lost 15:56:08 <dtantsur> JayF: have you seen the updated docs? https://review.opendev.org/c/openstack/sushy-tools/+/923111/14/doc/source/user/dynamic-emulator.rst 15:56:09 <TheJulia> (Hey, this is the sort of thing SPUC would have helped with! *ducks*) 15:56:19 <dtantsur> SPUUUC \o/ 15:56:21 <mohammed> Why do we need it? It extends the fake system, which is currently insufficient for testing end-to-end Ironic deployments. 15:56:26 <masghar> SPUC++ 15:56:41 <JayF> I guess I just wish I could see the whole picture 15:56:50 <mohammed> who will use it? For now, but not limited to, the fake IPA (which mocks some real IPA command executions). This component is in progress on the Metal3 org, and we plan to integrate it with the fake system using this interface. 15:56:55 <JayF> right now it feels like I'm reviewing the "y" of the xy problem on all these changes because I can't see the bigger picture 15:57:08 <chris218> TheJulia: nah I had a question about implementing custom ironic driver 15:57:10 <JayF> which is mostly fine, I rarely review sushy* stuff anyway, but it seems to need the review attention now 15:57:30 <mohammed> Note: Sushy-tools emulates Redfish-compliant hardware, creating a "fake" environment to mimic the behavior of physical hardware for testing and development purposes. The fake system within sushy-tools adds another layer of simulation, creating fake drivers and components on top of the already emulated hardware environment. Essentially, the fake system is a "fake of the fake" and is intended only for testing purposes. I 15:57:30 <mohammed> just want to use it, and if there is no motivation to merge it into sushy-tools, I can host it elsewhere to continue using it effectively. 15:58:17 <cardoe> I mean I've thought about using something similar to that for some "fast" tests. 15:58:20 <TheJulia> I think it is likely okay 15:58:48 <TheJulia> it being the doing a thing, not having looked at the content yet 15:58:55 <cardoe> But our concern has definitely been to not pull a Crowd Strike and implement our behavior to the fake implementation and then have issues in prod. 15:59:30 <TheJulia> .... too soon 15:59:42 * TheJulia had to deal with airline flight drama 16:00:14 <mohammed> i do not think sushy will run in any prod env 16:00:22 <dtantsur> cardoe: sushy-tools is something that should not exist in production hence we're trying to put everything there 16:00:33 <dtantsur> mohammed: please avoid confusion: sushy vs sushy-tools 16:00:41 <dtantsur> sushy is a Python library that Ironic uses 16:00:43 <JayF> mohammed: yeah, mainly my question in the review was in an attempt to have this conversation happen 16:00:45 <rpittau> I guess we need to keep discussing that, maybe one ore more reviews with questions/answers in the patch could help 16:01:02 <JayF> mohammed: the conversation has happened, people are more on the same page, I think you'll get this all landed 16:01:13 <TheJulia> JayF: ++ 16:01:14 <dtantsur> I think the patch looks solid, I just spotted one issue with power state handling 16:01:23 <rpittau> alright then :) 16:02:08 <rpittau> going to close the meeting, we're a couple of minutes out 16:02:10 <rpittau> thanks all! 16:02:14 <rpittau> #endmeeting