15:00:14 #startmeeting ironic 15:00:14 Meeting started Mon Aug 5 15:00:14 2024 UTC and is due to finish in 60 minutes. The chair is rpittau. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:14 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:14 The meeting name has been set to 'ironic' 15:00:33 o/ 15:00:34 o/ 15:00:36 o/ 15:00:39 Hello everyone! 15:00:39 Welcome to our weekly meeting! 15:00:39 The meeting agenda can be found here: 15:00:39 #link https://wiki.openstack.org/wiki/Meetings/Ironic#Agenda_for_August_05.2C_2024 15:00:46 o/ 15:00:54 o/ 15:01:19 #topic Announcements/Reminders 15:01:32 Standing reminder to review patches tagged ironic-week-prio and to hashtag any patches ready for review with ironic-week-prio 15:01:32 #link https://tinyurl.com/ironic-weekly-prio-dash 15:02:02 o/ 15:02:26 we have some patches in merge conflict in the list, worth having a look there 15:03:04 #info 2024.2 Dalmatian Release Schedule 15:03:04 #link https://releases.openstack.org/dalmatian/schedule.html 15:03:08 Maryna Savchenko proposed openstack/ironic master: Fixes config drive base64 encoding in kickstart file https://review.opendev.org/c/openstack/ironic/+/925569 15:03:12 we're at R-8 two months to go! 15:03:15 k, I'll try to rebase those this week 15:03:29 thanks TheJulia :) 15:03:52 Maryna Savchenko proposed openstack/ironic master: Fixes config drive base64 encoding in kickstart file https://review.opendev.org/c/openstack/ironic/+/925569 15:03:54 Feature Freeze is in 3 weeks 15:04:24 I really, really need reviews on ironic-guest-metadata 15:04:29 because Nova will enforce FF 15:05:14 ++ 15:05:21 JayF: I see it's failing CI, unrelated ? 15:05:30 yes 15:05:38 Maryna Savchenko proposed openstack/ironic master: Fixes config drive base64 encoding in kickstart file https://review.opendev.org/c/openstack/ironic/+/925569 15:05:39 just not rechecking like a madman on something that almost certainly will get revised 15:05:47 ok 15:06:31 any other priorities since we're close to FF ? 15:06:45 What's FF? 15:06:50 Feature Freeze 15:06:58 Oh, okay 15:07:17 I'm soliciting reviews on runbooks 15:07:19 JayF: so you need it merged, what, this week? 15:07:21 in ironic/. 15:07:23 ? 15:07:25 Any feedback is appreciated 15:07:47 TheJulia: I need feedback on the Ironic and Nova changes, and I will ask Nova for nova side reviews 15:08:00 TheJulia: the feature basically straddles both projects so I see it as a situation where both should be +2 before either lands 15:08:06 JayF: if you can post a link, don't know if that is a topic or hashtag 15:08:31 I cant find it in the weekly priorities patches 15:08:38 #link https://review.opendev.org/c/openstack/ironic/+/924887 15:08:45 Thanks! 15:08:46 TheJulia: https://review.opendev.org/q/topic:%22ironic-guest-metadata%22 15:08:47 cid, I will take other look this week, the patch is big, but trill wed I should provide some feedback 15:09:04 Tks 15:09:32 ok moving on 15:10:08 #info the next OpenInfra PTG which will take place October 21-25, 2024 virtually! don't forget to register 15:10:08 add your name and topics to the etherpad 15:10:08 #link https://etherpad.opendev.org/p/ironic-ptg-october-2024 15:10:35 JayF: thanks 15:10:39 we'll have to decide the schedule soon so please add your topics there 15:11:32 btw I forgot one very important thing! 15:11:55 #info PTL elections nominations open next week! 15:12:12 #link https://releases.openstack.org/dalmatian/schedule.html#tc-and-ptl-elections 15:12:53 Do you intend to run again? 15:13:44 JayF: not sure, I'm still evaluating some things at the moment 15:14:25 please let us know soon if you decide not to 15:14:34 Yeah, someone else has to carve that time out if you don't 15:14:51 So I've been nudged to run for TC so I intend on doing that. 15:15:04 TheJulia: as soon as I know I will let the community know 15:16:44 I was hoping to land https://review.opendev.org/c/openstack/python-ironicclient/+/924895 into the next client release as well if at all possible. 15:16:47 cardoe didn't even make me get out my cudgel 15:17:49 onward! 15:17:56 #topic Review Ironic CI status 15:18:05 seems like the CI was pretty stable last week 15:18:22 anything worth mentioning ? 15:19:14 alright! 15:19:26 #topic Discussions 15:19:33 cardoe: added the ironic-week-prio label to that change 15:19:44 adam-metal3: you have 3 topics 15:19:57 rpittau: yes 15:20:18 should I go one by one or should I just answer questions? 15:20:54 adam-metal3: let's do that in order, I think a discussione already started before the meeting 15:21:34 +1 to one by one 15:21:53 sure 15:22:25 So the first topic is RFE Ironic Support for custom reboot https://bugs.launchpad.net/ironic/+bug/2076099, https://review.opendev.org/c/openstack/ironic/+/925703 the second link is the implementation 15:23:20 In current implementation of susy tools if you would like to use an OStack VM as to emulate a node with sushy tools but the VM has booted from a volume or a snapshot there will be a failure 15:23:59 Re reboot: I've left a comment about using deploy steps overrides - please check 15:24:02 the reason is that sushy tools currently only looks for the boot image that is directly attached to the VM instead of checking the "volume chain" 15:24:03 I would prefer a route, like Dmitry already posted there, of using existing configs / generic configs to achieve this goal. I am afraid a specific option of this type would be creating technical debt and potentially lead to folks breaking their system if they don't know where all the seams lie. 15:24:22 Re sushy-tools: I don't care much about it as long as the implementation is not too crazy to review and maintain 15:24:41 and you can answer "why does this exist" just by reading the repo/docs 15:24:47 (re sushy-tools ^) 15:25:15 Personally for me I'd rather see 1 path and maybe we fix it with some kind of callback to ensure we're ready to proceed to the next step? 15:25:20 re: reboot, I think there is a miss-conveyance of context 15:25:36 oo shit so the first was the reboot 15:25:41 because the proposal reads as a step, but *not really* based upon the code. Just highlighting as it is a risk to create confusion 15:25:47 sorry sorry for the reboot I have the explanation here: 15:25:55 So RFE 2 custom reboot workflow: dev creates a custom hardware manager, adds "custom_reboot: True" to the step(s)'s proerties (in the code of the custom hardware manage, same as we do for "requires_boot" property) , wehn Ironic reaches this custom step during execution it detects and saved "custom_reboot=true" to the current "Task" instance , when the execution reaches the mandatory "agent_tear_down" out of ban 15:25:55 step (priority 40) then it checks if the current "Task" have the "custom_reboot" value set to true and if yes it will skip the reboot . TLDR this is an option for those who write custom hardware managers, the proeprty is set during the implementation of the custom steps, no configuration needed during runtime. No exposure of variable on Ironc API , no variable exposure in Ironic or IPA config, all configuartion is 15:25:56 contained within the custom hardware manager. 15:26:23 yeah but that's based in a flawed assumption: that the machine, if rebooted in-band, would be able to recover itself 15:26:25 What I mean is: disabling reboot after deployment is possible now, via deploy steps 15:26:31 how does Ironic know when to flip the network if I have that enabled? 15:26:36 So can you go into what the overall goal is? What's the big overall problem to solve? 15:27:17 that's why I said earlier that I think the only way this makes sense (outside of the generic route dtantsur already laid out in the bug) is if we make required reboot steps more explicit and documented/overridable in custom-agent similar to how we do required deploy-steps 15:27:18 I suspect this might be a separate case to support, not the default usage case, fwiw 15:27:21 JayF: yeah that's a question I would have as well. How do it know when to move that. 15:27:50 explicit++ 15:27:50 That is sort of critical for those with security boundries to maintain 15:28:03 but maybe custom-agent is not that deployment interface? 15:28:29 I guess what we all miss is a pretty detailed user story 15:28:46 ^ agreed 15:28:47 yeah I think something like custom-agent where you are explicitly opting into "Ironic the orchestration engine for me to do deployments in" mode and opting out of "Ironic the deployment tool" mode 15:28:59 dtantsur: ++++ 15:29:09 JayF: this is how we use it in OpenShift, yes 15:30:06 So I would need to provide a user-story, that is fine 15:30:21 And I think that is fine, just need to be expressly understood and shared around the differences for interfaces because they all serve different problem spaces 15:30:22 is tehre something else that is an issue ? 15:30:38 It's hard to tell until we know the WHY 15:30:39 adam-metal3: just a statement to set us all to the same context as to how the usage will occur 15:30:52 We need to be able to answer Why 15:31:16 adam-metal3: I'd say generally we have trepidation for allowing people to override parts of Ironic that can impact the ability for it to work and be secure. That means you gotta tell us a really compelling story as to why it's worth it. 15:31:46 And we're all at different starting places with different contexts 15:32:03 which is why the discussion has headed down the path it has 15:32:15 We should likely move on to the next item 15:32:51 so the process would be that I write user stories in the RFE and then I bring it up again on the next meeting? 15:32:55 or just on the chat? 15:33:06 adam-metal3: just on the RFE 15:33:20 and you can always use chats in IRC/meeting to refine if folks are around 15:33:23 yeah, on the RFE. Something to help us get to the same place so we unerstand why 15:33:25 JayF: ++ 15:34:22 Okay then for the second the sushy tool improvement , do Is is there something I should do there in your opinion ? 15:34:32 can you link it? 15:34:32 That reads like a bug to me 15:34:40 I was thinking the same 15:34:45 "I can't use this snapshot which is a registered VM" 15:34:46 Adam] RFE sushy-tools Support for OStack VMs booted from voluem/snapshot https://bugs.launchpad.net/sushy-tools/+bug/2067405, https://review.opendev.org/c/openstack/sushy-tools/+/912113 15:35:20 I agree it's a bug not an rfe 15:35:37 It's a lot of code, but I think the motivation is quite clear 15:36:14 I think it could be, and sometimes bugs are a lot of code to fix. :( 15:36:46 well the support for volumes and snapshots was missing completely so that is why I needed that amount of code 15:36:48 ... a lot of that is test and docstrings :) 15:36:57 and sadly volumes can be cloned from volumes taht complicates things 15:37:17 it was not an objection, just an observation 15:37:20 and yes mostly unit tests 15:37:32 So somewhat related, do we have good docs on how to use the different test / emulation modes of sushy-tools? 15:37:43 Like OpenStack VMs like adam-metal3 is doing? 15:37:48 cardoe: last time I looked, not really. 15:37:59 cardoe: Patches welcome is the saying :) 15:38:06 Oh for sure. 15:38:13 I've been trying to be better about asking for docs for sushy-tools changes 15:38:20 I was just going to say maybe that's another place that the user story case could help. 15:38:25 Please keep in mind, it is a testing only tool, not meant for production in any context. 15:38:31 Absolutely. 15:38:57 ++ 15:38:57 onward? 15:39:06 I'll try to write something up. 15:39:13 cardoe: much appreciated! 15:39:22 adam-metal3: one more topic for you, but /I think we discussed this already? 15:39:26 yeh sure as I understand there is no issue with the sushy-tools bugfix (not RFE) 15:39:50 Yeah I started to recieve reviews on the 3rd topic the encryption, thanks for that 15:40:00 I just wanted to bring it up officially too 15:40:09 [Adam] RFE IPA LUKS+TPM https://bugs.launchpad.net/ironic/+bug/2073762, proposal: https://review.opendev.org/c/openstack/ironic-specs/+/924993 15:40:29 I started reviewing that today. I think some reformatting is going to help a lot 15:40:37 I've echoed lots of dmitry's comments 15:40:40 or, at least I am 15:40:43 I geuss we can approve that RFE unless someone has objections\ 15:40:44 This is in a sense the opposite case: it has great background information but could use a bit more technical details 15:42:05 Yes my take away was from dtantsur's comment taht a "sequence" type description is missing so I will extend the whole image and partition image sections with e2e workflow explanation 15:43:14 I like the overall idea. I have some serious concerns 15:43:15 I'll note that re: technical details, we should be a bit flexible too, the LUKS stuff is a bear to get working IRL 15:43:32 I am excited about the possibilities this feature enables though, and will look at that spec Soon(tm) 15:43:44 I'm definitely not suggesting to describe every line of the future code :) 15:43:54 honestly that might be a good one to put on for the PTG if it's not fully specified by then 15:43:55 dtantsur: ++ 15:43:58 or even if it is, to share context 15:44:16 it is indeed a potential topic for PTG 15:44:22 I suspect this might have to be finalized at the ptg 15:44:30 Because it is a huge amount of work 15:44:54 yeah agree, I will most probably pusth an e2e implementation for the whole disk image workflow to have some code present alongside the discussion anyways I am writing it already 15:46:06 the partition image support is a bit more rocky path because of the initrd editing that I have proposed, I might need a different approach there 15:46:16 adam-metal3: feel free to add the topic here https://etherpad.opendev.org/p/ironic-ptg-october-2024 15:46:25 I will note that having the encryption method be pluggable would be super interesting to me. My downstream currently does disk encryption via a custom step assisted by an external device. 15:46:30 To be honest, I wouldn't be sad if partition image support for it was never implemented, 15:46:41 We'd likely move to such a pluggable interface if it was flexible enough 15:46:46 The reality is that we have to do a weird dance to move artifacts over for proper UEFI boot to happen 15:47:09 I will note I *think* that is managed somewhat via the OOB, so making it a step/agent only thing wouldn't fly for that generic case 15:47:41 inside of a step, decisions can be made, but it has to know what is wanted by the user to make proper decisions 15:48:06 that is actually a point of concern I have as well, but I need an overall flow to validate if I'm interpreting it the way that worries me 15:48:29 Overall phases are a good thing, logically decomposing work is also a good thing 15:49:26 let's move on? 15:49:29 ++ 15:49:31 ++ 15:49:33 ++ 15:50:07 next is me 15:50:07 I have one "simple" question: do we want to support 2 ansible versions in bifrost? 15:50:15 as in: 2 major versions? 15:50:17 2 major versions 15:50:18 yeah 15:50:22 wo wo 15:50:25 wow wow 15:50:29 =O 15:50:31 *want* could be a strong word, but it's not impossible 15:50:36 what's the background? 15:50:37 the current situation is that ansible 9 is "going away" 15:50:47 * TheJulia suggests "keeping it simple" 15:50:47 and ansible 1 0requires pytho n3.10 15:50:55 wheeeeeeeeeeeee 15:50:59 so we can't test that on centos9 :D 15:51:00 fun 15:51:02 * TheJulia notes this is release note territory 15:51:06 yeah 15:51:21 so I have a patch to use both at the same time based on the distro 15:51:26 We should involve kayobe in this decision, perhaps? 15:51:26 but yeah.... 15:51:33 ohh god 15:51:35 in fact, if you could just propose a release note for usage of ansible on bifrost and ironic now to highlight operators will likely have issues 15:51:45 I rarely hear of people using bifrost directly in my day-to-day, but I know a *lot* of folks use kayobe to bootstrap 15:51:48 JayF: probably, although we keep compatibility with ansible9 anyway 15:51:51 so maybe being nice and coordinating with them would be good 15:52:11 rpittau: I wonder if the situation with non-default Pythons in CentOS has improved 15:52:13 JayF: that is a good point, at least on the what beyon raising awareness 15:52:23 Previously, they lacked important packages 15:52:28 * dtantsur podman run 15:52:35 So I suggest 301-redirect this conversation to mailing list? 15:52:43 ++ 15:52:44 dtantsur: it has improved, but I'm not sure about using a non default python in this case 15:52:54 why not? 15:53:18 we would it only in bifrost CI, nowhere else 15:53:47 and it would mean abandon ansible9 15:53:50 python3-libselinux still has 1 version, nevermind 15:54:02 also yeah not sure about some packages 15:54:15 the problematic are selinux, firewalld and dnf IIRC 15:54:38 always ... 15:54:43 * TheJulia has to switch focus in 6 minutes 15:55:20 I can probably complete the patch and see how it goes, and coordinate with kayobe if needed 15:55:30 ++ 15:55:41 not sure if this is a problem for bifrost too, but the new ansible version also drops a lot of ansible support for the remote nodes 15:55:43 Assume you will need to since they are all ubutnu I think 15:55:56 so you need python3.10 where yuo run ansible but then your inventory needs 3.8 or newer too iirc 15:55:58 ubuntu 15:56:09 Gonna be rude and just dump here cause I've gotta do a demo of Ironic internally in 6 minutes and I'll be sharing my screen. I'm wanting to fix up Ironic and anything it needs for Python 3.12. https://review.opendev.org/c/openstack/requirements/+/925306 is my blocker but I plan on sending you guys a bunch of changes once that lands for other pieces. Going to aim to drop all of setuptools rather than pull it in as a depend 15:56:09 so expect to see those changes coming. 15:56:10 clarkb: thanks! not a problem for us: we mostly run ansible locally 15:56:19 clarkb: yeah there are some changes we need to test in advance 15:56:37 cardoe: no big deal, I think many of us have a hard stop at the hour :) 15:56:41 cardoe: cool cool, I think. Good luck! 15:56:49 good luck cardoe :) 15:57:01 I've also got a lot of Dell gear that behaves badly. I'm not sure the best way to proceed to fix things. I've created https://bugs.launchpad.net/sushy/+bug/2075979 and just hoping for a suggestion. I'll write the code. 15:57:04 alright I'll move forward with that and see 15:57:30 Lastly, to deal with these Dells... I need to touch BIOS settings without the IPA booting... so I made https://bugs.launchpad.net/ironic/+bug/2075980 but wanted to make sure that was okay before implementing something. 15:57:58 we just have bug deputy left 15:58:06 #topic Bug Deputy Updates 15:58:11 o/ 15:58:14 iurygregory: anything worth mentioning? 15:58:24 * TheJulia notes this has been a super lively meeting 15:58:48 rpittau, I've checked a few bugs, did the triage, closed some old ones 15:58:56 but I was wondering about ironic-inspector bugs 15:59:01 Merged openstack/bifrost master: Remove unused get_md5 https://review.opendev.org/c/openstack/bifrost/+/925669 15:59:03 Merged openstack/bifrost master: Update min required pip version to 22.3.1 https://review.opendev.org/c/openstack/bifrost/+/925602 15:59:04 cardoe: re: https://bugs.launchpad.net/sushy/+bug/2075979 that might be a problem on Dell side actually 15:59:05 Merged openstack/bifrost master: pip: Use SETUPTOOLS_USE_STDLIB if python < 3.12 https://review.opendev.org/c/openstack/bifrost/+/924828 15:59:11 I saw one that was recently reported (like in June) 15:59:34 May be value in evaluating if those bugs would exist in the Ironic impl, and if so, porting the bug report over 15:59:34 so I'm thinking if we would still accept changes there 15:59:38 iurygregory: we should treat inspector bugs as normal, the project is still active even if deprecated 15:59:41 otherwise probably should keep inspector fixes to medium+ bugfixes 15:59:44 https://bugs.launchpad.net/ironic-inspector/+bug/2073588 15:59:49 ++ 15:59:54 ok o/ 16:00:14 we're out of time 16:00:20 any volunteer for this week bug deputy? 16:00:37 I'm slammed 16:00:50 heh me too :/ 16:00:56 I could 16:00:59 thanks cid 16:01:01 thanks cid :) 16:01:04 thanks all! 16:01:08 #endmeeting