Monday, 2024-08-12

*** dtantsur_ is now known as dtantsur02:24
jssfrHuh. https://review.opendev.org/c/openstack/ironic-python-agent/+/925087 failed in Zuul, but running `tox` finds no fails on my local machine. Is there anything Zuul does differently so I can reproduce the issue?06:53
rpittaugood morning ironic! o/06:58
opendevreviewRiccardo Pittau proposed openstack/ironic bugfix/24.0: [bugfix only] Remove deleted lextudio packages  https://review.opendev.org/c/openstack/ironic/+/92611907:21
iurygregorygood morning ironic10:45
cidI have updated ironic's wiki/meetings page with my bug deputy report, please review for correctness and contained report details ahead of the weekly meeting.12:14
cidhttps://wiki.openstack.org/wiki/Meetings/Ironic12:14
iurygregoryhttps://bugs.launchpad.net/ironic/+bug/2076265 I do think it make sense to set importance high ..12:35
iurygregoryhttps://bugs.launchpad.net/metalsmith/+bug/2076344 I would leave to metalsmith experts... but it makes sense to me, I would mark as low12:36
iurygregoryyay for a new implementation on hardware vendor side https://bugs.launchpad.net/sushy/+bug/2075979 ...12:37
iurygregoryor at least is seems like a new implementation on the first glance, I need more coffee before looking at bugs XD12:43
rpittauwasn't that already addressed ?12:44
rpittauoh no, nvm12:45
rpittauit could be an issue on the dell firmware12:46
rpittaucid: thanks for the report!12:47
rpittaucid: in https://bugs.launchpad.net/metalsmith/+bug/2076344 it mentions metalsmith 1.4.4 but the latest is 2.3.012:48
rpittau1.4.4 is 2 years old12:48
cidrpittau, Hmm, so likely already fixed. How about people still using that version though 12:51
rpittauthat's a great question, 1.4.4 is from wallaby which is currently unmaintained, they should not use that12:52
cid++, should I add a comment "to upgrade to a later version" in the bug?12:54
cidOr just leave it 12:54
rpittauthe minimum supported version is 1.10.1 which is from stable/2023.112:54
rpittaucid: yeah, suggest to use a more recent version to see if the bug is still present12:54
rpittauthanks!12:54
cidrpittau, tks12:55
iurygregoryrpittau, yeah, I'm thinking about a firmware issue also =( 13:00
rpittaucid: if you quickly update https://review.opendev.org/c/openstack/ironic/+/926082 I guess we can merge it as quickly :)13:07
cidrpittau: on it13:08
opendevreviewcid proposed openstack/ironic master: Update error message  https://review.opendev.org/c/openstack/ironic/+/92608213:10
TheJuliagood morning13:14
opendevreviewRiccardo Pittau proposed openstack/ironic-lib master: Fix invalid UTF-8 characters in execute output  https://review.opendev.org/c/openstack/ironic-lib/+/92604513:55
rpittaustill unsure if we should ignore or log the invalid characters , reviews welcome13:56
rpittau#startmeeting ironic15:00
opendevmeetMeeting started Mon Aug 12 15:00:15 2024 UTC and is due to finish in 60 minutes.  The chair is rpittau. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
opendevmeetThe meeting name has been set to 'ironic'15:00
iurygregoryo/15:00
dtantsuro/15:00
rpittauHello everyone!15:00
rpittauWelcome to our weekly meeting!15:00
rpittauThe meeting agenda can be found here:15:00
rpittau#link https://wiki.openstack.org/wiki/Meetings/Ironic#Agenda_for_August_12.2C_202415:00
TheJuliao/15:00
JayFo/15:00
rpittau#topic Announcements/Reminders15:01
rpittau#info Standing reminder to review patches tagged ironic-week-prio and to hashtag any patches ready for review with ironic-week-prio15:01
rpittau#link https://tinyurl.com/ironic-weekly-prio-dash15:01
cardoeo/15:01
rpittauit's not looking bad, lots of new patches, some from last week15:01
masgharo/15:02
rpittau#info 2024.2 Dalmatian Release Schedule15:02
rpittau#link https://releases.openstack.org/dalmatian/schedule.html15:02
rpittauwe're at week -7 !15:02
cido/15:02
rpittauwe hsould start thinking about what's missing in the non-client and client libraries15:02
rpittauFF is in 2 weeks also15:02
rpittau#info PTL nominations period between August 14 and August 2815:03
rpittauI have decided to run for a second mandate as PTL15:03
iurygregorynice =D15:03
rpittau:)15:03
rpittausorry for the late notice, just life in the middle of everything, it's been a long summer and it hasn't ended yet!15:04
JayFThanks Riccardo!15:04
TheJuliaAwesome, thanks!15:04
rpittaumy pleasure, really :)15:04
rpittauI'm glad I can do it15:05
rpittau#info the next OpenInfra PTG which will take place October 21-25, 2024 virtually! Registration is now open!15:05
rpittau#link  https://ptg.openinfra.dev/15:05
rpittauthe ironic team has been registered15:05
rpittauplease add your name and topics here15:05
rpittau https://etherpad.opendev.org/p/ironic-ptg-october-202415:05
rpittauwe still have some time for topics15:06
rpittauanything else to announce/remind ?15:07
rpittauokey dokey, onward!15:07
rpittau#topic Review Ironic CI status15:08
rpittauanything worth mentioning from last week?15:08
rpittauI've only seen some instability on pkgs repos, but it recovered pretty quickly15:08
rpittaualright, great week for CI :D15:09
rpittau#topic Discussions 15:09
rpittaunothing to discuss, unless anyone has anything to mention?15:09
JayFIs someone making sure old bugfix branches still get retired?15:10
rpittauJayF: I am :)15:10
JayFAwesome, thanks :) 15:10
rpittaunp! :)15:10
rpittauactually maybe I should write something down about the procedure, I'll take a note on that15:11
TheJulia++15:11
rpittauok!15:11
rpittauanything else?15:11
masgharI would like to mention something15:12
rpittaumasghar: please go ahead :)15:12
masgharUnfortunately I have not been able to complete the inspection rules work so far15:12
masghar(Have been a little overwhelmed with bugs and such downstream)15:12
TheJuliaNo worries, it happens :)15:13
masghar(Just a head's up)15:13
rpittauno worries masghar :)15:13
masgharThanks, will try to carve out time for it15:13
rpittauthanks for the headsup15:13
masgharNo problem, and thanks15:14
masghar(Thats it from me)15:14
JayFmasghar: If we can document, in detail, what's done and what's left to be done, I think cid is willing to help pickup some of that work. I'm unsure if the knowledge transfer is worth it :) 15:14
JayFmasghar: so know that's an option at hand if you'd like to exercise it15:15
masgharI have a very tiny patch that I started a few months ago15:15
masgharI can explain my thought process to cid or whoever asks15:15
masgharI appreciate the offer of help :)15:16
cidmasghar, JayF, In my todo for the week15:16
rpittauthanks cid :)15:17
masgharThanks cid!15:17
cidnop!15:17
* cid that should be no problems :)15:18
rpittauanything else? otherwise I have one quick thing15:18
rpittauok15:19
rpittauI forgot to mention that it's almost time for the highlights15:19
rpittauI will take care of them, at least start them, I'll be out at the beginning of Seprember when they're due, but I'll do my best to finish them before I leave for my PTO15:19
rpittauif you have anything you want to mention about this development cycle please let me know15:19
iurygregoryack15:20
rpittaualright, moving on15:20
rpittau#topic Bug Deputy Updates15:20
rpittaucid: thanks for taking car of that, anything to mention?15:21
cidSo, I needed help triaging 4 bugs, I think two can be considered done, except these other two:15:21
cidhttps://bugs.launchpad.net/sushy/+bug/207597915:21
cidhttps://bugs.launchpad.net/sushy/+bug/207598015:21
cardoeSo the first one is mine.15:21
TheJuliaI suspect yours is a valid but, but I'm having trouble understanding exactly what, the partition, in the api is15:22
cardoeEssentially sushy does not see those NICs. Dell says they conform to Redfish, etc.15:22
rpittaucardoe: which version of iDRAC9 firmware are you using ? cause that could be a firmware issue15:22
cardoeIt doesn't matter. It's the same for at least half a dozen versions. Both 6.x and 7.x15:23
rpittauok, that's what I wanted to exclude :)15:23
TheJuliaits sort of a bit of a framing issue, and my comment sort of reflect this, we need to understand how to frame it and I think we're missing context.15:23
cardoeYeah I'm sure I'm not providing the right details.15:24
TheJuliaWell, your providing what you have :)15:24
cardoeSo basically when ironic calls `ethernet_interfaces.summary` on Sushy. Those NICs are excluded.15:24
TheJuliaI guess sushy needs to understand "which one is actually important"15:25
TheJuliaand identify when to check/reconcile them together15:25
cardoeSo the ramdisk inspector sees NIC.Slot.1-1 as ens2f0np0 for example. Which would imply NIC.Slot.1-1-115:27
dtantsurI see a big problem in that there is no link between the two EthernetInterface objects15:27
cardoeBut then the other port on that card is NIC.Slot.1-2 for example... it's called ens2f1np1 which implies NIC.Slot.1-2-215:28
TheJuliadtantsur: ... that is indeed a huge issue15:29
dtantsurA workaround would be to ignore the Health record if its values are null (as opposed to unhealthy)15:29
TheJuliabut, 1-1-1 is a noted partition, but still sort of goes back to what is the partition in the context15:29
cardoeI'm gonna get someone from Dell's firmware team on a call.15:29
TheJulia+++++++15:29
dtantsur++15:30
dtantsur"InterfaceEnabled": false is very concerning (but we don't look at it.. yet?)15:30
cardoeBut I did want to at least bubble this up to you guys and see if I could come up with an acceptable way to map them and put that in sushy-oem-idrac or something.15:30
TheJuliaI guess that explains why it fails15:30
TheJuliathe partition has no mac15:30
cardoeRight15:30
TheJuliaso more info needed, I guess next item?15:32
cardoeSo another weird wrinkle15:32
cardoeWhen I set the devices as PXE or HTTPS bootable from Dell's HTTP UI15:32
cidNext item is: https://bugs.launchpad.net/sushy/+bug/207598015:32
cidThen, one last one from TheJulia: https://bugs.launchpad.net/ironic-python-agent/+bug/2076367 (sounded like something that needs to be discussed, so...)https://bugs.launchpad.net/sushy/+bug/2075980https://bugs.launchpad.net/sushy/+bug/207598015:32
cardoeThe values that Ironic pulls out from the BIOS are "NIC.Slot.1-1" and "NIC.Embedded.1-1-1"15:33
dtantsurAhh ohh. 2075980 is a long standing pain point15:33
cardoeYeah 2075980 is mine as well. Happy to write patches.15:33
dtantsurcardoe: what's your timezone? I'm happy to chat about the potential fix when I'm not boiling alive in this bloody weather.15:34
JayFIs there something sneakily not straightforward about 2075980?15:34
cardoeI'm CST (or is it CDT right now?)15:34
dtantsurJayF: yes, figuring out what do use instead of IPA to detect the finish of the operation and the subsequent reboot.15:34
dtantsurit could be simple, but we need to take a careful look (and ideally involve janders and iurygregory)15:35
JayFwhy is "flip the power on and wait" just like we do for going ACTIVE on deployment not sufficient?15:36
dtantsurJayF: wait for what?15:36
cardoeSo when I locally patched it, Ironic got mad that the node went away for a while.15:36
TheJuliapower doesn't even necessarily need to be on.15:36
JayFcardoe just pointed at the thing I was trying to see15:36
dtantsurYeah, during BIOS settings or firmware updates, the machine is doing $weird_things for several minutes15:36
iurygregoryyeah 15:36
iurygregoryin firmware updates the bmc will be unresponsive for some time also15:37
cardoeIronic was happy knowing that it would come back to the IPA after a while.15:37
iurygregoryat least with iDRAC the UI goes down XD15:37
dtantsuryeah, we cannot even be sure the BMC behaves during the process15:37
dtantsurso it's "wait for $something and retry if the BMC is not reachable or returns HTTP 5xx)15:38
TheJuliaI think we're going down a rabbit hole15:38
dtantsurPTG topic? :)15:38
TheJuliabut maybe disjoint it from the overall issue15:38
rpittausounds like it :)15:38
TheJuliaYes, there are issues, but that is not blocking to trying to fix cardoe 15:38
TheJuliaor at least, cardoe and his efforts are input into that discussion15:39
TheJuliaso we shouldn't inadvertently block15:39
cardoeSo if I just updated 1 setting and let it expect to come back after a while that was fine.15:39
JayFmy question of "where is the hard here" has been more than sufficiently answered :D 15:39
cardoeBut you can't chain 2 clean/service steps15:40
TheJuliacardoe: quite possibly, but you'll need to provide input to the discussions :)15:40
cardoeKick the bug back to me to write up more details when running with a patched Ironic allowing that behavior15:40
cardoeFair?15:40
iurygregoryi'm a bit confused by "you can't chain 2 clean/service steps" .-.15:41
dtantsuryeah, let's start with that, and maybe let's have a high-bandwidth discussion afterwards15:41
TheJuliaiurygregory: i think it means, the second step might fail15:41
iurygregoryTheJulia, oh ok! 15:41
TheJuliaiurygregory: because we don't have a complete understanding15:41
JayFYeah it sounds like the workaround is setting cleaning timeout so high it just works15:41
JayFI've seen similar workaround for in-band steps that rebooted outta band of ironic15:42
rpittauor it doesn't :)15:42
rpittaureboot time is so unpredictable with bios/bmc updates15:42
iurygregorywell, I had a funny iLO bug when doing firmware update (but only happened when doing between two specific versions)15:42
dtantsurI think TheJulia has rightfully hinted that we should collect more information and thoughts before coming up with a solution :)15:43
iurygregoryeven increasing the timeout the node went to clean failed because it failed to power on after reboot15:43
rpittauyes, I agree15:43
TheJuliadtantsur: bingo15:43
iurygregory++15:43
TheJuliaonward?15:45
rpittauyeah15:45
rpittauany other bug to check ?15:45
cid++15:45
rpittauotherwise I think we're good for today :)15:45
rpittauoh wait15:45
rpittauany volunteer for the bug deputy this week ?15:46
cidHappy to do it again15:46
cidThis could be a topic for another day:15:46
cidhttps://bugs.launchpad.net/ironic-python-agent/+bug/207636715:46
cidAnd my observation is worthy of note too: A non-core bug deputy might need to be able to revert the status of a bug that shows as 'In Progress' when the assignee has abandoned it.15:46
rpittauright15:47
JayFI am surprised the ironic-drivers group we added you to doesn't have that ability15:47
JayFcid: I'd say 2076367 is Low15:47
dtantsurJayF: me too15:47
TheJuliaYeah, there is an opportunity to make IPA a little smarter there, but definitely low priority15:47
JayFIt's behavior that's existed for years, that's mildly annoying but the real price paid is minimal (5 seconds?)15:47
TheJuliaeh... 20+ locally15:48
TheJuliaat least, it feels like 20+15:48
TheJuliain a VM15:48
JayFit's a trivial enough fix in any event ... if os.path.exists()15:48
JayF(on the various ipmi device locations)15:48
rpittaulooks like a low priority indeed, and a quick fix15:48
* cid Updated 2076367 bug's important.15:49
rpittaucid: thanks for volunteering, again! :D15:49
rpittauI forgot one more thing!15:50
rpittauI will be out next monday, so someone will have to take care of the meeting and meeting notes, please :)15:50
cidno p15:50
JayFI can run it if you want15:51
rpittauthanks JayF, much appreciated15:51
rpittaualright, I think that's it for today15:52
rpittauthanks everyone!15:52
rpittau#endmeeting15:52
opendevmeetMeeting ended Mon Aug 12 15:52:23 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:52
opendevmeetMinutes:        https://meetings.opendev.org/meetings/ironic/2024/ironic.2024-08-12-15.00.html15:52
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/ironic/2024/ironic.2024-08-12-15.00.txt15:52
opendevmeetLog:            https://meetings.opendev.org/meetings/ironic/2024/ironic.2024-08-12-15.00.log.html15:52
opendevreviewcid proposed openstack/ironic master: Follow up to the runbooks change (#922142)  https://review.opendev.org/c/openstack/ironic/+/92591715:53
opendevreviewcid proposed openstack/ironic master: Follow up to the runbooks change (#922142)  https://review.opendev.org/c/openstack/ironic/+/92591716:02
cardoeAnything we can do to nudge some of the PRs along? A number from the weekly tag are just waiting on workflows.16:04
TheJuliaI'm heads down the next couple days, I can devote time to reviewing later this week16:07
JayFI'm in a similar boat, but given I have a meeting at EOD I'll dedicate everything after that to reviewing, might get some of them landed if it's not already reviewed by me.16:10
rpittaugood night! o/16:26
opendevreviewMerged openstack/ironic master: DevStack: enable the new in-band inspection by default  https://review.opendev.org/c/openstack/ironic/+/92568817:01
cardoeThere's no "baremetal node eject" like command to let me just nuke a node from the DB even if its in "active" state?18:12
cardoeOr like delete this node but don't actually tear it down?18:12
cardoeAlso if I told you Dell just told me... "try this rando firmware blob as an iDRAC update"18:13
JayF"delete" removes a node from Ironic management entirely18:16
JayF"undeploy" deprovisions a node and initiates automated cleaning. There are tricks for disabling automated cleaning per node, but it's one of those situations where -- usually -- if you think you need it, there's *probably* a more appropriate path to get where you're going.18:16
cardoeDefinitely not a feature I'm looking for. More along the lines of testing.18:59
TheJuliatesting wise, just to rip a node out people just set maintenance and then delete a node19:03
cido/19:05
cardoeThank you TheJulia 20:10

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!