| opendevreview | Merged openstack/ironic-prometheus-exporter stable/2025.2: Fix handling of unknown metric keys in ironic parser https://review.opendev.org/c/openstack/ironic-prometheus-exporter/+/971072 | 00:33 |
|---|---|---|
| opendevreview | Steve Baker proposed openstack/ironic master: OciImageService detect bootc image https://review.opendev.org/c/openstack/ironic/+/966760 | 02:16 |
| opendevreview | Steve Baker proposed openstack/ironic master: WIP autodetect deploy interface https://review.opendev.org/c/openstack/ironic/+/973187 | 02:16 |
| *** jroll07 is now known as jroll0 | 07:16 | |
| rpittau | good morning ironic! o/ | 07:43 |
| opendevreview | Merged openstack/ironic master: Simplify ovn vtep microversion logic https://review.opendev.org/c/openstack/ironic/+/971494 | 08:31 |
| opendevreview | Pierre Riteau proposed openstack/bifrost master: Fix path to bifrost.crt with non-default tls_root https://review.opendev.org/c/openstack/bifrost/+/973452 | 10:21 |
| opendevreview | Pierre Riteau proposed openstack/bifrost master: Fix path to bifrost.crt with non-default tls_root https://review.opendev.org/c/openstack/bifrost/+/973452 | 10:25 |
| priteau | Hello Bifrost team. Is there a way to disable the new OCI artifact registry? In kolla-ansible/kayobe land we are running Bifrost inside a Docker container and that is failing to run podman inside for now, breaking our CI | 10:33 |
| Continuity | TheJulia: amazing:D | 10:33 |
| Continuity | Sorry a little behind on my IRC | 10:33 |
| opendevreview | Pierre Riteau proposed openstack/bifrost master: CI: Fix SLURP upgrade jobs https://review.opendev.org/c/openstack/bifrost/+/973458 | 10:58 |
| opendevreview | Pierre Riteau proposed openstack/bifrost master: CI: Fix previous release for SLURP upgrade jobs https://review.opendev.org/c/openstack/bifrost/+/973458 | 10:58 |
| opendevreview | Bartosz Bezak proposed openstack/networking-generic-switch master: Add Arista bond trunk support https://review.opendev.org/c/openstack/networking-generic-switch/+/973461 | 11:31 |
| opendevreview | Pierre Riteau proposed openstack/bifrost master: CI: Ensure firewalld is unmasked https://review.opendev.org/c/openstack/bifrost/+/973463 | 12:04 |
| opendevreview | Bartosz Bezak proposed openstack/networking-generic-switch stable/2025.1: [DNM] Add Arista bond trunk support https://review.opendev.org/c/openstack/networking-generic-switch/+/973470 | 13:11 |
| opendevreview | Pierre Riteau proposed openstack/bifrost master: CI: Ensure firewalld is unmasked https://review.opendev.org/c/openstack/bifrost/+/973463 | 13:33 |
| opendevreview | Pierre Riteau proposed openstack/bifrost master: Ensure firewalld is unmasked https://review.opendev.org/c/openstack/bifrost/+/973463 | 13:36 |
| opendevreview | Michal Nasiadka proposed openstack/networking-generic-switch stable/2025.2: Use upper constraints in pep8 job https://review.opendev.org/c/openstack/networking-generic-switch/+/973480 | 13:56 |
| opendevreview | Michal Nasiadka proposed openstack/networking-generic-switch stable/2025.2: Use upper constraints in pep8 job https://review.opendev.org/c/openstack/networking-generic-switch/+/973480 | 13:56 |
| opendevreview | Michal Nasiadka proposed openstack/networking-generic-switch stable/2025.1: Use upper constraints in pep8 job https://review.opendev.org/c/openstack/networking-generic-switch/+/973481 | 13:56 |
| opendevreview | Michal Nasiadka proposed openstack/networking-generic-switch stable/2025.1: Use upper constraints in pep8 job https://review.opendev.org/c/openstack/networking-generic-switch/+/973481 | 13:56 |
| opendevreview | Michal Nasiadka proposed openstack/networking-generic-switch stable/2025.1: Use upper constraints in pep8 job https://review.opendev.org/c/openstack/networking-generic-switch/+/973481 | 14:16 |
| opendevreview | Michal Nasiadka proposed openstack/networking-generic-switch stable/2025.2: Use upper constraints in pep8 job https://review.opendev.org/c/openstack/networking-generic-switch/+/973480 | 14:16 |
| TheJulia | Continuity: no worries | 14:26 |
| TheJulia | priteau: looks like the option would need to be added to do such | 14:30 |
| TheJulia | That also means some of the outstanding patches would likely need to also make its usage optional. :\ | 14:31 |
| TheJulia | priteau: any chance kayobe might be interested in leveraging OCI URLs instead of bringing your own url/artifact ? | 14:37 |
| opendevreview | Merged openstack/python-ironicclient stable/2025.2: feat: add 'vendor' and 'category' for port object https://review.opendev.org/c/openstack/python-ironicclient/+/973439 | 14:37 |
| opendevreview | Michal Nasiadka proposed openstack/networking-generic-switch stable/2025.2: Use upper constraints in pep8 job https://review.opendev.org/c/openstack/networking-generic-switch/+/973480 | 14:38 |
| priteau | TheJulia: What do you mean by OCI URL? Registry URL for an image? | 14:43 |
| priteau | In Kayobe we already deploy a local Docker registry, so ideally we would reuse it instead of having Bifrost deploy its own | 14:44 |
| TheJulia | priteau: so the direction bifrost is moving in the use of a image registry to back the need for the webserver for user image artifacts, which allows for more options (and also allows for things like bootc deployed hosts as well, eventually. | 14:45 |
| TheJulia | so, conceivably, if we put in knobs on the registry install/use, you could disable the registry from installing in your jobs, and when the other patches merge to use it, you could likely then just offer a patch for an external registry to be used | 14:48 |
| priteau | Do you store qcow2/raw images in this registry? | 14:48 |
| TheJulia | A patch which has not landed yet does, yes | 14:49 |
| TheJulia | Using ORAS | 14:49 |
| priteau | Nice stuff | 14:51 |
| priteau | Do you have a timeline for completely removing the options of keeping images on the built-in HTTP server? | 14:52 |
| TheJulia | once loaded in, Ironic uses the OCI client we added back in... late 2024... (wow, where has time flown), to retrieve the artifact | 14:53 |
| TheJulia | A timeline has not yet been set, but I think the consensus has been to just try and move bifrost forward | 14:53 |
| TheJulia | not knowing the kayobe CI details | 14:53 |
| TheJulia | Another topic: Who here cares about Dell OS10 (Formerly Force10 OS, version 10) switches and use of VXLAN? | 14:54 |
| alegacy | TheJulia: we care about os10 switches, but not VXLAN (yet) | 16:08 |
| TheJulia | So, some analysis points to mostly coupled with vxrail setups which may be on their way out anyway, but those could apparently be ordered with different vendor switches, so no real numbers there. I guess the reason I raise the question is they have a 3 step configuration model for the actual attachment of a VNI to a vlan through an internal range limited value. Just not really sure about demand or relevence at this point | 16:12 |
| TheJulia | (also, since I've heard they are heading towards being phased out in favor of SONiC (although, I have no idea what might replace that given Dell SONiC requires a license to run sonic at all which expires after some number of years...)) | 16:12 |
| *** gmaan is now known as gmaan_afk | 16:33 | |
| TheJulia | JayF: When you get a chance, please take a look at https://review.opendev.org/c/openstack/ironic/+/973187 I suspect we might need a little more data | 16:55 |
| JayF | https://review.opendev.org/c/openstack/ironic/+/973187/3#message-5972ac124953e722a05fc5c145187ed263780407 | 16:59 |
| JayF | I suggested to make it detect bootc only for now, and I can do ramdisk as the next step | 16:59 |
| JayF | since there are questions about how to make ramdisk work I can't answer now and don't have time to tackle likely this week | 16:59 |
| JayF | who were our outreachy mentors last cycle? rpittau and someone else? Can one of you all DM me? | 17:07 |
| cardoe | JayF: I'd also appreciate https://review.opendev.org/c/openstack/ironic/+/973294 so that I can work on testing clif's stuff. | 17:16 |
| opendevreview | Julia Kreger proposed openstack/networking-generic-switch master: WIP: l2vni plug case with Cisco NXOS https://review.opendev.org/c/openstack/networking-generic-switch/+/968377 | 17:37 |
| opendevreview | Julia Kreger proposed openstack/networking-generic-switch master: WIP: Arista EOS and vendor neutral SONiC support for VXLAN attachments https://review.opendev.org/c/openstack/networking-generic-switch/+/972763 | 17:37 |
| opendevreview | Julia Kreger proposed openstack/networking-generic-switch master: WIP: VXLAN: Add Junos, Cumulus NVUE, and denote Dell OS10 as unsupported https://review.opendev.org/c/openstack/networking-generic-switch/+/972764 | 17:37 |
| opendevreview | Julia Kreger proposed openstack/networking-generic-switch master: WIP: OVS testing patch for 'vxlan' binding model https://review.opendev.org/c/openstack/networking-generic-switch/+/972765 | 17:37 |
| opendevreview | Merged openstack/networking-generic-switch stable/2025.2: Use upper constraints in pep8 job https://review.opendev.org/c/openstack/networking-generic-switch/+/973480 | 17:38 |
| opendevreview | Merged openstack/networking-generic-switch stable/2025.1: Use upper constraints in pep8 job https://review.opendev.org/c/openstack/networking-generic-switch/+/973481 | 17:38 |
| *** gmaan_afk is now known as gmaan | 17:46 | |
| opendevreview | Julia Kreger proposed openstack/networking-generic-switch master: Migrate setup configuration to pyproject.toml format https://review.opendev.org/c/openstack/networking-generic-switch/+/973526 | 17:53 |
| cardoe | TheJulia: free pony if you add pre-commit in there too | 18:02 |
| TheJulia | My claude assistant may be up for that, need to level set the repos to a better state... really | 18:02 |
| TheJulia | Awwww... lets see if I remember after 11 am. ;) | 18:06 |
| cardoe | So if a machine in "inspect failed", what's the right operation to get it back to a point that I can inspect it again. "manage"? | 18:13 |
| TheJulia | Easy win: https://review.opendev.org/c/openstack/ironic/+/973283 | 18:15 |
| TheJulia | cardoe, that is one of yours :) | 18:15 |
| TheJulia | cardoe: yes, that should be | 18:16 |
| TheJulia | Although, I thought we kept the auto roll back pattern, maybe we didn't with inspection from inspector? | 18:16 |
| cardoe | what's wrong with the patch? | 18:19 |
| cardoe | So I ask because I'm about to file a bug. | 18:20 |
| cardoe | I'm still using agent inspect in this env. And the agent had an error. So it went to "inspect failed". The baremetal port retained its | internal_info | {'inspection_vif_port_id': '8013b009-2064-423b-9b99-a38953cef7d3'} even after I did "manage" | 18:21 |
| cardoe | The port remained in neutron | 18:21 |
| cardoe | And when I went to inspect again I got a conflict on the port. | 18:21 |
| TheJulia | cardoe: nothing wrong with the patch | 18:23 |
| opendevreview | Merged openstack/ironic master: fix: port endpoints did not return vendor and category and fix docs https://review.opendev.org/c/openstack/ironic/+/973294 | 18:23 |
| opendevreview | Merged openstack/ironic master: Add positive port api category/vendor field test https://review.opendev.org/c/openstack/ironic/+/973396 | 18:23 |
| TheJulia | oh jeeze, yeah, thats a bug | 18:23 |
| TheJulia | on inspect error being reached, the port should have been ripped out | 18:24 |
| TheJulia | along with the vif | 18:24 |
| cardoe | oh I see what happens. | 18:24 |
| cardoe | We left it on there and when it went to do cleaning it said "gimme the attached vif" and it got the inspection port | 18:24 |
| opendevreview | Doug Goldstein proposed openstack/ironic stable/2025.2: fix: port endpoints did not return vendor and category and fix docs https://review.opendev.org/c/openstack/ironic/+/973532 | 18:25 |
| TheJulia | so we failed to capture/handle/remove the vif/port/attachment with the error? | 18:25 |
| opendevreview | Doug Goldstein proposed openstack/ironic bugfix/33.0: fix: port endpoints did not return vendor and category and fix docs https://review.opendev.org/c/openstack/ironic/+/973533 | 18:26 |
| cardoe | Yeah. I don't know if its suppose to be removed with the error or if when it moves away from the failed state? | 18:28 |
| cardoe | blerp | 18:33 |
| cardoe | janders: so I'll +2 that patch if ya add some constants and make the clean up clear like dtantsur said... what makes me think of this... | 18:33 |
| cardoe | https://github.com/openstack/ironic/blob/09761d9549286b56561d7c2dee16beb83e62e21f/ironic/conductor/utils.py#L1486 | 18:34 |
| cardoe | https://github.com/openstack/ironic/blob/09761d9549286b56561d7c2dee16beb83e62e21f/ironic/drivers/modules/network/common.py#L478 | 18:34 |
| TheJulia | It should happen in the error handler, not on the move from the error state | 18:41 |
| TheJulia | cardoe: have you had a chance to move on the mech driver locally on the neutron repo? | 18:42 |
| TheJulia | (trying to think through the building blocks here, before I start to piece it together | 18:42 |
| cardoe | to the networking-baremetal? not yet. | 18:43 |
| TheJulia | k | 18:43 |
| cardoe | I will. | 18:48 |
| cardoe | So this box hit a timeout in the inspection. It didn't get attached physically on the network correctly. | 18:49 |
| TheJulia | ok | 18:49 |
| TheJulia | ... what went sideways? delayed bind? | 18:50 |
| cardoe | Some change in our mech driver. | 18:51 |
| cardoe | But the neutron port from inspection get left around with the MAC address of that port. | 18:51 |
| cardoe | And inspection_vif_id stuck around | 18:52 |
| TheJulia | yeah, that definitely needs fixing | 18:52 |
| cardoe | The box went to cleaning and we clean up ports by MAC address | 18:52 |
| cardoe | And due to those links I had above the order of the vif_port_id done differently in two different places. | 18:52 |
| TheJulia | ugh, yeah | 18:53 |
| TheJulia | so the new inspection code doesn't handle errors like the others | 18:53 |
| cardoe | So we run remove_ports_from_network() which removes a port based on the MAC from the network. But now there was an inspection port and a cleaning port. When cleaning finished it removed the inspection_vif_id port from neutron. But removed the 'cleaning_vif_id' from the baremetal port. | 18:53 |
| TheJulia | and it is, very definitely, minimal | 18:53 |
| cardoe | So now the cleaning port with the MAC address was left behind along with a reference on the baremetal port. | 18:54 |
| TheJulia | The root issue is the original inspection failure, right? | 18:54 |
| cardoe | And somehow when the box was used again the cleaning port was used and bound down to the box and not the tenant port. | 18:55 |
| cardoe | Yeah. | 18:55 |
| cardoe | Just walking up the stack of operations from the logs. | 18:55 |
| TheJulia | could you just create a quick bug, I see the issue and can hammer a fix out kind of quick | 18:56 |
| cardoe | Yeah. | 18:56 |
| TheJulia | Thanks | 18:56 |
| cardoe | So one idea I've had recently is on the neutron ports. We set 'device_owner' to 'baremetal:none'. Really what matters or what we need is that baremetal: prefix. We've been utilizing for our network node baremetal port 'baremetal:network' in our mech driver. | 18:57 |
| TheJulia | Interesting | 18:57 |
| cardoe | But I've been thinking that neutron should probably make the port as 'baremetal:cleaning', 'baremetal:inspecting', 'baremetal:provisioning', 'baremetal:tenant', 'baremetal:rescuing' | 18:57 |
| TheJulia | That is more state conveyance and tracking, not sure but maybe could be a case if extra guardrails are needed | 18:58 |
| cardoe | Yes but the guardrail I was thinking is that when we go to tell neutron THIS PORT RIGHT HERE ATTACH IT NOW... we can ensure we select the right port. | 18:59 |
| cardoe | Cause right now we lookup a port with the MAC address for device == node_id | 18:59 |
| cardoe | So if a cleaning network port or an inspecting network port didn't get cleaned up by accident.... | 19:00 |
| TheJulia | hmm | 19:00 |
| cardoe | Just an idea. | 19:00 |
| TheJulia | its sort of an issue a few steps removed though | 19:00 |
| TheJulia | its not really wrong to use the mac since the mac is forced to be "unique" | 19:00 |
| cardoe | for baremetal ports but not neutron ports | 19:01 |
| cardoe | This issue only happened because of the fix for https://bugs.launchpad.net/ironic/+bug/2106073 | 19:01 |
| cardoe | Because we now create the tenant port early | 19:01 |
| cardoe | bugs.launchpad.net is down | 19:02 |
| cardoe | out of curiosity what tracks the timeout for inspection? | 19:03 |
| cardoe | oh ironic/conductor/manager.py _check_inspect_wait_timeouts | 19:05 |
| cardoe | Nothing in there cleans up | 19:08 |
| opendevreview | Julia Kreger proposed openstack/networking-baremetal master: Swap networking-baremetal to use pyproject.toml https://review.opendev.org/c/openstack/networking-baremetal/+/973539 | 19:11 |
| TheJulia | launchpaddddddddd | 19:12 |
| TheJulia | oh, thats wonderful, because even the inspection error handler fails to do basic cleanup | 19:12 |
| opendevreview | Verification of a change to openstack/ironic master failed: Add LLDP collect for DRAC Redfish inspection https://review.opendev.org/c/openstack/ironic/+/970630 | 19:31 |
| cardoe | So I went crazy and started switching from random strings to an enum | 20:13 |
| cardoe | I discovered that servicing_vif_port_id is missed entirely in a couple of places. | 20:14 |
| cardoe | TheJulia: do we make a new vif/port for servicing? | 20:34 |
| cardoe | Or are we recycling the tenant port? | 20:35 |
| TheJulia | make a new one | 20:36 |
| TheJulia | I've got patch in progress | 20:36 |
| cardoe | ooo well my patch is gonna be a lot of fun | 20:37 |
| cardoe | https://github.com/openstack/ironic/blob/09761d9549286b56561d7c2dee16beb83e62e21f/ironic/drivers/modules/network/common.py#L478-L482 | 20:37 |
| cardoe | We do not get the "servicing_vif_port_id" | 20:38 |
| cardoe | We're gonna grab the tenant one | 20:38 |
| TheJulia | In a meeting atm, so can't look atm | 20:45 |
| cardoe | Here's the thing... I don't really know what it impacts. | 21:02 |
| opendevreview | Doug Goldstein proposed openstack/ironic master: fix: refactor VIF network types to be an enum and ensure all are checked https://review.opendev.org/c/openstack/ironic/+/973557 | 21:08 |
| cardoe | launchpad is still dead. | 21:10 |
| TheJulia | likley just an oversight, I am not sure much code actually calls get_current_vif | 21:42 |
| TheJulia | Claude is acting... very concernably. | 21:42 |
| cardoe | Well I'll review whatever ya come up with. | 22:00 |
| JayF | https://review.opendev.org/c/openstack/ironic/+/964502 metal3-integration is failing on this, it looks like nodes are not coming online from cleaning | 22:09 |
| JayF | but I can't seem to find any of hte ironic-bm-log equivalents, or even find where the ironic-anything logs are | 22:10 |
| opendevreview | Julia Kreger proposed openstack/ironic master: WIP: Fix introspection failure handling https://review.opendev.org/c/openstack/ironic/+/973560 | 22:13 |
| TheJulia | cardoe: ^^^ passes tests, but claude was acting a bit wonky, a close eye is likely needed | 22:14 |
| cardoe | JayF: yeah I couldn't figure out how to look at it. They're expecting the nodes to be in a different state is all I could tell. | 22:16 |
| cardoe | The inspection failure change looks reasonable. | 22:29 |
| TheJulia | good to know, I'll review it myself in the. morning and give it a cleanup spin | 22:29 |
| TheJulia | I basically had 1-on-1 and then a team meeting so my brain is in word use mode, not code logic mode | 22:30 |
| TheJulia | but, my devstack is restacking with the stack of ngs code for vxlan fun | 22:30 |
| opendevreview | Doug Goldstein proposed openstack/ironic master: fix: refactor VIF network types to be an enum and ensure all are checked https://review.opendev.org/c/openstack/ironic/+/973557 | 22:31 |
| cardoe | I reviewed the ironic standalone networking. I'm the same as everyone else as a +1 to show it. | 22:32 |
| TheJulia | cool cool | 22:36 |
| cardoe | It's got a metric ton of strings all over the place. The first time we've gotta add support for something else it's gonna be like ^ above. | 22:36 |
| TheJulia | yeah | 22:37 |
| opendevreview | Verification of a change to openstack/ironic master failed: Add LLDP collect for DRAC Redfish inspection https://review.opendev.org/c/openstack/ironic/+/970630 | 22:49 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!