| JayF | cardoe: https://youtu.be/Vtrdr-nxvAA#t=15s | 00:00 |
|---|---|---|
| TheJulia | so much troublemaking :) | 00:07 |
| TheJulia | cardoe: two questions: 1) could I get the ironic api log just to be super sure, specifically I just want to see the high level requests/transations for the node. 2) Can you confirm there are no orphaned vif records? | 00:21 |
| cardoe | yes.. by orphaned vif you mean a neutron port right? or something in nova? | 00:22 |
| opendevreview | Merged openstack/ironic stable/2025.2: Filter null NIC firmware versions from cache https://review.opendev.org/c/openstack/ironic/+/966773 | 00:52 |
| TheJulia | cardoe: yes, recorded on an ironic node | 01:20 |
| *** mdfr5 is now known as mdfr | 04:30 | |
| rpittau | good morning ironic! happy friday! o/ | 07:30 |
| opendevreview | David Nwosu proposed openstack/ironic master: Move configdrive functions to configdrive_utils and configdrive tests to test_configdrive_utils https://review.opendev.org/c/openstack/ironic/+/965880 | 08:42 |
| opendevreview | David Nwosu proposed openstack/ironic master: Enable codespell to use pyproject.toml for spelling exceptions https://review.opendev.org/c/openstack/ironic/+/967965 | 08:51 |
| rpittau | we really need to merge https://review.opendev.org/c/openstack/ironic/+/967821 | 09:45 |
| rpittau | It's just bringing things back to what they were before the change in oslo.service, this is blocking ironic and metal3 CI :/ | 09:45 |
| *** tkajinam is now known as Guest31836 | 11:05 | |
| alegacy | cid: JayF: dtantsur: TheJulia: ... friendly reminder to those of you that did a 1st pass on my patchset to swing by and take a look at the latest revision! please and thank you: https://review.opendev.org/q/topic:%22feature/standalone-networking%22+status:open | 14:01 |
| dtantsur | on my plans for today | 14:01 |
| alegacy | dtantsur: thank you! | 14:02 |
| zigo | Hi. scciclient fails because it's not pysnmp 7 compatible: | 14:04 |
| zigo | https://bugs.debian.org/1117426 | 14:04 |
| zigo | Is it still maintained? | 14:04 |
| TheJulia | alegacy: also my time to spend some time reviewing today as well | 14:05 |
| alegacy | TheJulia: perfect! | 14:06 |
| cardoe | TheJulia: I'll get ya those logs today but someone just asked me about... "The mac address 14:23:f3:f4:c7:e0 is in use." when a build hit one of the servers that didn't clean up the other day. | 14:08 |
| cardoe | So I'm gonna guess that the answer to your "was a vif left around" is gonna be yes | 14:08 |
| TheJulia | zigo: for your purposes, the answer is likely no, and I believe a patch is up to deprecate the fujitsu driver which would ultimately remove python-scciclient as a dependency | 14:09 |
| zigo | TheJulia: FYI, because of that, Ironic was removed from Debian Testing. :/ | 14:09 |
| TheJulia | we literally just got them to agree in writing this week. removal/deprecation wise, our hands are tied by openstack community process | 14:11 |
| zigo | Ok, thanks for the info. | 14:12 |
| TheJulia | cardoe: mac in use in neutron, oh absolutely, I was meaning on the ironic node, which I don't think generates that error, I think that one comes from neutron | 14:12 |
| zigo | Though it's only the SNMP part, which isn't much. | 14:12 |
| zigo | It's a shame, we should only remove that part, IMO. | 14:12 |
| cardoe | the internal info on the baremetal port still shows a neutron vif as well. | 14:12 |
| zigo | FYI, that's what I'm currently doing. | 14:12 |
| skrobul | cardoe: yup, the vif was left around - https://gist.githubusercontent.com/skrobul/21b9961d2bdb686cf26819ed966108bb/raw/4b1f582ea31ae1988f842203cc7904d4b998abd2/leftover.txt | 14:13 |
| cardoe | TheJulia: ^ that answers that | 14:13 |
| zigo | (ie: remove snmp tests for scciclient, and fix the pysnmp imports) | 14:13 |
| TheJulia | cardoe: not really | 14:13 |
| TheJulia | cardoe: that is neutron, you crashed neutron | 14:13 |
| TheJulia | I'm talking about vif attachment records in ironic | 14:13 |
| TheJulia | openstack baremetal node vif list <node_uuid> | 14:13 |
| skrobul | TheJulia: in that case no, it's empty. The 'internal info' in 'baremetal port list --long' is empty too. | 14:17 |
| TheJulia | cool cool | 14:21 |
| TheJulia | Thanks | 14:21 |
| zigo | This one also worries me: https://bugs.debian.org/1117734 is pykmip still used by Ironic ? | 14:21 |
| zigo | From the bug report: | 14:22 |
| zigo | "There is a suggested fix as a PR on the upstream github | 14:22 |
| zigo | at https://github.com/OpenKMIP/PyKMIP/pull/707, but it seems like the | 14:22 |
| zigo | project is abandoned." | 14:22 |
| * TheJulia raises an eyebrow | 14:22 | |
| JayF | If we use pykmip it's a transitory dependency, not a direct one | 14:23 |
| zigo | My bad, it looks like barbican uses it, not Ironic. | 14:26 |
| zigo | Listed as extras, and in tests. | 14:27 |
| zigo | The proposed patch seems easy enough to add, I'll try. | 14:27 |
| cardoe | the pysnmp stuff is "solved" by dropping proliantutils | 14:28 |
| TheJulia | cardoe: python-scciclient is also impacted | 14:30 |
| TheJulia | Speaking of, who was going to put the deprecation in for ilo stuffs? | 14:32 |
| cardoe | It's there | 14:34 |
| cardoe | https://review.opendev.org/c/openstack/ironic/+/965009 | 14:35 |
| cardoe | It's unfortunately gonna cause me to not upgrade to 2026.1 for a while | 14:36 |
| cardoe | Cause I've got ilo5 stuff still | 14:36 |
| cardoe | For my test gear | 14:36 |
| TheJulia | Thats removal, we need to deprecate first | 14:46 |
| opendevreview | Merged openstack/ironic master: Fix singleprocess launcher compatibility with oslo.service 4.4+ https://review.opendev.org/c/openstack/ironic/+/967821 | 15:23 |
| JayF | TheJulia: to deal with that is the same as irmc right: we're only deprecating it until the SNMP driver is dead next cycle | 15:39 |
| TheJulia | Yeah, I think so | 15:39 |
| dtantsur | TheJulia: the VXLAN spec draft will be easier to read if you link it from the index (and thus cause it to actually get rendered) | 17:00 |
| dtantsur | it does seem to render locally even now, I wonder why I get 404 on the preview in the CI | 17:02 |
| TheJulia | weeeeird | 17:09 |
| TheJulia | ack, I thought I fixed that | 17:09 |
| TheJulia | Thanks for the heads up | 17:09 |
| * TheJulia finally gets off vxlan call and falls over dead | 17:09 | |
| dtantsur | Sure. I'm afraid my contribution to this document will be a few readability nits. | 17:10 |
| TheJulia | That is prefectly okay | 17:10 |
| dtantsur | So far, there is no paragraph that I've understood :D | 17:10 |
| TheJulia | That is perfectly okay | 17:10 |
| dtantsur | All I understand is that you people are way too smart for me | 17:10 |
| TheJulia | It is a subset of context acquired by drinking delicious beverages | 17:11 |
| TheJulia | Meanwhile, livers complain | 17:11 |
| dtantsur | Yeah, mine won't survive enough alcohol | 17:11 |
| TheJulia | fair enough | 17:11 |
| TheJulia | Doug started a call close to 2 hours ago and I finally got off a call with the last neutron person a few minutes ago | 17:12 |
| dtantsur | Impressive dedication | 17:12 |
| TheJulia | My spec proposes two separate aspects, an broad idea of connectivity as it relates to OVN+neutron, and then the NGS side impact. The broad connectivity idea *may* end up in something like networking-baremetal or elsewhere, but what cardoe has been pushing to bring clarity for first is the port binding and segmentation issues first | 17:14 |
| TheJulia | then the actual layer 2 logical segment bridging as a secondary aspect | 17:14 |
| dtantsur | The spec could actually use some sort of "dummy's guide to this spec" :D as in, summary for people, for whom FRR is the sound that a cat can make. | 17:16 |
| TheJulia | lol, okay | 17:18 |
| TheJulia | If this becomes a thing *in* ironic, we must name modules after colors of cats and their behaviors. I'm just unsure which thing should be the orange cat or the black cat (commonly referred to as a void) | 17:19 |
| dtantsur | Orange cat should regularly send packages where they're not supposed to be | 17:20 |
| dtantsur | "Why are 10% of packages dropped?" "Have you tried looking under the fridge?" "Oh heck, there are gigabytes of them here!!" | 17:24 |
| cardoe | That's the thing. It's complicated. And every time I make a simple representation people tell me "well why can't it just be X". | 17:24 |
| dtantsur | Maybe calling it "Laymen summary" will be offensive enough to people who know networking to prevent them from commenting? | 17:25 |
| dtantsur | Layhumans/Peasants/Dmitry-who-should-have-known-networking-after-all-these-years-here summary, pick yours | 17:26 |
| TheJulia | cardoe: my favorite is "just use type5, that is what we're standardizing on" | 17:27 |
| TheJulia | dtantsur: would "Friends, Romans, Countrymen, lend me your ears" style work? | 17:29 |
| dtantsur | Perfect! | 17:29 |
| TheJulia | "As if your Marcus Antonius, please give me a simple, easy to understand primer" | 17:30 |
| * TheJulia expects latin as the result | 17:30 | |
| TheJulia | dtantsur: does https://review.opendev.org/c/openstack/neutron/+/965415 help as a primer? | 17:44 |
| dtantsur | I'll have to read it on Monday with a fresh head, thanks | 17:47 |
| TheJulia | ok | 17:48 |
| cardoe | dtantsur: https://cardoe.com/neutron/evpn-vxlan-network/admin/data-center-networks.html is a rendered copy of 965415 as well. | 17:53 |
| dtantsur | nice! | 17:54 |
| dking | dtantsur: I have an unimportant question about BMO. For https://github.com/metal3-io/baremetal-operator/blob/main/config/base/rbac/kustomization.yaml, I see that most of the rbac .yaml aren't being used. Is that intentional? Am I missing something, or do those get picked up on some specific kustomize? | 17:58 |
| dtantsur | dking: I suspect they're auto-generated examples (you mean stuff like *_role.yaml right?) | 18:00 |
| dtantsur | https://github.com/metal3-io/baremetal-operator/blob/main/config/base/rbac/role.yaml is what the actual service user uses | 18:00 |
| dking | dtantsur: Correct. I see that role.yaml and about 8 other files are used, but that's out of about 25 files. | 18:02 |
| TheJulia | clif: i just -1'ed to request a quick revision on 964895. If you want to do it as a follow-up, thats fine. Hopefully test order won't change and will just work as well, but that might be another reason to use id value over uuid :) Just let me know if you want to do it as a follow-up or not | 18:06 |
| opendevreview | Merged openstack/ironic master: Add a script to copy inspection data between Swift buckets https://review.opendev.org/c/openstack/ironic/+/966899 | 18:11 |
| clif | TheJulia: So you mean sort by ID in descending order (greatest/highest ID would be first)? | 18:14 |
| TheJulia | Yeah, that was also the way I was interpretting JayF's comment as as well | 18:15 |
| clif | JayF didn't seem to explicitly favor one over the other but newer first may make more sense | 18:17 |
| cardoe | So https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/966192 can we land that? | 18:23 |
| cardoe | or maybe https://review.opendev.org/c/openstack/ironic/+/964570 with a cherry on top? | 18:24 |
| TheJulia | cardoe: I workflowed it and then saw the other comment and un-workflowed the first one | 18:28 |
| TheJulia | so.. if people are okay we can workflow it | 18:28 |
| cardoe | So I think there's value in the less combined thing... but already the current one is combined enough people cannot use it | 18:28 |
| cardoe | So I thought if there's interest in it... we make a "base" | 18:29 |
| TheJulia | fair enough | 18:29 |
| cardoe | The only reason we wanted the less combined thing is cause the virtual-media not waiting 30 seconds bug | 18:30 |
| cardoe | We needed to turn off dhcp-all-interfaces | 18:30 |
| TheJulia | thats super bizzar to me | 18:31 |
| TheJulia | since when we've tested, and maybe its just that specific hardware, I dunno | 18:32 |
| opendevreview | Clif Houck proposed openstack/ironic master: Trait Based Networking Filter Expression Parsing and Base Models https://review.opendev.org/c/openstack/ironic/+/961498 | 18:37 |
| opendevreview | Clif Houck proposed openstack/ironic master: Configuration file for Trait Based Networking https://review.opendev.org/c/openstack/ironic/+/962598 | 18:37 |
| opendevreview | Clif Houck proposed openstack/ironic master: Generate network plan based on trait based networking config https://review.opendev.org/c/openstack/ironic/+/964895 | 18:37 |
| opendevreview | Clif Houck proposed openstack/ironic master: Trait Based Networking Simulator https://review.opendev.org/c/openstack/ironic/+/966202 | 18:37 |
| opendevreview | Clif Houck proposed openstack/ironic master: WIP: Add configuration options for trait based networking https://review.opendev.org/c/openstack/ironic/+/968054 | 18:37 |
| cardoe | TheJulia: what's the ironic-api log snippet you're looking for? | 18:47 |
| cardoe | Cause I included the one with the 500 error | 18:48 |
| TheJulia | cardoe: anything that includes the additional calls made by nova afterwards with their corresponding error codes | 18:49 |
| TheJulia | since that is just the call itself, the thing which needs to be done is ensure/verify where things went sideways on the unwind | 18:50 |
| cardoe | 2025-11-20 12:19:03.160 8 INFO ironic.api [None req-899167ff-7be6-4a7c-b72f-ce5c4ca9becd 6dff409ebb31414299c3d0cd837eea9e 32e02632f4f04415bab5895d1e7247b7 - - a6f7dcd63c9b4940915062f57a48df77 7f46f53fcb3c4625a343eaa35b5e0d04] 10.64.48.142 - DELETE /v1/nodes/86eb7354-cc10-4173-8ff2-d1ac2ea6befd/vifs/ecf586d0-99c2-4867-b516-6fa2c39c30b1 - 204 (1537.92ms) | 18:52 |
| cardoe | That seems good an validated what skrobul said above | 18:52 |
| TheJulia | Okay, any others, specifically looking for a PATCH call | 18:52 |
| cardoe | 2025-11-20 12:19:03.889 8 INFO ironic.api [None req-7880e28e-bbc9-4c00-8e88-aa946c5500d6 6dff409ebb31414299c3d0cd837eea9e 32e02632f4f04415bab5895d1e7247b7 - - a6f7dcd63c9b4940915062f57a48df77 7f46f53fcb3c4625a343eaa35b5e0d04] 10.64.48.142 - PATCH /v1/nodes/86eb7354-cc10-4173-8ff2-d1ac2ea6befd - 200 (609.26ms) | 18:53 |
| TheJulia | WUT | 18:53 |
| TheJulia | oh, that is a different failure | 18:53 |
| TheJulia | look at the times on https://gist.github.com/cardoe/b0aefe21b1fc7b81c38bed8dad8e14b2 | 18:53 |
| cardoe | Look at my fail-y copy and paste with the different times.... it's that node. grafana copy and paste got fixed | 18:54 |
| cardoe | That's 2 seconds after the failure | 18:54 |
| cardoe | 2025-11-20 12:19:01.544 8 INFO ironic.api [None req-2415f97c-664f-4a61-8241-12161991eb12 6dff409ebb31414299c3d0cd837eea9e 32e02632f4f04415bab5895d1e7247b7 - - a6f7dcd63c9b4940915062f57a48df77 7f46f53fcb3c4625a343eaa35b5e0d04] 10.64.48.142 - POST /v1/nodes/86eb7354-cc10-4173-8ff2-d1ac2ea6befd/vifs - 500 (7047.78ms) | 18:55 |
| cardoe | That's the one that started it | 18:55 |
| TheJulia | oh, okay | 18:56 |
| TheJulia | so... hmm | 18:56 |
| TheJulia | and that patch is the last call on that node?> | 18:57 |
| cardoe | ugh | 18:57 |
| cardoe | you're right its a race | 18:57 |
| cardoe | cause that one looks like it didn't have the instance-uuid left behind | 18:57 |
| TheJulia | but another attempt did, so I bet it has a different error code on the patch | 18:57 |
| TheJulia | I *bet* the lock was still unwinding from the vif removal | 18:58 |
| TheJulia | That is super rapid succession | 18:58 |
| TheJulia | and maybe the thing to do is resolve the request and unwind the lock before returning the response | 18:58 |
| TheJulia | I *bet* that would actually fix it | 18:58 |
| cardoe | 2025-11-20 12:37:00.921 8 INFO ironic.api [None req-75fab2b4-b31f-4ff4-9609-b18b4b81fadf 6dff409ebb31414299c3d0cd837eea9e 32e02632f4f04415bab5895d1e7247b7 - - a6f7dcd63c9b4940915062f57a48df77 7f46f53fcb3c4625a343eaa35b5e0d04] 10.64.48.142 - PATCH /v1/nodes/86eb7354-cc10-4173-8ff2-d1ac2ea6befd - 409 (2588.32ms) | 18:59 |
| cardoe | That's a run with instance-uuid left behind | 18:59 |
| cardoe | Everything else is the same error wise.. its the same 500 error with the Could not seed network configuration for VIF | 19:00 |
| opendevreview | Julia Kreger proposed openstack/ironic master: doc: trivial: Quick revision of README https://review.opendev.org/c/openstack/ironic/+/968056 | 19:00 |
| TheJulia | Yeah, a held lock will do that | 19:01 |
| TheJulia | cardoe: do you have defaults for [conductor]node_locked_retry_interval and [conductor]node_locked_retry_attempts ? | 19:13 |
| TheJulia | likely not since it should be 1 and 3 respectively | 19:15 |
| cardoe | not set in the file | 19:18 |
| TheJulia | so, its running long, but it shoudl be taking longer than 2.588 seconds | 19:19 |
| cardoe | this was on a box that's got everything converged on it and hosting virtual nodes so its fairly slammed | 19:19 |
| TheJulia | could be one extra retry is getting in there I guess | 19:19 |
| opendevreview | Julia Kreger proposed openstack/ironic master: WIP: Downgrade the lock on vif detach https://review.opendev.org/c/openstack/ironic/+/968061 | 19:34 |
| TheJulia | oh, 2.5ms makes sense because of the way tenaicy works, the first one counts immediately and 1 second between each | 19:39 |
| TheJulia | so yeah, its just unwinding. | 19:39 |
| TheJulia | So yeah, I think downgrading the lock is only the real viable solution, the alternative is to play with extending the retries | 19:46 |
| opendevreview | Julia Kreger proposed openstack/ironic master: WIP: Downgrade the lock on vif detach https://review.opendev.org/c/openstack/ironic/+/968061 | 19:51 |
| TheJulia | cardoe: give ^^^ a spin if you wouldn't mind on your ironic conductor. Otherwise the lock us unwound in __exit__ which I'm wondering might be a bit too late. The alternative is really tuning, or going going a "this is somehow imperative, and we need to keep retrying", update_node and task reservation locking code is only the basic retries on the update_node | 19:55 |
| JayF | TheJulia: +1 to the suggested solution of ensuring the lock is unwound before the response is given (in general we should see if we can enforce this pattern) | 19:55 |
| JayF | lol I'm a little late I see | 19:55 |
| opendevreview | Merged openstack/ironic-python-agent stable/2025.2: Test advertised ip reachability before assigning it https://review.opendev.org/c/openstack/ironic-python-agent/+/966671 | 19:55 |
| TheJulia | The call is "exception or success through no exception", __exit__ should be unwinding it all sooner, but yeah... | 19:56 |
| TheJulia | at least, sooner than the call but that might jsut be racey overall | 19:56 |
| TheJulia | The alternative is largely to add some more loggin to understand why the node lock was still held | 19:57 |
| TheJulia | logging | 19:57 |
| JayF | It's weird to me, given the retry config, that the lock could be held so long | 19:57 |
| JayF | because if we 403 to nova it should retry 5 times which takes about 8 seconds | 19:58 |
| TheJulia | well | 19:58 |
| TheJulia | we 409 | 19:58 |
| JayF | 409 is what I meant | 19:58 |
| JayF | 409/503 gets retried by default in osdk | 19:58 |
| TheJulia | We 500, then 409, I'm not sure nova is retrying on a 409, it has a general exception catch its in | 19:58 |
| TheJulia | oh, news to me | 19:58 |
| JayF | this is the road cid and I was going down | 19:58 |
| JayF | when we saw the servicing bug | 19:59 |
| JayF | by default Ironic osdk clients retry 503/409 up to 5 times | 19:59 |
| TheJulia | but functionally what is happening is we get vif detach, we're returning it but the lock is still somehow there, in rapid succession nova is immediately kicking back. As doug noted, if the machine is under load, things might get a little funky | 19:59 |
| JayF | I really suspect that cardoe is running something that's subduing retries | 19:59 |
| TheJulia | That... might also do it | 20:00 |
| JayF | like configuring it to retry less either intentionally or unintentionally | 20:00 |
| TheJulia | I just don't know why that lock was sitting around for so long | 20:00 |
| TheJulia | maybe power state sync snuck in?! | 20:00 |
| JayF | not saying we should require the retry to work properly by any means, but that the story here is missing the "why did retries not happen" and/or "why was the lock held for several seconds so retries didn't matter" | 20:00 |
| TheJulia | The *only* alternative I can think of is "recognize this is a complete detach of the instance metadata due to instance_info and instance_uuid both being reset, and hard lock the update or somehow have more aggressive internal retry logic | 20:01 |
| TheJulia | But that feels super complex possibly | 20:01 |
| TheJulia | and leads down a path of wanting to over-engineer | 20:01 |
| JayF | well where do we die with the lack of lock | 20:01 |
| opendevreview | Marcus Furlong proposed openstack/sushy master: remove tests for boot and actions missing attributes https://review.opendev.org/c/openstack/sushy/+/968069 | 20:02 |
| JayF | internal retries on requests that involve removal of instance_uuid is a really, really clever idea depending on where it'd blow up | 20:02 |
| JayF | maybe with deferred task framework, we could one day do a "accepted" type of response for undeploys in cases of locked node | 20:02 |
| JayF | like have a class of things that if locked, we will do it later instead | 20:02 |
| JayF | IDK | 20:02 |
| * JayF spitballing | 20:03 | |
| TheJulia | everything points to update_node call handling inside manager.py | 20:05 |
| TheJulia | I dunno, I'd rather not modify the task_manager with an optional retry override | 20:05 |
| TheJulia | but if cardoe can give some insight, that will help in the end | 20:06 |
| opendevreview | Verification of a change to openstack/ironic-python-agent-builder master failed: Add simple-init by default https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/966192 | 20:39 |
| opendevreview | Nahian Pathan proposed openstack/ironic master: Reduce API calls when collecting sensor data with redfish https://review.opendev.org/c/openstack/ironic/+/955484 | 20:51 |
| opendevreview | Julia Kreger proposed openstack/ironic-python-agent master: ci: disable vnc on IPA jobs https://review.opendev.org/c/openstack/ironic-python-agent/+/968078 | 20:56 |
| opendevreview | Julia Kreger proposed openstack/ironic-python-agent master: Remove PReP support https://review.opendev.org/c/openstack/ironic-python-agent/+/965390 | 21:03 |
| JayF | there was a change already up for that? | 21:10 |
| JayF | https://review.opendev.org/c/openstack/ironic-python-agent/+/964200 | 21:10 |
| JayF | cid proposed it before the ptg even hit | 21:10 |
| JayF | oh, deprecate vs remove | 21:11 |
| JayF | hm | 21:11 |
| opendevreview | Julia Kreger proposed openstack/ironic master: deprecate PReP partition support https://review.opendev.org/c/openstack/ironic/+/968081 | 21:21 |
| JayF | TheJulia: ^ | 21:22 |
| JayF | I'm not sure which direction we wanna go, but the patches in flight right now are duplicating work :/ | 21:22 |
| TheJulia | oh, I had no idea cid was going to do that | 21:25 |
| TheJulia | well, Odds are we can safely remove from IPA as long as we don't remove the call handling in ironic until the deprecation, but *shrugs* | 21:25 |
| JayF | it was done pre-ptg :) | 21:26 |
| TheJulia | oh, really?! | 21:26 |
| TheJulia | wow | 21:26 |
| TheJulia | okay | 21:26 |
| JayF | yeah, it's not labelled -prio | 21:26 |
| JayF | so it's sorta in a weird limbo | 21:26 |
| TheJulia | I mean, we should review patches outside of -prio, but prio does give us a smaller list :) | 21:28 |
| JayF | I actaully stopped using prio dash for the most part | 21:30 |
| JayF | unless I'm really, really low on time | 21:30 |
| JayF | started noticing too many non-core patches getting big review delays, so I just use my "anything in ironic repo I haven't reviewed" search | 21:31 |
| TheJulia | cid, on https://review.opendev.org/c/openstack/python-ironicclient/+/955102, you need to update the client's max version, its a bit hidden but... i think in http.py ;) | 21:32 |
| TheJulia | Yeah, sort of the same | 21:33 |
| JayF | speaking of review; I need to make 100% sure alain gets a review from me today | 21:33 |
| * JayF goes and does that | 21:33 | |
| cardoe | I've been adding stuff to ironic-week-prio that seems ready. | 21:34 |
| cardoe | But the queue is backed up | 21:34 |
| JayF | yeah I'm not saying doing that or using that dashboard is bad | 21:34 |
| JayF | but as someone with a lot of review b/w I try to cast a larger net | 21:35 |
| TheJulia | some of the items also need to be fixed/rebased/adjusted | 21:37 |
| JayF | IMO as cores we should drop the tag from anything needing code changes that haven't been updated in a couple of days | 21:38 |
| JayF | but maybe that's a bit aggro :) | 21:38 |
| TheJulia | Yeah | 21:44 |
| TheJulia | I'm sort of pondering trying to take a day, and that might be something I do this next week, to take a look at a couple of them and just do a "I'm updating it for you" | 21:44 |
| TheJulia | Its something I tend to do at times but haven't done in recent time | 21:44 |
| JayF | I am like 50/50 on if I'm working Monday, if you wanted to carve some time out for a group review/rebase party, I would make time for it | 21:45 |
| * JayF is out most of next week going home to NC to visit family | 21:45 | |
| TheJulia | JayF: enjoy! | 21:45 |
| TheJulia | I'm sort of hoping next week is nice and quiet, I've got some open downstream items which really require people to re-appear who are on PTO right now, so... *shrug* | 21:46 |
| JayF | if we got a third core we could just go land half the world, one person to update, two to review, bang out some patches | 21:46 |
| TheJulia | +++ | 21:46 |
| TheJulia | I might be able to make some time on monday | 21:46 |
| JayF | if so we need to pick a time since I won't just be around all day :) lmk, no rush really | 21:46 |
| TheJulia | It looks like, and I'll double check, but I think I only have 1 meeting on Monday | 21:46 |
| TheJulia | monday morning after the team meeting? | 21:46 |
| JayF | I think I have a vet appt on Monday (for my cats :P) but otherwise I should be clear. Unsure when that is, I hope I put it on the calendar lol | 21:47 |
| JayF | I should be able to make that. Not gonna set an alarm for the meeting though :D | 21:47 |
| TheJulia | lol | 21:47 |
| TheJulia | sounds good | 21:47 |
| JayF | cid: if you are working monday, and can come to the review jam right after team meeting, please do :D | 21:48 |
| JayF | cardoe: also ^^^ similarly a chance for you to help us shave down that list | 21:48 |
| TheJulia | that list should shrink, I aggressively approved some backports | 21:49 |
| TheJulia | But, yeah. | 21:49 |
| cid | ,,,, | 21:50 |
| opendevreview | Merged openstack/ironic master: Support segmented port ranges https://review.opendev.org/c/openstack/ironic/+/967727 | 21:50 |
| * cid tests, for some reason I couldn't send messages here for a while now. | 21:50 | |
| cid | TheJulia, patch '955102,, ack'ed | 21:50 |
| cid | JayF, sure, I will be working Monday. | 21:50 |
| opendevreview | cid proposed openstack/python-ironicclient master: A new `instance_name` field to the node object https://review.opendev.org/c/openstack/python-ironicclient/+/955102 | 21:51 |
| opendevreview | Doug Goldstein proposed openstack/ironic master: fix: glance image member lookup resulted in an empty list always https://review.opendev.org/c/openstack/ironic/+/968087 | 21:51 |
| cardoe | JayF: yep | 21:51 |
| cardoe | I've been slowly approving backports | 21:51 |
| JayF | cardoe: I literally spent my first like, 6 months at GR-OSS just backporting every bugfix since the beginning of time. Satisfying but time consuming work for something so simple. | 21:52 |
| cardoe | ^ that's not even what I'm trying to fix but its spamming my logs... so that's fixed now... it needs to be backported to 2025.2 | 21:52 |
| cardoe | but in all seriousness... do we want to hold off on https://review.opendev.org/c/openstack/ironic/+/964570 ? | 21:53 |
| cardoe | I'm waiting on that to decide how I approach some of the other segment issues | 21:53 |
| cardoe | As well as refreshing my patch on deleting the check that JayF wanted clif to delete. | 21:53 |
| JayF | I don't have enough context to say one way or the other re 964570 | 21:54 |
| cardoe | probably a TheJulia poke | 21:55 |
| TheJulia | merge it | 21:56 |
| TheJulia | MMMEEERGE! | 21:56 |
| TheJulia | cardoe: unless you don't want us to?! | 21:56 |
| cardoe | well I didn't think it right to +W my own patch | 21:57 |
| TheJulia | fair enough | 21:57 |
| * TheJulia does it | 21:57 | |
| TheJulia | done | 21:57 |
| JayF | cardoe: I don't hesitate to +W my own patch if it has 2x+2 from other cores, as long as it's not something I JUST posted up | 21:58 |
| TheJulia | same | 21:59 |
| TheJulia | If its been up for a while, I'm okay with it, otherwise I only do stuff like that if its a CI fix or something semi-critical so other stuff can flow | 21:59 |
| opendevreview | Merged openstack/ironic-python-agent stable/2025.1: Test advertised ip reachability before assigning it https://review.opendev.org/c/openstack/ironic-python-agent/+/966776 | 22:08 |
| opendevreview | Merged openstack/ironic-python-agent bugfix/11.1: Test advertised ip reachability before assigning it https://review.opendev.org/c/openstack/ironic-python-agent/+/966774 | 22:08 |
| opendevreview | Merged openstack/ironic-python-agent bugfix/11.0: Test advertised ip reachability before assigning it https://review.opendev.org/c/openstack/ironic-python-agent/+/966775 | 22:08 |
| opendevreview | Merged openstack/ironic-python-agent bugfix/11.1: Fix RuntimeError when stopping heartbeater in rescue mode https://review.opendev.org/c/openstack/ironic-python-agent/+/967356 | 22:08 |
| opendevreview | Merged openstack/ironic-python-agent bugfix/11.0: Fix RuntimeError when stopping heartbeater in rescue mode https://review.opendev.org/c/openstack/ironic-python-agent/+/967357 | 22:08 |
| cardoe | My favorite messages in this channel. | 22:09 |
| JayF | bugfix/11.* made me do a double take | 22:15 |
| JayF | until I saw IPA lol | 22:15 |
| cardoe | okay now that this member list thing is quiet... here's the other weird one | 22:18 |
| cardoe | 2025-11-21 16:24:09.848 13 ERROR ironic.drivers.modules.image_cache [None req-1642551b-ad53-4692-927d-eca2d6cdf90f 6dff409ebb31414299c3d0cd837eea9e 32e02632f4f04415bab5895d1e7247b7 - - a6f7dcd63c9b4940915062f57a48df77 7f46f53fcb3c4625a343eaa35b5e0d04] Could not link image 5048de36-4c01-4047-b638-195e639cb1e3 from /var/lib/openstack-helm/ironic/master_images/5048de36-4c01-4047-b638-195e639cb1e3.converted to | 22:18 |
| cardoe | /var/lib/openstack-helm/ironic/images/5180e19d-c3c6-4afb-b626-08d70ec1f456/disk, error: [Errno 2] No such file or directory: '/var/lib/openstack-helm/ironic/master_images/tmpgmxnx_5h/5048de36-4c01-4047-b638-195e639cb1e3.converted': FileNotFoundError: [Errno 2] No such file or directory: '/var/lib/openstack-helm/ironic/master_images/tmpgmxnx_5h/5048de36-4c01-4047-b638-195e639cb1e3.converted' | 22:18 |
| JayF | I have a present for you cardoe | 22:20 |
| cardoe | The weekend? | 22:20 |
| JayF | cardoe: https://review.opendev.org/c/openstack/ironic/+/964502 I suspect you're hitting the bug this fixes | 22:20 |
| cardoe | https://opendev.org/openstack/ironic/src/commit/27c805b97ffd7a4a159860537311815f579d9cf3/ironic/drivers/modules/image_cache.py#L243 I'm trying to figure out if its the first os.link or the 2nd | 22:20 |
| JayF | cardoe: I spent a hilariously long time trying to get a big enough image into glance to test that before ragequitting it for a while, and I haven't gotten back around | 22:20 |
| cardoe | okay so it doesn't happen on 2025.1 and it doesn't happen when I turn caching off. | 22:22 |
| cardoe | so what you're saying is I gotta stop slamming this over loaded box? | 22:24 |
| cardoe | It's actually not overloaded. Someone played a funny on me and set the k8s resource limit on CPU per pod to like nada. | 22:25 |
| JayF | I think there's a case that can happen (emphasis on the /think/, this is 100% untested hypothesis territory) where your conductor gets timed out while doing the hash | 22:26 |
| JayF | our code that deletes an image after a failed deployment fires | 22:26 |
| JayF | and you sorta end up in a loop of sadness | 22:26 |
| JayF | I know johnthetubaguy has reported having deployments fail the first time and succeed later, but I think that's with a version of ironic old enough to not clear cache on failures | 22:26 |
| cardoe | I'm being an absolute stick in the mud and wanting 2025.2 to work happy on this box that's clearly struggling and not just on regular hardware. | 22:27 |
| opendevreview | Verification of a change to openstack/ironic-python-agent-builder master failed: Add simple-init by default https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/966192 | 22:53 |
| cardoe | JayF: yes jsonschema works with yaml | 23:02 |
| opendevreview | Merged openstack/ironic master: Include inspector conf groups in sample conf/docs https://review.opendev.org/c/openstack/ironic/+/952338 | 23:09 |
| opendevreview | Merged openstack/ironic master: pass along physical_network to neutron from the baremetal port https://review.opendev.org/c/openstack/ironic/+/964570 | 23:28 |
| JayF | cardoe: clif: Then we should probably use jsonschema to do the validation unless there's a specific reason we cannot. | 23:48 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!