| opendevreview | Verification of a change to openstack/ironic master failed: Remove old -ipmitool job definitions https://review.opendev.org/c/openstack/ironic/+/974789 | 00:32 |
|---|---|---|
| opendevreview | Merged openstack/ironic master: Change default value of some redfish firmware update config https://review.opendev.org/c/openstack/ironic/+/974800 | 00:43 |
| Continuity | TheJulia: not sure I understand, ill catch up with you tomorrow | 00:52 |
| TheJulia | Continuity: it’s always easier to embrace the thing in motion if you align with it, harder if you try to move against it. | 01:22 |
| TheJulia | Momentum and all ;) | 01:23 |
| opendevreview | Merged openstack/networking-generic-switch master: Migrate setup configuration to pyproject.toml format https://review.opendev.org/c/openstack/networking-generic-switch/+/973526 | 02:47 |
| opendevreview | Jacob Anders proposed openstack/python-ironicclient master: Remove health from default node list columns https://review.opendev.org/c/openstack/python-ironicclient/+/974984 | 08:03 |
| janders | ^^ follow up to health monitoring change, client-side. This is to prevent this: https://paste.openstack.org/show/bwS9KrGw1g9AHzcNIPNU/ CC stevebaker[m] cardoe. Would welcome your thoughts, meanwhile will work on a doco addition to cover health. | 08:09 |
| janders | Cause: we removed health from default API fields in the last round of reviews, but this got missed on the client side. Happens I suppose :) | 08:10 |
| rpittau | good morning ironic! o/ | 08:13 |
| opendevreview | Pierre Riteau proposed openstack/ironic master: Set columns in bios_settings table to bigint https://review.opendev.org/c/openstack/ironic/+/968348 | 08:22 |
| janders | hey rpittau o/ | 08:25 |
| abongale | good morning o/ | 08:33 |
| opendevreview | Jacob Anders proposed openstack/ironic master: Add documentation for node health monitoring feature https://review.opendev.org/c/openstack/ironic/+/974985 | 08:37 |
| dtantsur | cardoe: since we want consistency between redfish and agent inspections, it makes sense to delegate to the hook. A change like that will definitely need a release note though because of upgrade impact. | 12:40 |
| dtantsur | ehmmm, folks, can anyone open https://docs.openstack.org/ironic/latest/ ? | 12:45 |
| cardoe | janders: I don’t think we can edit release notes like that. | 12:57 |
| cardoe | dtantsur: loads for me | 12:57 |
| kubajj | hello Ironic! o/ | 13:23 |
| kubajj | This might be a stupid question: Is the default_inspect_interface config option not used in Ironic? | 13:24 |
| kubajj | It is either in docs, api-ref or the config https://www.irccloud.com/pastebin/WMq8MRMk/git%20grep%20default_inspect_interface%20over%20ironic%20repo | 13:27 |
| dtantsur | kubajj: it's supposedto be used | 13:32 |
| TheJulia | good morning | 14:01 |
| TheJulia | BRRAAAINS | 14:01 |
| TheJulia | dtantsur: The latest docs worked just fine for me | 14:10 |
| dtantsur | yeah, the open for me now too. got 403 a few times. | 14:11 |
| TheJulia | Interesting | 14:14 |
| janders | cardoe ACK, will respin it | 14:16 |
| janders | (w/r/t health monitoring client followup) | 14:16 |
| cardoe | rpittau: if ya get a sec to respin... https://review.opendev.org/c/openstack/bifrost/+/974855 | 14:27 |
| rpittau | cardoe: done | 14:28 |
| abongale | JayF: stevebaker[m]: thanks for the review on https://review.opendev.org/c/openstack/python-ironicclient/+/973948. | 14:33 |
| abongale | JayF: raised some questions on the patchset: How much API compatibility do we owe existing users?, is this OK with a major version bump?, and can the SDK team review? | 14:35 |
| abongale | idk how to address them, if ironic can chime in, please? | 14:35 |
| cardoe | I'm good with a major bump for it to ship. | 14:40 |
| cardoe | That change would make scripting the client to me much easier. | 14:40 |
| cardoe | Cause right now the field names change inconsistently. | 14:40 |
| TheJulia | its sort of a bug fix because of that inconsistency, but we should treat it as major because it is breaking. One thing for sure, it will be breaking for folks with scripting looking for existing fields. | 14:43 |
| opendevreview | Allain Legacy proposed openstack/ironic master: Add ironic-networking network interface https://review.opendev.org/c/openstack/ironic/+/966470 | 14:44 |
| opendevreview | Allain Legacy proposed openstack/ironic master: Add standalone networking service installation guide https://review.opendev.org/c/openstack/ironic/+/966471 | 14:44 |
| opendevreview | Allain Legacy proposed openstack/ironic master: Improve exception handling in switch driver factory https://review.opendev.org/c/openstack/ironic/+/969852 | 14:44 |
| opendevreview | Allain Legacy proposed openstack/ironic master: Address remaining review comments for rpc methods https://review.opendev.org/c/openstack/ironic/+/971184 | 14:44 |
| opendevreview | Merged openstack/ironic-python-agent bugfix/11.4: Update .gitreview for bugfix/11.4 https://review.opendev.org/c/openstack/ironic-python-agent/+/974739 | 14:50 |
| cardoe | TheJulia: yep. break all my scripts please cause the scripts in OpenStack Helm suck cause of this inconsistency. | 14:52 |
| TheJulia | ... That is not a response I expected from the hint of challenge ;) | 14:53 |
| opendevreview | Clif Houck proposed openstack/ironic master: Prevent multiple attach actions being generated for the same port https://review.opendev.org/c/openstack/ironic/+/974520 | 14:54 |
| opendevreview | Clif Houck proposed openstack/ironic master: Filter out NoMatch actions in _vif_attach_tbn https://review.opendev.org/c/openstack/ironic/+/974569 | 14:54 |
| opendevreview | Clif Houck proposed openstack/ironic master: Update TBN config file to improve trait structure https://review.opendev.org/c/openstack/ironic/+/974776 | 14:54 |
| opendevreview | Clif Houck proposed openstack/ironic master: Add an ordering method for TBN traits https://review.opendev.org/c/openstack/ironic/+/974519 | 14:54 |
| opendevreview | Clif Houck proposed openstack/ironic master: Add default trait behavior to _vif_attach_tbn https://review.opendev.org/c/openstack/ironic/+/974960 | 14:54 |
| opendevreview | Clif Houck proposed openstack/ironic master: Filter out already attached portlikes in plan_vif_attach https://review.opendev.org/c/openstack/ironic/+/974521 | 14:54 |
| opendevreview | Clif Houck proposed openstack/ironic master: WIP: Update TBN simulator for vif_attach planning https://review.opendev.org/c/openstack/ironic/+/973691 | 14:54 |
| JayF | I did put a link to that patch in the SDK IRC | 15:03 |
| TheJulia | Maybe a pending comment? | 15:04 |
| opendevreview | Jacob Anders proposed openstack/python-ironicclient master: Remove health from default node list columns https://review.opendev.org/c/openstack/python-ironicclient/+/974984 | 15:09 |
| cardoe | TheJulia: so OpenStack Helm likes to do their init / bootstrap stuff with bash scripts. And they've optionally got quite a bit that can be done like add some nodes and ports for the nodes via a YAML file. | 15:10 |
| janders | cardoe removed the release note edit and created a new reno instead ^ | 15:10 |
| cardoe | I cannot easily parse the data with some helper functions cause the field names change between input and output. It's lead to a number of bugs in the past. | 15:10 |
| TheJulia | woot, I got my updated ironic-neutron-agent started.... but we won't talk about the exceptions coming out of it | 15:11 |
| * TheJulia suspects the software is traveling in a light cone past the event horizon | 15:11 | |
| janders | kubajj once we have the client follow up sorted, would you be able to / interested in testing hardware health monitoring on a test system? I tested it on real hardware, but only one machine (which implies one model) and I can't easily simulate failures (so can't validate how it reacts to failure). I figured you folks may have a ton of hardware | 15:12 |
| janders | including some moderately broken nodes :) | 15:12 |
| cardoe | janders: were you talking about setting up an endpoint in the conductor that the redfish could send hardware events back to? | 15:14 |
| janders | cardoe no, this is separate - but I am interested in this also, with extending health monitoring in mind | 15:15 |
| janders | I think iurygregory and dtantsur were looking into events | 15:15 |
| cardoe | Do we have a bug for it or a tracker? | 15:15 |
| cardoe | I just want to reference it cause I think I'm going to ask Nidhi to work on that next. | 15:15 |
| cardoe | It's either that or the storage stuff, TheJulia. | 15:16 |
| TheJulia | who huh what? | 15:16 |
| cardoe | Nidhi on my team that's done a bunch of the redfish inspection changes. | 15:16 |
| TheJulia | oh, bugs please ? :) | 15:16 |
| * TheJulia walks the corgi overlord | 15:17 | |
| cardoe | So before I forget... https://review.opendev.org/c/openstack/python-ironicclient/+/974984 once that's in that should make the client 100% compat with the upcoming api release. | 15:17 |
| dtantsur | One of the worst aspects of IRC: not surviving even a mere reconnect... | 15:17 |
| dtantsur | cardoe: not sure if the link went through: https://docs.openstack.org/ironic/latest/admin/drivers/redfish/passthru.html#create-subscription | 15:18 |
| dtantsur | I'd be very interesting in having something in Ironic that would serve as a sink for events though | 15:18 |
| dtantsur | even if it's as simple as dumping everything with certain severity in the logs | 15:19 |
| cardoe | dtantsur: yeah exactly.. building on that | 15:20 |
| cardoe | https://specs.openstack.org/openstack/ironic-specs/specs/backlog/event-subscription.html we've got that spec which says its a backlog but I've not read it but I know that passthru exists so maybe its done? | 15:21 |
| janders | once we have monitoring all the way through the stack with health info sourced in Ironic in BMO, I'd be interested in extending it from a single OK|WARN|CRIT aggregate field to providing more data on what's actually failing | 15:22 |
| kubajj | janders: oh, we do have broken nodes | 15:22 |
| janders | kubajj awesome! :) | 15:22 |
| dtantsur | cardoe: it was not done as a generic API | 15:22 |
| janders | If I can ask you to re+2 the patch we will have all the code in master shortly | 15:22 |
| kubajj | I could probably chery pick the changes to our qa region | 15:23 |
| janders | awesome | 15:23 |
| janders | it would be great to see how the feature works in the wild | 15:23 |
| janders | my main focus areas are 1) reacting to actual failures and 2) how it works on heterogeneous hw | 15:24 |
| kubajj | janders: I need all 3, right? | 15:24 |
| kubajj | (ironic, sdk, client) | 15:24 |
| janders | yes | 15:24 |
| janders | + client-followup (to avoid messy outputs) | 15:24 |
| janders | so 4 | 15:24 |
| janders | will you be interested in having health info in metal3/BMO (what I will be working on next w/r/t to this feature)? | 15:25 |
| janders | or are you thinking Ironic at this stage? | 15:26 |
| TheJulia | dtantsur: ... I thought there wsa some work on event sinking at some point? | 15:30 |
| dtantsur | There was a desire :) | 15:31 |
| opendevreview | Merged openstack/ironic-prometheus-exporter master: Use SERVICE_HOST for url https://review.opendev.org/c/openstack/ironic-prometheus-exporter/+/974631 | 15:31 |
| dtantsur | It may even end up on our backlog in 2026 (in the form I mentioned above: just dump them in the conductor logs) | 15:31 |
| JayF | Node history might be an interesting place to toss those kind of alerts too | 15:47 |
| JayF | Or potentially even republishing as notifications? | 15:47 |
| * TheJulia pokes neutron and wonders why it is seemingly on vacation | 16:33 | |
| TheJulia | ... wut | 16:35 |
| TheJulia | doh! | 16:39 |
| TheJulia | misconfiguration | 16:39 |
| TheJulia | JayF: it might also be a super bad idea, some gear as I understand it dumps an event for any sensor change | 16:40 |
| JayF | You shouldn't be able to write things that throw events or alerts until you've been on call lol | 16:40 |
| TheJulia | "oh, temp sensor reading now 29.1c" "oh, temp sensor is now 29.2" *human stands in front of rack* "temp sensor is now 29.4!" | 16:40 |
| JayF | alert fatigue, my friends | 16:40 |
| JayF | :( | 16:40 |
| JayF | even my meat thermometer is smart enough to let me configure how many degrees of change before it pushes an update | 16:41 |
| cardoe | What about those that throw events for the same temp? because the value changed but due to rounding the displayed value didn't change. | 16:51 |
| cardoe | It's devices like that which sometimes make me think... ya know... global EMP... let's start over from the ground up. | 16:52 |
| JayF | the only real way to solve stuff like that in the medium/long term is 1) working with DMTF via folks like janders and 2) convincing your purchasing team that quality of BMC software matters, and having them communicate that to their vendors | 16:53 |
| JayF | ultimately, our own employers' willingness to buy hardware with subpar firmware is the primary root cause of the issue we have any power to address | 16:53 |
| TheJulia | yup | 16:54 |
| JayF | In my experience working with vendor engineers, they've come off as smug caricatures of a "works for me" developer who is tossing crap over the wall :( | 16:54 |
| TheJulia | and trying to close the loop | 16:54 |
| JayF | "No, our software isn't broken because [specific CURL command] works" | 16:54 |
| dtantsur | :D | 17:10 |
| dtantsur | Folks, do we have anything for the pattern like: I want networking prepared for deployment but then no actual deployment to happen? Is it essentially the custom-deploy interface without any extra steps? | 17:11 |
| JayF | noop deploy interface, I assume, doesn't touch networking? | 17:16 |
| JayF | I would expect noop deploy interface to still end up with all the network bits in the right place if configured though | 17:17 |
| JayF | so... is that your answer? | 17:17 |
| clif | TheJulia: Regarding https://review.opendev.org/c/openstack/ironic/+/972495/comment/e8914144_63b34c62/ do you want me to remove the in-class lock and just use the global module-level lock? I don't think there is a risk of cross-locking atm but I could be mistaken. | 17:23 |
| TheJulia | clif: I guess it depends on how your using the class, if your using it once and there is only one instance, then why have the lock. | 17:23 |
| TheJulia | clif: then the other question is do you really really need a sub-lock, because each time it is initialized is an instance of it, and cross-threading wise it needs to be defined in advance of a thread launch | 17:24 |
| TheJulia | That is where I was kind of coming from, without looking at the actual class instantiation, does that make sense? | 17:25 |
| clif | Yes, I suppose that's true. | 17:25 |
| clif | I'll remove the class-level lock | 17:25 |
| TheJulia | ok, cool, thanks! | 17:25 |
| opendevreview | Clif Houck proposed openstack/ironic master: Add ConfigLoader class to TBN and use it in TaskManager https://review.opendev.org/c/openstack/ironic/+/972495 | 17:29 |
| opendevreview | Clif Houck proposed openstack/ironic master: Integrate TBN with vif_attach https://review.opendev.org/c/openstack/ironic/+/973690 | 17:29 |
| opendevreview | Clif Houck proposed openstack/ironic master: Prevent multiple attach actions being generated for the same port https://review.opendev.org/c/openstack/ironic/+/974520 | 17:29 |
| opendevreview | Clif Houck proposed openstack/ironic master: Filter out NoMatch actions in _vif_attach_tbn https://review.opendev.org/c/openstack/ironic/+/974569 | 17:29 |
| opendevreview | Clif Houck proposed openstack/ironic master: Update TBN config file to improve trait structure https://review.opendev.org/c/openstack/ironic/+/974776 | 17:29 |
| opendevreview | Clif Houck proposed openstack/ironic master: Add an ordering method for TBN traits https://review.opendev.org/c/openstack/ironic/+/974519 | 17:29 |
| opendevreview | Clif Houck proposed openstack/ironic master: Add default trait behavior to _vif_attach_tbn https://review.opendev.org/c/openstack/ironic/+/974960 | 17:29 |
| opendevreview | Clif Houck proposed openstack/ironic master: Filter out already attached portlikes in plan_vif_attach https://review.opendev.org/c/openstack/ironic/+/974521 | 17:29 |
| opendevreview | Clif Houck proposed openstack/ironic master: WIP: Update TBN simulator for vif_attach planning https://review.opendev.org/c/openstack/ironic/+/973691 | 17:30 |
| opendevreview | Clif Houck proposed openstack/ironic master: Add ConfigLoader class to TBN and use it in TaskManager https://review.opendev.org/c/openstack/ironic/+/972495 | 17:44 |
| opendevreview | Clif Houck proposed openstack/ironic master: Integrate TBN with vif_attach https://review.opendev.org/c/openstack/ironic/+/973690 | 17:44 |
| opendevreview | Clif Houck proposed openstack/ironic master: Prevent multiple attach actions being generated for the same port https://review.opendev.org/c/openstack/ironic/+/974520 | 17:44 |
| opendevreview | Clif Houck proposed openstack/ironic master: Filter out NoMatch actions in _vif_attach_tbn https://review.opendev.org/c/openstack/ironic/+/974569 | 17:44 |
| opendevreview | Clif Houck proposed openstack/ironic master: Update TBN config file to improve trait structure https://review.opendev.org/c/openstack/ironic/+/974776 | 17:44 |
| opendevreview | Clif Houck proposed openstack/ironic master: Add an ordering method for TBN traits https://review.opendev.org/c/openstack/ironic/+/974519 | 17:44 |
| opendevreview | Clif Houck proposed openstack/ironic master: Add default trait behavior to _vif_attach_tbn https://review.opendev.org/c/openstack/ironic/+/974960 | 17:44 |
| opendevreview | Clif Houck proposed openstack/ironic master: Filter out already attached portlikes in plan_vif_attach https://review.opendev.org/c/openstack/ironic/+/974521 | 17:44 |
| opendevreview | Clif Houck proposed openstack/ironic master: WIP: Update TBN simulator for vif_attach planning https://review.opendev.org/c/openstack/ironic/+/973691 | 17:44 |
| JayF | ^^ We are making major progress on this and more of the patches are landable | 18:23 |
| TheJulia | cool cool | 18:23 |
| TheJulia | Thanks! | 18:24 |
| JayF | I <3 all the ddt tests | 18:26 |
| JayF | I used to hate that big decorator but not so much anymore :D | 18:26 |
| janders | cardoe JayF happy to help. Dealing with two issues at DMTF at the moment, one should be sorted this week so can raise another one if need be. Happy to talk further. | 19:01 |
| opendevreview | Merged openstack/python-ironicclient master: Remove health from default node list columns https://review.opendev.org/c/openstack/python-ironicclient/+/974984 | 19:10 |
| janders | ^ yeay! last actual health monitoring patch from the initial implementation, thanks JayF clif cid | 19:16 |
| janders | now just doco left: https://review.opendev.org/c/openstack/ironic/+/974985 | 19:16 |
| janders | if you have any feedback I will be happy to pass it to Claude | 19:16 |
| *** dking is now known as Guest702 | 19:22 | |
| *** Guest702 is now known as dking | 19:25 | |
| dking | dtantsur (or anybody who knows): For IrSO, where are the deploy_kernel and deploy_ramdisk variable set? | 19:27 |
| opendevreview | Nahian Pathan proposed openstack/ironic master: Reduce API calls when collecting sensor data with redfish https://review.opendev.org/c/openstack/ironic/+/955484 | 20:20 |
| cardoe | janders, JayF: I was just making a joke about bad hardware in the world. Nothing specific. | 21:40 |
| JayF | too late, I already signed you up for "fix all problems with hardware ever" workitem for next cycle | 21:44 |
| TheJulia | dude, that is every cycle | 21:44 |
| JayF | well, we usually spread it across the team | 21:44 |
| JayF | next cycle? All Doug. | 21:44 |
| JayF | TheJulia: alegacy: https://review.opendev.org/c/openstack/ironic/+/966471/10..11#message-69fe92d772eee613906ea45b10528d3e7e7a24b0 I don't wanna leave a -1 on there, but I am not OK with us pre-determining we're going to break out API at some future point | 21:47 |
| JayF | forcing a migration to get new features? very yes. Planning to fully remove something that someone might have deployed working during Gazpacho? not cool :( | 21:48 |
| TheJulia | Lets take a step back | 21:51 |
| TheJulia | what did we do with vifs ? | 21:51 |
| JayF | I don't know what you're getting at? | 21:52 |
| JayF | if I, as an operator, deploy this, and then 2-3 releases later have to migrate, I'd be upset and for good reason | 21:53 |
| JayF | that's entirely the mindset I'm in around this | 21:53 |
| TheJulia | yeah, and I think your jumping to an extreme interpretation | 21:53 |
| TheJulia | In that, with vifs, which was the same sort of pattern | 21:53 |
| TheJulia | we KEPT the extra field mappings for ages | 21:53 |
| JayF | I said "I assume we'll keep things working per microversion" and the response was "no, we're planning to break it" | 21:53 |
| JayF | as long as our plan involves something like that, it's fine | 21:53 |
| TheJulia | oh, no, you don't microversion that away | 21:54 |
| TheJulia | you always have to check it | 21:54 |
| JayF | I just don't want us to treat shipping a feature labelled "experimental" as carte blanche to break it up | 21:54 |
| JayF | s/it up/the api/ | 21:54 |
| TheJulia | the mechanism later is microversion guarded though | 21:54 |
| TheJulia | extra, is just a freeform field | 21:54 |
| JayF | true-freeform | 21:54 |
| JayF | not fake-freeform with a schema like local_link_conn | 21:54 |
| TheJulia | so, when your looking at it, you can't use the microversion because you don't know it | 21:54 |
| JayF | it's have to look something like our instance_name thing | 21:54 |
| JayF | where if you API call to update extra[standalone_info] (whatever the key is), we intercept it and put it in the new field | 21:55 |
| JayF | (we proxy instance_info[instance_name] to node.instance_name in API code now) | 21:55 |
| JayF | the reason this in particular is rough to me | 21:56 |
| JayF | is that it's even worse than an "N" calls for "N" nodes | 21:56 |
| JayF | this is N calls for N ports which is a multiplier of nodes | 21:56 |
| JayF | so asking operators after the fact to move that metadata around is a big, big ask | 21:56 |
| JayF | (and bluntly, exactly the sort of thing that has caused pain for my downstream, operationally, slowing upgrades -- e.g. the ilo->redfish driver swap) | 21:57 |
| JayF | I'm not just being an API pedant for the hell of it :) | 21:57 |
| JayF | an online_data_migration + API shim could probably allow us to do the progression as we intend without putting operators at risk of a nasty migration | 22:00 |
| TheJulia | clarifying comment on the post, I do like a migration goal if it is clean enough | 22:02 |
| TheJulia | we... for some reason... refused to go down that path in large part because of nova | 22:02 |
| JayF | https://review.opendev.org/c/openstack/ironic/+/966471/10..11#message-69fe92d772eee613906ea45b10528d3e7e7a24b0 I flipped my vote back | 22:06 |
| TheJulia | We couldn't get them to change the interface quickly and we kept getting push back and even cases where people were starting to run dramatically different versions | 22:07 |
| JayF | It's interesting because with ironic-networking and many new features | 22:09 |
| TheJulia | oh, absolutely | 22:09 |
| TheJulia | claude, how did you break tooze hashring.py | 22:09 |
| JayF | we're moving more and more from "we control both server/client of API" to "we are an API that is used for god-knows-what" | 22:09 |
| TheJulia | yeah | 22:11 |
| TheJulia | Continuity: sorry, didn't remember to ping you earlier. Did my last statement make sense? | 22:22 |
| cardoe | speaking of vif mappings and cleaning that up still... https://review.opendev.org/c/openstack/ironic/+/973557 that's fixing the last of that up and making the interface safe from future changes. | 22:35 |
| cardoe | If ya wanna peek TheJulia | 22:35 |
| cardoe | It fixes a couple of outstanding bugs with servicing vs tenant vif selection | 22:36 |
| cardoe | How those bugs manifest themselves... I dunno yet. | 22:36 |
| TheJulia | I'll ahve to look with a fresh brain | 22:42 |
| TheJulia | been fairly heads down on the agent | 22:42 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!