*** zbitter is now known as zaneb | 04:26 | |
rpittau | good morning ironic! o/ | 06:34 |
---|---|---|
opendevreview | Verification of a change to openstack/ironic master failed: Build PXE config for node in SERVICING state https://review.opendev.org/c/openstack/ironic/+/922006 | 06:41 |
masghar | Morning! | 08:19 |
opendevreview | Dmitry Tantsur proposed openstack/ironic master: [WIP] Update the redfish interoperability profile https://review.opendev.org/c/openstack/ironic/+/920574 | 10:00 |
opendevreview | Verification of a change to openstack/ironic master failed: Update version change log with special treatment of .json removal https://review.opendev.org/c/openstack/ironic/+/921966 | 10:03 |
sylvr | Hello ! I have some questions regarding Ironic/Bifrost IPMI driver and the autodiscovery of my nodes (I'm using Kayobe to deploy Bifrost/Ironic), I'm able to use ipmitool to power cycle my nodes, but they don't seem to register themselves to Bifrost/Ironic... | 10:46 |
sylvr | I planned to check the log on the PXEboot server to check if any nodes are downloading the IPA ramdisk or if it is the IPA that are unable to register the nodes | 10:47 |
dtantsur | sylvr: first and foremost, have you enabled discovery in bifrost? it's off by default. | 10:48 |
sylvr | In `ostack/src/kayobe-config/etc/kayobe/inspector.yml` I have ```# Whether to enable discovery of nodes not managed by Ironic. | 10:50 |
sylvr | inspector_enable_discovery: "true" | 10:50 |
sylvr | ``` | 10:50 |
sylvr | and when trying to use another driver (redfish) the discovery was working, but I had issue with the driver being unable to communicate with the iDRAC because they're too old... | 10:51 |
dtantsur | It's weird: the driver does not affect how the hosts are booted | 11:08 |
sylvr | my thoughts too | 11:10 |
sylvr | how can I check the PXEboot server logs to see if my nodes are attempting pxeboot on the correct network? | 11:15 |
dtantsur | sylvr: it's either dnsmasq or tftpd (but I'd probably just use tcpdump) | 11:26 |
sylvr | okay thanks, I'll try | 11:27 |
opendevreview | Dmitry Tantsur proposed openstack/ironic master: WIP migration guide from inspector https://review.opendev.org/c/openstack/ironic/+/922089 | 11:28 |
sylvr | dtantsur: okay, so I see some traffic on the network, but mostly the seed/bifrost communicating with the gateway (sometimes some ARP between nodes and seed/bifrost) and some UDP, but still nothing in `baremetal node list` | 12:07 |
sylvr | I power-cycled my nodes through ipmitool, but didn't seems like a lot were happening... I'm going to take a look in my config files, maybe I did override the discovery | 12:09 |
sylvr | BRB | 12:09 |
rpittau | FYI metal3 integration job is broken, again | 12:31 |
opendevreview | cid proposed openstack/ironic master: Provision ARM (aarch64) fake-bare-metal-vms https://review.opendev.org/c/openstack/ironic/+/915441 | 12:32 |
sylvr | back! | 12:40 |
TheJulia | good morning | 12:45 |
opendevreview | cid proposed openstack/ironic master: Provision ARM (aarch64) fake-bare-metal-vms https://review.opendev.org/c/openstack/ironic/+/915441 | 12:53 |
TheJulia | youch metal3 ModuleNotFoundError: No module named 'jinja2' | 13:13 |
TheJulia | sigh | 13:18 |
TheJulia | https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_43f/922006/2/check/ironic-tempest-ramdisk-bios-snmp-pxe/43f756c/controller/logs/screen-ir-cond.txt Wheeeee | 13:24 |
dtantsur | TheJulia: https://github.com/metal3-io/metal3-dev-env/pull/1427 | 13:24 |
dtantsur | good morning | 13:24 |
opendevreview | cid proposed openstack/ironic master: Flexible IPMI credential persistence method configuration https://review.opendev.org/c/openstack/ironic/+/917229 | 14:10 |
TheJulia | That should work I guess | 14:10 |
opendevreview | cid proposed openstack/ironic master: Provision ARM (aarch64) fake-bare-metal-vms https://review.opendev.org/c/openstack/ironic/+/915441 | 14:17 |
JayF | Do we need to have metal3-integration stop voting in ironic or ensure it's "voting" (or whatever the equivalent is) for metal3? | 14:43 |
rpittau | JayF: after the fix merges it will start working again | 14:44 |
rpittau | uhm wait I think I read your question wrong :D | 14:45 |
JayF | So I'm saying, the only way for that to get broken in the way it is, would be a commit that changed the behavior and broke it, yeah? | 14:45 |
JayF | And it not getting tested until *we* tested it? | 14:45 |
JayF | I'm saying if *Ironic* is the first time that metal3 code gets tested, we should not be voting on it | 14:45 |
rpittau | yes, but the breaking change was not tested with the behavior we use in ironic CI AFAICS | 14:46 |
rpittau | we're supposed to run some tests, we didn't | 14:46 |
rpittau | I guess the metal3-dev-env tests should be mandatory in metal3-dev-env repo | 14:46 |
rpittau | I don't think they are | 14:46 |
JayF | That's mainly what I'm getting at, I think something needs to be made voting somewhere on metal3 side | 14:47 |
rpittau | yeah, it happened last week as well | 14:51 |
rpittau | probably worth discussing during the next metal3 meeting | 14:51 |
TheJulia | I guess constraints don't actually check ironic either? | 14:51 |
TheJulia | err, requirements | 14:52 |
rpittau | TheJulia: they should, there's an ironic job in requirements AFAIK | 14:54 |
TheJulia | I guess it doesn't actually force usage of the newer version | 14:55 |
JayF | TheJulia: I'm confused, are we broken by a new requirement right now? | 14:56 |
TheJulia | The new version of pysnmp horribly breaks ironic's driver loading | 14:56 |
TheJulia | not sure why yet | 14:56 |
rpittau | TheJulia: in the uc patch the ironic job was green https://review.opendev.org/c/openstack/requirements/+/915830 | 14:57 |
JayF | We have an ironic side patch to land | 14:58 |
JayF | I think(?) | 14:58 |
rpittau | yeah | 14:58 |
rpittau | unit tests need to be fixed | 14:58 |
TheJulia | yeah, that is horribly broken | 14:58 |
TheJulia | :( | 14:58 |
rpittau | #startmeeting ironic | 15:00 |
opendevmeet | Meeting started Mon Jun 17 15:00:26 2024 UTC and is due to finish in 60 minutes. The chair is rpittau. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:00 |
opendevmeet | The meeting name has been set to 'ironic' | 15:00 |
JayF | o/ | 15:00 |
TheJulia | o/ | 15:00 |
iurygregory | oi/ | 15:00 |
rpittau | Hello everyone! | 15:01 |
rpittau | Welcome to our weekly meeting! | 15:01 |
rpittau | The meeting agenda can be found here: | 15:01 |
rpittau | #link https://wiki.openstack.org/wiki/Meetings/Ironic#Agenda_for_June_17.2C_2024 | 15:01 |
rpittau | #topic Announcements/Reminders | 15:02 |
rpittau | #info Standing reminder to review patches tagged ironic-week-prio and to hashtag any patches ready for review with ironic-week-prio: | 15:02 |
dtantsur | o/ | 15:02 |
rpittau | #link https://tinyurl.com/ironic-weekly-prio-dash | 15:02 |
rpittau | #info The Ironic/Bare Metal SIG meetup at CERN happened on June 5, leaving the link to the notes still for this week | 15:03 |
rpittau | #link https://etherpad.opendev.org/p/baremetal-sig-cern-june-5-2024 | 15:03 |
JayF | I filed several RFEs out of the feedback from that | 15:03 |
JayF | I haven't added them to RFE review in the meeting though, I missed that | 15:03 |
rpittau | JayF: I saw that, thanks | 15:03 |
JayF | we can potentially go over them at the appropriate time if desired | 15:03 |
rpittau | sounds good | 15:03 |
rpittau | #info 2024.2 Dalmatian Release Schedule | 15:04 |
rpittau | #link https://releases.openstack.org/dalmatian/schedule.html | 15:04 |
rpittau | anything else to announce/remind ? | 15:05 |
rpittau | ok, moving on | 15:06 |
rpittau | #topic Review Ironic CI status | 15:06 |
rpittau | #info metal3 integration job is broken again \o/ | 15:06 |
rpittau | probably worth discussing why we didn't see CI breaking in metal3-dev-env during the metal3 meeting on Wednesday | 15:07 |
rpittau | anything else CI related ? | 15:08 |
JayF | It sounds like the pySNMP change hurt us, too | 15:08 |
JayF | I'll look at 912328 and see if I can get it over the line with some time today | 15:08 |
TheJulia | https://d2e31a25148547148788-51f9901de94768cc4d0e03f07c031664.ssl.cf1.rackcdn.com/921966/2/gate/ironic-tempest-ramdisk-bios-snmp-pxe/7f972ac/controller/logs/screen-ir-cond.txt <-- confirmed | 15:08 |
TheJulia | somewhere driver loading now breaks | 15:09 |
rpittau | I left a message for Lex Li in the original thread on github https://github.com/lextudio/pysnmp/issues/49 | 15:09 |
TheJulia | Haven't gotten far enough to figure it out exactly | 15:09 |
sylvr | dtantsur: I found the issue I had, and yes it was a misconfiguration... | 15:09 |
dtantsur | good to know! | 15:09 |
sylvr | etc/kayobe/kolla/config/bifrost/bifrost.yml contained the wrong driver, I corrected it and now it works | 15:10 |
rpittau | TheJulia: maybe some hints from the failing unit tests? | 15:10 |
TheJulia | the failing unit tests with the patch to bump our local state is 2000+ failing tests for me locally | 15:11 |
JayF | it was only 6 in the gate, weird | 15:11 |
TheJulia | but that has been the path I've been looking at | 15:11 |
sylvr | I added the kolla dir in my .gitignore to avoid pushing some secrets in the git repo, but then I didn't tracked this file... so hidden from my config.. | 15:11 |
sylvr | thanks! | 15:11 |
TheJulia | JayF: I should check the eventlet version of the gate... | 15:12 |
TheJulia | Anyway, its all broken, we can move the meeting along | 15:12 |
TheJulia | its going to take time to figure out what is going on exactly | 15:12 |
JayF | TheJulia: seeing 6 locally here on a clean tox run too, fwiw; I'll take a look at getting those to pass locally post-meeting | 15:13 |
rpittau | TheJulia: I can propose a revert for the uc change until our snmp job passes | 15:14 |
TheJulia | Lets wait and see, I think it is just worse with lextudio's patch, at least on debian | 15:14 |
rpittau | ok | 15:14 |
TheJulia | btw, was that count with just tox -epy3, or did you try tox -ecover ? | 15:15 |
JayF | TheJulia: tox -epy3 on python3.12 | 15:15 |
JayF | ^ -r | 15:15 |
TheJulia | I ran cover with 3.11 | 15:15 |
TheJulia | ack | 15:15 |
JayF | cover can be a bit ... misleading when there are certain kinds of failures ime | 15:15 |
JayF | e.g. some failures can cascade | 15:16 |
JayF | but we should probably move on the meeting :D | 15:16 |
rpittau | let's move on! | 15:16 |
rpittau | :) | 15:16 |
rpittau | #Discussion topics | 15:16 |
rpittau | heh | 15:16 |
rpittau | #topic Discussion topics | 15:16 |
rpittau | JayF: you have something there :) | 15:16 |
JayF | So Reverbverbverb has a draft up of his report on Ironic documentatation in google docs | 15:17 |
JayF | I implore you, please find time in the next week to go over it and make comments | 15:17 |
JayF | we will be converting this into "final version" and filing bugs around the action items as early as this time next week barring feedback to the contrary | 15:17 |
rpittau | #action review reverbverbverb Ironic documentation audit update & results | 15:17 |
rpittau | #link https://docs.google.com/document/d/1e9URuPHKNTx5QXdkCFAzsS0EAAxPJ-xQC4BEpDwESp4/edit | 15:17 |
JayF | ty for that, you got it copied before I did :D | 15:18 |
rpittau | thanks Reverbverbverb for that! | 15:18 |
Reverbverbverb | My pleasure. Please comment in the doc. | 15:18 |
rpittau | will do :) | 15:18 |
rpittau | anything else to add? | 15:19 |
Reverbverbverb | My review of the doc is necessarily big-picture, but please feel free to comment on anything about the doc | 15:20 |
JayF | I think that's all for our action item | 15:21 |
Reverbverbverb | I'll be breaking the analysis down to (I hope) actionable items, so anything you have to say will help | 15:21 |
rpittau | alright, thanks again | 15:21 |
Reverbverbverb | 👍 | 15:21 |
dtantsur | Thanks Reverbverbverb! | 15:21 |
rpittau | #topic Bug Deputy Updates | 15:22 |
rpittau | so that was me last week | 15:22 |
rpittau | haven't seen a lot of movements in terms of bugs but this popped up on Friday | 15:22 |
rpittau | #link https://bugs.launchpad.net/ironic/+bug/2069413 | 15:22 |
TheJulia | Looks like they proposed a fix, I'll try to take a look this week | 15:23 |
rpittau | thanks TheJulia | 15:23 |
rpittau | any volunteer for this week bug deputy ? | 15:23 |
JayF | give it to me | 15:24 |
rpittau | thanks JayF :) | 15:24 |
rpittau | moving on! | 15:24 |
rpittau | #topic RFE Review | 15:24 |
rpittau | JayF: you want to mention the RFEs you opened ? | 15:25 |
JayF | yeah I have a bunch of recent rfes that need review, not all are mine but all need attention | 15:25 |
JayF | #link https://bugs.launchpad.net/ironic/+bug/2069085 | 15:25 |
JayF | RFE: Add a burnin_gpu step | 15:25 |
JayF | I'll note all these are against *ironic* even though for many of them, IPA commits are all that are needed | 15:26 |
JayF | That's pretty straightforward; as requested by operators at the cern meetup, we can hook up the stress-ng support for GPUs to a burnin_gpu step | 15:26 |
rpittau | ok | 15:26 |
rpittau | we can always add the project there | 15:26 |
JayF | I don't think there's anything controversial here and it's a good first issue | 15:26 |
rpittau | that looks ok to me | 15:27 |
JayF | #link https://bugs.launchpad.net/ironic/+bug/2069083 | 15:27 |
JayF | RFE: Use results of burnin_* methods for inspection | 15:27 |
JayF | I'll be honest, this probably needs more design than has been done in the bug currently | 15:27 |
JayF | but at a high level; operators want us to optionally report metrics from burnin_ methods to inspection so they can compare performance | 15:27 |
TheJulia | ... that seems fairly nebulous | 15:28 |
TheJulia | or at least, at a high level a bit outside of what we can do with the stack with where we're going | 15:29 |
TheJulia | since surely they mean inspector | 15:29 |
JayF | how is "they mean inspector" different than what I said? | 15:29 |
TheJulia | we're on a path to deprecate inspector | 15:30 |
dtantsur | but we can still report stuff? the functional difference is not that huge | 15:30 |
TheJulia | and inspector really only holds value introspection runs, so we're talking about storing more data and extending then | 15:30 |
JayF | Yes, but we have similar functionality in an ironic-forward way? | 15:30 |
JayF | I specifically didn't talk about internal ironic vs external inspector because AIUI it doesn't make much of a difference at that level in ipa | 15:30 |
* dtantsur is writing a migration guide by the way | 15:30 | |
TheJulia | I guess I'm semi -1 to extending expector | 15:30 |
TheJulia | inspector | 15:30 |
* TheJulia needs way more coffee this morning | 15:31 | |
rpittau | we can always hold until the migration has been completed | 15:31 |
dtantsur | well, yeah, inspector is frozen at this point | 15:31 |
rpittau | or add the feature to the agent in ironic | 15:31 |
JayF | I don't comprehend how this extends inspector -- if I were implementing this, I'd focus on new ironic-based inspection | 15:31 |
rpittau | yep, exactly | 15:31 |
dtantsur | same | 15:31 |
JayF | we will still have the ability to inspect stuff and report results | 15:31 |
TheJulia | okay then | 15:31 |
JayF | I'm OK with us not approving it because it *is* very vague and might need design | 15:32 |
rpittau | it actually makes sense to have that in the ironic project :) | 15:32 |
JayF | but it's 100% not the intention for this to be an `ironic-inspector` features | 15:32 |
dtantsur | also, if what gets extended is IPA, it does not matter what the server side is | 15:32 |
rpittau | yeah | 15:32 |
dtantsur | if something gets into the inspection data, it will be stored either way | 15:32 |
TheJulia | so I guess going back to my lack of caffinated state, I read Jay's summary as they want us to compare the results | 15:33 |
TheJulia | which sent my brain a step further | 15:33 |
JayF | Ah, I'll be 100% clear we're talking about storing metrics from the run, and nothing else | 15:33 |
JayF | the use case brought up IRL was "sometimes machines just randomly perform out of spec" | 15:33 |
dtantsur | Well, we could write an inspection hook to compare the old data with the new one.. that would depend on which server accepts the data | 15:33 |
JayF | with a story about NVMe drives that *over performed* by a factor of 2s | 15:33 |
JayF | *2x | 15:33 |
JayF | dtantsur: I'd suggest that be a separate, next step | 15:34 |
* dtantsur nods | 15:34 | |
JayF | dtantsur: part of the goal here is to have some low-hanging-fruit | 15:34 |
rpittau | it clearly needs more info, but I'm ok with that | 15:34 |
JayF | I put a clarifying comment in 2069083, going to move on without marking as approved since it needs more info | 15:35 |
rpittau | yep, thanks | 15:35 |
JayF | #link https://bugs.launchpad.net/ironic/+bug/2068530 | 15:35 |
JayF | [RFE] Allow an operator to block all future bios deployments | 15:35 |
TheJulia | That was an idea which came up in discussion which I kind of liked | 15:35 |
JayF | The idea here is to have a conductor-level setting which prevents new nodes from being created / nodes from being updated to use BIOS boot mode | 15:35 |
JayF | In order for places to enforce UEFI-only boot mode | 15:36 |
JayF | Hmm that's not exactly how it is written | 15:36 |
JayF | how it's written is to block *deployments* if in bios boot mode, that's not exactly the same thing | 15:36 |
rpittau | this kind of means that we're deprecating legacy BIOS deployments | 15:36 |
JayF | rpittau: it means we're allowing some operators to flip a switch to disable them *in their environment* if they have a separate requirement | 15:37 |
JayF | rpittau: it's just convenient we can flip it to default-enabled when we're ready to deprecate | 15:37 |
rpittau | yes, I remember the discussion now, and I'm all for it | 15:37 |
TheJulia | I noted it as deployment in large part because you can have machines in different states | 15:37 |
TheJulia | and the logical place to prevent that sort of thing is in the deployment pathway code because you want to be able to unprovision machines | 15:38 |
rpittau | of course, makes sense | 15:38 |
TheJulia | sort of like the retire logic with cleaning | 15:38 |
JayF | well, that doesn't exactly match what the operator in the room was saying | 15:38 |
JayF | they were saying they had an issue where a device would be misconfigured at enrollment with a bios boot mode | 15:38 |
JayF | setting back their efforts to "drain" bios booting out of the environment | 15:38 |
JayF | I think there's potential value in checking at the API level (you can't explicitly set to bios) as well as at deployment, but IMBW | 15:39 |
JayF | at least that's how I understood it | 15:39 |
TheJulia | So, we're semi-seeing such issues with some hardware, but it doesn't really show until cleaning/deployment | 15:40 |
TheJulia | specifically hardware which *has* to be requested to be in bios mode, but we shouldn't model the universe around the exception | 15:40 |
JayF | I guess I'm wondering if we'd ever have operators who'd say "I don't even want a ramdisk to boot via bios" | 15:40 |
TheJulia | quite possible, but they are going to have to somehow drain out their current state | 15:40 |
JayF | that seems to be the meaningful difference between our two perspectives, yeah? | 15:41 |
TheJulia | somewhat, really it is likely two knobs | 15:41 |
JayF | ack | 15:41 |
TheJulia | and two distinct checks at different points | 15:41 |
JayF | I'll update the RFE with that comment | 15:41 |
JayF | I think it's clear we are onboard to do something like this? | 15:41 |
TheJulia | one early on in the entry flow to move a node out of enrollment, and likely later on with deployments | 15:41 |
TheJulia | so if you switch both, you'll just end up with nodes which cannot be deployed anymore and can't add anymroe | 15:41 |
TheJulia | well, until you fix them or replace them :) | 15:42 |
JayF | updated with that comment, marking approved | 15:42 |
JayF | TheJulia: this is one of yours | 15:42 |
JayF | #link https://bugs.launchpad.net/ironic/+bug/2067073 | 15:43 |
TheJulia | by out of enrollment, I mean upon managing one node | 15:43 |
JayF | [RFE] HTTP ISO Boot via Network (UEFI) HTTP Boot | 15:43 |
JayF | This seems like a pretty straightforward extension of the new http-* based boot interfaces | 15:44 |
TheJulia | So this one is unrelated to the meetup, but the idea is "what if folks didn't want ipxe or a network style PXE loader anymore", and is there anyway to direct boot an ISO via the network. Today there is not a delineation as the httpboot interfaces are delineated and the pure network+dhcp boot path leverages a network bootloader, or provide the entire url in advance to the BMC | 15:44 |
TheJulia | yeah, pretty much just a slightly different option so you don't need a network bootloader at all | 15:45 |
rpittau | that is very interesting | 15:45 |
TheJulia | It will only really be useful with a managed dhcp server though | 15:45 |
JayF | I think silence is acceptance? | 15:47 |
TheJulia | I suspect so | 15:48 |
rpittau | good for me | 15:48 |
JayF | that's the end of the list | 15:48 |
rpittau | thanks! | 15:48 |
JayF | how refreshing! Features people asked for, in real life | 15:48 |
JayF | not filtered through someone elses' product team :P | 15:48 |
rpittau | heh :) | 15:48 |
rpittau | #topic Open Discussion | 15:49 |
TheJulia | eh... that last one is sort of me saving my sanity from a product team :) | 15:49 |
JayF | As mentioned Friday, we're probably going to start piloting asyncio-based solutions for removing eventlet from IPA | 15:49 |
rpittau | we still have 10 minutes in case someone has something to discuss about | 15:49 |
JayF | I ask folks who may have an opinion to please have it early, I'll make sure stuff gets posted for review quickly | 15:49 |
TheJulia | I really don't have an opinion formed right now, but asyncio seems fine to me | 15:51 |
JayF | that's all I had for open discussion | 15:51 |
JayF | TheJulia: my sincere hope is it's a giant nothingburger for something the size of IPA | 15:51 |
TheJulia | it really should be, honestly | 15:52 |
rpittau | anything else to discuss today? | 15:52 |
TheJulia | we could discuss pysnmp fun and my many broken tests on python 3.11 | 15:53 |
TheJulia | but that doesn't seem *that* important :) | 15:53 |
JayF | I want to look at that with an IDE up and not IRC up before talking about it more | 15:54 |
TheJulia | fwiw, I think it's presence angers eventlet, but we can end the meeting | 15:55 |
rpittau | thanks everyone! | 15:56 |
rpittau | #endmeeting | 15:56 |
opendevmeet | Meeting ended Mon Jun 17 15:56:01 2024 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 15:56 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/ironic/2024/ironic.2024-06-17-15.00.html | 15:56 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/ironic/2024/ironic.2024-06-17-15.00.txt | 15:56 |
opendevmeet | Log: https://meetings.opendev.org/meetings/ironic/2024/ironic.2024-06-17-15.00.log.html | 15:56 |
JayF | jfyi only seeing six failures on cover job, too | 16:06 |
JayF | running tests on python 3.9 to see if it's a version thing | 16:07 |
rpittau | good night! o/ | 16:30 |
opendevreview | Jay Faulkner proposed openstack/ironic master: Upgrade pysnmp-lextudio. https://review.opendev.org/c/openstack/ironic/+/912328 | 16:33 |
opendevreview | Merged openstack/python-ironicclient master: Support traits configuration on baremetal create CLI https://review.opendev.org/c/openstack/python-ironicclient/+/919432 | 16:33 |
JayF | TheJulia: ^^ 912328 passing all tests locally now ; curious to see how tempest likes it | 16:38 |
TheJulia | I'm curious if all jobs will pass | 16:39 |
TheJulia | same behavior on your python 3.9? | 16:39 |
opendevreview | Jay Faulkner proposed openstack/ironic master: Upgrade pysnmp-lextudio. https://review.opendev.org/c/openstack/ironic/+/912328 | 16:40 |
JayF | yes, py3.9 didn't change how tests ran | 16:40 |
TheJulia | okay, that is *weird* | 16:40 |
* JayF does it again to verify | 16:40 | |
TheJulia | I guess we will see what the snmp job does since that seems to be the one which easily fails | 16:41 |
TheJulia | and... local python why you fail so many! | 16:41 |
JayF | check it out -> tox -epy3 + -ecover + -epep8 + -epy39 all pass | 16:41 |
JayF | tox is running under python 3.12, all are fresh (-reenvname) environments | 16:41 |
TheJulia | running tox -r -epy3 | 16:42 |
JayF | actually I've tested with tox running under 3.12 (system) and 3.11 (venv) | 16:42 |
TheJulia | heh, 2995 failed tests failing | 16:45 |
JayF | you did `tox -repy3`? What version did it use? | 16:46 |
JayF | I assume you don't have something like python 3.8 locally ? | 16:46 |
JayF | or you have some package in your global site-packages that is wrecking havoc on your unit tests, perhaps? | 16:47 |
JayF | https://paste.opendev.org/show/bOEgomYRO3Y9qI8pQfde/ a pip freeze from a working python 3.9 (py39) run | 16:48 |
TheJulia | 3.11.2 on debian, system python loaded into the venv | 16:49 |
JayF | my tox is running on python 3.12.3 (gentoo) and that paste is from it running a python 3.9.19 (gentoo) inside that venv | 16:51 |
TheJulia | https://www.irccloud.com/pastebin/QRj0mUmc/ | 16:51 |
JayF | interesting, idk how I ended up with the bonus packages | 16:52 |
JayF | maybe just py3.9 vs py3.11? | 16:52 |
JayF | I'll do a 3.11 run | 16:52 |
TheJulia | zeroconf pulls in async-timeout it looks like | 16:52 |
opendevreview | cid proposed openstack/ironic master: Provision ARM (aarch64) fake-bare-metal-vms https://review.opendev.org/c/openstack/ironic/+/915441 | 16:54 |
JayF | py311 passes cleanly too, and package list looks similar | 16:57 |
TheJulia | weeeird | 16:58 |
* JayF runs with driver libs | 16:58 | |
JayF | I am highly curious if somehow | 16:58 |
JayF | you're using debian eventlet version | 16:58 |
JayF | and it doesn't have the async bridge causing all things to break | 16:58 |
TheJulia | that... might.. be | 17:01 |
TheJulia | give me a few and I'll try to install from source | 17:01 |
TheJulia | 2989 \o/ | 17:11 |
TheJulia | dunno, I'll have to keep hunting | 17:11 |
opendevreview | cid proposed openstack/ironic master: WIP: Self-Service via Runbooks [Prototype] https://review.opendev.org/c/openstack/ironic/+/922142 | 17:25 |
TheJulia | so ironic-tempest-ramdisk-bios-snmp-pxe failed | 17:25 |
TheJulia | https://6e60353ab6408ff35f72-65a5b55aa83e6a6123b1ed34561cc44d.ssl.cf5.rackcdn.com/912328/4/check/ironic-tempest-ramdisk-bios-snmp-pxe/c56fd24/controller/logs/screen-ir-cond.txt | 17:25 |
JayF | aha | 17:27 |
JayF | I made the code match the unit test | 17:27 |
JayF | instead of making the unit test match the code | 17:27 |
JayF | and that wants the code to look the other way | 17:27 |
TheJulia | yup | 17:27 |
JayF | I'll revise it | 17:27 |
TheJulia | ok | 17:27 |
opendevreview | Jay Faulkner proposed openstack/ironic master: Upgrade pysnmp-lextudio. https://review.opendev.org/c/openstack/ironic/+/912328 | 17:54 |
opendevreview | cid proposed openstack/ironic master: Provision ARM (aarch64) fake-bare-metal-vms https://review.opendev.org/c/openstack/ironic/+/915441 | 18:45 |
JayF | > Jun 17 18:20:17.711718 np0037743883 ironic-conductor[94821]: ERROR ironic.conductor.verify [None req-62350a1e-928f-45a7-9663-1b161177abc1 None None] Failed to get power state for node e7d72db7-b1bf-4020-864b-6434f13f924e. Error: SNMP operation 'GET' failed: No SNMP response received before timeout: ironic.common.exception.SNMPFailure: SNMP operation 'GET' failed: No SNMP response received before timeout | 18:54 |
JayF | much less interesting error this time :/ | 18:54 |
JayF | virtualpdu is busted | 18:55 |
JayF | https://zuul.opendev.org/t/openstack/build/9fe579e010e44c51b598caa819020762/log/controller/logs/screen-virtualpdu.txt | 18:55 |
JayF | this is not going to be trivial, going to try and jfdi it though | 19:01 |
JayF | so I *think* https://review.opendev.org/c/openstack/virtualpdu/+/881538 may have landed without ever running on that version due to constraints | 19:24 |
JayF | I can verify that with that reverted, it passes tests on python 3.11 (but blows up on python 3.12 because asyncore doesn't exist there) | 19:26 |
opendevreview | Jay Faulkner proposed openstack/virtualpdu master: DNM: See if virtualpdu is happier on older pysnmp https://review.opendev.org/c/openstack/virtualpdu/+/922147 | 19:27 |
opendevreview | Jay Faulkner proposed openstack/ironic master: Upgrade pysnmp-lextudio. https://review.opendev.org/c/openstack/ironic/+/912328 | 19:27 |
JayF | stacking them for science | 19:28 |
JayF | if this works oh boy we are in for pain | 19:28 |
rpittau | JayF: virtualpdu should work with pysnmp-lextudio and pyasn1 | 19:39 |
rpittau | you just need to change aysnsock to asyncio | 19:39 |
JayF | I will re-check but that did not work in my testing | 19:40 |
JayF | rpittau: very much no | 19:56 |
JayF | at least on py3.12, I should try on py3.11 I guess | 19:56 |
JayF | > RuntimeError: This event loop is already running | 19:57 |
JayF | littered throughout the tests | 19:57 |
JayF | I am almost certain at this point the lower-level methods used by virtualpdu changed from non-asyncio-native to asyncio-native and we may have to do work to make it work | 19:58 |
JayF | I'm going to push up a reproducer | 20:00 |
opendevreview | Jay Faulkner proposed openstack/virtualpdu master: DNM: Example of brokeness https://review.opendev.org/c/openstack/virtualpdu/+/922149 | 20:03 |
JayF | rpittau: ^ fails tests angrily. I'm going to keep digging but wanted to show your assumption is (apparently?) bad or give you a chance to tell me where I went wrong :D | 20:04 |
rpittau | my assumption was based on this change https://github.com/lextudio/pysnmp/commit/12ed9cc996a0a8415e3b7cf7abb5e967da61c201 | 20:04 |
JayF | looks like the failures are coming from the cmdgen code | 20:08 |
JayF | I think we might need to run the asyncio event loop at startup now | 20:08 |
rpittau | yep | 20:08 |
cid | Hmm. Curious to see the root cause of this failures :)) | 20:11 |
rpittau | cid: pysnmp version upgrade | 20:11 |
cid | Logging off now, but very active on my mobile for a couple of hours | 20:11 |
cid | rpittau: same with the metal3 job? | 20:12 |
rpittau | cid: no, this is completely different, metal3 was a missing python library | 20:13 |
cid | rpittau: Oh, got it. | 20:14 |
JayF | this is basically trying to jump like, a year+ of library updates | 20:24 |
JayF | which also crosses an asyncore->asyncio migration | 20:24 |
JayF | in a project that until today I'm pretty sure I haven't spent more than 5 minutes looking at | 20:24 |
JayF | I give up and/or need to pair with someone on this | 20:52 |
JayF | virtualpdu unit tests are broken in a way that suggests to me something is screwed up around threading/async-y-ness | 20:52 |
JayF | but I don't have a good enough mental model of the code to know exactly where it might be falling through | 20:52 |
cid | If a pairing session happens, I will like to be part too. | 21:03 |
cid | By tomorrow, I will be getting familiar with the job log | 21:04 |
JayF | I suspect there's a nonzero chance rpittau just figures it out before I start tomorrow, but if not and we pair I'll ping you | 21:12 |
opendevreview | Jay Faulkner proposed openstack/ironic master: Temporarily non-voting on SNMP job https://review.opendev.org/c/openstack/ironic/+/922153 | 21:18 |
JayF | went ahead and put ^ that up since I suspect we'll end up needing it to get virtualpdu+ironic fixes coordinated | 21:18 |
cid | JayF: noted | 21:25 |
opendevreview | Jay Faulkner proposed openstack/virtualpdu master: DNM: Example of brokeness https://review.opendev.org/c/openstack/virtualpdu/+/922149 | 21:41 |
opendevreview | Jay Faulkner proposed openstack/ironic master: Upgrade pysnmp-lextudio. https://review.opendev.org/c/openstack/ironic/+/912328 | 21:47 |
JayF | We will have to land https://review.opendev.org/c/openstack/ironic/+/922153 and then 912328 before we can fix virtualpdu | 21:47 |
JayF | I am frly sure 912328 is correct because we attempt to connect to the broken PDU and handle the error correctly; | 21:48 |
JayF | if it works, then we'll see the tempest job pass on virtualpdu when we fix it | 21:48 |
cardoe | I was following https://opendev.org/openstack/ironic/commit/e19fd1d050cbabfdafa40c4743f82cd4c7649f82 change that's for adding HTTPBoot for UEFI devices and I noticed the only hardware type that had this added was the generic hardware type. It's left off for the redfish hardware type. Was wondering if anyone knew if that was intentional or accidental? Due to the way that Redfish overrides supported_boot_interfaces in | 21:55 |
cardoe | https://opendev.org/openstack/ironic/src/commit/ebbc8300c36c4d17cc89b0b37f0c2dd030ff2d5d/ironic/drivers/redfish.py#L57 it's not possible to even enable it via config. | 21:55 |
JayF | Isn't that roughly equivalent to redfishhttpsboot? | 22:03 |
JayF | yeah, it's not because the other interface uses ipxe | 22:05 |
JayF | hmm | 22:06 |
JayF | cardoe: I'll PR a fix up for that, but I'll ask TheJulia to review it; she knows more about the edges here than I do and might have some insight asa to why it was excluded initially | 22:07 |
JayF | cardoe: do you have a launchpad bug filed about it, perhaps? | 22:07 |
cardoe | I haven't made one. | 22:07 |
iurygregory | I think the equivalent for redfish is RedfishHttpsBoot no? | 22:07 |
cardoe | I was gonna edit my setup locally and submit a patch to let it happen. | 22:07 |
iurygregory | redfish-https * | 22:08 |
JayF | so redfish-https doesn't give you the option to have ipxe in the middle afaict | 22:08 |
opendevreview | Riccardo Pittau proposed openstack/virtualpdu master: [DNM] Test timeout https://review.opendev.org/c/openstack/virtualpdu/+/922158 | 22:08 |
cardoe | No it's not the same as redfish-https. https://docs.openstack.org/ironic/latest/admin/drivers/redfish.html#redfish-http-s-boot per that it would be like the virtual-media route. | 22:09 |
opendevreview | Riccardo Pittau proposed openstack/virtualpdu master: [DNM] Test timeout https://review.opendev.org/c/openstack/virtualpdu/+/922158 | 22:09 |
iurygregory | right, I just checked https://opendev.org/openstack/ironic/commit/041a7d7064491958278725123af0c1b8fa8aefe5 | 22:09 |
JayF | I could see a world where you had a redfish that didn't support BMC-driven http booting | 22:10 |
JayF | and could need to use the pxe-to-http uefi stuff | 22:10 |
JayF | but again, not sure if there's something I don't know here | 22:10 |
iurygregory | right | 22:11 |
cardoe | Well so right now my BMC interface doesn't have the ability to fetch files from my HTTP server. So I was hoping to use the HTTPBoot to avoid TFTP. | 22:12 |
TheJulia | Fun! | 22:12 |
opendevreview | Jay Faulkner proposed openstack/ironic master: Enable HTTP network boot for Redfish hardware https://review.opendev.org/c/openstack/ironic/+/922160 | 22:13 |
JayF | cardoe: it makes more sense at least, operationally, for a BMC to be capable of it but the BMC not being in the right network | 22:14 |
TheJulia | cardoe: so yeah, looks like an oversight on my end | 22:14 |
TheJulia | At the doctors office | 22:14 |
JayF | Alternatively, we implemented your feature request about 15 minutes after you asked for it ;) | 22:15 |
cardoe | No rush at all. I was gonna test it and submit a patch. But looks like Jay beat me to the patch part. | 22:15 |
JayF | I see no evidence looking at the boot interface that it won't just work | 22:15 |
cardoe | I take it redfish-https is really the best interface overall for redfish hardware then and I should get my network fixed up. | 22:15 |
JayF | Everything has upsides and downsides? I think we generally default to not hooking *Everything* up because the matrix gets huge | 22:16 |
JayF | as seen here, not even we are always sure off the top of our heads what the exact differences are | 22:17 |
TheJulia | Yeah, it should just work | 22:17 |
TheJulia | The bottom line is redfish should really be importing generic as a base and then adding, but almost no drivers *actually* do that :( | 22:17 |
* TheJulia gets to listen to the doctor tell her she needs more exercise and less cholesterol. | 22:18 | |
JayF | I was expecting to see super()+redfish_list | 22:18 |
JayF | and was surprised when I didn't | 22:18 |
cardoe | You and me both Julia. | 22:18 |
cardoe | Thanks for the quick patch Jay. | 22:21 |
TheJulia | Doctor is doing that with my wife now, I’m up next! | 22:21 |
JayF | hey, I'm happy to write a line to enable a whole thing | 22:22 |
* JayF standing on the shoulders of Julia | 22:22 | |
JayF | lol | 22:22 |
TheJulia | Give it a shot! My turn | 22:23 |
cardoe | Well I'll try to contribute in the future rather than just asking questions or making bugs. Working on deploying Ironic via OpenStack Helm currently and most of the patching has been over on that side of the world. | 22:24 |
JayF | We're happy to help wherever we can. If you're ever looking for something to get started with contributing, we try to keep some bugs tagged `low-hanging-fruit` in the bugtracker to be simple onboarding tasks | 22:27 |
cardoe | And setting up a relay again needs to be first order of business so I don't drop. Thank you all for the quick replies. | 22:28 |
JayF | cardoe: many of us use irccloud.com; it's free plan keeps you connected for 2 hours after your client disconnects | 22:37 |
TheJulia | I pay for it actually, but I’m silly and like 24/7 notices from it | 22:42 |
cardoe | I appreciate the suggestion. Been off IRC since I tore down my Xen dev box after I stopped being a maintainer. I learned where my ZNC was running that day. | 22:46 |
cardoe | Now with irccloud so I don't disappear... | 23:50 |
cardoe | I didn't have a chance to participate in OpenInfra Days or at any past PTGs, but an area of interest is the thread that Julia had about the Ironic networking story. | 23:51 |
TheJulia | That is definitely not at the top of my priority list unfortunately | 23:52 |
* TheJulia has much on that list | 23:53 | |
cardoe | I've heard your name floated long before I joined so I can only imagine that list is long. | 23:54 |
TheJulia | Unfortunately, I got to write a nice board resolution today and I still haven't started on any of the items I had planned on working on today :( | 23:58 |
TheJulia | .... and now that I've been appropriately told "you need to exercise more!" | 23:59 |
cardoe | So the focus for me is around ZTP for infrastructure (servers and networks gear) and how to advance that. | 23:59 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!