Monday, 2024-06-17

*** zbitter is now known as zaneb04:26
rpittaugood morning ironic! o/06:34
opendevreviewVerification of a change to openstack/ironic master failed: Build PXE config for node in SERVICING state  https://review.opendev.org/c/openstack/ironic/+/92200606:41
masgharMorning!08:19
opendevreviewDmitry Tantsur proposed openstack/ironic master: [WIP] Update the redfish interoperability profile  https://review.opendev.org/c/openstack/ironic/+/92057410:00
opendevreviewVerification of a change to openstack/ironic master failed: Update version change log with special treatment of .json removal  https://review.opendev.org/c/openstack/ironic/+/92196610:03
sylvrHello ! I have some questions regarding Ironic/Bifrost IPMI driver and the autodiscovery of my nodes (I'm using Kayobe to deploy Bifrost/Ironic), I'm able to use ipmitool to power cycle my nodes, but they don't seem to register themselves to Bifrost/Ironic...10:46
sylvrI planned to check the log on the PXEboot server to check if any nodes are downloading the IPA ramdisk or if it is the IPA that are unable to register the nodes10:47
dtantsursylvr: first and foremost, have you enabled discovery in bifrost? it's off by default.10:48
sylvrIn `ostack/src/kayobe-config/etc/kayobe/inspector.yml` I have ```# Whether to enable discovery of nodes not managed by Ironic.10:50
sylvrinspector_enable_discovery: "true"10:50
sylvr```10:50
sylvrand when trying to use another driver (redfish) the discovery was working, but I had issue with the driver being unable to communicate with the iDRAC because they're too old...10:51
dtantsurIt's weird: the driver does not affect how the hosts are booted11:08
sylvrmy thoughts too11:10
sylvrhow can I check the PXEboot server logs to see if my nodes are attempting pxeboot on the correct network?11:15
dtantsursylvr: it's either dnsmasq or tftpd (but I'd probably just use tcpdump)11:26
sylvrokay thanks, I'll try11:27
opendevreviewDmitry Tantsur proposed openstack/ironic master: WIP migration guide from inspector  https://review.opendev.org/c/openstack/ironic/+/92208911:28
sylvrdtantsur: okay, so I see some traffic on the network, but mostly the seed/bifrost communicating with the gateway (sometimes some ARP between nodes and seed/bifrost) and some UDP, but still nothing in `baremetal node list`12:07
sylvrI power-cycled my nodes through ipmitool, but didn't seems like a lot were happening... I'm going to take a look in my config files, maybe I did override the discovery12:09
sylvrBRB12:09
rpittauFYI metal3 integration job is broken, again12:31
opendevreviewcid proposed openstack/ironic master: Provision ARM (aarch64) fake-bare-metal-vms  https://review.opendev.org/c/openstack/ironic/+/91544112:32
sylvrback!12:40
TheJuliagood morning12:45
opendevreviewcid proposed openstack/ironic master: Provision ARM (aarch64) fake-bare-metal-vms  https://review.opendev.org/c/openstack/ironic/+/91544112:53
TheJuliayouch metal3 ModuleNotFoundError: No module named 'jinja2'13:13
TheJuliasigh13:18
TheJuliahttps://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_43f/922006/2/check/ironic-tempest-ramdisk-bios-snmp-pxe/43f756c/controller/logs/screen-ir-cond.txt Wheeeee13:24
dtantsurTheJulia: https://github.com/metal3-io/metal3-dev-env/pull/142713:24
dtantsurgood morning13:24
opendevreviewcid proposed openstack/ironic master: Flexible IPMI credential persistence method configuration  https://review.opendev.org/c/openstack/ironic/+/91722914:10
TheJuliaThat should work I guess14:10
opendevreviewcid proposed openstack/ironic master: Provision ARM (aarch64) fake-bare-metal-vms  https://review.opendev.org/c/openstack/ironic/+/91544114:17
JayFDo we need to have metal3-integration stop voting in ironic or ensure it's "voting" (or whatever the equivalent is) for metal3?14:43
rpittauJayF: after the fix merges it will start working again14:44
rpittauuhm wait I think I read your question wrong :D14:45
JayFSo I'm saying, the only way for that to get broken in the way it is, would be a commit that changed the behavior and broke it, yeah?14:45
JayFAnd it not getting tested until *we* tested it?14:45
JayFI'm saying if *Ironic* is the first time that metal3 code gets tested, we should not be voting on it14:45
rpittauyes, but the breaking change was not tested with the behavior we use in ironic CI AFAICS14:46
rpittauwe're supposed to run some tests, we didn't14:46
rpittauI guess the metal3-dev-env tests should be mandatory in metal3-dev-env repo14:46
rpittauI don't think they are14:46
JayFThat's mainly what I'm getting at, I think something needs to be made voting somewhere on metal3 side14:47
rpittauyeah, it happened last week as well14:51
rpittauprobably worth discussing during the next metal3 meeting14:51
TheJuliaI guess constraints don't actually check ironic either?14:51
TheJuliaerr, requirements14:52
rpittauTheJulia: they should, there's an ironic job in requirements AFAIK14:54
TheJuliaI guess it doesn't actually force usage of the newer version14:55
JayFTheJulia: I'm confused, are we broken by a new requirement right now?14:56
TheJuliaThe new version of pysnmp horribly breaks ironic's driver loading14:56
TheJulianot sure why yet14:56
rpittauTheJulia: in the uc patch the ironic job was green https://review.opendev.org/c/openstack/requirements/+/91583014:57
JayFWe have an ironic side patch to land14:58
JayFI think(?)14:58
rpittauyeah14:58
rpittauunit tests need to be fixed14:58
TheJuliayeah, that is horribly broken14:58
TheJulia:(14:58
rpittau#startmeeting ironic15:00
opendevmeetMeeting started Mon Jun 17 15:00:26 2024 UTC and is due to finish in 60 minutes.  The chair is rpittau. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
opendevmeetThe meeting name has been set to 'ironic'15:00
JayFo/15:00
TheJuliao/15:00
iurygregoryoi/15:00
rpittauHello everyone!15:01
rpittauWelcome to our weekly meeting!15:01
rpittauThe meeting agenda can be found here:15:01
rpittau#link https://wiki.openstack.org/wiki/Meetings/Ironic#Agenda_for_June_17.2C_202415:01
rpittau#topic Announcements/Reminders15:02
rpittau#info Standing reminder to review patches tagged ironic-week-prio and to hashtag any patches ready for review with ironic-week-prio:15:02
dtantsuro/15:02
rpittau#link https://tinyurl.com/ironic-weekly-prio-dash15:02
rpittau#info The Ironic/Bare Metal SIG meetup at CERN happened on June 5, leaving the link to the notes still for this week15:03
rpittau#link  https://etherpad.opendev.org/p/baremetal-sig-cern-june-5-202415:03
JayFI filed several RFEs out of the feedback from that15:03
JayFI haven't added them to RFE review in the meeting though, I missed that15:03
rpittauJayF: I saw that, thanks15:03
JayFwe can potentially go over them at the appropriate time if desired15:03
rpittausounds good15:03
rpittau#info 2024.2 Dalmatian Release Schedule15:04
rpittau#link  https://releases.openstack.org/dalmatian/schedule.html15:04
rpittauanything else to announce/remind ?15:05
rpittauok, moving on15:06
rpittau#topic Review Ironic CI status15:06
rpittau#info metal3 integration job is broken again \o/15:06
rpittauprobably worth discussing why we didn't see CI breaking in metal3-dev-env during the metal3 meeting on Wednesday15:07
rpittauanything else CI related ?15:08
JayFIt sounds like the pySNMP change hurt us, too 15:08
JayFI'll look at 912328 and see if I can get it over the line with some time today15:08
TheJuliahttps://d2e31a25148547148788-51f9901de94768cc4d0e03f07c031664.ssl.cf1.rackcdn.com/921966/2/gate/ironic-tempest-ramdisk-bios-snmp-pxe/7f972ac/controller/logs/screen-ir-cond.txt <-- confirmed15:08
TheJuliasomewhere driver loading now breaks15:09
rpittauI left a message for Lex Li in the original thread on github https://github.com/lextudio/pysnmp/issues/4915:09
TheJuliaHaven't gotten far enough to figure it out exactly15:09
sylvrdtantsur: I found the issue I had, and yes it was a misconfiguration...15:09
dtantsurgood to know!15:09
sylvretc/kayobe/kolla/config/bifrost/bifrost.yml contained the wrong driver, I corrected it and now it works15:10
rpittauTheJulia: maybe some hints from the failing unit tests?15:10
TheJuliathe failing unit tests with the patch to bump our local state is 2000+ failing tests for me locally15:11
JayFit was only 6 in the gate, weird15:11
TheJuliabut that has been the path I've been looking at15:11
sylvrI added the kolla dir in my .gitignore to avoid pushing some secrets in the git repo, but then I didn't tracked this file... so hidden from my config..15:11
sylvrthanks!15:11
TheJuliaJayF: I should check the eventlet version of the gate...15:12
TheJuliaAnyway, its all broken, we can move the meeting along15:12
TheJuliaits going to take time to figure out what is going on exactly15:12
JayFTheJulia: seeing 6 locally here on a clean tox run too, fwiw; I'll take a look at getting those to pass locally post-meeting15:13
rpittauTheJulia: I can propose a revert for the uc change until our snmp job passes15:14
TheJuliaLets wait and see, I think it is just worse with lextudio's patch, at least on debian15:14
rpittauok15:14
TheJuliabtw, was that count with just tox -epy3, or did you try tox -ecover ?15:15
JayFTheJulia: tox -epy3 on python3.1215:15
JayF   ^ -r15:15
TheJuliaI ran cover with 3.1115:15
TheJuliaack15:15
JayFcover can be a bit ... misleading when there are certain kinds of failures ime15:15
JayFe.g. some failures can cascade15:16
JayFbut we should probably move on the meeting :D 15:16
rpittaulet's move on!15:16
rpittau:)15:16
rpittau#Discussion topics15:16
rpittauheh15:16
rpittau#topic Discussion topics15:16
rpittauJayF: you have something there :)15:16
JayFSo Reverbverbverb has a draft up of his report on Ironic documentatation in google docs15:17
JayFI implore you, please find time in the next week to go over it and make comments15:17
JayFwe will be converting this into "final version" and filing bugs around the action items as early as this time next week barring feedback to the contrary15:17
rpittau#action review reverbverbverb Ironic documentation audit update & results15:17
rpittau#link https://docs.google.com/document/d/1e9URuPHKNTx5QXdkCFAzsS0EAAxPJ-xQC4BEpDwESp4/edit15:17
JayFty for that, you got it copied before I did :D 15:18
rpittauthanks Reverbverbverb for that!15:18
ReverbverbverbMy pleasure. Please comment in the doc. 15:18
rpittauwill do :)15:18
rpittauanything else to add?15:19
ReverbverbverbMy review of the doc is necessarily big-picture, but please feel free to comment on anything about the doc15:20
JayFI think that's all for our action item15:21
ReverbverbverbI'll be breaking the analysis down to (I hope) actionable items, so anything you have to say will help15:21
rpittaualright, thanks again15:21
Reverbverbverb👍15:21
dtantsurThanks Reverbverbverb!15:21
rpittau#topic Bug Deputy Updates15:22
rpittauso that was me last week15:22
rpittauhaven't seen a lot of movements in terms of bugs but this popped up on Friday15:22
rpittau#link  https://bugs.launchpad.net/ironic/+bug/206941315:22
TheJuliaLooks like they proposed a fix, I'll try to take a look this week15:23
rpittauthanks TheJulia 15:23
rpittauany volunteer for this week bug deputy ?15:23
JayFgive it to me15:24
rpittauthanks JayF :)15:24
rpittaumoving on!15:24
rpittau#topic RFE Review15:24
rpittauJayF: you want to mention the RFEs you opened ?15:25
JayFyeah I have a bunch of recent rfes that need review, not all are mine but all need attention15:25
JayF#link https://bugs.launchpad.net/ironic/+bug/206908515:25
JayF RFE: Add a burnin_gpu step 15:25
JayFI'll note all these are against *ironic* even though for many of them, IPA commits are all that are needed15:26
JayFThat's pretty straightforward; as requested by operators at the cern meetup, we can hook up the stress-ng support for GPUs to a burnin_gpu step15:26
rpittauok15:26
rpittauwe can always add the project there15:26
JayFI don't think there's anything controversial here and it's a good first issue15:26
rpittauthat looks ok to me15:27
JayF#link https://bugs.launchpad.net/ironic/+bug/206908315:27
JayF RFE: Use results of burnin_* methods for inspection 15:27
JayFI'll be honest, this probably needs more design than has been done in the bug currently15:27
JayFbut at a high level; operators want us to optionally report metrics from burnin_ methods to inspection so they can compare performance15:27
TheJulia... that seems fairly nebulous15:28
TheJuliaor at least, at a high level a bit outside of what we can do with the stack with where we're going15:29
TheJuliasince surely they mean inspector15:29
JayFhow is "they mean inspector" different than what I said?15:29
TheJuliawe're on a path to deprecate inspector15:30
dtantsurbut we can still report stuff? the functional difference is not that huge15:30
TheJuliaand inspector really only holds value introspection runs, so we're talking about storing more data and extending then15:30
JayFYes, but we have similar functionality in an ironic-forward way?15:30
JayFI specifically didn't talk about internal ironic vs external inspector because AIUI it doesn't make much of a difference at that level in ipa15:30
* dtantsur is writing a migration guide by the way15:30
TheJuliaI guess I'm semi -1 to extending expector15:30
TheJuliainspector15:30
* TheJulia needs way more coffee this morning15:31
rpittauwe can always hold until the migration has been completed15:31
dtantsurwell, yeah, inspector is frozen at this point15:31
rpittauor add the feature to the agent in ironic15:31
JayFI don't comprehend how this extends inspector -- if I were implementing this, I'd focus on new ironic-based inspection 15:31
rpittauyep, exactly15:31
dtantsursame15:31
JayFwe will still have the ability to inspect stuff and report results15:31
TheJuliaokay then15:31
JayFI'm OK with us not approving it because it *is* very vague and might need design15:32
rpittauit actually makes sense to have that in the ironic project :)15:32
JayFbut it's 100% not the intention for this to be an `ironic-inspector` features15:32
dtantsuralso, if what gets extended is IPA, it does not matter what the server side is15:32
rpittauyeah15:32
dtantsurif something gets into the inspection data, it will be stored either way15:32
TheJuliaso I guess going back to my lack of caffinated state, I read Jay's summary as they want us to compare the results15:33
TheJuliawhich sent my brain a step further15:33
JayFAh, I'll be 100% clear we're talking about storing metrics from the run, and nothing else15:33
JayFthe use case brought up IRL was "sometimes machines just randomly perform out of spec"15:33
dtantsurWell, we could write an inspection hook to compare the old data with the new one.. that would depend on which server accepts the data15:33
JayFwith a story about NVMe drives that *over performed* by a factor of 2s15:33
JayF*2x15:33
JayFdtantsur: I'd suggest that be a separate, next step15:34
* dtantsur nods15:34
JayFdtantsur: part of the goal here is to have some low-hanging-fruit15:34
rpittauit clearly needs more info, but I'm ok with that15:34
JayFI put a clarifying comment in 2069083, going to move on without marking as approved since it needs more info15:35
rpittauyep, thanks15:35
JayF#link https://bugs.launchpad.net/ironic/+bug/206853015:35
JayF [RFE] Allow an operator to block all future bios deployments 15:35
TheJuliaThat was an idea which came up in discussion which I kind of liked15:35
JayFThe idea here is to have a conductor-level setting which prevents new nodes from being created / nodes from being updated to use BIOS boot mode15:35
JayFIn order for places to enforce UEFI-only boot mode15:36
JayFHmm that's not exactly how it is written 15:36
JayFhow it's written is to block *deployments* if in bios boot mode, that's not exactly the same thing15:36
rpittauthis kind of means that we're deprecating legacy BIOS deployments15:36
JayFrpittau: it means we're allowing some operators to flip a switch to disable them *in their environment* if they have a separate requirement15:37
JayFrpittau: it's just convenient we can flip it to default-enabled when we're ready to deprecate15:37
rpittauyes, I remember the discussion now, and I'm all for it15:37
TheJuliaI noted it as deployment in large part because you can have machines in different states15:37
TheJuliaand the logical place to prevent that sort of thing is in the deployment pathway code because you want to be able to unprovision machines15:38
rpittauof course, makes sense15:38
TheJuliasort of like the retire logic with cleaning15:38
JayFwell, that doesn't exactly match what the operator in the room was saying15:38
JayFthey were saying they had an issue where a device would be misconfigured at enrollment with a bios boot mode15:38
JayFsetting back their efforts to "drain" bios booting out of the environment15:38
JayFI think there's potential value in checking at the API level (you can't explicitly set to bios) as well as at deployment, but IMBW15:39
JayFat least that's how I understood it15:39
TheJuliaSo, we're semi-seeing such issues with some hardware, but it doesn't really show until cleaning/deployment15:40
TheJuliaspecifically hardware which *has* to be requested to be in bios mode, but we shouldn't model the universe around the exception15:40
JayFI guess I'm wondering if we'd ever have operators who'd say "I don't even want a ramdisk to boot via bios"15:40
TheJuliaquite possible, but they are going to have to somehow drain out their current state15:40
JayFthat seems to be the meaningful difference between our two perspectives, yeah?15:41
TheJuliasomewhat, really it is likely two knobs15:41
JayFack15:41
TheJuliaand two distinct checks at different points15:41
JayFI'll update the RFE with that comment15:41
JayFI think it's clear we are onboard to do something like this?15:41
TheJuliaone early on in the entry flow to move a node out of enrollment, and likely later on with deployments15:41
TheJuliaso if you switch both, you'll just end up with nodes which cannot be deployed anymore and can't add anymroe15:41
TheJuliawell, until you fix them or replace them :)15:42
JayFupdated with that comment, marking approved15:42
JayFTheJulia: this is one of yours15:42
JayF#link https://bugs.launchpad.net/ironic/+bug/206707315:43
TheJuliaby out of enrollment, I mean upon managing one node15:43
JayF [RFE] HTTP ISO Boot via Network (UEFI) HTTP Boot 15:43
JayFThis seems like a pretty straightforward extension of the new http-* based boot interfaces15:44
TheJuliaSo this one is unrelated to the meetup, but the idea is "what if folks didn't want ipxe or a network style PXE loader anymore", and is there anyway to direct boot an ISO via the network. Today there is not a delineation as the httpboot interfaces are delineated and the pure network+dhcp boot path leverages a network bootloader, or provide the entire url in advance to the BMC15:44
TheJuliayeah, pretty much just a slightly different option so you don't need a network bootloader at all15:45
rpittauthat is very interesting15:45
TheJuliaIt will only really be useful with a managed dhcp server though15:45
JayFI think silence is acceptance?15:47
TheJuliaI suspect so15:48
rpittaugood for me15:48
JayFthat's the end of the list15:48
rpittauthanks!15:48
JayFhow refreshing! Features people asked for, in real life15:48
JayFnot filtered through someone elses' product team :P 15:48
rpittauheh :)15:48
rpittau#topic Open Discussion15:49
TheJuliaeh... that last one is sort of me saving my sanity from a product team :)15:49
JayFAs mentioned Friday, we're probably going to start piloting asyncio-based solutions for removing eventlet from IPA15:49
rpittauwe still have 10 minutes in case someone has something to discuss about15:49
JayFI ask folks who may have an opinion to please have it early, I'll make sure stuff gets posted for review quickly15:49
TheJuliaI really don't have an opinion formed right now, but asyncio seems fine to me15:51
JayFthat's all I had for open discussion15:51
JayFTheJulia: my sincere hope is it's a giant nothingburger for something the size of IPA15:51
TheJuliait really should be, honestly15:52
rpittauanything else to discuss today?15:52
TheJuliawe could discuss pysnmp fun and my many broken tests on python 3.1115:53
TheJuliabut that doesn't seem *that* important :)15:53
JayFI want to look at that with an IDE up and not IRC up before talking about it more15:54
TheJuliafwiw, I think it's presence angers eventlet, but we can end the meeting15:55
rpittauthanks everyone!15:56
rpittau#endmeeting15:56
opendevmeetMeeting ended Mon Jun 17 15:56:01 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:56
opendevmeetMinutes:        https://meetings.opendev.org/meetings/ironic/2024/ironic.2024-06-17-15.00.html15:56
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/ironic/2024/ironic.2024-06-17-15.00.txt15:56
opendevmeetLog:            https://meetings.opendev.org/meetings/ironic/2024/ironic.2024-06-17-15.00.log.html15:56
JayFjfyi only seeing six failures on cover job, too16:06
JayFrunning tests on python 3.9 to see if it's a version thing16:07
rpittaugood night! o/16:30
opendevreviewJay Faulkner proposed openstack/ironic master: Upgrade pysnmp-lextudio.  https://review.opendev.org/c/openstack/ironic/+/91232816:33
opendevreviewMerged openstack/python-ironicclient master: Support traits configuration on baremetal create CLI  https://review.opendev.org/c/openstack/python-ironicclient/+/91943216:33
JayFTheJulia: ^^ 912328 passing all tests locally now ; curious to see how tempest likes it16:38
TheJuliaI'm curious if all jobs will pass16:39
TheJuliasame behavior on your python 3.9?16:39
opendevreviewJay Faulkner proposed openstack/ironic master: Upgrade pysnmp-lextudio.  https://review.opendev.org/c/openstack/ironic/+/91232816:40
JayFyes, py3.9 didn't change how tests ran16:40
TheJuliaokay, that is *weird*16:40
* JayF does it again to verify16:40
TheJuliaI guess we will see what the snmp job does since that seems to be the one which easily fails16:41
TheJuliaand... local python why you fail so many!16:41
JayFcheck it out -> tox -epy3 + -ecover + -epep8 + -epy39 all pass 16:41
JayFtox is running under python 3.12, all are fresh (-reenvname) environments16:41
TheJuliarunning tox -r -epy316:42
JayFactually I've tested with tox running under 3.12 (system) and 3.11 (venv)16:42
TheJuliaheh, 2995 failed tests failing16:45
JayFyou did `tox -repy3`? What version did it use?16:46
JayFI assume you don't have something like python 3.8 locally ?16:46
JayFor you have some package in your global site-packages that is wrecking havoc on your unit tests, perhaps?16:47
JayFhttps://paste.opendev.org/show/bOEgomYRO3Y9qI8pQfde/ a pip freeze from a working python 3.9 (py39) run16:48
TheJulia3.11.2 on debian, system python loaded into the venv16:49
JayFmy tox is running on python 3.12.3 (gentoo) and that paste is from it running a python 3.9.19 (gentoo) inside that venv16:51
TheJuliahttps://www.irccloud.com/pastebin/QRj0mUmc/16:51
JayFinteresting, idk how I ended up with the bonus packages16:52
JayFmaybe just py3.9 vs py3.11?16:52
JayFI'll do a 3.11 run16:52
TheJuliazeroconf pulls in async-timeout it looks like16:52
opendevreviewcid proposed openstack/ironic master: Provision ARM (aarch64) fake-bare-metal-vms  https://review.opendev.org/c/openstack/ironic/+/91544116:54
JayFpy311 passes cleanly too, and package list looks similar16:57
TheJuliaweeeird16:58
* JayF runs with driver libs16:58
JayFI am highly curious if somehow16:58
JayFyou're using debian eventlet version16:58
JayFand it doesn't have the async bridge causing all things to break16:58
TheJuliathat... might.. be17:01
TheJuliagive me a few and I'll try to install from source17:01
TheJulia2989 \o/17:11
TheJuliadunno, I'll have to keep hunting17:11
opendevreviewcid proposed openstack/ironic master: WIP: Self-Service via Runbooks [Prototype]  https://review.opendev.org/c/openstack/ironic/+/92214217:25
TheJuliaso ironic-tempest-ramdisk-bios-snmp-pxe failed17:25
TheJuliahttps://6e60353ab6408ff35f72-65a5b55aa83e6a6123b1ed34561cc44d.ssl.cf5.rackcdn.com/912328/4/check/ironic-tempest-ramdisk-bios-snmp-pxe/c56fd24/controller/logs/screen-ir-cond.txt17:25
JayFaha17:27
JayFI made the code match the unit test17:27
JayFinstead of making the unit test match the code17:27
JayFand that wants the code to look the other way17:27
TheJuliayup17:27
JayFI'll revise it17:27
TheJuliaok17:27
opendevreviewJay Faulkner proposed openstack/ironic master: Upgrade pysnmp-lextudio.  https://review.opendev.org/c/openstack/ironic/+/91232817:54
opendevreviewcid proposed openstack/ironic master: Provision ARM (aarch64) fake-bare-metal-vms  https://review.opendev.org/c/openstack/ironic/+/91544118:45
JayF> Jun 17 18:20:17.711718 np0037743883 ironic-conductor[94821]: ERROR ironic.conductor.verify [None req-62350a1e-928f-45a7-9663-1b161177abc1 None None] Failed to get power state for node e7d72db7-b1bf-4020-864b-6434f13f924e. Error: SNMP operation 'GET' failed: No SNMP response received before timeout: ironic.common.exception.SNMPFailure: SNMP operation 'GET' failed: No SNMP response received before timeout18:54
JayFmuch less interesting error this time :/18:54
JayFvirtualpdu is busted18:55
JayFhttps://zuul.opendev.org/t/openstack/build/9fe579e010e44c51b598caa819020762/log/controller/logs/screen-virtualpdu.txt18:55
JayFthis is not going to be trivial, going to try and jfdi it though19:01
JayFso I *think* https://review.opendev.org/c/openstack/virtualpdu/+/881538 may have landed without ever running on that version due to constraints19:24
JayFI can verify that with that reverted, it passes tests on python 3.11 (but blows up on python 3.12 because asyncore doesn't exist there)19:26
opendevreviewJay Faulkner proposed openstack/virtualpdu master: DNM: See if virtualpdu is happier on older pysnmp  https://review.opendev.org/c/openstack/virtualpdu/+/92214719:27
opendevreviewJay Faulkner proposed openstack/ironic master: Upgrade pysnmp-lextudio.  https://review.opendev.org/c/openstack/ironic/+/91232819:27
JayFstacking them for science19:28
JayFif this works oh boy we are in for pain19:28
rpittauJayF: virtualpdu should work with pysnmp-lextudio and pyasn119:39
rpittauyou just need to change aysnsock to asyncio19:39
JayFI will re-check but that did not work in my testing19:40
JayFrpittau: very much no19:56
JayFat least on py3.12, I should try on py3.11 I guess19:56
JayF> RuntimeError: This event loop is already running19:57
JayFlittered throughout the tests19:57
JayFI am almost certain at this point the lower-level methods used by virtualpdu changed from non-asyncio-native to asyncio-native and we may have to do work to make it work19:58
JayFI'm going to push up a reproducer20:00
opendevreviewJay Faulkner proposed openstack/virtualpdu master: DNM: Example of brokeness  https://review.opendev.org/c/openstack/virtualpdu/+/92214920:03
JayFrpittau: ^ fails tests angrily. I'm going to keep digging but wanted to show your assumption is (apparently?) bad or give you a chance to tell me where I went wrong :D 20:04
rpittaumy assumption was based on this change https://github.com/lextudio/pysnmp/commit/12ed9cc996a0a8415e3b7cf7abb5e967da61c20120:04
JayFlooks like the failures are coming from the cmdgen code20:08
JayFI think we might need to run the asyncio event loop at startup now20:08
rpittauyep20:08
cidHmm. Curious to see the root cause of this failures :))20:11
rpittaucid: pysnmp version upgrade20:11
cidLogging off now, but very active on my mobile for a couple of hours 20:11
cidrpittau: same with the metal3 job?20:12
rpittaucid: no, this is completely different, metal3 was a missing python library20:13
cidrpittau: Oh, got it.20:14
JayFthis is basically trying to jump like, a year+ of library updates20:24
JayFwhich also crosses an asyncore->asyncio migration20:24
JayFin a project that until today I'm pretty sure I haven't spent more than 5 minutes looking at20:24
JayFI give up and/or need to pair with someone on this20:52
JayFvirtualpdu unit tests are broken in a way that suggests to me something is screwed up around threading/async-y-ness20:52
JayFbut I don't have a good enough mental model of the code to know exactly where it might be falling through20:52
cidIf a pairing session happens, I will like to be part too.21:03
cidBy tomorrow, I will be getting familiar with the job log 21:04
JayFI suspect there's a nonzero chance rpittau just figures it out before I start tomorrow, but if not and we pair I'll ping you21:12
opendevreviewJay Faulkner proposed openstack/ironic master: Temporarily non-voting on SNMP job  https://review.opendev.org/c/openstack/ironic/+/92215321:18
JayFwent ahead and put ^ that up since I suspect we'll end up needing it to get virtualpdu+ironic fixes coordinated21:18
cidJayF: noted21:25
opendevreviewJay Faulkner proposed openstack/virtualpdu master: DNM: Example of brokeness  https://review.opendev.org/c/openstack/virtualpdu/+/92214921:41
opendevreviewJay Faulkner proposed openstack/ironic master: Upgrade pysnmp-lextudio.  https://review.opendev.org/c/openstack/ironic/+/91232821:47
JayFWe will have to land https://review.opendev.org/c/openstack/ironic/+/922153 and then 912328 before we can fix virtualpdu21:47
JayFI am frly sure 912328 is correct because we attempt to connect to the broken PDU and handle the error correctly; 21:48
JayFif it works, then we'll see the tempest job pass on virtualpdu when we fix it21:48
cardoeI was following https://opendev.org/openstack/ironic/commit/e19fd1d050cbabfdafa40c4743f82cd4c7649f82 change that's for adding HTTPBoot for UEFI devices and I noticed the only hardware type that had this added was the generic hardware type. It's left off for the redfish hardware type. Was wondering if anyone knew if that was intentional or accidental? Due to the way that Redfish overrides supported_boot_interfaces in21:55
cardoe https://opendev.org/openstack/ironic/src/commit/ebbc8300c36c4d17cc89b0b37f0c2dd030ff2d5d/ironic/drivers/redfish.py#L57 it's not possible to even enable it via config.21:55
JayFIsn't that roughly equivalent to redfishhttpsboot?22:03
JayFyeah, it's not because the other interface uses ipxe22:05
JayFhmm22:06
JayFcardoe: I'll PR a fix up for that, but I'll ask TheJulia to review it; she knows more about the edges here than I do and might have some insight asa to why it was excluded initially22:07
JayFcardoe: do you have a launchpad bug filed about it, perhaps?22:07
cardoeI haven't made one.22:07
iurygregoryI think the equivalent for redfish is RedfishHttpsBoot no?22:07
cardoeI was gonna edit my setup locally and submit a patch to let it happen.22:07
iurygregoryredfish-https *22:08
JayFso redfish-https doesn't give you the option to have ipxe in the middle afaict22:08
opendevreviewRiccardo Pittau proposed openstack/virtualpdu master: [DNM] Test timeout  https://review.opendev.org/c/openstack/virtualpdu/+/92215822:08
cardoeNo it's not the same as redfish-https. https://docs.openstack.org/ironic/latest/admin/drivers/redfish.html#redfish-http-s-boot per that it would be like the virtual-media route.22:09
opendevreviewRiccardo Pittau proposed openstack/virtualpdu master: [DNM] Test timeout  https://review.opendev.org/c/openstack/virtualpdu/+/92215822:09
iurygregoryright, I just checked https://opendev.org/openstack/ironic/commit/041a7d7064491958278725123af0c1b8fa8aefe522:09
JayFI could see a world where you had a redfish that didn't support BMC-driven http booting22:10
JayFand could need to use the pxe-to-http uefi stuff22:10
JayFbut again, not sure if there's something I don't know here22:10
iurygregoryright22:11
cardoeWell so right now my BMC interface doesn't have the ability to fetch files from my HTTP server. So I was hoping to use the HTTPBoot to avoid TFTP.22:12
TheJuliaFun!22:12
opendevreviewJay Faulkner proposed openstack/ironic master: Enable HTTP network boot for Redfish hardware  https://review.opendev.org/c/openstack/ironic/+/92216022:13
JayFcardoe: it makes more sense at least, operationally, for a BMC to be capable of it but the BMC not being in the right network22:14
TheJuliacardoe: so yeah, looks like an oversight on my end22:14
TheJuliaAt the doctors office22:14
JayFAlternatively, we implemented your feature request about 15 minutes after you asked for it ;) 22:15
cardoeNo rush at all. I was gonna test it and submit a patch. But looks like Jay beat me to the patch part.22:15
JayFI see no evidence looking at the boot interface that it won't just work22:15
cardoeI take it redfish-https is really the best interface overall for redfish hardware then and I should get my network fixed up.22:15
JayFEverything has upsides and downsides? I think we generally default to not hooking *Everything* up because the matrix gets huge22:16
JayFas seen here, not even we are always sure off the top of our heads what the exact differences are22:17
TheJuliaYeah, it should just work22:17
TheJuliaThe bottom line is redfish should really be importing generic as a base and then adding, but almost no drivers *actually* do that :(22:17
* TheJulia gets to listen to the doctor tell her she needs more exercise and less cholesterol.22:18
JayFI was expecting to see super()+redfish_list22:18
JayFand was surprised when I didn't 22:18
cardoeYou and me both Julia.22:18
cardoeThanks for the quick patch Jay.22:21
TheJuliaDoctor is doing that with my wife now, I’m up next!22:21
JayFhey, I'm happy to write a line to enable a whole thing22:22
* JayF standing on the shoulders of Julia22:22
JayFlol22:22
TheJuliaGive it a shot! My turn22:23
cardoeWell I'll try to contribute in the future rather than just asking questions or making bugs. Working on deploying Ironic via OpenStack Helm currently and most of the patching has been over on that side of the world.22:24
JayFWe're happy to help wherever we can. If you're ever looking for something to get started with contributing, we try to keep some bugs tagged `low-hanging-fruit` in the bugtracker to be simple onboarding tasks22:27
cardoeAnd setting up a relay again needs to be first order of business so I don't drop. Thank you all for the quick replies.22:28
JayFcardoe: many of us use irccloud.com; it's free plan keeps you connected for 2 hours after your client disconnects22:37
TheJuliaI pay for it actually, but I’m silly and like 24/7 notices from it22:42
cardoeI appreciate the suggestion. Been off IRC since I tore down my Xen dev box after I stopped being a maintainer. I learned where my ZNC was running that day.22:46
cardoeNow with irccloud so I don't disappear...23:50
cardoeI didn't have a chance to participate in OpenInfra Days or at any past PTGs, but an area of interest is the thread that Julia had about the Ironic networking story.23:51
TheJuliaThat is definitely not at the top of my priority list unfortunately23:52
* TheJulia has much on that list23:53
cardoeI've heard your name floated long before I joined so I can only imagine that list is long.23:54
TheJuliaUnfortunately, I got to write a nice board resolution today and I still haven't started on any of the items I had planned on working on today :(23:58
TheJulia.... and now that I've been appropriately told "you need to exercise more!"23:59
cardoeSo the focus for me is around ZTP for infrastructure (servers and networks gear) and how to advance that.23:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!