Wednesday, 2025-09-17

opendevreviewJacob Anders proposed openstack/ironic master: Fix intermittent Redfish firmware update failures with BMC validation  https://review.opendev.org/c/openstack/ironic/+/96023001:03
opendevreviewJacob Anders proposed openstack/ironic master: Fix intermittent Redfish firmware update failures with BMC validation  https://review.opendev.org/c/openstack/ironic/+/96023001:05
opendevreviewJacob Anders proposed openstack/ironic master: Fix intermittent Redfish firmware update failures with BMC validation  https://review.opendev.org/c/openstack/ironic/+/96023001:06
opendevreviewJacob Anders proposed openstack/ironic master: [WIP] Make cache_firmware_components more resilient during upgrades  https://review.opendev.org/c/openstack/ironic/+/96071103:24
opendevreviewSteve Baker proposed openstack/ironic master: Replace Chrome/Selenium console with Firefox extension  https://review.opendev.org/c/openstack/ironic/+/96143405:05
opendevreviewSteve Baker proposed openstack/ironic master: Replace Chrome/Selenium console with Firefox extension  https://review.opendev.org/c/openstack/ironic/+/96143405:17
dtantsurError while preparing to deploy to node 050264d4-5439-44f0-a9d7-c34f855a8f0a: Image oci://quay.io/dtantsur/cirros:0.6.3 12:01
dtantsurcould not be found.: ironic.common.exception.ImageNotFound: Image oci://quay.io/dtantsur/cirros:0.6.3 could not be found12:01
dtantsurI don't think this feature even works...12:01
opendevreviewDmitry Tantsur proposed openstack/ironic master: OCI: accept both content types when requesting a manifest  https://review.opendev.org/c/openstack/ironic/+/96028312:12
dtantsurmore debugging ^^^12:12
opendevreviewDmitry Tantsur proposed openstack/bifrost master: WIP add an OCI artifact registry  https://review.opendev.org/c/openstack/bifrost/+/96138812:33
opendevreviewDmitry Tantsur proposed openstack/bifrost master: Remove the ability to install and use ironic-inspector  https://review.opendev.org/c/openstack/bifrost/+/88793413:02
dtantsurJayF: revived the old patch of mine ^^ let me know if you think it's too early13:02
JayFI think for the sake of the release teams sanity, we should wait to do the paperwork until after the official release date. As far as prep work like this, I think it's okay.13:24
JayFIf someone wanted to get really industrious, they could also propose all of the delete everything and replace it with a readme patches 😂13:24
TheJuliagood morning13:42
opendevreviewJakub Jelinek proposed openstack/ironic-python-agent master: Fix skip block devices for RAID arrays  https://review.opendev.org/c/openstack/ironic-python-agent/+/93734213:49
opendevreviewJakub Jelinek proposed openstack/ironic-python-agent master: Fix erasable devices check  https://review.opendev.org/c/openstack/ironic-python-agent/+/96148514:09
TheJuliawell, dib seems broken on fedora14:10
TheJuliale-sigh14:10
dtantsurTheJulia: which registries have you tested the OCI feature with?14:13
TheJuliaquay.io and the image registry included with OpenShift14:16
dtantsurhmm, quay.io definitely does not work in the current version14:18
TheJuliadtantsur: can you try something like oci://quay.io/podman/machine-os:5.3-amd64 ?14:18
TheJuliait will be a way bigger payload, but that was the working example I was using14:19
TheJuliaugh, it looks like dib+ipa-b is super broken14:20
dtantsurtrying now14:20
dtantsurTheJulia: Image oci://quay.io/podman/machine-os:5.3-amd64 could not be found.14:20
TheJuliawut?!14:21
dtantsurI'm now adding logging everywhere, I'm not sure where this one instance is coming from14:21
TheJuliaok14:21
TheJuliaugh14:21
TheJuliaactually, try14:22
TheJuliaoci://quay.io/podman/machine-os:5.314:22
opendevreviewJakub Jelinek proposed openstack/ironic-python-agent master: Fix erasable devices check  https://review.opendev.org/c/openstack/ironic-python-agent/+/96148514:23
TheJuliaTag structure wise, 5.3 should take it down the most complex path as well with most possibilities of being happy14:23
dtantsurso far so good14:23
dtantsurah, we're probably restarting IPA because of the previous error14:24
dtantsurno, wait, it works14:25
TheJuliaAHH!14:25
dtantsurTheJulia: this one works.. but we need this feature work with more than one image :)14:25
TheJuliaso its something in the matching logic then14:25
dtantsurI'm trying to find where this ImageNotFound is coming from. It's not oci_registry.py apparently?14:26
TheJuliagive me a minute to pull that up14:27
TheJuliaI'm trying to help iury with drac10 stuffs14:27
dtantsurNow I'm looking at OciImageService and.. it does not support tags at all?? Then what supports them?14:27
TheJuliaso https://github.com/openstack/ironic/blob/master/ironic/common/oci_registry.py#L527 should be attempting to resolve the tags and what *should* be happening is it gets a list of possible tags back based upon what was supplied14:31
TheJuliahttps://github.com/openstack/ironic/blob/master/ironic/common/oci_registry.py#L535 should be getting a whole index back based upon the tag and resolution14:31
dtantsurIt's not coming from oci_registry.py but I have another case14:31
TheJuliawell, hold on14:31
TheJuliaso the entry point into the call sequence should be https://github.com/openstack/ironic/blob/master/ironic/common/oci_registry.py#L59614:32
TheJuliahttps://github.com/openstack/ironic/blob/master/ironic/common/image_service.py#L640-L804 should be taking the resulting data and then figuring out/matching out what is there14:33
TheJuliaI'm wondering if your tag matching doesn't line up with what it is searching for?! Or are you not even making it there ?14:33
opendevreviewDmitry Tantsur proposed openstack/ironic master: OCI: accept both content types when requesting a manifest  https://review.opendev.org/c/openstack/ironic/+/96028314:35
dtantsurI'll run with ^^ and hopefully see14:35
TheJuliak14:35
TheJuliaiurygregory: I'm going to build a centos machine to try and reproduce your build issue14:35
TheJuliaLooks like dib's current code expects the same version of python in the build enviornment14:38
TheJuliawhich is wrong14:38
*** dansmith_ is now known as dansmith14:39
*** JakubJelnek[m] is now known as kubajj14:41
dtantsurTheJulia: the problem is: I simply did `oras push <tag> <file>`. I expect roughly 100% of users trying this feature to do the same (and fail the same way) :(14:44
TheJuliaFair, uhh... I'm trying to remember what it does under the hood and if that has changed at all14:44
TheJuliadtantsur: I'm trying to remember, is that what was written in the docs, or did I frame the docs much more towards explicitly using a shadigest style URL ?14:52
TheJuliaiurygregory: now updating centos14:53
dtantsurTheJulia: you're not really prescriptive there, but we absolutely need both to work IMO14:53
dtantsurhttps://docs.openstack.org/ironic/latest/admin/oci-container-registry.html#available-url-formats14:53
TheJuliaokay, yeah14:54
TheJuliaSo I never did tag creation in quay, since I started to wholly discount the idea thanks to openshift's image registry14:54
TheJuliaso, distinctly possible oras is doing something we odn't expect and is not matching in the path I noted14:55
TheJuliaThat is where I would expect it to be breaking without being in any position to reproduce/try at the moment14:55
dtantsurDon't worry, I can iterate on it further. I just need to find the exactl place where ImageNotFound appears14:55
dtantsurUnfortunately, rebuilding dev-scripts with a new revision takes like half an hour :(14:56
dtantsurand I'm still unable to rebuild my bifrost environment with CS1014:56
TheJuliaCS9 or ubuntu+devstack ? I mean, its not *that* difficult to do and its not like your doing anything special aside from setting an image_source14:57
TheJuliaWOW14:57
TheJulia9-stream firmware-images package is like over 1GB14:58
TheJulia75% done and at 880 MB14:58
TheJuliaat some point, we're just going to flip the code around, copy the couple of NIC firmwares worth keeping, wax the entire folder, and rebuild it14:59
dtantsurCS9 is not supported by bifrost14:59
TheJuliaoh well14:59
dtantsurand bloody virt-builder still does not support CS10, so I need to change my tooling14:59
TheJulia1.1 GB at 95%, WOW15:00
dtantsurreplace https://github.com/dtantsur/config/blob/85facb18464a96bc995632f6b2f9da713b40490e/virt-install.sh with something else15:00
iurygregoryTheJulia, ack tks15:09
TheJuliaso I had to pass DIB_RELEASE as well on it, but its gettings tarted15:10
TheJulialoops attached, extracting15:12
TheJuliadoing the magic15:12
JayFhttps://superuser.openinfra.org/articles/2025-superuser-awards-nominee-g-research/ just went live. Not pasting it here so my Ironic colleagues can stuff ballots in. Nope not at all... ;) 15:19
opendevreviewClif Houck proposed openstack/ironic master: WIP: Trait Based Networking Filter Expression Parsing and Base Models  https://review.opendev.org/c/openstack/ironic/+/96149815:31
TheJuliaiurygregory: how did you install diskimage-builder && ironic-python-agent-builder ?15:31
TheJuliaregardless, I think your build is out of date because currently it is horribly broken, specifically: https://tarballs.opendev.org/openstack/ironic-python-agent-builder/dib/15:37
TheJuliaWe never put a project.toml file in ironic-python-agent-builder FWIW15:38
iurygregoryTheJulia, basically it was via bifrost and after pulling the new code I've run pip install . for ipa-b15:45
iurygregorygoing to prepare lunch, will be back in ~2hrs15:45
dtantsurTheJulia: finally useful logging15:48
dtantsurCannot use image oci://quay.io/dtantsur/cirros:0.6.3: the artifact index does not contain a list of manifests: {'sc15:48
dtantsurhemaVersion': 2, 'mediaType': 'application/vnd.oci.image.manifest.v1+json', 'artifactType': 'application/vnd.unknown.artifact.v1', 'config': {'mediaType': 'application/vnd.oci.empty.v1+json', 'digest': 'sha256:44136fa355b3678a1146ad16f715:48
dtantsure8649e94fb4fc21fe77e8310c060f61caaff8a', 'size': 2, 'data': 'e30='}, 'layers': [{'mediaType': 'application/vnd.oci.image.layer.v1.tar', 'digest': 'sha256:7d6355852aeb6dbcd191bcda7cd74f1536cfe5cbf8a10495a7283a8396e4b75b', 'size': 216924115:48
dtantsur6, 'annotations': {'org.opencontainers.image.title': 'cirros-0.6.3-x86_64-disk.img'}}], 'annotations': {'org.opencontainers.image.created': '2025-09-09T13:50:07Z'}}15:48
TheJuliawoohoo15:49
dtantsurI think our code expects one more layer of indirection where here it's already THE manifest?15:49
TheJuliajust draw a direct line in that code!?15:49
dtantsurwhich is probably why the Accept header does not work?15:49
TheJuliaYeah, we do, and yeah15:49
TheJuliaI'd just try and raw the direct line, but using the length15:49
TheJuliaif there is one or more15:49
TheJuliabecause then we know its a composite tag15:49
TheJuliaiurygregory: finally got it to build, I think its environment variable differences passed in, ramdisk of 487MB on a local build.16:00
dtantsursigh, there is also mandatory disktype that needs fixing..16:12
TheJuliayeah, just make it not mandatory in such a case, if there try and use it, if not don't16:16
TheJuliathe existing modeling assumed compound object, such as podman's machine-os which was the prior art we based upon16:17
TheJuliaThe super simple path of 1-1 mapping... yeah16:17
dtantsur`'OciImageService.identify_specific_image' is too complex` jesus how I hate everything...16:18
dtantsurso I'm also on the hook for a refactoring, nice16:18
* dtantsur gets a snack first16:19
TheJuliayou could always bump complexity, create a bug, and wait for me to be able to cycle to it16:20
TheJulia(I'm awful, I know)16:20
dtantsurI *think* there is a simple subroutine I can extract into a new method16:49
* TheJulia falls over dddead17:06
* TheJulia just had a long-ish discussion regarding bootc17:08
opendevreviewDmitry Tantsur proposed openstack/ironic master: Fix OCI artifacts pointing to a single manifest  https://review.opendev.org/c/openstack/ironic/+/96028317:19
dtantsurThe patch is getting out of hand ^^17:19
dtantsurneeds testing still17:19
dtantsurTheJulia: it's surprisingly quiet in my world about bootc. although somebody in metal3 upstream has already asked if we can somehow build IPA with it :D17:29
JayFI got some feedback that ironic-weekly-prio was tagged on some stuff, and was autocompleting in that field. I dropped that invalid tag (with the LY) from the merged/abandoned patches that had it in an attempt to correct17:30
TheJuliaJayF: the auto pull of items with open/merged state?17:31
JayFI'm saying simply: many older patches had hashtag:ironic-weekly-prio17:31
TheJuliaJayF: yeah, okay17:31
JayFwhen I asked clif1 to h/t his change, he noted it autocompleted with the ly version17:31
JayFso I deleted them17:31
TheJuliaoh, ironic-week-prio vs ironic-weekly-prio (which I don't think I've ever tagged, but I could see how it gets there)17:32
JayFyeah exactly17:32
TheJuliadtantsur: interesting in that we added logic to support it in the agent, but to use it to run the agent is... Interesting.17:32
*** clif1 is now known as clif17:32
dtantsurI'm not sure it was a well-thought idea :) we had a brainstorming session about distributing IPA images17:33
dtantsurThe immediate desire to to switch to oci:// by default and ship IPA images via our Quay17:35
dtantsurAnd the next question was of course "oh, and can we layer them for easy modifications"? :)17:35
TheJuliaYeah, that is a plus17:36
TheJuliaI mean, its more like just running a container17:36
TheJuliaUhh, lets see, where did I put my brain17:38
dtantsurCats were playing with it, so now it's under some furniture?17:38
* iurygregory is back17:42
iurygregoryTheJulia, what command did you use?17:43
TheJuliadtantsur: oh noes... the orange cat has it.. Rutro!17:44
dtantsurforget about it then :D17:44
dtantsur(what would an orange cat do with a brain??)17:44
TheJuliaiurygregory:  DIB_RELEASE=9-stream ironic-python-agent-builder centos -o ironic-python-agent17:44
iurygregoryack, trying now17:45
TheJuliadtantsur: given they universally share a single braincell.... take a nap?!17:45
dtantsurtrue17:45
TheJuliaYes, orange cat's eyes are closed. He occupies the dog bed.17:47
dtantsurA good position to be in! (my condolences to the dog)17:48
TheJulia(He is used to it...)17:53
dtantsurOkay, my testing environment won't rebuild for some time more, I guess I'll test the new revision tomorrow.17:53
dtantsurIf that works, I'll try ORAS+docker's registry, I guess17:54
iurygregory595MB ironic-python-agent.initramfs17:57
TheJulia... I guess your going to need to take it apart18:13
TheJuliaand look inside18:13
iurygregoryyeah18:14
iurygregoryI'm going to also test building in my laptop just to see what size will be18:14
TheJuliaiurygregory: by chance, did you put forward a change around https://github.com/iurygregory/openstack-sushy/commit/8039b9f8a42e0600a467849b3dc77f79397422e3 into gerrit?18:24
iurygregoryTheJulia, nope, still testing downstream to see if really makes sense18:26
TheJuliabased upon your notes, it seems to, fwiw.18:26
iurygregoryack18:27
TheJuliaFWIW, I'm looking at starting to move some of these fixes downstream into openstack18:38
iurygregoryok18:56
JayFI didn't realize y'all had any sushy compat code in downstream places :-O18:59
TheJuliawell, we have the older releases we've consumed downstream19:05
JayFhonestly am more just curious how your end of the world works overall19:06
iurygregoryon ocp is mostly to make it easier to consume from source so we can map the content for each ocp release we have19:35
iurygregoryso we don't have to be strict with an upstream release (since the upstream/downstream cycle) is quite different 6 and 4 months kinda19:36
JayFsensible20:01
TheJuliaosp/rhoso is release tied, so we have downstream git mirrors which have additional branches and additional processes we need to walk thorugh. From there we go through build/release processes20:21
TheJuliaiurygregory: so looking at your change again, I'm thinking that might be wrong because its going to raise an exception regardless then because there won't be a location. What needs to happen, I think, is first check if we have a 200 error code, then hand over to that location handling code and the reference lookup using that20:28
TheJuliathoughts?20:29
*** mnaser_ is now known as mnaser20:36
TheJuliaI guess I'm mentally trying to avoid raising extensionerror20:36
mnaserwhos ready for some FUN20:37
TheJuliadefine fun? and type of fun?20:38
TheJuliamnaser: whats going on?20:38
mnaseri have a f(riggin)antastic environment that takes maybe up to 60s for a port to go up20:38
JayFhopefully the fun where he approves adamcarthur5's openstack-exporter changes ;) 20:38
mnaserhttps://paste.openstack.org/show/bDbiC2ZIPAB1u6YBRnoE/20:38
* JayF hands mnaser a "stp off" /s 20:38
mnaseri have a feeling that ironic-conductor is in a busy loop20:38
TheJuliamnaser: line carrier up or for packets to actually be forwarded?20:39
JayFso you're saying the *BMC* port goes out to lunch?20:39
JayFbecause that's querying the BMC port20:39
mnaserright, but isnt oslo_service.loopingcall.LoopingCallTimeOut seeming to imply that the loop never ran for 82s?20:39
* TheJulia twitches about eventlet20:40
mnaserour bff eventlet indeed20:40
JayFThis looks like Ironic is querying /your BMC/ for power state and it's not up20:40
JayFso again I ask: do the *BMC* ports flap on power change?20:40
JayFbecause we might have a corrolation!=causation issue but I'm not 100% sure20:40
TheJuliamnaser: ipmi or redfish?20:40
mnaserredfish20:41
TheJuliauhh....20:41
TheJuliapresently searching for a word20:41
TheJuliais it a shared port with the OS? or a dedicated BMC port?20:41
mnaserdedicated20:41
TheJuliaAnd it just doesn't respond, times out, etc? Do we see packets?20:42
mnaserlet me double check but for example20:43
TheJuliaThe looping call only calling back to it after so many seconds is... 8|20:43
mnaserhttps://www.irccloud.com/pastebin/H2ZNJSpW/20:43
* TheJulia blinks20:44
TheJuliaooookay20:44
TheJuliaokay20:44
TheJuliaso so the overall loopingcall I guess is wrapping the neutron interaction and your getting the thread slayed basically20:45
TheJuliaeventlet is standing there going "no, you cannot proceed"20:45
JayFis this how needing more workers presents?!20:45
TheJuliaor the overall port operation is hanging for a super long time, finally returns20:46
mnaserso the port was plugged "2025-09-17 20:22:05.057"20:46
TheJuliaoh. wow.20:46
mnaserwhich is fine, you see it 3 minutes less in the logs that the ports are there20:46
TheJuliahow many nodes is this conductor managing?20:46
mnasera whopping 80 something split by 3 conductors20:47
TheJulia... Are there a bunch of override timeouts?20:47
mnaseronly for neutron since these _FUN_ cumulus switches take ~30s to actually apply their configs * two switches for a bond20:47
TheJulia(We've seen some folks try to tune things like timeouts to insane values and break things in super weird ways, just trying to understand the scope)20:47
TheJuliaokay okay20:48
TheJuliaso, uhhhhhh20:48
mnaserand to avoid massive races, we have locking in place, so this is with me trying to put a 60s sleep between every manage20:48
TheJuliaokay, I think I understand what is going on, give me a few minutes while I dig through the code20:49
JayFI'm very curious to see :D 20:49
mnaser(i understand i am dealing with a turd here.. but i do think we can get away if the behaviour is indeed blocking and tenacity is not yielding for some reason)20:49
JayFmnaser: the amount I'm curious what the behavior would be on post-eventlet ironic is maximum right now :D 20:50
mnaserare ya suggesting an upgrade on a thursday20:50
mnaserso right now, with a 60s sleep, i managed to get 21 clean failed, 7 managable20:51
TheJuliacrazy question20:52
TheJuliawhat python version is this?20:52
mnaser3.10.1220:53
TheJuliaso flow wise, your asking for the node to be provided, and they are all going down this path20:55
mnaserrunning a runbook more specifically, but i think it ends up the same issue if it was a normal provide too20:55
mnaserhttps://www.irccloud.com/pastebin/vtB9qfWc/20:55
mnaserthey all seem to be floating around 75-80s20:55
TheJuliayeah20:55
TheJuliaCurious20:56
mnaserhttps://opendev.org/openstack/oslo.service/src/branch/master/oslo_service/backend/_eventlet/loopingcall.py#L54-L6020:56
mnaseri mean it sounds/feels like the system didnt power on for 75s "in theory"20:57
mnaserbut that just seems off20:57
TheJuliamnaser: any chance we can get a peak at a santized ironic.conf ?20:58
TheJuliathis feels super bizar21:00
mnaserhttps://www.irccloud.com/pastebin/IZn2Imum/21:01
TheJuliasecondary question: what kind of gear is this?21:01
mnaserguilherme told me your favorite kind21:01
mnaserthe novo21:02
TheJulianovo?!21:04
JayFfwiw cc: kubajj I'm working on that IPA Hardware manager refactor21:04
TheJuliaso, the power sync interval stands out to me in the configuration21:04
JayFfigured if anyone should be responsible to clean that up it should be the original spiller :)21:04
mnaserlenovo => le novo => the novo21:04
TheJuliaset to every 30 seconds21:04
TheJulialol21:04
TheJulianow I get it!21:04
mnaseri'm all bad jokes after digging at this for the past few hours21:04
mnasermush brain21:04
TheJuliaso, every 30 seconds for every bmc is super aggressive21:05
TheJuliaI'd back it to 60 since the base standards expect 1 interaction every 60 seconds, specifically around ipmi, but the vendors get weird about redfish and sessions as well21:05
TheJuliauhhhhh. speaking of redfish, is it session auth ?21:05
mnasersession auth.. is that just username/pw?21:06
TheJuliathis would be governed with driver_info parameters21:06
TheJuliasession auth is username+password, but where the conductor saves a session token in ram21:06
JayFevery 30 seconds is MASSIVE compared to what I've ever run at scale21:06
TheJuliaso it doesn't re-auth21:06
JayFI think we run 5 minutes+ in current downstraem21:06
TheJuliayou could be exhausting the session limits of the BMCs21:06
TheJuliasome vendors only allow so many distinct logins, fwiw21:06
TheJuliathe novo gear, I'm don't know about21:07
TheJuliaBUT, I've seen/heard grumbling about some gear *also* taking *forever* to refelct/update power state changes too21:07
mnaserif i have username+pw that is session, right? or is there an extra knob to flip?21:07
TheJuliaLike... 2.5 minutes which forces the sync interval to also be raised upwards21:07
TheJuliamnaser: we can use it to create a session, or we use it for basic re-auth every thime21:08
TheJuliauhhh21:08
mnaserhttps://www.irccloud.com/pastebin/jHChM3Vs/21:08
TheJuliaauth_type in ironic.conf21:08
TheJuliaso sessions auth it *should* be21:09
TheJuliabased upon the info you've provided21:09
TheJuliawell, it would be "auto", where it tries/prefers session auth and falls back21:09
TheJuliaOkay, what else...21:09
mnaserTheJulia: ok so i got a fun one for you, i looekd at docs at why its 30s21:10
mnaserlooks like we had it at default..21:10
TheJuliaoh?21:10
mnaser2025-07-18 14:10:01.932 1 ERROR oslo.service.loopingcall [-] Dynamic backoff interval looping call 'ironic.conductor.utils.node_wait_for_power_state.<locals>._wait' failed: oslo_service.loopingcall.LoopingCallTimeOut: Looping call timed out after 186.12 seconds21:10
TheJuliaWUT21:11
mnaserokay hold21:11
mnaserthis runbook includes bios.apply_configuration21:11
mnaseri wonder if that's mucking about21:11
TheJuliaoh21:11
TheJuliaOH21:11
TheJuliayeah21:11
mnaserso.. do we need a sleep clean step?21:12
TheJuliaTechnically, version wise, we might have one21:12
TheJuliaI'd just do a sleep in a playbook first to let things settle down post-apply because things may be autonmously rebooting21:12
mnaserwell after the apply bios config, we have some erase and raid stuff we do after21:13
mnaserso i wonder if we split that into two runbooks..21:13
mnaseror if we can put a sleep in there, we have 2024.2 here i think21:14
TheJuliaI'd give it a shot. I know we've long seen vendors do weird things when bios settings get changed21:14
TheJuliayeah, I think the code which give you a sleep option is in 2025.2, but first step to just delineate if one is causing another issue21:15
TheJuliaThe whole thing very much looks like we're sitting there aggressively trying to engage with the bmc and not getting a response21:15
TheJuliathe *other* issue, is the connection timeouts are also 60 seconds, so your basically constantly trying to check power state and if the BMC is not responding, then bad things will start to happen21:15
mnaseri think post apply configuration maybe the bmc is not happy21:15
TheJuliaI'd packet capture to verify somewhere in there21:16
TheJuliaYeah, I'd concur with that21:16
TheJuliaand I bet that is starting a sort of cascade of unhappiness21:16
TheJuliasort of like a portable storm cloud21:16
mnaserso maybe two runbooks, one for bios, and one for the normal raid cleanup21:16
mnaserand go for a nap inbetween21:16
TheJuliasandwich for the novo ;)21:17
opendevreviewJay Faulkner proposed openstack/ironic master: Increase default sync_power_state_interval  https://review.opendev.org/c/openstack/ironic/+/96155421:17
JayFdid somebody say bad default?21:17
TheJuliaeek, 60 is okay-ish21:18
TheJuliajust... the interval is exceeding the timeouts21:18
TheJuliadefault timeouts are also 60 seconds21:18
TheJulia(I... think.)21:18
mnaseroh also i managed to get an sqlalachemy trace from the api in messing with this too!21:19
mnaserrunbook update with without "args" in the json blob21:20
* TheJulia vomits21:20
TheJuliawecanhasbugpls?21:20
mnaseryes i will write that down 21:21
opendevreviewSteve Baker proposed openstack/ironic master: Replace Chrome/Selenium console with Firefox extension  https://review.opendev.org/c/openstack/ironic/+/96143421:22
TheJuliamnaser: thanks21:24
TheJuliastevebaker[m]: impressive21:33
stevebaker[m]TheJulia: ty!21:33
JayFI am so close to getting all unit tests passing on this IPA HWM refactor21:35
mnaserso i guess.. technically... do we need to reboot or even boot up the system for the redfish bios interface?22:31
mnaserhttps://github.com/openstack/ironic/blob/247ae57a22d1b48173398e940d62e531c237d66e/ironic/drivers/modules/redfish/bios.py#L31822:33
mnaserit seems we're rebooting right meow22:33
JayFiurygregory has some work ongoing in that space iirc, we have lots of unneccessary reboots in redfish-only step actions22:34
opendevreviewJay Faulkner proposed openstack/ironic-python-agent master: WIP: Refactor ironic_python_agent/hardware.py into multiple modules  https://review.opendev.org/c/openstack/ironic-python-agent/+/96155922:36
* JayF wonders if mnaser is seeing his messages?22:38
mnaserpo9=]o22:39
mnasersorry22:39
mnaserkid attacking laptop22:39
mnaserim still trying to figure out how to run this bios thing :(22:39
JayFYeah, I think the reboots are unavoidable at this time22:41
TheJuliaIf memory serves the bmc can also say "i need to reboot for this to take effect"22:44
mnaserpower_state_change_timeout .. gonna try playing with that, since that's really my issue, maaaybe23:08
iurygregoryI think for bios settings we need to reboot, some firmware updates we can try to avoid rebooting 23:19
iurygregoryTheJulia, by any chance can you share the .initramfs and .kernel you generated so I can test in the idrac10?23:20
cardoeYeah there's a bit we get back actually when we upload the firmware or set BIOS settings if memory serves me correctly we're just throwing out the window.23:25
cardoeLike the BMC will give you back a Job ID if it wants you to reboot it or do something further.23:26
cardoeI had a WIP patch a while back that I tried to save that ID and make some choices from there23:26
cardoeBut I couldn't figure out how to make the operation conditionally do a reboot and ensure the IPA was setup to boot again or not.23:27
*** ex_tnode7 is now known as ex_tnode23:40
iurygregoryTheJulia, oh I only saw this now: so looking at your change again, I'm thinking that might be wrong because its going to raise an exception regardless then because there won't be a location. What needs to happen, I think, is first check if we have a 200 error code, then hand over to that location handling code and the reference lookup using that23:46
iurygregoryI think it makes sense 23:46

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!