| TheJulia | happy day! | 00:00 |
|---|---|---|
| cardoe | janders: come over to my little special corner. The CI just gives us cat poop in little baggies. | 00:00 |
| TheJulia | cardoe: I thought our CI system was like a Litter-Robot... so BIG baggies?! | 00:01 |
| opendevreview | Merged openstack/ironic master: Remove direct mapping from API -> DB https://review.opendev.org/c/openstack/ironic/+/956512 | 00:03 |
| cardoe | They can be quite large. | 00:05 |
| TheJulia | cardoe: I haven't had any time to cycle over to the issue your looking at | 00:07 |
| TheJulia | unfortuantely | 00:07 |
| TheJulia | Maybe tomorrow, I've got to rebuild my test VM and make sure something is happy | 00:07 |
| cardoe | It's all good. I haven't either. | 00:09 |
| iurygregory | TheJulia, shouldn't we remove wip from the commit title https://review.opendev.org/c/openstack/ironic/+/956972 ? | 00:10 |
| TheJulia | iurygregory: I need to sit down to self review it at this point and likely make doc changes, I can do that unless you or jacob want to sanity review it | 00:10 |
| TheJulia | Jacob has already commented in irc though, which is a positive sign | 00:11 |
| iurygregory | yeah I was about to mention that | 00:11 |
| TheJulia | I'm going to go begin to prepare dinner here | 00:12 |
| iurygregory | enjoy :D | 00:12 |
| iurygregory | I just had lasagna here | 00:12 |
| opendevreview | Verification of a change to openstack/ironic master failed: Fix service failed state transitions for wait/hold https://review.opendev.org/c/openstack/ironic/+/957290 | 00:15 |
| opendevreview | Merged openstack/ironic master: Optional indirection API use https://review.opendev.org/c/openstack/ironic/+/956504 | 00:25 |
| janders | TheJulia ++ for removing WIP | 00:25 |
| janders | shall I start working on the doco change related to aborting servicing? Happy to | 00:25 |
| janders | half of gardening done, just dropped in to check messages, off for another 20 mins and then back properly | 00:26 |
| opendevreview | Merged openstack/ironic master: Revert "ci: temporary metal3 integration job disable" https://review.opendev.org/c/openstack/ironic/+/956953 | 00:35 |
| opendevreview | Merged openstack/ironic master: Clean-up misc eventlet references https://review.opendev.org/c/openstack/ironic/+/955632 | 00:35 |
| opendevreview | Jacob Anders proposed openstack/ironic master: Fix servicing abort to respect abortable flag https://review.opendev.org/c/openstack/ironic/+/957189 | 02:12 |
| opendevreview | Jacob Anders proposed openstack/ironic master: WIP: update documentation to include servicing abort. https://review.opendev.org/c/openstack/ironic/+/957825 | 02:34 |
| janders | TheJulia when you're online in the morning, please have a look, ^^ is my initial attempt at service-abort doco change. Question: do we update the state maching svg by hand or is it auto-generated? | 02:35 |
| TheJulia | its a command to do it, I can do it tomorrow | 03:03 |
| opendevreview | Clif Houck proposed openstack/ironic master: Add a new 'category' field to the Port object https://review.opendev.org/c/openstack/ironic/+/955447 | 03:03 |
| TheJulia | its in tox.ini, fwiw | 03:03 |
| janders | thank you! :) | 03:04 |
| opendevreview | Merged openstack/ironic master: Fix service failed state transitions for wait/hold https://review.opendev.org/c/openstack/ironic/+/957290 | 03:10 |
| opendevreview | Clif Houck proposed openstack/ironic master: Add a new 'physical_network' field to the Portgroup object https://review.opendev.org/c/openstack/ironic/+/955625 | 03:22 |
| opendevreview | OpenStack Proposal Bot proposed openstack/ironic-ui master: Imported Translations from Zanata https://review.opendev.org/c/openstack/ironic-ui/+/957829 | 03:33 |
| opendevreview | Jacob Anders proposed openstack/ironic master: WIP: update documentation to include servicing abort. https://review.opendev.org/c/openstack/ironic/+/957825 | 03:34 |
| janders | further to above I have a general doco contributions question: do we write Ironic docs manually or is Ironic documentation auto-generated? | 05:07 |
| rpittau | good morning ironic! o/ | 06:47 |
| janders | hey rpittau o/ | 06:47 |
| rpittau | janders: re ironic docs: it is usually manually written :) | 06:48 |
| janders | thank you rpittau | 06:48 |
| janders | probably a good job for Claude, however I did jump the gun with the above change (all manual). Was all done before I thought about it :) | 06:48 |
| rpittau | I'm sure Claude will be happy to help :D | 07:20 |
| opendevreview | Jacob Anders proposed openstack/ironic master: WIP: update documentation to include servicing abort. https://review.opendev.org/c/openstack/ironic/+/957825 | 07:42 |
| opendevreview | Jacob Anders proposed openstack/ironic master: WIP: update documentation to include servicing abort. https://review.opendev.org/c/openstack/ironic/+/957825 | 07:44 |
| janders | ^^ rebased on service-abort patch AND regenerated the state machine diagram | 07:45 |
| janders | looks better now | 07:45 |
| rpittau | FYI final release for sushy for Flamingo has been requested https://review.opendev.org/c/openstack/releases/+/957742 | 07:59 |
| opendevreview | Riccardo Pittau proposed openstack/bifrost master: Deprecate support for Debian 11 Bullseye https://review.opendev.org/c/openstack/bifrost/+/957847 | 08:38 |
| rpittau | forgot debian 11 has pyuthon 3.9 by default! ^ | 08:39 |
| sa | Hi all, We are seeing an issue with InsertMedia via Redfish on HPE Compute Scale-up Server 3200: The path .../VirtualMedia/0/Actions/VirtualMedia.InsertMedia exists, but calls fail with sushy.exceptions.ResourceNotFoundError. We are using the current sushy and ironic versions with the latest merged patches. Could you advise: Are there any known conditions where InsertMedia fails even when the resource path exists? Are th | 09:51 |
| sa | Are there limitations on image URL types or lengths for VirtualMedia on this platform? Any suggestions to reliably insert a UEFI boot ISO in this setup? Thank you for your guidance. Best regards, Pooja Sangle | 09:51 |
| Sandzwerg[m] | Morning Ironic. Regarding the request above: We now found out that the things don't support HTTP(S) for virtual-media only CIFS or NFS. https://docs.openstack.org/ironic/latest/admin/drivers/redfish.html#redfish-virtual-media mentions "The idea behind virtual media boot is that BMC gets hold of the boot image one way or the other (e.g. by HTTP GET, other methods are defined in the standard), then “inserts” it into node’s | 10:59 |
| Sandzwerg[m] | virtual drive as if it was burnt on a physical CD/DVD." but does ironic supports anything else apart from HTTP(S)? | 10:59 |
| frickler | Sandzwerg[m]: I have a similar issue and already opened an RFE bug https://bugs.launchpad.net/ironic/+bug/2119212, planning to submit some code for it real soon(tm) | 11:35 |
| Sandzwerg[m] | Sounds promising. Thanks. I'll follow that bug :) | 12:02 |
| TheJulia | good morning | 13:13 |
| darkhackernc | 0/ | 13:13 |
| cardoe | JayF: I do wanna talk about the quirks thing at some point | 13:23 |
| cardoe | I'm just neck deep in cinder right now. | 13:23 |
| rpittau | cardoe: I hope not literally :D | 13:27 |
| cardoe | heh. it might be more fun | 13:28 |
| cardoe | Just diving into a new to me code base is always frought with battles. | 13:28 |
| opendevreview | Morten Stephansen proposed openstack/ironic-python-agent stable/2025.1: Fix for motherboards where efibootmgr returns UTF-8. https://review.opendev.org/c/openstack/ironic-python-agent/+/957909 | 13:30 |
| JayF | cardoe: let's have the conversation async in the etherpad so it can be a jumping off point for the PTG | 13:41 |
| rpittau | JayF: you have the etherpad for the PTG already created ? | 13:44 |
| cardoe | JayF: good call. | 13:47 |
| rpittau | mmm PRC connection not working well with Python 3.10? jammy does not like it | 13:56 |
| rpittau | https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_2d0/openstack/2d0742476cfc4b12aa93e91ab0e67ea3/logs/ironic.log | 13:57 |
| opendevreview | Clif Houck proposed openstack/ironic master: Add a new 'category' field to the Portgroup object https://review.opendev.org/c/openstack/ironic/+/955713 | 14:12 |
| JayF | rpittau: yep it's linked on the whiteboard | 14:21 |
| rpittau | JayF: thanks | 14:21 |
| opendevreview | cid proposed openstack/ironic master: Image cached on deployment failure https://review.opendev.org/c/openstack/ironic/+/957613 | 14:45 |
| opendevreview | cid proposed openstack/ironic master: Clear image cache on deployment failure https://review.opendev.org/c/openstack/ironic/+/957613 | 14:53 |
| *** darmach47 is now known as darmach4 | 14:58 | |
| TheJulia | I've already had someone reach out for eventlet help, success?!? | 15:00 |
| JayF | oh yeah that's right, it's party day | 15:37 |
| JayF | \o/ /o\ \o/ /o\ | 15:37 |
| TheJulia | Indeed! | 15:38 |
| opendevreview | Stanislav Dmitriev proposed openstack/ironic-python-agent stable/2024.2: Fix for motherboards where efibootmgr returns UTF-8. https://review.opendev.org/c/openstack/ironic-python-agent/+/957947 | 15:56 |
| opendevreview | Stanislav Dmitriev proposed openstack/ironic-python-agent stable/2024.1: Fix for motherboards where efibootmgr returns UTF-8. https://review.opendev.org/c/openstack/ironic-python-agent/+/957948 | 15:57 |
| opendevreview | Merged openstack/ironic-ui master: Fix small mistake in text https://review.opendev.org/c/openstack/ironic-ui/+/956614 | 16:16 |
| clif | It seems like something may have broke with downloading the CentOS GenericCloud-9-latest image? It lists have been last updated today and at least one voting test is failing to download it: https://zuul.opendev.org/t/openstack/build/7e229768b74e4e8c86805e984e429847/log/job-output.txt?severity=0#24758 | 16:47 |
| JayF | looking | 16:47 |
| clif | maybe not downloading but something with repacking the base image | 16:48 |
| JayF | 2025-08-19 03:27:45.681573 | controller | 2025-08-19 03:27:45.680 | qemu-img: error while reading at byte 6186532864: Input/output error | 16:48 |
| JayF | that is ... weird | 16:48 |
| JayF | implies a bad download or full disk or something along those lines I'd suspect | 16:48 |
| clif | so download succeeds, then it barfs trying to do `qemu-img convert` | 16:48 |
| JayF | so a few things I usually check in this case: | 16:50 |
| JayF | other changes, is that same job failing or passing | 16:50 |
| JayF | if it's passing; is it a machine in the same cloud | 16:50 |
| JayF | (you can usually tell by mirror urls but I think it's in metadata somewhere) | 16:51 |
| JayF | if the answer is "it's passing on a machine in the same cloud" I'd recheck | 16:51 |
| JayF | if the answer is "it's passing on a machine in a different cloud" I'd recheck and note the additional datapoint | 16:51 |
| JayF | also checking the system output to see if any of the other htings (e.g. full disk) happened | 16:51 |
| JayF | https://6d6df89134025ef5b0e9-648722ac87374da2f576895eac8df5a8.ssl.cf5.rackcdn.com/openstack/7e229768b74e4e8c86805e984e429847/controller/logs/worlddump-latest.txt yeah the systems info looks good | 16:53 |
| JayF | the thing that job specifically tests is weird though: you have to have special disk images for things to work on 4k | 16:53 |
| JayF | so there's also a possibility something changed about converting the images to 4k breaking that job | 16:53 |
| JayF | but asking if other jobs are passing will help answer that | 16:54 |
| clif | what is 4k in this context? | 16:54 |
| JayF | https://zuul.opendev.org/t/openstack/builds | 16:54 |
| JayF | disk block size | 16:54 |
| JayF | as oppposed to 512, the standard | 16:54 |
| JayF | (well, "standard" meaning how it's been done a long long time, I'm sure it's all standard to someone) | 16:54 |
| clif | hwere is another job doing the same thing: https://zuul.opendev.org/t/openstack/build/a827f19c81bb49baac21aa0777854d3a/log/job-output.txt#24818 | 16:55 |
| JayF | https://zuul.opendev.org/t/openstack/builds?job_name=ironic-tempest-uefi-redfish-vmedia-4k&skip=0 clif that looks bad | 16:55 |
| clif | aha | 16:55 |
| clif | yea they started failing around the time that new image was published | 16:55 |
| JayF | so I'd spot check those, maybe 3-4 of them, make sure it's a similar failure | 16:55 |
| clif | https://cloud.centos.org/centos/9-stream/x86_64/images/ | 16:55 |
| JayF | then we beg TheJulia for her 4k voodoo knowledge | 16:55 |
| JayF | and/or mark it nonvoting temporarily while we figure it out | 16:55 |
| clif | I mean I already have found two with the exact same failure | 16:56 |
| JayF | I think we have a hypothesis | 16:56 |
| clif | I agree | 16:56 |
| JayF | if you want, you can document this in a bug, file a review that temporarily marks this job nonvoting | 16:56 |
| * TheJulia hides | 16:56 | |
| JayF | we won't merge it immediately, but we will if there's no path to fix soon | 16:56 |
| clif | sure | 16:56 |
| TheJulia | give me a few to context switch, I'm digging into another issue right now | 16:56 |
| clif | either non-voting, or temporarily peg the image to previous version? | 16:56 |
| JayF | if you can peg to previous version you get three gold stars | 16:57 |
| JayF | but make sure you document it in a bug so we don't have it pinned to august 18, 2025 on august 18, 2026 :D | 16:57 |
| clif | yea true, well I'll see how easy/hard that is to do | 16:58 |
| clif | does this go against ironic or tempest in the bug tracker? | 16:59 |
| JayF | ironic | 17:00 |
| JayF | we have our own ironic-tempest-plugin as well | 17:00 |
| JayF | but a bug would only go there if the *test itself* was broken | 17:00 |
| JayF | in this case, we're breaking in the devstack setup inside ironic/devstack/lib/ironic | 17:00 |
| opendevreview | Stephen Finucane proposed openstack/ironic master: api: Allow more types for updates https://review.opendev.org/c/openstack/ironic/+/957960 | 17:00 |
| JayF | clearly in ironic's land | 17:00 |
| JayF | but we might also discover, for instance, a new DIB release is impacting (doubtful given failures) or qemu-img (again doubtful) | 17:00 |
| *** sfinucan is now known as stephenfin | 17:02 | |
| stephenfin | rpittau: cid: Small follow-up for https://review.opendev.org/c/openstack/ironic/+/945218 there ^ | 17:04 |
| stephenfin | Spotted it in the SDK CI https://zuul.opendev.org/t/openstack/builds?job_name=openstacksdk-functional-devstack-ironic&project=openstack/openstacksdk | 17:04 |
| JayF | lookin | 17:05 |
| opendevreview | Stephen Finucane proposed openstack/ironic master: api: Add schema for bios API (responses) https://review.opendev.org/c/openstack/ironic/+/952149 | 17:05 |
| clif | filed: https://bugs.launchpad.net/ironic/+bug/2120974 | 17:05 |
| JayF | good stuff, if you can get the version pin done I can land that for a CI fix, otherwise do the voting:false patch and we'll have that as an escape hatch | 17:05 |
| clif | looking | 17:09 |
| opendevreview | Clif Houck proposed openstack/ironic master: Make ironic-tempest-uefi-redfish-vmedia-4k non-voting https://review.opendev.org/c/openstack/ironic/+/957962 | 17:14 |
| JayF | yeah I was afraid it wouldn't be pinnable :( | 17:15 |
| clif | proposing that for now, will look at doing the version pin too if that's preferrable | 17:15 |
| JayF | ack | 17:15 |
| JayF | that would be ideal | 17:16 |
| opendevreview | Clif Houck proposed openstack/ironic master: Make ironic-tempest-uefi-redfish-vmedia-4k non-voting https://review.opendev.org/c/openstack/ironic/+/957962 | 17:16 |
| clif | I think we would have to patch diskimage_builder somehow either in its tree or however we pull it into the devstack environment in order to peg or point it to a previous version | 17:37 |
| clif | which seems like a lot of work for something that may be fixed upstram in centos land | 17:37 |
| clif | so unless it's incredibly important I propose we just make it non-voting for now and watch for another centos image release | 17:38 |
| JayF | my bigger concern is that it's not a *bug* in the centos image, it's some kind of intended-change that has side effects on us | 17:43 |
| JayF | I'd suggest we give some time for TheJulia or stevebaker[m] to have a look before we mark it -nv | 17:43 |
| TheJulia | what is going on? | 17:44 |
| JayF | 4k job failing since ~3am (when the timestamp for teh updated centos image is) erroring with 2025-08-19 03:27:45.681573 | controller | 2025-08-19 03:27:45.680 | qemu-img: error while reading at byte 6186532864: Input/output error | 17:44 |
| TheJulia | yeah, we've seen that before | 17:44 |
| JayF | during the 4k image conversion piece | 17:44 |
| TheJulia | the mirror is bad | 17:44 |
| JayF | aha | 17:44 |
| TheJulia | it clears up eventually once the copy gets resynced | 17:45 |
| TheJulia | or its a partial image on a mirror | 17:45 |
| JayF | is there room for us to do something like add sha1/md5 checking to avoid a wild goose chase in the future? | 17:45 |
| JayF | or is the sha1/md5 right and the image is just bad | 17:45 |
| TheJulia | DIB would need to do that | 17:45 |
| TheJulia | but I think that is definitely something which we shoudl do and if we get a crazy error code... I dunno, skip the job | 17:45 |
| JayF | clif: if you wanna add that feature to dib it would be nice to have and downstream would likely benefit too; up to you | 17:46 |
| JayF | and/or implementing TheJulia's suggestion; but I have no idea how to make a devstack setup fail in a way that passes the job | 17:46 |
| * TheJulia finishes pinning down a customer complaint regarding proliantutils | 17:47 | |
| JayF | you know, I wouldn't mind keeping the lights on for that if they had given us the keys | 17:57 |
| TheJulia | yeaaah | 17:57 |
| TheJulia | This customer is unhappy that nic0 is the pxe nic, but when proliantutils uses redfish the bmc somehow boots nic1 | 18:00 |
| TheJulia | with PXE | 18:00 |
| JayF | well, good luck patching it. | 18:02 |
| TheJulia | yeah, no | 18:02 |
| JayF | I had a patch up, they asked after 4 months for a unit test | 18:02 |
| JayF | and I decided not to waste my time | 18:03 |
| TheJulia | heh | 18:03 |
| clif | JayF: which feature? trying a different mirror? | 18:04 |
| JayF | DIB checking sha1/md5 on the image and/or us configuring it to do so if it already supports it | 18:05 |
| clif | I'll take a look | 18:05 |
| JayF | just generally trying to turn that awful error into "yeah it's a bad image download" | 18:05 |
| TheJulia | and also likely detecting such a failure and likely blowing up the job in a way that we can know "oh, it was this" | 18:05 |
| JayF | the first thing I'd do is validate the assumption that the hash woulda shown this | 18:05 |
| TheJulia | JayF: like 2 weeks ago, the image on mirrors was like 268 MB for a few hours and that did the same exact thing. | 18:06 |
| clif | I'd be surprised if it doesn't already do that | 18:06 |
| JayF | but in general, yes, just somehow make our devstack failure loud in the right way so you and I, or some other victim in the future, doesn't lose time digging aknown issue | 18:06 |
| JayF | clif: me too, but sometimes things are surprising, especially in tools like DIB which usually get updated as-needed | 18:06 |
| clif | perhaps it does not after an initial skim of git grep md5/sha | 18:08 |
| clif | I'll dig some more | 18:09 |
| JayF | yeah like I said up to you | 18:09 |
| JayF | these "make CI failures less annoying" rabbitholes are infinitly deep | 18:09 |
| JayF | so sometimes you take the shot, sometimes you move on :D | 18:09 |
| clif | I'm happy to take the shot, occasionally at least | 18:09 |
| JayF | ++ | 18:13 |
| JayF | make sure to link the PR into here and/or to me | 18:13 |
| TheJulia | okay, brains | 18:31 |
| TheJulia | word from centos land | 20:46 |
| TheJulia | ( I prodded a centos board member after being on a call with someone expressing a very similar issue. ) | 20:47 |
| TheJulia | They are working on it, they are aware of it. No ETR but they have identified the root cause and are going to try and fix the root issue. | 20:49 |
| JayF | good stuff, thank you for closing that loop | 21:01 |
| clif | https://review.opendev.org/c/openstack/diskimage-builder/+/957983 | 21:21 |
| clif | works and in the process I discovered that a bunch of the most recent sha256sums for centos images are missing/zero length so that's fun | 21:22 |
| clif | idk what's going on with their infra/mirrors | 21:22 |
| clif | I have taken psychic damage | 21:23 |
| JayF | TheJulia: ^ perhaps more data for your CentOS board member contact | 21:37 |
| JayF | clif: it's okay, we have a cleric, she'll help you | 21:37 |
| janders | good morning Ironic o/ | 21:46 |
| janders | w/r/t repos/mirrors (we are also being hit by this downstream) - would there be any point in considering running our own? | 21:46 |
| JayF | opendev does mirror some items, but not everything | 21:47 |
| JayF | and often our image needs don't align with the rest of the community | 21:47 |
| JayF | so it's sorta a more complex question than it should be tbh | 21:47 |
| janders | I used to hit this waay too often in my devops days, got annoyed, set up my own with snapshotting mechanism and never looked back | 21:47 |
| janders | I would have paths with "live" mirrors as well as "frozen" ones snapshotted at a certain date | 21:48 |
| janders | in case of garbage landing in repos breaking stuff I could just repoint either the client or the "default" symlink | 21:48 |
| janders | there is complexity involved but the tradeoff is fixing this whole class of problems | 21:49 |
| janders | for those familiar with RH Satellite channel concept, I am thinking something similar but more lightweight and using CentOS tooling | 21:50 |
| janders | "only" drawbacks : 1) setup/maintenance effort 2) this needs a few TB disk space | 21:51 |
| janders | but such approach does a pretty good job of swapping random, unpredictable but intense CI pain to constant background pain that can be lived with so to speak :) | 21:52 |
| TheJulia | I'm 95% sure opendev doesn't actually mirror the qcow2 images | 22:21 |
| TheJulia | and they are trying to hold without expanding the AFS mirors | 22:22 |
| TheJulia | so centos is sort of the first victim to suffer mirroring wise | 22:22 |
| TheJulia | clif: that is the issue, basically a race condition aiui | 22:31 |
| opendevreview | Julia Kreger proposed openstack/ironic master: trivial: fix benchmark data generation script https://review.opendev.org/c/openstack/ironic/+/955099 | 22:57 |
| opendevreview | Julia Kreger proposed openstack/ironic master: Fix the ability to escape service fail https://review.opendev.org/c/openstack/ironic/+/956972 | 23:08 |
| TheJulia | janders: if you want to propose a doc change after ^, thta would help. Just to keep clarity | 23:09 |
| janders | TheJulia ACK. I stacked the doco change on top of ^^, will have a look at the latest revision of 956972 shortly | 23:13 |
| janders | (I forgot that initially and was wondering why state machine diagram didn't update - doh! :) ) | 23:14 |
| TheJulia | oh, let me do that as a fresh change | 23:50 |
| TheJulia | uhhhhhhhhh | 23:50 |
| opendevreview | Julia Kreger proposed openstack/ironic master: Update the state machine diagram https://review.opendev.org/c/openstack/ironic/+/957990 | 23:51 |
| TheJulia | janders: ^ | 23:51 |
| opendevreview | Steve Baker proposed openstack/networking-generic-switch master: WIP Add security group support to ovs https://review.opendev.org/c/openstack/networking-generic-switch/+/956519 | 23:56 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!