JayF | thank you | 00:19 |
---|---|---|
iurygregory | https://zuul.opendev.org/t/openstack/build/566e49467d9344c4a890fb748bfefaef/log/controller/logs/ironic-bm-logs/node-1_console_log.txt doesn't look good, trying to find more info on the logs to see how we can fix | 02:58 |
iurygregory | ideas are welcome ofc | 02:58 |
TheJulia | Jammy ipxe whee | 03:15 |
TheJulia | I suspect dhcp, but it would need to be checked in the neutron dhcp logs | 03:15 |
rpittau | good morning ironic! o/ | 06:24 |
opendevreview | Dmitry Tantsur proposed openstack/bifrost stable/yoga: CI: Update cached cirros image to 0.5.3 https://review.opendev.org/c/openstack/bifrost/+/887656 | 08:07 |
dtantsur | next one ^^ | 08:07 |
samuelkunkel[m] | Good morning ironic | 08:10 |
samuelkunkel[m] | I have a question, we have nodes that can not register themselves to ironic via the ipa because we have some other nodes in maintenance (they get their ILOs exchanged by a "combined ILO card" - hpe stuff). During that time a DELL node can not register himself, logging: | 08:11 |
rpittau | dtantsur: approved, yoga is still in Maintained status, so should have upgrade jobs, I guess | 08:11 |
dtantsur | yep (if they don't pass, we'll think) | 08:11 |
samuelkunkel[m] | The following failures happened during running pre-processing hooks:\nNode not found hook failed: Failed to resolve the hostname (ilocz22410kgd.mc.infra.eu01.int.stackit.cloud) for node 8660cb5e-6308-4d2d-9d28-6dff2f08a6e3. | 08:11 |
samuelkunkel[m] | So the Inspector complains that there is an HPE Node not reachable via its redfish (ilo) | 08:12 |
samuelkunkel[m] | and therefore responds with a 400 to a DELL node wanting to register himself | 08:12 |
dtantsur | I don't remember this code well, but it probably tries to de-duplicate nodes | 08:12 |
samuelkunkel[m] | is this intended because the BMC is not reachable? | 08:12 |
dtantsur | And probably tries to resolve hostnames for all Ironic nodes | 08:12 |
samuelkunkel[m] | ahh | 08:12 |
dtantsur | samuelkunkel[m]: we definitely need a better error message. I don't know if we can safely ignore the error though... | 08:17 |
samuelkunkel[m] | for now I remove all the nodes in maintenance | 08:25 |
samuelkunkel[m] | as I will redeploy them afterwards anyway | 08:25 |
samuelkunkel[m] | after removing the nodes in question (16) - everything works fine | 08:26 |
samuelkunkel[m] | noted to myself: having nodes without redfish / bmc connectivity is not a good state | 08:26 |
opendevreview | Jacob Anders proposed openstack/ironic master: [WIP] Retry connecting vmedia through a DVD device if available. https://review.opendev.org/c/openstack/ironic/+/887665 | 10:20 |
opendevreview | Jacob Anders proposed openstack/ironic master: [WIP] Retry connecting vmedia through a DVD device if available. https://review.opendev.org/c/openstack/ironic/+/887665 | 10:49 |
dtantsur | My first in-band inspection without ironic-inspector succeeded \o/ | 10:59 |
dtantsur | (Ports only, masghar is looking into the remaining hooks) | 10:59 |
janders | \o/ | 11:01 |
opendevreview | Dmitry Tantsur proposed openstack/bifrost master: Make inspector.ipxe respect inspector_debug https://review.opendev.org/c/openstack/bifrost/+/887667 | 11:03 |
dtantsur | Error: iLO get_power_status failed, error: RIBCL is disabled | 11:06 |
dtantsur | So, apparently the ilo hardware type may not work with ilo6 at all. TheJulia ^^^ | 11:06 |
iurygregory | good morning Ironic | 11:31 |
iurygregory | oh WOW O.o | 11:33 |
iurygregory | re ilo hardware type may not work with ilo6 | 11:33 |
samuelkunkel[m] | generic redfish works pretty good with ilo6 | 11:33 |
iurygregory | thank god | 11:34 |
iurygregory | \o/ | 11:35 |
opendevreview | Verification of a change to openstack/bifrost stable/yoga failed: CI: Update cached cirros image to 0.5.3 https://review.opendev.org/c/openstack/bifrost/+/887656 | 11:37 |
rpittau | samuelkunkel[m]: I really hope that's true :) | 12:17 |
dtantsur | samuelkunkel[m]: nice, thanks for confirming! I wonder if we need to update the docs... | 12:21 |
samuelkunkel[m] | So I have currently only 5 ILO6 Nodes, and TheJulia did a fix (https://bugs.launchpad.net/sushy/+bug/2016307). | 12:22 |
samuelkunkel[m] | Since then I did not see any issues. | 12:22 |
dtantsur | aha, so there was a bug fixed, good to know | 12:29 |
dtantsur | (someone is asking me about wallaby) | 12:29 |
samuelkunkel[m] | we are using zed - so not really able to comment about that | 12:29 |
dtantsur | samuelkunkel[m]: is this the fix in question? https://review.opendev.org/q/Ib78198a60a8924de934bda0c9a0b9298541496cf | 12:30 |
samuelkunkel[m] | yes | 12:30 |
samuelkunkel[m] | I have backported it in our environment if I recall correctly | 12:32 |
samuelkunkel[m] | as its not merged in the zed equivalent of sushy | 12:32 |
dtantsur | samuelkunkel[m]: yeah, I'm looking at it, but it's causing merge conflicts | 12:36 |
dtantsur | not even this patch, but the patch that is required for it | 12:37 |
dtantsur | https://review.opendev.org/c/openstack/sushy/+/867675 | 12:37 |
opendevreview | Dmitry Tantsur proposed openstack/sushy stable/zed: Retry on ilo state error https://review.opendev.org/c/openstack/sushy/+/882746 | 12:43 |
iurygregory | does anyone remember why we have IRONIC_VM_COUNT: 4 in standalone jobs? but our ironic-base is defined with 2? grenade also has 4 and most other jobs have IRONIC_VM_COUNT as 3 (some inspector jobs only have 1) | 13:28 |
dtantsur | iurygregory: grenade and standalone jobs run the most tests | 13:28 |
TheJulia | good morning | 13:29 |
TheJulia | dtantsur: wow, I've kind of floated we detect and warn on ilo usage | 13:29 |
dtantsur | Or even outright refuse to operate if we can confirm that ilo6 is not going to work. | 13:30 |
TheJulia | The other challenge is the next generation proliants can be ordered with openbmc instead | 13:30 |
dtantsur | which also won't have RIBCL, I assume? | 13:30 |
TheJulia | correct | 13:31 |
rpittau | TheJulia: so it's confirmed next gen proliantutils will support openbmc ? | 13:31 |
samuelkunkel[m] | you can order proliant g11 with either ILO or openbmc | 13:31 |
samuelkunkel[m] | its basically up to the buyer | 13:31 |
TheJulia | rpittau: no, it is not confirmed | 13:31 |
samuelkunkel[m] | the hpe rep "strongly advises" to not do this ;) | 13:31 |
rpittau | :D | 13:32 |
TheJulia | heh | 13:32 |
samuelkunkel[m] | do we have any case of people already doing that? | 13:32 |
TheJulia | The hardware for it is just not available in the general market | 13:32 |
rpittau | I'm not aware | 13:32 |
TheJulia | sometime in Q3 AIUI | 13:32 |
TheJulia | dtantsur: was that an error/exception we could catch on w/r/t the ilo6?! | 13:42 |
dtantsur | possibly, I don't have many details | 13:43 |
iurygregory | TheJulia, I also thought about dhcp, I saw a few errors in the neutron dhcp logs but after each error there was a warning https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_566/885276/6/check/ironic-standalone-redfish/566e494/controller/logs/screen-q-dhcp.txt "ERROR neutron.agent.linux.external_process [-] dnsmasq for dhcp with uuid 323e34a1-6ce9-4413-84f6-6724b78fb51e not found. The process | 13:51 |
iurygregory | should not have died" "WARNING neutron.agent.linux.external_process [-] Respawning dnsmasq for uuid 323e34a1-6ce9-4413-84f6-6724b78fb51e" | 13:51 |
opendevreview | Merged openstack/bifrost stable/yoga: CI: Update cached cirros image to 0.5.3 https://review.opendev.org/c/openstack/bifrost/+/887656 | 14:04 |
opendevreview | Dmitry Tantsur proposed openstack/bifrost stable/zed: CI: Update cached cirros image to 0.5.3 https://review.opendev.org/c/openstack/bifrost/+/887699 | 14:05 |
dtantsur | next one ^^^ | 14:05 |
dtantsur | see you tomorrow o/ | 14:05 |
TheJulia | oh, lovely | 14:17 |
TheJulia | iurygregory: if it did die, that would totally do it | 14:17 |
iurygregory | even if it restarts right after? humm | 14:17 |
iurygregory | we have the chance to fail during that time... | 14:18 |
opendevreview | Julia Kreger proposed openstack/ironic master: DNM Enable OVN https://review.opendev.org/c/openstack/ironic/+/885087 | 14:23 |
TheJulia | indeed | 14:23 |
iurygregory | I shouldn't have opened devstack/lib/ironic in this patch... | 14:25 |
TheJulia | heh | 14:26 |
TheJulia | I'd really like to get us mainly off of dnsmasq based jobs because of stability reasons | 14:26 |
iurygregory | ++ | 14:26 |
opendevreview | Daniel Bengtsson proposed openstack/metalsmith master: Display a message if the undercloudrc file is not loaded https://review.opendev.org/c/openstack/metalsmith/+/887704 | 14:42 |
rpittau | good night! o/ | 15:30 |
TheJulia | dtantsur: was that rib disabled error from the ilo hardware type or the ilo5 hardware type? | 15:41 |
JayF | Are there any changes for Ironic libraries we need to land before b-2 tomorrow? | 16:23 |
TheJulia | I don't believe so | 16:40 |
JayF | good stuff, I'm going to make sure all the release prs are up to date then | 16:40 |
TheJulia | k | 16:40 |
JayF | I'm going to be gone some this afternoon; flexing out some time since I fly to UK early Saturday :| | 16:41 |
TheJulia | Makes sense, have a good flight | 16:51 |
JayF | Yeah, will be glad to have travel done for the rest of the year | 16:52 |
JayF | I'm checking r/n to make sure there's nothing about travelling internationally by air that's changed in the years since I've done it lol | 16:52 |
TheJulia | eek, for the UK too | 16:53 |
* TheJulia wonders if you need an ETA | 16:53 | |
JayF | everything I've seen says I need no advance anything | 16:56 |
JayF | eta is a useful keyword though, now I'm in another rabbithole lol | 16:58 |
JayF | looks like not required yet, but I can get one to make my life easier (which I will) | 17:00 |
JayF | not yet, but soon you can get them | 17:02 |
JayF | so I should check each time I gotta go, apparently | 17:02 |
opendevreview | Julia Kreger proposed openstack/ironic-python-agent master: DNM: Test logging number of bytes downloaded https://review.opendev.org/c/openstack/ironic-python-agent/+/887729 | 17:06 |
TheJulia | so, it looks like if the connection breaks mid download, we don't actually (nor can we actually figure it out) anything | 17:07 |
TheJulia | we think the download is done thanks to http | 17:07 |
TheJulia | ... also this is a gray area because we don't have much we can do and we don't test python-requests returning content so the only way for me to validate my change works to just log is to put it through CI :\ | 17:11 |
TheJulia | 2023-07-05T15:21:31.595Z|00018|lflow|WARN|error parsing actions "reg0[3] = put_dhcp_opts(offerip = 10.1.0.17, bootfile_name = "http://173.231.255.103:3928/boot.ipxe", bootfile_name_alt = "undionly.kpxe", [trim] ; next;": Syntax error at `bootfile_name_alt' expecting DHCPv4 option name. | 17:23 |
opendevreview | Julia Kreger proposed openstack/ironic master: DNM Enable OVN https://review.opendev.org/c/openstack/ironic/+/885087 | 17:24 |
TheJulia | iurygregory: you around? | 18:16 |
opendevreview | Julia Kreger proposed openstack/ironic master: DNM Enable OVN https://review.opendev.org/c/openstack/ironic/+/885087 | 18:31 |
stevebaker[m] | Good morning | 20:25 |
TheJulia | good morning stevebaker[m] | 20:25 |
opendevreview | Julia Kreger proposed openstack/ironic master: DNM Enable OVN https://review.opendev.org/c/openstack/ironic/+/885087 | 21:50 |
iurygregory | TheJulia, now I'm | 22:19 |
iurygregory | I was at the doctor | 22:20 |
iurygregory | the errors above are from the attempt to use OVN? | 22:21 |
TheJulia | no worries | 22:29 |
TheJulia | yeah, from ovn without latest version | 22:29 |
opendevreview | Julia Kreger proposed openstack/ironic-python-agent master: Log the number of bytes downloaded https://review.opendev.org/c/openstack/ironic-python-agent/+/887729 | 22:40 |
TheJulia | so ^^^ is a result of a customer case where I can't tell if a download is getting interrupted mid stream, but it sure looks like it | 22:51 |
TheJulia | checksums don't match and that is the simplest answer I can think of | 22:51 |
iurygregory | I think I saw this discussion on slack =) | 22:51 |
TheJulia | iurygregory: I found where they tried iscsi, and the transfer io errored about half way | 22:52 |
iurygregory | ouch >.< | 22:53 |
iurygregory | will add the patch to my list of review | 22:53 |
TheJulia | thanks | 22:53 |
TheJulia | we really have no way to go "we know what the size is" up front either | 22:54 |
TheJulia | and the body/content size header is optional | 22:54 |
TheJulia | so..... | 22:54 |
iurygregory | "it's fine" | 22:54 |
TheJulia | logging seems like the minimal and maximal | 22:54 |
TheJulia | pretty much | 22:54 |
iurygregory | <insert the gif here> | 22:54 |
iurygregory | yup, log would totally help | 22:54 |
* TheJulia adds some more fire | 22:54 | |
JayF | I want the Ironic installshield(tm) progress bar | 23:09 |
JayF | where it goes 1%, up through 10-15% in small steps, sorta stays there, jumps to 45% where it stays for three hours, then jumps to 99% and stays there for another hour, then it's done | 23:10 |
* JayF wonders if anyone is old enough to remember the days of a software install taking hours | 23:10 | |
TheJulia | “Please insert diskette 23” | 23:13 |
JayF | I got an old powerbook a couple years back, only real interactability with it is via floppies, and I have the hardware to image them | 23:15 |
opendevreview | Merged openstack/bifrost stable/zed: CI: Update cached cirros image to 0.5.3 https://review.opendev.org/c/openstack/bifrost/+/887699 | 23:15 |
JayF | so I did a few of those, getting newer mac os system and old apps onto it | 23:15 |
JayF | fun story: only about 9 out of 10 new-old-stock floppies still work, and those are mostly gone. Nowadays you just buy giant bulk quantities of floppies and get very, very low yields. yay. | 23:16 |
TheJulia | My parents had an autocad license…. We had boxes of diskettes | 23:16 |
JayF | we had a family friend who believed in copying that floppy LOL | 23:17 |
JayF | I think I have a sealed OEM floppy copy of Windows NT 3.51 somewhere in my retro stuff | 23:17 |
TheJulia | we could copy the floppies, but when you had hardware keys... | 23:19 |
JayF | dead serious, I remember my dad taking NASCAR Racing (by Papyrus, the best NASCAR game made that ended up turning into iRacing, yes, seriously) | 23:19 |
JayF | copying the floppies, but the copy-protect was a brochure about nascar tracks | 23:19 |
JayF | with dark-red background and black text so you couldn't use a B&W copier | 23:20 |
JayF | my dad went down to the print shop and paid like $3 a copy to make copies of that guide for all his work buddies lol | 23:20 |
JayF | I still can identify the outline of some tracks from doing the copy protection loading into that game | 23:20 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!