*** pmannidi is now known as pmannidi|brb | 00:40 | |
*** pmannidi|brb is now known as pmannidi | 01:21 | |
*** pmannidi is now known as pmannidi|brb | 01:48 | |
*** pmannidi|brb is now known as pmannidi | 02:30 | |
arne_wiebalck | Good morning janders and Ironic! | 07:18 |
---|---|---|
iurygregory | good morning janders arne_wiebalck and Ironic o/ | 08:00 |
arne_wiebalck | hey iurygregory o/ | 08:01 |
rpittau | good morning ironic! o/ | 08:04 |
rpittau | ooook something's off with ironic-standalone-ipa-src and metalsmith-integration-ipa-src, and it looks like ironic-standalone has become (more) unstable recently | 08:20 |
opendevreview | Riccardo Pittau proposed openstack/ironic-python-agent stable/xena: Delete EFI boot entry duplicate labels first https://review.opendev.org/c/openstack/ironic-python-agent/+/816489 | 08:22 |
iurygregory | good morning rpittau o/ | 08:26 |
rpittau | hey iurygregory :) | 08:26 |
iurygregory | rpittau, well about ironic-standalone-ipa-src https://opendev.org/openstack/ironic-python-agent/commit/a67807b9b6017a8a1dc250bd7a7141e6728544cb | 08:28 |
iurygregory | and seems like we have the job voting in ipa-builder... but not in IPA =) | 08:29 |
rpittau | yeah but it's giving more issues recently, check the stats | 08:31 |
rpittau | also the metalsmith job is voting | 08:31 |
rpittau | I'm actually thinking to make them voting again | 08:32 |
iurygregory | hummm 39/50 SUCCESS | 1/50 ABORTED | 1/50 TIMED_OUT | 8/50 FAILURE | 08:34 |
iurygregory | for ironic-standalone-ipa-src | 08:35 |
iurygregory | considering only ipa-builder repo | 08:35 |
opendevreview | Riccardo Pittau proposed openstack/ironic-python-agent master: Use json for lsblk output https://review.opendev.org/c/openstack/ironic-python-agent/+/775391 | 08:36 |
iurygregory | oh 1/50 CANCELED =) | 08:36 |
rpittau | the error is during raid creation "DEBUG ironic_lib.utils [-] Command stderr is: "mdadm: cannot open /dev/vdb1: No such file or directory" | 08:43 |
rpittau | the partition seems to be created | 08:45 |
rpittau | maybe partition table not updated? | 08:47 |
iurygregory | didn't we have a case like this in the past? (the partition table didn't get updated) | 08:48 |
rpittau | yep, trying to remember | 08:48 |
* iurygregory wondering if arne_wiebalck remember something | 08:49 | |
rpittau | we do run partx before calling mdadm https://opendev.org/openstack/ironic-python-agent/src/branch/master/ironic_python_agent/hardware.py#L2169 | 08:50 |
iurygregory | omg! | 08:50 |
iurygregory | Necessary, if we want to avoid hitting an error when creating the mdadm array below 'mdadm: cannot open /dev/nvme1n1p1: No such file or directory' | 08:51 |
iurygregory | :D | 08:51 |
rpittau | yeeeeep | 08:52 |
iurygregory | can we cry now or just later? | 08:55 |
*** pmannidi is now known as pmannidi|brb | 08:55 | |
rpittau | also the last commit was done to prevent such a case https://opendev.org/openstack/ironic-python-agent/commit/9d707e9f4bab40109b7e29df2136e86d65325ea3 | 08:59 |
arne_wiebalck | iurygregory: rpittau: yes | 09:08 |
arne_wiebalck | there is an issue when the device nodes are not yet created | 09:08 |
arne_wiebalck | I've put in 2 patches lately | 09:09 |
arne_wiebalck | https://review.opendev.org/c/openstack/ironic-python-agent/+/816192 | 09:09 |
arne_wiebalck | https://review.opendev.org/c/openstack/ironic-python-agent/+/812470 | 09:10 |
iurygregory | https://review.opendev.org/c/openstack/ironic-python-agent/+/816192 I would say would probably help (but metalsmith failed /facepalm) | 09:11 |
rpittau | right, so need to see why the metalsmith job is now failing | 09:11 |
iurygregory | yeah | 09:11 |
arne_wiebalck | the comment has the error message, but the fix is not sufficient | 09:11 |
rpittau | ok, I see | 09:13 |
rpittau | arne_wiebalck: I'm wondering if we should make rescan_device public, move that to hardware.py and call that directly instead of lines 2156-2170 with your change that adds the call to blockdev | 09:25 |
arne_wiebalck | rpittau: sorry, which lines are these? | 09:27 |
* arne_wiebalck has a meeting now ... | 09:28 | |
rpittau | arne_wiebalck: https://opendev.org/openstack/ironic-python-agent/src/branch/master/ironic_python_agent/hardware.py#L2156 | 09:28 |
rpittau | maybe add that after the except on L2171 | 09:28 |
arne_wiebalck | rpittau: yes, I was thinking this as well | 09:35 |
arne_wiebalck | rpittau: we should pull out the big sweep every time | 09:35 |
rpittau | yeah | 09:35 |
arne_wiebalck | acc. to the internet, all these calls may be necessary in different situations | 09:36 |
arne_wiebalck | all of them work sometimes | 09:36 |
arne_wiebalck | :-/ | 09:36 |
rpittau | exactly, and they're usually fast if everything's right | 09:36 |
arne_wiebalck | yep | 09:36 |
arne_wiebalck | we probably have more places where we call udev settle for similar reasons | 09:37 |
arne_wiebalck | but these two are very similar already | 09:37 |
arne_wiebalck | and already justify your proposal I think | 09:37 |
rpittau | yes, it already makes sense for just those two | 09:38 |
dtantsur | morning ironic | 10:39 |
iurygregory | morning dtantsur o/ | 10:49 |
opendevreview | Aija Jauntēva proposed x/sushy-oem-idrac master: Fix moved tests https://review.opendev.org/c/x/sushy-oem-idrac/+/816641 | 11:02 |
*** dviroel|rover|out is now known as dviroel|rover | 11:12 | |
dtantsur | does anyone know what happened with the metalsmith jobs? iurygregory, rpittau? | 11:24 |
dtantsur | hmm, the statistics is not too bad https://zuul.openstack.org/builds?job_name=metalsmith-integration-ipa-src | 11:24 |
dtantsur | maybe a temporary glitch | 11:24 |
rpittau | could be just a glitch, I got worried because it failed 3 times in a row in https://review.opendev.org/c/openstack/ironic-python-agent/+/816192 | 11:25 |
iurygregory | dtantsur, yeah I agree, I checked in the zuul the history of last 50 runs also | 11:25 |
dtantsur | InstanceDeployFailure: Failed to install a bootloader when deploying node c7c7e554-453a-4842-8438-756135047c11. Error: No partition with UUID 6b1e3209-4f48-46de-a022-a6ea69099155 found on device /dev/vda | 11:25 |
rpittau | that | 11:26 |
dtantsur | hmmm | 11:26 |
iurygregory | maybe something specific in a given cloud? (I haven't checked if there is somethign common between the failures) | 11:27 |
* dtantsur fixes the double error handling in the meantime | 11:43 | |
dtantsur | File "/home/dtantsur/Projects/ironic/.tox/py3/lib/python3.10/site-packages/eventlet/timeout.py", line 166, in wrap_is_timeout | 11:45 |
dtantsur | base.is_timeout = property(lambda _: True) | 11:45 |
dtantsur | TypeError: cannot set 'is_timeout' attribute of immutable type 'TimeoutError' | 11:45 |
dtantsur | DEAR EVENTLET!!1 | 11:45 |
dtantsur | has anyone tried running our unit tests on python 3.10? | 11:45 |
rpittau | I did | 11:54 |
rpittau | didn't see that though | 11:54 |
rpittau | mmm I ran it before upgrading to F35, but it was py310 anyway | 11:55 |
dtantsur | maybe F35 has something newer? dunno | 11:56 |
rpittau | oh wow now I see that failing in ipa | 11:57 |
rpittau | I have to split now, see you tomorrow folks! o/ | 12:02 |
dtantsur | see you | 12:04 |
dtantsur | ehmm, it looks like lsblk actually loses partition UUID Oo | 12:04 |
dtantsur | before re-reading: https://zuul.openstack.org/build/963864600d774599a59e835b60d3d0ef/log/controller/ironic-bm-logs/node-0_no_ansi_2021-11-03-20:17:45.log#2961 | 12:04 |
dtantsur | after: https://zuul.openstack.org/build/963864600d774599a59e835b60d3d0ef/log/controller/ironic-bm-logs/node-0_no_ansi_2021-11-03-20:17:45.log#2987 | 12:04 |
dtantsur | arne_wiebalck: can it be a side effect of rereatpt? | 12:05 |
dtantsur | I suspect your patch may be breaking it | 12:06 |
dtantsur | funnily, the lsblk output in deploy logs does contain the required UUID | 12:08 |
dtantsur | I wonder if we should reorder partition read with udev settle? is it somehow asynchronous? | 12:08 |
iurygregory | dtantsur, I just got the same error running ipa in 3.10 locally (F34) | 12:25 |
dtantsur | sigh | 12:28 |
TheJulia | good morning | 12:43 |
arne_wiebalck | dtantsur: o.O (no idea) | 12:44 |
dtantsur | morning TheJulia | 12:44 |
arne_wiebalck | dtantsur: the lsblk calls are slightly different | 12:44 |
arne_wiebalck | dtantsur: not sure this is relevant | 12:45 |
dtantsur | arne_wiebalck: yeah, but they shouldn't return so different results | 12:45 |
* arne_wiebalck has to step away for 45mins, will have a look after | 12:46 | |
arne_wiebalck | dtantsur: no sure how re-reading the pt would make UUIDs disappear | 12:46 |
dtantsur | yeah, sounds crazy | 12:46 |
janders | see you tomorrow Ironic o/ | 12:47 |
arne_wiebalck | bye janders o/ | 12:49 |
arne_wiebalck | dtantsur: I can try to reproduce this on real hardware | 12:50 |
arne_wiebalck | dtantsur: just add the lsblk calls before and after rescan_device() | 12:50 |
opendevreview | Dmitry Tantsur proposed openstack/ironic master: Avoid handling a deploy failure twice https://review.opendev.org/c/openstack/ironic/+/816684 | 12:52 |
arne_wiebalck | Good morning, TheJulia | 12:53 |
opendevreview | Michal Nasiadka proposed openstack/ironic-python-agent master: Add LVM based image support to MD scenario https://review.opendev.org/c/openstack/ironic-python-agent/+/816685 | 13:03 |
opendevreview | Michal Nasiadka proposed openstack/ironic-python-agent master: Add LVM based image support to MD scenario https://review.opendev.org/c/openstack/ironic-python-agent/+/816685 | 13:06 |
*** pmannidi|brb is now known as pmannidi | 13:10 | |
dtantsur | could I get a 2nd +2 on https://review.opendev.org/c/openstack/ironic-python-agent/+/815861 please? | 13:21 |
TheJulia | I can look at outstanding changes on ipa a little later | 13:24 |
*** pmannidi is now known as pmannidi|AFK | 13:30 | |
*** dviroel|rover is now known as dviroel|lunch|appt|afk | 15:03 | |
opendevreview | Merged openstack/ironic-python-agent master: Always include the oslo_log log file in ramdisk logs https://review.opendev.org/c/openstack/ironic-python-agent/+/815861 | 15:14 |
opendevreview | Dmitry Tantsur proposed openstack/sushy master: Migrate System constants to enums https://review.opendev.org/c/openstack/sushy/+/816717 | 15:55 |
opendevreview | Dmitry Tantsur proposed openstack/ironic-python-agent stable/xena: Always include the oslo_log log file in ramdisk logs https://review.opendev.org/c/openstack/ironic-python-agent/+/816664 | 15:56 |
dtantsur | the migration to enums will break ironic unit tests, le sigh | 15:58 |
opendevreview | Julia Kreger proposed openstack/ironic master: Test nova-compute fix https://review.opendev.org/c/openstack/ironic/+/813264 | 15:58 |
dtantsur | what is worse, it will also break some actual code where we assume constants are strings. double sigh. | 15:59 |
dtantsur | TheJulia: do we have a single reason to enable only "pxe", not "ipxe" by default? | 16:05 |
opendevreview | Dmitry Tantsur proposed openstack/ironic master: Enable Redfish and iPXE by default https://review.opendev.org/c/openstack/ironic/+/816721 | 16:15 |
dtantsur | anyway, here goes ^^^ | 16:15 |
TheJulia | dtantsur: none, change it! | 16:18 |
TheJulia | :) | 16:18 |
dtantsur | \o/ | 16:19 |
* dtantsur looks at redfish code and wants to cry.. again | 16:27 | |
opendevreview | Julia Kreger proposed openstack/ironic-python-agent master: Fix UEFI record regex https://review.opendev.org/c/openstack/ironic-python-agent/+/816723 | 16:44 |
TheJulia | le-sigh | 16:45 |
opendevreview | Julia Kreger proposed openstack/ironic-python-agent master: Fix UEFI record regex https://review.opendev.org/c/openstack/ironic-python-agent/+/816723 | 16:45 |
NobodyCam | good Morning Ironic folks | 17:02 |
dtantsur | moar merge conflicts \o/ | 17:03 |
dtantsur | morning NobodyCam | 17:03 |
NobodyCam | joy.. good morning dtantsur :) o/ | 17:03 |
arne_wiebalck | Good morning, NobodyCam o/ | 17:04 |
dtantsur | TheJulia: I assume colon is actually never there? | 17:04 |
NobodyCam | hey hey arne_wiebalck :) | 17:04 |
TheJulia | dtantsur: it is not | 17:04 |
dtantsur | okie | 17:04 |
NobodyCam | morning TheJulia | 17:04 |
TheJulia | I must have just have mentally gone "this makes sense!" | 17:04 |
dtantsur | too much sense for EFI :) | 17:05 |
TheJulia | and kept putting them into the samples where as I developed the regex against data from a customer | 17:05 |
TheJulia | but somewhere along the lines added the colon into it :\ | 17:05 |
* TheJulia kicks self | 17:05 | |
dtantsur | my laptop doesn't have colons but does have stars | 17:05 |
TheJulia | my laptop doesn't have stars :) | 17:05 |
dtantsur | \o/ | 17:05 |
dtantsur | do you have a thinkpad as well? | 17:05 |
TheJulia | I cracked open the efibootmgr code to see what it does | 17:05 |
TheJulia | yup | 17:06 |
dtantsur | \o/ consistency \o/ | 17:06 |
dtantsur | I'm glad they dropped U from UEFI :D | 17:06 |
arne_wiebalck | dtantsur: I have now finally an image now to test what happens with the partition UUIDs on my hardware ... let's see ... | 17:07 |
dtantsur | on top of that, my EFI has 30 boot records | 17:08 |
dtantsur | or so | 17:08 |
arne_wiebalck | dtantsur: obvious question: what's the limit? :-D | 17:08 |
dtantsur | obvious answer: vendor dependent? :) dunno actually | 17:09 |
TheJulia | arne_wiebalck: AIUI, it is like 4 or 8kB of space | 17:09 |
TheJulia | I *have* seen reports of 30+ records just not fitting into the space | 17:09 |
arne_wiebalck | TheJulia: heh, that would make quite some entries :) | 17:09 |
TheJulia | evil, misbehaving HBA adapters of evil | 17:09 |
dtantsur | hehe, at least I hope nobody allows it to overflow into the neighboring regions :D | 17:10 |
TheJulia | EvilHBA == HBA that adds undesired UEFI NVRAM entries | 17:10 |
TheJulia | dtantsur: I don't remember, I *think* it silently fails | 17:10 |
dtantsur | so good | 17:11 |
arne_wiebalck | TheJulia has been at the edge of the world, it seems | 17:11 |
TheJulia | almost wholesome sort of good | 17:11 |
dtantsur | computers are cursed | 17:11 |
JayF | Computers are the curse; people who work on them are the cursed ones. | 17:12 |
JayF | and Ironic is the blackest grimoire of them all | 17:12 |
dtantsur | :D | 17:14 |
opendevreview | Dmitry Tantsur proposed openstack/ironic master: Fix RedfishManagement.get_mac_addresses and related functions https://review.opendev.org/c/openstack/ironic/+/816726 | 17:14 |
dtantsur | janders: FYI ^^ | 17:14 |
*** dviroel|lunch|appt|afk is now known as dviroel|rover | 17:20 | |
TheJulia | dtantsur: JayF: this is all in accordance with the prophecy, yes? | 17:37 |
arne_wiebalck | dtantsur: I have some output ... and some questions :) | 17:37 |
dtantsur | TheJulia: exactly | 17:38 |
arne_wiebalck | dtantsur: I do lsblk before and after the call to blockdev | 17:38 |
arne_wiebalck | dtantsur: the output differs | 17:38 |
arne_wiebalck | dtantsur: but it seems to differ since the partition table also changed (since the image comes with a partition table) | 17:39 |
arne_wiebalck | dtantsur: our image has 4 partitions, while before the image is dumped there are only 2 | 17:40 |
arne_wiebalck | dtantsur: so, at the moment I would say: yes there are no more partition UUIDs, but these are not the same partitions | 17:40 |
dtantsur | O___o | 17:40 |
arne_wiebalck | dtantsur: does that makes sense or is it just too late for me ? | 17:41 |
dtantsur | I'm on a call, but from a quick read it makes no sense | 17:41 |
arne_wiebalck | we can discuss tomorrow | 17:41 |
arne_wiebalck | I think it makes sense in the sense we make this call to get the new partitions | 17:42 |
dtantsur | I think we read the partitions after writing the image, not before | 17:47 |
dtantsur | in case of the CI there are not previous partitions? | 17:47 |
arne_wiebalck | the issue for the blockdev call was that we write the image but we do not see the partitions the image brought | 17:49 |
arne_wiebalck | in the log output I captured this was the case: first 2, then 4 partitions | 17:49 |
arne_wiebalck | I guess I would need to wait for a case where I have 4 partitions before | 17:49 |
arne_wiebalck | to see if the UUIDs disappear | 17:49 |
arne_wiebalck | atm, the number of partitions changes and the new ones have no UUIDs | 17:50 |
arne_wiebalck | the first part is what I want, the second I cannot explain atm | 17:50 |
dtantsur | hmm, hold on, it's a partition image, it doesn't have its own partitions? | 17:51 |
dtantsur | and whole disk images are not written on a partition? | 17:51 |
* dtantsur is confused | 17:51 | |
arne_wiebalck | the image is a whole disk image | 17:51 |
dtantsur | okay, so it's not the same case as in the CI | 17:51 |
arne_wiebalck | right | 17:51 |
arne_wiebalck | CI is a partition image? | 17:52 |
dtantsur | the metalsmith jobs do both. the whole disk image work (it does not REALLY need the partition UUID) | 17:52 |
arne_wiebalck | oh, wait | 17:53 |
arne_wiebalck | this is also iscsi on top | 17:54 |
arne_wiebalck | my test node still has that | 17:54 |
arne_wiebalck | not sure if that matters | 17:54 |
arne_wiebalck | but the 2nd partition before the image is dumped is the config drive I would think | 17:55 |
arne_wiebalck | hmm ... hmm ... since the UUID is also empty, I am wondering if that would break s/w RAID | 17:57 |
arne_wiebalck | since that will need the uuid of the root fs | 17:58 |
arne_wiebalck | and, yeah, since this is rescan_device rather than manage_uefi, it is now in the RAID code path as well | 18:01 |
opendevreview | Arne Wiebalck proposed openstack/ironic master: [Trivial] Clarify conditions under which power recovery is attempted https://review.opendev.org/c/openstack/ironic/+/816733 | 18:02 |
arne_wiebalck | I will test if s/w RAID still works ... isn't this fun :) | 18:05 |
arne_wiebalck | ok, I will leave the details for tomorrow :-D, have a good night everyone o/ | 18:06 |
dtantsur | o/ | 18:38 |
opendevreview | Dmitry Tantsur proposed openstack/ironic master: Enable Redfish and iPXE by default https://review.opendev.org/c/openstack/ironic/+/816721 | 18:46 |
arne_wiebalck | dtantsur: the UUIDs really disappear | 18:47 |
arne_wiebalck | dtantsur: https://paste.opendev.org/show/810784/ | 18:48 |
arne_wiebalck | dtantsur: the deployment works and once the instance is booted, all UUIDs are back | 18:49 |
arne_wiebalck | dtantsur: so maybe there is some other race where blockdev triggers the non-availability of the uuids for some time? | 18:50 |
arne_wiebalck | *the deployment of a WDI on a s/w RAID seems to work | 18:50 |
stevebaker[m] | morning | 20:21 |
*** dviroel|rover is now known as dviroel|out | 22:13 | |
janders | good morning stevebaker[m] and Ironic o/ | 23:37 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!