| *** pmannidi is now known as pmannidi|brb | 00:40 | |
| *** pmannidi|brb is now known as pmannidi | 01:21 | |
| *** pmannidi is now known as pmannidi|brb | 01:48 | |
| *** pmannidi|brb is now known as pmannidi | 02:30 | |
| arne_wiebalck | Good morning janders and Ironic! | 07:18 |
|---|---|---|
| iurygregory | good morning janders arne_wiebalck and Ironic o/ | 08:00 |
| arne_wiebalck | hey iurygregory o/ | 08:01 |
| rpittau | good morning ironic! o/ | 08:04 |
| rpittau | ooook something's off with ironic-standalone-ipa-src and metalsmith-integration-ipa-src, and it looks like ironic-standalone has become (more) unstable recently | 08:20 |
| opendevreview | Riccardo Pittau proposed openstack/ironic-python-agent stable/xena: Delete EFI boot entry duplicate labels first https://review.opendev.org/c/openstack/ironic-python-agent/+/816489 | 08:22 |
| iurygregory | good morning rpittau o/ | 08:26 |
| rpittau | hey iurygregory :) | 08:26 |
| iurygregory | rpittau, well about ironic-standalone-ipa-src https://opendev.org/openstack/ironic-python-agent/commit/a67807b9b6017a8a1dc250bd7a7141e6728544cb | 08:28 |
| iurygregory | and seems like we have the job voting in ipa-builder... but not in IPA =) | 08:29 |
| rpittau | yeah but it's giving more issues recently, check the stats | 08:31 |
| rpittau | also the metalsmith job is voting | 08:31 |
| rpittau | I'm actually thinking to make them voting again | 08:32 |
| iurygregory | hummm 39/50 SUCCESS | 1/50 ABORTED | 1/50 TIMED_OUT | 8/50 FAILURE | 08:34 |
| iurygregory | for ironic-standalone-ipa-src | 08:35 |
| iurygregory | considering only ipa-builder repo | 08:35 |
| opendevreview | Riccardo Pittau proposed openstack/ironic-python-agent master: Use json for lsblk output https://review.opendev.org/c/openstack/ironic-python-agent/+/775391 | 08:36 |
| iurygregory | oh 1/50 CANCELED =) | 08:36 |
| rpittau | the error is during raid creation "DEBUG ironic_lib.utils [-] Command stderr is: "mdadm: cannot open /dev/vdb1: No such file or directory" | 08:43 |
| rpittau | the partition seems to be created | 08:45 |
| rpittau | maybe partition table not updated? | 08:47 |
| iurygregory | didn't we have a case like this in the past? (the partition table didn't get updated) | 08:48 |
| rpittau | yep, trying to remember | 08:48 |
| * iurygregory wondering if arne_wiebalck remember something | 08:49 | |
| rpittau | we do run partx before calling mdadm https://opendev.org/openstack/ironic-python-agent/src/branch/master/ironic_python_agent/hardware.py#L2169 | 08:50 |
| iurygregory | omg! | 08:50 |
| iurygregory | Necessary, if we want to avoid hitting an error when creating the mdadm array below 'mdadm: cannot open /dev/nvme1n1p1: No such file or directory' | 08:51 |
| iurygregory | :D | 08:51 |
| rpittau | yeeeeep | 08:52 |
| iurygregory | can we cry now or just later? | 08:55 |
| *** pmannidi is now known as pmannidi|brb | 08:55 | |
| rpittau | also the last commit was done to prevent such a case https://opendev.org/openstack/ironic-python-agent/commit/9d707e9f4bab40109b7e29df2136e86d65325ea3 | 08:59 |
| arne_wiebalck | iurygregory: rpittau: yes | 09:08 |
| arne_wiebalck | there is an issue when the device nodes are not yet created | 09:08 |
| arne_wiebalck | I've put in 2 patches lately | 09:09 |
| arne_wiebalck | https://review.opendev.org/c/openstack/ironic-python-agent/+/816192 | 09:09 |
| arne_wiebalck | https://review.opendev.org/c/openstack/ironic-python-agent/+/812470 | 09:10 |
| iurygregory | https://review.opendev.org/c/openstack/ironic-python-agent/+/816192 I would say would probably help (but metalsmith failed /facepalm) | 09:11 |
| rpittau | right, so need to see why the metalsmith job is now failing | 09:11 |
| iurygregory | yeah | 09:11 |
| arne_wiebalck | the comment has the error message, but the fix is not sufficient | 09:11 |
| rpittau | ok, I see | 09:13 |
| rpittau | arne_wiebalck: I'm wondering if we should make rescan_device public, move that to hardware.py and call that directly instead of lines 2156-2170 with your change that adds the call to blockdev | 09:25 |
| arne_wiebalck | rpittau: sorry, which lines are these? | 09:27 |
| * arne_wiebalck has a meeting now ... | 09:28 | |
| rpittau | arne_wiebalck: https://opendev.org/openstack/ironic-python-agent/src/branch/master/ironic_python_agent/hardware.py#L2156 | 09:28 |
| rpittau | maybe add that after the except on L2171 | 09:28 |
| arne_wiebalck | rpittau: yes, I was thinking this as well | 09:35 |
| arne_wiebalck | rpittau: we should pull out the big sweep every time | 09:35 |
| rpittau | yeah | 09:35 |
| arne_wiebalck | acc. to the internet, all these calls may be necessary in different situations | 09:36 |
| arne_wiebalck | all of them work sometimes | 09:36 |
| arne_wiebalck | :-/ | 09:36 |
| rpittau | exactly, and they're usually fast if everything's right | 09:36 |
| arne_wiebalck | yep | 09:36 |
| arne_wiebalck | we probably have more places where we call udev settle for similar reasons | 09:37 |
| arne_wiebalck | but these two are very similar already | 09:37 |
| arne_wiebalck | and already justify your proposal I think | 09:37 |
| rpittau | yes, it already makes sense for just those two | 09:38 |
| dtantsur | morning ironic | 10:39 |
| iurygregory | morning dtantsur o/ | 10:49 |
| opendevreview | Aija Jauntēva proposed x/sushy-oem-idrac master: Fix moved tests https://review.opendev.org/c/x/sushy-oem-idrac/+/816641 | 11:02 |
| *** dviroel|rover|out is now known as dviroel|rover | 11:12 | |
| dtantsur | does anyone know what happened with the metalsmith jobs? iurygregory, rpittau? | 11:24 |
| dtantsur | hmm, the statistics is not too bad https://zuul.openstack.org/builds?job_name=metalsmith-integration-ipa-src | 11:24 |
| dtantsur | maybe a temporary glitch | 11:24 |
| rpittau | could be just a glitch, I got worried because it failed 3 times in a row in https://review.opendev.org/c/openstack/ironic-python-agent/+/816192 | 11:25 |
| iurygregory | dtantsur, yeah I agree, I checked in the zuul the history of last 50 runs also | 11:25 |
| dtantsur | InstanceDeployFailure: Failed to install a bootloader when deploying node c7c7e554-453a-4842-8438-756135047c11. Error: No partition with UUID 6b1e3209-4f48-46de-a022-a6ea69099155 found on device /dev/vda | 11:25 |
| rpittau | that | 11:26 |
| dtantsur | hmmm | 11:26 |
| iurygregory | maybe something specific in a given cloud? (I haven't checked if there is somethign common between the failures) | 11:27 |
| * dtantsur fixes the double error handling in the meantime | 11:43 | |
| dtantsur | File "/home/dtantsur/Projects/ironic/.tox/py3/lib/python3.10/site-packages/eventlet/timeout.py", line 166, in wrap_is_timeout | 11:45 |
| dtantsur | base.is_timeout = property(lambda _: True) | 11:45 |
| dtantsur | TypeError: cannot set 'is_timeout' attribute of immutable type 'TimeoutError' | 11:45 |
| dtantsur | DEAR EVENTLET!!1 | 11:45 |
| dtantsur | has anyone tried running our unit tests on python 3.10? | 11:45 |
| rpittau | I did | 11:54 |
| rpittau | didn't see that though | 11:54 |
| rpittau | mmm I ran it before upgrading to F35, but it was py310 anyway | 11:55 |
| dtantsur | maybe F35 has something newer? dunno | 11:56 |
| rpittau | oh wow now I see that failing in ipa | 11:57 |
| rpittau | I have to split now, see you tomorrow folks! o/ | 12:02 |
| dtantsur | see you | 12:04 |
| dtantsur | ehmm, it looks like lsblk actually loses partition UUID Oo | 12:04 |
| dtantsur | before re-reading: https://zuul.openstack.org/build/963864600d774599a59e835b60d3d0ef/log/controller/ironic-bm-logs/node-0_no_ansi_2021-11-03-20:17:45.log#2961 | 12:04 |
| dtantsur | after: https://zuul.openstack.org/build/963864600d774599a59e835b60d3d0ef/log/controller/ironic-bm-logs/node-0_no_ansi_2021-11-03-20:17:45.log#2987 | 12:04 |
| dtantsur | arne_wiebalck: can it be a side effect of rereatpt? | 12:05 |
| dtantsur | I suspect your patch may be breaking it | 12:06 |
| dtantsur | funnily, the lsblk output in deploy logs does contain the required UUID | 12:08 |
| dtantsur | I wonder if we should reorder partition read with udev settle? is it somehow asynchronous? | 12:08 |
| iurygregory | dtantsur, I just got the same error running ipa in 3.10 locally (F34) | 12:25 |
| dtantsur | sigh | 12:28 |
| TheJulia | good morning | 12:43 |
| arne_wiebalck | dtantsur: o.O (no idea) | 12:44 |
| dtantsur | morning TheJulia | 12:44 |
| arne_wiebalck | dtantsur: the lsblk calls are slightly different | 12:44 |
| arne_wiebalck | dtantsur: not sure this is relevant | 12:45 |
| dtantsur | arne_wiebalck: yeah, but they shouldn't return so different results | 12:45 |
| * arne_wiebalck has to step away for 45mins, will have a look after | 12:46 | |
| arne_wiebalck | dtantsur: no sure how re-reading the pt would make UUIDs disappear | 12:46 |
| dtantsur | yeah, sounds crazy | 12:46 |
| janders | see you tomorrow Ironic o/ | 12:47 |
| arne_wiebalck | bye janders o/ | 12:49 |
| arne_wiebalck | dtantsur: I can try to reproduce this on real hardware | 12:50 |
| arne_wiebalck | dtantsur: just add the lsblk calls before and after rescan_device() | 12:50 |
| opendevreview | Dmitry Tantsur proposed openstack/ironic master: Avoid handling a deploy failure twice https://review.opendev.org/c/openstack/ironic/+/816684 | 12:52 |
| arne_wiebalck | Good morning, TheJulia | 12:53 |
| opendevreview | Michal Nasiadka proposed openstack/ironic-python-agent master: Add LVM based image support to MD scenario https://review.opendev.org/c/openstack/ironic-python-agent/+/816685 | 13:03 |
| opendevreview | Michal Nasiadka proposed openstack/ironic-python-agent master: Add LVM based image support to MD scenario https://review.opendev.org/c/openstack/ironic-python-agent/+/816685 | 13:06 |
| *** pmannidi|brb is now known as pmannidi | 13:10 | |
| dtantsur | could I get a 2nd +2 on https://review.opendev.org/c/openstack/ironic-python-agent/+/815861 please? | 13:21 |
| TheJulia | I can look at outstanding changes on ipa a little later | 13:24 |
| *** pmannidi is now known as pmannidi|AFK | 13:30 | |
| *** dviroel|rover is now known as dviroel|lunch|appt|afk | 15:03 | |
| opendevreview | Merged openstack/ironic-python-agent master: Always include the oslo_log log file in ramdisk logs https://review.opendev.org/c/openstack/ironic-python-agent/+/815861 | 15:14 |
| opendevreview | Dmitry Tantsur proposed openstack/sushy master: Migrate System constants to enums https://review.opendev.org/c/openstack/sushy/+/816717 | 15:55 |
| opendevreview | Dmitry Tantsur proposed openstack/ironic-python-agent stable/xena: Always include the oslo_log log file in ramdisk logs https://review.opendev.org/c/openstack/ironic-python-agent/+/816664 | 15:56 |
| dtantsur | the migration to enums will break ironic unit tests, le sigh | 15:58 |
| opendevreview | Julia Kreger proposed openstack/ironic master: Test nova-compute fix https://review.opendev.org/c/openstack/ironic/+/813264 | 15:58 |
| dtantsur | what is worse, it will also break some actual code where we assume constants are strings. double sigh. | 15:59 |
| dtantsur | TheJulia: do we have a single reason to enable only "pxe", not "ipxe" by default? | 16:05 |
| opendevreview | Dmitry Tantsur proposed openstack/ironic master: Enable Redfish and iPXE by default https://review.opendev.org/c/openstack/ironic/+/816721 | 16:15 |
| dtantsur | anyway, here goes ^^^ | 16:15 |
| TheJulia | dtantsur: none, change it! | 16:18 |
| TheJulia | :) | 16:18 |
| dtantsur | \o/ | 16:19 |
| * dtantsur looks at redfish code and wants to cry.. again | 16:27 | |
| opendevreview | Julia Kreger proposed openstack/ironic-python-agent master: Fix UEFI record regex https://review.opendev.org/c/openstack/ironic-python-agent/+/816723 | 16:44 |
| TheJulia | le-sigh | 16:45 |
| opendevreview | Julia Kreger proposed openstack/ironic-python-agent master: Fix UEFI record regex https://review.opendev.org/c/openstack/ironic-python-agent/+/816723 | 16:45 |
| NobodyCam | good Morning Ironic folks | 17:02 |
| dtantsur | moar merge conflicts \o/ | 17:03 |
| dtantsur | morning NobodyCam | 17:03 |
| NobodyCam | joy.. good morning dtantsur :) o/ | 17:03 |
| arne_wiebalck | Good morning, NobodyCam o/ | 17:04 |
| dtantsur | TheJulia: I assume colon is actually never there? | 17:04 |
| NobodyCam | hey hey arne_wiebalck :) | 17:04 |
| TheJulia | dtantsur: it is not | 17:04 |
| dtantsur | okie | 17:04 |
| NobodyCam | morning TheJulia | 17:04 |
| TheJulia | I must have just have mentally gone "this makes sense!" | 17:04 |
| dtantsur | too much sense for EFI :) | 17:05 |
| TheJulia | and kept putting them into the samples where as I developed the regex against data from a customer | 17:05 |
| TheJulia | but somewhere along the lines added the colon into it :\ | 17:05 |
| * TheJulia kicks self | 17:05 | |
| dtantsur | my laptop doesn't have colons but does have stars | 17:05 |
| TheJulia | my laptop doesn't have stars :) | 17:05 |
| dtantsur | \o/ | 17:05 |
| dtantsur | do you have a thinkpad as well? | 17:05 |
| TheJulia | I cracked open the efibootmgr code to see what it does | 17:05 |
| TheJulia | yup | 17:06 |
| dtantsur | \o/ consistency \o/ | 17:06 |
| dtantsur | I'm glad they dropped U from UEFI :D | 17:06 |
| arne_wiebalck | dtantsur: I have now finally an image now to test what happens with the partition UUIDs on my hardware ... let's see ... | 17:07 |
| dtantsur | on top of that, my EFI has 30 boot records | 17:08 |
| dtantsur | or so | 17:08 |
| arne_wiebalck | dtantsur: obvious question: what's the limit? :-D | 17:08 |
| dtantsur | obvious answer: vendor dependent? :) dunno actually | 17:09 |
| TheJulia | arne_wiebalck: AIUI, it is like 4 or 8kB of space | 17:09 |
| TheJulia | I *have* seen reports of 30+ records just not fitting into the space | 17:09 |
| arne_wiebalck | TheJulia: heh, that would make quite some entries :) | 17:09 |
| TheJulia | evil, misbehaving HBA adapters of evil | 17:09 |
| dtantsur | hehe, at least I hope nobody allows it to overflow into the neighboring regions :D | 17:10 |
| TheJulia | EvilHBA == HBA that adds undesired UEFI NVRAM entries | 17:10 |
| TheJulia | dtantsur: I don't remember, I *think* it silently fails | 17:10 |
| dtantsur | so good | 17:11 |
| arne_wiebalck | TheJulia has been at the edge of the world, it seems | 17:11 |
| TheJulia | almost wholesome sort of good | 17:11 |
| dtantsur | computers are cursed | 17:11 |
| JayF | Computers are the curse; people who work on them are the cursed ones. | 17:12 |
| JayF | and Ironic is the blackest grimoire of them all | 17:12 |
| dtantsur | :D | 17:14 |
| opendevreview | Dmitry Tantsur proposed openstack/ironic master: Fix RedfishManagement.get_mac_addresses and related functions https://review.opendev.org/c/openstack/ironic/+/816726 | 17:14 |
| dtantsur | janders: FYI ^^ | 17:14 |
| *** dviroel|lunch|appt|afk is now known as dviroel|rover | 17:20 | |
| TheJulia | dtantsur: JayF: this is all in accordance with the prophecy, yes? | 17:37 |
| arne_wiebalck | dtantsur: I have some output ... and some questions :) | 17:37 |
| dtantsur | TheJulia: exactly | 17:38 |
| arne_wiebalck | dtantsur: I do lsblk before and after the call to blockdev | 17:38 |
| arne_wiebalck | dtantsur: the output differs | 17:38 |
| arne_wiebalck | dtantsur: but it seems to differ since the partition table also changed (since the image comes with a partition table) | 17:39 |
| arne_wiebalck | dtantsur: our image has 4 partitions, while before the image is dumped there are only 2 | 17:40 |
| arne_wiebalck | dtantsur: so, at the moment I would say: yes there are no more partition UUIDs, but these are not the same partitions | 17:40 |
| dtantsur | O___o | 17:40 |
| arne_wiebalck | dtantsur: does that makes sense or is it just too late for me ? | 17:41 |
| dtantsur | I'm on a call, but from a quick read it makes no sense | 17:41 |
| arne_wiebalck | we can discuss tomorrow | 17:41 |
| arne_wiebalck | I think it makes sense in the sense we make this call to get the new partitions | 17:42 |
| dtantsur | I think we read the partitions after writing the image, not before | 17:47 |
| dtantsur | in case of the CI there are not previous partitions? | 17:47 |
| arne_wiebalck | the issue for the blockdev call was that we write the image but we do not see the partitions the image brought | 17:49 |
| arne_wiebalck | in the log output I captured this was the case: first 2, then 4 partitions | 17:49 |
| arne_wiebalck | I guess I would need to wait for a case where I have 4 partitions before | 17:49 |
| arne_wiebalck | to see if the UUIDs disappear | 17:49 |
| arne_wiebalck | atm, the number of partitions changes and the new ones have no UUIDs | 17:50 |
| arne_wiebalck | the first part is what I want, the second I cannot explain atm | 17:50 |
| dtantsur | hmm, hold on, it's a partition image, it doesn't have its own partitions? | 17:51 |
| dtantsur | and whole disk images are not written on a partition? | 17:51 |
| * dtantsur is confused | 17:51 | |
| arne_wiebalck | the image is a whole disk image | 17:51 |
| dtantsur | okay, so it's not the same case as in the CI | 17:51 |
| arne_wiebalck | right | 17:51 |
| arne_wiebalck | CI is a partition image? | 17:52 |
| dtantsur | the metalsmith jobs do both. the whole disk image work (it does not REALLY need the partition UUID) | 17:52 |
| arne_wiebalck | oh, wait | 17:53 |
| arne_wiebalck | this is also iscsi on top | 17:54 |
| arne_wiebalck | my test node still has that | 17:54 |
| arne_wiebalck | not sure if that matters | 17:54 |
| arne_wiebalck | but the 2nd partition before the image is dumped is the config drive I would think | 17:55 |
| arne_wiebalck | hmm ... hmm ... since the UUID is also empty, I am wondering if that would break s/w RAID | 17:57 |
| arne_wiebalck | since that will need the uuid of the root fs | 17:58 |
| arne_wiebalck | and, yeah, since this is rescan_device rather than manage_uefi, it is now in the RAID code path as well | 18:01 |
| opendevreview | Arne Wiebalck proposed openstack/ironic master: [Trivial] Clarify conditions under which power recovery is attempted https://review.opendev.org/c/openstack/ironic/+/816733 | 18:02 |
| arne_wiebalck | I will test if s/w RAID still works ... isn't this fun :) | 18:05 |
| arne_wiebalck | ok, I will leave the details for tomorrow :-D, have a good night everyone o/ | 18:06 |
| dtantsur | o/ | 18:38 |
| opendevreview | Dmitry Tantsur proposed openstack/ironic master: Enable Redfish and iPXE by default https://review.opendev.org/c/openstack/ironic/+/816721 | 18:46 |
| arne_wiebalck | dtantsur: the UUIDs really disappear | 18:47 |
| arne_wiebalck | dtantsur: https://paste.opendev.org/show/810784/ | 18:48 |
| arne_wiebalck | dtantsur: the deployment works and once the instance is booted, all UUIDs are back | 18:49 |
| arne_wiebalck | dtantsur: so maybe there is some other race where blockdev triggers the non-availability of the uuids for some time? | 18:50 |
| arne_wiebalck | *the deployment of a WDI on a s/w RAID seems to work | 18:50 |
| stevebaker[m] | morning | 20:21 |
| *** dviroel|rover is now known as dviroel|out | 22:13 | |
| janders | good morning stevebaker[m] and Ironic o/ | 23:37 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!