TheJulia | dtantsur: doesn't Xena override the last bug fix branch at this point? | 00:16 |
---|---|---|
TheJulia | arne_wiebalck: so, thinking through nova patch I'm working on. I *think* changing the conductor group would just get it re-assigned onto a compute node host where it is then visible, if visible to the instance list visible to that compute node, but that likely has a failure case I'm not thinking of in that | 00:19 |
TheJulia | arne_wiebalck: could there be another case or a permutation on that? | 00:19 |
opendevreview | Jacob Anders proposed openstack/sushy master: Changing boot device string for vmedia on SuperMicro https://review.opendev.org/c/openstack/sushy/+/817137 | 05:31 |
*** pmannidi is now known as pmannidi|pto | 06:20 | |
iurygregory | good morning Ironic o/ | 06:43 |
arne_wiebalck | Good morning, iurygregory janders and Ironic! | 07:16 |
iurygregory | arne_wiebalck, o// | 07:16 |
arne_wiebalck | TheJulia: The issue we see on Nova (Stein) is that the moved host is only soft-deleted by the old compute node and then fails to be re-added by the new one (since the UUID already exists). | 07:19 |
opendevreview | Arne Wiebalck proposed openstack/ironic-python-agent stable/xena: Re-read the partition table with partx -a https://review.opendev.org/c/openstack/ironic-python-agent/+/817143 | 07:25 |
opendevreview | Arne Wiebalck proposed openstack/ironic-python-agent stable/wallaby: Re-read the partition table with partx -a https://review.opendev.org/c/openstack/ironic-python-agent/+/817145 | 07:32 |
arne_wiebalck | iurygregory: which bugfix branch(es) should the partx patch go to (and how do I find them next time)? | 07:33 |
iurygregory | arne_wiebalck, bugfix/18.1 if possible :D | 07:34 |
iurygregory | (downstream we are using this one =X) | 07:34 |
arne_wiebalck | this for the IPA, not Ironic | 07:34 |
iurygregory | oh | 07:34 |
iurygregory | let me check =) | 07:34 |
arne_wiebalck | bugfix/8.1 ? | 07:34 |
iurygregory | arne_wiebalck, yup =) | 07:35 |
arne_wiebalck | iurygregory: policy is that it only goes to the latest bugfix? | 07:36 |
arne_wiebalck | iurygregory: i.e. not to bugfix/8.0 | 07:36 |
iurygregory | arne_wiebalck, it makes sense to me (but I don't remember if we did specify if we need to always backport to bugfix branches) | 07:37 |
* iurygregory looks for the spec about the new release model to double check | 07:37 | |
iurygregory | Bugfix branches (for deliverables that have them) are supported for 6 months. Only high and critical bug fixes are accepted during the whole support time. | 07:40 |
iurygregory | well, we should backport them to all based on that (but since we don't know if people are really using bugfix/8.0 for ipa it's not mandatory) | 07:41 |
iurygregory | https://specs.openstack.org/openstack/ironic-specs/specs/15.1/new-release-model.html | 07:42 |
iurygregory | arne_wiebalck, does it makes sense to you? =) | 07:44 |
*** sshnaidm is now known as sshnaidm|afk | 07:45 | |
arne_wiebalck | iurygregory: let's start with 8.1 then and see | 07:45 |
arne_wiebalck | iurygregory: 8.1 is the bugfix branch for Xena, right? | 07:46 |
iurygregory | yup | 07:46 |
arne_wiebalck | iurygregory: then I am lost as to why the patch applies cleanly to stable/xena, but not to bugfix/8.1 | 07:46 |
iurygregory | huh? :O | 07:46 |
arne_wiebalck | iurygregory: hmm ... maybe there are non-backported patches | 07:47 |
iurygregory | I was about to say this :D | 07:47 |
arne_wiebalck | i.e. sth in stable/xena which did not go into bugfix/8.1 | 07:47 |
arne_wiebalck | ok | 07:47 |
arne_wiebalck | patch incoming ... :) | 07:47 |
iurygregory | maybe some of TheJulia patches? | 07:49 |
opendevreview | Arne Wiebalck proposed openstack/ironic-python-agent bugfix/8.1: Re-read the partition table with partx -a https://review.opendev.org/c/openstack/ironic-python-agent/+/817151 | 07:49 |
arne_wiebalck | iurygregory: yes | 07:49 |
opendevreview | ZhouHao proposed openstack/ironic master: [iRMC] Convert the type of irmc_port to int https://review.opendev.org/c/openstack/ironic/+/817154 | 07:52 |
opendevreview | ZhouHao proposed openstack/ironic master: [iRMC] Convert the type of irmc_port to int https://review.opendev.org/c/openstack/ironic/+/817154 | 07:54 |
rpittau | good morning ironic! o/ | 08:43 |
dtantsur | morning ironic | 08:44 |
dtantsur | TheJulia: some of us do use bugfix/18.1 still | 08:44 |
rpittau | oh yes we do | 08:45 |
iurygregory | morning rpittau dtantsur o/ | 08:45 |
rpittau | hey iurygregory :) | 08:45 |
janders | hey arne_wiebalck iurygregory rpittau dtantsur and Ironic o/ | 08:45 |
rpittau | hey janders :) | 08:46 |
iurygregory | janders, o/ | 08:46 |
opendevreview | Riccardo Pittau proposed openstack/ironic-python-agent-builder master: Build and publish arm64 debian based ipa ramdisk https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/815815 | 08:49 |
*** sshnaidm|afk is now known as sshnaidm | 09:40 | |
opendevreview | Riccardo Pittau proposed openstack/ironic-python-agent master: Move rescan device function to general utils https://review.opendev.org/c/openstack/ironic-python-agent/+/816916 | 09:58 |
rpittau | ^ we probably need udevadm settle after parted too, so I'm wondering if we should invert and run udevadm settle *before* partx in rescan_device, or before AND after | 10:08 |
iurygregory | omg | 10:09 |
rpittau | heh | 10:10 |
dtantsur | le sigh | 10:12 |
rpittau | I believe before should be enough as the uuids should be already updated by udev, let's see what the CI says | 10:12 |
rpittau | I may even move udevadm_settle to utils at this point | 10:12 |
dtantsur | I'm quite sure we need udev after partx, that was the outcome of Arne's previous attempt | 10:12 |
rpittau | oh right :/ | 10:13 |
arne_wiebalck | yes, udev should be last | 10:43 |
arne_wiebalck | it is the sync point where all the newly found partitions are turned into device nodes | 10:43 |
rpittau | alright, let's see how the CI goes, in the end it may be that we can ignore that error I'm seeing and we just need a little customization of partx inside rescan_device | 10:44 |
*** dviroel|out is now known as dviroel | 11:07 | |
opendevreview | Verification of a change to openstack/ironic master failed: Avoid handling a deploy failure twice https://review.opendev.org/c/openstack/ironic/+/816684 | 11:26 |
rpittau | mmm I see still ironic-standalone-ipa-src failing for 'mdadm: cannot open /dev/vdb1: No such file or directory' | 11:33 |
iurygregory | huh?! but we already merged the fix right? | 11:36 |
rpittau | yeah | 11:36 |
arne_wiebalck | rpittau: does the partx -a call fail? it does when I run it on VMs, it works on BM, though | 11:37 |
rpittau | mmm interesting it still shows partx -u | 11:38 |
arne_wiebalck | aha | 11:38 |
dtantsur | the image wasn't rebuilt yet? | 11:40 |
rpittau | mmm the change was just for image though, I'm reading this from hardware | 11:41 |
arne_wiebalck | aren't there 2 partx calls, one in rescan_device, one in the s/w RAID part in hardware.py | 11:41 |
rpittau | arne_wiebalck: yeah this is in the raid part | 11:41 |
arne_wiebalck | rpittau: this is the one we wanted subsequently change to call the new rescan | 11:42 |
rpittau | yep | 11:42 |
arne_wiebalck | we need the refactoring to merge first I guess? | 11:42 |
rpittau | yes, at this point, or we can split that in two for the sake of backporting the entire fix | 11:43 |
rpittau | what I mean is we change -u to -a in the raid part as a fix for that, and we can backport it easily | 11:47 |
rpittau | then we do the refactor and we don't necessarily need to backport it | 11:47 |
arne_wiebalck | you mean: have partx -u replaced in the raid code now, then merge your refactoring patch, then call rescan_device from the new place? | 11:47 |
rpittau | yeah | 11:47 |
arne_wiebalck | sounds good to me ... you want to do the u/a replacement or shall I do it? | 11:48 |
rpittau | I can do that so I directly rebase the refactor onto it | 11:49 |
arne_wiebalck | rpittau: +1, ty | 11:49 |
opendevreview | Merged openstack/ironic-python-agent-builder master: new element burn-in for package stress-ng, added fio https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/815453 | 11:52 |
opendevreview | Merged openstack/ironic-python-agent stable/xena: Re-read the partition table with partx -a https://review.opendev.org/c/openstack/ironic-python-agent/+/817143 | 11:59 |
opendevreview | Merged openstack/ironic-python-agent stable/wallaby: Re-read the partition table with partx -a https://review.opendev.org/c/openstack/ironic-python-agent/+/817145 | 11:59 |
opendevreview | Riccardo Pittau proposed openstack/ironic-python-agent master: Re-read the partition table with partx -a, part 2 https://review.opendev.org/c/openstack/ironic-python-agent/+/817197 | 12:03 |
opendevreview | Verification of a change to openstack/ironic master failed: Avoid handling a deploy failure twice https://review.opendev.org/c/openstack/ironic/+/816684 | 12:05 |
opendevreview | Jacob Anders proposed openstack/sushy master: Changing boot device string for vmedia on SuperMicro https://review.opendev.org/c/openstack/sushy/+/817137 | 12:13 |
opendevreview | Riccardo Pittau proposed openstack/ironic-python-agent master: Move rescan device function to general utils https://review.opendev.org/c/openstack/ironic-python-agent/+/816916 | 12:16 |
arne_wiebalck | well ... the raid code is calling udev, then partx ... rescan is doing it the other way round :) | 12:17 |
arne_wiebalck | let's see what implications this has ... | 12:18 |
opendevreview | Merged openstack/ironic-python-agent bugfix/8.1: Re-read the partition table with partx -a https://review.opendev.org/c/openstack/ironic-python-agent/+/817151 | 12:18 |
opendevreview | Dmitry Tantsur proposed openstack/ironic-python-agent master: Simplify error messages when running clean/deploy step https://review.opendev.org/c/openstack/ironic-python-agent/+/817201 | 13:00 |
janders | see you tomorrow Ironic o/ | 13:14 |
iurygregory | bye janders o/ | 13:19 |
opendevreview | Riccardo Pittau proposed openstack/ironic bugfix/18.1: Fix idrac-wsman deploy with existing non-BIOS jobs https://review.opendev.org/c/openstack/ironic/+/817107 | 13:27 |
arne_wiebalck | Bare Metal SIG meeting in 10mins! Topic: "Hardware Burn-in with Ironic", details on https://etherpad.opendev.org/p/bare-metal-sig | 13:51 |
rpioso | arne_wiebalck: I lost my Zoom connection to the meeting. | 14:12 |
TheJulia | arne_wiebalck: hmm, soft delete does definitely sound like a separate conundrum | 14:31 |
arne_wiebalck | TheJulia: ok | 14:57 |
arne_wiebalck | TheJulia: this is why I was a little careful on saying this is an issue, since it also may have changed already | 14:57 |
opendevreview | Merged openstack/ironic-python-agent stable/wallaby: Output verbose info from efibootmgr https://review.opendev.org/c/openstack/ironic-python-agent/+/817012 | 15:09 |
opendevreview | Merged openstack/ironic-python-agent stable/wallaby: Delete EFI boot entry duplicate labels first https://review.opendev.org/c/openstack/ironic-python-agent/+/817013 | 15:09 |
opendevreview | Merged openstack/ironic master: Avoid handling a deploy failure twice https://review.opendev.org/c/openstack/ironic/+/816684 | 15:12 |
opendevreview | Dmitry Tantsur proposed openstack/ironic stable/xena: Avoid handling a deploy failure twice https://review.opendev.org/c/openstack/ironic/+/817109 | 15:27 |
TheJulia | arne_wiebalck: is it just a matter of changing soft deleted back to active in the db? | 15:30 |
arne_wiebalck | TheJulia: I cannot say, tbh: for sure, nova does not expect to find an entry. If undeleting it, or hard delete and re-create is better, I cannot judge. | 15:40 |
arne_wiebalck | TheJulia: Belmiro (as our Nova expert) wanted to check more recent releases first, before patching, esp. as we are preparing the Nova upgrade. | 15:42 |
TheJulia | okay, well, if it is not auto-recovered, we likely could do it in the compute node soon then and just update the db, but that entirely depends on what else nova does | 15:44 |
opendevreview | Riccardo Pittau proposed openstack/ironic-python-agent master: Move rescan device function to general utils https://review.opendev.org/c/openstack/ironic-python-agent/+/816916 | 16:33 |
arne_wiebalck | I do not have the full logs to prove it, but we just removed a test node from our deployment which has been instantiated with Ironic certainly more than 10k times (probably even more than 20k times). | 16:58 |
opendevreview | Dmitry Tantsur proposed openstack/ironic master: Fix RedfishManagement.get_mac_addresses and related functions https://review.opendev.org/c/openstack/ironic/+/816726 | 17:00 |
dtantsur | arne_wiebalck: wow Oo | 17:00 |
arne_wiebalck | rally, once an hour, and the node was added in nov 2017 | 17:01 |
dtantsur | impressive :) | 17:10 |
dtantsur | see you tomorrow folks o/ | 17:10 |
arne_wiebalck | bye dtantsur o/ | 17:10 |
arne_wiebalck | TheJulia: it is not auto-recovered from what I see ... maybe when the DB is finally purged after some months :-D | 17:10 |
*** JasonF is now known as JayF | 17:12 | |
arne_wiebalck | bye everyone, see you tomorrow o/ | 17:12 |
JayF | arne_wiebalck: we hit numbers like that with onmetal | 17:12 |
JayF | arne_wiebalck: because with the number of nodes we had in inventory + the amount of CI we ran, we provisioned a node dozens of times a week | 17:13 |
arne_wiebalck | JayF: I imagine! | 17:13 |
JayF | it's more impressive if you hit those numbers with real usage | 17:13 |
JayF | and not CI-style usage | 17:13 |
arne_wiebalck | JayF: heh, true, but all the activity on our cloud is dominated by the CI/Rally activity | 17:14 |
arne_wiebalck | JayF: it is still cool to see how the numbers accumulate | 17:14 |
JayF | It's often easy to focus on the times when things don't work right. | 17:14 |
arne_wiebalck | JayF: and also to think how often that node has been rebooted | 17:14 |
JayF | When you hit big milestones, it helps you reflect on all the stuff that silently works | 17:15 |
arne_wiebalck | JayF: exactly | 17:15 |
arne_wiebalck | o/ | 17:15 |
JayF | o/ | 17:15 |
TheJulia | arne_wiebalck: you should so tweet a about a node deployed 10k+ times :) | 17:17 |
TheJulia | arne_wiebalck: I guess in that case, regarding auto recovery, we just need to have a bug someplace and then it *might* be stupidly easy to fix soon() | 17:18 |
rpittau | bye everyone! o/ | 17:32 |
*** dviroel is now known as dviroel|out | 20:51 | |
opendevreview | Jacob Anders proposed openstack/sushy master: Changing boot device string for vmedia on SuperMicro https://review.opendev.org/c/openstack/sushy/+/817137 | 20:53 |
janders | good morning Ironic o/ | 21:42 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!