opendevreview | Michal Nasiadka proposed openstack/kolla stable/yoga: Pin iptables to 1.8.4 in Centos Stream 8 https://review.opendev.org/c/openstack/kolla/+/893359 | 06:37 |
---|---|---|
opendevreview | Bartosz Bezak proposed openstack/kolla stable/yoga: Pin iptables to 1.8.4 in Centos Stream 8 https://review.opendev.org/c/openstack/kolla/+/893423 | 07:22 |
opendevreview | Bartosz Bezak proposed openstack/kolla stable/yoga: Pin iptables to 1.8.4 in Centos Stream 8 https://review.opendev.org/c/openstack/kolla/+/893423 | 07:24 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: ovn: Improve clustering https://review.opendev.org/c/openstack/kolla-ansible/+/868929 | 07:37 |
opendevreview | Merged openstack/kayobe stable/xena: Remove upgrade jobs following Wallaby EOL https://review.opendev.org/c/openstack/kayobe/+/893435 | 07:39 |
opendevreview | Maksim Malchuk proposed openstack/kolla stable/yoga: Add server-status handler to Rocky/Centos Apache conf https://review.opendev.org/c/openstack/kolla/+/893242 | 08:19 |
SvenKieske | ouch @ that iptables regression, at least they added an upstream test to detect that in the future :) | 08:32 |
frickler | evidence #371 in the "centos stream is unuseable" case | 08:35 |
mnasiadka | that's not really regression, that's RH way of building packages, they backport patches they like, not take the latest git version from application repository - it's long time fixed in 1.8.5, but they decided to not include the patch when they bumped iptables to 1.8.5 in c8s | 08:35 |
opendevreview | Pierre Riteau proposed openstack/kayobe stable/xena: Speed up calls to Bifrost https://review.opendev.org/c/openstack/kayobe/+/893204 | 08:35 |
bbezak | frickler: totally true | 08:40 |
SvenKieske | frickler: I needed to suppress the urge to write that :D | 08:40 |
SvenKieske | to be fair: redhat has at least _some_ QA, but I feel most of it is benefiting fedora these days :D (I won't complain) | 08:41 |
TK_ | Hello Guys, I have a quick support request, I have 4 compute nodes but when I migrate an instance from one compute to another I get the error below | 08:43 |
TK_ | https://paste.openstack.org/show/bWws544zUWbMR6AJBvw4/ | 08:43 |
frickler | whatever they do, it pretty obviously is no longer a stable distro, which is what we usually require for our CI platforms | 08:43 |
frickler | TK_: didn't you already create a bug report for that issue? | 08:44 |
SvenKieske | TK_ what do you not understand about your error? the cpu's you are migrating between aren't compatible (I'm just rephrasing the error message here). They need to be. | 08:44 |
frickler | also not everyone may identify as "Guys", just saying | 08:44 |
frickler | SvenKieske: I think there may have been a bug in nova about this somehow | 08:45 |
SvenKieske | make a diff between "lscpu" on each node and that should tell you what is wrong in most of the cases. their are edgecases where that might not be enough though. | 08:45 |
TK_ | @Frickler .. No harm intended ... My bad | 08:45 |
SvenKieske | frickler: yes, nova tried to be clever and used it's own cpucomparison feature, but that got patched out..1 or 2 releases ago? | 08:46 |
SvenKieske | this: https://review.opendev.org/c/openstack/nova/+/838926 was the original problem | 08:46 |
TK_ | So I guess I will have to just wait for a patch | 08:48 |
SvenKieske | uh oh, they try to be clever again and reintroduce that? https://review.opendev.org/c/openstack/nova/+/762330 | 08:48 |
SvenKieske | TK_ That is already merged, your error most likely is not related to that bug | 08:48 |
SvenKieske | but you need to investigate to check if it is or is not related :) | 08:48 |
frickler | this looks similar and isn't fixed yet afaict https://bugs.launchpad.net/nova/+bug/2023035 | 08:48 |
SvenKieske | I honestly don't know why nova tries to reinvent the wheel and always slaps it's own cpu compare function on top of libvirt, and than has to fix the ensuing mess | 08:49 |
SvenKieske | also nova logs are constantly lying and referring to upstream libvirt cpu maps when in reality nova uses it's own comparison function which fails. | 08:51 |
frickler | this revert is also still blocked, need to ping ppl again https://review.opendev.org/c/openstack/nova/+/871968 | 08:52 |
SvenKieske | mhm, what _is_ the state of the nova patches here? There is e.g. https://review.opendev.org/c/openstack/nova/+/762330 | 08:53 |
SvenKieske | not merged, I notice I'm even CC'ed to that | 08:54 |
SvenKieske | there's also https://review.opendev.org/c/openstack/nova/+/869587 also not merged | 08:55 |
SvenKieske | and also: https://review.opendev.org/c/openstack/nova/+/838552; also not merged | 08:55 |
SvenKieske | seems like a real mess currently | 08:56 |
SvenKieske | seems I don't know the current state of nova stuff. I originally had the hopes this was fixed by the first mentioned fix https://review.opendev.org/c/openstack/nova/+/838926 | 08:56 |
TK_ | Let me try a few things... I will update in the findings | 08:59 |
SvenKieske | also funny this wasn't catched during integration tests; I'm fairly certain nova _does_ run some live migration tests. | 09:00 |
SvenKieske | caught* | 09:00 |
opendevreview | Maksim Malchuk proposed openstack/kolla-ansible master: Add forgotten releasenote https://review.opendev.org/c/openstack/kolla-ansible/+/893482 | 09:05 |
SvenKieske | ah nice, libvirt has a new API: https://review.opendev.org/c/openstack/nova/+/869950 interesting didn't read about this stuff this year | 09:05 |
bbezak | we've also added this workaround to disable nova comparison and rely on libvirt comparison - https://review.opendev.org/c/openstack/kolla-ansible/+/888028 | 09:09 |
SvenKieske | ah right, always weird to be reminded of commits I did review, didn't remember that one. | 09:10 |
SvenKieske | so above mentioned bug report confirms reverting 468b03e0ee4a917ae26106f6e57081bcd9e7a65b from stable/2023.1 fixed the issue for one user | 09:11 |
SvenKieske | but I still haven't fully grasped why the new, supposedly better, api leads to worse results :) | 09:11 |
SvenKieske | any, probably a topic for #openstack-nova | 09:12 |
SvenKieske | anyway* | 09:12 |
* SvenKieske can't type today.. | 09:12 | |
opendevreview | Maksim Malchuk proposed openstack/kolla-ansible master: Add forgotten release note for 886747 https://review.opendev.org/c/openstack/kolla-ansible/+/893482 | 09:14 |
opendevreview | Matt Crees proposed openstack/kolla master: Document KOLLA_UPGRADE_CHECK environment variable https://review.opendev.org/c/openstack/kolla/+/893484 | 09:22 |
opendevreview | Maksim Malchuk proposed openstack/kolla-ansible stable/2023.1: Use better default bind address for ironic-tftp https://review.opendev.org/c/openstack/kolla-ansible/+/893385 | 09:26 |
opendevreview | Matt Crees proposed openstack/kolla master: Document KOLLA_UPGRADE_CHECK environment variable https://review.opendev.org/c/openstack/kolla/+/893484 | 09:28 |
opendevreview | Maksim Malchuk proposed openstack/kolla-ansible stable/zed: Use better default bind address for ironic-tftp https://review.opendev.org/c/openstack/kolla-ansible/+/893386 | 09:33 |
opendevreview | Maksim Malchuk proposed openstack/kolla-ansible stable/yoga: Use better default bind address for ironic-tftp https://review.opendev.org/c/openstack/kolla-ansible/+/893421 | 09:36 |
opendevreview | Maksim Malchuk proposed openstack/kolla-ansible stable/xena: Use better default bind address for ironic-tftp https://review.opendev.org/c/openstack/kolla-ansible/+/893422 | 09:37 |
TK_ | What is weird is the CPUs are exactly the same | 09:51 |
SvenKieske | TK_ can you post the "lscpu" output of both hosts on paste.openstack.org ? | 09:52 |
SvenKieske | in my experience they are most of the time not the same, but the difference can be really subtle, like a single missing register, especially intel is notorious for disabling/enabling cpu stuff depending on how much money you throw at them.. | 09:54 |
TK_ | https://paste.openstack.org/show/bnqkvU8xapOY6lZGsIpe/ | 09:54 |
SvenKieske | diff --side-by-side --suppress-common-lines cpu3 cpu1 | 10:00 |
SvenKieske | cpu1 has these flags that cpu3 hasn't: | 10:00 |
SvenKieske | mhm, these diffs are butchered..there are weird line breaks | 10:02 |
hrw | hwp_* flags are missing in one and present in second | 10:02 |
hrw | TK_: you are running different OS/kernel on them, right? | 10:02 |
TK_ | I am running Ubuntu 20 on both | 10:02 |
SvenKieske | that famous patch: https://patchwork.kernel.org/project/linux-pm/patch/1442944296-11737-1-git-send-email-kristen@linux.intel.com/ | 10:03 |
hrw | lscpu output differs suggesting different OS versions | 10:03 |
frickler | TK_: which kernel versions? also there is no Ubuntu 20, you likely run Ubuntu 20.04? | 10:04 |
TK_ | VERSION="20.04.2 LTS (Focal Fossa)" VERSION="20.04.6 LTS (Focal Fossa)" | 10:05 |
TK_ | They are different | 10:05 |
frickler | that's not the kernel version, "uname --kernel-version" shows it | 10:07 |
TK_ | Should that be a problem ? | 10:07 |
bbezak | maybe microcode versions differ ? | 10:07 |
TK_ | #176-Ubuntu SMP Mon Aug 14 12:04:20 UTC 2023 #165-Ubuntu SMP Tue Apr 18 08:53:12 UTC 2023\ | 10:08 |
TK_ | Those are the outputs | 10:08 |
frickler | so different kernels, as hrw suggested | 10:08 |
frickler | if the newer one has the patch SvenKieske mentioned, that's your issue | 10:08 |
SvenKieske | I almost certain there was a bug report for that, but I can't find it currently.. | 10:10 |
TK_ | Would you recommend installing 20.04.2 on the new servers instead of 20.04.6 | 10:10 |
SvenKieske | regarding the live migration part | 10:10 |
SvenKieske | I'd generally recommend to keep these versions in sync on all your hypervisors: base distro version; kernel version (to the patch), libvirt, qemu, docker | 10:11 |
TK_ | ok | 10:11 |
SvenKieske | you of course need to deviate from this during upgrades | 10:11 |
SvenKieske | ah I forgot an important one: cpu microcode! | 10:11 |
TK_ | Do I need to upgrade them ? | 10:14 |
opendevreview | Merged openstack/kolla stable/yoga: Pin iptables to 1.8.4 in Centos Stream 8 https://review.opendev.org/c/openstack/kolla/+/893423 | 10:22 |
hrw | TK_: upgrade 22.04 to current state on all machines? | 10:23 |
hrw | TK_: and then keep all systems in sync? | 10:23 |
hrw | TK_: I could recommend using 22.04.latest on new systems and migrate old ones to same version | 10:25 |
kevko | SvenKieske: did you read my comment regarding your +1 ? :D | 11:08 |
SvenKieske | kevko: not just yet, had lunch break and now the next meeting is starting; will have a look later | 11:57 |
TK_ | I deployed wallaby using kolla-ansible and as per the documentation, It requires Ubuntu 20.04 | 12:16 |
TK_ | https://docs.openstack.org/kolla-ansible/wallaby/user/support-matrix | 12:17 |
SvenKieske | the long cooking designate change would be happy about some (core) reviews: https://review.opendev.org/c/openstack/kolla-ansible/+/878270 :) | 13:22 |
opendevreview | Bartosz Bezak proposed openstack/kolla-ansible stable/zed: Added precheck for OpenSearch migration https://review.opendev.org/c/openstack/kolla-ansible/+/893453 | 13:39 |
SvenKieske | kevko: I replied with a long rant about CI :) | 13:53 |
SvenKieske | mnasiadka: frickler: might be worthwhile reading for you too, if you like reading rants about CI and stuff: https://review.opendev.org/c/openstack/kolla-ansible/+/799229/comment/ab898aa0_5ea90b38/ if not, better skip it :D | 13:54 |
SvenKieske | kevko: besides my rant: good catches, especially the relnotes, I hate large changesets because I always tend to miss something in them. I really don't trust the changesets if they are large. I also don't trust the reviewers, because no person can stay alert for the amount of time needed to really review such large changes. | 13:56 |
SvenKieske | so you review those in batches and have then to check, that you didn't miss anything in between, and that you have all the connections between different code parts present in your mind. very difficult imho. changesets > 200 lines are evil. | 13:57 |
SvenKieske | that number was arbitrarily chosen, but you get the point :) | 13:58 |
opendevreview | Bartosz Bezak proposed openstack/kolla-ansible stable/zed: Added precheck for OpenSearch migration https://review.opendev.org/c/openstack/kolla-ansible/+/893453 | 13:58 |
SvenKieske | regarding my CI rant: a quick win imho would be, if zuul only did post the result of failed jobs, in good ol' unix tradition (no output=everything is fine). nobody looks at successful jobs anyway, right? | 14:00 |
SvenKieske | kevko: I replied to my own reply with an example which might explain why I'm lost in our current CI system. maybe someone can explain it to me, so I can do a better job at checking jobs :) | 14:09 |
frickler | SvenKieske: there are lots of reasons to look at successful jobs, too | 14:10 |
SvenKieske | ah okay, lol. is there a list somewhere? how do we get anything done this way? I mean you surely don't check manually the CI output of every job on every changeset? | 14:12 |
SvenKieske | so there needs to be some kind of heuristic. if the answer is "you get experience and a feeling over time to know which jobs to check" maybe we can extract this useful knowledge from your heads so I don't have to make errors for years until I have the same knowledge? | 14:13 |
SvenKieske | and maybe write that down somewhere, for all the other people also, looking at that change, two other people also gave +1, so obviously didn't check the zuul jobs careful enough as well. | 14:14 |
SvenKieske | one of those was maksim who I would not consider inexperienced :) (don't want to call you out maksim, I just think we need to improve this stuff) | 14:15 |
SvenKieske | I'm also fine with an incomplete jobs which to check, or the reverse, a list of jobs never to check, but the current status quo for me is basically: every job _might_ be important to check, no matter the job result, which is just ridiculous. can we please make it at least a little easier to contribute? | 14:19 |
SvenKieske | incomplete list* | 14:19 |
SvenKieske | frickler: this testset only has an ara sqlite dump and not the actual openstack-tests present, how to debug this? https://zuul.opendev.org/t/openstack/build/49c367a8d6f04f27b166edb3f7061360/logs | 15:31 |
SvenKieske | nvm I'm blind | 15:31 |
opendevreview | Merged openstack/kayobe master: Use merge_configs and merge_yaml to generate Kolla custom config https://review.opendev.org/c/openstack/kayobe/+/782749 | 18:12 |
opendevreview | Merged openstack/kayobe master: Add cached plugin https://review.opendev.org/c/openstack/kayobe/+/803064 | 18:12 |
opendevreview | Merged openstack/kayobe master: Kayobe environment dependencies https://review.opendev.org/c/openstack/kayobe/+/802865 | 18:12 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!