Thursday, 2025-10-09

*** mhen_ is now known as mhen01:59
jlejeunesean-k-mooney: hello, is that better: https://review.opendev.org/c/openstack/nova/+/963368 ?07:01
elodilleshi nova-stable-maint members o/ i've sent a mail to ML yesterday about 2024.1 Caracal transition to Unmaintained in 2 weeks (!), we could probably do a final stable release of 2024.1 Caracal, but we should be quick about it (but be careful not to break anything! as this will be final!)07:46
elodillesbut before we cut a release, maybe a couple of backports can land, see the etherpad: https://etherpad.opendev.org/p/nova-stable-caracal-eom07:47
bauzasack will review those07:49
jlejeuneelodilles: bauzas: it could be great also to backport that topic: https://review.opendev.org/q/topic:%22bug/2085135%22 :)07:55
elodillesjlejeune: once they got merged to master branch and backports are proposed, then feel free to ping me / add me as reviewer to those backports :)08:22
jlejeuneelodilles: got it, thanks08:27
opendevreviewJulien Le Jeune proposed openstack/nova master: Reset the mapped field of nodes at service deletion  https://review.opendev.org/c/openstack/nova/+/93302209:41
opendevreviewSvein-Erik Skjelbred proposed openstack/nova stable/2025.1: lvcreate fails now and then.  https://review.opendev.org/c/openstack/nova/+/96351310:48
sean-k-mooneyjlejeune: yep thanks11:15
sean-k-mooneyjlejeune: that is now followign or standar repoeducer and fix pattern using a dedicated regression test in the regression folder11:16
opendevreviewJulien Le Jeune proposed openstack/nova master: Reset the mapped field of nodes at service deletion  https://review.opendev.org/c/openstack/nova/+/93302211:35
opendevreviewJulien Le Jeune proposed openstack/nova master: Adds regression test for bug LP#2085135  https://review.opendev.org/c/openstack/nova/+/96336812:31
opendevreviewJulien Le Jeune proposed openstack/nova master: Reset the mapped field of nodes at service deletion  https://review.opendev.org/c/openstack/nova/+/93302212:31
opendevreviewJulien Le Jeune proposed openstack/nova master: Adds regression test for bug LP#2085135  https://review.opendev.org/c/openstack/nova/+/96336812:58
opendevreviewJulien Le Jeune proposed openstack/nova master: Reset the mapped field of nodes at service deletion  https://review.opendev.org/c/openstack/nova/+/93302212:58
opendevreviewJulien Le Jeune proposed openstack/nova master: Reset the mapped field of nodes at service deletion  https://review.opendev.org/c/openstack/nova/+/93302213:18
jlejeunesean-k-mooney: I've updated my changes following your reviews13:21
sean-k-mooneycool ill re reivew. di that work locally for you https://review.opendev.org/c/openstack/nova/+/933022/9/nova/db/main/api.py13:23
*** haleyb_ is now known as haleyb13:28
jlejeunesean-k-mooney: yes it works as expected 13:45
opendevreviewKamil Sambor proposed openstack/nova master: Switch nova-conductor to use global executor  https://review.opendev.org/c/openstack/nova/+/96247813:56
ratailorsean-k-mooney, gibi could you please review https://review.opendev.org/c/openstack/nova/+/963348 https://review.opendev.org/c/openstack/nova/+/961352 and https://review.opendev.org/c/openstack/nova/+/96121114:27
opendevreviewBalazs Gibizer proposed openstack/placement master: Prune a_c search space by invalid prefixes  https://review.opendev.org/c/openstack/placement/+/96277614:32
gibidansmith: I will accept the heat for it but I ended up changing the whole pruning logic while tried to pull out the capacity check function call and the odometer logic. I ended up dropping the odometer logic in favor of a recusive tree search as it results in a lot cleaner code and allows one extra optimization out of the box that is dropping half of the steps out in the symmetric case. So the 3by3 14:32
gibitest now checks only 30 partial candidates not 60. This gives us even more performance buffer. I also managed to avoid changing the conditional ARR copy logic which is a nice risk reduction alone. ^^14:32
dansmithooof14:32
sean-k-mooneygibi: long term do you think we will want to keep the fallback to itertools.product14:38
sean-k-mooneygibi: if the performace is simialr for small trees i wonder if we coudl jsut alwasy use the new soltuion14:39
sean-k-mooneyper haps somethignto let bake for a while and evolve too in a future release14:39
dansmithI asked for the fallback specifically to avoid switching the world to this (well, the previous) much more complicated solution until it had seen some time on the battlefield, especially when the stdlib version works fine for everyone else (so far)14:40
gibisean-k-mooney: dansmith asked to put this behind a workaround flag to mitigate risk for the majority of our users who are not using PCI in Placement (or other sources of wide provider trees) so I think we want to bake it 14:40
sean-k-mooneyack14:40
sean-k-mooneyso after our converstaion on this topic i was thinkign that we may want to consider changign the default of https://docs.openstack.org/placement/latest/configuration/config.html#placement.allocation_candidates_generation_strategy to breadth-first and deprecating randomize_allocation_candidates (possibley replacing it with a random allocation_candidates_generation_strategy14:43
sean-k-mooneypolicy if needed)14:43
sean-k-mooneyi also was wodnerign if optimize_for_wide_provider_trees really shoudl be a workaround option14:45
sean-k-mooneyto me this feel much more fitign to be grouped with allocation_candidates_generation_strategy14:45
sean-k-mooneyor  randomize_allocation_candidates in the palcement section14:46
gibiyeah workarond make sense if we know we will go back and redesign the modeling of PCI devices and that make the current optimization unnecessary14:46
dansmithworkaround because we expect to make it the default eventually, and right now this is basically for one very specific use case...as a workaround.. if we're going to keep it forever, then fine to put it in main config, but it feels like something we plan to drop when we're comfortable (or redesign this entirely)14:46
sean-k-mooneyok fair, i was thinking about the fact we dont want to ship with a workaroudn enabled by defalt14:47
sean-k-mooneyso i was tryign to see how we woudl evovel this to on by default without forcing that in a single release14:47
sean-k-mooneyim not opposed to puting it in workarounds14:48
sean-k-mooneyjust wanted to raise that14:48
sean-k-mooneyi was kind fo thinking add in 2026.1 make default in 2026.2 and remove in 2027.1 somethign like that but we dont need to rush this either14:49
gibimoving a config between sections is not hard, oslo has deprecated section arg. 14:49
sean-k-mooneythat was partly why i was asking about the perfroamce for small treese vs itertools.product14:49
gibiso if we decide to make it default true, then we can also move it to [placement] section14:49
sean-k-mooneyya that also fair, it handels the legacy name for us14:50
dansmithwe're not going to ship with a workaround defaulted to true14:50
sean-k-mooneyyep i know but im not really considerign this a wrokaround at this point it feel more liek a real performafce optimisation/mini feature14:51
sean-k-mooneyanyway im ok with the current proposal14:51
sean-k-mooneyi.e. to have it be a workaroudn for now. ill try and make some time to look at the revised patch in detail14:51
dansmithgibi: tbh, even though I'm impressed at getting more performance out, I'm (as you guessed) a bit sad to have reset the progress a week14:52
dansmithhow much better is it for the current known-worst case of like 20ish providers?14:52
gibidansmith: test_many_non_viable_candidates_21_21 went from 8.3 sec (odometer) to 3 sec (recursive tree search). But I would not pivot the impl because performance (alone) I pivoted due to how much cleaner the code now. (I guess it is matter of taste a bit if a maintaining a set of indexes is better than maintaining stack for recursion)15:02
gibitest_many_non_viable_candidates_32_32_two_computes went from 20 to 2.515:02
sean-k-mooneyv15 test_many_non_viable_candidates_64_64_two_computes [8.489979s] v16 test_many_non_viable_candidates_64_64_two_computes [0.485396s]15:03
dansmithbut the immediate goal here was to avoid it being longer than the scheduler timeout to fix the actual failure, which even at 64x on the previous version was well accomplished correct?15:04
dansmithanyway, I preempted vtpm stuff for the reviews recently and was geared up to get back to that today and feel like I'm now reset back a week :(15:04
sean-k-mooneywell that is with 10 allcoation candiate max15:04
sean-k-mooneybut yes if you tune small enough you can with v1515:05
sean-k-mooneyv16 woudl allow you to have that higher15:05
sean-k-mooneyfor the test gibi ran i see v15 test_many_non_viable_candidates_21_21 [15.881115s] v 16 test_many_non_viable_candidates_21_21 [4.901830s]15:06
dansmithI get that it's faster, I'm just not sure it's faster enough to matter _that_ much when we're trying to get something landed to fix a thing.. and I can already see from a skim that it's cleaner,15:07
dansmithI just don't know that it's important enough to have reset the game here15:07
sean-k-mooneyyou saying restore v15 and make v16 a patch on top?15:08
gibiI think the cleaner code worth the re-review effort as probably need to read this piece of code multiple times in the future15:08
dansmithI'm not saying anything.15:08
sean-k-mooneygibi: given nova-next is currently testing thread mode can i suggest enablign this in nova-ovs-hybrid-plug. i have mentioned befor ethat i woudl like to rename this to nova-alt-config to beterer refect that we use ti to test non defautl config liek ml2/ovs or spice instead of vnc.15:33
sean-k-mooneyi really dont expect that to be a stablity issue given the job wont reallly exersie teh new codepaths15:35
sean-k-mooneyi.e. we dont have a lot of child providers in that job today15:35
gibiwe have no jobs with child providers really in upstream CI15:35
sean-k-mooneywe will if we ever finish enabling the mtty change or similar but ya we dotn really today15:36
gibiI even feel like enabling it in nova-next is OK as this is a placement config and placement is not changed threading15:36
sean-k-mooneythat is fine too15:36
sean-k-mooneyi just didnt want to intoduce a diffent variabel to have to debug in that job15:37
sean-k-mooneybasiclly im suggstign tweak a job just to have it enabeld 15:37
gibieven the breadth-first config is supposed to be enabled in nova-next I just never managed to get back to the devstack patch that make it possible 15:37
sean-k-mooneybut im not too pushed on which job15:37
gibihttps://review.opendev.org/c/openstack/devstack/+/93982515:38
sean-k-mooneyoh is that still open15:39
gibihttps://review.opendev.org/c/openstack/nova/+/93727515:40
gibiyepp15:40
gibibut I will use the time now to fix that up15:40
opendevreviewAlan Bishop proposed openstack/nova-specs master: Show swap_progress in attachment details  https://review.opendev.org/c/openstack/nova-specs/+/93748516:25
dansmithmelwitt: hmm, so I moved to the next patch (host), configured a flavor for host, booted a server which clearly has host set (in sysmeta) and I get the same failure to restart it as admin17:21
dansmithhost means the user still owns the key in barbican, but we don't need to hit barbican to start anymore because it's persisted in libvirt right?17:22
melwittdansmith: oh, maybe that's an oversight on my part. admin should be able to live migrate it but I don't believe I changed the code in hard reboot to not go to barbican if 'host'17:24
melwittso yeah I think I just need to bypass that if 'host'. I had been focusing on the live migration only. I should also add a hard reboot by admin to my tempest tests17:26
opendevreviewMerged openstack/placement master: Reproduce GET a_c slowness bug/2126751  https://review.opendev.org/c/openstack/placement/+/96259617:27
dansmithmelwitt: you claim it in the config can be rebooted :)17:46
melwittdansmith: uh oh17:48
dansmithoh, well, it says "host reboot" actually, so maybe that works because of libvirt?17:56
dansmithanyway, I think we want that to be a feature of host security, so we should probably make a note to fix it even if not in this patch17:57
melwittyeah, I'll try to fix it in the same patch since i'm changing stuff anyways17:58
opendevreviewmelanie witt proposed openstack/nova master: Dump conf to debug log earlier in the WSGI app pipeline  https://review.opendev.org/c/openstack/nova/+/96361918:48
*** haleyb is now known as haleyb|out23:22

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!