Wednesday, 2026-05-06

*** jgilaber_ is now known as jgilaber08:54
opendevreviewDouglas Viroel proposed openstack/watcher master: [WIP] Add optional lock-free mode for cluster data models  https://review.opendev.org/c/openstack/watcher/+/98749212:08
opendevreviewDouglas Viroel proposed openstack/watcher master: [WIP] Add optional lock-free mode for cluster data models  https://review.opendev.org/c/openstack/watcher/+/98749212:14
dviroelamoralej: ^ 12:18
dviroelseems to improve a lot, at least on my env, was ~89% faster12:18
amoralejcool! in the watcher_stabilization case?12:23
amoralejwe discussed that in the past but i didn't think that'd be such a great improvement for that particular one12:24
dviroelamoralej: yes, i was testing with workload_stabilization12:32
amoralejthat's nice, that would automatically improve all strategies12:33
amoralejI'll give it a try in a local env12:37
sean-k-mooneydviroel: we need to be very very carful if we allow not takign a lock12:59
sean-k-mooneyi have not reviwed your WIP yet13:00
sean-k-mooneybut we need to ensure its correct in all usage13:00
sean-k-mooneymy incliation is to say instead of adding a nolock version we shoudl asses why the locks were there in the first place13:02
sean-k-mooneyskimming it im borderly -2 because the use of logs is encoded as a a privte internal on teh model root13:03
sean-k-mooneythat is very very nonobvious and will likely cause issue with debuging in the future13:03
opendevreviewAlfredo Moralejo proposed openstack/watcher stable/2026.1: Add debug logging for host rejection in workload_balance  https://review.opendev.org/c/openstack/watcher/+/98749713:06
dviroelyeah, agree, we need to make sure that is are using a no lock version on veryu specific cases, which may need more investigation as part of this solution13:06
dviroeli would day that the lock is originally there since there are 2 different thread that may change the models, the collectors and the notifications13:07
sean-k-mooneyif we know it safe to call a speicifc fucntion without a lock i woupld pefer to have a locked and unlocked version13:07
dviroelsean-k-mooney: ack, we can discuss a different approch instead of the property13:08
amoralejactually, it may be a good opportunity to extend documentation about how to use the model in the strategies, 13:08
dviroelamoralej: valid point13:08
sean-k-mooneywell even in eventlet mode you still need to lock shared datastucure like that13:08
sean-k-mooneyso its not so much that we have muliple trhead btu concurrent access13:10
sean-k-mooney@lockutils.synchronized("model_root") feels very heavy for this usecase13:11
dviroelyes, that's what I meant to say in the end13:11
sean-k-mooney@lockutils.synchronized("model_root") is a filesystem lock13:11
amoralejlocking for the shared model which is updated by the notification and collector while non-locking for the in-audit execution copies looks like a good starting point, and what the patch implements,13:12
sean-k-mooneyit would be better to have a lock in the instnace (self.model_lock) and have a decorator that aquired the lock for the current instance of the obejct13:12
sean-k-mooneythen when we make a copy they can have seperate locks13:12
amoralejbut yep, needs to be treated carefully ...13:12
dviroelindeed a filesystem lock is expensive13:13
sean-k-mooneyamoralej: right but we are makeign a copy so that it wont be updated by the collecort13:13
sean-k-mooneythe issue here is because its a file sytem lock each copy still share the lock13:13
amoralejwe don't want the collector to update a copy of the model which is in use13:13
amoraleji mean by an audit13:13
sean-k-mooneyif we make the lock per insstnace of the obejct that wont be a probelem any more13:13
amoralejah, sorry, about that, yep, makes sense13:14
sean-k-mooneyamoralej: right but we can hanel that by do a CAS (compare exchange) when updating the shared instnace13:14
amoraleji didn't know @lockutils.synchronized was on filesystem tbh13:14
sean-k-mooneyyes its effectivly a full inter process level lock 13:15
sean-k-mooneyhttps://github.com/openstack/oslo.concurrency/blob/master/oslo_concurrency/lockutils.py#L46913:16
sean-k-mooneyto befair that depend on if external is set to true or not13:18
sean-k-mooneyit is in nova but the defaultg in oslo is false13:18
sean-k-mooneyso this might not be a file level lock but we are still sharing it when we should not be because of how this is implemented13:18
dviroelyeah, I see your point, all copies would be sharing the same lock, which doesn't make too much sense13:21
sean-k-mooneyit wont be that hard to create  new decorator to fix this13:24
sean-k-mooneybut that how i woudl od it, put the lock in the instance of the datamodle and have the decorater aquire that via self13:24
sean-k-mooneythat will allow us to use the same pattern of decorating the methods and if a single instnace of the object is ever shared arocss thread or greenlets it will just work13:25
amoralejanyway, aren't they two different issues?, 1. where to lock and where to non-lock, and 2. what's the best to way to lock and the scope when locking13:25
sean-k-mooneyno ned to opt in or out13:25
sean-k-mooneyamoralej: well yes and no if there is no contention on a lock it basiclly free13:26
sean-k-mooneyso if we stop sharign the locks between copies i suspect the majoriy of the perfomacne impact will go away13:26
amoralejwe could easily test it13:26
dviroelwe would need to test and get some results13:26
amoralejyep13:26
sean-k-mooneygive claude the irc logs and ask it to create a poc13:26
sean-k-mooney:)13:27
amoralejbut not locking if it's not needed sounds also good to me :) 13:27
sean-k-mooneyyep that need more tought13:27
sean-k-mooneyi would be relucat to backprot the lock removal13:27
sean-k-mooneychanging to a better lock i think is much simpler to justify13:28
sean-k-mooneyso we can do both13:28
amoralejthat's right13:28
dviroeli'll give a try 13:31
sean-k-mooneyi assume this is related to the descion engine scaling work or eventlet removal work?13:34
dviroelrelated to the scaling work13:37
dviroelthis LP https://bugs.launchpad.net/watcher/+bug/214195113:37
opendevreviewIvan Anfimov proposed openstack/watcher-dashboard master: Update Babel configuration  https://review.opendev.org/c/openstack/watcher-dashboard/+/98751814:09
opendevreviewIvan Anfimov proposed openstack/watcher-dashboard master: Update Babel configuration  https://review.opendev.org/c/openstack/watcher-dashboard/+/98751814:10
dviroelsean-k-mooney: amoralej: so yeah, it seems that replacing the lockutils.syncronized with a simple instance lock already improves the execution, to a execution time close to the no lock proposal15:07
amoralejcool, will you send a separated review with that? I may test both15:09
dviroelamoralej: yes15:10
sean-k-mooneysweet15:17
dviroelbtw, sean-k-mooney, if you have some time today, would you like to take a look on https://review.opendev.org/c/openstack/watcher/+/986486? 15:23
sean-k-mooneyim reviewing the tempest tests first15:31
sean-k-mooneybut yes i have that open15:31
opendevreviewDouglas Viroel proposed openstack/watcher master: Replace global lockutils locks with per-instance RLock  https://review.opendev.org/c/openstack/watcher/+/98754015:36
dviroelforgot to WIP it15:38
dviroelamoralej: ^15:39
dviroelclaude thinks that is ready for production  XD15:39
amoralejnp it may be! :) 15:41
amoralejadd a -W if you prefere not to resend with wip15:41
sean-k-mooneyi mean obviouly just ship it, what the worst that can happen when changing concurance primitives15:43
sean-k-mooneyi wonder if teim will agrre with claude while using claude (with glm) to review what claude did :)15:43
opendevreviewAlfredo Moralejo proposed openstack/watcher stable/2026.1: Add debug logging for host rejection in workload_balance  https://review.opendev.org/c/openstack/watcher/+/98749716:26
sean-k-mooneydviroel: i wont push my comment on https://review.opendev.org/c/openstack/watcher-specs/+/969840 this evenign because they are frm ps 7 pre-ptg but ill see if i can review it this week17:17
dviroelok, thanks sean-k-mooney 17:18
sean-k-mooneyi also finished reviewing the bfv code. the code itself looks reasonabel but there are gaps in teh testing at least on the tempest side17:18
sean-k-mooneythere is no end to end bfv coverage with an actual cold/live migration to confirm watcher actully calls nova correctly nor is there any testing with flavor with non zeor swap/ephermal17:20
sean-k-mooneyim goign to grab dinner17:21
dviroelindeed, i was considering the migration as same as the existing ones, but we don't have any bfv coverage in tempest, before we actually say that it is supported17:36
dviroelthe test is creating the instance, but not migrating it17:37
opendevreviewMerged openstack/watcher-dashboard master: test: Restructure test directory with unit/ and local_fixtures/  https://review.opendev.org/c/openstack/watcher-dashboard/+/97825718:19

Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!