12:00:19 <dviroel> #startmeeting watcher
12:00:19 <opendevmeet> Meeting started Thu Aug  7 12:00:19 2025 UTC and is due to finish in 60 minutes.  The chair is dviroel. Information about MeetBot at http://wiki.debian.org/MeetBot.
12:00:19 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
12:00:19 <opendevmeet> The meeting name has been set to 'watcher'
12:00:26 <dviroel> who is around today?
12:00:36 <amoralej> o/
12:01:18 <morenod> o/
12:02:01 <dviroel> courtesy ping: jgilaber sean-k-mooney chandankumar rlandy
12:02:11 <rlandy> o/
12:02:21 <dviroel> alright, let's start with today's meeting agenda
12:02:36 <dviroel> #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L40 (Meeting agenda)
12:02:43 <dviroel> feel free to add your own topics to the agenda
12:03:17 <dviroel> #topic Announcements
12:03:48 <dviroel> Flamingo release schedule
12:03:51 <dviroel> #link  https://releases.openstack.org/flamingo/schedule.html
12:04:24 <dviroel> adding it here just to mention that we are 3 weeks from feature freeze
12:05:06 <dviroel> and reminder that no features should be merged after this milestone
12:05:08 <dviroel> so please add your changes to the review list in our meeting agenda
12:05:28 <dviroel> today I added a section in the etherpad
12:05:36 <sean-k-mooney> o/
12:05:53 <dviroel> to help us track the main topic that we plan to review
12:06:09 <dviroel> #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L18
12:06:36 <dviroel> i added 3 proposed features by their topics
12:07:18 <dviroel> feel free to comment, place status update and so on
12:07:57 <dviroel> sorry jwysogla for missing the Aetos topic :(
12:08:03 <dviroel> thanks for adding
12:08:04 <sean-k-mooney> i guess we never created a wtacher status etehrpad
12:08:32 <dviroel> sean-k-mooney: yeah, we may create one, to include everything, bugfixes, features, etc..
12:08:58 <sean-k-mooney> well we can use the meting one btu this si what nova does https://etherpad.opendev.org/p/nova-2025.2-status
12:09:15 <sean-k-mooney> i tought we did this last cycle but maybe not
12:09:25 <sean-k-mooney> im ok using the irc ether pad for now
12:09:33 <sean-k-mooney> but it nice to have a per cycle one as well
12:09:56 <sean-k-mooney> less noice and its uesful for cycle highlight and the reslease note prelude
12:10:00 <dviroel> yes it is, since the amount of patche is also growing
12:10:58 <dviroel> ok, any other announcement?
12:11:00 <sean-k-mooney> let create one async and start usign it for the remaidner fo the cyel and put a link at the top of the irc etherpad
12:11:09 <dviroel> sean-k-mooney: +1
12:11:49 <dviroel> i can start one after the meeting
12:11:55 <rlandy> +1
12:12:30 <dviroel> #action dviroel to start a watcher status etherpad and link at the meeting etherpad
12:13:33 <dviroel> ok so, next topic in the list
12:13:44 <dviroel> (dviroel): Eventlet Removal Updates
12:13:57 <dviroel> #topic Eventlet Removal Updates
12:14:12 <dviroel> #link https://etherpad.opendev.org/p/watcher-eventlet-removal (watcher evenlet removal etherpad)
12:14:35 <dviroel> there is a good news
12:14:48 <dviroel> the patch that extend decision engine to support threading mode merged
12:14:58 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/952257 (Extend decision engine to support threading mode)
12:15:09 <dviroel> ty all for the reviews
12:15:24 <dviroel> note that this patch added a new voting job
12:15:44 <dviroel> watcher-prometheus-integration-threading runs with decision-engine in threading mode
12:16:00 <dviroel> we decided to keep it voting, since was stable enough
12:16:08 <amoralej> cool
12:16:32 <dviroel> let us know if your patch start hitting any issue with that specific job, and we can help with debug, or just make it non-voting if required
12:16:54 <dviroel> thre is another job proposed in
12:17:01 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/955097 (Add a new tox environment to run unit tests in threading mode)
12:17:25 <dviroel> that adds a new tox environment to run unit tests in threading mode too
12:17:39 <dviroel> which should be merging soon too, i think
12:17:49 <sean-k-mooney> i left that unappoved to allow other to reivew
12:17:55 <dviroel> ++
12:18:09 <sean-k-mooney> but ya  llikely early next week if there are no objects
12:18:25 <dviroel> from the decision-engine side, I expect to still go through all threadpool default number of threads
12:18:36 <dviroel> and propose a update if needed
12:18:46 <sean-k-mooney> so im not sure if that will be
12:19:08 <sean-k-mooney> looking at the eventlet code they were prvciously limiting the eventlet pool to 4 eventlets i think
12:19:23 <sean-k-mooney> i have not checked all code path but if that was infact correct for all pools
12:19:34 <sean-k-mooney> then i dont think there will be an issue using 4 real os threads
12:19:57 <sean-k-mooney> the concern with using real os thread was mainly for project where that default limit si 10,000
12:19:58 <dviroel> right, that's the conclusion that I would expect to reach, but still as a in-progress task on my side
12:20:02 <sean-k-mooney> which i think it was in nova
12:20:20 <sean-k-mooney> ack
12:20:40 <dviroel> we may also want to include threadpool stats in debug logs, to help us on debugging
12:20:52 <dviroel> I think that nova is also adding something similar
12:21:03 <sean-k-mooney> ya gibi has added a hook for futurists stat mechanium
12:21:06 <dviroel> but that is not only for decision-engine, but for all
12:21:11 <sean-k-mooney> so you can likely port that
12:21:17 <dviroel> +1
12:21:30 <dviroel> next thing would be to continue the work in the applier
12:21:44 <dviroel> I will provide more udpates as soon as I have something
12:21:46 <sean-k-mooney> https://review.opendev.org/c/openstack/nova/+/948340
12:21:54 <dviroel> or the issue that we found in the way
12:22:09 <dviroel> sean-k-mooney: yeah, this one
12:22:15 <sean-k-mooney> the applier will be a littel trickier
12:22:33 <sean-k-mooney> so the applie is implementign cancelation fo the task by killing the greenthread
12:22:42 <dviroel> yes!
12:22:43 <sean-k-mooney> that is not somethign you can do with an OS pthread
12:23:00 <sean-k-mooney> so we may have to consier usign a process pool but that has other problems
12:23:16 <sean-k-mooney> ro we may have to rethinkn how that works in general
12:23:26 <dviroel> indeed, the applier logic need to be evaluated
12:23:38 <sean-k-mooney> if we did not have time to do that this cycle you have still made a lot fo progress
12:24:47 <sean-k-mooney> dviroel: have we ensured that there is no eventlet usage in the api?
12:25:12 <sean-k-mooney> i know you marked the depercation fo the console script as done
12:25:39 <dviroel> there is a usage only with console script
12:25:40 <sean-k-mooney> but when we run under uswig have you confirmed we nevver use eventlet in teh api
12:25:56 <sean-k-mooney> ya the console script is not a problem because we are just going to delete that
12:26:25 <sean-k-mooney> its the wsgi applciation taht is the main part we need to validate
12:26:41 <dviroel> sean-k-mooney: only digging through the code, and I didn't found anything
12:26:47 <sean-k-mooney> ack
12:27:40 <dviroel> but worth revisting yeah
12:27:49 <dviroel> thanks for the reminder
12:28:01 <dviroel> any other question for this topic?
12:28:27 <amoralej> you are doing a great progress on this
12:28:32 <dviroel> moving forward then, since we have things to cover
12:28:38 * dviroel thanks amoralej
12:28:57 <dviroel> #topic (dviroel) Fix for bug #2098984
12:29:18 <dviroel> i was recently hitting this bug, with a continuous audit
12:29:20 <dviroel> #link  https://bugs.launchpad.net/watcher/+bug/2098984 (Zone Migration Strategy failing to build a list of instances for migration)
12:29:31 <dviroel> the tl;dr; is
12:30:07 <dviroel> zone migration strategy uses nova and cinder client to retrive instances and volumes, and then checks if they exist in the compute and storage models
12:30:51 <dviroel> when the instance or volume doesn't exist in the model, it raises an exception, which impatch the audit, setting it to Failure
12:31:05 <dviroel> so the audir doesn't run anymore
12:31:11 <dviroel> there is a patch
12:31:30 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/956198
12:32:03 <dviroel> where I try to fix this issue with a simple try/expect to ignore the instance/volume
12:32:14 <dviroel> which is the expected behavior in the current code
12:32:42 <dviroel> amoralej: raised a question if we should fix the strategy to  instead, only look at the model, and not use the clients
12:33:05 <sean-k-mooney> ya so there are multiple parts to this. 1 your patch is correct and valid, 2 amoralej  is also correct that the stragey shoudl be using the data form the model not hitting the api directly
12:33:07 <dviroel> so the idea here is to get a feedback on tha
12:33:24 <amoralej> it's fine for me to fix in two steps
12:33:24 <sean-k-mooney> i think we will want to fix both issue as seperate bugs
12:33:29 <amoralej> wfm
12:34:14 <amoralej> 1st merge the try/expect from dviroel and fill a new bug with "zone_migration uses nova client instead of the model to retrive instances"
12:34:18 <dviroel> i also agree that we should be using these client directly in the strategy
12:35:06 <sean-k-mooney> well using the clinet to take actions is obvioulsy fine as part of the applciation fo the action plan.
12:35:19 <sean-k-mooney> in limited cases it may be ok in the descion engine
12:35:32 <amoralej> theoretically, consuming from model should avoid hitting the other issue, but there may be race conditions with model updates, so good to handle exceptions also
12:35:33 <sean-k-mooney> to enfrich the obejcts form teh model with extra data
12:35:40 <sean-k-mooney> but we should ideally avoid that
12:36:10 <dviroel> the 2nd fix change the behavior a little bit
12:36:10 <sean-k-mooney> amoralej: well the race can alwasy happen and we need to be tolerate to that
12:36:21 <sean-k-mooney> which si where the skip feature comes in
12:36:29 <amoralej> yes
12:36:44 <amoralej> but in that case is between running the audit and executing the action
12:36:49 <dviroel> by getting the list of instances/volumes from the api, we also avoid adding already deleted elements, that are still in the model
12:36:56 <amoralej> while this particular bug is while running the audit, not the action
12:37:21 <dviroel> but yes, this can be mitigated by the migration action, in the pre_ methods
12:37:27 <sean-k-mooney> dviroel: that is correct but we have to handel that the elemen might be deleted in teh applier anyway
12:37:34 <dviroel> ++
12:37:36 <amoralej> yes
12:37:39 <sean-k-mooney> becuase between the action plan being created and it being appoved it could happen
12:37:46 <dviroel> true
12:37:58 <sean-k-mooney> dviroel: so where i ma with his is i think im pretty comfrotabel backprotign yoru inital patch
12:38:13 <sean-k-mooney> im less sure about changing to usign the model
12:38:35 <amoralej> note, that when i said to use only (unless we can't) the model is in the strategy execution, not in the action execution
12:38:55 <dviroel> so we agree with the current patch, file a new bug to replace the current implementation with model only, and we can discuss if should be backport afterwards or not
12:39:03 <amoralej> +1
12:39:04 <sean-k-mooney> that a bigger change and we need to make sure it does nto result in other bugs before consdiring backproting it. that the ohter reason i would prefer 2 patches and 2 bugs
12:39:15 <sean-k-mooney> +1
12:40:03 <dviroel> #action dviroel to file a new bug to explaing the current behavior with zone_migration using nova/cinder clients to get info about resources
12:40:41 <dviroel> #agreed on continue with the fix https://review.opendev.org/c/openstack/watcher/+/956198
12:41:05 <dviroel> thanks for the feedbacks on that o/
12:41:29 <dviroel> moving to next topic then
12:41:36 <dviroel> #topic (amoralej) Update about skipped actions feature
12:41:48 <dviroel> amoralej: want to highlight your latest changes?
12:41:51 <amoralej> #link https://review.opendev.org/q/topic:%22blueprint-add-skip-actions%22
12:42:09 <amoralej> so, i've reorganized the changes as discussed a couple of meetings back
12:42:32 <amoralej> main change is that all the api related changes are in the last one https://review.opendev.org/c/openstack/watcher/+/955753/
12:42:52 <amoralej> including status_message visibility and new actions patch
12:43:04 <amoralej> both covered by the same microversion 1.5
12:43:19 <amoralej> instead of in separate ones, as in my previous patches
12:43:41 <amoralej> other than that, i think organization now is much more clear
12:43:44 <amoralej> and easy to review
12:44:17 <dviroel> amoralej++
12:44:35 <dviroel> already started to review them, based on the relation chain
12:44:36 <amoralej> also, I sent a new tempest test https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/955775 which is using the microversion support
12:44:40 <amoralej> that dviroel added
12:44:43 <dviroel> ++
12:44:49 <amoralej> and worked great
12:45:00 * dviroel needs to update its own tempest patch to use that
12:45:11 <amoralej> runing in functional master job, skipped in functiona-<stable> ones
12:45:57 <amoralej> I think we discussed pretty much the implementation details in a previous mtg, so this was just heads-up for reviews
12:46:04 <dviroel> thanks amoralej
12:46:10 <amoralej> the only missing part is the watcherclient and watcher-dashboard one
12:46:28 <amoralej> I'll work on that too
12:46:44 <dviroel> so please folks, review these changes when you have a chance
12:47:10 <dviroel> amoralej: anything else?
12:47:15 <sean-k-mooney> i have approved the parent
12:47:24 <sean-k-mooney> https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/956306/3 so that should merge soon
12:47:26 <amoralej> nothing else from my side
12:47:29 <amoralej> thanks
12:48:08 <dviroel> nice, thanks sean-k-mooney
12:48:11 <dviroel> thanks amoralej
12:48:24 <dviroel> #topic Reviews
12:48:34 <dviroel> again, no specific reviews to mention here
12:48:44 <dviroel> just take a look on the m-3 review list
12:48:57 <sean-k-mooney> jwysogla: proably want to bring up theres?
12:49:01 <dviroel> see what you can help us to move forward
12:49:45 <sean-k-mooney> i did a first pass on the aetos change yesterday https://review.opendev.org/c/openstack/watcher/+/955608
12:49:58 <sean-k-mooney> its in merge conflict becase of some other stuff we landed
12:50:08 <sean-k-mooney> but over all it looks pretty reasonable
12:50:31 <dviroel> great, going to take a look soon too
12:50:34 <sean-k-mooney> the min risks here are on teh timeing of releasign the new version fo the observiablity client and geting that into upper constratis
12:51:18 <sean-k-mooney> jwysogla: what are the changes fo gettign jaunLarriba to appove teh release patch adn geing that doen thsi week?
12:51:50 <sean-k-mooney> oh they already have https://review.opendev.org/c/openstack/releases/+/956657
12:52:09 <sean-k-mooney> ok so this is just pending the release team to approve
12:53:08 <dviroel> ok, so we need to follow up on that too
12:53:30 <dviroel> we can add it under its change in the status etherpad
12:53:46 <dviroel> jwysogla: pls provide us a feedback on that, once you are back online
12:54:07 <jwysogla> Sorry, I was in another meeting
12:54:10 <dviroel> sean-k-mooney: thanks for raising this concern
12:54:19 <sean-k-mooney> jwysogla: no worries
12:54:44 <sean-k-mooney> once the release is doen a bot will propose an update to the requirement repo to bump it in upper constraits
12:54:53 <jwysogla> I hope the observabilityclient releases soon. Then I can finish up the patch and make the CI passing. I run all the tests locally and they were working, so I hope there is not much work left.
12:55:01 <sean-k-mooney> until that is also merged the watcher patch will be blocked
12:55:13 <sean-k-mooney> but that does nto eman we cant review in advance of that
12:55:25 <jwysogla> yes, thanks
12:55:33 <dviroel> correct
12:55:49 <sean-k-mooney> jwysogla: there are some followup that we may want to dicsus after or at the ptg
12:55:50 <jwysogla> I should also mention, that I'll be on PTO last 2 weeks of August, which is quite unfortunate timing.
12:56:22 <jwysogla> But until then, that patch will be my top priority.
12:57:12 <dviroel> ok, this is importanto to know
12:57:15 <sean-k-mooney> jwysogla: basiclly next cycle if we can i woudl like to see if we can evolve our jobs away form sgcore to use the new promtious scape endpoint. i think that woudl be a good topic to dicuss in the telemetyr ptg session or any time before then after we have cut RC1
12:57:43 <jwysogla> +1 to that
12:57:56 <sean-k-mooney> we also need to dicsus the future of gnoici supprot in ceilometer
12:58:29 <morenod> +1
12:58:32 <sean-k-mooney> is it correct that there is a plan to retire the way ceilometer feed data to gnoccic today
12:58:34 <jwysogla> Yes, that's also a good topic for the PTG.
12:59:03 <sean-k-mooney> we had a related topic that we want to dicuss at the ptg about the future or the gnocci backend and that woudl obviously be a factor
12:59:50 <jwysogla> I don't think we have any official "plan". I my personal opinion, gnocchi should go away once in the future. But afaik most users of openstack telemetry outside of Red Hat currently use gnocchi, so we can't just deprecate it.
12:59:52 <sean-k-mooney> jwysogla: is this somethign youc can feed back to the rest of the telemetry team (i.e. our interest in disccsign this)
13:00:11 <jwysogla> yes
13:00:36 <sean-k-mooney> gnoicci relise on sg_core today is that correct?
13:00:45 <sean-k-mooney> is there an intent to deprecate sg_core
13:01:26 <sean-k-mooney> or the ceilometer part. actully we can leave that to a future time
13:01:32 <sean-k-mooney> we are at the top of the hour
13:01:42 <dviroel> yeah
13:01:50 <jwysogla> sg-core is used purely for Prometheus. I think (I'd need to check to be sure) you can currently use the Prometheus exporter just for compute metrics, so ceilometer-central still needs to go through the notification agent and sg-core
13:01:55 <dviroel> sean-k-mooney: ok to move your topic to the next meeting?
13:02:14 <sean-k-mooney> yep
13:02:22 <dviroel> rlandy: thanks for volunteering to be the next chair
13:02:41 <dviroel> ok, so let's wrap up for today
13:02:49 <dviroel> we will meet again next week
13:02:55 <dviroel> thank you all for participating
13:03:04 <dviroel> #endmeeting