12:04:30 <chandankumar> #startmeeting watcher
12:04:30 <opendevmeet> Meeting started Thu Jul 31 12:04:30 2025 UTC and is due to finish in 60 minutes.  The chair is chandankumar. Information about MeetBot at http://wiki.debian.org/MeetBot.
12:04:30 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
12:04:30 <opendevmeet> The meeting name has been set to 'watcher'
12:04:32 <dviroel> the name is required, only the name is enough
12:04:34 <dviroel> :)
12:04:45 <chandankumar> courtesy ping: sean-k-mooney chandankumar morenod rlandy
12:04:53 <sean-k-mooney> o/
12:04:54 <rlandy> I'm here :)
12:05:00 <chandankumar> o/
12:05:06 <morenod> o/
12:05:16 <chandankumar> let's start with today's meeting agenda
12:05:42 <chandankumar> #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L21 (Meeting agenda)
12:05:52 <chandankumar> feel free to add your own topics to the agenda
12:05:58 <chandankumar> Starting with the first one
12:06:06 <chandankumar> #topic Eventlet Removal
12:06:09 <dviroel> o/
12:06:22 <dviroel> as usual, the etherpad link
12:06:25 <dviroel> #link https://etherpad.opendev.org/p/watcher-eventlet-removal (watcher evenlet removal etherpad)
12:06:25 <chandankumar> #link https://etherpad.opendev.org/p/watcher-eventlet-removal (watcher evenlet removal etherpad)
12:06:29 <dviroel> :)
12:06:38 <dviroel> some minor changes this week
12:07:13 <dviroel> i removed the depends-on changes from the main dec-engine patch
12:07:16 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/952257 (Extend decision engine to support threading mode)
12:07:47 <dviroel> the devstack one merged, the other one was the tempest-plugin change, which is not required to merge the main one
12:08:22 <dviroel> but there is another DNM change just to test the new continous audit test:
12:08:33 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/956199
12:09:11 <opendevreview> David proposed openstack/watcher master: Disable real metrics on devstack injected data jobs  https://review.opendev.org/c/openstack/watcher/+/955281
12:09:29 <dviroel> not that we discussed about replacing te continuous audit test wit a unit or functional test
12:09:44 <sean-k-mooney> yep devstack change merged yesterday so that unblocks that patch
12:10:24 <sean-k-mooney> we can have both
12:10:24 <dviroel> it turns that I couldn't find a way yet of mocking everything needed to simulate the bahavior found with continuous audit thread
12:10:38 <sean-k-mooney> ack
12:10:42 <dviroel> I updated instead the tempest-plugin change
12:10:53 <dviroel> #link https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/954264
12:11:07 <dviroel> to use only one audit as Alfredo suggested
12:11:19 <dviroel> and turns that I hit another bug
12:11:31 <dviroel> one from zone_migration that I filed in the past
12:12:05 * dviroel find the link
12:12:11 <dviroel> #link https://bugs.launchpad.net/watcher/+bug/2098984
12:12:33 <dviroel> so i started to hit this issue with continuous audit, with a 10s interval
12:12:40 <dviroel> CI also hit that issue
12:13:19 <sean-k-mooney> that the isse with not sharing the same model?
12:13:26 <dviroel> not, another one
12:13:34 <sean-k-mooney> oh ok
12:13:38 <dviroel> zone_migration gets instances/volumes from nova/cinder but while they aren't yet in the model
12:13:52 <dviroel> it raises an exception, since it is not properly handled
12:13:53 <sean-k-mooney> oh didnt we fix that before
12:13:58 <sean-k-mooney> for other stragies
12:14:16 <sean-k-mooney> we added a polling loop or somethign liek that to make sure the model was synced
12:14:40 <dviroel> this is specific for zone_migration implementation, not all strategies use clients to get info about instances/volumes
12:14:55 <dviroel> the proposed fix:
12:14:57 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/956198/1/watcher/decision_engine/strategy/strategies/zone_migration.py
12:15:10 <sean-k-mooney> oh i see
12:15:24 <sean-k-mooney> your fixing this from the watcher size not the test side
12:15:34 <dviroel> another patch to add a unit test for this scenario:
12:15:36 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/956197
12:15:37 <sean-k-mooney> ya this feels like it a real watcher bug
12:16:01 <sean-k-mooney> hum
12:16:13 <sean-k-mooney> so you are plannign to fix this by filtering to only the ones in the model
12:16:27 <dviroel> sean-k-mooney: it was doing this already
12:16:45 <dviroel> but not treating the exception
12:16:58 <sean-k-mooney> ah your right
12:17:14 <sean-k-mooney> so the ohter way to adress this is to updte teh model with the missing isntance
12:17:33 <sean-k-mooney> i guess we can consider that as a latter enhancment and fix the expction handelign first
12:18:13 <sean-k-mooney> ok i think just handelign the excption is more backportable anyway
12:18:52 <dviroel> right, we can further discuss that, even if strategies should be getting info directly from the services..
12:19:06 <dviroel> but yes, we should backport this one
12:19:38 <dviroel> in the etherpad there is a link to the error in CI, if someone wants to take a look
12:20:01 <dviroel> alright, this bug is not eventlet related
12:20:15 <dviroel> but one change take to another
12:20:22 <dviroel> and I ended fixing this bug
12:20:58 <dviroel> interesting that the continous audit test was useful for cathing it
12:20:59 <sean-k-mooney> ya so we wont backport any of the eventlet change bu tthis is a ligitmate bug in its own right
12:21:04 <sean-k-mooney> and we likely shoudl backprot that
12:21:08 <dviroel> +1
12:21:15 <sean-k-mooney> so thatnk for filing a seperate tracker and spliting it out
12:21:42 <chandankumar> #link https://bugs.launchpad.net/watcher/+bug/2098984
12:21:45 <dviroel> sure np
12:21:56 <chandankumar> and fix https://review.opendev.org/c/openstack/watcher/+/956198
12:22:10 <sean-k-mooney> chandankumar: yep dviroel linked thosse above
12:22:40 <chandankumar> yup
12:22:42 <dviroel> alright, if nobody has any questions, that's cover my eventlet part
12:23:06 <sean-k-mooney> one
12:23:07 <chandankumar> thank you dviroel for sharing the update :-)
12:23:11 <sean-k-mooney> but slightly unerelated
12:23:22 <sean-k-mooney> the content provider job failed to build https://softwarefactory-project.io/zuul/t/rdoproject.org/build/6a8fe1f8aa174887803d784ec9cebdc4
12:23:47 <chandankumar> sean-k-mooney: the fix merged, few hours back
12:23:48 <dviroel> yeah, it is failing in lot of jobs, but I still didn't start the investigation
12:23:52 <sean-k-mooney> have we seen that on other patches ro do folks knwo why
12:24:00 <dviroel> chandankumar: oh, good to know, i was about to ask you
12:24:09 <sean-k-mooney> oh cool
12:24:16 <sean-k-mooney> all good then
12:24:32 <chandankumar> thanks sean-k-mooney for bringing that one
12:24:33 <dviroel> i will recheck the patches afterwards then
12:24:48 <sean-k-mooney> """ The task includes an option with an undefined variable. The error was: {{ ansible_user }}: 'ansible_user' is undefined. 'ansible_user' is undefined. {{ ansible_user }}: 'ansible_user' is undefined. 'ansible_user' is undefined"""
12:25:01 <sean-k-mooney> i think perhaps ansible_user was missing :)
12:25:12 <sean-k-mooney> ansibel can be a bit verbose ocationally
12:25:16 <chandankumar> https://github.com/openstack-k8s-operators/ci-framework/commit/225d9d2f4b38a8d8e7e56bd431bb056462aab8c6
12:25:46 <dviroel> yeah right, it was podman role
12:26:20 <rlandy> showed up late yesterday
12:26:26 <rlandy> chandankumar, fixed it today
12:26:31 <dviroel> chandankumar++
12:26:39 <chandankumar> Since no further question, moving now to next topic
12:26:42 <dviroel> chandankumar: we can move to next topic
12:26:57 <chandankumar> #topic Croniter swap with appscheduler
12:27:32 <chandankumar> I was working on above topic and we had a long discussion for the same here https://review.opendev.org/c/openstack/watcher/+/955459/5#message-191158289ed45d4824525724dc38d247c0e8d4bc
12:27:35 <chandankumar> #link https://review.opendev.org/c/openstack/watcher/+/955459/5#message-191158289ed45d4824525724dc38d247c0e8d4bc
12:27:55 <chandankumar> I tried to summarize notes here https://etherpad.opendev.org/p/watcher-croniter-swap, But I will drop here also
12:28:06 <chandankumar> The review discussed about migrating from croniter to appscheduler crontigger library.
12:28:17 <chandankumar> Croniter supports 7 field format (with years and seconds as optional field) while appscheduler supports 5 field format.
12:28:34 <chandankumar> The watcher continous audit specs does not provide any info about supporting 5 or 7 field format.
12:28:41 <chandankumar> Since we are going to swap croniter usage with appscheduler. We saw few issues/concerns.
12:28:48 <chandankumar> Upgrade Impact: Existing scheduled jobs (continuous audits) using croniter-specific syntax(which becomes invalid format) will fail after the migration.
12:28:55 <chandankumar> Critical Failure: ongoing continuous audit created after the "bad-formatted" one, will also fail to schedule next runs as the worker responsible of scheduling fails with uncatched exception.
12:29:14 <chandankumar> Thank you sean-k-mooney and Alfredo for actively reviewing and providing feedback on this
12:29:28 <chandankumar> In order to mitigate these whole issues, the following plan is suggested:
12:29:36 <chandankumar> 1. We need add watcher status check to detect if any audits are using an incomparable interval format.
12:29:42 <chandankumar> 2. we need to deprecate the use of 6/7 column format and log a warning when its used. we can do that by trying to use aspschduler then fallback to using cronitoer if apscheduler cannot parse it.
12:30:02 <chandankumar> 3. do the migration automatically on load from the db.
12:30:08 <chandankumar> 4. provide a CLI tool to do an online migration of the data via watcher-manage to convert from 6/7 format to 5 format
12:30:13 <chandankumar> 5. document a manual procedure to do the conversation via the api
12:30:20 <chandankumar> 6. Finally by 2026.2 we will drop the fallback and only use apscheduler.
12:31:02 <chandankumar> we also need to add proper exception handling and api validaitons for these formats.
12:31:19 <dviroel> so we will call the 6/7 format as invalid already? we will just accept its input and do the conversion
12:31:21 <chandankumar> The main thing we wanted to discuss about support 5 field or 7 field format
12:31:34 <sean-k-mooney> the api validation can basiclly just be "parse it with aspchdluer or cronitor"
12:31:54 <sean-k-mooney> dviroel: so i coudl not find anything to say it was ever offically supproted
12:32:06 <dviroel> ack, we can justify that was never supported
12:32:07 <sean-k-mooney> the plan above is the most conservitive option
12:32:18 <dviroel> and will be an invalid input the future releases
12:32:21 <dviroel> yeah
12:32:27 <chandankumar> we went over code and specs, there is no mention of formats
12:32:35 <chandankumar> the test uses 5 field format
12:32:43 <dviroel> yeah, I saw your comments about specs/releasenotes
12:33:01 <sean-k-mooney> the agressive option is say no it was never supprote we only supprot 5 colume format. but even if we did that i think the watcher-status command and posibly a helper command to do the converton woudl be good to have
12:33:14 <chandankumar> yup
12:33:33 <sean-k-mooney> given someone has taken over maintance of it again
12:33:38 <dviroel> yes, since there wasn't anything blocking it before
12:33:43 <sean-k-mooney> i think we are ok to take the concerviitve one
12:34:34 <chandankumar> ok
12:34:37 <dviroel> yeah, looks a good approach
12:35:02 <chandankumar> one more question, since we have a plan in place, Do we want to document the plan in spec or existing bug would be fine to track?
12:35:18 <sean-k-mooney> we have one other option by the way, we could vendor a 7 colum parser in watcher. i woudl prefer not to but that is an option if we relaly need that in the future.
12:36:43 <sean-k-mooney> that a good question
12:36:53 <sean-k-mooney> i think we can use the exisitng bug
12:37:22 <sean-k-mooney> we may want to have a bluepirnt or a seocnd bug to track the followup work
12:37:40 <dviroel> or even create more bugs, like the missing API validation, or for the missing doc
12:37:45 <dviroel> etc
12:38:03 <sean-k-mooney> for next cycel and the one after. this does nto feel like it need a spec but im not oppsoed. ya the validation exctra can be tracked seperatly
12:38:46 <chandankumar> more bugs sounds good.
12:39:52 <chandankumar> I will add these info the bugs and will update the review based on the plan.
12:40:00 <dviroel> ack chandankumar
12:40:18 <chandankumar> That's it I wanted to discuss on croniter swap.
12:40:51 <chandankumar> Any questions or concerns on this topic before moving to next one.
12:40:55 <dviroel> tks chandankumar
12:41:11 <dviroel> we can move, lot to cover yet
12:41:16 <chandankumar> thank you sean-k-mooney dviroel for the discussion!
12:41:26 <chandankumar> #topic Open Reviews
12:41:44 <chandankumar> #link     https://review.opendev.org/c/openstack/watcher/+/955711 (Fix api-ref doc for GET /infra-optim/v1/data_model)
12:41:48 <dviroel> i have a few to request attention
12:41:56 <dviroel> not going to spend too much time on them
12:42:03 <chandankumar> dviroel: go ahead
12:42:13 <dviroel> there is a doc update, pls check the related bug
12:42:26 <dviroel> #link https://bugs.launchpad.net/watcher/+bug/2117726
12:42:37 <dviroel> we can further discuss in the bug
12:43:00 <dviroel> but the api-ref wasn't reflecting all the fields
12:43:24 <dviroel> and looking at the code, it seems that they were they since the beginning
12:44:14 <dviroel> I also added a few unit tests to validate the response:
12:44:17 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/955820
12:44:44 <dviroel> maybe not the best way to do that, but I accept reviews or proposals for enhancements
12:45:20 <dviroel> and finally, a small update in the extend compute model attributes spec
12:45:22 <dviroel> #link https://review.opendev.org/c/openstack/watcher-specs/+/955921
12:45:27 <sean-k-mooney> we have api sample tests
12:45:40 <sean-k-mooney> so we may want to enhace those too
12:45:51 <dviroel> to also incluse the flavor extra_specs in compute model
12:46:14 <dviroel> sean-k-mooney: right
12:47:04 <sean-k-mooney> you still have that last one marked as WIP in geerit
12:47:27 <sean-k-mooney> most project dont use that feature  form my expeirnce but is there a specific reason?
12:47:54 <dviroel> sean-k-mooney: you are talking about:
12:47:58 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/955827 ?
12:48:18 <dviroel> I found a issue and marked as WIP again, but I can W-1 too
12:48:58 <sean-k-mooney> ack normally we use -w instaead
12:49:02 <dviroel> yep
12:49:16 <sean-k-mooney> part of the reason i prefer that
12:49:21 <sean-k-mooney> other then avoidign change :)
12:49:29 <sean-k-mooney> is i likel to leave a commetn why
12:49:29 <dviroel> done
12:49:55 <sean-k-mooney> i.e so reviewers knwo what the issue you found is if its not obvious
12:50:10 <dviroel> yeah, i can will add more details about it in a few
12:50:23 <dviroel> tks
12:50:24 <sean-k-mooney> no worreis you mentioend it was an issue with notificatons
12:50:43 <sean-k-mooney> that basiclly enough to let ohter know "oh this will get revised again"
12:50:56 <dviroel> ++
12:51:10 <chandankumar> there are few more reviews from quangngo in the bottom I am going to cover in this section. If ok?
12:51:36 <dviroel> chandankumar: sure, pls go ahead, i will get back to extend-compute-model next week
12:51:50 <chandankumar> Reviews related to Add options to disable migration in host maintenance
12:52:00 <chandankumar> #link https://review.opendev.org/c/openstack/watcher/+/952538
12:52:21 <chandankumar> #link Add tests for disable migration in host maintenance https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/954214
12:52:39 <chandankumar> Please take a look at these reviews.
12:52:42 <sean-k-mooney> that was getting pretty close i think. i looked at much fo the code but not the etsting in detail
12:52:59 <chandankumar> there are some questions from author on etherpad, let me bring one by one
12:53:01 <dviroel> i still own reviews there, but it is on my list
12:53:19 <chandankumar> Is it possible for this feature to appear in 2025.02 release?
12:53:45 <dviroel> 2025.2 yes right
12:54:00 <dviroel> we are 4 weeks from the feature freeze
12:54:07 <sean-k-mooney> yes this will likely be in 2025.2
12:54:21 <sean-k-mooney> ubutnu are freee to backport this downstream only to thre distro
12:54:25 <dviroel> but if the question was 2025.1, that's a no
12:54:31 <sean-k-mooney> but we wont be backproting this upstream
12:54:55 <chandankumar> there was one follow up questions also Question for Ubuntu SRU: backportability this feature to any current stable branches? (A no expected, Ubuntu SRU decision just requires upstream confirmation)
12:54:55 <sean-k-mooney> we also are unlikely to backprot this to our donstream
12:55:21 <sean-k-mooney> feature are not allowed to be backpaorted understable policy
12:55:38 <sean-k-mooney> so this was never a backport candiate
12:56:06 <chandankumar> quangngo: I hope it answers the your queries.
12:56:11 <sean-k-mooney> https://docs.openstack.org/project-team-guide/stable-branches.html#appropriate-fixes
12:56:17 <dviroel> ++
12:56:25 <quangngo> yes, we expect that, ack!
12:56:51 <dviroel> quangngo: tks for proposing the patches, I will take a look on those
12:57:04 <chandankumar> Since we have 4 mins left. I am going to move over to next topic
12:57:12 <dviroel> sure
12:57:40 <sean-k-mooney> quangngo: in this particalar case canonical likely coudl backprot that enhancement downstream safely
12:57:57 <sean-k-mooney> but its more risk then we woudl normally take upstream
12:58:04 <chandankumar> #topic monasca retirement and sdk adoption
12:58:24 <sean-k-mooney> ya so i added that
12:58:29 <sean-k-mooney> tl;dr
12:58:43 <sean-k-mooney> the tc has resolved to continue with the retirement process for monsasca
12:59:03 <sean-k-mooney> son in the next few weeks the git repos will be retired and there will be no future releases of monasca
12:59:12 <dviroel> rip monasca
12:59:19 <sean-k-mooney> 5 months ago we deprecated support
12:59:30 <sean-k-mooney> and we had planned to remove it in 2026.2
12:59:42 <sean-k-mooney> to mitigate the impact of the retirement
12:59:58 <sean-k-mooney> i plan to work on some targeted patches to make it an optional depency for this cycle
13:00:13 <sean-k-mooney> we can dicusss for next cycle if we want to acclerate the removal
13:00:15 <dviroel> +1
13:00:17 <sean-k-mooney> or not
13:00:40 <sean-k-mooney> we have no tempest test or docs so iw as going to propsoe droping it at the start of 2026.1
13:00:44 <dviroel> make the conditional import would be great
13:01:06 <sean-k-mooney> so the follow up to that is we shoudl do the same with all the datasocue and openstack project clients
13:01:15 <sean-k-mooney> and ideally replace the proejct client with the openstack sdk
13:01:22 <dviroel> +1
13:01:25 <chandankumar> +1
13:01:29 <sean-k-mooney> that is work for next cycle
13:01:52 <chandankumar> thank you sean-k-mooney for bring that up.
13:01:59 <sean-k-mooney> i will likely draw up a propsoal for that prior to the ptg and either create a spec or blueprint
13:02:19 <sean-k-mooney> that basicly all i had.
13:02:28 <chandankumar> Since we are running out of time, I will go with last topic
13:02:33 <dviroel> sean-k-mooney: thanks for that
13:02:56 <chandankumar> #topic volunteer to chair for next week meeting
13:03:10 <chandankumar> Anyone would like to take it?
13:03:12 <dviroel> i can chair, since I will be out on 14th
13:03:40 <chandankumar> thanks dviroel
13:03:44 <chandankumar> time to wrap up
13:03:47 <dviroel> :)
13:03:51 <chandankumar> thank you all for attending
13:03:54 <chandankumar> #endmeeting