12:04:30 <chandankumar> #startmeeting watcher 12:04:30 <opendevmeet> Meeting started Thu Jul 31 12:04:30 2025 UTC and is due to finish in 60 minutes. The chair is chandankumar. Information about MeetBot at http://wiki.debian.org/MeetBot. 12:04:30 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 12:04:30 <opendevmeet> The meeting name has been set to 'watcher' 12:04:32 <dviroel> the name is required, only the name is enough 12:04:34 <dviroel> :) 12:04:45 <chandankumar> courtesy ping: sean-k-mooney chandankumar morenod rlandy 12:04:53 <sean-k-mooney> o/ 12:04:54 <rlandy> I'm here :) 12:05:00 <chandankumar> o/ 12:05:06 <morenod> o/ 12:05:16 <chandankumar> let's start with today's meeting agenda 12:05:42 <chandankumar> #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L21 (Meeting agenda) 12:05:52 <chandankumar> feel free to add your own topics to the agenda 12:05:58 <chandankumar> Starting with the first one 12:06:06 <chandankumar> #topic Eventlet Removal 12:06:09 <dviroel> o/ 12:06:22 <dviroel> as usual, the etherpad link 12:06:25 <dviroel> #link https://etherpad.opendev.org/p/watcher-eventlet-removal (watcher evenlet removal etherpad) 12:06:25 <chandankumar> #link https://etherpad.opendev.org/p/watcher-eventlet-removal (watcher evenlet removal etherpad) 12:06:29 <dviroel> :) 12:06:38 <dviroel> some minor changes this week 12:07:13 <dviroel> i removed the depends-on changes from the main dec-engine patch 12:07:16 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/952257 (Extend decision engine to support threading mode) 12:07:47 <dviroel> the devstack one merged, the other one was the tempest-plugin change, which is not required to merge the main one 12:08:22 <dviroel> but there is another DNM change just to test the new continous audit test: 12:08:33 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/956199 12:09:11 <opendevreview> David proposed openstack/watcher master: Disable real metrics on devstack injected data jobs https://review.opendev.org/c/openstack/watcher/+/955281 12:09:29 <dviroel> not that we discussed about replacing te continuous audit test wit a unit or functional test 12:09:44 <sean-k-mooney> yep devstack change merged yesterday so that unblocks that patch 12:10:24 <sean-k-mooney> we can have both 12:10:24 <dviroel> it turns that I couldn't find a way yet of mocking everything needed to simulate the bahavior found with continuous audit thread 12:10:38 <sean-k-mooney> ack 12:10:42 <dviroel> I updated instead the tempest-plugin change 12:10:53 <dviroel> #link https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/954264 12:11:07 <dviroel> to use only one audit as Alfredo suggested 12:11:19 <dviroel> and turns that I hit another bug 12:11:31 <dviroel> one from zone_migration that I filed in the past 12:12:05 * dviroel find the link 12:12:11 <dviroel> #link https://bugs.launchpad.net/watcher/+bug/2098984 12:12:33 <dviroel> so i started to hit this issue with continuous audit, with a 10s interval 12:12:40 <dviroel> CI also hit that issue 12:13:19 <sean-k-mooney> that the isse with not sharing the same model? 12:13:26 <dviroel> not, another one 12:13:34 <sean-k-mooney> oh ok 12:13:38 <dviroel> zone_migration gets instances/volumes from nova/cinder but while they aren't yet in the model 12:13:52 <dviroel> it raises an exception, since it is not properly handled 12:13:53 <sean-k-mooney> oh didnt we fix that before 12:13:58 <sean-k-mooney> for other stragies 12:14:16 <sean-k-mooney> we added a polling loop or somethign liek that to make sure the model was synced 12:14:40 <dviroel> this is specific for zone_migration implementation, not all strategies use clients to get info about instances/volumes 12:14:55 <dviroel> the proposed fix: 12:14:57 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/956198/1/watcher/decision_engine/strategy/strategies/zone_migration.py 12:15:10 <sean-k-mooney> oh i see 12:15:24 <sean-k-mooney> your fixing this from the watcher size not the test side 12:15:34 <dviroel> another patch to add a unit test for this scenario: 12:15:36 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/956197 12:15:37 <sean-k-mooney> ya this feels like it a real watcher bug 12:16:01 <sean-k-mooney> hum 12:16:13 <sean-k-mooney> so you are plannign to fix this by filtering to only the ones in the model 12:16:27 <dviroel> sean-k-mooney: it was doing this already 12:16:45 <dviroel> but not treating the exception 12:16:58 <sean-k-mooney> ah your right 12:17:14 <sean-k-mooney> so the ohter way to adress this is to updte teh model with the missing isntance 12:17:33 <sean-k-mooney> i guess we can consider that as a latter enhancment and fix the expction handelign first 12:18:13 <sean-k-mooney> ok i think just handelign the excption is more backportable anyway 12:18:52 <dviroel> right, we can further discuss that, even if strategies should be getting info directly from the services.. 12:19:06 <dviroel> but yes, we should backport this one 12:19:38 <dviroel> in the etherpad there is a link to the error in CI, if someone wants to take a look 12:20:01 <dviroel> alright, this bug is not eventlet related 12:20:15 <dviroel> but one change take to another 12:20:22 <dviroel> and I ended fixing this bug 12:20:58 <dviroel> interesting that the continous audit test was useful for cathing it 12:20:59 <sean-k-mooney> ya so we wont backport any of the eventlet change bu tthis is a ligitmate bug in its own right 12:21:04 <sean-k-mooney> and we likely shoudl backprot that 12:21:08 <dviroel> +1 12:21:15 <sean-k-mooney> so thatnk for filing a seperate tracker and spliting it out 12:21:42 <chandankumar> #link https://bugs.launchpad.net/watcher/+bug/2098984 12:21:45 <dviroel> sure np 12:21:56 <chandankumar> and fix https://review.opendev.org/c/openstack/watcher/+/956198 12:22:10 <sean-k-mooney> chandankumar: yep dviroel linked thosse above 12:22:40 <chandankumar> yup 12:22:42 <dviroel> alright, if nobody has any questions, that's cover my eventlet part 12:23:06 <sean-k-mooney> one 12:23:07 <chandankumar> thank you dviroel for sharing the update :-) 12:23:11 <sean-k-mooney> but slightly unerelated 12:23:22 <sean-k-mooney> the content provider job failed to build https://softwarefactory-project.io/zuul/t/rdoproject.org/build/6a8fe1f8aa174887803d784ec9cebdc4 12:23:47 <chandankumar> sean-k-mooney: the fix merged, few hours back 12:23:48 <dviroel> yeah, it is failing in lot of jobs, but I still didn't start the investigation 12:23:52 <sean-k-mooney> have we seen that on other patches ro do folks knwo why 12:24:00 <dviroel> chandankumar: oh, good to know, i was about to ask you 12:24:09 <sean-k-mooney> oh cool 12:24:16 <sean-k-mooney> all good then 12:24:32 <chandankumar> thanks sean-k-mooney for bringing that one 12:24:33 <dviroel> i will recheck the patches afterwards then 12:24:48 <sean-k-mooney> """ The task includes an option with an undefined variable. The error was: {{ ansible_user }}: 'ansible_user' is undefined. 'ansible_user' is undefined. {{ ansible_user }}: 'ansible_user' is undefined. 'ansible_user' is undefined""" 12:25:01 <sean-k-mooney> i think perhaps ansible_user was missing :) 12:25:12 <sean-k-mooney> ansibel can be a bit verbose ocationally 12:25:16 <chandankumar> https://github.com/openstack-k8s-operators/ci-framework/commit/225d9d2f4b38a8d8e7e56bd431bb056462aab8c6 12:25:46 <dviroel> yeah right, it was podman role 12:26:20 <rlandy> showed up late yesterday 12:26:26 <rlandy> chandankumar, fixed it today 12:26:31 <dviroel> chandankumar++ 12:26:39 <chandankumar> Since no further question, moving now to next topic 12:26:42 <dviroel> chandankumar: we can move to next topic 12:26:57 <chandankumar> #topic Croniter swap with appscheduler 12:27:32 <chandankumar> I was working on above topic and we had a long discussion for the same here https://review.opendev.org/c/openstack/watcher/+/955459/5#message-191158289ed45d4824525724dc38d247c0e8d4bc 12:27:35 <chandankumar> #link https://review.opendev.org/c/openstack/watcher/+/955459/5#message-191158289ed45d4824525724dc38d247c0e8d4bc 12:27:55 <chandankumar> I tried to summarize notes here https://etherpad.opendev.org/p/watcher-croniter-swap, But I will drop here also 12:28:06 <chandankumar> The review discussed about migrating from croniter to appscheduler crontigger library. 12:28:17 <chandankumar> Croniter supports 7 field format (with years and seconds as optional field) while appscheduler supports 5 field format. 12:28:34 <chandankumar> The watcher continous audit specs does not provide any info about supporting 5 or 7 field format. 12:28:41 <chandankumar> Since we are going to swap croniter usage with appscheduler. We saw few issues/concerns. 12:28:48 <chandankumar> Upgrade Impact: Existing scheduled jobs (continuous audits) using croniter-specific syntax(which becomes invalid format) will fail after the migration. 12:28:55 <chandankumar> Critical Failure: ongoing continuous audit created after the "bad-formatted" one, will also fail to schedule next runs as the worker responsible of scheduling fails with uncatched exception. 12:29:14 <chandankumar> Thank you sean-k-mooney and Alfredo for actively reviewing and providing feedback on this 12:29:28 <chandankumar> In order to mitigate these whole issues, the following plan is suggested: 12:29:36 <chandankumar> 1. We need add watcher status check to detect if any audits are using an incomparable interval format. 12:29:42 <chandankumar> 2. we need to deprecate the use of 6/7 column format and log a warning when its used. we can do that by trying to use aspschduler then fallback to using cronitoer if apscheduler cannot parse it. 12:30:02 <chandankumar> 3. do the migration automatically on load from the db. 12:30:08 <chandankumar> 4. provide a CLI tool to do an online migration of the data via watcher-manage to convert from 6/7 format to 5 format 12:30:13 <chandankumar> 5. document a manual procedure to do the conversation via the api 12:30:20 <chandankumar> 6. Finally by 2026.2 we will drop the fallback and only use apscheduler. 12:31:02 <chandankumar> we also need to add proper exception handling and api validaitons for these formats. 12:31:19 <dviroel> so we will call the 6/7 format as invalid already? we will just accept its input and do the conversion 12:31:21 <chandankumar> The main thing we wanted to discuss about support 5 field or 7 field format 12:31:34 <sean-k-mooney> the api validation can basiclly just be "parse it with aspchdluer or cronitor" 12:31:54 <sean-k-mooney> dviroel: so i coudl not find anything to say it was ever offically supproted 12:32:06 <dviroel> ack, we can justify that was never supported 12:32:07 <sean-k-mooney> the plan above is the most conservitive option 12:32:18 <dviroel> and will be an invalid input the future releases 12:32:21 <dviroel> yeah 12:32:27 <chandankumar> we went over code and specs, there is no mention of formats 12:32:35 <chandankumar> the test uses 5 field format 12:32:43 <dviroel> yeah, I saw your comments about specs/releasenotes 12:33:01 <sean-k-mooney> the agressive option is say no it was never supprote we only supprot 5 colume format. but even if we did that i think the watcher-status command and posibly a helper command to do the converton woudl be good to have 12:33:14 <chandankumar> yup 12:33:33 <sean-k-mooney> given someone has taken over maintance of it again 12:33:38 <dviroel> yes, since there wasn't anything blocking it before 12:33:43 <sean-k-mooney> i think we are ok to take the concerviitve one 12:34:34 <chandankumar> ok 12:34:37 <dviroel> yeah, looks a good approach 12:35:02 <chandankumar> one more question, since we have a plan in place, Do we want to document the plan in spec or existing bug would be fine to track? 12:35:18 <sean-k-mooney> we have one other option by the way, we could vendor a 7 colum parser in watcher. i woudl prefer not to but that is an option if we relaly need that in the future. 12:36:43 <sean-k-mooney> that a good question 12:36:53 <sean-k-mooney> i think we can use the exisitng bug 12:37:22 <sean-k-mooney> we may want to have a bluepirnt or a seocnd bug to track the followup work 12:37:40 <dviroel> or even create more bugs, like the missing API validation, or for the missing doc 12:37:45 <dviroel> etc 12:38:03 <sean-k-mooney> for next cycel and the one after. this does nto feel like it need a spec but im not oppsoed. ya the validation exctra can be tracked seperatly 12:38:46 <chandankumar> more bugs sounds good. 12:39:52 <chandankumar> I will add these info the bugs and will update the review based on the plan. 12:40:00 <dviroel> ack chandankumar 12:40:18 <chandankumar> That's it I wanted to discuss on croniter swap. 12:40:51 <chandankumar> Any questions or concerns on this topic before moving to next one. 12:40:55 <dviroel> tks chandankumar 12:41:11 <dviroel> we can move, lot to cover yet 12:41:16 <chandankumar> thank you sean-k-mooney dviroel for the discussion! 12:41:26 <chandankumar> #topic Open Reviews 12:41:44 <chandankumar> #link https://review.opendev.org/c/openstack/watcher/+/955711 (Fix api-ref doc for GET /infra-optim/v1/data_model) 12:41:48 <dviroel> i have a few to request attention 12:41:56 <dviroel> not going to spend too much time on them 12:42:03 <chandankumar> dviroel: go ahead 12:42:13 <dviroel> there is a doc update, pls check the related bug 12:42:26 <dviroel> #link https://bugs.launchpad.net/watcher/+bug/2117726 12:42:37 <dviroel> we can further discuss in the bug 12:43:00 <dviroel> but the api-ref wasn't reflecting all the fields 12:43:24 <dviroel> and looking at the code, it seems that they were they since the beginning 12:44:14 <dviroel> I also added a few unit tests to validate the response: 12:44:17 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/955820 12:44:44 <dviroel> maybe not the best way to do that, but I accept reviews or proposals for enhancements 12:45:20 <dviroel> and finally, a small update in the extend compute model attributes spec 12:45:22 <dviroel> #link https://review.opendev.org/c/openstack/watcher-specs/+/955921 12:45:27 <sean-k-mooney> we have api sample tests 12:45:40 <sean-k-mooney> so we may want to enhace those too 12:45:51 <dviroel> to also incluse the flavor extra_specs in compute model 12:46:14 <dviroel> sean-k-mooney: right 12:47:04 <sean-k-mooney> you still have that last one marked as WIP in geerit 12:47:27 <sean-k-mooney> most project dont use that feature form my expeirnce but is there a specific reason? 12:47:54 <dviroel> sean-k-mooney: you are talking about: 12:47:58 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/955827 ? 12:48:18 <dviroel> I found a issue and marked as WIP again, but I can W-1 too 12:48:58 <sean-k-mooney> ack normally we use -w instaead 12:49:02 <dviroel> yep 12:49:16 <sean-k-mooney> part of the reason i prefer that 12:49:21 <sean-k-mooney> other then avoidign change :) 12:49:29 <sean-k-mooney> is i likel to leave a commetn why 12:49:29 <dviroel> done 12:49:55 <sean-k-mooney> i.e so reviewers knwo what the issue you found is if its not obvious 12:50:10 <dviroel> yeah, i can will add more details about it in a few 12:50:23 <dviroel> tks 12:50:24 <sean-k-mooney> no worreis you mentioend it was an issue with notificatons 12:50:43 <sean-k-mooney> that basiclly enough to let ohter know "oh this will get revised again" 12:50:56 <dviroel> ++ 12:51:10 <chandankumar> there are few more reviews from quangngo in the bottom I am going to cover in this section. If ok? 12:51:36 <dviroel> chandankumar: sure, pls go ahead, i will get back to extend-compute-model next week 12:51:50 <chandankumar> Reviews related to Add options to disable migration in host maintenance 12:52:00 <chandankumar> #link https://review.opendev.org/c/openstack/watcher/+/952538 12:52:21 <chandankumar> #link Add tests for disable migration in host maintenance https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/954214 12:52:39 <chandankumar> Please take a look at these reviews. 12:52:42 <sean-k-mooney> that was getting pretty close i think. i looked at much fo the code but not the etsting in detail 12:52:59 <chandankumar> there are some questions from author on etherpad, let me bring one by one 12:53:01 <dviroel> i still own reviews there, but it is on my list 12:53:19 <chandankumar> Is it possible for this feature to appear in 2025.02 release? 12:53:45 <dviroel> 2025.2 yes right 12:54:00 <dviroel> we are 4 weeks from the feature freeze 12:54:07 <sean-k-mooney> yes this will likely be in 2025.2 12:54:21 <sean-k-mooney> ubutnu are freee to backport this downstream only to thre distro 12:54:25 <dviroel> but if the question was 2025.1, that's a no 12:54:31 <sean-k-mooney> but we wont be backproting this upstream 12:54:55 <chandankumar> there was one follow up questions also Question for Ubuntu SRU: backportability this feature to any current stable branches? (A no expected, Ubuntu SRU decision just requires upstream confirmation) 12:54:55 <sean-k-mooney> we also are unlikely to backprot this to our donstream 12:55:21 <sean-k-mooney> feature are not allowed to be backpaorted understable policy 12:55:38 <sean-k-mooney> so this was never a backport candiate 12:56:06 <chandankumar> quangngo: I hope it answers the your queries. 12:56:11 <sean-k-mooney> https://docs.openstack.org/project-team-guide/stable-branches.html#appropriate-fixes 12:56:17 <dviroel> ++ 12:56:25 <quangngo> yes, we expect that, ack! 12:56:51 <dviroel> quangngo: tks for proposing the patches, I will take a look on those 12:57:04 <chandankumar> Since we have 4 mins left. I am going to move over to next topic 12:57:12 <dviroel> sure 12:57:40 <sean-k-mooney> quangngo: in this particalar case canonical likely coudl backprot that enhancement downstream safely 12:57:57 <sean-k-mooney> but its more risk then we woudl normally take upstream 12:58:04 <chandankumar> #topic monasca retirement and sdk adoption 12:58:24 <sean-k-mooney> ya so i added that 12:58:29 <sean-k-mooney> tl;dr 12:58:43 <sean-k-mooney> the tc has resolved to continue with the retirement process for monsasca 12:59:03 <sean-k-mooney> son in the next few weeks the git repos will be retired and there will be no future releases of monasca 12:59:12 <dviroel> rip monasca 12:59:19 <sean-k-mooney> 5 months ago we deprecated support 12:59:30 <sean-k-mooney> and we had planned to remove it in 2026.2 12:59:42 <sean-k-mooney> to mitigate the impact of the retirement 12:59:58 <sean-k-mooney> i plan to work on some targeted patches to make it an optional depency for this cycle 13:00:13 <sean-k-mooney> we can dicusss for next cycle if we want to acclerate the removal 13:00:15 <dviroel> +1 13:00:17 <sean-k-mooney> or not 13:00:40 <sean-k-mooney> we have no tempest test or docs so iw as going to propsoe droping it at the start of 2026.1 13:00:44 <dviroel> make the conditional import would be great 13:01:06 <sean-k-mooney> so the follow up to that is we shoudl do the same with all the datasocue and openstack project clients 13:01:15 <sean-k-mooney> and ideally replace the proejct client with the openstack sdk 13:01:22 <dviroel> +1 13:01:25 <chandankumar> +1 13:01:29 <sean-k-mooney> that is work for next cycle 13:01:52 <chandankumar> thank you sean-k-mooney for bring that up. 13:01:59 <sean-k-mooney> i will likely draw up a propsoal for that prior to the ptg and either create a spec or blueprint 13:02:19 <sean-k-mooney> that basicly all i had. 13:02:28 <chandankumar> Since we are running out of time, I will go with last topic 13:02:33 <dviroel> sean-k-mooney: thanks for that 13:02:56 <chandankumar> #topic volunteer to chair for next week meeting 13:03:10 <chandankumar> Anyone would like to take it? 13:03:12 <dviroel> i can chair, since I will be out on 14th 13:03:40 <chandankumar> thanks dviroel 13:03:44 <chandankumar> time to wrap up 13:03:47 <dviroel> :) 13:03:51 <chandankumar> thank you all for attending 13:03:54 <chandankumar> #endmeeting