12:04:30 #startmeeting watcher 12:04:30 Meeting started Thu Jul 31 12:04:30 2025 UTC and is due to finish in 60 minutes. The chair is chandankumar. Information about MeetBot at http://wiki.debian.org/MeetBot. 12:04:30 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 12:04:30 The meeting name has been set to 'watcher' 12:04:32 the name is required, only the name is enough 12:04:34 :) 12:04:45 courtesy ping: sean-k-mooney chandankumar morenod rlandy 12:04:53 o/ 12:04:54 I'm here :) 12:05:00 o/ 12:05:06 o/ 12:05:16 let's start with today's meeting agenda 12:05:42 #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L21 (Meeting agenda) 12:05:52 feel free to add your own topics to the agenda 12:05:58 Starting with the first one 12:06:06 #topic Eventlet Removal 12:06:09 o/ 12:06:22 as usual, the etherpad link 12:06:25 #link https://etherpad.opendev.org/p/watcher-eventlet-removal (watcher evenlet removal etherpad) 12:06:25 #link https://etherpad.opendev.org/p/watcher-eventlet-removal (watcher evenlet removal etherpad) 12:06:29 :) 12:06:38 some minor changes this week 12:07:13 i removed the depends-on changes from the main dec-engine patch 12:07:16 #link https://review.opendev.org/c/openstack/watcher/+/952257 (Extend decision engine to support threading mode) 12:07:47 the devstack one merged, the other one was the tempest-plugin change, which is not required to merge the main one 12:08:22 but there is another DNM change just to test the new continous audit test: 12:08:33 #link https://review.opendev.org/c/openstack/watcher/+/956199 12:09:11 David proposed openstack/watcher master: Disable real metrics on devstack injected data jobs https://review.opendev.org/c/openstack/watcher/+/955281 12:09:29 not that we discussed about replacing te continuous audit test wit a unit or functional test 12:09:44 yep devstack change merged yesterday so that unblocks that patch 12:10:24 we can have both 12:10:24 it turns that I couldn't find a way yet of mocking everything needed to simulate the bahavior found with continuous audit thread 12:10:38 ack 12:10:42 I updated instead the tempest-plugin change 12:10:53 #link https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/954264 12:11:07 to use only one audit as Alfredo suggested 12:11:19 and turns that I hit another bug 12:11:31 one from zone_migration that I filed in the past 12:12:05 * dviroel find the link 12:12:11 #link https://bugs.launchpad.net/watcher/+bug/2098984 12:12:33 so i started to hit this issue with continuous audit, with a 10s interval 12:12:40 CI also hit that issue 12:13:19 that the isse with not sharing the same model? 12:13:26 not, another one 12:13:34 oh ok 12:13:38 zone_migration gets instances/volumes from nova/cinder but while they aren't yet in the model 12:13:52 it raises an exception, since it is not properly handled 12:13:53 oh didnt we fix that before 12:13:58 for other stragies 12:14:16 we added a polling loop or somethign liek that to make sure the model was synced 12:14:40 this is specific for zone_migration implementation, not all strategies use clients to get info about instances/volumes 12:14:55 the proposed fix: 12:14:57 #link https://review.opendev.org/c/openstack/watcher/+/956198/1/watcher/decision_engine/strategy/strategies/zone_migration.py 12:15:10 oh i see 12:15:24 your fixing this from the watcher size not the test side 12:15:34 another patch to add a unit test for this scenario: 12:15:36 #link https://review.opendev.org/c/openstack/watcher/+/956197 12:15:37 ya this feels like it a real watcher bug 12:16:01 hum 12:16:13 so you are plannign to fix this by filtering to only the ones in the model 12:16:27 sean-k-mooney: it was doing this already 12:16:45 but not treating the exception 12:16:58 ah your right 12:17:14 so the ohter way to adress this is to updte teh model with the missing isntance 12:17:33 i guess we can consider that as a latter enhancment and fix the expction handelign first 12:18:13 ok i think just handelign the excption is more backportable anyway 12:18:52 right, we can further discuss that, even if strategies should be getting info directly from the services.. 12:19:06 but yes, we should backport this one 12:19:38 in the etherpad there is a link to the error in CI, if someone wants to take a look 12:20:01 alright, this bug is not eventlet related 12:20:15 but one change take to another 12:20:22 and I ended fixing this bug 12:20:58 interesting that the continous audit test was useful for cathing it 12:20:59 ya so we wont backport any of the eventlet change bu tthis is a ligitmate bug in its own right 12:21:04 and we likely shoudl backprot that 12:21:08 +1 12:21:15 so thatnk for filing a seperate tracker and spliting it out 12:21:42 #link https://bugs.launchpad.net/watcher/+bug/2098984 12:21:45 sure np 12:21:56 and fix https://review.opendev.org/c/openstack/watcher/+/956198 12:22:10 chandankumar: yep dviroel linked thosse above 12:22:40 yup 12:22:42 alright, if nobody has any questions, that's cover my eventlet part 12:23:06 one 12:23:07 thank you dviroel for sharing the update :-) 12:23:11 but slightly unerelated 12:23:22 the content provider job failed to build https://softwarefactory-project.io/zuul/t/rdoproject.org/build/6a8fe1f8aa174887803d784ec9cebdc4 12:23:47 sean-k-mooney: the fix merged, few hours back 12:23:48 yeah, it is failing in lot of jobs, but I still didn't start the investigation 12:23:52 have we seen that on other patches ro do folks knwo why 12:24:00 chandankumar: oh, good to know, i was about to ask you 12:24:09 oh cool 12:24:16 all good then 12:24:32 thanks sean-k-mooney for bringing that one 12:24:33 i will recheck the patches afterwards then 12:24:48 """ The task includes an option with an undefined variable. The error was: {{ ansible_user }}: 'ansible_user' is undefined. 'ansible_user' is undefined. {{ ansible_user }}: 'ansible_user' is undefined. 'ansible_user' is undefined""" 12:25:01 i think perhaps ansible_user was missing :) 12:25:12 ansibel can be a bit verbose ocationally 12:25:16 https://github.com/openstack-k8s-operators/ci-framework/commit/225d9d2f4b38a8d8e7e56bd431bb056462aab8c6 12:25:46 yeah right, it was podman role 12:26:20 showed up late yesterday 12:26:26 chandankumar, fixed it today 12:26:31 chandankumar++ 12:26:39 Since no further question, moving now to next topic 12:26:42 chandankumar: we can move to next topic 12:26:57 #topic Croniter swap with appscheduler 12:27:32 I was working on above topic and we had a long discussion for the same here https://review.opendev.org/c/openstack/watcher/+/955459/5#message-191158289ed45d4824525724dc38d247c0e8d4bc 12:27:35 #link https://review.opendev.org/c/openstack/watcher/+/955459/5#message-191158289ed45d4824525724dc38d247c0e8d4bc 12:27:55 I tried to summarize notes here https://etherpad.opendev.org/p/watcher-croniter-swap, But I will drop here also 12:28:06 The review discussed about migrating from croniter to appscheduler crontigger library. 12:28:17 Croniter supports 7 field format (with years and seconds as optional field) while appscheduler supports 5 field format. 12:28:34 The watcher continous audit specs does not provide any info about supporting 5 or 7 field format. 12:28:41 Since we are going to swap croniter usage with appscheduler. We saw few issues/concerns. 12:28:48 Upgrade Impact: Existing scheduled jobs (continuous audits) using croniter-specific syntax(which becomes invalid format) will fail after the migration. 12:28:55 Critical Failure: ongoing continuous audit created after the "bad-formatted" one, will also fail to schedule next runs as the worker responsible of scheduling fails with uncatched exception. 12:29:14 Thank you sean-k-mooney and Alfredo for actively reviewing and providing feedback on this 12:29:28 In order to mitigate these whole issues, the following plan is suggested: 12:29:36 1. We need add watcher status check to detect if any audits are using an incomparable interval format. 12:29:42 2. we need to deprecate the use of 6/7 column format and log a warning when its used. we can do that by trying to use aspschduler then fallback to using cronitoer if apscheduler cannot parse it. 12:30:02 3. do the migration automatically on load from the db. 12:30:08 4. provide a CLI tool to do an online migration of the data via watcher-manage to convert from 6/7 format to 5 format 12:30:13 5. document a manual procedure to do the conversation via the api 12:30:20 6. Finally by 2026.2 we will drop the fallback and only use apscheduler. 12:31:02 we also need to add proper exception handling and api validaitons for these formats. 12:31:19 so we will call the 6/7 format as invalid already? we will just accept its input and do the conversion 12:31:21 The main thing we wanted to discuss about support 5 field or 7 field format 12:31:34 the api validation can basiclly just be "parse it with aspchdluer or cronitor" 12:31:54 dviroel: so i coudl not find anything to say it was ever offically supproted 12:32:06 ack, we can justify that was never supported 12:32:07 the plan above is the most conservitive option 12:32:18 and will be an invalid input the future releases 12:32:21 yeah 12:32:27 we went over code and specs, there is no mention of formats 12:32:35 the test uses 5 field format 12:32:43 yeah, I saw your comments about specs/releasenotes 12:33:01 the agressive option is say no it was never supprote we only supprot 5 colume format. but even if we did that i think the watcher-status command and posibly a helper command to do the converton woudl be good to have 12:33:14 yup 12:33:33 given someone has taken over maintance of it again 12:33:38 yes, since there wasn't anything blocking it before 12:33:43 i think we are ok to take the concerviitve one 12:34:34 ok 12:34:37 yeah, looks a good approach 12:35:02 one more question, since we have a plan in place, Do we want to document the plan in spec or existing bug would be fine to track? 12:35:18 we have one other option by the way, we could vendor a 7 colum parser in watcher. i woudl prefer not to but that is an option if we relaly need that in the future. 12:36:43 that a good question 12:36:53 i think we can use the exisitng bug 12:37:22 we may want to have a bluepirnt or a seocnd bug to track the followup work 12:37:40 or even create more bugs, like the missing API validation, or for the missing doc 12:37:45 etc 12:38:03 for next cycel and the one after. this does nto feel like it need a spec but im not oppsoed. ya the validation exctra can be tracked seperatly 12:38:46 more bugs sounds good. 12:39:52 I will add these info the bugs and will update the review based on the plan. 12:40:00 ack chandankumar 12:40:18 That's it I wanted to discuss on croniter swap. 12:40:51 Any questions or concerns on this topic before moving to next one. 12:40:55 tks chandankumar 12:41:11 we can move, lot to cover yet 12:41:16 thank you sean-k-mooney dviroel for the discussion! 12:41:26 #topic Open Reviews 12:41:44 #link https://review.opendev.org/c/openstack/watcher/+/955711 (Fix api-ref doc for GET /infra-optim/v1/data_model) 12:41:48 i have a few to request attention 12:41:56 not going to spend too much time on them 12:42:03 dviroel: go ahead 12:42:13 there is a doc update, pls check the related bug 12:42:26 #link https://bugs.launchpad.net/watcher/+bug/2117726 12:42:37 we can further discuss in the bug 12:43:00 but the api-ref wasn't reflecting all the fields 12:43:24 and looking at the code, it seems that they were they since the beginning 12:44:14 I also added a few unit tests to validate the response: 12:44:17 #link https://review.opendev.org/c/openstack/watcher/+/955820 12:44:44 maybe not the best way to do that, but I accept reviews or proposals for enhancements 12:45:20 and finally, a small update in the extend compute model attributes spec 12:45:22 #link https://review.opendev.org/c/openstack/watcher-specs/+/955921 12:45:27 we have api sample tests 12:45:40 so we may want to enhace those too 12:45:51 to also incluse the flavor extra_specs in compute model 12:46:14 sean-k-mooney: right 12:47:04 you still have that last one marked as WIP in geerit 12:47:27 most project dont use that feature form my expeirnce but is there a specific reason? 12:47:54 sean-k-mooney: you are talking about: 12:47:58 #link https://review.opendev.org/c/openstack/watcher/+/955827 ? 12:48:18 I found a issue and marked as WIP again, but I can W-1 too 12:48:58 ack normally we use -w instaead 12:49:02 yep 12:49:16 part of the reason i prefer that 12:49:21 other then avoidign change :) 12:49:29 is i likel to leave a commetn why 12:49:29 done 12:49:55 i.e so reviewers knwo what the issue you found is if its not obvious 12:50:10 yeah, i can will add more details about it in a few 12:50:23 tks 12:50:24 no worreis you mentioend it was an issue with notificatons 12:50:43 that basiclly enough to let ohter know "oh this will get revised again" 12:50:56 ++ 12:51:10 there are few more reviews from quangngo in the bottom I am going to cover in this section. If ok? 12:51:36 chandankumar: sure, pls go ahead, i will get back to extend-compute-model next week 12:51:50 Reviews related to Add options to disable migration in host maintenance 12:52:00 #link https://review.opendev.org/c/openstack/watcher/+/952538 12:52:21 #link Add tests for disable migration in host maintenance https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/954214 12:52:39 Please take a look at these reviews. 12:52:42 that was getting pretty close i think. i looked at much fo the code but not the etsting in detail 12:52:59 there are some questions from author on etherpad, let me bring one by one 12:53:01 i still own reviews there, but it is on my list 12:53:19 Is it possible for this feature to appear in 2025.02 release? 12:53:45 2025.2 yes right 12:54:00 we are 4 weeks from the feature freeze 12:54:07 yes this will likely be in 2025.2 12:54:21 ubutnu are freee to backport this downstream only to thre distro 12:54:25 but if the question was 2025.1, that's a no 12:54:31 but we wont be backproting this upstream 12:54:55 there was one follow up questions also Question for Ubuntu SRU: backportability this feature to any current stable branches? (A no expected, Ubuntu SRU decision just requires upstream confirmation) 12:54:55 we also are unlikely to backprot this to our donstream 12:55:21 feature are not allowed to be backpaorted understable policy 12:55:38 so this was never a backport candiate 12:56:06 quangngo: I hope it answers the your queries. 12:56:11 https://docs.openstack.org/project-team-guide/stable-branches.html#appropriate-fixes 12:56:17 ++ 12:56:25 yes, we expect that, ack! 12:56:51 quangngo: tks for proposing the patches, I will take a look on those 12:57:04 Since we have 4 mins left. I am going to move over to next topic 12:57:12 sure 12:57:40 quangngo: in this particalar case canonical likely coudl backprot that enhancement downstream safely 12:57:57 but its more risk then we woudl normally take upstream 12:58:04 #topic monasca retirement and sdk adoption 12:58:24 ya so i added that 12:58:29 tl;dr 12:58:43 the tc has resolved to continue with the retirement process for monsasca 12:59:03 son in the next few weeks the git repos will be retired and there will be no future releases of monasca 12:59:12 rip monasca 12:59:19 5 months ago we deprecated support 12:59:30 and we had planned to remove it in 2026.2 12:59:42 to mitigate the impact of the retirement 12:59:58 i plan to work on some targeted patches to make it an optional depency for this cycle 13:00:13 we can dicusss for next cycle if we want to acclerate the removal 13:00:15 +1 13:00:17 or not 13:00:40 we have no tempest test or docs so iw as going to propsoe droping it at the start of 2026.1 13:00:44 make the conditional import would be great 13:01:06 so the follow up to that is we shoudl do the same with all the datasocue and openstack project clients 13:01:15 and ideally replace the proejct client with the openstack sdk 13:01:22 +1 13:01:25 +1 13:01:29 that is work for next cycle 13:01:52 thank you sean-k-mooney for bring that up. 13:01:59 i will likely draw up a propsoal for that prior to the ptg and either create a spec or blueprint 13:02:19 that basicly all i had. 13:02:28 Since we are running out of time, I will go with last topic 13:02:33 sean-k-mooney: thanks for that 13:02:56 #topic volunteer to chair for next week meeting 13:03:10 Anyone would like to take it? 13:03:12 i can chair, since I will be out on 14th 13:03:40 thanks dviroel 13:03:44 time to wrap up 13:03:47 :) 13:03:51 thank you all for attending 13:03:54 #endmeeting