Thursday, 2025-10-02

dviroelhi all, watcher meering will start in 15m, here in this channel11:46
opendevreviewAlfredo Moralejo proposed openstack/watcher master: [WIP] Add end-to-end strategies execution tests  https://review.opendev.org/c/openstack/watcher/+/96278411:54
dviroel#startmeeting watcher12:01
opendevmeetMeeting started Thu Oct  2 12:01:22 2025 UTC and is due to finish in 60 minutes.  The chair is dviroel. Information about MeetBot at http://wiki.debian.org/MeetBot.12:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.12:01
opendevmeetThe meeting name has been set to 'watcher'12:01
dviroelhi all o/12:01
dviroelwho is around today?12:01
jgilabero/12:02
morenodo/12:02
amoralejo/12:02
dviroelcourtesy ping: sean-k-mooney chandankumar rlandy12:02
dviroelok, let's start with today's meeting agenda12:02
dviroel#link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L28 (Meeting agenda)12:02
dviroelfeel free to add your own topics to our agenda12:03
sean-k-mooneyo/12:03
dviroel#topic Announcements12:03
dviroelone quick announcement12:03
dviroelwhich you should already know12:04
dviroelOpenStack Flamingo is now released!12:04
dviroelhttps://lists.openstack.org/archives/list/openstack-announce@lists.openstack.org/thread/7ANVEGX7NMVJ7A6ROGCPGXGC7NWQ4UBT/12:04
dviroelthat reminds us that we can get our backports to stable/2025.212:05
dviroeland continue with backports to older releases12:05
sean-k-mooneyyep that is very true12:05
dviroeli also added that comment in reviews topic12:05
dviroelany other announcement before we move on?12:06
dviroelok12:06
dviroel#topic PTG schedule for Watcher12:06
dviroelwe start this discussion last week12:07
dviroelwe just need to define our schedule, so I can book our room12:07
dviroelthere were 2 proposals from last week12:07
dviroel1- Wed-Fri (13 UTC, 14 UTC, 15 UTC)12:07
dviroel2 - Tue-Thu (13 UTC, 14 UTC, 15 UTC)12:07
dviroelboth of them will conflict with other projects, like noa12:08
dviroels/noa/nova12:08
dviroelI see that sean-k-mooney also proposes to go with Tue-Thu12:09
sean-k-mooney2 woudl be my prefernce yes12:09
sean-k-mooneyeither are ok12:09
amoralej+1 to Tue-Thu12:09
dviroelthis gives Fri as backup, or to join any other session12:09
dviroelI am also ok with Tue-Thu, and we still can block the time slots to join others team discussion12:10
dviroelso lets keep Tue-Thu (13 UTC, 14 UTC, 15 UTC) and if needed, we add Fri to the list12:10
jgilaber+1 to option 212:10
dviroel#agree watcher ptg schedule to be booked: Tuesday, Wednesday and Thurdsays, 13 UTC, 14 UTC, 15 UTC12:11
dviroel#undo12:12
opendevmeetRemoving item from minutes: #agreed watcher ptg schedule to be booked: Tuesday, Wednesday and Thurdsays, 13 UTC, 14 UTC, 15 UTC12:12
dviroel#agreed watcher ptg schedule to be booked: Tuesday, Wednesday and Thurdsays, at  13 UTC, 14 UTC, 15 UTC12:12
dviroelok,  our planning etherpad is here:12:12
dviroel#link https://etherpad.opendev.org/p/watcher-2026.1-ptg12:12
dviroelplease add any other with your name, that you plan to cover12:13
dviroeli should work on setting the slots for each12:13
dviroel#action dviroel to book pts slot for watcher12:14
dviroelok, anything else in this topic?12:14
dviroel#topic Unused code to handle Active/Active decisio-engine services12:14
amoralejthat's mine12:14
dviroelall yours12:15
amoralejso, while digging into the A/A solution for decision-engine a found this https://github.com/openstack/watcher/blob/master/watcher/api/scheduling.py12:15
amoralejthat implements a basic approach to move audits to alive decision-engines when one is detected as dead12:16
sean-k-mooneyya that more of a stopgap then a real solution12:16
amoralejyes12:16
sean-k-mooneythat does nto mean we cant start with that but it shoudl nto be in the api12:16
amoralejso, i have two questions, cuould this be a short-term fix?12:16
amoralejif so, how/where should we run it?12:17
sean-k-mooneyif we add a new watcher-schduler consolsript entry point that just runs this then yes12:17
amoralejyes, that's what i was thinking12:17
sean-k-mooneyif we do that we can then build on that to buidl a real solution12:17
amoralejthat'd be fine, but in that case, we should also what would be the long-term real solution (that may be the discussion in ptg)12:18
sean-k-mooneyi mentioned this in the ptg topic but my approch would be to have the schduler dispatch work to the descion enginve via an rpc to a shared queue that the desions engines plural listen on12:18
amoralejanother option would be t run it as a thread in the decision-engine, but i don't know if i like that12:19
sean-k-mooneyso descion engins would not pool and would not do any schdulign of the continues audits at all12:19
sean-k-mooneyamoralej: we coudl be if we did that12:19
amoralejdecision-engine would just get a rpc at the time to execute it and would execute it, right?12:19
sean-k-mooneythen we woudl need to also have a distibute locl/memofy view12:19
sean-k-mooneyso that only one of the service did the reshuffel12:20
sean-k-mooneyamoralej: yes that is my idea12:20
amoralejyes, that's another question, we should only run a instance of the scheduler12:20
amoraleji mean, in the short term solution12:20
amoralejwith the simple approach12:20
sean-k-mooneywe woudl remove the use of apschduler entirly form applier and descion engin12:20
amoralejor do some kind of locking12:20
sean-k-mooneyand make them purly worker that execute action or audits in responce to an rpc message 12:20
sean-k-mooneyamoralej: we can also do a very simple form of leader election12:21
sean-k-mooneyi did that in nova's schduler recently12:21
amoralejbased on something in db?12:21
sean-k-mooneyso all descison engins coudl run it in a new tread 12:21
sean-k-mooneyyes i can fidn the patch and link it to you after the meeting12:22
sean-k-mooneyits a very simple apprhc based on a rendevour hash12:22
amoralejno, no, no problem, i was just thinking in if we would need some other infra service12:22
amoralejactually, i think something simple would be enough12:22
sean-k-mooneyhttps://github.com/openstack/nova/commit/e98393c5c26743ec4c862af3e1a4beaa7f2d174b12:23
amoralejis it worthy to discuss and send patch for the 1st step intermediate solution before PTG or better to hold on and do the full discussion in PTG?12:23
sean-k-mooneyi think having a poc to look at is good12:23
sean-k-mooneyso fi you have time why not 12:24
sean-k-mooneybut if you look at https://github.com/openstack/nova/commit/e98393c5c26743ec4c862af3e1a4beaa7f2d174b#diff-ce71048fd132b0db262fad40554388acaa624bbb621e1141020c30840cdc1472R11212:24
amoralejthanks12:24
sean-k-mooneyi litrally just looked up the set of schdluers (in our case descion engions) filtered them by up and sorted them to make it stable12:24
sean-k-mooneyand then the first one does the work and rest early return12:24
sean-k-mooneywe coudl do the same thing with this reblance approch12:25
amoralejyes, i think that would be enough12:25
dviroel+112:25
amoralejso, i understand we are fine with going for this intermediate solution before implementing a rpc based one12:25
sean-k-mooneythis bascilly rely on the fact that if it take a few seconds for the leader to change its fine because it will heal over time12:26
amoralejwhich will likely need more changes12:26
sean-k-mooneyamoralej: i think yes. its better to make incremental progress12:26
sean-k-mooneyas long as that does not block a more complete solution down the road12:26
sean-k-mooneyi dont really see this as creating any tech debt12:26
amoraleji tested current approach by running a watcher-api standalone and worked fine, i mean audits were rebalanced and the new decision-engined picked them in next execution12:27
amoralejack, thanks12:27
sean-k-mooneyi think we shoudl proceed with that as an initial setp but in the ptg we shoud dicuss what the next step woudl be 12:27
amoralejfrom reporting pov, may i report this as a bug?12:27
sean-k-mooneyyes in that the functionalty is not supprted in the wsgi mode and we have deprecated the eventlet standalone verison of the api12:28
amoralejscheduling.APISchedulingService() is not executed when running as wsgi?12:28
sean-k-mooneyso its a regression12:28
amoralejyep, agreed12:28
amoralejso, i think that's it about this topic12:28
sean-k-mooneyand as such coudl be a bug. even if its a boarderline feature.12:28
dviroelamoralej: no it is not, afaik12:28
amoraleji meant the bug would be "scheduling.APISchedulingService() is not executed when running as wsgi" 12:29
amoralejsorry i was not clear :)12:29
amoralejthat can be understood as a bug although borderline feature, as Sean said12:30
dviroeli think that the functionality of migrating audits in the end12:30
dviroelbut indeed, this will be a great topic for  our PTG12:31
sean-k-mooneyi think dviroel ment no its not called in wsgi mode12:31
dviroelyes ^12:31
amoralejit's in the list12:32
dviroelok, thanks amoralej 12:32
dviroelok, lets move to the next topic12:32
dviroelwe can cover this with more details at the ptg12:32
dviroel#topic what about testing the end-to-end strategy execution as unit tests?12:32
amoralej#link https://review.opendev.org/c/openstack/watcher/+/96278412:33
amoraleji sent that WIP patch to get feedback12:33
dviroelit is a patch that you just pushed, right amoralej ?12:33
amoralejyes12:33
amoraleji realized that at least in some of the strategies, our unit tests are focused on each method12:34
amoralejwhich is fine12:34
amoralejbut, i was wondering if we should also run tests which execute the entire strategy for a predefined metrics and model12:34
amoralejand check the resulting solution12:34
sean-k-mooneyamoralej: so technicaly that would not be a unit test12:35
amoraleji called this end-to-end strategy testing (may exist a better name)12:35
amoralejyes, that's my doubt12:35
sean-k-mooneybut it woudl be a functional tests in nova parlance12:35
sean-k-mooneyand i want to build a functional test suite12:35
amoralejit would allow us to test much more complex cases that we do in tempest12:35
sean-k-mooneyso im ok with this type of testign but htere is more test setup we need to do to do it peroperly12:35
amoralejtesting with hundred of computes and vms12:35
sean-k-mooneyyep12:36
amoralejwhat would mean properly, from your pov?12:36
amoralejin my patch the coverage scope is restricted to the strategy itself12:36
sean-k-mooneyne of the tenant of doing correct functional testing is you minimize any mocking of the watcher project but use fixture for external services12:37
amoralejyou mean, simulate prometheus, nova, etc... and run watcher in "real mode" with  no mocks, right?12:37
sean-k-mooneyso the way you do that is you start a watcher api descion engin and appler in the test using oslo messaging in memory message bus and sqlite12:37
sean-k-mooneythen your test actully calls the api with a psot to triggert the audit12:38
sean-k-mooneybut you use fixture to emulate the respocnes form nova ectra12:38
sean-k-mooneyso yes12:38
sean-k-mooneyi think you have a middel gorund12:38
amoralejgot it, what i proposed is more that unit tests but less that funcional tests ...12:38
amoralejexactly12:39
sean-k-mooneyso what i woudl suggst for now12:39
sean-k-mooneyis we add watcher.test.unit.senario12:39
sean-k-mooneybut eventulaly i would like to to have watcher.test.functional.*12:39
sean-k-mooneywhich will do even less mocking12:39
amoraleji like the idea of moving these intermediate to it's own folder and classes12:40
amoralejmake sense12:40
amoraleji will12:40
jgilaberthere are similar tests to what amoralej proposed in the zone_migration https://github.com/openstack/watcher/blob/03073a1b0d8dacfc49b2d220a1120be381d831d1/watcher/tests/decision_engine/strategy/strategies/test_zone_migration.py#L67812:41
jgilaberI don't know if also for other strategies12:41
amoraleji check the workload_balancing and i think some other, but yeah, not all12:42
sean-k-mooneyso as a general rule or guide line12:42
amoraleji think it may reserve a full check on all the strategies, at least the non-experimental ones12:43
sean-k-mooneyunit test shoudl test one thing, they shoudl mock any calle that are to fucntion in a diffent module and any function in the current moduel that have sideefects12:43
sean-k-mooneythat does nto mean they have to test exactly one fucntion but they shoudl be small and targeted12:43
sean-k-mooneyhave unit.senairo tessts in there onw folder12:44
sean-k-mooneymakes it clear thaty they are not followign the normal pattern12:44
sean-k-mooneyand make maintianing/reviewign the simpler12:44
sean-k-mooneyso if we want to add more secenairo tests im fine with that12:44
jgilaber+1 to moving these existing tests to dedicated folder12:44
amoralej+112:44
sean-k-mooneybut we should still writh the simple unit test to test the relevent fucniton on there own too12:45
amoraleji also created a way to define the metrics we want to get for each host and instance, instead of having them hardcoded as we have today12:45
sean-k-mooneydviroel: any input on ^12:45
amoralejbased on the uuid12:45
amoralejit's not very elegant, tbh12:46
amoralejbut i'd like to get your feedback on that too12:46
sean-k-mooneyamoralej: cool that could be come the basis fo t test fixture in the future12:46
dviroeli am not sure if .scenario or e2e are the best names in this situation, but I also don't have any other idea12:46
amoralejhttps://review.opendev.org/c/openstack/watcher/+/962784/1/watcher/tests/decision_engine/model/gnocchi_metrics.py12:46
dviroelbut I am +1 on moving to a different directory12:47
amoralej"ComputeNode hostname="hostname_1" uuid="Node_1_CPU_5_RAM_46" :)12:47
amoraleji couldn't find a better way to add arbitrary metadata into the model12:47
sean-k-mooneyoh i tought you ment it was done via a map lookup12:48
sean-k-mooneyi was thinking more that in the tst you would do somtihng liek metrics_fixture.register_metics(uuid, {...})12:48
amoralejcurrently, there are hardcoded values for specific compute names and instances in https://review.opendev.org/c/openstack/watcher/+/962784/1/watcher/tests/decision_engine/model/gnocchi_metrics.py12:48
sean-k-mooneyand then when you used the metrics client it woudl return those metrics ectra12:49
amoraleji can look for a better way, i can think in something like that12:49
sean-k-mooneyso we dont need to design this now12:49
sean-k-mooneybefore we move on12:49
sean-k-mooneyone thing i wanted to do to make room for edxperiments and impovments12:50
dviroelwe may want to have a topic to discuss more about our unit tests and functional tests at the PTG - in case someone wants to take it12:50
amoralejbut i liked the idea of having both the metrics and model defined together in that xml12:50
amoraleji will add it dviroel 12:50
sean-k-mooneyis move all the tests form watcher/tests/ to watcher/tests/unit/12:50
amoralejbut anyone can take it12:50
dviroelamoralej: nice, thanks12:50
sean-k-mooneybecause i want to add watcher/tests/functional and watcher/tests/watcher_fixtures/12:50
sean-k-mooneywe can discuss that more in teh ptg too but does ^ sound ok to folks12:51
dviroelmake sense12:51
jgilaber+112:51
amoralejsounds good12:51
sean-k-mooneycool anything more on this topic for today?12:52
dviroelok, lets move on, and add more feedbacks in the patch12:52
amoralejnot from my side12:52
dviroelack12:52
dviroeltks amoralej 12:52
dviroel#Reviews12:52
dviroelthere is nothing new there12:53
dviroelI just added a reminder to our 2025.2 open backports12:53
dviroel#link https://review.opendev.org/q/project:openstack/watcher+branch:stable/2025.2+is:open12:53
sean-k-mooneyi guess one minor update12:53
dviroelyes12:53
sean-k-mooneyi approved the watcher-spec patch to create the 2026.1 folder yesterday12:54
sean-k-mooneyso if we need to crate them that not possible12:54
sean-k-mooney*now12:54
sean-k-mooneythanks dviroel for propsoing that12:54
dviroelnp, we also agree at some point, for the next release, we would split that change12:55
dviroeland create the 2026.2 earlier12:55
dviroelso we can decide to move specs to next release earlier too12:55
sean-k-mooneyright we shoudl create teh new folder at m2 and do the approved -> implemntted mvoe at m312:55
dviroel+!12:56
dviroel+112:56
dviroelany other review that someone wants to bring to this meeting?12:56
dviroelthere are some open changes in watcher-tempest-plugin12:56
dviroelbut all under review I think12:56
sean-k-mooneyjgilaber: i think you had 3?12:56
dviroel#link https://review.opendev.org/q/project:openstack/watcher-tempest-plugin+status:open12:57
jgilaberI have 2 currently in the tempest plugin12:57
dviroelI have been working on some refactoring too:12:57
jgilaberon was merged12:57
dviroel#link https://review.opendev.org/q/topic:%22organize_tests%22 12:57
sean-k-mooneyand there is an nfs patch too?12:57
jgilabernot upstream12:58
sean-k-mooneyah. so going to dviroel series12:58
dviroeltks12:58
sean-k-mooneyso the first is just reogainisng where tests are located to group them more logically 12:59
dviroelwe only have 1 minute12:59
amoraleji already commented in https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/960310, i think that refactor is good12:59
amoralejthanks for taking care of it12:59
dviroelyes, there are duplicated tests12:59
dviroelthe ones in test_execute_strategies 12:59
sean-k-mooneyya i see you are also changing the decorators.idempotent_id13:00
sean-k-mooneyso normally we dont change that if we are renameing a test13:00
dviroelsince there are duplicated tests, I was trying to keep one of them13:00
dviroelso it is possible that is moving the test and the id 13:01
sean-k-mooneyyep that is fine but we shoudl keep one of the two idempotent_id13:01
dviroelbut requires some review on that too yes13:01
dviroelto make sure that is correct13:02
sean-k-mooneyon the downstream side thsoe IDs are what is used to track thing in polarion13:02
sean-k-mooneyso so that when we go form verion to version even if the test is renames we have continuity fo the test cases13:02
morenodtry to keep the latest tests, they are which are being executed and tracked in polarion13:02
dviroelgood point, i will double check that13:03
dviroelthanks for raising this 13:03
opendevreviewJoan Gilabert proposed openstack/watcher-tempest-plugin master: Test zone migration volume and compute migrations  https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/96270213:03
dviroelok, we are over time13:03
dviroelplease add your questions and comments in the patch, I will get to them asap13:03
dviroelanything else?13:04
dviroelok, we will meet again next week13:04
dviroelthank you all for participating13:05
dviroel#endmeeting13:05
opendevmeetMeeting ended Thu Oct  2 13:05:06 2025 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)13:05
opendevmeetMinutes:        https://meetings.opendev.org/meetings/watcher/2025/watcher.2025-10-02-12.01.html13:05
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/watcher/2025/watcher.2025-10-02-12.01.txt13:05
opendevmeetLog:            https://meetings.opendev.org/meetings/watcher/2025/watcher.2025-10-02-12.01.log.html13:05
morenodthanks dviroel 13:05
opendevreviewTakashi Kajinami proposed openstack/watcher master: Migrate bandit options to pyproject.toml  https://review.opendev.org/c/openstack/watcher/+/96283015:05
opendevreviewDouglas Viroel proposed openstack/watcher-tempest-plugin master: Organize strategy tests and remove duplicated tests  https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/96176218:09
opendevreviewDouglas Viroel proposed openstack/watcher-tempest-plugin master: Refactor execute_strategy method into smaller helpers  https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/96031018:09

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!