| opendevreview | Joan Gilabert proposed openstack/watcher master: Handle optional pool fields in Cinder notification https://review.opendev.org/c/openstack/watcher/+/961667 | 08:26 |
|---|---|---|
| chandankumar | sean-k-mooney: thank you for sharing the detailed improvement plan https://gist.github.com/SeanMooney/8746be1c29bbb78a63ef88f0104ea54f , it will help a lot! | 09:32 |
| sean-k-mooney | chandankumar: i also created a high level spec to cover the over all usecase and motivation https://review.opendev.org/c/openstack/watcher-specs/+/963438 | 10:58 |
| sean-k-mooney | https://gist.github.com/SeanMooney/8746be1c29bbb78a63ef88f0104ea54f is the high level plane but the detailed low level impeleation plan i created is https://gist.github.com/SeanMooney/f56c7fd6f55ac48958a5c549e1701b6c | 10:59 |
| dviroel | hi all o/, watcher meeting will start in 5 | 11:55 |
| dviroel | #startmeeting watcher | 12:01 |
| opendevmeet | Meeting started Thu Oct 9 12:01:02 2025 UTC and is due to finish in 60 minutes. The chair is dviroel. Information about MeetBot at http://wiki.debian.org/MeetBot. | 12:01 |
| opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 12:01 |
| opendevmeet | The meeting name has been set to 'watcher' | 12:01 |
| dviroel | hi all, lets see who is around today | 12:01 |
| jgilaber_ | o/ | 12:01 |
| *** jgilaber_ is now known as jgilaber | 12:01 | |
| chandankumar | o/ | 12:01 |
| amoralej | o/ | 12:01 |
| dviroel | courtesy ping: sean-k-mooney morenod rlandy | 12:02 |
| dviroel | let's start with today's meeting agenda | 12:02 |
| dviroel | #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L27 (Meeting agenda) | 12:02 |
| dviroel | feel free to add your own topics to the agenda | 12:02 |
| sean-k-mooney | o/ | 12:02 |
| dviroel | there are topics to place your changes that requires attention from reviewers | 12:03 |
| dviroel | lets start with | 12:03 |
| dviroel | #topic Announcements | 12:03 |
| dviroel | PTG Schedule | 12:03 |
| dviroel | #link https://ptg.opendev.org/ptg.html | 12:03 |
| dviroel | based on our decisions from last meeting | 12:04 |
| dviroel | i booked Tue, Wed and Thu for watcher | 12:04 |
| dviroel | 13 - 15 UTC | 12:04 |
| dviroel | one week before, I plan to organise our topics | 12:04 |
| dviroel | if you have any topic that you want to discuss, please add it to our PTG etherpad | 12:05 |
| dviroel | #link https://etherpad.opendev.org/p/watcher-2026.1-ptg | 12:05 |
| sean-k-mooney | 13 - 15 UTC being an incluse range so we have 3 hours each day | 12:05 |
| dviroel | correct | 12:05 |
| sean-k-mooney | there are sever topic i think can be merged togther into sessions | 12:07 |
| sean-k-mooney | i.e. many of the curernt topic have similar theme so we can group the related topics toghther | 12:08 |
| dviroel | agreed | 12:08 |
| amoralej | +1 | 12:08 |
| dviroel | I plan to group similar theme | 12:08 |
| dviroel | and also merge some of the topics that I plan to bring | 12:09 |
| sean-k-mooney | we should invite the telemetry team to https://etherpad.opendev.org/p/watcher-2026.1-ptg#L35 : Future of datasouces backends and untested integrations | 12:09 |
| amoralej | I added one about "Improving testing coverage for strategies", that should be in the CI testing/coverage block i think | 12:09 |
| sean-k-mooney | they are nto required but they may have input into the Aetos vs promethose topic | 12:09 |
| dviroel | ack | 12:09 |
| sean-k-mooney | there are at least 3 diffent ci/testing topics | 12:10 |
| sean-k-mooney | and i wote 2 specs/impelation plans with ai over the last few days to help with that as well | 12:10 |
| dviroel | so yeah, for now, we can just drop topics there in the etherpad | 12:11 |
| amoralej | I just moved the one i created | 12:11 |
| dviroel | we can then group them and find a slot for it | 12:11 |
| amoralej | thanks Sean for adding the link to your spec there | 12:11 |
| chandankumar | Do we want to merge watcher-dashboard ci topic under other CI topics? | 12:12 |
| jgilaber | I think the dashboard case might be different enough to merit being a separate topic | 12:12 |
| jgilaber | because of the stack used (django, html, etc) | 12:12 |
| dviroel | yeah, but we can have both in a sequence | 12:13 |
| chandankumar | +1 to keep in sequence | 12:13 |
| jgilaber | +1 | 12:13 |
| amoralej | lgtm | 12:13 |
| dviroel | ack, anything else related to the vPTG? | 12:14 |
| dviroel | any other announcement? | 12:15 |
| dviroel | #topic Moving decision-engine monitoring to the decision-engine service | 12:15 |
| dviroel | hey amoralej, seems that this one is yours | 12:15 |
| amoralej | yes | 12:15 |
| dviroel | #link https://review.opendev.org/q/topic:%22bug-2126767%22 | 12:15 |
| amoralej | #link https://review.opendev.org/q/topic:%22bug-2126767%22 | 12:15 |
| dviroel | :) | 12:15 |
| amoralej | damn :) | 12:16 |
| amoralej | so, i sent a couple of patches related to moving the decision engine monitoring to the decision-engine service | 12:16 |
| amoralej | I have several questions | 12:16 |
| amoralej | I moved it from the watcher-api to the decision-engine inconditionally | 12:17 |
| amoralej | i mean, no deprecation, no config parameter, etc... | 12:17 |
| amoralej | i think this is transparent for the users | 12:17 |
| amoralej | and even to the admin | 12:17 |
| amoralej | so i don't see the need to do something more complex than adding a release note | 12:17 |
| amoralej | wdyt ? | 12:17 |
| dviroel | in the end you are really fixing a bug | 12:18 |
| amoralej | an user running watcher-api as a wsgi will get a bug fixed when updates, an user running eventlet based watcher-api will keep same functionality but in a different service | 12:19 |
| jgilaber | iiuc the monitor service is currently not running by default right? | 12:19 |
| amoralej | yes, that's also my opinion | 12:19 |
| amoralej | it is running when running watcher-api as a standalone service only | 12:19 |
| amoralej | not as wsgi server | 12:19 |
| jgilaber | we don't have any env like that in ci right? | 12:20 |
| dviroel | the standalone? i don't think so | 12:20 |
| sean-k-mooney | sorry had to leave for a phone call | 12:21 |
| sean-k-mooney | back | 12:21 |
| amoralej | np | 12:21 |
| amoralej | right, devstack runs it with uwsgi iiuc | 12:21 |
| * dviroel look at ci failure | 12:21 | |
| * dviroel not related | 12:22 | |
| jgilaber | it doesn't look like we lose any functionality to me | 12:22 |
| sean-k-mooney | well the watcher-api evently proicess is due to be delete perhaps even this cycle althoug definlty in 2026.2 | 12:23 |
| sean-k-mooney | so i think this is not really a regression. however we coudl allow it to work form either location | 12:23 |
| amoralej | yeah, but then we'd have to add parameters, etc... dunno if it's worthy tbh | 12:23 |
| amoralej | that was my doubt | 12:24 |
| sean-k-mooney | ack | 12:24 |
| amoralej | if we would allow to run in either or both, we'd need to include the api also in the leader election process etc.. | 12:24 |
| sean-k-mooney | we may or may not want to have a config opiont | 12:25 |
| dviroel | to keep the current behavior, you would need to add a config to enable/disable the functionality in dec-eng, with the later disabled by default | 12:25 |
| sean-k-mooney | *option | 12:25 |
| sean-k-mooney | right so we likely would want to have the config option in either case | 12:25 |
| amoralej | yes, that'd be like the way to keep current behavior by default and adding the new one | 12:25 |
| sean-k-mooney | we can have 1 that is shared between both services | 12:26 |
| amoralej | but watcher-api is not registeres as a service in watcher | 12:26 |
| amoralej | i mean, for the leader election | 12:26 |
| sean-k-mooney | ah that is a good point | 12:26 |
| sean-k-mooney | the implemtin woudl not be identical althogu i guess it does not have to be | 12:26 |
| sean-k-mooney | we could copy rather then move the currnt code and deprecate teh nova-api version | 12:27 |
| amoralej | the question is if it's worthy | 12:27 |
| sean-k-mooney | again im not sure its worht the effort either | 12:27 |
| amoralej | yes, that'd be an option sean-k-mooney, but again if an user runs it in both sides, may have unexpected | 12:27 |
| sean-k-mooney | so i think we can just deprecate teh nova-api version and docuemtn it and remove it | 12:27 |
| amoralej | actually, i realized i added it as a fix in releasenote, and may be also deprecated | 12:28 |
| sean-k-mooney | well this is also not a bug | 12:28 |
| sean-k-mooney | its a feature so it shoud have a bluepritn at a minium | 12:28 |
| amoralej | we discussed in lat minute it could be a bug | 12:28 |
| amoralej | last meeting, i meant | 12:29 |
| amoralej | given that it's regression on moving from eventlet to wsgi | 12:29 |
| sean-k-mooney | it coudl be a whislise bug to minize paper work but its not somethign we woudl consider backporting upstream | 12:29 |
| amoralej | ah, ok, i wasn't thinking in backporting at this point tbh | 12:29 |
| sean-k-mooney | i guess we can move forward with thsi and dicuss this more in the implementaion | 12:30 |
| dviroel | yeah, doesn't seems safe to backport | 12:30 |
| amoralej | let's keep the discussion about backporting as a separate one | 12:30 |
| amoralej | ok | 12:30 |
| amoralej | so next question is, i added the leader election in spearate review, dunnow if you prefer me to squash? | 12:31 |
| amoralej | i found easier to review 1. move 2. add leader election | 12:31 |
| amoralej | but can squash if you prefer | 12:32 |
| jgilaber | +1 to have two patches, they are easier to review and smaller | 12:32 |
| sean-k-mooney | i think either is fien it woueld be more correct to squash but we can just have a single release note for both patches and update it in the second one to note it is now safe to enable if you have more then one desion engine | 12:32 |
| amoralej | i like that | 12:33 |
| dviroel | ok to split if the first one works properly without the second one | 12:33 |
| sean-k-mooney | the first patch shoudl have a cofnig option to opt into it and say this is currently only supproted on one desion engine at a time and you remove that limitation in the second patch | 12:33 |
| amoralej | that's the point, the first one only may be problematic if running multiple dec-eng | 12:33 |
| amoralej | then i will squash :) | 12:34 |
| sean-k-mooney | ok well the other thing is did you look at why watcher-prometheus-integration-threading failed | 12:34 |
| sean-k-mooney | ah it was just 1 test | 12:34 |
| dviroel | yes, just checked | 12:34 |
| amoralej | i think it's unrelated, it's a tempest test failed | 12:34 |
| sean-k-mooney | so likely not relatd | 12:35 |
| amoralej | but yeah, i will check why that on e failed | 12:35 |
| dviroel | with the recent refactoring on tempest | 12:35 |
| jgilaber | that job seems a bit flaky, I've seen it fail in a different patch https://zuul.opendev.org/t/openstack/builds?job_name=watcher-prometheus-integration-threading&project=openstack/watcher | 12:35 |
| sean-k-mooney | we should look at enabling a second desion engine in the job also to test this properly | 12:35 |
| dviroel | i may see some test that sometimes don't generate a Recommended action plan, and will fail | 12:35 |
| amoralej | that's another question sean-k-mooney, lemme go to that at the end :) | 12:35 |
| dviroel | so we would need to improve these tests | 12:36 |
| dviroel | to avoid failures like that | 12:36 |
| amoralej | another question i have, current code only takes care of audit migrations when there is a change in the state of a decision-engine. It saves some api calls and processing but may have corner cases | 12:36 |
| sean-k-mooney | dviroel: ya i have notice at least one test failing semi often because of not actions in the plan | 12:36 |
| amoralej | i.e. decision-engine-0 dies while decision-engine-1 is not running. Then decision-engine-1 is started | 12:36 |
| sean-k-mooney | well the way this shoudl work is 1 the perodic shoudl run on agent start and 2 it shoudl set that it is the leader | 12:37 |
| amoralej | if a decision-engine is detected as failed at start, then it will not look for assigned audits to it | 12:37 |
| sean-k-mooney | so decision-engine-1 on start shoudl rebalance | 12:37 |
| amoralej | not with current implementation | 12:37 |
| sean-k-mooney | right but that need to be fixed in the first patch | 12:38 |
| amoralej | current = the one in watcher-api | 12:38 |
| amoralej | ack, I will, i was unsure that was a bug or a feature :) | 12:38 |
| sean-k-mooney | well i was leaning betwen blueprint and specless blueprint so im definlty in hte feature camp for this over all | 12:38 |
| dviroel | so, maybe another reason to consider a new implementation in dec-eng, and deprecate the api version one | 12:38 |
| sean-k-mooney | im ok with it as a bug because of the eventlet removal btu if we are going to supprot this protperly in the descion engin | 12:39 |
| sean-k-mooney | i dont wnat to claim supprot for this until the know issuee are adressed | 12:39 |
| sean-k-mooney | that does nto mean it all has to happen in one patch | 12:40 |
| sean-k-mooney | but it does mean that the first patch in its current form should not have Closes-Bug: #2126767 | 12:40 |
| sean-k-mooney | it should have partial-bug | 12:40 |
| amoralej | i was thinking in squashing | 12:40 |
| amoralej | would it be fine if i squash both patches + fix the logic in the new implementation for the issue i mentioned? | 12:41 |
| dviroel | hum | 12:42 |
| amoralej | feel free to propose what'd be better | 12:42 |
| dviroel | to use the same bug, would be good to add more details to the LP | 12:43 |
| amoralej | or i can treat the second issue as a separate bug | 12:43 |
| sean-k-mooney | so you coudl fix the issue of not handling the down service in the api version first | 12:44 |
| sean-k-mooney | before the prot/refactor that your are doing | 12:44 |
| amoralej | ah, that'd be another option... right | 12:44 |
| dviroel | split the issues in two LP seems better imho | 12:46 |
| amoralej | report it as a new bug (bug2), 1st review: fix bug2 in watcher-api, 2nd review: move + leader election | 12:46 |
| sean-k-mooney | that sound reasonable to me +1 | 12:46 |
| dviroel | ok, amoralej anything else in this topic? I think that we can get more feedbacks in the reviews | 12:47 |
| amoralej | wrt testing | 12:47 |
| dviroel | sure | 12:47 |
| amoralej | dunno if today or in ptg | 12:47 |
| amoralej | how/if we test high availability upstream | 12:47 |
| amoralej | but, we are short on time, we can discuss that in next meeting or ptg, no problem | 12:48 |
| dviroel | i think that we can discuss at the ptg? it may require more time to discuss | 12:48 |
| amoralej | yeah +1 | 12:48 |
| amoralej | i'm done then, thanks for your feedback! | 12:48 |
| dviroel | thanks amoralej@ | 12:48 |
| dviroel | s/@/! | 12:48 |
| dviroel | we have 10m, so lets quickly cover some open reviews | 12:49 |
| dviroel | #topic Reviews | 12:49 |
| dviroel | i will cover mine first | 12:49 |
| dviroel | 960381: Fail if action plan cannot be execute | https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/960381 | 12:49 |
| dviroel | this is kind of one of the things that i wanted to fix | 12:50 |
| dviroel | test were just retuning sucess if no action plan is generated | 12:50 |
| sean-k-mooney | amoralej: we do not need to test the fail over but i do wnat to see 2 things, 1 we shoudl have 2 dession engine enabled in a job and 2 i want to see a debug log showing that one was elected the leader and the other was not. | 12:50 |
| dviroel | which doesn't actually properly test the outcome of a strategy | 12:50 |
| sean-k-mooney | "Fail if action plan cannot be execute" | 12:50 |
| sean-k-mooney | ya that is an important fix IMO | 12:51 |
| dviroel | so we might see some test that sometimes don't generate an action plan, to fail in CI | 12:51 |
| sean-k-mooney | it may uncover other flaky tests but that a good thing | 12:51 |
| dviroel | like the one in amoralej patches above | 12:51 |
| sean-k-mooney | ya | 12:52 |
| dviroel | which is likely the case | 12:52 |
| jgilaber | +1 | 12:52 |
| dviroel | and then we can work on improving the test, to keep ci more stable | 12:52 |
| dviroel | ok, next one | 12:52 |
| sean-k-mooney | i just reviewd it now and approved it FYI | 12:52 |
| dviroel | one from eventlet removal that was pending at my side | 12:52 |
| dviroel | sean-k-mooney: tks | 12:53 |
| dviroel | 960881: Add ThreadPool statistics debug log | https://review.opendev.org/c/openstack/watcher/+/960881 | 12:53 |
| dviroel | this is a similar approach from nova code, that print threadpool statistics | 12:53 |
| dviroel | which would be helpfull for debug | 12:54 |
| sean-k-mooney | ya i agree that it should eb off by default and have its own config option | 12:54 |
| dviroel | since watcher has couple of thread pools around | 12:54 |
| sean-k-mooney | i think print_thread_pool_stats is ok | 12:55 |
| sean-k-mooney | and i see you enabled it in the zuul jobs | 12:55 |
| sean-k-mooney | ill review teh patch fully and the job output later but i think this looks good overall | 12:55 |
| dviroel | ack, thanks | 12:55 |
| dviroel | thanks jgilaber for the review ther too | 12:55 |
| dviroel | next one | 12:55 |
| dviroel | 955472: Add Continuous audit dummy strategy scenario test | https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/955472 | 12:56 |
| chandankumar | It adds coverage for continuous audit using dummy strategies in scenario test, validating all state of Audit, Updating start/end time/interval with sec and crontab. | 12:56 |
| chandankumar | It will make sure everything is working for continous audit. | 12:56 |
| dviroel | chandankumar: ack, i already start reviewing it | 12:56 |
| chandankumar | dviroel: thank you! | 12:57 |
| dviroel | we are moving from nothing to many test for continuous audits | 12:57 |
| dviroel | :) | 12:57 |
| dviroel | thanks for proposing | 12:57 |
| dviroel | next one | 12:57 |
| dviroel | 958644: Add test for volume migrate with zone migration | https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/958644 | 12:57 |
| jgilaber | this adds a test for zone migration where it migrates volumes | 12:58 |
| dviroel | jgilaber: is proposing some new volume migrate tests | 12:58 |
| dviroel | ++ | 12:58 |
| jgilaber | we already merged the one with retype | 12:58 |
| dviroel | ack, S change, I can add to my list too | 12:58 |
| dviroel | thanks | 12:58 |
| jgilaber | this test assumes that there are at least two cinder pools configured with the same volume_backend_name | 12:59 |
| jgilaber | otherwise it's skipped | 12:59 |
| dviroel | ack, I see that you rebased based on the refactoring ++ | 12:59 |
| jgilaber | I'm looking to add a similar test with multiple backends | 12:59 |
| dviroel | ack | 13:00 |
| dviroel | 960265: Handle missing fields building storage model XML | https://review.opendev.org/c/openstack/watcher/+/960265 | 13:00 |
| dviroel | this is a fix for https://bugs.launchpad.net/watcher/+bug/2121147 | 13:00 |
| jgilaber | this and the other two patches are all related to the storage model | 13:00 |
| dviroel | 961667: Handle optional pool fields in Cinder notification | https://review.opendev.org/c/openstack/watcher/+/961667 | 13:01 |
| jgilaber | it logs some exceptions that are not really errors and can be confusing for the user | 13:01 |
| dviroel | 958781: Add some debug logs to storage model | https://review.opendev.org/c/openstack/watcher/+/958781 | 13:01 |
| jgilaber | last one adds logs to clarify, the other two prevents erros over optional fields | 13:01 |
| dviroel | ack, so it would be good to review all them together | 13:02 |
| sean-k-mooney | jgilaber: ok so you moved it form log.excption which is actully at error level to log.debug | 13:02 |
| jgilaber | I would say they are not super urgent, mainly UX improvements | 13:02 |
| dviroel | ack | 13:02 |
| sean-k-mooney | im not sure https://review.opendev.org/c/openstack/watcher/+/960265 is the write way to fix that but i have not looked deeply enought to provdie a difetn suggestion | 13:03 |
| sean-k-mooney | to me that is just hiding the issue rather then fixing it | 13:03 |
| jgilaber | sean-k-mooney, yes, since it was not really an error ( the storage model is building correctly) | 13:03 |
| dviroel | yeah, I can't provide a feedback now, I will need to spend some time on it, even being XS change :) | 13:03 |
| jgilaber | I think the right fix is to remove those optional fields | 13:04 |
| jgilaber | but this is a small improvement for the logs | 13:04 |
| dviroel | ack, we are over time, going to close the meeting, but feel free to continue the discussion in the channel | 13:05 |
| dviroel | thanks jgilaber for volunteering to chair next meeting | 13:05 |
| sean-k-mooney | jgilaber: i fell like we shoudl do the real fix rather then the logging change | 13:05 |
| dviroel | let's wrap up | 13:05 |
| dviroel | thank you all for participating | 13:05 |
| dviroel | #endmeeting | 13:05 |
| opendevmeet | Meeting ended Thu Oct 9 13:05:57 2025 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 13:05 |
| opendevmeet | Minutes: https://meetings.opendev.org/meetings/watcher/2025/watcher.2025-10-09-12.01.html | 13:05 |
| opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/watcher/2025/watcher.2025-10-09-12.01.txt | 13:05 |
| opendevmeet | Log: https://meetings.opendev.org/meetings/watcher/2025/watcher.2025-10-09-12.01.log.html | 13:05 |
| chandankumar | dviroel: ++ thank you for running the meeting! | 13:06 |
| sean-k-mooney | https://review.opendev.org/c/openstack/watcher/+/958781 seams harmless enough to me so im +2 on it | 13:06 |
| amoralej | thank you dviroel++ ! | 13:07 |
| jgilaber | sean-k-mooney, that's fair, but I would probably have to tackle it in the future, I'm not sure how complex that might turn out to be | 13:09 |
| jgilaber | especially considering this is low severity bug | 13:09 |
| *** haleyb_ is now known as haleyb | 13:28 | |
| opendevreview | Merged openstack/watcher-tempest-plugin master: Fail if action plan cannot be execute https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/960381 | 13:29 |
| opendevreview | Joan Gilabert proposed openstack/watcher-tempest-plugin master: Test zone migration volume and compute migrations https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/962702 | 13:32 |
| opendevreview | Joan Gilabert proposed openstack/watcher-tempest-plugin master: Add extra checks to zone migration retype test https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/963559 | 13:32 |
| opendevreview | chandan kumar proposed openstack/watcher-tempest-plugin master: Added tempest API tests for continous audit https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/956004 | 14:17 |
| opendevreview | chandan kumar proposed openstack/watcher-tempest-plugin master: Added tempest API tests for continous audit https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/956004 | 14:22 |
| opendevreview | Merged openstack/watcher master: Add some debug logs to storage model https://review.opendev.org/c/openstack/watcher/+/958781 | 14:23 |
| opendevreview | Douglas Viroel proposed openstack/watcher-tempest-plugin master: Add tests for extended compute datamodel https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/956116 | 19:35 |
| opendevreview | Douglas Viroel proposed openstack/watcher master: Enable data_model tests in jobs and update tempest configs https://review.opendev.org/c/openstack/watcher/+/958678 | 19:36 |
| *** haleyb is now known as haleyb|out | 23:22 | |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!