12:00:51 <jgilaber> #startmeeting watcher 12:00:51 <opendevmeet> Meeting started Thu Oct 16 12:00:51 2025 UTC and is due to finish in 60 minutes. The chair is jgilaber. Information about MeetBot at http://wiki.debian.org/MeetBot. 12:00:51 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 12:00:51 <opendevmeet> The meeting name has been set to 'watcher' 12:01:04 <jgilaber> hi all, let's see who is around today 12:01:18 <amoralej> o/ 12:02:00 <chandankumar> o/ 12:02:04 <rlandy> o/ 12:02:20 <jgilaber> courtesy ping dviroel sean-k-mooney morenod 12:02:40 <jgilaber> let's start with today's meeting agenda 12:02:48 <jgilaber> #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L23 12:03:19 <jgilaber> feel free to add more topics to the agenda 12:03:28 <jgilaber> let's start with the first topic 12:03:38 <jgilaber> #topic watcher rally-jobs 12:03:51 <chandankumar> I have added this topic. 12:04:11 <jgilaber> chandankumar, feel free to cover it 12:04:21 <chandankumar> I was going through rally-jobs directory in watcher repo and searching around openstack codesearch and here is what i found 12:04:44 <jgilaber> #link https://github.com/openstack/watcher/blob/master/rally-jobs/watcher-watcher.yaml 12:04:55 <chandankumar> In rally-openstack project, we have rally jobs defined for almost all project https://github.com/openstack/rally-openstack/tree/master/rally-jobs 12:05:22 <jgilaber> #link https://github.com/openstack/rally-openstack/tree/master/rally-jobs 12:05:34 <chandankumar> I compared the content of rally-openstack watcher rally content with https://github.com/openstack/watcher/blob/master/rally-jobs/watcher-watcher.yaml 12:05:39 <chandankumar> it was kind of similar 12:06:10 <chandankumar> the rally openstack watcher job is running against rally-openstack (receives few crs) in non-voting mode 12:06:21 <chandankumar> job definition: https://github.com/openstack/rally-openstack/blob/master/.zuul.d/rally-task-watcher.yaml 12:06:31 <chandankumar> and job results: https://zuul.openstack.org/builds?job_name=rally-task-watcher&skip=0 12:06:34 <jgilaber> #link https://zuul.openstack.org/builds?job_name=rally-task-watcher&skip=0 12:06:42 <chandankumar> logs of last run: https://7286c318f284276b918d-33403e99d7f0c0f7ca362b53c8ca1faf.ssl.cf2.rackcdn.com/openstack/bb656b811fba45e48d88edc1eee4659b/results/report.html 12:07:02 <chandankumar> it only creates and deletes audit and audit templates to test crud operations. 12:07:33 <chandankumar> I was also checking other projects. for neutron, they have two rally jobs, one for ovn and ovs using neutron repo rally directory 12:07:56 <chandankumar> https://github.com/openstack/neutron/blob/master/zuul.d/rally.yaml and similar for cinder https://github.com/openstack/cinder/blob/master/.zuul.yaml#L164 12:08:18 <chandankumar> My question here is do we want to add a new rally job for watcher following cinder/neutron model and keep them running? 12:08:38 <chandankumar> the new job will use rally files from watcher repo 12:09:04 <jgilaber> I have no experience with rally, what would that job do/test? 12:09:30 <chandankumar> Based on test results, I can see it is creating/deleting audit templates 12:09:41 <amoralej> https://github.com/openstack/rally the idea is to run performance tests 12:10:02 <amoralej> you can define an action and ask rally to run it a number of times concurrently 12:10:09 <amoralej> or set of actions 12:10:21 <amoralej> defining parallelism etc... 12:11:05 <amoralej> in this case, we would be testing mostly api scalability, i.e. if it's able to create X audits, list them, etc... in how much time 12:11:13 <jgilaber> this is using a real deployment with devstack? 12:11:33 <chandankumar> different types of reports: https://7286c318f284276b918d-33403e99d7f0c0f7ca362b53c8ca1faf.ssl.cf2.rackcdn.com/openstack/bb656b811fba45e48d88edc1eee4659b/ 12:12:04 <amoralej> yes, it's devstack 12:12:11 <amoralej> https://7286c318f284276b918d-33403e99d7f0c0f7ca362b53c8ca1faf.ssl.cf2.rackcdn.com/openstack/bb656b811fba45e48d88edc1eee4659b/controller/logs/ 12:12:21 <chandankumar> jgilaber: yes, it is using devstack with real deployment job definition https://github.com/openstack/rally-openstack/blob/master/.zuul.d/rally-task-watcher.yaml 12:12:27 <jgilaber> ack thanks 12:13:57 <amoralej> @chandankumar, you can define max expected time to run a test or something like that? 12:14:04 <amoralej> what's considered a test failure? 12:14:06 <rlandy> how likely are these tests to find a real problem with what they can actually run? 12:14:45 <rlandy> perhaps the value add here is unclear 12:16:26 <chandankumar> amoralej: those are good question, I donot have answer, I still need to explore it. 12:16:30 <chandankumar> based on doc https://docs.openstack.org/rally/latest/quick_start/tutorial/step_4_adding_success_criteria_for_benchmarks.html 12:16:49 <amoralej> i think it can be useful to find potential scalability issues, but i also see some limitations for the watcher case 12:17:03 <jgilaber> #link https://docs.openstack.org/rally/latest/quick_start/tutorial/step_4_adding_success_criteria_for_benchmarks.html 12:17:15 <morenod> I think that rally generates value on modules where every api call generates tasks which consume resources or create openstack objects. in our case, it will be useful if we can stress audits or actions 12:17:29 <chandankumar> we can set SLA service on each task 12:17:49 <amoralej> good, based on run max time and failure rate 12:17:59 <jgilaber> the balancing strategies might be the most interesting to test I think 12:18:36 <jgilaber> they are the most computationally expensive vs other like zone migration or host maintenance where nova/cinder do most of the work 12:18:45 <morenod> but rally is not going to check if the strategy has been applied, it is going to validate that the api call of creating it will answer correctly, right? 12:19:05 <chandankumar> morenod: that I need to explore. 12:19:12 <amoralej> i think we could define what specific aspects we want to test 12:19:26 <chandankumar> I was just checking things where things are defined and used currently. 12:19:26 <jgilaber> hmm that's a good point morenod, the strategy would not make much of a difference then 12:19:30 <amoralej> and see where rally can help, and where we need something else 12:20:45 <amoralej> also, some tests will depend more on the environment that in the number of api call runs, etc... 12:21:01 <chandankumar> May we I will try with one strategy end to end and see what happens via rally 12:21:09 <chandankumar> that may give some data 12:21:13 <amoralej> what would happen if we run a workload_stabilization on 100s of hosts and 1000s of vms? 12:21:17 <chandankumar> s/we/be 12:21:33 <amoralej> we don't need to run many api calls, but one in a big environment 12:22:02 <amoralej> although rally may also help there to define max time 12:22:49 <jgilaber> ack thanks chandankumar for looking into this, I think we can have a more detailed discussion in the future, once we now more 12:23:04 <chandankumar> yup 12:23:55 <jgilaber> if there is no further comments we can move on to the next topic, any last minute request for reviews? 12:24:00 <chandankumar> for right now, I will look into workload_stablization case wuth rally case 12:24:11 <chandankumar> *rally 12:24:31 <chandankumar> jgilaber: sounds good 12:25:07 <jgilaber> ok, no reviews this week I added a few bugs for triage 12:25:11 <jgilaber> #topic Bug triage 12:25:25 <jgilaber> first one is https://bugs.launchpad.net/watcher/+bug/2121807 12:25:29 <jgilaber> #link https://bugs.launchpad.net/watcher/+bug/2121807 12:26:42 <jgilaber> according to the report the bug is new in version 14.0 12:27:14 <jgilaber> which is based on epoxy https://releases.openstack.org/teams/watcher.html#team-epoxy-watcher 12:28:08 <amoralej> we should recommend to use wsgi for the watcher api 12:28:22 <amoralej> i think it's using eventlet standalone server 12:28:24 <amoralej> right? 12:28:55 <jgilaber> I think you're right 12:28:57 <jgilaber> command: [ 12:28:57 <jgilaber> "watchmedo", "auto-restart", "--directory=/app", "--pattern=*.py", "--recursive", "--", # Dev only 12:28:57 <jgilaber> "bash", "-c", 12:28:57 <jgilaber> "/usr/local/bin/watcher-api --config-file /etc/watcher/watcher.conf > /app/logs/app.log 2>&1" 12:28:58 <jgilaber> ] 12:29:18 <jgilaber> this is the command for the watcher-api container in the docker compose attached to the report 12:29:41 <jgilaber> did we deprecate that usage? 12:29:55 <amoralej> i'd say so 12:30:04 <amoralej> as part of the eventlet changes 12:31:19 <jgilaber> I'm trying to find something in the docs to that effect 12:32:35 <jgilaber> found this release not but I'm not sure it's the same https://docs.openstack.org/releasenotes/watcher/2025.1.html#deprecation-notes 12:33:50 <amoralej> that's different i think 12:35:40 <jgilaber> the installation guide covers only installing using packages https://docs.openstack.org/watcher/latest/install/install-ubuntu.html 12:37:03 <amoralej> i think there was a general recommendation about running api services as wsgi services 12:37:12 <amoralej> i'm not sure if we document it properly, tbh 12:37:48 <jgilaber> from the container logs it looks like it's running in python 3.13 which I don't think we have tested 12:37:49 <jgilaber> 2025-09-01 14:11:02.093 8 DEBUG watcher.common.service [-] ******************************************************************************** log_opt_values /usr/local/lib/python3.13/site-packages/oslo_config/cfg.py:2804 12:38:36 <amoralej> yes, good point too 12:39:11 <jgilaber> for now I think we can ask to run with python 3.12 and mark the bug as need info? 12:39:30 <jgilaber> incomplete, actually 12:42:06 <amoralej> i'd include the wsgi recommendation 12:42:14 <amoralej> yes, and move it to incomplete 12:43:51 <jgilaber> amoralej, can I ask you to add a comment with the wsgi recommendation? 12:44:01 <amoralej> sure 12:44:06 <amoralej> i will do it after the mtg 12:44:08 <jgilaber> thanks! 12:44:21 <jgilaber> we can move to the second bug https://bugs.launchpad.net/watcher/+bug/2127777 12:44:23 <jgilaber> # link https://bugs.launchpad.net/watcher/+bug/2127777 12:44:55 <jgilaber> amoralej, this was filed after last week's meeting discussion right? 12:45:22 <amoralej> yes, actually i already sent patch for it 12:45:56 <jgilaber> so I don't think we need to discuss much here, just set the importance to high or critical since it's already in progress 12:46:11 <amoralej> i've just assigned to me and set as medium 12:46:15 <amoralej> but i can raise 12:46:47 <jgilaber> unless others object I'm ok with that 12:46:53 <chandankumar> Can we also add target to series 2026.1? 12:47:31 <amoralej> done 12:47:39 <chandankumar> thanks! 12:48:08 <jgilaber> ack, thanks, looks like we are done, on to the next one https://bugs.launchpad.net/watcher/+bug/2127485 12:48:11 <jgilaber> #link https://bugs.launchpad.net/watcher/+bug/2127485 12:49:00 <jgilaber> I opened this one because the cinder client method for migrate does an incomplete check when trying to determine if it can migrate a volume 12:49:39 <jgilaber> it expects that the volume's type is configured with the same volume_backend_name as thedestination pool 12:49:44 <jgilaber> which is not required 12:50:13 <jgilaber> I already created a patch for it, it's missing importance, which I think would be medium 12:51:36 <jgilaber> any comments/objections? 12:52:37 <jgilaber> I'll take silence as a yes :) and set the importance 12:53:33 <jgilaber> so with that we've reached the end of the agenda, any last minute topic? otherwise we just need a volunteer for next week's meeting 12:54:33 <amoralej> i can take it 12:54:34 <morenod> I will 12:54:39 <amoralej> morenod wins :) 12:54:39 <morenod> all yours :P 12:54:48 <morenod> foto finish? 12:54:56 <jgilaber> I'll let you fight for it ;) 12:55:04 <chandankumar> one more thing, During ptg week, do we want to cancel weekly meeting? 12:55:12 <rlandy> ack - was going to ask that 12:55:13 <amoralej> morenod deserves it :) 12:55:19 <amoralej> i'd cancel 12:55:23 <rlandy> ie: the one after next week? 12:55:25 <jgilaber> I think so, that is in two weeks time right? 12:55:29 <chandankumar> yes 12:55:33 <rlandy> there is one next week 12:55:45 <rlandy> the one in two weeks would be ptg 12:55:47 <rlandy> correct 12:56:09 <rlandy> +1 to cancel 12:56:11 <jgilaber> so we meet as usual next week, we cancel the next for ptg, and looks like there is consensus 12:56:56 <amoralej> yep 12:57:48 <jgilaber> ok so that's all for today, thanks for participating! 12:58:03 <jgilaber> #endmeeting