12:00:51 <jgilaber> #startmeeting watcher
12:00:51 <opendevmeet> Meeting started Thu Oct 16 12:00:51 2025 UTC and is due to finish in 60 minutes.  The chair is jgilaber. Information about MeetBot at http://wiki.debian.org/MeetBot.
12:00:51 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
12:00:51 <opendevmeet> The meeting name has been set to 'watcher'
12:01:04 <jgilaber> hi all, let's see who is around today
12:01:18 <amoralej> o/
12:02:00 <chandankumar> o/
12:02:04 <rlandy> o/
12:02:20 <jgilaber> courtesy ping dviroel sean-k-mooney morenod
12:02:40 <jgilaber> let's start with today's meeting agenda
12:02:48 <jgilaber> #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L23
12:03:19 <jgilaber> feel free to add more topics to the agenda
12:03:28 <jgilaber> let's start with the first topic
12:03:38 <jgilaber> #topic watcher rally-jobs
12:03:51 <chandankumar> I have added this topic.
12:04:11 <jgilaber> chandankumar, feel free to cover it
12:04:21 <chandankumar> I was going through rally-jobs directory in watcher repo and searching around openstack codesearch and here is what i found
12:04:44 <jgilaber> #link https://github.com/openstack/watcher/blob/master/rally-jobs/watcher-watcher.yaml
12:04:55 <chandankumar> In rally-openstack project, we have rally jobs defined for almost all project https://github.com/openstack/rally-openstack/tree/master/rally-jobs
12:05:22 <jgilaber> #link https://github.com/openstack/rally-openstack/tree/master/rally-jobs
12:05:34 <chandankumar> I compared the content of rally-openstack watcher rally content with https://github.com/openstack/watcher/blob/master/rally-jobs/watcher-watcher.yaml
12:05:39 <chandankumar> it was kind of similar
12:06:10 <chandankumar> the rally openstack watcher job is running against rally-openstack (receives few crs) in non-voting mode
12:06:21 <chandankumar> job definition: https://github.com/openstack/rally-openstack/blob/master/.zuul.d/rally-task-watcher.yaml
12:06:31 <chandankumar> and job results: https://zuul.openstack.org/builds?job_name=rally-task-watcher&skip=0
12:06:34 <jgilaber> #link https://zuul.openstack.org/builds?job_name=rally-task-watcher&skip=0
12:06:42 <chandankumar> logs of last run: https://7286c318f284276b918d-33403e99d7f0c0f7ca362b53c8ca1faf.ssl.cf2.rackcdn.com/openstack/bb656b811fba45e48d88edc1eee4659b/results/report.html
12:07:02 <chandankumar> it only creates and deletes audit and audit templates to test crud operations.
12:07:33 <chandankumar> I was also checking other projects. for neutron, they have two rally jobs, one for ovn and ovs using neutron repo rally directory
12:07:56 <chandankumar> https://github.com/openstack/neutron/blob/master/zuul.d/rally.yaml and similar for cinder https://github.com/openstack/cinder/blob/master/.zuul.yaml#L164
12:08:18 <chandankumar> My question here is do we want to add a new rally job for watcher following cinder/neutron model and keep them running?
12:08:38 <chandankumar> the new job will use rally files from watcher repo
12:09:04 <jgilaber> I have no experience with rally, what would that job do/test?
12:09:30 <chandankumar> Based on test results, I can see it is creating/deleting audit templates
12:09:41 <amoralej> https://github.com/openstack/rally the idea is to run performance tests
12:10:02 <amoralej> you can define an action and ask rally to run it a number of times concurrently
12:10:09 <amoralej> or set of actions
12:10:21 <amoralej> defining parallelism etc...
12:11:05 <amoralej> in this case, we would be testing mostly api scalability, i.e. if it's able to create X audits, list them, etc... in how much time
12:11:13 <jgilaber> this is using a real deployment with devstack?
12:11:33 <chandankumar> different types of reports: https://7286c318f284276b918d-33403e99d7f0c0f7ca362b53c8ca1faf.ssl.cf2.rackcdn.com/openstack/bb656b811fba45e48d88edc1eee4659b/
12:12:04 <amoralej> yes, it's devstack
12:12:11 <amoralej> https://7286c318f284276b918d-33403e99d7f0c0f7ca362b53c8ca1faf.ssl.cf2.rackcdn.com/openstack/bb656b811fba45e48d88edc1eee4659b/controller/logs/
12:12:21 <chandankumar> jgilaber: yes, it is using devstack with real deployment job definition https://github.com/openstack/rally-openstack/blob/master/.zuul.d/rally-task-watcher.yaml
12:12:27 <jgilaber> ack thanks
12:13:57 <amoralej> @chandankumar, you can define max expected time to run a test or something like that?
12:14:04 <amoralej> what's considered a test failure?
12:14:06 <rlandy> how likely are these tests to find a real problem with what they can actually run?
12:14:45 <rlandy> perhaps the value add here is unclear
12:16:26 <chandankumar> amoralej: those are good question, I donot have answer, I still need to explore it.
12:16:30 <chandankumar> based on doc https://docs.openstack.org/rally/latest/quick_start/tutorial/step_4_adding_success_criteria_for_benchmarks.html
12:16:49 <amoralej> i think it can be useful to find potential scalability issues, but i also see some limitations for the watcher case
12:17:03 <jgilaber> #link https://docs.openstack.org/rally/latest/quick_start/tutorial/step_4_adding_success_criteria_for_benchmarks.html
12:17:15 <morenod> I think that rally generates value on modules where every api call generates tasks which consume resources or create openstack objects. in our case, it will be useful if we can stress audits or actions
12:17:29 <chandankumar> we can set SLA service on each task
12:17:49 <amoralej> good, based on run max time and failure rate
12:17:59 <jgilaber> the balancing strategies might be the most interesting to test I think
12:18:36 <jgilaber> they are the most computationally expensive vs other like zone migration or host maintenance where nova/cinder do most of the work
12:18:45 <morenod> but rally is not going to check if the strategy has been applied, it is going to validate that the api call of creating it will answer correctly, right?
12:19:05 <chandankumar> morenod: that I need to explore.
12:19:12 <amoralej> i think we could define what specific aspects we want to test
12:19:26 <chandankumar> I was just checking things where things are defined and used currently.
12:19:26 <jgilaber> hmm that's a good point morenod, the strategy would not make much of a difference then
12:19:30 <amoralej> and see where rally can help, and where we need something else
12:20:45 <amoralej> also, some tests will depend more on the environment that in the number of api call runs, etc...
12:21:01 <chandankumar> May we I will try with one strategy end to end and see what happens via rally
12:21:09 <chandankumar> that may give some data
12:21:13 <amoralej> what would happen if we run a workload_stabilization on 100s of hosts and 1000s of vms?
12:21:17 <chandankumar> s/we/be
12:21:33 <amoralej> we don't need to run many api calls, but one in a big environment
12:22:02 <amoralej> although rally may also help there to define max time
12:22:49 <jgilaber> ack thanks chandankumar for looking into this, I think we can have a more detailed discussion in the future, once we now more
12:23:04 <chandankumar> yup
12:23:55 <jgilaber> if there is no further comments we can move on to the next topic, any last minute request for reviews?
12:24:00 <chandankumar> for right now, I will look into workload_stablization case wuth rally case
12:24:11 <chandankumar> *rally
12:24:31 <chandankumar> jgilaber: sounds good
12:25:07 <jgilaber> ok, no reviews this week I added a few bugs for triage
12:25:11 <jgilaber> #topic Bug triage
12:25:25 <jgilaber> first one is https://bugs.launchpad.net/watcher/+bug/2121807
12:25:29 <jgilaber> #link https://bugs.launchpad.net/watcher/+bug/2121807
12:26:42 <jgilaber> according to the report the bug is new in version 14.0
12:27:14 <jgilaber> which is based on epoxy https://releases.openstack.org/teams/watcher.html#team-epoxy-watcher
12:28:08 <amoralej> we should recommend to use wsgi for the watcher api
12:28:22 <amoralej> i think it's using eventlet standalone server
12:28:24 <amoralej> right?
12:28:55 <jgilaber> I think you're right
12:28:57 <jgilaber> command: [
12:28:57 <jgilaber> "watchmedo", "auto-restart", "--directory=/app", "--pattern=*.py", "--recursive", "--",  # Dev only
12:28:57 <jgilaber> "bash", "-c",
12:28:57 <jgilaber> "/usr/local/bin/watcher-api --config-file /etc/watcher/watcher.conf > /app/logs/app.log 2>&1"
12:28:58 <jgilaber> ]
12:29:18 <jgilaber> this is the command for the watcher-api container in the docker compose attached to the report
12:29:41 <jgilaber> did we deprecate that usage?
12:29:55 <amoralej> i'd say so
12:30:04 <amoralej> as part of the eventlet changes
12:31:19 <jgilaber> I'm trying to find something in the docs to that effect
12:32:35 <jgilaber> found this release not but I'm not sure it's the same https://docs.openstack.org/releasenotes/watcher/2025.1.html#deprecation-notes
12:33:50 <amoralej> that's different i think
12:35:40 <jgilaber> the installation guide covers only installing using packages https://docs.openstack.org/watcher/latest/install/install-ubuntu.html
12:37:03 <amoralej> i think there was a general recommendation about running api services as wsgi services
12:37:12 <amoralej> i'm not sure if we document it properly, tbh
12:37:48 <jgilaber> from the container logs it looks like it's running in python 3.13 which I don't think we have tested
12:37:49 <jgilaber> 2025-09-01 14:11:02.093 8 DEBUG watcher.common.service [-] ******************************************************************************** log_opt_values /usr/local/lib/python3.13/site-packages/oslo_config/cfg.py:2804
12:38:36 <amoralej> yes, good point too
12:39:11 <jgilaber> for now I think we can ask to run with python 3.12 and mark the bug as need info?
12:39:30 <jgilaber> incomplete, actually
12:42:06 <amoralej> i'd include the wsgi recommendation
12:42:14 <amoralej> yes, and move it to incomplete
12:43:51 <jgilaber> amoralej, can I ask you to add a comment with the wsgi recommendation?
12:44:01 <amoralej> sure
12:44:06 <amoralej> i will do it after the mtg
12:44:08 <jgilaber> thanks!
12:44:21 <jgilaber> we can move to the second bug https://bugs.launchpad.net/watcher/+bug/2127777
12:44:23 <jgilaber> # link https://bugs.launchpad.net/watcher/+bug/2127777
12:44:55 <jgilaber> amoralej, this was filed after last week's meeting discussion right?
12:45:22 <amoralej> yes, actually i already sent patch for it
12:45:56 <jgilaber> so I don't think we need to discuss much here, just set the importance to high or critical since it's already in progress
12:46:11 <amoralej> i've just assigned to me and set as medium
12:46:15 <amoralej> but i can raise
12:46:47 <jgilaber> unless others object I'm ok with that
12:46:53 <chandankumar> Can we also add target to series 2026.1?
12:47:31 <amoralej> done
12:47:39 <chandankumar> thanks!
12:48:08 <jgilaber> ack, thanks, looks like we are done, on to the next one https://bugs.launchpad.net/watcher/+bug/2127485
12:48:11 <jgilaber> #link https://bugs.launchpad.net/watcher/+bug/2127485
12:49:00 <jgilaber> I opened this one because the cinder client method for migrate does an incomplete check when trying to determine if it can migrate a volume
12:49:39 <jgilaber> it expects that the volume's type is configured with the same volume_backend_name as thedestination pool
12:49:44 <jgilaber> which is not required
12:50:13 <jgilaber> I already created a patch for it, it's missing importance, which I think would be medium
12:51:36 <jgilaber> any comments/objections?
12:52:37 <jgilaber> I'll take silence as a yes :) and set the importance
12:53:33 <jgilaber> so with that we've reached the end of the agenda, any last minute topic? otherwise we just need a volunteer for next week's meeting
12:54:33 <amoralej> i can take it
12:54:34 <morenod> I will
12:54:39 <amoralej> morenod wins :)
12:54:39 <morenod> all yours :P
12:54:48 <morenod> foto finish?
12:54:56 <jgilaber> I'll let you fight for it ;)
12:55:04 <chandankumar> one more thing, During ptg week, do we want to cancel weekly meeting?
12:55:12 <rlandy> ack - was going to ask that
12:55:13 <amoralej> morenod deserves it :)
12:55:19 <amoralej> i'd cancel
12:55:23 <rlandy> ie: the one after next week?
12:55:25 <jgilaber> I think so, that is in two weeks time right?
12:55:29 <chandankumar> yes
12:55:33 <rlandy> there is one next week
12:55:45 <rlandy> the one in two weeks would be ptg
12:55:47 <rlandy> correct
12:56:09 <rlandy> +1 to cancel
12:56:11 <jgilaber> so we meet as usual next week, we cancel the next for ptg, and looks like there is consensus
12:56:56 <amoralej> yep
12:57:48 <jgilaber> ok so that's all for today, thanks for participating!
12:58:03 <jgilaber> #endmeeting