12:00:43 <dviroel> #startmeeting watcher
12:00:43 <opendevmeet> Meeting started Thu Jun 12 12:00:43 2025 UTC and is due to finish in 60 minutes.  The chair is dviroel. Information about MeetBot at http://wiki.debian.org/MeetBot.
12:00:43 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
12:00:43 <opendevmeet> The meeting name has been set to 'watcher'
12:00:58 <dviroel> hi all, who's is around today?
12:01:29 <sean-k-mooney> o/
12:01:37 <sean-k-mooney> im doing spec review but im here
12:01:43 <dviroel> courtesy ping: amoralej jgilaber chandankumar morenod
12:01:46 <jgilaber> o/
12:01:51 <mtembo> o/
12:02:01 <opendevreview> David proposed openstack/watcher-tempest-plugin master: Add tests for workload_balance with injected data  https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/949722
12:02:04 <amoralej> o/
12:02:28 <dviroel> alright, let's start with today's meeting agenda
12:02:33 <dviroel> #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L45 (Meeting agenda)
12:02:51 <dviroel> feel free to add your own topics to the agenda
12:03:00 <dviroel> there are topics to place your changes that requires attention from reviewers
12:03:10 <dviroel> lets start with the first one
12:03:27 <dviroel> #topic Eventlet Removal
12:03:35 <dviroel> #link https://etherpad.opendev.org/p/watcher-eventlet-removal
12:03:40 <rlandy> o/
12:03:50 <dviroel> not too much to add i think
12:03:51 <dviroel> I updated the patch to move eventlet commands to a different directory
12:04:06 <dviroel> and propose a WIP patch for decision engine
12:04:12 <dviroel> link for related patches:
12:04:18 <dviroel> #link https://etherpad.opendev.org/p/watcher-eventlet-removal#L11
12:04:37 <dviroel> the WIP patch for decision engine, which is now running on CI, is failing :P
12:04:59 <dviroel> the patch allows to replace usage of green thread pools and test against oslo.service new threading backend
12:05:20 <dviroel> as for now, there is an issue with cluster data model updates which I'm investigating and working on a fix on my env
12:05:49 <dviroel> i will bring more updates about that soon
12:06:26 <dviroel> any question or comments on this topic?
12:07:27 <dviroel> ok, moving to the next topic
12:07:36 <amoralej> the plan is to make the change configurable and disabled by default?
12:07:47 <amoralej> or moving to the new threading backend in oslo.service by default?
12:08:02 <sean-k-mooney> different project are taking diffent approches
12:08:12 <sean-k-mooney> nova is making it configurable via a ENV var
12:08:29 <sean-k-mooney> ironic and neutron just ripped out supprot  i think
12:08:30 <dviroel> yes amoralej, and have a CI job running with eventlet disabled for the services that we add support for it
12:09:06 <dviroel> i think that, for now we could go with the ENV var to enable/disable eventlet, like nova
12:09:44 <amoralej> i ask because i just found that the oslo.service supporting the threading backend is blocked in RDO because it requires some other updates in libs
12:09:48 <amoralej> cotyledon, iirc
12:10:04 <amoralej> nothing strategic
12:10:12 <sean-k-mooney> yes it does
12:10:19 <sean-k-mooney> alhtouhg that is not a new requriement in general
12:10:30 <sean-k-mooney> cotyledon was already used in some projects in openstack
12:10:33 <amoralej> yes, but it requires > 2.0
12:10:37 <sean-k-mooney> that shoudl already be packaged in rdo
12:10:41 <amoralej> and there is 1.7 or something like that
12:10:55 <amoralej> not big thing, but needs to be done
12:11:00 <sean-k-mooney> ack but that would have been requireed for the other service that nativly use it already
12:11:03 <amoralej> also new backend is added as an extras
12:11:14 <amoralej> so, it needs to provide an extra optional rpm
12:11:37 <sean-k-mooney> good to know
12:11:54 <dviroel> right, good point amoralej
12:11:55 <sean-k-mooney> once we start merging support in watcher we wil want to make it a required dep for the watcher rpm
12:12:23 <amoralej> it has automatic deps, so if we add it to requiremets.txt, it should be automatically in the rpm
12:12:26 <sean-k-mooney> but that can happen later in the cycle once we actully have it woring
12:12:33 <amoralej> yeap
12:13:14 <dviroel> thanks for the heads up amoralej
12:13:33 <sean-k-mooney> dviroel: this is not the way to test that by the way https://review.opendev.org/c/openstack/watcher/+/952257/1/watcher/eventlet.py
12:13:52 <sean-k-mooney> you can tweak the job to set the env var for the relevent service
12:14:12 <sean-k-mooney> basicly you shoudl not need to comemnt and hard code the behvior
12:14:40 <sean-k-mooney> also returnign FALSE here https://review.opendev.org/c/openstack/watcher/+/952257/1/watcher/eventlet.py#26
12:14:45 <sean-k-mooney> is semanticly incorrect
12:15:13 <dviroel> it checks the MONKEY_PATCHED for that
12:15:35 <dviroel> but we can discuss more about there in the change, for sure
12:15:43 <dviroel> the idea is not to hardcoded things
12:16:08 <dviroel> but work based on OS_WATCHER_DISABLE_EVENTLET_PATCHING
12:16:26 <sean-k-mooney> i know but monkey patch shoudl return None
12:16:40 <sean-k-mooney> i dont want to use   if _monkey_patch():
12:17:17 <sean-k-mooney> if is_patching_enabled(): is fine
12:17:57 <sean-k-mooney> but dont conflate the two "_monkey_patch" shoudl patch if not allreay patched it shoudl not  be used to make desions
12:18:08 <sean-k-mooney> anyway lets move on
12:18:31 <dviroel> ack, we can discuss more about this in the patch on later in the channel
12:18:38 <dviroel> thanks sean-k-mooney
12:18:58 <sean-k-mooney> also is_patched is the function that you shoudl be useing to make desciosn outside this module
12:19:06 <sean-k-mooney> ill leave thos comment on the review
12:19:18 <dviroel> lets move on to the next topic, and feel free to comment on the WIP patch too, maybe I'm missing some details there
12:19:21 <sean-k-mooney> is_patching_enabled shoudl be private and internal.
12:19:21 <dviroel> sean-k-mooney: ty
12:20:18 <dviroel> #topic Call for reviews
12:20:31 <dviroel> thre is some doc reviews open
12:20:42 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/951002 Update Host Maintenance strategy documentation
12:21:08 <dviroel> already has some votes on it ^
12:22:15 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/951025  Update Workload Balance strategy documentation
12:22:46 <dviroel> rlandy: any comment on this one?
12:23:04 <rlandy> not really - just needs some more votes
12:23:18 <rlandy> the only ones for further discussion are the tables ones
12:23:28 <dviroel> ack, it was recently update, i will review again
12:23:33 <rlandy> those two should match
12:23:38 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/951440  Add table - level of test/usage per strategy
12:23:49 <dviroel> and
12:23:53 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/951699  Add Integrations doc page with support matrix
12:25:01 <dviroel> ack, i see that you moved the status from Production -> Supported
12:25:14 <dviroel> to match with the Integration table
12:25:28 <dviroel> they look that are in sync now
12:26:17 <dviroel> going to review it again and vote
12:26:33 <dviroel> pls all, take some time to review these doc updates ^
12:26:49 <dviroel> there is one more
12:26:52 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/951899 Add doc clarifications for Zone Migration
12:27:29 <dviroel> ty all for helping improving the docs (and reviewing them)
12:27:51 <dviroel> any other doc patches?
12:28:25 <dviroel> now about specs
12:28:41 <sean-k-mooney> ill try and review those all after the meeting
12:28:58 <dviroel> the spec:  feat(spec): disable cold/live migration for host maintenance strategy merged minutes ago
12:29:08 <sean-k-mooney> yep
12:29:11 <dviroel> we still have 3 open for review
12:29:19 <dviroel> #link https://review.opendev.org/c/openstack/watcher-specs/+/947282 (Adds spec for extend compute model attributes)
12:30:01 <dviroel> which I recently updated to reduce the scope ^ - we would end now with 2 specs, one for extending the model, and anothe one for using this new info in strategies
12:30:29 <sean-k-mooney> ya we likely will be able to complete the first this cycle i think
12:30:30 <dviroel> going to propose another one to cover the strategies part
12:30:51 <sean-k-mooney> the second might be next cycle as im not sure we can design it in time for sepc freeze
12:31:03 <sean-k-mooney> we can see how things go
12:31:10 <dviroel> correct, there is a lot to discuss in the second one
12:31:37 <dviroel> ty for all for comments/reviews
12:31:41 <dviroel> next spec
12:31:44 <dviroel> #link https://review.opendev.org/c/openstack/watcher-specs/+/949528 (Add spec for skip actions from action plan executions)
12:31:59 <amoralej> about that, i still need to update it with your comments
12:32:18 <dviroel> i need to revisit the last ps too
12:32:32 <amoralej> but there is an opened topic from gmaan , which is the status of the actionplan when one ore more actions are skipped
12:32:32 <dviroel> it is on my spec review list
12:33:20 <amoralej> my initial approach was keeping it as succeeded but he challenged if succeeded is the same as succeeded_with_skipped
12:33:33 <amoralej> wdyt? should we add a new state to the actionplan?
12:33:55 <amoralej> or keep as succeeded with some note or something visible?
12:34:17 <sean-k-mooney> thats an interesting question
12:34:27 <amoralej> it is :)
12:34:31 <sean-k-mooney> i dont htink it shoudl be a uniqe state
12:35:10 <sean-k-mooney> it could be but then it calles into question the diffent betwen skiped intentionally and skipped by precondtions
12:35:54 <sean-k-mooney> it feels incorrect to skip actions before executing the plann and then to say succeeded_with_skipped
12:36:06 <sean-k-mooney> if not addtional actions were skipped during the execution
12:36:27 <sean-k-mooney> to feel missleadign to me as all the actions i asked to executed were executed
12:36:49 <amoralej> that's a good point, so skipping manually actions would not affect to the succeeded status, but automatically would
12:37:11 <sean-k-mooney> so i woudl only add succeeded_with_skiped if you explcitly exluded ones that were skipped before we started the action plan and that feels overly compelxt IMO
12:37:34 <sean-k-mooney> can you capture this in the spec. i can try an loop back
12:37:42 <sean-k-mooney> and we can discuss it more with gmaan  if they have time
12:37:55 <amoralej> so, what would be the case if the action skips automatically based on pre-condution?
12:38:05 <sean-k-mooney> i think we could do it based on th reason filed
12:38:10 <amoralej> yes, we can keep discussing in the review
12:38:10 <jgilaber> amoralej: I would still consider an succesful actionplan that had actions skipped automatically as success
12:38:25 <sean-k-mooney> basiclaly if we set it to "skipped by user" or some knwo string when skipping ebfore the execution
12:39:02 <sean-k-mooney> then we could perhaps have successs_with_skipped if it only was reported that way when there is skipped_by_precondtion in the reaons
12:39:10 <sean-k-mooney> that might be the way forward
12:40:09 <sean-k-mooney> amoralej: in either case we shoudl capture this in the alternitive section regardles of which path we take
12:40:16 <sean-k-mooney> to record this design choice
12:40:18 <amoralej> ack
12:41:19 <dviroel> ack, so for those that want to know more, pls check the spec, there are some discussions in there already
12:41:30 <amoralej> i will do, but instead of using the text reason field, i wonder if that should be a new one
12:41:48 <amoralej> anyway, yes, let's keep the discussion in the spec
12:41:57 <amoralej> any feedback is appreciated
12:42:10 <dviroel> tks amoralej for the updates
12:42:21 <dviroel> thre is one more spec open
12:42:28 <dviroel> #link https://review.opendev.org/c/openstack/watcher-specs/+/949804 (Support multitenancy with Prometheus datasource)
12:42:45 <dviroel> this one I still need to review :(
12:43:30 <dviroel> but sean-k-mooney already raise some discussion/suggestions there
12:43:31 <dviroel> ++
12:43:53 <sean-k-mooney> ya im not partically happy with how the design of this is beign done
12:44:13 <sean-k-mooney> metta context as well i think they are assumeign we would backprot this supprot downstream
12:44:34 <sean-k-mooney> which they have not discuss with us and i did not plan to do
12:44:51 <sean-k-mooney> that a donwstream dicsusion but
12:45:08 <sean-k-mooney> form my perspecitve they are not properly doing open desgins and following upsream convestion which im annoyed by
12:46:44 <dviroel> ack, thanks for raising that and for requesting a spec, which allow us to discuss more about the design
12:47:02 <dviroel> i will try to get some eyes on this one too
12:47:18 <dviroel> if someone else have some time, pls also take a look
12:47:49 <sean-k-mooney> as it stand the current issue i think is a blocker is changing the obeserablity client to require that we alwasy pass a keystone session
12:48:04 <sean-k-mooney> that shoudl only be done if Ateos is used
12:48:10 <dviroel> +1
12:48:32 <sean-k-mooney> i condiser it to be a security issue
12:48:41 <dviroel> #link https://opendev.org/openstack/aetos fyi
12:48:41 <sean-k-mooney> to pass a admin toke to prometiosu if it is not required
12:48:50 <sean-k-mooney> because if that is not authenicated viah https
12:48:58 <sean-k-mooney> then it can be sniffed on the wire
12:49:04 <sean-k-mooney> and now you have full admin rights
12:49:33 <sean-k-mooney> so it is not accpable to pass it when it is not used and it need to be opt in for upgrade reasons
12:50:33 <dviroel> upgrade reasons is a important point to raise, for sure
12:50:58 <dviroel> and also the security concerns
12:51:27 <dviroel> thanks for the discussions sean-k-mooney - we should all ready the spec and the comments there
12:51:45 <dviroel> since it is a important design change for watcher
12:52:08 <dviroel> going to move forward, and we should bring these specs again in the next meetings..
12:52:33 <dviroel> that's all for now in specs
12:52:39 <dviroel> there are some review request for bug fixes too
12:52:48 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/952212 bug in workload stabilization
12:53:18 <amoralej> i added a those, which are more urgent and i expect are easy to review
12:53:29 <dviroel> ack, going to take a look
12:53:46 <dviroel> and
12:53:52 <dviroel> #link https://review.opendev.org/c/openstack/watcher/+/952364 Aggregate by label when querying instance cpu usage in prometheus
12:54:10 <dviroel> this second one I need some time to process the query:)
12:54:52 <opendevreview> David proposed openstack/watcher-tempest-plugin master: Add tests for workload_balance with injected data  https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/949722
12:55:00 <amoralej> sure :)
12:55:26 <dviroel> but it is something important to review
12:56:12 <dviroel> any question on these patches?
12:56:23 <amoralej> it's finally only changing the "avg by (instance)"  to "avg by (resource)"
12:56:43 <amoralej> but i took the oportunity to move the query format to dict which i think it's better
12:56:55 <amoralej> from %s to dict, i mean
12:56:55 <morenod> I confirm that after that change, the workload_balance under prometheus is consistent
12:57:30 <dviroel> ack, it is a small change, but it is also a dangerous one :)
12:57:39 <dviroel> so it requires more eyes
12:58:11 <dviroel> i will take a look
12:58:22 <amoralej> yep, we can break it all by chaging prometheus queries, that's right
12:58:33 <amoralej> :)
12:58:56 <dviroel> we are almost out of time
12:59:49 <dviroel> we can review the bugs listed after the meeting, or someone want to highlight something?
13:01:04 <dviroel> thanks mtembo for volunteer to chair next week meeting
13:01:30 <dviroel> I won't join next week's meeting, since it is a public holiday here
13:01:38 <mtembo> thanks too dviroel
13:02:22 <dviroel> ok, i will move the bugs listed to next week meeting, and close for today
13:02:42 <dviroel> thank you all for joining
13:02:56 <dviroel> #endmeeting