opendevreview | Alfredo Moralejo proposed openstack/watcher master: Fix instance migrations in zone_migration strategy together with volumes https://review.opendev.org/c/openstack/watcher/+/952116 | 06:27 |
---|---|---|
opendevreview | David proposed openstack/watcher-tempest-plugin master: Add tests for workload_balance with injected data https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/949722 | 09:11 |
dviroel | sean-k-mooney: hi - pls revote on this spec when you have some time: https://review.opendev.org/c/openstack/watcher-specs/+/943873 | 11:14 |
sean-k-mooney | done i missed that hey updated it | 11:17 |
opendevreview | Merged openstack/watcher-specs master: feat(spec): disable cold/live migration for host maintenance strategy https://review.opendev.org/c/openstack/watcher-specs/+/943873 | 11:24 |
opendevreview | Merged openstack/watcher-tempest-plugin master: Update tests for host maintenance strategy. https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/950851 | 11:54 |
*** amoralej_ is now known as amoralej | 11:59 | |
dviroel | #startmeeting watcher | 12:00 |
opendevmeet | Meeting started Thu Jun 12 12:00:43 2025 UTC and is due to finish in 60 minutes. The chair is dviroel. Information about MeetBot at http://wiki.debian.org/MeetBot. | 12:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 12:00 |
opendevmeet | The meeting name has been set to 'watcher' | 12:00 |
dviroel | hi all, who's is around today? | 12:00 |
sean-k-mooney | o/ | 12:01 |
sean-k-mooney | im doing spec review but im here | 12:01 |
dviroel | courtesy ping: amoralej jgilaber chandankumar morenod | 12:01 |
jgilaber | o/ | 12:01 |
mtembo | o/ | 12:01 |
opendevreview | David proposed openstack/watcher-tempest-plugin master: Add tests for workload_balance with injected data https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/949722 | 12:02 |
amoralej | o/ | 12:02 |
dviroel | alright, let's start with today's meeting agenda | 12:02 |
dviroel | #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L45 (Meeting agenda) | 12:02 |
dviroel | feel free to add your own topics to the agenda | 12:02 |
dviroel | there are topics to place your changes that requires attention from reviewers | 12:03 |
dviroel | lets start with the first one | 12:03 |
dviroel | #topic Eventlet Removal | 12:03 |
dviroel | #link https://etherpad.opendev.org/p/watcher-eventlet-removal | 12:03 |
rlandy | o/ | 12:03 |
dviroel | not too much to add i think | 12:03 |
dviroel | I updated the patch to move eventlet commands to a different directory | 12:03 |
dviroel | and propose a WIP patch for decision engine | 12:04 |
dviroel | link for related patches: | 12:04 |
dviroel | #link https://etherpad.opendev.org/p/watcher-eventlet-removal#L11 | 12:04 |
dviroel | the WIP patch for decision engine, which is now running on CI, is failing :P | 12:04 |
dviroel | the patch allows to replace usage of green thread pools and test against oslo.service new threading backend | 12:04 |
dviroel | as for now, there is an issue with cluster data model updates which I'm investigating and working on a fix on my env | 12:05 |
dviroel | i will bring more updates about that soon | 12:05 |
dviroel | any question or comments on this topic? | 12:06 |
dviroel | ok, moving to the next topic | 12:07 |
amoralej | the plan is to make the change configurable and disabled by default? | 12:07 |
amoralej | or moving to the new threading backend in oslo.service by default? | 12:07 |
sean-k-mooney | different project are taking diffent approches | 12:08 |
sean-k-mooney | nova is making it configurable via a ENV var | 12:08 |
sean-k-mooney | ironic and neutron just ripped out supprot i think | 12:08 |
dviroel | yes amoralej, and have a CI job running with eventlet disabled for the services that we add support for it | 12:08 |
dviroel | i think that, for now we could go with the ENV var to enable/disable eventlet, like nova | 12:09 |
amoralej | i ask because i just found that the oslo.service supporting the threading backend is blocked in RDO because it requires some other updates in libs | 12:09 |
amoralej | cotyledon, iirc | 12:09 |
amoralej | nothing strategic | 12:10 |
sean-k-mooney | yes it does | 12:10 |
sean-k-mooney | alhtouhg that is not a new requriement in general | 12:10 |
sean-k-mooney | cotyledon was already used in some projects in openstack | 12:10 |
amoralej | yes, but it requires > 2.0 | 12:10 |
sean-k-mooney | that shoudl already be packaged in rdo | 12:10 |
amoralej | and there is 1.7 or something like that | 12:10 |
amoralej | not big thing, but needs to be done | 12:10 |
sean-k-mooney | ack but that would have been requireed for the other service that nativly use it already | 12:11 |
amoralej | also new backend is added as an extras | 12:11 |
amoralej | so, it needs to provide an extra optional rpm | 12:11 |
sean-k-mooney | good to know | 12:11 |
dviroel | right, good point amoralej | 12:11 |
sean-k-mooney | once we start merging support in watcher we wil want to make it a required dep for the watcher rpm | 12:11 |
amoralej | it has automatic deps, so if we add it to requiremets.txt, it should be automatically in the rpm | 12:12 |
sean-k-mooney | but that can happen later in the cycle once we actully have it woring | 12:12 |
amoralej | yeap | 12:12 |
dviroel | thanks for the heads up amoralej | 12:13 |
sean-k-mooney | dviroel: this is not the way to test that by the way https://review.opendev.org/c/openstack/watcher/+/952257/1/watcher/eventlet.py | 12:13 |
sean-k-mooney | you can tweak the job to set the env var for the relevent service | 12:13 |
sean-k-mooney | basicly you shoudl not need to comemnt and hard code the behvior | 12:14 |
sean-k-mooney | also returnign FALSE here https://review.opendev.org/c/openstack/watcher/+/952257/1/watcher/eventlet.py#26 | 12:14 |
sean-k-mooney | is semanticly incorrect | 12:14 |
dviroel | it checks the MONKEY_PATCHED for that | 12:15 |
dviroel | but we can discuss more about there in the change, for sure | 12:15 |
dviroel | the idea is not to hardcoded things | 12:15 |
dviroel | but work based on OS_WATCHER_DISABLE_EVENTLET_PATCHING | 12:16 |
sean-k-mooney | i know but monkey patch shoudl return None | 12:16 |
sean-k-mooney | i dont want to use if _monkey_patch(): | 12:16 |
sean-k-mooney | if is_patching_enabled(): is fine | 12:17 |
sean-k-mooney | but dont conflate the two "_monkey_patch" shoudl patch if not allreay patched it shoudl not be used to make desions | 12:17 |
sean-k-mooney | anyway lets move on | 12:18 |
dviroel | ack, we can discuss more about this in the patch on later in the channel | 12:18 |
dviroel | thanks sean-k-mooney | 12:18 |
sean-k-mooney | also is_patched is the function that you shoudl be useing to make desciosn outside this module | 12:18 |
sean-k-mooney | ill leave thos comment on the review | 12:19 |
dviroel | lets move on to the next topic, and feel free to comment on the WIP patch too, maybe I'm missing some details there | 12:19 |
sean-k-mooney | is_patching_enabled shoudl be private and internal. | 12:19 |
dviroel | sean-k-mooney: ty | 12:19 |
dviroel | #topic Call for reviews | 12:20 |
dviroel | thre is some doc reviews open | 12:20 |
dviroel | #link https://review.opendev.org/c/openstack/watcher/+/951002 Update Host Maintenance strategy documentation | 12:20 |
dviroel | already has some votes on it ^ | 12:21 |
dviroel | #link https://review.opendev.org/c/openstack/watcher/+/951025 Update Workload Balance strategy documentation | 12:22 |
dviroel | rlandy: any comment on this one? | 12:22 |
rlandy | not really - just needs some more votes | 12:23 |
rlandy | the only ones for further discussion are the tables ones | 12:23 |
dviroel | ack, it was recently update, i will review again | 12:23 |
rlandy | those two should match | 12:23 |
dviroel | #link https://review.opendev.org/c/openstack/watcher/+/951440 Add table - level of test/usage per strategy | 12:23 |
dviroel | and | 12:23 |
dviroel | #link https://review.opendev.org/c/openstack/watcher/+/951699 Add Integrations doc page with support matrix | 12:23 |
dviroel | ack, i see that you moved the status from Production -> Supported | 12:25 |
dviroel | to match with the Integration table | 12:25 |
dviroel | they look that are in sync now | 12:25 |
dviroel | going to review it again and vote | 12:26 |
dviroel | pls all, take some time to review these doc updates ^ | 12:26 |
dviroel | there is one more | 12:26 |
dviroel | #link https://review.opendev.org/c/openstack/watcher/+/951899 Add doc clarifications for Zone Migration | 12:26 |
dviroel | ty all for helping improving the docs (and reviewing them) | 12:27 |
dviroel | any other doc patches? | 12:27 |
dviroel | now about specs | 12:28 |
sean-k-mooney | ill try and review those all after the meeting | 12:28 |
dviroel | the spec: feat(spec): disable cold/live migration for host maintenance strategy merged minutes ago | 12:28 |
sean-k-mooney | yep | 12:29 |
dviroel | we still have 3 open for review | 12:29 |
dviroel | #link https://review.opendev.org/c/openstack/watcher-specs/+/947282 (Adds spec for extend compute model attributes) | 12:29 |
dviroel | which I recently updated to reduce the scope ^ - we would end now with 2 specs, one for extending the model, and anothe one for using this new info in strategies | 12:30 |
sean-k-mooney | ya we likely will be able to complete the first this cycle i think | 12:30 |
dviroel | going to propose another one to cover the strategies part | 12:30 |
sean-k-mooney | the second might be next cycle as im not sure we can design it in time for sepc freeze | 12:30 |
sean-k-mooney | we can see how things go | 12:31 |
dviroel | correct, there is a lot to discuss in the second one | 12:31 |
dviroel | ty for all for comments/reviews | 12:31 |
dviroel | next spec | 12:31 |
dviroel | #link https://review.opendev.org/c/openstack/watcher-specs/+/949528 (Add spec for skip actions from action plan executions) | 12:31 |
amoralej | about that, i still need to update it with your comments | 12:31 |
dviroel | i need to revisit the last ps too | 12:32 |
amoralej | but there is an opened topic from gmaan , which is the status of the actionplan when one ore more actions are skipped | 12:32 |
dviroel | it is on my spec review list | 12:32 |
amoralej | my initial approach was keeping it as succeeded but he challenged if succeeded is the same as succeeded_with_skipped | 12:33 |
amoralej | wdyt? should we add a new state to the actionplan? | 12:33 |
amoralej | or keep as succeeded with some note or something visible? | 12:33 |
sean-k-mooney | thats an interesting question | 12:34 |
amoralej | it is :) | 12:34 |
sean-k-mooney | i dont htink it shoudl be a uniqe state | 12:34 |
sean-k-mooney | it could be but then it calles into question the diffent betwen skiped intentionally and skipped by precondtions | 12:35 |
sean-k-mooney | it feels incorrect to skip actions before executing the plann and then to say succeeded_with_skipped | 12:35 |
sean-k-mooney | if not addtional actions were skipped during the execution | 12:36 |
sean-k-mooney | to feel missleadign to me as all the actions i asked to executed were executed | 12:36 |
amoralej | that's a good point, so skipping manually actions would not affect to the succeeded status, but automatically would | 12:36 |
sean-k-mooney | so i woudl only add succeeded_with_skiped if you explcitly exluded ones that were skipped before we started the action plan and that feels overly compelxt IMO | 12:37 |
sean-k-mooney | can you capture this in the spec. i can try an loop back | 12:37 |
sean-k-mooney | and we can discuss it more with gmaan if they have time | 12:37 |
amoralej | so, what would be the case if the action skips automatically based on pre-condution? | 12:37 |
sean-k-mooney | i think we could do it based on th reason filed | 12:38 |
amoralej | yes, we can keep discussing in the review | 12:38 |
jgilaber | amoralej: I would still consider an succesful actionplan that had actions skipped automatically as success | 12:38 |
sean-k-mooney | basiclaly if we set it to "skipped by user" or some knwo string when skipping ebfore the execution | 12:38 |
sean-k-mooney | then we could perhaps have successs_with_skipped if it only was reported that way when there is skipped_by_precondtion in the reaons | 12:39 |
sean-k-mooney | that might be the way forward | 12:39 |
sean-k-mooney | amoralej: in either case we shoudl capture this in the alternitive section regardles of which path we take | 12:40 |
sean-k-mooney | to record this design choice | 12:40 |
amoralej | ack | 12:40 |
dviroel | ack, so for those that want to know more, pls check the spec, there are some discussions in there already | 12:41 |
amoralej | i will do, but instead of using the text reason field, i wonder if that should be a new one | 12:41 |
amoralej | anyway, yes, let's keep the discussion in the spec | 12:41 |
amoralej | any feedback is appreciated | 12:41 |
dviroel | tks amoralej for the updates | 12:42 |
dviroel | thre is one more spec open | 12:42 |
dviroel | #link https://review.opendev.org/c/openstack/watcher-specs/+/949804 (Support multitenancy with Prometheus datasource) | 12:42 |
dviroel | this one I still need to review :( | 12:42 |
dviroel | but sean-k-mooney already raise some discussion/suggestions there | 12:43 |
dviroel | ++ | 12:43 |
sean-k-mooney | ya im not partically happy with how the design of this is beign done | 12:43 |
sean-k-mooney | metta context as well i think they are assumeign we would backprot this supprot downstream | 12:44 |
sean-k-mooney | which they have not discuss with us and i did not plan to do | 12:44 |
sean-k-mooney | that a donwstream dicsusion but | 12:44 |
sean-k-mooney | form my perspecitve they are not properly doing open desgins and following upsream convestion which im annoyed by | 12:45 |
dviroel | ack, thanks for raising that and for requesting a spec, which allow us to discuss more about the design | 12:46 |
dviroel | i will try to get some eyes on this one too | 12:47 |
dviroel | if someone else have some time, pls also take a look | 12:47 |
sean-k-mooney | as it stand the current issue i think is a blocker is changing the obeserablity client to require that we alwasy pass a keystone session | 12:47 |
sean-k-mooney | that shoudl only be done if Ateos is used | 12:48 |
dviroel | +1 | 12:48 |
sean-k-mooney | i condiser it to be a security issue | 12:48 |
dviroel | #link https://opendev.org/openstack/aetos fyi | 12:48 |
sean-k-mooney | to pass a admin toke to prometiosu if it is not required | 12:48 |
sean-k-mooney | because if that is not authenicated viah https | 12:48 |
sean-k-mooney | then it can be sniffed on the wire | 12:48 |
sean-k-mooney | and now you have full admin rights | 12:49 |
sean-k-mooney | so it is not accpable to pass it when it is not used and it need to be opt in for upgrade reasons | 12:49 |
dviroel | upgrade reasons is a important point to raise, for sure | 12:50 |
dviroel | and also the security concerns | 12:50 |
dviroel | thanks for the discussions sean-k-mooney - we should all ready the spec and the comments there | 12:51 |
dviroel | since it is a important design change for watcher | 12:51 |
dviroel | going to move forward, and we should bring these specs again in the next meetings.. | 12:52 |
dviroel | that's all for now in specs | 12:52 |
dviroel | there are some review request for bug fixes too | 12:52 |
dviroel | #link https://review.opendev.org/c/openstack/watcher/+/952212 bug in workload stabilization | 12:52 |
amoralej | i added a those, which are more urgent and i expect are easy to review | 12:53 |
dviroel | ack, going to take a look | 12:53 |
dviroel | and | 12:53 |
dviroel | #link https://review.opendev.org/c/openstack/watcher/+/952364 Aggregate by label when querying instance cpu usage in prometheus | 12:53 |
dviroel | this second one I need some time to process the query:) | 12:54 |
opendevreview | David proposed openstack/watcher-tempest-plugin master: Add tests for workload_balance with injected data https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/949722 | 12:54 |
amoralej | sure :) | 12:55 |
dviroel | but it is something important to review | 12:55 |
dviroel | any question on these patches? | 12:56 |
amoralej | it's finally only changing the "avg by (instance)" to "avg by (resource)" | 12:56 |
amoralej | but i took the oportunity to move the query format to dict which i think it's better | 12:56 |
amoralej | from %s to dict, i mean | 12:56 |
morenod | I confirm that after that change, the workload_balance under prometheus is consistent | 12:56 |
dviroel | ack, it is a small change, but it is also a dangerous one :) | 12:57 |
dviroel | so it requires more eyes | 12:57 |
dviroel | i will take a look | 12:58 |
amoralej | yep, we can break it all by chaging prometheus queries, that's right | 12:58 |
amoralej | :) | 12:58 |
dviroel | we are almost out of time | 12:58 |
dviroel | we can review the bugs listed after the meeting, or someone want to highlight something? | 12:59 |
dviroel | thanks mtembo for volunteer to chair next week meeting | 13:01 |
dviroel | I won't join next week's meeting, since it is a public holiday here | 13:01 |
mtembo | thanks too dviroel | 13:01 |
dviroel | ok, i will move the bugs listed to next week meeting, and close for today | 13:02 |
dviroel | thank you all for joining | 13:02 |
dviroel | #endmeeting | 13:02 |
opendevmeet | Meeting ended Thu Jun 12 13:02:56 2025 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 13:02 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/watcher/2025/watcher.2025-06-12-12.00.html | 13:02 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/watcher/2025/watcher.2025-06-12-12.00.txt | 13:02 |
opendevmeet | Log: https://meetings.opendev.org/meetings/watcher/2025/watcher.2025-06-12-12.00.log.html | 13:02 |
amoralej | thanks dviroel for chairing! | 13:04 |
gmaan | sean-k-mooney: amoralej: ++ on only add skip for automatic skipped actions. Main idea is user should know if actions are skipped .when they skip they know so nothing needed but if automatic skipped then they should know either from state or some extra field with reason. | 16:52 |
gmaan | Also, it does not need to be a new state as such but adding skipped action info in other field can server the same purpose | 16:53 |
gmaan | amoralej: I kept spec open to reply but got distracted on other tasks. I replied to your comments in spec, please check | 16:59 |
sean-k-mooney | gmaan: well you can list the action of an action plan and see the status fo each | 17:01 |
sean-k-mooney | gmaan: so in the current propaly the action plan woudl be succeeded adn the action that were skipped would have the skipped statsu and a populated reaosn field | 17:02 |
sean-k-mooney | to say why | 17:02 |
sean-k-mooney | i.e. live mgiration was skipped because the vm was deleted | 17:03 |
sean-k-mooney | somethign like that | 17:03 |
sean-k-mooney | <action> was skipped beacue <precondition> was not met | 17:03 |
gmaan | true but having a field say "action_skipped" on action plan level will give a easy way to know and if user need to go in action status detail | 17:03 |
gmaan | it can be None if no action skipped | 17:03 |
sean-k-mooney | so if we look at action plan show https://docs.openstack.org/api-ref/resource-optimization/#id42 that is the respocen today | 17:04 |
sean-k-mooney | it does not refenc the actions at all | 17:05 |
sean-k-mooney | your suggestion we asdd a list or somehting to say that some acttions were skipped? | 17:05 |
sean-k-mooney | we coudl have a genera "status_message" field rather then "reason" and provide a human readiable message i guess | 17:06 |
gmaan | yeah, because if we keep state same 'success' which make sense then another field showing detail about 'success' state can be useful | 17:06 |
gmaan | sean-k-mooney: ++ "status_message" | 17:07 |
sean-k-mooney | ok that i can get behind, a detail fild would also work | 17:07 |
sean-k-mooney | it woudl be nice ot use the same name for the field on the action and action plan | 17:07 |
sean-k-mooney | but i dont have a stong prefernce | 17:07 |
sean-k-mooney | i jsut want it to be consditent where reasonable | 17:07 |
gmaan | making it more generic and not just limiting to 'skipped" is good idea | 17:07 |
sean-k-mooney | it woudl be useful to report info on the audit too | 17:08 |
gmaan | yeah | 17:08 |
sean-k-mooney | i thikn "status_message", "message", or "detail" are all good options | 17:09 |
sean-k-mooney | amoralej: ^ | 17:09 |
sean-k-mooney | orginall i was suggesting "reason" for skipped, canceled and failed | 17:09 |
sean-k-mooney | but reaons is weired for success | 17:09 |
sean-k-mooney | i know nova uses fault for error reproting but im not sure that the right example to follow | 17:11 |
opendevreview | Douglas Viroel proposed openstack/watcher master: WIP - Merge decision engine services into a single one https://review.opendev.org/c/openstack/watcher/+/952499 | 19:06 |
opendevreview | Douglas Viroel proposed openstack/watcher master: Extend decision engine to support threading mode https://review.opendev.org/c/openstack/watcher/+/952257 | 19:17 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!