opendevreview | Takashi Kajinami proposed openstack/puppet-watcher master: Deprecate support for [oslo_messaging_rabbit] heartbeat_in_pthread https://review.opendev.org/c/openstack/puppet-watcher/+/942045 | 00:53 |
---|---|---|
opendevreview | Merged openstack/puppet-watcher master: Stop using absolute names for defined resource types https://review.opendev.org/c/openstack/puppet-watcher/+/941964 | 01:32 |
amoralej | meeting is starting in a minute, please add your topics to https://etherpad.opendev.org/p/openstack-watcher-irc-meeting | 11:59 |
amoralej | #startmeeting Watcher IRC Weekly Meeting - 20 February 2025 | 12:01 |
opendevmeet | Meeting started Thu Feb 20 12:01:21 2025 UTC and is due to finish in 60 minutes. The chair is amoralej. Information about MeetBot at http://wiki.debian.org/MeetBot. | 12:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 12:01 |
opendevmeet | The meeting name has been set to 'watcher_irc_weekly_meeting___20_february_2025' | 12:01 |
amoralej | o/ | 12:01 |
chandankumar | o/ | 12:01 |
amoralej | who is around? | 12:01 |
jgilaber | o/ | 12:01 |
jneo8 | o/ | 12:01 |
dviroel | o/ | 12:01 |
rlandy | o/ | 12:01 |
marios | o/ hello | 12:01 |
amoralej | hello everyone | 12:01 |
amoralej | I'm giving one more minute to add topics | 12:02 |
amoralej | #topic Host Maintenance strategy adjustment, non-breaking | 12:03 |
amoralej | #info For instances with ephemeral disks—especially databases—we want to ensure they remain on the same node during maintenance rather than being migrated. However, the current strategy migrates all instances on a node by default | 12:03 |
amoralej | jneo8, you want to introduce the topic ? | 12:04 |
jneo8 | Yes | 12:04 |
jneo8 | So now we relay on host maintenance strategy to put node into maintenance. But in case if there is instance which is database with huge ephemeral, we don't want to migrate it. And this is not support by current strategy right now. | 12:05 |
marios | jneo8: for 1 (from etherpad) ... is that in nova or watcher? I mean "introduce a tag in instance properties"... and for 2 'detect' is that in watcher (trying to understand the proposal) | 12:06 |
marios | also, i am wondering if the work dviroel is doing with the scheduler hints can help us | 12:06 |
jneo8 | So I aims to proposal first 1. | 12:07 |
amoralej | One of the points is if we want to make that part of instance configuration (tag) or audit/strategy parameteer | 12:07 |
jneo8 | In strategy parameteer, you mean pass the instance id you want to ignore? | 12:08 |
amoralej | i.e. a parameter in the strategy witn max_ephemeral_disk_size, if an instance ephemeral > than that value, we'd just stop | 12:08 |
amoralej | no, parameter would be like a max ephemeral disk size, or even disabling migration with ephemeral | 12:09 |
amoralej | it'd be a parametrized version of your option 2 | 12:09 |
jneo8 | Both work. I more prefer first one is I somehow feel it may be more flexible. | 12:10 |
amoralej | depends, i.e. option 2 would allow to run a host_maintenance with migration "always" for an expected long maintenance | 12:11 |
amoralej | and migrate "only small disks instances" for short maintenances | 12:11 |
amoralej | or not migrate at all ephemeral disks based ones in short maintenances | 12:11 |
amoralej | modifying parameters in audit is easier that modifying tags in individual instances | 12:12 |
amoralej | I'd say ... | 12:12 |
jneo8 | I see... | 12:12 |
sean-k-mooney | o/ | 12:12 |
dviroel | i think that we already support the proposal 1, but with audit scope | 12:13 |
amoralej | yeah, i was also thinking about scope | 12:13 |
dviroel | https://github.com/openstack/watcher/blob/7fcca0cc469b89957fd3821c72c3bb2d167a23ba/watcher/decision_engine/scope/compute.py#L136 | 12:13 |
amoralej | but I guess jneo8 would like to stop the instances in that cases | 12:13 |
amoralej | not migrate, but stop | 12:13 |
amoralej | and audit would leave the tagged instances out of scope totally | 12:14 |
amoralej | i guess | 12:14 |
sean-k-mooney | so i have a general topic with the nova folks for the ptg | 12:14 |
sean-k-mooney | not sure what room it will be in | 12:14 |
sean-k-mooney | but i want to dicuss with them how we can annotate intance with policies that would inflance the behavior of watcher | 12:14 |
sean-k-mooney | i.e. optimize: flavor extra specs ro something else | 12:15 |
dviroel | i understood that from 1. the instances would be only skipped from migrating | 12:15 |
sean-k-mooney | so you could express if an instance an be migrated and if so how live vs cold things liek that | 12:15 |
sean-k-mooney | doing it via the audit scope may be an option too | 12:16 |
sean-k-mooney | as could be a diffent propal which was to tallow you to modify the action plan before its executed | 12:16 |
sean-k-mooney | i.e. skiping actions or changing them | 12:16 |
sean-k-mooney | i.e change the propsoed action form cold migrate to noop or live | 12:16 |
jneo8 | > not migrate, but stop | 12:17 |
jneo8 | > and audit would leave the tagged instances out of scope totally | 12:17 |
jneo8 | manually stop instances and exclude instances can work. But need some manual operation to stop those instances. So more prefer to handle all operation in the watcher. | 12:17 |
sean-k-mooney | jneo8: i think there is defintly scope to make it easier to express this in watcher and make ti automatic | 12:17 |
sean-k-mooney | we proably shoud discss this more in the ptg and prepare a spec to capture the motivating usecases and come up with a design | 12:18 |
amoralej | yes, I also think we should look for a way to automate it | 12:18 |
sean-k-mooney | jneo8: i will not that tags in nova are inteed for users to be able to list instnace | 12:18 |
sean-k-mooney | they are not ment to change the behavior of other projects | 12:19 |
sean-k-mooney | so masikari is abusing the tag interface | 12:19 |
sean-k-mooney | we could abuse it in the same way but that not what that api is for | 12:19 |
sean-k-mooney | but that is one of the options i wanted to dicuss wiht the wider core team | 12:19 |
jneo8 | ok | 12:20 |
amoralej | also, if we stop the instances for maintenance, once the host comes up, instances will stay stopped, right? | 12:20 |
sean-k-mooney | i thnk nova code benifit form a per instnace updateabel way to have metadata like this that is key value and intened for consumttion by other projects | 12:21 |
amoralej | so manual task post-maintenance would be required? | 12:21 |
jneo8 | > so manual task post-maintenance would be required? | 12:21 |
jneo8 | That will be a new strategy, right? | 12:21 |
sean-k-mooney | amoralej: depening on how we implemnted it we could perhaps annotate them in a way that we could start them again as a result of a new audit | 12:21 |
amoralej | If we get creative, an audit could create two action plans, one pre and one post :) | 12:22 |
sean-k-mooney | yes that might be an option | 12:23 |
amoralej | but yeah, doing a new audit with a new strategy would be also a simpler option | 12:23 |
sean-k-mooney | these are all thing that we shoudl dicuss in a spec or the ptg | 12:23 |
amoralej | yes, i think it's a good topic | 12:24 |
jneo8 | I can help to create first version of spec. | 12:24 |
amoralej | jneo8, if you agree, i will add your use case to the ptg item lists in the strategies part | 12:24 |
sean-k-mooney | specs are intened to capture the tought process that informed the design and express the inent not jsut the mechanics so future us knwo why we did it that way :) | 12:24 |
jneo8 | Sure | 12:24 |
amoralej | the one_audit -> two_auditplanes would be an API thing, i guess | 12:25 |
amoralej | i guess now it's 1-1 | 12:25 |
sean-k-mooney | maybe not | 12:25 |
amoralej | right, continuous can create multiple actionplans ... | 12:25 |
amoralej | so from api pov, maybe it's fine | 12:26 |
sean-k-mooney | we woudl have to see but to me the auit is an output of the stragey/goal | 12:26 |
sean-k-mooney | the fact its 1:1 is also not striclty ture for contius audits today | 12:26 |
amoralej | yeah, i don't know either | 12:26 |
sean-k-mooney | jneo8: thanks for bring this up | 12:27 |
amoralej | yes | 12:28 |
jneo8 | Sure | 12:28 |
amoralej | thanks jneo8, it really deserves discussion | 12:28 |
jneo8 | Thanks, and sure I will keep follow up. | 12:28 |
amoralej | so, i gues it will be on hold until ptg ? | 12:29 |
jneo8 | I guess yes... | 12:29 |
amoralej | ok | 12:29 |
marios | if you like you could start working on the spec (under 2025.1 for now) | 12:29 |
marios | and we can move it to 2025.2 when it comes at ptg time | 12:30 |
amoralej | as a workaround you can use the scope as dviroel and sean-k-mooney mentioned | 12:30 |
sean-k-mooney | marios: actully we could just do that now | 12:30 |
marios | sure | 12:31 |
sean-k-mooney | we already have the 2025.2 directory in nova | 12:31 |
sean-k-mooney | conventualy we create sometime between spec freeze (milestoen 2) and feature freeze (m3) | 12:31 |
sean-k-mooney | jneo8: so if you felt like it you could copy paste the template and create teh 2025.2 dir when working on a draft | 12:32 |
sean-k-mooney | if not one of use will do that in the next week or two | 12:32 |
jneo8 | I guess I maybe start after next week. | 12:33 |
sean-k-mooney | no rush | 12:33 |
jneo8 | I can wait you to create that. | 12:33 |
sean-k-mooney | sure the tempelate wil remail mostly the same bar the numbers | 12:34 |
sean-k-mooney | we may updated it in the future but we have not had that disfcscsion or really looked at it in detail so i would not worry about that too much | 12:34 |
amoralej | ok, so i think we can move to next topic? | 12:35 |
sean-k-mooney | sure | 12:35 |
amoralej | #topic Request for reviews | 12:35 |
amoralej | #info Add support for prometheus datasource in scenario tests | 12:35 |
amoralej | #link https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/942141 | 12:35 |
amoralej | #link https://review.opendev.org/c/openstack/watcher/+/942150 | 12:36 |
amoralej | dviroel, this is yours | 12:36 |
dviroel | i proposed this tempest-plugin improvement, to support injecting metrics in prometheus | 12:36 |
dviroel | in the way that we do today for gnocchi | 12:36 |
dviroel | there is another watcher change that is testing it | 12:36 |
dviroel | #link https://review.opendev.org/c/openstack/watcher/+/942150 | 12:37 |
dviroel | the watcher-prometheus-integration job is setting prometheus as datasource in tempest config | 12:37 |
dviroel | zone_migration test failed 2x there | 12:38 |
dviroel | but it is one of the strategies that don't really require metrics to work | 12:38 |
dviroel | i think that is a bug on zone_migration strategy, that might deserve a bug report | 12:39 |
dviroel | it tries to get instances from the model, but they don't yet exist there | 12:39 |
dviroel | but what I would like to ask is really some eyes on the tempest change | 12:40 |
amoralej | ok | 12:40 |
amoralej | it's a great progress to be able to inject metrics | 12:40 |
dviroel | I also plan to improve these methods (for prometheus) to also support calling promtool in a remote host, to cover other deployments | 12:41 |
dviroel | it will be propose as a follow up for this one | 12:41 |
amoralej | the connection to remote would be part of the plugin or we'd replace promtool by some kind of wrapper ? | 12:42 |
amoralej | transparently, i mean | 12:42 |
amoralej | how others do it? | 12:42 |
dviroel | other plugins use tempest ssh client to run commands, usually | 12:43 |
amoralej | just asking, no problem if we didn't dig into it | 12:43 |
dviroel | I am planning to also go with the same approach | 12:43 |
amoralej | ah, right, tempest already has functions to do so | 12:43 |
amoralej | good | 12:43 |
sean-k-mooney | well | 12:43 |
dviroel | sean-k-mooney: mentioned these days the problem with paramiko and FIPS also | 12:43 |
sean-k-mooney | yes and no | 12:43 |
sean-k-mooney | the function in tempest are for sshing into the guest | 12:44 |
sean-k-mooney | i dont think we shoudl use shsh in the devstack jobs to execute promtool in general | 12:44 |
amoralej | yeah, my question was bout the other deployment cases | 12:44 |
amoralej | i asume it will be local in devstack | 12:45 |
sean-k-mooney | we could but i think the plugin shodul supprot it for cases wehre we need to supprot remote execution fo the too | 12:45 |
dviroel | for devstack jobs, it would continue to use subprocess | 12:45 |
sean-k-mooney | *tool | 12:45 |
sean-k-mooney | yep so that what the whitbox tempets plugin does | 12:45 |
amoralej | +1 | 12:45 |
sean-k-mooney | local executeion in devstack and ssh for tripleo | 12:45 |
sean-k-mooney | same test just slightly diffent way of calling it | 12:46 |
sean-k-mooney | in general tempest is not allow to conenct to the underlying hosts infrastucrue | 12:46 |
amoralej | ok, so let's review dviroel patches when we have some time | 12:46 |
sean-k-mooney | that is because tempest is blackbox testing i.e. only using interface avaiable to a end user (admin or otherwise) the "whitebox" name | 12:46 |
sean-k-mooney | is signifying that its looking at the internals or retriving data not otherwise aviabel | 12:47 |
sean-k-mooney | for watcher its sort of gray box testign | 12:47 |
sean-k-mooney | we are looking at interface aviabel to watcher which is fine to do in tempest | 12:48 |
sean-k-mooney | but also injecting things which is borderlien but i think ok too | 12:48 |
dviroel | +1 | 12:48 |
sean-k-mooney | so ya lets continue with the review | 12:49 |
sean-k-mooney | i think its highly likely we wil be able to merge that this cycle before FF | 12:49 |
dviroel | ack, the main focus of the first one is to enabled it on devstack jobs | 12:49 |
dviroel | amoralej: that what I have for today tks | 12:51 |
amoralej | thanks dviroel | 12:51 |
amoralej | there are no more topics and we have some minutes | 12:52 |
amoralej | is there anything else you'd like to discuss or want to use the time for some bug triage? | 12:52 |
sean-k-mooney | just an fyi to non redhatters | 12:52 |
sean-k-mooney | tomorrow is a company day at redhat so we will mostly not be around | 12:53 |
sean-k-mooney | psa over :) | 12:53 |
amoralej | right :) | 12:53 |
amoralej | #topic bug triage | 12:53 |
amoralej | let's see if we can take care of something | 12:54 |
amoralej | #link https://bugs.launchpad.net/watcher/+bug/2098701 | 12:54 |
rlandy | malinga is out ... sorry forgot to fill that | 12:54 |
amoralej | that's a new bug reported two days ago | 12:54 |
amoralej | no problem rlandy it's on me as chair :) | 12:54 |
amoralej | a user is asking about "Is there any configuration in watcher to force migration within same aggregate?" | 12:54 |
amoralej | i'd say that also depends on the strategy if it just ask nova to find the location or providing destination? | 12:55 |
dviroel | again, maybe audit scope? providing only the aggregate wanted? | 12:57 |
* dviroel need to some time to play with audit scope | 12:57 | |
* amoralej too | 12:57 | |
amoralej | audit scope restricts the instances that are included in the actions, but i'm not sure about the destination of a migrate action | 12:58 |
amoralej | i guess this somehow interferes with nova scheduling too | 12:59 |
sean-k-mooney | aggreates are not user visable | 13:00 |
rlandy | (time check) | 13:00 |
sean-k-mooney | however watcher is admin only | 13:00 |
amoralej | yeah, we are out of time, sorry | 13:00 |
sean-k-mooney | so maybe | 13:00 |
sean-k-mooney | we can come back to this but its a feature not a bug | 13:00 |
sean-k-mooney | so whishlist at most | 13:00 |
amoralej | we can discus on that in the channel after the meeting | 13:00 |
sean-k-mooney | and mvoe to a spec/blueprint if/wehn we do it | 13:01 |
amoralej | for now, i will not update the bug | 13:01 |
amoralej | any volunteer for next week chair? | 13:01 |
rlandy | I'll take it | 13:02 |
amoralej | #action rlandy will chair next week | 13:02 |
amoralej | thanks rlandy | 13:02 |
amoralej | and thank you all for joining! | 13:02 |
amoralej | #endmeeting | 13:02 |
opendevmeet | Meeting ended Thu Feb 20 13:02:47 2025 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 13:02 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/watcher_irc_weekly_meeting___20_february_2025/2025/watcher_irc_weekly_meeting___20_february_2025.2025-02-20-12.01.html | 13:02 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/watcher_irc_weekly_meeting___20_february_2025/2025/watcher_irc_weekly_meeting___20_february_2025.2025-02-20-12.01.txt | 13:02 |
opendevmeet | Log: https://meetings.opendev.org/meetings/watcher_irc_weekly_meeting___20_february_2025/2025/watcher_irc_weekly_meeting___20_february_2025.2025-02-20-12.01.log.html | 13:02 |
marios | nice one amoralej | 13:03 |
amoralej | thank you marios | 13:04 |
chandankumar | dviroel: sean-k-mooney Hello, one small review https://review.opendev.org/c/openstack/devstack-plugin-prometheus/+/942053 , test results: https://review.opendev.org/c/openstack/devstack-plugin-prometheus/+/942053/2#message-d46678e889cb3a0b56af568ce0eba1cd1a0127f4 , please take a look, when free, thank you! | 13:59 |
opendevreview | Douglas Viroel proposed openstack/watcher-tempest-plugin master: Add a check for instances in the compute model https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/942308 | 14:23 |
opendevreview | Douglas Viroel proposed openstack/watcher-tempest-plugin master: Add support for prometheus datasource in scenario tests https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/942141 | 15:14 |
opendevreview | Douglas Viroel proposed openstack/watcher master: Enable prometheus datasource in watcher-prometheus-integration job https://review.opendev.org/c/openstack/watcher/+/942150 | 16:32 |
opendevreview | Merged openstack/puppet-watcher master: Deprecate support for [oslo_messaging_rabbit] heartbeat_in_pthread https://review.opendev.org/c/openstack/puppet-watcher/+/942045 | 16:33 |
*** haleyb is now known as haleyb|out | 22:49 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!