Thursday, 2025-02-20

opendevreviewTakashi Kajinami proposed openstack/puppet-watcher master: Deprecate support for [oslo_messaging_rabbit] heartbeat_in_pthread  https://review.opendev.org/c/openstack/puppet-watcher/+/94204500:53
opendevreviewMerged openstack/puppet-watcher master: Stop using absolute names for defined resource types  https://review.opendev.org/c/openstack/puppet-watcher/+/94196401:32
amoralejmeeting is starting in a minute, please add your topics to https://etherpad.opendev.org/p/openstack-watcher-irc-meeting11:59
amoralej#startmeeting Watcher IRC Weekly Meeting - 20 February 202512:01
opendevmeetMeeting started Thu Feb 20 12:01:21 2025 UTC and is due to finish in 60 minutes.  The chair is amoralej. Information about MeetBot at http://wiki.debian.org/MeetBot.12:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.12:01
opendevmeetThe meeting name has been set to 'watcher_irc_weekly_meeting___20_february_2025'12:01
amoralejo/12:01
chandankumaro/12:01
amoralejwho is around?12:01
jgilabero/12:01
jneo8o/12:01
dviroelo/12:01
rlandyo/12:01
marioso/ hello12:01
amoralejhello everyone12:01
amoralejI'm giving one more minute to add topics12:02
amoralej#topic Host Maintenance strategy adjustment, non-breaking12:03
amoralej#info For instances with ephemeral disks—especially databases—we want to ensure they remain on the same node during maintenance rather than being migrated. However, the current strategy migrates all instances on a node by default12:03
amoralejjneo8, you want to introduce the topic ?12:04
jneo8Yes12:04
jneo8So now we relay on host maintenance strategy to put node into maintenance. But in case if there is instance which is database with huge ephemeral, we don't want to migrate it. And this is not support by current strategy right now.12:05
mariosjneo8: for 1 (from etherpad) ... is that in nova or watcher? I mean "introduce a tag in instance properties"... and for 2 'detect' is that in watcher (trying to understand the proposal)12:06
mariosalso, i am wondering if the work dviroel is doing with the scheduler hints can help us 12:06
jneo8So I aims to proposal first 1.12:07
amoralejOne of the points is if we want to make that part of instance configuration (tag) or audit/strategy parameteer12:07
jneo8In strategy parameteer, you mean pass the instance id you want to ignore?12:08
amoraleji.e. a parameter in the strategy witn max_ephemeral_disk_size, if an instance ephemeral > than that value, we'd just stop12:08
amoralejno, parameter would be like a max ephemeral disk size, or even disabling migration with ephemeral12:09
amoralejit'd be a parametrized version of your option 212:09
jneo8Both work. I more prefer first one is I somehow feel it may be more flexible.12:10
amoralejdepends, i.e. option 2 would allow to run a host_maintenance with migration "always" for an expected long maintenance12:11
amoralejand migrate "only small disks instances" for short maintenances12:11
amoralejor not migrate at all ephemeral disks based ones in short maintenances12:11
amoralejmodifying parameters in audit is easier that modifying tags in individual instances12:12
amoralejI'd say ...12:12
jneo8I see...12:12
sean-k-mooneyo/12:12
dviroeli think that we already support the proposal 1, but with audit scope12:13
amoralejyeah, i was also thinking about scope12:13
dviroelhttps://github.com/openstack/watcher/blob/7fcca0cc469b89957fd3821c72c3bb2d167a23ba/watcher/decision_engine/scope/compute.py#L13612:13
amoralejbut I guess jneo8 would like to stop the instances in that cases12:13
amoralejnot migrate, but stop12:13
amoralejand audit would leave the tagged instances out of scope totally12:14
amoraleji guess12:14
sean-k-mooneyso i have a general topic with the nova folks for the ptg12:14
sean-k-mooneynot sure what room it will be in12:14
sean-k-mooneybut i want to dicuss with them how we can annotate intance with policies that would inflance the behavior of watcher12:14
sean-k-mooneyi.e. optimize: flavor extra specs ro something else12:15
dviroeli understood that from 1. the instances would be only skipped from migrating12:15
sean-k-mooneyso you could express if an instance an be migrated and if so how live vs cold things liek that12:15
sean-k-mooneydoing it via the audit scope may be an option too12:16
sean-k-mooneyas could be a diffent propal which was to tallow you to modify the action plan before its executed12:16
sean-k-mooneyi.e. skiping actions or changing them12:16
sean-k-mooneyi.e change the propsoed action form cold migrate to noop or live12:16
jneo8> not migrate, but stop12:17
jneo8> and audit would leave the tagged instances out of scope totally12:17
jneo8manually stop instances and exclude instances can work. But need some manual operation to stop those instances. So more prefer to handle all operation in the watcher.12:17
sean-k-mooneyjneo8: i think there is defintly scope to make it easier to express this in watcher and make ti automatic12:17
sean-k-mooneywe proably shoud discss this more in the ptg and prepare a spec to capture the motivating usecases and come up with a design 12:18
amoralejyes, I also think we should look for a way to automate it12:18
sean-k-mooneyjneo8: i will not that tags in nova are inteed for users to be able to list instnace 12:18
sean-k-mooneythey are not ment to change the behavior of other projects12:19
sean-k-mooneyso masikari is abusing the tag interface12:19
sean-k-mooneywe could abuse it in the same way but that not what that api is for12:19
sean-k-mooneybut that is one of the options i wanted to dicuss wiht the wider core team12:19
jneo8ok12:20
amoralejalso, if we stop the instances for maintenance, once the host comes up, instances will stay stopped, right?12:20
sean-k-mooneyi thnk nova code benifit form a per instnace updateabel way to have metadata like this that is key value and intened for consumttion by other projects12:21
amoralejso manual task post-maintenance would be required?12:21
jneo8> so manual task post-maintenance would be required?12:21
jneo8That will be a new strategy, right?12:21
sean-k-mooneyamoralej: depening on how we implemnted it we could perhaps annotate them in a way that we could start them again as a result of a new audit12:21
amoralejIf we get creative, an audit could create two action plans, one pre and one post :)12:22
sean-k-mooneyyes that might be an option12:23
amoralejbut yeah, doing a new audit with a new strategy would be also a simpler option12:23
sean-k-mooneythese are all thing that we shoudl dicuss in a spec or the ptg12:23
amoralejyes, i think it's a good topic12:24
jneo8I can help to create first version of spec.12:24
amoralejjneo8, if you agree, i will add your use case to the ptg item lists in the strategies part12:24
sean-k-mooneyspecs  are intened to capture the tought process that informed the design and express the inent not jsut the mechanics so future us knwo why we did it that way :)12:24
jneo8Sure12:24
amoralejthe one_audit -> two_auditplanes would be an API thing, i guess12:25
amoraleji guess now it's 1-112:25
sean-k-mooneymaybe not12:25
amoralejright, continuous can create multiple actionplans ...12:25
amoralejso from api pov, maybe it's fine12:26
sean-k-mooneywe woudl have to see but to me the auit is an output of the stragey/goal12:26
sean-k-mooneythe fact its 1:1 is also not striclty ture for contius audits today12:26
amoralejyeah, i don't know either12:26
sean-k-mooneyjneo8: thanks for bring this up12:27
amoralejyes12:28
jneo8Sure12:28
amoralejthanks jneo8, it really deserves discussion12:28
jneo8Thanks, and sure I will keep follow up.12:28
amoralejso, i gues it will be on hold until ptg ?12:29
jneo8I guess yes...12:29
amoralejok12:29
mariosif you like you could start working on the spec (under 2025.1 for now)12:29
mariosand we can move it to 2025.2 when it comes at ptg time12:30
amoralejas a workaround you can use the scope as dviroel and sean-k-mooney mentioned12:30
sean-k-mooneymarios: actully we could just do that now12:30
mariossure12:31
sean-k-mooneywe already have the 2025.2 directory in nova12:31
sean-k-mooneyconventualy we create sometime between spec freeze (milestoen 2) and feature freeze (m3)12:31
sean-k-mooneyjneo8: so if you felt like it you could copy paste the template and create teh 2025.2 dir when working on a draft12:32
sean-k-mooneyif not one of use will do that in the next week or two12:32
jneo8I guess I maybe start after next week.12:33
sean-k-mooneyno rush12:33
jneo8I can wait you to create that.12:33
sean-k-mooneysure the tempelate wil remail mostly the same bar the numbers12:34
sean-k-mooneywe may updated it in the future but we have not had that disfcscsion or really looked at it in detail so i would not worry about that too much12:34
amoralejok, so i think we can move to next topic?12:35
sean-k-mooneysure12:35
amoralej#topic Request for reviews12:35
amoralej#info Add support for prometheus datasource in scenario tests12:35
amoralej#link https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/94214112:35
amoralej#link https://review.opendev.org/c/openstack/watcher/+/94215012:36
amoralejdviroel, this is yours12:36
dviroeli proposed this tempest-plugin improvement, to support injecting metrics in prometheus12:36
dviroelin the way that we do today for gnocchi12:36
dviroelthere is another watcher change that is testing it12:36
dviroel#link https://review.opendev.org/c/openstack/watcher/+/94215012:37
dviroelthe watcher-prometheus-integration job is setting prometheus as datasource in tempest config12:37
dviroelzone_migration test failed 2x there12:38
dviroelbut it is one of the strategies that don't really require metrics to work12:38
dviroeli think that is a bug on zone_migration strategy, that might deserve a bug report12:39
dviroelit tries to get instances from the model, but they don't yet exist there12:39
dviroelbut what I would like to ask is really some eyes on the tempest change12:40
amoralejok12:40
amoralejit's a great progress to be able to inject metrics12:40
dviroelI also plan to improve these methods (for prometheus) to also support calling promtool in a remote host, to cover other deployments12:41
dviroelit will be propose as a follow up for this one12:41
amoralejthe connection to remote would be part of the plugin or we'd replace promtool by some kind of wrapper ?12:42
amoralejtransparently, i mean12:42
amoralejhow others do it?12:42
dviroelother plugins use tempest ssh client to run commands, usually12:43
amoralejjust asking, no problem if we didn't dig into it12:43
dviroelI am planning to also go with the same approach12:43
amoralejah, right, tempest already has functions to do so12:43
amoralejgood12:43
sean-k-mooneywell12:43
dviroelsean-k-mooney: mentioned these days the problem with paramiko and FIPS also12:43
sean-k-mooneyyes and no12:43
sean-k-mooneythe function in tempest are for sshing into the guest12:44
sean-k-mooneyi dont think we shoudl use shsh in the devstack jobs to execute promtool in general12:44
amoralejyeah, my question was bout the other deployment cases12:44
amoraleji asume it will be local in devstack12:45
sean-k-mooneywe could but i think the plugin shodul supprot it for cases wehre we need to supprot remote execution fo the too12:45
dviroelfor devstack jobs, it would continue to use subprocess12:45
sean-k-mooney*tool12:45
sean-k-mooneyyep so that what the whitbox tempets plugin does12:45
amoralej+112:45
sean-k-mooneylocal executeion in devstack and ssh for tripleo12:45
sean-k-mooneysame test just slightly diffent way of calling it12:46
sean-k-mooneyin general tempest is not allow to conenct to the underlying hosts infrastucrue12:46
amoralejok, so let's review dviroel patches when we have some time12:46
sean-k-mooneythat is because tempest is blackbox testing i.e. only using interface avaiable to a end user (admin or otherwise) the "whitebox" name12:46
sean-k-mooneyis signifying that its looking at the internals or retriving data not otherwise aviabel12:47
sean-k-mooneyfor watcher its sort of gray box testign 12:47
sean-k-mooneywe are looking at interface aviabel to watcher which is fine to do in tempest12:48
sean-k-mooneybut also injecting things which is borderlien but i think ok too12:48
dviroel+112:48
sean-k-mooneyso ya lets continue with the review12:49
sean-k-mooneyi think its highly likely we wil be able to merge that this cycle before FF12:49
dviroelack, the main focus of the first one is to enabled it on devstack jobs12:49
dviroelamoralej: that what I have for today tks12:51
amoralejthanks dviroel 12:51
amoralejthere are no more topics and we have some minutes12:52
amoralejis there anything else you'd like to discuss or want to use the time for some bug triage?12:52
sean-k-mooneyjust an fyi to non redhatters12:52
sean-k-mooneytomorrow is a company day at redhat so we will mostly not be around 12:53
sean-k-mooneypsa over :)12:53
amoralejright :)12:53
amoralej#topic bug triage12:53
amoralejlet's see if we can take care of something12:54
amoralej#link https://bugs.launchpad.net/watcher/+bug/209870112:54
rlandymalinga is out ... sorry forgot to fill that12:54
amoralejthat's a new bug reported two days ago12:54
amoralejno problem rlandy it's on me as chair :)12:54
amoraleja user is asking about "Is there any configuration in watcher to force migration within same aggregate?"12:54
amoraleji'd say that also depends on the strategy if it just ask nova to find the location or providing destination?12:55
dviroelagain, maybe audit scope? providing only the aggregate wanted?12:57
* dviroel need to some time to play with audit scope12:57
* amoralej too12:57
amoralejaudit scope restricts the instances that are included in the actions, but i'm not sure about the destination of a migrate action12:58
amoraleji guess this somehow interferes with nova scheduling too12:59
sean-k-mooneyaggreates are not user visable13:00
rlandy(time check)13:00
sean-k-mooneyhowever watcher is admin only13:00
amoralejyeah, we are out of time, sorry13:00
sean-k-mooneyso maybe13:00
sean-k-mooneywe can come back to this but its a feature not a bug13:00
sean-k-mooneyso whishlist at most13:00
amoralejwe can discus on that in the channel after the meeting13:00
sean-k-mooneyand mvoe to a spec/blueprint if/wehn we do it13:01
amoralejfor now, i will not update the bug13:01
amoralejany volunteer for next week chair?13:01
rlandyI'll take it13:02
amoralej#action rlandy will chair next week13:02
amoralejthanks rlandy13:02
amoralejand thank you all for joining!13:02
amoralej#endmeeting13:02
opendevmeetMeeting ended Thu Feb 20 13:02:47 2025 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)13:02
opendevmeetMinutes:        https://meetings.opendev.org/meetings/watcher_irc_weekly_meeting___20_february_2025/2025/watcher_irc_weekly_meeting___20_february_2025.2025-02-20-12.01.html13:02
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/watcher_irc_weekly_meeting___20_february_2025/2025/watcher_irc_weekly_meeting___20_february_2025.2025-02-20-12.01.txt13:02
opendevmeetLog:            https://meetings.opendev.org/meetings/watcher_irc_weekly_meeting___20_february_2025/2025/watcher_irc_weekly_meeting___20_february_2025.2025-02-20-12.01.log.html13:02
mariosnice one amoralej 13:03
amoralejthank you marios 13:04
chandankumardviroel: sean-k-mooney Hello, one small review https://review.opendev.org/c/openstack/devstack-plugin-prometheus/+/942053 , test results: https://review.opendev.org/c/openstack/devstack-plugin-prometheus/+/942053/2#message-d46678e889cb3a0b56af568ce0eba1cd1a0127f4 , please take a look, when free, thank you!13:59
opendevreviewDouglas Viroel proposed openstack/watcher-tempest-plugin master: Add a check for instances in the compute model  https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/94230814:23
opendevreviewDouglas Viroel proposed openstack/watcher-tempest-plugin master: Add support for prometheus datasource in scenario tests  https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/94214115:14
opendevreviewDouglas Viroel proposed openstack/watcher master: Enable prometheus datasource in watcher-prometheus-integration job  https://review.opendev.org/c/openstack/watcher/+/94215016:32
opendevreviewMerged openstack/puppet-watcher master: Deprecate support for [oslo_messaging_rabbit] heartbeat_in_pthread  https://review.opendev.org/c/openstack/puppet-watcher/+/94204516:33
*** haleyb is now known as haleyb|out22:49

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!