12:01:19 #startmeeting Watcher meeting - 2025-01-09 12:01:19 Meeting started Thu Jan 9 12:01:19 2025 UTC and is due to finish in 60 minutes. The chair is amoralej. Information about MeetBot at http://wiki.debian.org/MeetBot. 12:01:19 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 12:01:19 The meeting name has been set to 'watcher_meeting___2025_01_09' 12:01:33 \o/ 12:01:38 o/ 12:01:45 o/ 12:01:47 hello o/ 12:01:55 o/ 12:02:06 hi o/ 12:02:12 #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting meeting agenda 12:02:27 please add your topics to the list if you have something else 12:02:30 o/ 12:02:59 let's start with the agenda items for today 12:03:21 #topic (marios) let's revisit the backports discussion prompted by https://review.opendev.org/c/openstack/watcher/+/937823/comment/eb0827ce_db971940/ 12:03:47 o/ hello, so there has been some discussion on gerrit already in response to jneo8 patches but... 12:04:06 o/ (apologies for joining late) 12:04:25 tl;dr no objection to backports but they will be taken on a case by case basis. some of the proposed patches need to go on 2024.2 first and some of them need discussion because the proposed backports are still ongoing work in this cycle 12:04:54 the topic of backports has come up in the past and we werent' sure if folks were interested since the project had very low volume in the last few cycles 12:05:20 so it seems jneo8 is interested in deploying 2024.1 and will request a point release once the rquired backports are agreed and merged 12:05:39 k... thats the intro :) anyone have something to add to this topic comments concerns 12:06:37 one comment 12:06:56 the backprots seam to be motivated by supproting python 3.12 12:07:15 Yes 12:07:15 even if we back prot the rquired patches to make it work it wont add offical support to older brnahces 12:07:33 the first release to supprot python 3.12 upstream will be 2025.1 12:07:49 on older relases it will be conserded experimental 12:07:53 I see, but when will be the release time for 2025.1? 12:08:14 in about 6-8 weeks 12:08:16 https://releases.openstack.org/epoxy/schedule.html 12:08:28 for what it worth master does nto fully work on 3.12 yet 12:08:45 the inital paches i wrote to try and fix the eventlet issue sdidnt actully fix it properly 12:09:03 we still see blocking calls on the main loop 12:09:15 so we still have work to do to properly supprot it 12:10:07 I think making master branch working properly make sense. 12:11:24 so, if the goal is to have it running on python 3.12, should we wait at least until master is working before moving on with any backport ? 12:11:36 any backport related to 3.12, i mean 12:11:45 well i think maybe the backports will no longer be required 12:11:59 if the target is 2025.1 12:12:08 yep, likely by then the best option may be to move to 2025.1 ... 12:12:09 So look like the back port should be discuss after master branch support python3.12 properly, it maybe too early to do it now. 12:12:36 we can back port some of the fixes as they are unrelated to 3.12 12:12:57 but i think we shoudl wiat for the master supprot to be finalised before the 3.12 specic ones are considerd 12:13:31 I'd sugest to set the ones related to 3.12 as WIP or in some way that it is clear which ones are waiting for 3.12 support 12:13:33 I see, that make sense. 12:14:16 https://review.opendev.org/c/openstack/watcher/+/938429 specificaly is the one that shoudl wait 12:14:25 or even abandon them and restore later, whatever the owner prefers 12:14:50 I can abandon them first. 12:16:21 wrt proposing to 2024.2, i understand that's discussed in the reviews, no need to discuss here? 12:17:20 yeah ithink so... some of the backports which are otherwise good to go (not related to 3.12 for example) are blocked because they need to be proposed to 2024.2 first 12:17:30 right, the summary is that stable poicly does nto allow skiping branches 12:17:41 yep, that's important 12:17:45 ack 12:17:54 ack 12:18:35 one final comment https://review.opendev.org/c/openstack/watcher/+/938435 cant be backported because it increase the requried version of oslo.utils 12:19:10 its not needed however as datetime.utcnow() is only deprecated not removed in 3.12 12:19:20 again that captured in the review 12:19:41 yes, and oslo.utils-7.0.0 is caracal release 12:20:32 actually, caracal is 2024.1 so https://review.opendev.org/c/openstack/watcher/+/938435/1/requirements.txt may be backportable to 2024.1 12:20:47 if that value is correct 12:20:52 its not 12:21:00 with out this change we supprot 3.36 12:21:12 https://github.com/openstack/requirements/blob/stable/2024.1/upper-constraints.txt#L600 12:21:13 we are not allowed to raise min verison in a backprot 12:21:18 even if its released 12:21:25 ah, didn't know that! 12:21:46 i thought it was possible as soon as it was part of the release ... 12:21:47 ok 12:21:55 good to know 12:22:02 the expction is for security reaons 12:22:11 but that does nto apply here 12:22:16 yep 12:22:39 anything else about backports or we can move to the next topic ? 12:22:49 not from me :) 12:23:08 not from me. And thanks for the input!! 12:23:16 gook, let's move on then 12:23:27 #topic (amoralej) call for a triage session to triage existing bugs in launchpad 12:24:02 We have been discussing about how to proceed with the existing bugs reported in launchpad for watcher 12:24:13 #link https://bugs.launchpad.net/watcher 12:24:44 some of them have been there for some time and have not been triaged, some were triaged long time ago, and may be worthy to revisit 12:25:24 43 is not an impossible number (but still some effort to try and understand and triage these) 12:25:33 so, the proposal is to schedule a triage session to work together and coordinate on irc so that we can see what can be closed and wat should be priorized 12:26:17 wdyt, any other proposal? 12:26:26 yep, i'd say 43 is doable 12:26:26 we might want to have a google meet as well 12:26:44 amoralej: there are also these: https://bugs.launchpad.net/watcher-dashboard 12:26:45 isnt there a foundation approved one jitsi? 12:26:53 we can use any tool 12:26:56 right 12:27:03 #link https://bugs.launchpad.net/watcher-dashboard 12:27:09 thanks for the reminder rlandy 12:27:23 my point was more we may need higher bandwith 12:27:33 we can start with irc 12:27:45 and perhaps have a second pass with a higher bandwith mediam after 12:27:46 ack yes i think call will be easier to coordinate we can get through more bugs 12:27:52 +1 12:27:57 from community pov, is fine to use gmeeting? i had no idea about how to create a meeting with opendev jitsi, but it'd be good 12:28:13 i guess as longas anyone interested can access the call ... ? 12:28:25 yes we use it form time to time for nova 12:28:35 the imporant things is to make it open to all 12:28:54 good, then I'd say google meet is the easier path ... 12:28:54 we have also used other toosl in the past but mostly we try to be irc first 12:29:15 I think one gmeet will be enough to set the process in motion 12:29:19 irc after that 12:30:03 gmeet in first call could be good for coordination 12:30:23 and I think it'd be good to schedule it asap, maybe next week ? 12:30:50 maybe this time slot on tuesday? 12:31:01 wfm 12:31:33 +1 12:31:35 sure we can make that work 12:32:31 so we can give that slot as agreed? anyone wants to propose a different one? 12:33:03 +1 from me 12:33:21 we should announce the meeting on the mailing list 12:33:27 #agreed we will have a Watcher triage session on tuesday 14th at 12:00 UTC 12:33:35 and provide a link for anyone who wnats to join 12:33:40 yes, we will also announce in mailing list 12:33:41 (we can coordinate in this irc channel) 12:34:20 wrt irc only vs gmeet (+irc) vs other 12:35:52 i understood there is agreement on gmeet ? 12:36:10 if someone cannot join they should be able to reach us somewhere 12:36:15 is what i was thinking of 12:36:27 #agreed google meet + irc will be use for coordination 12:36:33 but yeah the meeting will be gmeet 12:37:11 so, i think we are done with this topic 12:38:09 next topic in the agenda is from jneo8 about support for 3.12, it was covered in previous one or there is something else you want to discuss? 12:39:05 perhaps a birfe comment on the current state of things 12:39:07 No, I think it's been covered. 12:39:25 on master we see eventlet related issues that only happen on 3.12 12:39:36 out of curiosity, other openstack projects work fine (experimentally) with 3.12 and 2024.1 branch? 12:39:41 sepcificly on ubuntu noble 12:39:50 yes 12:39:52 for the most part 12:40:06 the issue with watcher is it is mixing 3 concurancy modeles at once 12:40:28 its using native treads va APSchduler + eventlets + asyncio 12:40:34 I think magnum has issues as well with 3.12 and 2024.1 branch 12:40:51 ya not all project work with 3.12 in 2024.1 12:41:10 ack 12:41:16 it was only added to the testing runtim as experimetal in 2024.2 and requried this cycle for 2025.1 12:41:53 almost none of openstack uses APSchduler and its the interaction betwen that and eventlet and python 3.12 that is broken 12:42:09 which is why nova/neutron/glance are not affected in the same way 12:42:19 i expected that it would be in worse situation, tbh 12:42:55 so i think it's clear enough and we can move to next topic or we will get out of time 12:43:36 #topic (marios): update on prometheus datasource https://review.opendev.org/c/openstack/watcher/+/934423 12:44:12 thanks, so some summary of the current state. i'd still like to merge this 'real soon now'. 12:44:32 there have been some review requests in the last couple of days aroudn 2 themes that i am working on for v30 coming today or tomorrow 12:44:57 one theme is to add a retry when we can't resolve the prometheus exporter hostname in the internal fqdn_instance_map 12:45:10 retry means rebuild the instance maps and retry the query before giving up 12:45:34 the other theme is around the client config options naming and being consistent with other projecs 12:45:57 so removing the prometheus_ prefix and using the 'standard' name for the tls options 12:46:28 sean-k-mooney: my plan is to implement the naming changes but pusing the oslo config validation as future work agree? 12:46:42 not decided if max_min inversion will be included in v30 or also pushed 12:46:52 but agree on the client opts lets get those right before merge 12:46:58 (also spliting the host and port ) 12:47:23 so any comments or questions on this or any other topic related to this patch ? 12:48:01 ok well moving the min_max around is purly internal 12:48:07 right 12:48:08 so that can be a follow up 12:48:48 i think we can proceed with that you plan to push up soon 12:49:08 so for now lets proceed with the split your propsoeding and we can take it form there 12:49:19 ack 12:50:20 we are done with the topic? 12:50:28 from my side yes 12:51:09 so, i think we can move to the last one 12:51:28 #topic (hemanth): noisy neighbour strategy not working as cpu_l3_cache metric is not collected 12:51:32 #link https://bugs.launchpad.net/ceilometer/+bug/2081128 12:51:59 hemanth, you want to introduce the topic ? 12:52:21 This is more of a question if anyone is using the noisy neighbour strategy, if so how? 12:52:44 The strategy relies on metric cpu_l3_cache which is not collected in ceilometer/gnocchi 12:52:50 we prbably shoudl deprecated it in the curent form 12:53:08 if i recall correctly we removed the fucntionatly related to thsi form livbirt too 12:53:14 or rather nova 12:53:42 yes 12:54:03 i guess it might be posible that the stats were collected soem other way to feed them in to gnooci 12:54:08 cpu_l3_cache is still inthe ceilometer documentation https://docs.openstack.org/ceilometer/latest/admin/telemetry-measurements.html#openstack-compute i guess that should be considered a bug? 12:54:49 a mean, a documentation bug :) 12:54:52 likely yes 12:55:28 Merged openstack/watcher stable/2024.1: Update .gitreview for stable/2024.1 https://review.opendev.org/c/openstack/watcher/+/913343 12:55:29 Merged openstack/watcher stable/2024.1: Update TOX_CONSTRAINTS_FILE for stable/2024.1 https://review.opendev.org/c/openstack/watcher/+/913344 12:56:34 i think we shoudl likely mark the stagey as deprecated and reach out on the mailing lsit to see if anyoen is ueing it 12:56:39 we can also ask at the ptg 12:56:53 ok sure 12:56:57 if there is no feedback form user we can look to remove it or reimplement it next cycle 12:57:07 ack 12:57:56 and maybe report it as a bug to ceilometer too, to check if they can get the metric in some other way or remove it from doc 12:58:29 for what its worht there are other facotrs that can be used ot detect noisy neibghors l3 cache usage is not really a good metric for this IMO 12:58:43 i think that off topci however 12:58:52 so, we will wait to PTG before marking it as deprecated, or somethign we can initiate it first ? 12:59:08 yep, we are almost out of time 12:59:17 no i think we shoudl mark it deprecated now 12:59:32 but we shoudl not look at remvoign it untl we have a wider dicussion 12:59:37 yep 12:59:53 per the SLURP policy 12:59:54 I can submit a PR to deprecate early next week 13:00:09 thanks hemanth, that'd be great 13:00:11 we are not allowed ot remvoe feature without adverstiing the deprecation in a SLURP release 13:00:43 so, unless there is some last minute topic, i'm closing the meeting 13:01:21 then thanks all for participating! see you on tuesday! 13:01:29 o/ 13:01:31 thank you amoralej \o 13:01:32 #endmeeting