12:03:21 <amoralej> #startmeeting Watcher meeting - 2025-01-16
12:03:21 <opendevmeet> Meeting started Thu Jan 16 12:03:21 2025 UTC and is due to finish in 60 minutes.  The chair is amoralej. Information about MeetBot at http://wiki.debian.org/MeetBot.
12:03:21 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
12:03:21 <opendevmeet> The meeting name has been set to 'watcher_meeting___2025_01_16'
12:03:48 <amoralej> please, add your topics to https://etherpad.opendev.org/p/openstack-watcher-irc-meeting
12:04:42 <marios> o/
12:04:46 <amoralej> #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting meeting agenda
12:04:53 <amoralej> let's start with the first topic
12:05:11 <amoralej> #topic (rlandy): with Martin Kopec changing roles, we will need to new cores for watcher-tempest-plugin
12:05:45 <amoralej> rlandy, you want to introduce the topic?
12:06:02 <rlandy> martin has switched roles
12:06:28 <rlandy> as such we will need to propose other cores for watcher-tempest-plugin
12:06:38 <rlandy> in time
12:06:44 <rlandy> this is just a team fyi
12:07:10 <marios> i think we have a general problem with lack of cores (basically we now have just one active core sean-k-mooney )
12:07:30 <marios> in the tempest-plugin case, until now we had martin as well (it is the exception) so here we are also down to one core
12:07:35 <marios> fetching the group for reference...
12:08:01 <sean-k-mooney> well realisticlaly until late december we didnt have any active cores
12:08:18 <sean-k-mooney> so martin has only been in the list for a few weeks and they were on pto for part of that
12:09:00 <sean-k-mooney> i added them on the 28th of november
12:09:12 <marios> #info https://review.opendev.org/admin/groups/09a91d8e24af9ce44b80062c4851a1d2fa3d4d14 watcher-tempest-core gerrit group
12:09:44 <marios> #info https://review.opendev.org/admin/groups/09a91d8e24af9ce44b80062c4851a1d2fa3d4d14,members watcher-tempest-core members
12:09:52 <sean-k-mooney> so i was going to propos that we do a review of core membership the first week of febuary
12:10:04 <rlandy> +1
12:10:09 <dviroel> +1
12:10:38 <marios> sounds good i think we have discussed doing this around this timeframe before (in the context of things not being able to merge in watcher and dashboard repos)
12:10:45 <sean-k-mooney> my plan was to review the review stats in https://openstack.biterg.io/app/dashboards and propsoe a set of potitla cores to each of the watcher group to the mailing list
12:11:23 <sean-k-mooney> if we want i can try and prepare that email before the next meeting
12:11:45 <sean-k-mooney> and then we can wait for feedback and dicusss in the next meeting
12:12:09 <amoralej> #agreed we will  do a review of core membership the first week of febuary
12:12:09 <sean-k-mooney> if there are no objects by the meeting after that on jan 30th
12:12:14 <sean-k-mooney> i can implemnt the changes
12:12:20 <sean-k-mooney> *objections
12:12:22 <amoralej> i guess that's a good plan
12:13:08 <amoralej> so, we can move to next topic?
12:14:40 <amoralej> i will open next one
12:14:47 <amoralej> #topic (marios): update on prometheus datasource
12:14:56 <marios> thanks amoralej
12:14:59 <amoralej> #link https://review.opendev.org/c/openstack/watcher/+/934423
12:15:36 <marios> as discussed last week there were some requested changes around the auth options, making the fqdn_instance_map more like a cache with a rebuild & retry at least once
12:15:57 <marios> those and some other smaller bits where implemented now (including removing the 'prometheus_' prefix on the config options)
12:16:36 <marios> there have been no further comments or requests yet and we have a +2 from sean-k-mooney and +1 from various other folks who have requested changes
12:16:48 <marios> thanks again to everyone for all your suggestions and improvements.
12:17:10 <marios> since we have the core issue, I would propose that if there are no negative comments by end of next week we can merge it?
12:17:17 * marios checks date on the patch
12:17:27 <marios> (I discussed that bit about merge with sean-k-mooney privately already
12:17:47 <sean-k-mooney> yep
12:17:59 <marios> yeah so i updated that jan 10th
12:18:00 <amoralej> +1 to merge it asasp
12:18:09 <sean-k-mooney> so for singlel core approval i want ot 1 leave time for other to review ideally at least 2 weeks
12:18:10 <marios> i'd say tomorrow is 1 week , so next friday sound good sean-k-mooney ?
12:18:22 <sean-k-mooney> 2 see reviews form non cores with no objections
12:18:36 <amoralej> btw, i started some work to integrate that in a deployment tool and I'm already relying in the config options set in latest PS :)
12:18:40 <sean-k-mooney> and 3 adress it by buildign out the core team so it is not requried long term
12:19:56 <sean-k-mooney> @marios yes the end of next week was what i had in mind so either after the next team meeting or fiday
12:20:27 <marios> #info planning to merge https://review.opendev.org/c/openstack/watcher/+/934423 by next friday 24 unless there are objections
12:21:01 <marios> so amoralej has already started iterating with the instance work on top
12:21:18 <marios> i wouldn't want it slipping further into february to merge i mean
12:21:35 <marios> thanks, that's all i had on this topic amoralej if there are no further comments we can move on?
12:21:47 <amoralej> actually my instance work is next topic :)
12:22:12 <amoralej> #topic (amoralej) add instance metrics into prometheus datasource
12:22:25 <amoralej> #link https://review.opendev.org/c/openstack/watcher/+/938893/
12:22:30 <amoralej> this is mainly a call for review
12:22:57 <amoralej> Given that the merge of previous one is approaching I'd like to also get this one reviewed when you have a chance
12:23:11 <marios> i think its already looking good amoralej thanks for jumping on that
12:24:29 <amoralej> it is much simpler that the one adding the datasource so i hope will be faster to review
12:25:43 <sean-k-mooney> the main thing that i think is missing (and coudl be in a follow up patch)
12:26:04 <sean-k-mooney> is i would like use to also extend the new tempest job to start testign with the new datasouce
12:26:38 <sean-k-mooney> that does require work in the tempest plugin but we shoudl at elast enabel/configure the new data source sooner rather then later once its merged
12:27:02 <marios> yes big +1
12:27:21 <marios> amoralej: has been testing on his env with the datasource but we should get into ci asap
12:27:23 <amoralej> yes, +1 for me too
12:27:45 <sean-k-mooney> im ok to defer the decion on if we start doing that in https://review.opendev.org/c/openstack/watcher/+/938893 or a follow up patch
12:27:48 <amoralej> as you said, I'd propose to make that follow up patch
12:27:58 <sean-k-mooney> but it would be nice to work on that before we merge it
12:28:00 <marios> yeah i think it can/should be different patch
12:28:03 <rlandy> chandankumar is close to being able to add prometheus metrics in the plugin
12:28:05 <sean-k-mooney> ack
12:28:15 <rlandy> so we should be able to extend the test shortly
12:28:40 <amoralej> for me it is a matter of time, if we can add it soon, no problem in including it into the patch
12:28:47 <sean-k-mooney> ok we can defer this to gerrit review and see how the various efforts come togheter
12:28:58 <sean-k-mooney> i think we are generally in agreement on the direction
12:29:14 <dviroel> yes
12:30:23 <amoralej> ok
12:30:27 <amoralej> so that's it for this one
12:30:31 <amoralej> moving to next one
12:30:55 <amoralej> #topic (rlandy) reminder bug triage continuation on Tuesday, January 21, 2025 12pm UTC (gmeet and IRC as sent on the ML)
12:31:04 <amoralej> i guess this is just a reminder
12:31:48 <amoralej> #info next bug triage session is on Tuesday, January 21, 2025 12pm UTC, details for anyone interested are in the mailing list
12:31:51 <marios> rlandy: did a great job running that but... do you want to do it again or would you like us to rotate?
12:32:37 <amoralej> #link https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/message/5KZUHLXUOGTBXMQ4HERO52XB7I5A3HXI/
12:33:20 <marios> rlandy: do you want us to rotate the chair on that? ^^
12:34:01 <rlandy> either way
12:34:27 <rlandy> I can finish off this one - and then next time we do it, someone else can take it
12:34:39 <rlandy> ie: I'll take this Tuesday's
12:34:48 <marios> sounds good thank you rlandy
12:34:59 <marios> we should be able to get through a fair chunk after next meeting
12:35:06 <marios> we had some overhead on the first one ;)
12:35:13 <amoralej> btw, i found a great exercise to keep learning about the status of watcher
12:35:42 <amoralej> yep, we well go faster next time :)
12:36:49 <amoralej> #topic next chair
12:37:06 <amoralej> any volunteer to chair next meeting? i don't want to forget this time
12:37:08 <amoralej> :)
12:37:20 <marios> i didn't do this for a while, but if there is someone who wants to/didn't go yet i will not fight you for it
12:38:15 <rlandy> looks like it's yours marios
12:38:21 <marios> yup
12:38:30 <amoralej> #action marios will chair next meeting
12:38:35 <amoralej> #topic open floor
12:39:07 <amoralej> out of the ones in the agenda, is there some other topic you'd like to discuss ?
12:39:22 <sean-k-mooney> i have one minor update
12:39:28 <sean-k-mooney> not in the adgeda
12:39:31 * dviroel proposing myself for chair the meeting on 30th
12:39:39 <amoralej> thanks dviroel
12:39:48 <amoralej> ok, sean-k-mooney go ahead please
12:40:13 <sean-k-mooney> #link https://bugs.launchpad.net/watcher/+bug/2086710
12:40:25 <sean-k-mooney> i have been looking into ^ on and off for a while
12:40:35 <amoralej> that's an important one ...
12:40:45 <sean-k-mooney> chattign to jayF yesterday i took a look at eventlet
12:40:58 <sean-k-mooney> and found that the behavior was changed in
12:41:02 <sean-k-mooney> #link https://github.com/eventlet/eventlet/pull/932
12:41:02 <marios> i think this randomly fails in the -strategies job right ?
12:41:13 <sean-k-mooney> so i have filed
12:41:17 <sean-k-mooney> #link https://github.com/eventlet/eventlet/issues/1014
12:41:28 <sean-k-mooney> marios: no this is not realted to -straegies
12:41:37 <sean-k-mooney> that is a bug in the tempest plugin
12:42:05 <sean-k-mooney> the -stragies job failure is
12:42:09 <sean-k-mooney> #link https://bugs.launchpad.net/watcher-tempest-plugin/+bug/2090854
12:42:40 <marios> thank you looking at that
12:42:52 <amoralej> sean-k-mooney, so we need to wait on the eventlet patch, righ? not fix from watcher side ?
12:42:55 <sean-k-mooney> anyway my update is that the runtime error were previosusly asserts in eventlet prior to eventlet 0.36.0
12:43:21 <sean-k-mooney> at least one of the new expction seams to have been incorrect to raise
12:43:35 <sean-k-mooney> the other exception may be valid and  may be a sqlachemy 2.0 issue
12:44:11 <sean-k-mooney> so right now its not clear if we will have to fix anything more in wathcer for this or if we will need to adress this in eventlet/sqlachmey
12:44:35 <sean-k-mooney> ill continue to follow this and update folks btu that was what i wanted to highlight
12:45:12 <amoralej> freeze for non-client libs is Feb 17 - Feb 21, i hope it arrives on timme
12:45:31 <amoralej> we probably can manage it as some kind of exception otherwise
12:46:12 <sean-k-mooney> so its also unclear why we are seeign diffent behavior on 3.9 vs 3.12
12:46:28 <sean-k-mooney> but yes we will have to see how we proceed if we do not have a reolution by then
12:47:05 <sean-k-mooney> ill also not that i woudl have expected this to also impact other services like nova
12:47:19 <sean-k-mooney> the fact its not impleis there is some watcher specific context to this
12:47:31 <sean-k-mooney> which is why im gong to continue to look into this in paralle
12:48:04 <amoralej> as you said before, probably the specific thing is using APScheduler, right?
12:48:15 <sean-k-mooney> yes
12:48:25 <sean-k-mooney> i did a review of which project use it in openstack
12:48:28 <sean-k-mooney> thereare 5
12:48:34 <sean-k-mooney> 3 are dead
12:48:49 <sean-k-mooney> zuul does not use eventlet
12:49:09 <sean-k-mooney> so watcher is the only "active" project with eventlet + apscheduler
12:49:32 <sean-k-mooney> i also think it required the sqlachemy datastore to be used with apscheduler too
12:49:51 <sean-k-mooney> im plannign to try and create a smaller repoducer with that combination today
12:49:56 <amoralej> so, it'd make sense that the issue is in the combination... this is a hard one sean-k-mooney , thanks for investigating it
12:50:33 <sean-k-mooney> a possibel PTG topic might be shoudl we remove/replace the usage of apschduler entirly
12:51:07 <sean-k-mooney> that not somethign i think we can do this cycle however. removbing eventlet woruld also be an optin but again too large to consider for 2025.1
12:51:53 <sean-k-mooney> as a comunity it is somethign we shoudl evaluate however as long term apschduler is workign on a 4.0 relase that is not backwards comaptible
12:52:00 <amoralej> we probably can follow what other active projects with similar usecase are doing
12:52:13 <sean-k-mooney> so we woudl have to do a large migration in the next cycle ot two anyway form 3.x
12:52:47 <amoralej> what are the alternatives to apscheduler ?
12:53:04 <amoralej> which projects may have a similar usecase ?
12:53:13 <sean-k-mooney> :) good question. today we useign it for two things
12:53:23 <sean-k-mooney> 1 runing perodic tasks that done need to use it
12:53:30 <sean-k-mooney> and the contiious audit
12:53:42 <sean-k-mooney> the first usecase it easy to remvoe
12:53:49 <sean-k-mooney> the second is why it was added in the first place
12:53:54 <amoralej> yeah
12:54:13 <sean-k-mooney> there is also a scalebality element
12:54:33 <sean-k-mooney> currently its not clear if we can horizontally scale watcher
12:54:48 <sean-k-mooney> i.e. to distibute the continous audits between deamons
12:55:00 <sean-k-mooney> anyway i think we can take this out of the meeting
12:55:07 <amoralej> right
12:55:22 <amoralej> but it is an interesting conversation to have, out of the mtg :)
12:55:30 <amoralej> also, we only have 5 more minutes
12:55:47 <amoralej> anything else you want to add before closing ?
12:57:52 <amoralej> I am closing then
12:58:01 <amoralej> Thanks all for joining!
12:58:06 <rlandy> thank you amoralej
12:58:12 <jgilaber> thanks!
12:58:15 <marios> thanks all and thanks for running amoralej o/
12:58:23 <amoralej> #endmeeting