amoralej | dviroel, microversion testing worked fine in https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/955775 it ran it in the master job and skipped it in the stable ones even without https://review.opendev.org/c/openstack/watcher/+/956380 | 06:42 |
---|---|---|
dviroel | amoralej: nice | 10:54 |
dviroel | the api tests don't have a min require microversion, so they all run in stable branches | 10:55 |
dviroel | most of the scenario tests have a min required of 1.3, so they should be skipped in stable branches if we don't backport the devstack change | 10:55 |
amoralej | but the one i'm adding in https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/955775/4/watcher_tempest_plugin/tests/api/admin/test_action.py it's beeing properly skipped | 10:55 |
amoralej | i meant | 10:56 |
amoralej | in stable branches jobs | 10:56 |
dviroel | yep | 10:57 |
dviroel | "setUpClass (watcher_tempest_plugin.tests.api.admin.test_action.TestPatchAction) ... SKIPPED: The microversion range[1.5 - latest] of this test is out of the configuration range[None - None]" | 10:57 |
dviroel | but after backporting the devstack change, we expect: | 10:58 |
dviroel | setUpClass (watcher_tempest_plugin.tests.api.admin.test_action.TestPatchAction) ... SKIPPED: The microversion range[1.5 - latest] of this test is out of the configuration range[1.0 - 1.4]. | 10:58 |
dviroel | if we configure one of the api test to min_microversion 1.3 for instance, it will also be skipped, which is not correct, but in api don't have any | 11:00 |
amoralej | ah, so now works because it has latest as default | 11:06 |
amoralej | ack, thanks | 11:06 |
amoralej | it works smoothly, anyway, nice work | 11:06 |
dviroel | ++ | 11:13 |
dviroel | hi folks, a reminder that watcher meeting starts in 10 minutes, here in this channel | 11:50 |
dviroel | please add your topics to the meeting agenda: https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L40 | 11:50 |
dviroel | #startmeeting watcher | 12:00 |
opendevmeet | Meeting started Thu Aug 7 12:00:19 2025 UTC and is due to finish in 60 minutes. The chair is dviroel. Information about MeetBot at http://wiki.debian.org/MeetBot. | 12:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 12:00 |
opendevmeet | The meeting name has been set to 'watcher' | 12:00 |
dviroel | who is around today? | 12:00 |
amoralej | o/ | 12:00 |
morenod | o/ | 12:01 |
dviroel | courtesy ping: jgilaber sean-k-mooney chandankumar rlandy | 12:02 |
rlandy | o/ | 12:02 |
dviroel | alright, let's start with today's meeting agenda | 12:02 |
dviroel | #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L40 (Meeting agenda) | 12:02 |
dviroel | feel free to add your own topics to the agenda | 12:02 |
dviroel | #topic Announcements | 12:03 |
dviroel | Flamingo release schedule | 12:03 |
dviroel | #link https://releases.openstack.org/flamingo/schedule.html | 12:03 |
dviroel | adding it here just to mention that we are 3 weeks from feature freeze | 12:04 |
dviroel | and reminder that no features should be merged after this milestone | 12:05 |
dviroel | so please add your changes to the review list in our meeting agenda | 12:05 |
dviroel | today I added a section in the etherpad | 12:05 |
sean-k-mooney | o/ | 12:05 |
dviroel | to help us track the main topic that we plan to review | 12:05 |
dviroel | #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L18 | 12:06 |
dviroel | i added 3 proposed features by their topics | 12:06 |
dviroel | feel free to comment, place status update and so on | 12:07 |
dviroel | sorry jwysogla for missing the Aetos topic :( | 12:07 |
dviroel | thanks for adding | 12:08 |
sean-k-mooney | i guess we never created a wtacher status etehrpad | 12:08 |
dviroel | sean-k-mooney: yeah, we may create one, to include everything, bugfixes, features, etc.. | 12:08 |
sean-k-mooney | well we can use the meting one btu this si what nova does https://etherpad.opendev.org/p/nova-2025.2-status | 12:08 |
sean-k-mooney | i tought we did this last cycle but maybe not | 12:09 |
sean-k-mooney | im ok using the irc ether pad for now | 12:09 |
sean-k-mooney | but it nice to have a per cycle one as well | 12:09 |
sean-k-mooney | less noice and its uesful for cycle highlight and the reslease note prelude | 12:09 |
dviroel | yes it is, since the amount of patche is also growing | 12:10 |
dviroel | ok, any other announcement? | 12:10 |
sean-k-mooney | let create one async and start usign it for the remaidner fo the cyel and put a link at the top of the irc etherpad | 12:11 |
dviroel | sean-k-mooney: +1 | 12:11 |
dviroel | i can start one after the meeting | 12:11 |
rlandy | +1 | 12:11 |
dviroel | #action dviroel to start a watcher status etherpad and link at the meeting etherpad | 12:12 |
dviroel | ok so, next topic in the list | 12:13 |
dviroel | (dviroel): Eventlet Removal Updates | 12:13 |
dviroel | #topic Eventlet Removal Updates | 12:13 |
dviroel | #link https://etherpad.opendev.org/p/watcher-eventlet-removal (watcher evenlet removal etherpad) | 12:14 |
dviroel | there is a good news | 12:14 |
dviroel | the patch that extend decision engine to support threading mode merged | 12:14 |
dviroel | #link https://review.opendev.org/c/openstack/watcher/+/952257 (Extend decision engine to support threading mode) | 12:14 |
dviroel | ty all for the reviews | 12:15 |
dviroel | note that this patch added a new voting job | 12:15 |
dviroel | watcher-prometheus-integration-threading runs with decision-engine in threading mode | 12:15 |
dviroel | we decided to keep it voting, since was stable enough | 12:16 |
amoralej | cool | 12:16 |
dviroel | let us know if your patch start hitting any issue with that specific job, and we can help with debug, or just make it non-voting if required | 12:16 |
dviroel | thre is another job proposed in | 12:16 |
dviroel | #link https://review.opendev.org/c/openstack/watcher/+/955097 (Add a new tox environment to run unit tests in threading mode) | 12:17 |
dviroel | that adds a new tox environment to run unit tests in threading mode too | 12:17 |
dviroel | which should be merging soon too, i think | 12:17 |
sean-k-mooney | i left that unappoved to allow other to reivew | 12:17 |
dviroel | ++ | 12:17 |
sean-k-mooney | but ya llikely early next week if there are no objects | 12:18 |
dviroel | from the decision-engine side, I expect to still go through all threadpool default number of threads | 12:18 |
dviroel | and propose a update if needed | 12:18 |
sean-k-mooney | so im not sure if that will be | 12:18 |
sean-k-mooney | looking at the eventlet code they were prvciously limiting the eventlet pool to 4 eventlets i think | 12:19 |
sean-k-mooney | i have not checked all code path but if that was infact correct for all pools | 12:19 |
sean-k-mooney | then i dont think there will be an issue using 4 real os threads | 12:19 |
sean-k-mooney | the concern with using real os thread was mainly for project where that default limit si 10,000 | 12:19 |
dviroel | right, that's the conclusion that I would expect to reach, but still as a in-progress task on my side | 12:19 |
sean-k-mooney | which i think it was in nova | 12:20 |
sean-k-mooney | ack | 12:20 |
dviroel | we may also want to include threadpool stats in debug logs, to help us on debugging | 12:20 |
dviroel | I think that nova is also adding something similar | 12:20 |
sean-k-mooney | ya gibi has added a hook for futurists stat mechanium | 12:21 |
dviroel | but that is not only for decision-engine, but for all | 12:21 |
sean-k-mooney | so you can likely port that | 12:21 |
dviroel | +1 | 12:21 |
dviroel | next thing would be to continue the work in the applier | 12:21 |
dviroel | I will provide more udpates as soon as I have something | 12:21 |
sean-k-mooney | https://review.opendev.org/c/openstack/nova/+/948340 | 12:21 |
dviroel | or the issue that we found in the way | 12:21 |
dviroel | sean-k-mooney: yeah, this one | 12:22 |
sean-k-mooney | the applier will be a littel trickier | 12:22 |
sean-k-mooney | so the applie is implementign cancelation fo the task by killing the greenthread | 12:22 |
dviroel | yes! | 12:22 |
sean-k-mooney | that is not somethign you can do with an OS pthread | 12:22 |
sean-k-mooney | so we may have to consier usign a process pool but that has other problems | 12:23 |
sean-k-mooney | ro we may have to rethinkn how that works in general | 12:23 |
dviroel | indeed, the applier logic need to be evaluated | 12:23 |
sean-k-mooney | if we did not have time to do that this cycle you have still made a lot fo progress | 12:23 |
sean-k-mooney | dviroel: have we ensured that there is no eventlet usage in the api? | 12:24 |
sean-k-mooney | i know you marked the depercation fo the console script as done | 12:25 |
dviroel | there is a usage only with console script | 12:25 |
sean-k-mooney | but when we run under uswig have you confirmed we nevver use eventlet in teh api | 12:25 |
sean-k-mooney | ya the console script is not a problem because we are just going to delete that | 12:25 |
sean-k-mooney | its the wsgi applciation taht is the main part we need to validate | 12:26 |
dviroel | sean-k-mooney: only digging through the code, and I didn't found anything | 12:26 |
sean-k-mooney | ack | 12:26 |
dviroel | but worth revisting yeah | 12:27 |
dviroel | thanks for the reminder | 12:27 |
dviroel | any other question for this topic? | 12:28 |
amoralej | you are doing a great progress on this | 12:28 |
dviroel | moving forward then, since we have things to cover | 12:28 |
* dviroel thanks amoralej | 12:28 | |
dviroel | #topic (dviroel) Fix for bug #2098984 | 12:28 |
dviroel | i was recently hitting this bug, with a continuous audit | 12:29 |
dviroel | #link https://bugs.launchpad.net/watcher/+bug/2098984 (Zone Migration Strategy failing to build a list of instances for migration) | 12:29 |
dviroel | the tl;dr; is | 12:29 |
dviroel | zone migration strategy uses nova and cinder client to retrive instances and volumes, and then checks if they exist in the compute and storage models | 12:30 |
dviroel | when the instance or volume doesn't exist in the model, it raises an exception, which impatch the audit, setting it to Failure | 12:30 |
dviroel | so the audir doesn't run anymore | 12:31 |
dviroel | there is a patch | 12:31 |
dviroel | #link https://review.opendev.org/c/openstack/watcher/+/956198 | 12:31 |
dviroel | where I try to fix this issue with a simple try/expect to ignore the instance/volume | 12:32 |
dviroel | which is the expected behavior in the current code | 12:32 |
dviroel | amoralej: raised a question if we should fix the strategy to instead, only look at the model, and not use the clients | 12:32 |
sean-k-mooney | ya so there are multiple parts to this. 1 your patch is correct and valid, 2 amoralej is also correct that the stragey shoudl be using the data form the model not hitting the api directly | 12:33 |
dviroel | so the idea here is to get a feedback on tha | 12:33 |
amoralej | it's fine for me to fix in two steps | 12:33 |
sean-k-mooney | i think we will want to fix both issue as seperate bugs | 12:33 |
amoralej | wfm | 12:33 |
amoralej | 1st merge the try/expect from dviroel and fill a new bug with "zone_migration uses nova client instead of the model to retrive instances" | 12:34 |
dviroel | i also agree that we should be using these client directly in the strategy | 12:34 |
sean-k-mooney | well using the clinet to take actions is obvioulsy fine as part of the applciation fo the action plan. | 12:35 |
sean-k-mooney | in limited cases it may be ok in the descion engine | 12:35 |
amoralej | theoretically, consuming from model should avoid hitting the other issue, but there may be race conditions with model updates, so good to handle exceptions also | 12:35 |
sean-k-mooney | to enfrich the obejcts form teh model with extra data | 12:35 |
sean-k-mooney | but we should ideally avoid that | 12:35 |
dviroel | the 2nd fix change the behavior a little bit | 12:36 |
sean-k-mooney | amoralej: well the race can alwasy happen and we need to be tolerate to that | 12:36 |
sean-k-mooney | which si where the skip feature comes in | 12:36 |
amoralej | yes | 12:36 |
amoralej | but in that case is between running the audit and executing the action | 12:36 |
dviroel | by getting the list of instances/volumes from the api, we also avoid adding already deleted elements, that are still in the model | 12:36 |
amoralej | while this particular bug is while running the audit, not the action | 12:36 |
dviroel | but yes, this can be mitigated by the migration action, in the pre_ methods | 12:37 |
sean-k-mooney | dviroel: that is correct but we have to handel that the elemen might be deleted in teh applier anyway | 12:37 |
dviroel | ++ | 12:37 |
amoralej | yes | 12:37 |
sean-k-mooney | becuase between the action plan being created and it being appoved it could happen | 12:37 |
dviroel | true | 12:37 |
sean-k-mooney | dviroel: so where i ma with his is i think im pretty comfrotabel backprotign yoru inital patch | 12:37 |
sean-k-mooney | im less sure about changing to usign the model | 12:38 |
amoralej | note, that when i said to use only (unless we can't) the model is in the strategy execution, not in the action execution | 12:38 |
dviroel | so we agree with the current patch, file a new bug to replace the current implementation with model only, and we can discuss if should be backport afterwards or not | 12:38 |
amoralej | +1 | 12:39 |
sean-k-mooney | that a bigger change and we need to make sure it does nto result in other bugs before consdiring backproting it. that the ohter reason i would prefer 2 patches and 2 bugs | 12:39 |
sean-k-mooney | +1 | 12:39 |
dviroel | #action dviroel to file a new bug to explaing the current behavior with zone_migration using nova/cinder clients to get info about resources | 12:40 |
dviroel | #agreed on continue with the fix https://review.opendev.org/c/openstack/watcher/+/956198 | 12:40 |
dviroel | thanks for the feedbacks on that o/ | 12:41 |
dviroel | moving to next topic then | 12:41 |
dviroel | #topic (amoralej) Update about skipped actions feature | 12:41 |
dviroel | amoralej: want to highlight your latest changes? | 12:41 |
amoralej | #link https://review.opendev.org/q/topic:%22blueprint-add-skip-actions%22 | 12:41 |
amoralej | so, i've reorganized the changes as discussed a couple of meetings back | 12:42 |
amoralej | main change is that all the api related changes are in the last one https://review.opendev.org/c/openstack/watcher/+/955753/ | 12:42 |
amoralej | including status_message visibility and new actions patch | 12:42 |
amoralej | both covered by the same microversion 1.5 | 12:43 |
amoralej | instead of in separate ones, as in my previous patches | 12:43 |
amoralej | other than that, i think organization now is much more clear | 12:43 |
amoralej | and easy to review | 12:43 |
dviroel | amoralej++ | 12:44 |
dviroel | already started to review them, based on the relation chain | 12:44 |
amoralej | also, I sent a new tempest test https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/955775 which is using the microversion support | 12:44 |
amoralej | that dviroel added | 12:44 |
dviroel | ++ | 12:44 |
amoralej | and worked great | 12:44 |
* dviroel needs to update its own tempest patch to use that | 12:45 | |
amoralej | runing in functional master job, skipped in functiona-<stable> ones | 12:45 |
amoralej | I think we discussed pretty much the implementation details in a previous mtg, so this was just heads-up for reviews | 12:45 |
dviroel | thanks amoralej | 12:46 |
amoralej | the only missing part is the watcherclient and watcher-dashboard one | 12:46 |
amoralej | I'll work on that too | 12:46 |
dviroel | so please folks, review these changes when you have a chance | 12:46 |
dviroel | amoralej: anything else? | 12:47 |
sean-k-mooney | i have approved the parent | 12:47 |
sean-k-mooney | https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/956306/3 so that should merge soon | 12:47 |
amoralej | nothing else from my side | 12:47 |
amoralej | thanks | 12:47 |
dviroel | nice, thanks sean-k-mooney | 12:48 |
dviroel | thanks amoralej | 12:48 |
dviroel | #topic Reviews | 12:48 |
dviroel | again, no specific reviews to mention here | 12:48 |
dviroel | just take a look on the m-3 review list | 12:48 |
sean-k-mooney | jwysogla: proably want to bring up theres? | 12:48 |
dviroel | see what you can help us to move forward | 12:49 |
sean-k-mooney | i did a first pass on the aetos change yesterday https://review.opendev.org/c/openstack/watcher/+/955608 | 12:49 |
sean-k-mooney | its in merge conflict becase of some other stuff we landed | 12:49 |
sean-k-mooney | but over all it looks pretty reasonable | 12:50 |
dviroel | great, going to take a look soon too | 12:50 |
sean-k-mooney | the min risks here are on teh timeing of releasign the new version fo the observiablity client and geting that into upper constratis | 12:50 |
sean-k-mooney | jwysogla: what are the changes fo gettign jaunLarriba to appove teh release patch adn geing that doen thsi week? | 12:51 |
sean-k-mooney | oh they already have https://review.opendev.org/c/openstack/releases/+/956657 | 12:51 |
sean-k-mooney | ok so this is just pending the release team to approve | 12:52 |
dviroel | ok, so we need to follow up on that too | 12:53 |
dviroel | we can add it under its change in the status etherpad | 12:53 |
dviroel | jwysogla: pls provide us a feedback on that, once you are back online | 12:53 |
jwysogla | Sorry, I was in another meeting | 12:54 |
dviroel | sean-k-mooney: thanks for raising this concern | 12:54 |
sean-k-mooney | jwysogla: no worries | 12:54 |
sean-k-mooney | once the release is doen a bot will propose an update to the requirement repo to bump it in upper constraits | 12:54 |
jwysogla | I hope the observabilityclient releases soon. Then I can finish up the patch and make the CI passing. I run all the tests locally and they were working, so I hope there is not much work left. | 12:54 |
sean-k-mooney | until that is also merged the watcher patch will be blocked | 12:55 |
sean-k-mooney | but that does nto eman we cant review in advance of that | 12:55 |
jwysogla | yes, thanks | 12:55 |
dviroel | correct | 12:55 |
sean-k-mooney | jwysogla: there are some followup that we may want to dicsus after or at the ptg | 12:55 |
jwysogla | I should also mention, that I'll be on PTO last 2 weeks of August, which is quite unfortunate timing. | 12:55 |
jwysogla | But until then, that patch will be my top priority. | 12:56 |
dviroel | ok, this is importanto to know | 12:57 |
sean-k-mooney | jwysogla: basiclly next cycle if we can i woudl like to see if we can evolve our jobs away form sgcore to use the new promtious scape endpoint. i think that woudl be a good topic to dicuss in the telemetyr ptg session or any time before then after we have cut RC1 | 12:57 |
jwysogla | +1 to that | 12:57 |
sean-k-mooney | we also need to dicsus the future of gnoici supprot in ceilometer | 12:57 |
morenod | +1 | 12:58 |
sean-k-mooney | is it correct that there is a plan to retire the way ceilometer feed data to gnoccic today | 12:58 |
jwysogla | Yes, that's also a good topic for the PTG. | 12:58 |
sean-k-mooney | we had a related topic that we want to dicuss at the ptg about the future or the gnocci backend and that woudl obviously be a factor | 12:59 |
jwysogla | I don't think we have any official "plan". I my personal opinion, gnocchi should go away once in the future. But afaik most users of openstack telemetry outside of Red Hat currently use gnocchi, so we can't just deprecate it. | 12:59 |
sean-k-mooney | jwysogla: is this somethign youc can feed back to the rest of the telemetry team (i.e. our interest in disccsign this) | 12:59 |
jwysogla | yes | 13:00 |
sean-k-mooney | gnoicci relise on sg_core today is that correct? | 13:00 |
sean-k-mooney | is there an intent to deprecate sg_core | 13:00 |
sean-k-mooney | or the ceilometer part. actully we can leave that to a future time | 13:01 |
sean-k-mooney | we are at the top of the hour | 13:01 |
dviroel | yeah | 13:01 |
jwysogla | sg-core is used purely for Prometheus. I think (I'd need to check to be sure) you can currently use the Prometheus exporter just for compute metrics, so ceilometer-central still needs to go through the notification agent and sg-core | 13:01 |
dviroel | sean-k-mooney: ok to move your topic to the next meeting? | 13:01 |
sean-k-mooney | yep | 13:02 |
dviroel | rlandy: thanks for volunteering to be the next chair | 13:02 |
dviroel | ok, so let's wrap up for today | 13:02 |
dviroel | we will meet again next week | 13:02 |
dviroel | thank you all for participating | 13:02 |
dviroel | #endmeeting | 13:03 |
opendevmeet | Meeting ended Thu Aug 7 13:03:04 2025 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 13:03 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/watcher/2025/watcher.2025-08-07-12.00.html | 13:03 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/watcher/2025/watcher.2025-08-07-12.00.txt | 13:03 |
opendevmeet | Log: https://meetings.opendev.org/meetings/watcher/2025/watcher.2025-08-07-12.00.log.html | 13:03 |
rlandy | thanks dviroel | 13:03 |
amoralej_ | thanks dviroel for chairing | 13:03 |
morenod | thanks dviroel | 13:03 |
opendevreview | Merged openstack/watcher-tempest-plugin master: Add support for microversion testing for api and scenario tests https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/956306 | 13:16 |
sean-k-mooney | jwysogla: we are not actully including sg core in our gnoccii job and efoly mentin that it has a diffent way to get the metrics. ceilometer send the metrics to it direclty | 13:24 |
sean-k-mooney | so sg-core is only relevnet in the scope of promethous adn aetos jobs | 13:25 |
jwysogla | correct | 13:25 |
jwysogla | Originally ceilometer was designed to push metrics to storage. But Prometheus is designed to pull them on its own from exporters. So sg-core is basically acting as a buffer, it'll receive metrics pushed by ceilometer and expose them for Prometheus to scrape. | 13:26 |
sean-k-mooney | jwysogla: i assume there is a plan to evengually extend the scrap endpoitn work to also work for ceilomenter-central? | 13:32 |
sean-k-mooney | watcher in theroy shoudl not need to care how the metrics get into aetos/promethus | 13:33 |
sean-k-mooney | im more interested if we can have less depencies in our jobs then anything else | 13:34 |
jwysogla | I had a short conversation with Juan just now. And sg-core apparently isn't meant to deprecate any time soon and currently the prometheus exporter approach isn't usable for metrics coming from notifications (I don't know if Watcher uses these for anything). But I don't have all the knowledge in this topic, Juan would be better person to ask about the Prometheus exporter, so I think this all is a | 13:38 |
jwysogla | pretty good topic for the PTG. | 13:38 |
sean-k-mooney | ack, lets disucss it then so | 13:41 |
sean-k-mooney | i can reach out to juan directly if it become relevnet before then | 13:41 |
opendevreview | JaromÃr Wysoglad proposed openstack/watcher master: Add Aetos datasource https://review.opendev.org/c/openstack/watcher/+/955608 | 15:15 |
opendevreview | Merged openstack/watcher master: Replace dateutils usage with datetime and oslo.utils https://review.opendev.org/c/openstack/watcher/+/955809 | 20:46 |
rlandy | sean-k-mooney: hi wrt https://review.opendev.org/c/openstack/watcher/+/954067 - I checked and didn't find other references to the "Watcher Overload standard deviation algorithm". Where/how do I add the release note to mention the change strategy reference name? | 21:41 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!