| opendevreview | chandan kumar proposed openstack/watcher-dashboard master: [poc]Add Playwright-based E2E testing framework https://review.opendev.org/c/openstack/watcher-dashboard/+/970353 | 04:09 |
|---|---|---|
| opendevreview | Alfredo Moralejo proposed openstack/watcher master: Revert "Update migration notification" https://review.opendev.org/c/openstack/watcher/+/974298 | 09:23 |
| opendevreview | Douglas Viroel proposed openstack/watcher master: Deprecate MAAS integration https://review.opendev.org/c/openstack/watcher/+/974308 | 11:15 |
| opendevreview | Joan Gilabert proposed openstack/watcher master: Add wrapper classes for novaclient objects https://review.opendev.org/c/openstack/watcher/+/972912 | 11:43 |
| opendevreview | Joan Gilabert proposed openstack/watcher master: Use wrapper classes for novaclient objects https://review.opendev.org/c/openstack/watcher/+/972913 | 11:43 |
| opendevreview | Douglas Viroel proposed openstack/watcher master: DNM - Eventlet code removal https://review.opendev.org/c/openstack/watcher/+/973995 | 11:46 |
| dviroel | hi all, watcher meeting will start in 7m, please add your topics/reviews/bugs to the agenda: https://etherpad.opendev.org/p/openstack-watcher-irc-meeting | 11:53 |
| dviroel | #startmeeting watcher | 12:01 |
| opendevmeet | Meeting started Thu Jan 22 12:01:08 2026 UTC and is due to finish in 60 minutes. The chair is dviroel. Information about MeetBot at http://wiki.debian.org/MeetBot. | 12:01 |
| opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 12:01 |
| opendevmeet | The meeting name has been set to 'watcher' | 12:01 |
| dviroel | hey o/ | 12:01 |
| rlandy | o/ | 12:01 |
| morenod | o/ | 12:01 |
| jgilaber | o/ | 12:01 |
| amoralej | o/ | 12:01 |
| dviroel | courtesy ping list: sean-k-mooney chandankumar | 12:02 |
| dviroel | thanks for joinning folks o/ | 12:02 |
| dviroel | let's start with today's meeting agenda | 12:02 |
| dviroel | #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L28 | 12:03 |
| dviroel | feel free to add your own topics to the agenda | 12:03 |
| rlandy | chandankumar is away today | 12:03 |
| dviroel | rlandy: ack, tks | 12:03 |
| dviroel | #topic Eventlet removal | 12:04 |
| dviroel | added a topic for a quick recap on our current status | 12:04 |
| dviroel | we recently merged the last applier patch, to include threading mode | 12:04 |
| dviroel | all components now support threading mode, and there is a ci job that runs on check against all new proposed patches | 12:05 |
| dviroel | there is also a job that runs all unit tests with python3.12 | 12:05 |
| dviroel | I recently proposed a DNM patch to remove all eventlet code | 12:06 |
| dviroel | and see how it goes in our CI jobs | 12:06 |
| dviroel | #link https://review.opendev.org/c/openstack/watcher/+/973995 | 12:06 |
| dviroel | the PS1 was green, but was still missing MAAS eventlet code removal | 12:06 |
| dviroel | the PS2 now has everything removed and we will see ci results soon | 12:07 |
| dviroel | and talking about MAAS | 12:07 |
| dviroel | i just pushed the deprecation of MAAS integration | 12:08 |
| dviroel | #link https://review.opendev.org/c/openstack/watcher/+/974308 | 12:08 |
| dviroel | if nobody fix its implementation in the following releases, we will need to delete the code, together with eventlet code | 12:08 |
| dviroel | so it is important to deprecated in this release | 12:09 |
| dviroel | i would also like to request some reviews on | 12:09 |
| dviroel | patch: Remove eventlet-based timeout in CDM collectors: https://review.opendev.org/c/openstack/watcher/+/968568 | 12:10 |
| dviroel | due to the removal of eventlet timeout based implementation, there is a proposal for adding new timeouts inside nova's collector code, using futurist waiters | 12:10 |
| dviroel | thanks all that already reviewed | 12:11 |
| dviroel | and finally | 12:11 |
| dviroel | just to mention what would be the next big thing in the eventlet removal | 12:11 |
| dviroel | https://bugs.launchpad.net/watcher/+bug/2133505 ([RFE] Enable Applier to run Taskflow parallel engine with native threads) | 12:11 |
| dviroel | this RFE summarize the need for a parallel engine execution when running native threads | 12:12 |
| dviroel | this is important to have before removing eventlet implementation | 12:12 |
| amoralej | it's the last issue to fix for eventlet, right? | 12:12 |
| dviroel | yeah, I think so | 12:13 |
| amoralej | good | 12:13 |
| dviroel | but it can turn to be very complicated | 12:13 |
| amoralej | :( | 12:13 |
| dviroel | :/ | 12:13 |
| jgilaber | thanks for the great work dviroel++ | 12:14 |
| dviroel | :) | 12:14 |
| dviroel | any other questions/comments in that topic? | 12:14 |
| amoralej | yep dviroel++ this was a hard one | 12:14 |
| dviroel | ok, next topic then | 12:15 |
| dviroel | #topic Reviews | 12:15 |
| dviroel | chandankumar is out, but he left some changes for us | 12:15 |
| dviroel | Add spec for improving watcher-dashboard testing with selenium vs playwright comparison based on poc | 12:16 |
| dviroel | #link https://review.opendev.org/c/openstack/watcher-specs/+/970220 | 12:16 |
| dviroel | it seems that was just a recent move to the 2026.2 directory | 12:16 |
| dviroel | there is also a link to the recent poc | 12:17 |
| dviroel | #link https://review.opendev.org/c/openstack/watcher-dashboard/+/970353 (playwright poc) | 12:17 |
| dviroel | so i guess that we could review and early merge this 2026.2 spec | 12:17 |
| dviroel | next | 12:17 |
| dviroel | 973806: Backport of three nodeset for realdata to stable/2025.2 | 12:18 |
| morenod | thats me | 12:18 |
| dviroel | #link https://review.opendev.org/c/openstack/watcher/+/973806 | 12:18 |
| morenod | last week we merged on master, and it was fine during the CI on the weekend | 12:18 |
| dviroel | oh nice | 12:18 |
| morenod | but this CI also runs on 2025.2, so we need this backport | 12:18 |
| dviroel | ack, I can do a review after the meeting and get this in | 12:19 |
| morenod | it also runs on the weekend, so if we can it merged today or tomorrow, we will see the results on monday | 12:19 |
| amoralej | master jobs have been passing consistently after that change? | 12:19 |
| morenod | there is only one run | 12:19 |
| morenod | but yes | 12:19 |
| amoralej | for periodic real-data jobs i mean | 12:19 |
| amoralej | it was consistently failing before, so i guess that's a good sign :) | 12:20 |
| sean-k-mooney | o/ | 12:20 |
| dviroel | ok, the next review then is | 12:21 |
| dviroel | Last pending patch of the applier-monitor series (comments are fixed in last PS) | 12:21 |
| dviroel | #link https://review.opendev.org/c/openstack/watcher/+/970614 | 12:21 |
| dviroel | this one is on my list for today | 12:21 |
| amoralej | yes, that comes from the service monitors one, i hope it has all the requested fixes, thanks! | 12:21 |
| jgilaber | me too, I'll get to that later today, but it was almost ready when I first reviewed | 12:22 |
| dviroel | ack, any other review requests? | 12:23 |
| amoralej | I have one but it's related to one of the bugs in the list, so we can cover it there | 12:23 |
| dviroel | ack | 12:23 |
| dviroel | #topic Bugs | 12:23 |
| dviroel | i added 3 bugs to the etherpad today | 12:23 |
| dviroel | DataModel is not updated properly after live migrations with notifications - NEW | 12:24 |
| dviroel | #link https://bugs.launchpad.net/watcher/+bug/2138857 | 12:24 |
| dviroel | this one amoralej ^? | 12:24 |
| amoralej | yes, that's the one i reported today | 12:25 |
| dviroel | i remember someone raising a similar issue in another old bug | 12:25 |
| amoralej | so, while testing locally i found some unexpected behaviors when notifications were enabled | 12:25 |
| amoralej | me too, but i couldn't find it, tbh | 12:25 |
| amoralej | I sent patch https://review.opendev.org/c/openstack/watcher/+/974298 | 12:25 |
| amoralej | surprisingly, it was fine when notifications were introduced but then event for live migration end was changed in https://review.opendev.org/c/openstack/watcher/+/658973 and broken | 12:26 |
| amoralej | or something has changed in the nova notifications since then, dunno | 12:26 |
| amoralej | but what i find checking at the notifications payload is that instance.live_migration_post_dest.end is the correct event to consume | 12:27 |
| amoralej | you can see the full payloads in the lp | 12:27 |
| dviroel | ok, i can also double check that during reviews | 12:28 |
| amoralej | thanks | 12:28 |
| dviroel | and this only happens with live migration? | 12:28 |
| amoralej | yes, only affects live migrations | 12:28 |
| amoralej | the actual issue is that the model is wrongly updated on live-migrations | 12:29 |
| amoralej | it's fixed in the following periodic sync | 12:29 |
| dviroel | oh ok | 12:29 |
| amoralej | but there is a time gap where the host assigned to the vm is wrong | 12:29 |
| dviroel | thanks for reporting and quickly providing a fix | 12:30 |
| dviroel | i will take a look with more time | 12:30 |
| amoralej | that will deserve backport imo btw | 12:30 |
| dviroel | sure | 12:30 |
| dviroel | any other comments wrt this bug? | 12:31 |
| dviroel | next one | 12:31 |
| dviroel | Race condition in ONESHOT Audit cancel workflow | 12:32 |
| dviroel | #link https://bugs.launchpad.net/watcher/+bug/2134046 | 12:32 |
| dviroel | ah! I remember this one | 12:32 |
| amoralej | yes, we discussed that some time ago | 12:32 |
| dviroel | it happens sometimes in CI, depending how the test is written in tempest | 12:33 |
| amoralej | and the infra load, as it's race condition | 12:33 |
| dviroel | I can confirm that it happens yes | 12:33 |
| dviroel | priority for this one? | 12:33 |
| amoralej | Chandan mentioned he would work on it in https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/969366 | 12:35 |
| amoralej | which btw, i think workarounds it | 12:35 |
| jgilaber | I think high, since we encouter it from time to time in ci | 12:35 |
| amoralej | i think it will not affect real world users | 12:35 |
| dviroel | yeah | 12:36 |
| jgilaber | although medium would also make sense since it's not consistent | 12:36 |
| amoralej | that's why we didn't prioritize it too much | 12:36 |
| amoralej | but yeah, it's a pain in ci | 12:36 |
| dviroel | i would say that is a Low, it is just a annoying in ci | 12:36 |
| dviroel | in the end, it will really execute the oneshot audit, that was previously cancelled | 12:36 |
| dviroel | but it is indeed a nice to have | 12:37 |
| amoralej | only happens if you try to cancel a oneshot audit righ after creating it and before going into ONGOING which is usually just a very short time interval | 12:37 |
| dviroel | btw, s/Priority/Importance | 12:37 |
| amoralej | correct, it's valid bug under certain circumpstances | 12:37 |
| dviroel | low or medium in this case? | 12:37 |
| amoralej | medium :) | 12:38 |
| dviroel | ok | 12:38 |
| amoralej | I proposed a couple of alternatives, btw, i think the one to follow is making the decision-engine to validate that the audit is not CANCELLED before moving it to ONGOING. | 12:38 |
| dviroel | ack | 12:39 |
| jgilaber | +1 to that | 12:39 |
| amoralej | if we don't fix it, i'd be in favor of mergint the workaround https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/969366 to at least fix the ci | 12:39 |
| jgilaber | it should fix the problem and have less implication than the alternative | 12:39 |
| amoralej | i mean, if we don't fix it soon | 12:40 |
| dviroel | ok, so amoralej you have the assignee of this one :) | 12:40 |
| dviroel | ack | 12:40 |
| amoralej | ok, no problem :) | 12:40 |
| dviroel | so the next bug in the list | 12:41 |
| dviroel | Resource name is not required in the schema of the change_nova_service_state action | 12:41 |
| dviroel | #link https://bugs.launchpad.net/watcher/+bug/2133199 | 12:41 |
| dviroel | this one is there for a while | 12:41 |
| jgilaber | yes, I opened this one while reviewing amoralej's patch documenting the existing actions | 12:43 |
| jgilaber | the action schema and code are not in sync | 12:43 |
| jgilaber | the action uses resource_name unconditionally but it's not required in the spec | 12:44 |
| jgilaber | s/spec/schema | 12:44 |
| amoralej | also, we have duplicated information there, resource_id and resource_name iirc. we should rely only in one | 12:45 |
| jgilaber | there is also a comment by sean-k-mooney in the bug pointing to the action using an old compute api | 12:45 |
| jgilaber | I have not digged into the action code | 12:45 |
| dviroel | got it, it is something that needs to be fixed | 12:46 |
| dviroel | but we don't see any failures today because the code itself is configuring these parameters? | 12:46 |
| dviroel | it could be an issue when user creates its own actions with the actuator? | 12:47 |
| amoralej | yes | 12:47 |
| amoralej | in the existing strategies, it's always using resource_name and resource_id, and the action only uses resource_name | 12:48 |
| jgilaber | for example the host maintenance strategy sets both https://github.com/openstack/watcher/blob/8a884e3d5166678b85eaf264e518ef24b30085e5/watcher/decision_engine/strategy/strategies/host_maintenance.py#L165 | 12:48 |
| jgilaber | so it won't cause a problem | 12:48 |
| sean-k-mooney | the schema and code are restricign uscase that nova supprots | 12:48 |
| amoralej | i.e. https://github.com/openstack/watcher/blob/4f17759e794d2f07e7ae209f1f07f4e79733db1d/watcher/decision_engine/strategy/strategies/host_maintenance.py#L163-L180 | 12:48 |
| sean-k-mooney | so resouce_name in at leas some cases shoudl nto be requried | 12:48 |
| sean-k-mooney | but we woudl need to change the implemation to make that work if i recall | 12:49 |
| amoralej | for me, the questions is if we should always use resource_id and remove resouce_name | 12:49 |
| amoralej | yes, it needs changes in the implementation | 12:49 |
| sean-k-mooney | when calling nova proably but we dont nessisarly want to change our inputs to the stragies to requrie that | 12:50 |
| sean-k-mooney | the hostname in nova orgianlly (i mean up until train ish) didnt offically supprot using FQDNs | 12:51 |
| sean-k-mooney | triplo was abusing a gap in our validation for year before we offically suprpoted that | 12:51 |
| sean-k-mooney | host names were requrie e to be gobally uniq | 12:52 |
| sean-k-mooney | in some case however i.e. multie cell deployments | 12:53 |
| sean-k-mooney | it was possibel to have 2 compute with the same service name but diffent uuids | 12:53 |
| amoralej | so, one more reason to use the uuid i guess | 12:53 |
| sean-k-mooney | now that wont happen in our product and we tell peopel to either use an FQDN now or a globlly unque host name | 12:53 |
| sean-k-mooney | well im mostly saying in practic today people knwo to either make the short host name unique or use the fqdn | 12:54 |
| sean-k-mooney | so its a factor but we coudl just say dont do that if we ahve a case where tehy conflict | 12:54 |
| sean-k-mooney | anyway we likely shoudl accpet either value but require that one of the 2 is set and prefer the uuid over the name where the uuid is aviable in most cases | 12:56 |
| dviroel | how do we proceed with this LP? | 12:57 |
| jgilaber | +1 to that | 12:57 |
| dviroel | fix the schema as mentioned? | 12:57 |
| dviroel | and open a RFE based on sean's comment #1? | 12:57 |
| dviroel | to update the code to use the correct nova api | 12:57 |
| sean-k-mooney | well i woudl consider comment 1 to be a bugfix not an RFE | 12:58 |
| sean-k-mooney | i.e. just internally useing a diffent nova api is not a feature if it does nto break exisitng exteral interaction | 12:58 |
| sean-k-mooney | my perference is to not change the schema and to modify the code to work when resouce name is not provided | 12:59 |
| amoralej | and when it's provided, what'd be the expected behavior? | 13:00 |
| amoralej | which one would be used? | 13:00 |
| sean-k-mooney | if both are there the uuid if only the resouce name we would retrive the uuid form our compute model and use the uuid | 13:01 |
| amoralej | so resource_name wins | 13:01 |
| sean-k-mooney | i mean it depns on if we are talkign about runing the action via the acturator stagey where we specify the action directly | 13:01 |
| amoralej | ah, sorry, the resource_id is mandatory | 13:01 |
| amoralej | that was my point | 13:01 |
| sean-k-mooney | or if we are talking about internally | 13:01 |
| sean-k-mooney | intallay i think we shoudl move to exlusively usign the uuid | 13:02 |
| amoralej | ^ that was my preference too | 13:02 |
| amoralej | said that, imo this is low | 13:03 |
| sean-k-mooney | so the decion engine shoud exclusivley create actions using resouce_id going forward | 13:03 |
| sean-k-mooney | ya this is a latenet bug low is ok | 13:03 |
| sean-k-mooney | medium would be fine too | 13:03 |
| dviroel | ack, i will update the LP and link our logs there | 13:04 |
| dviroel | anything else to discuss? | 13:04 |
| dviroel | we are already overtime | 13:04 |
| dviroel | chandankumar will chair next week meeting | 13:05 |
| dviroel | let's wrap up | 13:05 |
| dviroel | we will meet again next week | 13:05 |
| dviroel | thank you all for participating | 13:05 |
| dviroel | #endmeeting | 13:05 |
| opendevmeet | Meeting ended Thu Jan 22 13:05:44 2026 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 13:05 |
| opendevmeet | Minutes: https://meetings.opendev.org/meetings/watcher/2026/watcher.2026-01-22-12.01.html | 13:05 |
| opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/watcher/2026/watcher.2026-01-22-12.01.txt | 13:05 |
| opendevmeet | Log: https://meetings.opendev.org/meetings/watcher/2026/watcher.2026-01-22-12.01.log.html | 13:05 |
| opendevreview | David proposed openstack/watcher stable/2025.2: Move real data jobs nodeset to three nodes (two computes + 1 controller) https://review.opendev.org/c/openstack/watcher/+/973806 | 15:21 |
| dviroel | jgilaber: if you are still around, pls revote ^ | 17:07 |
| jgilaber | don | 17:15 |
| opendevreview | Joan Gilabert proposed openstack/watcher master: Add wrapper classes for novaclient objects https://review.opendev.org/c/openstack/watcher/+/972912 | 17:17 |
| opendevreview | Joan Gilabert proposed openstack/watcher master: Use wrapper classes for novaclient objects https://review.opendev.org/c/openstack/watcher/+/972913 | 17:17 |
| opendevreview | Merged openstack/watcher stable/2025.2: Move real data jobs nodeset to three nodes (two computes + 1 controller) https://review.opendev.org/c/openstack/watcher/+/973806 | 18:58 |
| opendevreview | Merged openstack/watcher master: Homogenize ActionPlans cancel behavior in threading and eventlet mode https://review.opendev.org/c/openstack/watcher/+/973274 | 19:24 |
| opendevreview | Douglas Viroel proposed openstack/watcher-specs master: Add spec for Audit Pipeline feature https://review.opendev.org/c/openstack/watcher-specs/+/969840 | 19:51 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!