Tuesday, 2025-06-17

opendevreviewJamesLin proposed openstack/watcher-specs master: fix: Update the launchpad blueprint link of host-maintenance-strategy-disable-migration spec  https://review.opendev.org/c/openstack/watcher-specs/+/95272901:13
opendevreviewVictoria Martinez de la Cruz proposed openstack/grian-ui master: [WIP] Volume metrics admin dashboard  https://review.opendev.org/c/openstack/grian-ui/+/95275010:32
sean-k-mooneygmaan: amoralej  fun faction all existign action pre and post condtions are litrualy nops11:23
sean-k-mooneywe should change that but that makes the effect on the skipped stte simpelr.11:24
sean-k-mooneywe shoudl not remove the content form the spec11:24
sean-k-mooneyas we want to defien the behavior moving forward11:24
sean-k-mooneybut ya thats a thing.11:24
amoralejactually, i thought of that part of the skipped while checking existing pre_conditions in actions11:25
amoralejwhich are empty, but i think it's good to implement them once we have skipped in place11:25
amoralejas you said, to move forward11:26
sean-k-mooneyyep, at a minitum we shoudl be assertign the resouce we are goign to perfrom an action on exist and is in a vaild sate11:26
amoralejyes11:26
sean-k-mooneyi was reviewign the new stop action for jameslin's spec https://review.opendev.org/c/openstack/watcher/+/952538/4/watcher/applier/actions/stop.py11:27
sean-k-mooneyand they actully impelemted pre/post condtion checks11:27
sean-k-mooneyalthoguh we will need to modify https://github.com/openstack/watcher/blob/master/watcher/applier/workflow_engine/default.py#L163 to actully mark the action as failed if the post condition failed11:29
amoralejthat's good11:29
amoralejalthough we may move to skipped once we have it. i.e. instance does not exist or it is already stopped11:30
sean-k-mooneyskipped woudl be correct for preconditions failure not for post condtion failures11:30
sean-k-mooneyif the post_condtions fail then the action failed11:31
amoralejyes, i meant pre, i was reading that11:31
sean-k-mooneyright11:31
amoralejagree, post failures -> failed11:31
sean-k-mooneyso when we have skipped then we need to make sure that that works proeprly11:31
sean-k-mooneyyou shoudl be able to handel that generically here https://github.com/openstack/watcher/blob/master/watcher/applier/workflow_engine/default.py#L139-L14411:33
sean-k-mooneyjust wrappign the cal to pre_conditosn in a try except11:33
sean-k-mooneyand marking the action as skiped if it raises11:33
amoralejyes, that was my plan11:35
opendevreviewAlfredo Moralejo proposed openstack/watcher master: Add debug message to report calculated metric for workload_balance  https://review.opendev.org/c/openstack/watcher/+/95163912:12
opendevreviewDavid proposed openstack/watcher-tempest-plugin master: Add tests for workload balance with real data  https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/95033513:24
opendevreviewJoan Gilabert proposed openstack/watcher master: Support zone migration audit without compute_nodes  https://review.opendev.org/c/openstack/watcher/+/95066513:31
opendevreviewMerged openstack/watcher-specs master: fix: Update the launchpad blueprint link of host-maintenance-strategy-disable-migration spec  https://review.opendev.org/c/openstack/watcher-specs/+/95272914:44
amoralejsean-k-mooney, wrt scope, after checking with jgilaber the current behavior of scope in the audits is that the model which is presented to every audit execution in the decision-engine is scoped according to the scope definition, https://github.com/openstack/watcher/blob/31879d26f4889c6367d85ff279f369913c672df7/watcher/decision_engine/strategy/strategies/base.py#L289-L29115:04
amoralejso scope should be the "top level filter" for the audit15:05
amoralejeven in zone_migration it has more priority that the list of src_node in compute_nodes parameter15:05
amoralejjgilaber ^ is this correct?15:05
amoralejwell, there is a bug, but that's a different story :)15:05
jgilaberyes, the model that is exposed to the strategy takes into account the scope properly15:14
jgilaberbut the input for the audit is not modified, so as amoralej says in the zone migration if you ask for a migration that is out of the scope the audit will crash15:15
sean-k-mooneyamoralej: ok but the zone migration only move vms on the comptue nodes that are in the input parmaters15:21
sean-k-mooneyamoralej: so if you only use scope it wont work15:21
amoralejyes15:21
sean-k-mooneywell it might but it wont move anythign15:21
amoralejit won't move anything if compute_nodes is empty, regardless of the scope15:22
sean-k-mooneywe would need something like a "compute_nodes_form_scope" value in the input_parmaters15:22
amoralejok, now i understand your point, i didn't get it before, so you want to infer input_parameter.compute_nodes from the scope15:23
amoralejthat's an rfe :)15:23
sean-k-mooneyright if we wanted to defien a zone via a host aggreate and pass that host aggrate via the scope15:23
sean-k-mooneywe would need to infer the computes form taht scope15:24
sean-k-mooneyfor it ot be useful15:24
sean-k-mooneyyes it defintly is15:24
sean-k-mooneyfor some stragies that just operate on teh whole model15:25
sean-k-mooneyyou dont need the infernce15:25
sean-k-mooneybut for zone migration you do 15:25
amoralejyep15:25
sean-k-mooneyit looks like host_matintance adn zone_migration are outlier15:28
sean-k-mooneythe other stargies do not use input_parmaters to contol whcih resouce to modify15:29
sean-k-mooneyfor zone migration the solution might be to deprecate compute_nodes and make ti optional in a newer microverion15:29
sean-k-mooneyand instead contole it via scope15:30
amoralejbut then we'd need a way to specify dst_node if we want to maintain current behavior15:30
amoralejcompute_nodes can be made of src_node, dst_node pairs15:31
sean-k-mooneywell i guess you would not need to fuly deprecate it jsut he soruce nod aspect15:31
sean-k-mooneyi guess curently its also quite limited in that you cant have multiple src_nodes15:31
amoralejit's 1:1, yes15:32
sean-k-mooneywhich is really somethign we woudl want to have for a "zone based" stratgey15:32
sean-k-mooneyrealisticly i think in the medium term we may end up decidng to create new stragies to replace this15:32
sean-k-mooneyyou can do the vm part with workload stablisation15:33
sean-k-mooneyand the volume part to some degree with https://docs.openstack.org/watcher/latest/strategies/storage_capacity_balance.html15:33
sean-k-mooneybut we may decised that its better to create a clean impletation without the legacy15:34
sean-k-mooneythat will replcae the existing one15:34
amoralejyes, mid-term i think it's worthy to rethink what we want to do with this goal and strategy15:35
sean-k-mooneythe only thing you cant do with the other two today15:35
sean-k-mooneyis move the vms and the volumes in one action plan15:35
amoralejand moving vms from a predefined set of hosts15:36
amoralejhost_maintenance can do only one15:36
amoralejand workload_stabilization would not move all the vms15:37
amoraleji mean, at least we enhance it for that use case15:37
amoralejsean-k-mooney, wrt unit tests to assert debug messages, should i mock LoggerAdapter.debug ?15:37
amoralejwhat's the best way to do it?15:37
sean-k-mooneydid the job report. if the line has coverage now then the fixture that jgilaber proted form nova tests teh formating15:38
sean-k-mooneyhow we woudl normally test this is mock the LOG varible in teh module under test and assert it called with teh correct paramter or assert teh expected string is in teh output of the log fixture15:39
sean-k-mooneyeither works15:39
jgilaberthe problem with the fixture is that with the current config it will not capture debug logs in the output15:40
amoralejwe are not including debug messages in the log fixtures, iiuc15:40
jgilaberit executes them, but with a NullHandler15:40
sean-k-mooneyjgilaber: debug logs requrie you to set an envionemt variable15:40
amoralejso mocking LOG seems the way to go15:40
sean-k-mooneyOS_DEBUG15:40
sean-k-mooneyand we shoudl be setting that in the ci jobs15:40
jgilaberyes, ok then that is missing15:40
sean-k-mooneyjgilaber: actully i think we may not to save space i cant recall15:41
sean-k-mooneyamoralej: but ya mockign log.debug is more convetnional15:41
jgilaberI guessed that it was disabled by default because it would be quite verbose15:41
sean-k-mooneyi know we had issues in teh past with hitting the max message size of subunit15:41
sean-k-mooneywhen we had debug enabeld in nova15:42
sean-k-mooneythat was eventually fixed in stestr15:42
sean-k-mooneybut it was a proablme in the past15:42
sean-k-mooneyamoralej: so if we look at the coverage report 15:43
sean-k-mooneyhttps://cba07d0052e893dee9e9-23ff2657fe87c0d4ee3109cc6b8bff44.ssl.cf2.rackcdn.com/openstack/d872cc5544854bd28ecf23f9c9e342de/cover/z_65cb95e032a919a1_workload_balance_py.html#t20115:43
sean-k-mooneywe have unit test coverage for the debug slog you added at least there15:43
sean-k-mooneywe have no unit test coverage for instance_ram_usage...15:44
sean-k-mooneyamoralej: that lack of coverage was actully a problem before your change15:44
sean-k-mooneyso you do not need to fix it in this one althoguh i wont object if you do15:44
sean-k-mooneybut we should add that to the list of tech debt15:45
sean-k-mooneyi dont think we need 100% unit test coverage15:45
sean-k-mooneybut we shoudl try to impvoe our current gaps over time15:45
amoralejgot it15:46
opendevreviewAlfredo Moralejo proposed openstack/watcher master: Add debug message to report calculated metric for workload_balance  https://review.opendev.org/c/openstack/watcher/+/95163917:15
-opendevstatus- NOTICE: Zuul jobs reporting POST_FAILURE were due to an incident with one of our cloud providers; this provider has been temporarily disabled and changes can be rechecked22:38

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!