| opendevreview | Thomas Goirand proposed openstack/watcher master: Fix term.py with 0.22.2 https://review.opendev.org/c/openstack/watcher/+/964762 | 10:40 |
|---|---|---|
| opendevreview | chandan kumar proposed openstack/watcher-dashboard master: Add option to SKIP Actions https://review.opendev.org/c/openstack/watcher-dashboard/+/958209 | 12:06 |
| chandankumar | dviroel: Hello, please add this review to your list https://review.opendev.org/c/openstack/watcher-dashboard/+/958209, thank you! | 12:06 |
| dviroel | chandankumar: ack | 12:10 |
| opendevreview | chandan kumar proposed openstack/watcher-dashboard master: Remove legacy integration test framework https://review.opendev.org/c/openstack/watcher-dashboard/+/964775 | 13:13 |
| opendevreview | chandan kumar proposed openstack/watcher-dashboard master: Add option to SKIP Actions https://review.opendev.org/c/openstack/watcher-dashboard/+/958209 | 13:15 |
| dviroel | sean-k-mooney: hi o/ - when you have some time, can you revisit https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/956116 ? I removed some previous added config option to make it more simple | 13:20 |
| opendevreview | Joan Gilabert proposed openstack/watcher master: Fix zone migration to accept dst_pool or dst_type https://review.opendev.org/c/openstack/watcher/+/964776 | 13:27 |
| sean-k-mooney | oh that sure ill revew it now | 14:05 |
| sean-k-mooney | dviroel: +2 with comments | 14:33 |
| sean-k-mooney | read tomorrow as next week since this is a long weekend | 14:34 |
| dviroel | thanks sean-k-mooney i will take a look | 14:45 |
| chandankumar | sean-k-mooney: dviroel hello, when you get time, please have a look at these two https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/956004 and https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/955472 thank you! | 15:09 |
| sean-k-mooney | im reviewing the seocond one currently | 15:10 |
| opendevreview | Merged openstack/watcher-tempest-plugin master: Add tests for extended compute datamodel https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/956116 | 15:11 |
| opendevreview | Joan Gilabert proposed openstack/watcher-tempest-plugin master: Add test for volume migrate with zone migration https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/958644 | 15:11 |
| opendevreview | Joan Gilabert proposed openstack/watcher-tempest-plugin master: Test zone migration volume and compute migrations https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/962702 | 15:11 |
| opendevreview | Joan Gilabert proposed openstack/watcher-tempest-plugin master: Add extra checks to zone migration retype test https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/963559 | 15:11 |
| opendevreview | Joan Gilabert proposed openstack/watcher-tempest-plugin master: Add test for volume migrate with zone migration https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/958644 | 15:19 |
| opendevreview | Joan Gilabert proposed openstack/watcher-tempest-plugin master: Test zone migration volume and compute migrations https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/962702 | 15:19 |
| opendevreview | Joan Gilabert proposed openstack/watcher-tempest-plugin master: Add extra checks to zone migration retype test https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/963559 | 15:19 |
| sean-k-mooney | jgilaber: dviroel we have a problem with our tempest pluging | 15:41 |
| sean-k-mooney | 20 tests in 3194.4666 sec. | 15:41 |
| sean-k-mooney | that is no where near ok | 15:41 |
| sean-k-mooney | the way we are buidlign the tests we are waiting for far far to long and reconfiling thigns far to slowly | 15:41 |
| jgilaber | hmm I thought with notifications the waiting would not be too bad | 15:43 |
| sean-k-mooney | 20 tests in 50 minutes is extream | 15:43 |
| sean-k-mooney | watcher_tempest_plugin.tests.scenario.test_execute_zone_migration.TestExecuteZoneMigrationStrategyVolume.test_execute_zone_migration_volume_retype [211.483003s] | 15:43 |
| sean-k-mooney | the zone migraiton tessts are some of the slowest | 15:43 |
| sean-k-mooney | anything over 90seond shoudl be a cuase for alarm but we are double or triple that in many cases | 15:44 |
| sean-k-mooney | the problem is that a lto fo the test are pooling and have embded sleeps | 15:49 |
| sean-k-mooney | we run tests sequetially so that patter cnat scale | 15:50 |
| sean-k-mooney | if we are polling like that we need to sue much much shorter intervals like 0.5-1.0 sconds not 10-15 | 15:50 |
| jgilaber | sean-k-mooney, you mean in snippets like https://github.com/openstack/watcher-tempest-plugin/blob/99388fae9603a71564456757149dbad2d004c0cd/watcher_tempest_plugin/tests/scenario/base.py#L230? | 15:53 |
| sean-k-mooney | yep | 15:54 |
| sean-k-mooney | no test is ment to take more then 5 minuts total i belvie by default os a single sterp bein allow to wait up to 10 mins is not correct | 15:54 |
| sean-k-mooney | https://github.com/openstack/watcher-tempest-plugin/blob/99388fae9603a71564456757149dbad2d004c0cd/watcher_tempest_plugin/tests/scenario/base.py#L255-L261 | 15:54 |
| sean-k-mooney | there is a reason why these default to .5 seconds | 15:55 |
| sean-k-mooney | often we set it to 0.2 or similar | 15:55 |
| sean-k-mooney | anything about about 30 seconds is consider a slow tst for temepst 30-90 is ok for a senario tests butif we are gettign in to a 200-300 second range that a problem | 15:56 |
| sean-k-mooney | jgilaber: part of the reason wny we are injecting data is so we can speed up the tests and have them run considtently | 15:57 |
| sean-k-mooney | we can difcuss this in the testing ptg session next week | 15:58 |
| sean-k-mooney | but we cant keep addign tempest test that take that long to run and we shoudl try and optimise the eixsing ones | 15:58 |
| jgilaber | sounds good, I'm trying to get some timings from the tempest logs | 15:58 |
| sean-k-mooney | https://zuul.opendev.org/t/openstack/build/a96614ce7797470e949eabf56c15632f/log/job-output.txt#54694 | 15:59 |
| sean-k-mooney | its not logged in teh tempest logs by default also we can get that info form stester with w tweak to the job | 15:59 |
| sean-k-mooney | the test time is however in the ray job output | 15:59 |
| jgilaber | {0} watcher_tempest_plugin.tests.scenario.test_execute_zone_migration.TestExecuteZoneMigrationStrategy.test_execute_zone_migration_with_destination_host [218.771057s] ... ok | 16:01 |
| jgilaber | 1 {0} watcher_tempest_plugin.tests.scenario.test_execute_zone_migration.TestExecuteZoneMigrationStrategy.test_execute_zone_migration_without_destination_host [234.584126s] ... ok | 16:01 |
| jgilaber | 3 {0} watcher_tempest_plugin.tests.scenario.test_execute_zone_migration.TestExecuteZoneMigrationStrategyVolume.test_execute_zone_migration_volume_retype [211.483003s] ... ok | 16:01 |
| jgilaber | all three tests for zone migration have very similar runtimes | 16:02 |
| dviroel | we can add some additional debug logging in the wait/sleep parts, to identify where/why | 16:04 |
| dviroel | wait for instances in model is common in most of the scenario tests | 16:04 |
| jgilaber | most likely culprit for zone migration is wait_for_instances_in_model, since for example the volume retype test creates server/volumes directly from tempest lib methods | 16:06 |
| jgilaber | not the methods from the plugin base | 16:06 |
| jgilaber | and the waits for the audits/action plan | 16:07 |
| dviroel | the collector run period is 2 min, but we should get model updates from notifications faster | 16:09 |
| jgilaber | I'm creating a quick and dirty patch with timing logs to check which calls take longer | 16:19 |
| opendevreview | Joan Gilabert proposed openstack/watcher-tempest-plugin master: [DNM] Log timing for function calls https://review.opendev.org/c/openstack/watcher-tempest-plugin/+/964800 | 16:24 |
| jgilaber | ^^ should confirm our suspicions | 16:24 |
| dviroel | reducing the sleep time in model check should help, but I would like to see if there is something else yes | 16:35 |
| jgilaber | I'll check back on Monday, I'm leaving for today o/ | 16:35 |
| dviroel | sure, thanks jgilaber | 16:35 |
| dviroel | have a good weekend | 16:35 |
| sean-k-mooney | dviroel: we are injecting metrics and enbaling notficiaotn ot not depend on that | 16:52 |
| sean-k-mooney | but ya the collector interval could be part of the problem | 16:52 |
| sean-k-mooney | my expection is that the creatioin fo the test vm and noticiation of the saem shoudl take a cople of second at most | 16:53 |
| sean-k-mooney | say 10 seconds form when we do the api request | 16:54 |
| sean-k-mooney | so im execting to have the vm in our model very shortly after that | 16:54 |
| dviroel | right, unless we have something not working properly with notifications side | 16:54 |
| sean-k-mooney | yep | 16:54 |
| dviroel | the only test that needs the collector sync is the one that I just added, to get pinned_az info | 16:54 |
| sean-k-mooney | so before we add more test to the plugin and start getting job timeouts | 16:54 |
| sean-k-mooney | we will need to dig into that | 16:55 |
| dviroel | since pinned_az is not in notifications from nova, which we could also fix in future releases | 16:55 |
| dviroel | agree | 16:55 |
| sean-k-mooney | dviroel: ack i think this a prexisitng isseu | 16:55 |
| sean-k-mooney | as in we have other test that are quite long | 16:55 |
| sean-k-mooney | i looked at nova and its live migration seniaro test are all in the 50-80 second range | 16:56 |
| sean-k-mooney | i would expect oru zone migration oenes to be in a similar ballpark | 16:56 |
| dviroel | ack | 16:57 |
| sean-k-mooney | it might be slightly longer but not by a lot ideally | 16:57 |
| dviroel | the model update waiting can be the thing | 16:57 |
| dviroel | it first waits for the instance be in the model | 16:57 |
| dviroel | and in the end it wais for the instance to be deleted from the model | 16:57 |
| dviroel | waits to times for the model | 16:58 |
| dviroel | + migrations and other things | 16:58 |
| sean-k-mooney | well the other issue i think is https://github.com/openstack/watcher/blob/74efcbf9992b0ffee1fcd5bc72b8b4f7963a4166/watcher/common/cinder_helper.py#L151 | 16:59 |
| sean-k-mooney | https://github.com/openstack/watcher/blob/74efcbf9992b0ffee1fcd5bc72b8b4f7963a4166/watcher/common/nova_helper.py#L162-L176 | 16:59 |
| sean-k-mooney | we have lots of place in watcher that are injecting hard code sleeps | 17:00 |
| sean-k-mooney | https://github.com/openstack/watcher/blob/74efcbf9992b0ffee1fcd5bc72b8b4f7963a4166/watcher/common/nova_helper.py#L218-L221 | 17:00 |
| sean-k-mooney | every time.sleep we have in teh apis like that are techinal debt | 17:00 |
| sean-k-mooney | we shoudl be doing an expontial backoof with an opper bound | 17:02 |
| dviroel | yeah, make sense | 17:04 |
| dviroel | each part has its own sleep | 17:04 |
| sean-k-mooney | including the applies executro loop | 17:04 |
| sean-k-mooney | https://github.com/openstack/watcher/blob/74efcbf9992b0ffee1fcd5bc72b8b4f7963a4166/watcher/applier/workflow_engine/base.py#L247 | 17:04 |
| sean-k-mooney | we really need to rewrite this to use treading events or futures or similr. | 17:05 |
| sean-k-mooney | all these 1 second sleeps are quickly going to add up | 17:05 |
| sean-k-mooney | this is kind of the other half to the scaleablity quetion that amoralej is looking into | 17:06 |
| sean-k-mooney | we need to enabel horizontal scalablity but we also need to adress the core of the executor loop and make that more efficent too | 17:07 |
| dviroel | ack, there will be a session for applier in general too, which we should cover this part | 17:09 |
| dviroel | there is also a critical part in the model collector too, with sleeps, that will require threading events | 17:10 |
| sean-k-mooney | ya so we shoudl move away form sleep to evetns or future which supprot timeouts when you wait on them | 17:11 |
| sean-k-mooney | now if we need to look up external data like pooling a migraiton for complettion | 17:11 |
| sean-k-mooney | then sure we need to still poll | 17:11 |
| sean-k-mooney | but we shoudl not do that on a fixed interval | 17:12 |
| dviroel | yeah | 17:33 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!