opendevreview | Ghanshyam proposed openstack/tempest master: Test Nova and Glance RBAC policy old defaults https://review.opendev.org/c/openstack/tempest/+/884764 | 03:32 |
---|---|---|
opendevreview | Ghanshyam proposed openstack/tempest master: Make glance-multistore-cinder-import voting https://review.opendev.org/c/openstack/tempest/+/885810 | 03:32 |
kopecmartin | frickler: hmm, i see also 4 non devstack patches there - https://review.opendev.org/q/topic:drop-old-distros | 06:16 |
frickler | kopecmartin: now I see those, too. maybe gerrit was slow with updating caches or so after you changed the topics yesterday | 06:18 |
slaweq | frickler hi, quick question - can You +W https://review.opendev.org/c/openstack/devstack/+/882770 now as yoga patch was merged already? Thx in advance | 08:51 |
whoami-rajat | damiandabrowski, no problem, thanks for the fixes | 08:55 |
frickler | slaweq: done | 09:22 |
slaweq | Thx | 09:23 |
opendevreview | Merged openstack/devstack stable/xena: Fix installation of OVS/OVN from sources https://review.opendev.org/c/openstack/devstack/+/882770 | 10:48 |
opendevreview | Maxim Sava proposed openstack/tempest master: Add image admin test to list public image https://review.opendev.org/c/openstack/tempest/+/886340 | 12:18 |
opendevreview | Maxim Sava proposed openstack/tempest master: Add image admin test to list public image https://review.opendev.org/c/openstack/tempest/+/886340 | 12:19 |
melwitt | does anyone know if there is anything in flight for addressing job timeouts? I feel like I'm getting a timeout every run ... should we consider increasing some of the timeouts or? | 17:32 |
gmann | melwitt: I am also seeing that since last week, nothing up for fix as we do not knwo the reason why it started now | 17:43 |
gmann | some job like tempest-slow are already at max time which i think needs to split the test but for other jobs its something else slow as they do not have large number of tests | 17:43 |
gmann | and we already refactored tempest-full and few more job in last month or so | 17:44 |
melwitt | gmann: ack, thanks. weirdly one I'm seeing a lot is tempest-integrated-compute | 17:46 |
gmann | humm | 17:46 |
melwitt | but I see some pretty long test times there like | 17:47 |
melwitt | 2023-06-28 17:14:13.107467 | controller | {0} tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_cross_tenant_traffic [187.746724s] ... ok | 17:47 |
melwitt | 2023-06-28 17:16:36.239756 | controller | {0} tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_in_tenant_traffic [143.121620s] ... ok | 17:47 |
melwitt | aside, I think it would be nice to have stestr output the names of the N slowest tests after the result to make it easier to look at the slowest ones. if I can figure out where to do that I will propose a patch | 17:48 |
gmann | I think those are ok, actual issue is all three worker are ideal and 4th one got all the remaining test execution so timeout | 17:50 |
gmann | https://ca38b56652cffb3b5e17-289b34983b13ece36a1a19c591f4d0ce.ssl.cf5.rackcdn.com/877056/5/check/tempest-integrated-compute/87764a4/controller/logs/stackviz/index.html#/stdin/timeline?test=tempest.api.compute.servers.test_server_rescue.ServerStableDeviceRescueTest.test_stable_device_rescue_disk_scsi | 17:50 |
gmann | we have seen these before also when dansmith reported unbalancing of test run on all workers | 17:50 |
gmann | not sure how to solve this. one way is group the tests but that can make situation worst in many cases | 17:51 |
gmann | melwitt: +1 that will be good data. | 17:58 |
gmann | and in this timeout, we can see worker3 is taking another 30 min when all other worker are idea https://zuul.opendev.org/t/openstack/build/87764a41607b420297f35d4e13a2b4d9/log/job-output.txt#21053-21058 | 17:58 |
gmann | this is success case with distributed load on all worker https://zuul.opendev.org/t/openstack/build/fc83c60a34504c8da3dab0bd1fef83c3/log/job-output.txt#21002-21007 | 17:58 |
melwitt | gmann: oh, nice insight. I didn't even notice that | 17:59 |
melwitt | I'll probably poke at it a bit to see if I get any ideas | 17:59 |
gmann | one idea i see to increase the concurrency in tempest run if we have any node with more than 4 cpu will utilize all cpu to run test | 18:00 |
melwitt | gmann: ah yeah.. by default stestr uses concurrency based on how many cpus detected but I guess we must be setting it to 4 always | 18:18 |
opendevreview | Ghanshyam proposed openstack/tempest master: Increase concurrency for tempest-slow jobs https://review.opendev.org/c/openstack/tempest/+/887218 | 18:22 |
gmann | melwitt: ^^ one is this we set as two on slow job and that might be cause for long time in this job | 18:23 |
gmann | but rest other I think we are letting stestr only to decide but checking if we are restricting it to 4 somewhere | 18:23 |
melwitt | I feel like i saw something recently /me looks | 18:24 |
melwitt | not sure what I was thinking of, but while looking I found this https://opendev.org/openstack/tempest/src/branch/master/roles/run-tempest/tasks/main.yaml#L18 | 18:30 |
melwitt | I dunno whether that connects to stestr somehow | 18:31 |
melwitt | oh yeah, derp https://opendev.org/openstack/tempest/src/branch/master/roles/run-tempest/tasks/main.yaml#L126 | 18:31 |
opendevreview | Dan Smith proposed openstack/devstack master: nova: Bump timeout-per-gb for BFV rebuild ops https://review.opendev.org/c/openstack/devstack/+/887110 | 18:32 |
melwitt | iiuc that is setting concurrency to be number of cpus // 2 | 18:32 |
* dansmith is not very smart | 18:32 | |
gmann | melwitt: yeah, I am looking the same. it make 8 cores to 4 here | 18:32 |
gmann | let me remove this logic and not set concurrency as default_concurrency | 18:33 |
melwitt | ++ | 18:33 |
gmann | and this confirm 4 passed to tempest run and to stestr https://zuul.opendev.org/t/openstack/build/fc83c60a34504c8da3dab0bd1fef83c3/log/job-output.txt#21008 | 18:34 |
melwitt | yeah, I def saw that early on but didn't know where it was coming from until now | 18:34 |
melwitt | aside, looks like mtreinish had worked on dynamic worker balance in the past https://github.com/mtreinish/stestr/pull/271 but still WIP | 18:38 |
melwitt | I was thinking that's what we need and then wondered if anyone proposed it yet, and the answer is yes :) | 18:39 |
opendevreview | Ghanshyam proposed openstack/tempest master: Do not set the concurrency for tempest run https://review.opendev.org/c/openstack/tempest/+/887220 | 18:41 |
gmann | melwitt: ah right, that will be good way | 18:42 |
dansmith | dynamic balance would be hella better | 18:52 |
gmann | yeah | 18:54 |
opendevreview | Ghanshyam proposed openstack/tempest master: Do not set the concurrency for tempest run https://review.opendev.org/c/openstack/tempest/+/887220 | 19:43 |
opendevreview | Dan Smith proposed openstack/devstack master: nova: Bump timeout-per-gb for BFV rebuild ops https://review.opendev.org/c/openstack/devstack/+/887110 | 21:13 |
opendevreview | Ghanshyam proposed openstack/tempest master: Do not set the concurrency for tempest run https://review.opendev.org/c/openstack/tempest/+/887220 | 21:17 |
opendevreview | Ghanshyam proposed openstack/tempest master: Run slow tests parallely https://review.opendev.org/c/openstack/tempest/+/887237 | 23:52 |
gmann | tempest-slow job run test serially so no impact of increasing the concurrency. I think running parallel is not causing much issue, let's try that in main job too | 23:53 |
opendevreview | Ghanshyam proposed openstack/tempest master: Run slow tests parallely https://review.opendev.org/c/openstack/tempest/+/887237 | 23:56 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!