15:00:14 <slaweq> #startmeeting neutron_ci 15:00:16 <openstack> Meeting started Wed Apr 15 15:00:14 2020 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:17 <slaweq> hi 15:00:17 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:20 <openstack> The meeting name has been set to 'neutron_ci' 15:00:39 <njohnston> o/ 15:01:19 <slaweq> hi njohnston 15:01:26 <ralonsoh> hi 15:01:27 <bcafarel> o/ 15:01:31 <njohnston> hello, how are you? Did you have a good dyngus day? 15:01:33 <slaweq> hi 15:01:39 <slaweq> njohnston: yes, thx 15:01:42 <slaweq> I had 15:01:49 <slaweq> do You have it also in US? 15:02:12 <njohnston> my father's family is from Buffalo, New York which has one of the most active dyngus day celebrations in the US 15:02:23 <slaweq> nice :) 15:02:37 <lajoskatona> Hi 15:03:18 <slaweq> I was splashing water on my kids from neighborhood from the window on first floor 15:03:21 <slaweq> it was fun :) 15:03:24 <slaweq> hi lajoskatona :) 15:03:31 <slaweq> ok, lets start meeting 15:03:43 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:03:52 <slaweq> #topic Actions from previous meetings 15:04:01 <slaweq> first action 15:04:02 <slaweq> slaweq to continue investigation of fullstack SG test broken pipe failures 15:04:31 <slaweq> I found out that in some cases it may be race condition between starting client and reading from it 15:05:28 <slaweq> so solution IMHO is to handle BrokenPipeException in same way as e.g RuntimeError 15:05:35 <njohnston> makes sense 15:05:40 <slaweq> patch is here https://review.opendev.org/#/c/718781 15:05:49 <slaweq> please review if You will have some time 15:06:06 <slaweq> next one 15:06:08 <slaweq> slaweq to check server termination on multicast test 15:06:17 <slaweq> I finally dig into it 15:07:12 <slaweq> and I found out that in tempest "all-plugin" env there is timeout 1200 seconds for each test set 15:07:38 <slaweq> and in case of our tests which require advanced image it may be not enough simply 15:07:51 <slaweq> so we are hitting this timeout e.g. during cleanup phase 15:08:11 <bcafarel> where do we ask again for more powerful hardware? ;) 15:08:13 <slaweq> I proposed patch https://review.opendev.org/#/c/719927/ to increase this timeout for some tests only 15:08:41 <slaweq> but as I talked with gmann today, it seems that better way would be to do it like e.g. ironic did: https://opendev.org/openstack/ironic/src/branch/master/zuul.d/ironic-jobs.yaml#L414 15:08:51 <slaweq> and set higher timeout globally "per job" 15:09:02 <slaweq> so I will update my patch and do it that way 15:09:18 <slaweq> bcafarel: I think that it's more matter of nested virtualisation 15:09:26 <slaweq> it would be much better if that would work fine finally 15:09:49 <bcafarel> oh yes 15:10:15 <slaweq> anyway, that's all about that one from me 15:10:21 <slaweq> questions/comments? 15:10:25 <lajoskatona> bcafarel: from fortnebula there is option to fetch bigger VMs 15:10:43 <lajoskatona> but that is just temporary for testing things 15:10:58 <bcafarel> slaweq: so you will update to use longer overall timeout? (which may help for later "long" tests) 15:11:00 <lajoskatona> at least it was when I last needed mor MEM 15:11:09 <bcafarel> nice I did not know that 15:11:22 <bcafarel> though hopefully just larger timeouts will be enough 15:11:52 <slaweq> bcafarel: yes, I will update to set longer timeout per test in job definition 15:12:03 <slaweq> it's not "job's timeout" but "test's timeout" 15:12:26 <bcafarel> ok sounds good 15:12:57 <slaweq> ok, next one 15:12:59 <slaweq> slaweq to ping yamamoto about midonet gate problems 15:13:05 <slaweq> I sent him an email last week 15:13:10 <slaweq> but he didn't reply 15:13:42 <slaweq> I will try to catch him during drivers meeting because for now networking-midonet's gate is still broken 15:14:20 <slaweq> #action slaweq to ping yamamoto about midonet gate problems 15:14:23 <lajoskatona> I looked for him as well to ask about taas, as the reviews are stopped there 15:14:54 <slaweq> he's not very active in community today 15:15:38 <slaweq> and I think he is the only "active" maintainer of networking-midonet 15:16:21 <slaweq> ok, last one from previous week 15:16:23 <slaweq> bcafarel to check and update stable branches grafana dashboards 15:17:26 <bcafarel> done in https://review.opendev.org/#/c/718676/ in the end that was almost full rewrite 15:17:38 <slaweq> thx bcafarel 15:17:44 <bcafarel> stable dashboards are quite close to neutron master ones now: 15:17:53 <bcafarel> http://grafana.openstack.org/d/pM54U-Kiz/neutron-failure-rate-previous-stable-release?orgId=1 15:18:00 <bcafarel> http://grafana.openstack.org/d/dCFVU-Kik/neutron-failure-rate-older-stable-release?orgId=1 15:18:16 <njohnston> thanks bcafarel! 15:19:09 <slaweq> I added links to meeting agenda 15:19:31 <slaweq> ok, I think that this is all regarding actions from last week 15:19:34 <slaweq> #topic Stadium projects 15:19:46 <slaweq> any updates about zuulv3 ? 15:21:19 <slaweq> ok, I guess this silence means "no updates" :) 15:21:35 <njohnston> nothing from me 15:21:36 <lajoskatona> not from me at least 15:21:46 <njohnston> networking-odl still has https://review.opendev.org/#/c/672925/ pending 15:21:56 <slaweq> ok 15:22:04 <njohnston> and networking-midonet still has nothing IIUC 15:22:10 <slaweq> I also don't have any updates about IPv6-only testing 15:22:34 <slaweq> regarding stadium issues, I know only about this one with networking-midonet which we already talked about 15:22:49 <slaweq> any other issues/questions regarding stadium and ci? 15:24:08 <njohnston> nope 15:24:31 <bcafarel> not from me either 15:24:34 <ralonsoh> no 15:24:53 <slaweq> ok, lets move on 15:24:58 <slaweq> #topic Stable branches 15:25:03 <slaweq> Train dashboard: http://grafana.openstack.org/d/pM54U-Kiz/neutron-failure-rate-previous-stable-release?orgId=1 15:25:05 <slaweq> Stein dashboard: http://grafana.openstack.org/d/dCFVU-Kik/neutron-failure-rate-older-stable-release?orgId=1 15:25:56 <slaweq> I don't see too much data on those dashboards so far 15:26:28 <njohnston> there rarely is 15:27:28 <ralonsoh> I need to review neutron-ovn-tempest-ovs-release, it's failing in all jobs 15:27:55 <slaweq> ralonsoh: but are You talking about stable branches or master? 15:28:02 <ralonsoh> sorry, master 15:28:06 <ralonsoh> my bad 15:28:24 <bcafarel> fullstack hopefully will be better with your fixes backports slaweq 15:28:40 <slaweq> bcafarel: we will see :) 15:28:56 <slaweq> there is also this rally issue https://bugs.launchpad.net/neutron/+bug/1871596 15:28:56 <openstack> Launchpad bug 1871596 in neutron "Rally job on stable branches is failing" [Critical,Confirmed] - Assigned to Slawek Kaplonski (slaweq) 15:28:56 <bcafarel> grenade yes is one of the common recheck causes in stable (though luckily not that bad) 15:29:03 <slaweq> but we should be good with it now, right? 15:29:29 <bcafarel> yep andreykurilin has merged the change to fix rocky 15:29:37 <slaweq> great 15:29:40 <bcafarel> (and networking-ovn stein/rocky also) 15:29:44 <slaweq> thx lajoskatona and bcafarel for taking care of this 15:29:59 <bcafarel> just waiting for some rechecks to complete to mark it as "back in working order"! 15:29:59 <slaweq> bcafarel: so we can close this LP, right? 15:31:43 <lajoskatona> good to hear :-) 15:31:52 <slaweq> today I got couple of +1 from zuul for various stable branches, so IMO it's fine now 15:32:02 <bcafarel> ok checking last results it does indeed look good 15:32:09 <bcafarel> one LP down 15:32:14 <slaweq> \o/ 15:32:20 <lajoskatona> ok I abandon then my rally devstack plugin capping patch 15:32:25 <slaweq> thx lajoskatona 15:33:01 <slaweq> anything else related to stable brances for today? 15:34:12 <slaweq> ok, lets move on 15:34:14 <slaweq> #topic Grafana 15:34:42 <slaweq> I still need to update my patch https://review.opendev.org/#/c/718392/ - thx bcafarel for review 15:35:26 <bcafarel> getting at correct position in the file will probably take longer than fixing my nit :) 15:35:42 <slaweq> ralonsoh: speaking about ovn jobs, it seems that all of them, except "slow" are going up with failure rates today 15:35:50 <slaweq> so probably there is some problem with those jobs 15:36:42 <slaweq> https://e93bf74b2537cfa96a59-e4f20cff14b59b3a1c5b0d28b2b173f9.ssl.cf5.rackcdn.com/719765/3/check/neutron-ovn-tempest-ovs-release/0982c80/testr_results.html 15:37:13 <slaweq> seems that this subnetpool test is a culprit 15:37:41 <ralonsoh> I'll take a look today 15:37:53 <slaweq> ralonsoh: thx 15:38:08 <slaweq> #action ralonsoh to check ovn jobs failure 15:38:48 <slaweq> except that I think that things looks pretty good this week 15:39:14 <slaweq> even functional jobs are finally working better thanks to maciejjozefczyk and ralonsoh's work 15:40:56 <slaweq> anything else regarding grafana? do You see anything which worries You? 15:41:53 <maciejjozefczyk> ralonsoh, I can help if you'll need any help :) 15:42:52 <slaweq> thx maciejjozefczyk for volunteering :) 15:43:20 <ralonsoh> thanks a lot! 15:43:38 <slaweq> ok, regarding other issues, I don't have anything really new for today 15:44:11 <slaweq> so if You don't have anything else to talk about today, I will give You some time back :) 15:44:43 <bcafarel> I won't complain that we have less to talk about in this meeting :) 15:44:51 <maciejjozefczyk> :) 15:44:52 <slaweq> bcafarel: :) 15:45:11 <njohnston> +100 15:45:41 <slaweq> ok, so thx for attending and see You online :) 15:45:42 <slaweq> o/ 15:45:45 <slaweq> #endmeeting