15:00:27 <slaweq> #startmeeting neutron_ci 15:00:28 <openstack> Meeting started Wed Jul 1 15:00:27 2020 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:29 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:30 <slaweq> hi 15:00:31 <openstack> The meeting name has been set to 'neutron_ci' 15:00:32 <ralonsoh> hi 15:02:30 <slaweq> lets wait few more minutes for others 15:02:35 <maciejjozefczyk> \p 15:02:37 <maciejjozefczyk> \o 15:03:56 <slaweq> bcafarel: njohnston: ping :) 15:04:09 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:04:25 <bcafarel> o/ 15:04:37 <bcafarel> sorry I was listening to Edge session 15:04:42 <slaweq> ahh 15:04:43 <slaweq> ok 15:04:48 <slaweq> I forgot about it 15:04:58 <slaweq> ok, lets start 15:05:00 <slaweq> #topic Actions from previous meetings 15:05:05 <slaweq> bcafarel to check gate status on rocky and queens (uwsgi problem) 15:05:39 <bcafarel> ok so patch in neutron itself is not enough, we need multiple fixes in devstack 15:05:58 <bcafarel> latest iteration in devstack itself looks good to go https://review.opendev.org/#/c/735615/ 15:06:43 <bcafarel> then we will need https://review.opendev.org/#/c/738851/ or something like that on top - with Depends-On I still see failures that should be fixed by the devstack one 15:06:52 <bcafarel> hopefully recheck once it is merged should be greener 15:07:14 <bcafarel> and once rocky is finally back on track, similar backports for older branches :) 15:07:24 <slaweq> ok 15:07:39 <slaweq> so it seems like we are still far from green gate for queens and rocky 15:08:11 <njohnston> o/ 15:08:47 <bcafarel> yup :/ but there is progress at least 15:09:00 <slaweq> thx bcafarel for taking care of it 15:09:20 <slaweq> ok, next one 15:09:25 <slaweq> maciejjozefczyk will check test_ovsdb_monitor.TestNBDbMonitorOverTcp.test_floatingip_mac_bindings failiure in https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_63a/711425/11/check/neutron-functional/63ac4ca/testr_results.html 15:09:40 <maciejjozefczyk> fixed 15:09:50 <slaweq> thx maciejjozefczyk :) 15:09:55 <maciejjozefczyk> #link https://review.opendev.org/#/c/738415/ 15:10:06 <slaweq> so next one 15:10:09 <slaweq> ralonsoh will check get_datapath_id issues in https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_63a/711425/11/check/neutron-functional/63ac4ca/testr_results.html 15:10:41 <slaweq> maciejjozefczyk: one question, should we backport it to ussuri? 15:10:42 <ralonsoh> slaweq, sorry, I didn't start debbuging 15:11:23 <maciejjozefczyk> slaweq, hmm we can 15:11:45 <slaweq> maciejjozefczyk: ok, can You propose backport? 15:11:49 <maciejjozefczyk> slaweq, clicked :D 15:11:52 <slaweq> thx 15:12:05 <slaweq> ralonsoh: sure, I know You were busy with other things 15:12:12 <slaweq> will You try to check that next week? 15:12:15 <ralonsoh> sure 15:12:28 <slaweq> #action ralonsoh will check get_datapath_id issues in https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_63a/711425/11/check/neutron-functional/63ac4ca/testr_results.html 15:12:29 <slaweq> thx 15:12:37 <slaweq> ok, next one 15:12:42 <slaweq> slaweq will check errors with non existing interface in https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_63a/711425/11/check/neutron-functional/63ac4ca/testr_results.html 15:12:51 <slaweq> I didn't have too much time to check this one 15:13:57 <slaweq> but from what I looked today it seems for me like maybe some tests are "overlapping" and other test cleanded some port/bridge 15:14:26 <slaweq> I will probably add some additional debug logging to be able to maybe investigate it more when it will happen again 15:14:45 <slaweq> and I will try to continue work on it this week 15:14:49 <slaweq> #action slaweq will check errors with non existing interface in https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_63a/711425/11/check/neutron-functional/63ac4ca/testr_results.html 15:15:01 <slaweq> and the last one 15:15:02 <slaweq> maciejjozefczyk to check failing neutron-ovn-tempest-ovs-master-fedora periodic job 15:15:11 <maciejjozefczyk> yeah 15:15:15 <maciejjozefczyk> #link https://review.opendev.org/#/c/737984/ 15:15:47 <slaweq> I saw that ovn jobs were failing on this patch today morning 15:15:54 <maciejjozefczyk> the issue is trivial, ovs compilation code was duplicated in both ovn and ovs modules 15:15:56 <slaweq> so lets wait how it will be now :) 15:16:06 <maciejjozefczyk> yeah, slaweq, this should be fine now 15:16:14 <maciejjozefczyk> (I hope so) :D 15:16:39 <bcafarel> and I think that one may be interesting for focal transition too 15:16:43 <maciejjozefczyk> This is about cleaning and unifying the way we compile ovs/ovn, so next time we'll need to fix one function instead two 15:16:48 <maciejjozefczyk> bcafarel, indeed 15:17:25 <slaweq> ++ 15:17:27 <slaweq> thx 15:18:03 <slaweq> ok 15:18:17 <slaweq> that are all actions from last week 15:18:23 <slaweq> lets move on to the next topic 15:18:24 <slaweq> #topic Stadium projects 15:18:59 <slaweq> with migration to zuul v3 stadium projects are actually good now 15:19:07 <slaweq> as only neutron-ovn-grenade job is still missing 15:19:22 <njohnston> \o/ 15:19:35 <slaweq> anything else You want to discuss about stadium projects today? 15:21:12 <slaweq> ok, so lets move on 15:21:18 <slaweq> #topic Stable branches 15:21:23 <slaweq> Ussuri dashboard: http://grafana.openstack.org/d/pM54U-Kiz/neutron-failure-rate-previous-stable-release?orgId=1 15:21:25 <slaweq> Train dashboard: http://grafana.openstack.org/d/dCFVU-Kik/neutron-failure-rate-older-stable-release?orgId=1 15:22:05 <slaweq> from what I was seeing this week, branches which are not EM are running pretty good now 15:22:09 <bcafarel> I mostly looked at rocky that week, but I think ussuri to stein were OK 15:22:14 <slaweq> and EM are red due to known reasons 15:23:14 <slaweq> I thought about one small improvement to safe some gate resources: wdyt about moving all non-voting jobs to experimental queue in the EM branches? 15:23:55 <bcafarel> :) I was thinking about that when filling https://review.opendev.org/#/c/738851/ 15:24:20 <slaweq> :) 15:24:28 <slaweq> so there is at least 2 of us 15:24:42 <slaweq> that would safe about 4-6 jobs, mostly multinode 15:24:55 <slaweq> so pretty many vms spawned to test each patch 15:25:16 <bcafarel> probability is low that someone will work on fixing them in EM, and I don't think anyone checks their results in backports 15:25:26 <slaweq> exactly 15:25:37 <slaweq> almost noboday is checking non-voting jobs in master 15:27:31 <slaweq> ralonsoh: njohnston any thoughts? 15:27:43 <slaweq> if You are ok with this I will propose such patch this week 15:27:44 <ralonsoh> agree with removing them 15:28:18 <njohnston> I absolutely agree 15:28:27 <njohnston> There is no reason to be using those resources 15:28:47 <slaweq> ok 15:28:49 <slaweq> thx 15:28:54 <slaweq> so I will propose such change 15:29:12 <slaweq> #action slaweq to move non-voting jobs to experimental queue in EM branches 15:29:32 <slaweq> anything else regarding stable branches? or can we move on? 15:29:50 <bcafarel> nothing from me 15:30:44 <njohnston> nothing from me 15:30:54 <slaweq> ok, lets move on then 15:30:56 <slaweq> #topic Grafana 15:31:02 <slaweq> #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:32:09 <slaweq> I don't see any serious problems there 15:33:16 <slaweq> I pushed small patch to update dashboard https://review.opendev.org/738784 15:34:32 <slaweq> ok, lets talk about some specific issues 15:34:34 <slaweq> #topic Tempest/Scenario 15:34:48 <slaweq> I found few failures for which I opened LPs 15:34:56 <slaweq> first one is 15:35:04 <slaweq> tempest.api.network.admin.test_routers.RoutersAdminTest.test_create_router_set_gateway_with_fixed_ip 15:35:09 <slaweq> in job neutron-tempest-dvr-ha-multinode-full 15:35:19 <slaweq> it happens very often 15:35:28 <slaweq> like e.g.: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_2c4/734876/1/check/neutron-tempest-dvr-ha-multinode-full/2c466a0/testr_results.html 15:35:35 <slaweq> Bug reported https://bugs.launchpad.net/neutron/+bug/1885897 15:35:35 <openstack> Launchpad bug 1885897 in neutron "Tempest test_create_router_set_gateway_with_fixed_ip test is failing often in dvr scenario job" [High,Confirmed] 15:35:51 <slaweq> in fact this is main reason of often failures of this job 15:36:34 <slaweq> any volunteer to check that one? 15:36:43 <slaweq> if not, I will try to check it this week 15:36:44 <ralonsoh> sorry, not this week 15:37:10 <slaweq> ok, I will check it 15:37:29 <slaweq> #action slaweq to investigate failures in test_create_router_set_gateway_with_fixed_ip test 15:37:41 <slaweq> I think that this may be some tempest cleaning issue maybe 15:37:45 <slaweq> or something like that 15:37:50 <slaweq> as it happens very often 15:38:11 <slaweq> next one is related to qos test 15:38:25 <slaweq> I found it in neutron-tempest-plugin-scenario-openvswitch job but I think it may happen also in other jobs 15:38:30 <slaweq> neutron_tempest_plugin.scenario.test_qos.QoSTest.test_qos_basic_and_update 15:38:35 <slaweq> https://20f4a85411442f4e3555-9f5a5e2736e26bdd8715596753fafe10.ssl.cf2.rackcdn.com/734876/1/check/neutron-tempest-plugin-scenario-openvswitch/a31f86b/testr_results.html 15:38:42 <slaweq> Bug reported https://bugs.launchpad.net/neutron/+bug/1885899 15:38:42 <openstack> Launchpad bug 1885899 in neutron "test_qos_basic_and_update test is failing" [Critical,Confirmed] 15:38:55 <slaweq> seems like nc wasn't spawned properly - maybe we should add additional logging in https://github.com/openstack/neutron-tempest-plugin/blob/master/neutron_tempest_plugin/common/utils.py#L122 ? 15:40:14 <slaweq> anyone wants to check that? 15:40:27 <ralonsoh> but I remember you changed the way to spawn nc 15:40:31 <ralonsoh> making it more reliable 15:40:45 <slaweq> yes 15:40:57 <ralonsoh> sorry, but next week 15:41:01 <ralonsoh> not this one 15:41:18 <slaweq> ok, lets keep it unassigned, maybe someone will want to check it 15:41:29 <slaweq> I marked it as critical because it impacts voting jobs 15:41:47 <slaweq> ok, next one 15:42:07 <slaweq> this is related only to the ovn based jobs where test neutron_tempest_plugin.scenario.test_connectivity.NetworkConnectivityTest.test_connectivity_through_2_routers is failing 15:42:14 <slaweq> like e.g. https://4ec598fcefc6b0367120-6910015cdc6b96c34eca0ab65a68e7f2.ssl.cf5.rackcdn.com/696926/18/check/neutron-ovn-tempest-full-multinode-ovs-master/c1c51ca/testr_results.html 15:42:25 <slaweq> Bug reported: https://bugs.launchpad.net/neutron/+bug/1885898 15:42:25 <openstack> Launchpad bug 1885898 in neutron "test connectivity through 2 routers fails in neutron-ovn-tempest-full-multinode-ovs-master job" [High,Confirmed] 15:42:39 <slaweq> maciejjozefczyk: will You have time to take a look into this? 15:43:10 <maciejjozefczyk> slaweq, You have my sword 15:43:20 <slaweq> maciejjozefczyk: thx a lot 15:43:33 <bcafarel> :) 15:43:34 <slaweq> #action maciejjozefczyk to check neutron_tempest_plugin.scenario.test_connectivity.NetworkConnectivityTest.test_connectivity_through_2_routers failure in ovn jobs 15:44:43 <slaweq> maciejjozefczyk: there is also another failure in ovn based jobs 15:44:44 <slaweq> neutron_tempest_plugin.scenario.test_trunk.TrunkTest.test_trunk_subport_lifecycle 15:44:48 <slaweq> https://7bea12b2d1429b68c6c8-10caedded388001c6bbc38619ca4b324.ssl.cf2.rackcdn.com/737047/8/check/neutron-ovn-tempest-full-multinode-ovs-master/592e31b/testr_results.html 15:44:56 <slaweq> Bug reported: https://bugs.launchpad.net/neutron/+bug/1885900 15:44:56 <openstack> Launchpad bug 1885900 in neutron "test_trunk_subport_lifecycle is failing in ovn based jobs" [Critical,Confirmed] 15:44:57 <maciejjozefczyk> slaweq, yeah notiecd that bug :( 15:45:08 <maciejjozefczyk> that fails pretty often now? 15:45:14 <slaweq> You will probably don't have time to work on both this week but please keep it in mind :) 15:45:54 <slaweq> I saw couple of times at lease in last week 15:46:12 <slaweq> see http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%20%5C%22line%20240%2C%20in%20test_trunk_subport_lifecycle%5C%22 15:46:47 <slaweq> and that's all what I had for today 15:47:03 <slaweq> I opened LP for each of failures so we can track them there 15:47:07 <maciejjozefczyk> slaweq, thanks for the link 15:47:16 <slaweq> maciejjozefczyk: yw :) 15:47:26 <slaweq> anything else You want to talk about today? 15:48:42 * bcafarel looks for correct tab to copy link 15:49:31 <bcafarel> https://review.opendev.org/#/c/738163/ I started to change a few bits for focal 15:49:46 <bcafarel> still early WIP but if you want to add stuff, please do so 15:50:18 <slaweq> thx bcafarel 15:50:21 <maciejjozefczyk> sure bcafarel thanks 15:50:30 <maciejjozefczyk> ovn fails in all the cases :D 15:50:41 <slaweq> I added myself to the reviewers to be up to date with this :) 15:50:50 <slaweq> damm ovn :P 15:50:51 <njohnston> +1 15:50:58 <bcafarel> :) yes it needs more work on the skip ovs/ovn compilation bits 15:51:02 <slaweq> we should move it to the stadium project ;P 15:51:12 <bcafarel> ahah 15:51:19 <slaweq> maciejjozefczyk: wdyt? 15:51:21 <slaweq> :D 15:51:50 <njohnston> ROFL 15:53:00 <slaweq> ok, I think we can finish this meeting now 15:53:04 <slaweq> thx for attending 15:53:08 <slaweq> and see You next week :) 15:53:12 <slaweq> #endmeeting