16:00:01 #startmeeting neutron_ci 16:00:01 Meeting started Tue Sep 24 16:00:01 2019 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:03 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:05 o/ 16:00:06 The meeting name has been set to 'neutron_ci' 16:00:07 hi (again) :) 16:00:28 hi 16:00:31 long time no see :) 16:00:51 lol 16:01:13 :) 16:01:17 ok, let's start 16:01:26 #topic Actions from previous meetings 16:01:30 first one is 16:01:40 mlavalle to continue investigating router migrations issue 16:02:32 but I think mlavalle is not here now 16:02:34 (also open up Grafana dashboard in some tab: http://grafana.openstack.org/dashboard/db/neutron-failure-rate ) 16:02:39 bcafarel: right 16:02:48 I forgot about it, thx for reminder 16:03:08 let's move on to the second action from last week 16:03:18 ralonsoh to check if https://review.opendev.org/#/c/679428/ should be backported to stable branches 16:03:33 no,t hat's not needed 16:03:43 this method is not in stable branches 16:04:17 ralonsoh: thx, that's good 16:04:27 ok, next one 16:04:29 ralonsoh to report bug and investigate issue with neutron_tempest_plugin.scenario.test_qos.QoSTest.test_qos_basic_and_update 16:05:08 yes, that was related to https://bugs.launchpad.net/neutron/+bug/1833721 16:05:08 Launchpad bug 1833721 in neutron "ip_lib synchronized decorator should wrap the privileged one" [Medium,Fix released] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 16:05:20 #link https://review.opendev.org/#/c/683109/ 16:05:51 really 16:06:00 ? 16:06:15 sorry 16:06:18 but why it was causing always failure of this one test? do You know? 16:06:34 yes, the problem was related to a ip_lib method timeout 16:07:14 sorry, I mix the bugs 16:07:21 #link https://bugs.launchpad.net/neutron/+bug/1844516 16:07:21 Launchpad bug 1844516 in neutron "[neutron-tempest-plugin] SSH timeout exceptions when executing remote commands" [Medium,Fix released] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 16:07:36 #link https://review.opendev.org/#/c/682864/ 16:07:49 that was the real problem for the qos test 16:08:11 ralonsoh: ok, that sounds more correct :) 16:08:26 so we should be fine with this issue now, thx a lot 16:08:37 during the BW test the ssh connection was raising a timeout exception 16:08:47 so this patch tries to mitigate that 16:10:18 (that's all from my side) 16:10:22 thx ralonsoh 16:10:48 ok, next one 16:10:51 also on ralonsoh :) 16:10:56 ralonsoh to report bug with "Multiple possible networks found" 16:11:07 yes one sec 16:11:40 #link https://bugs.launchpad.net/tempest/+bug/1844568 16:11:40 Launchpad bug 1844568 in tempest "[compute] "create_test_server" if networks is undefined and more than one network is present" [Undecided,In progress] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 16:11:48 and the patch #link https://review.opendev.org/#/c/682964/ 16:12:10 the problem is, soemtimes, when a port is created without specifying the network 16:12:15 ahh, You even created patch. Nice :) 16:12:26 there are more than one network, so Nova fails 16:12:50 with exception "NetworkAmbiguous" 16:12:52 that's all 16:13:11 thx for update ralonsoh 16:13:25 and the last one was 16:13:27 slaweq to report bug with fullstack test_bw_limit_qos_port_removed test 16:13:37 I reported bug today https://bugs.launchpad.net/neutron/+bug/1845176 16:13:38 Launchpad bug 1845176 in neutron "Removing of QoS queue in neutron-ovs-agent fails due to existing references" [Medium,Confirmed] 16:13:46 but didn't have time to look into it more 16:14:41 maybe I can take a look at this one tomorrow 16:14:49 I will left it unassigned for now, maybe someone will want to take a look 16:14:55 I'll ping you if I start looking at this one 16:14:56 if not I will try to check it 16:15:02 ralonsoh: sure, thx :) 16:15:24 that was all actions from last week which I had 16:15:32 so let's move on to the next topic 16:15:33 #topic Stadium projects 16:15:50 Python 3 migration 16:15:56 Stadium projects etherpad: https://etherpad.openstack.org/p/neutron_stadium_python3_status 16:16:13 njohnston: do You have any updates on this maybe? 16:17:22 No, I have not seen lajoskatona or yamamoto online 16:17:36 I need to check on the status of bagpipe, I had thought we were done with that 16:18:33 all reviews seem merged, so maybe just missing an etherpad update yes 16:18:40 it seems so 16:18:58 njohnston: will You check that and update etherpad or should I do it? 16:20:04 I'll do it 16:20:10 njohnston: thx a lot 16:20:21 #action njohnston update python 3 stagium etherpad for bagpipe 16:20:55 ok, next is 16:20:57 tempest-plugins migration 16:21:02 Etherpad: https://etherpad.openstack.org/p/neutron_stadium_move_to_tempest_plugin_repo 16:21:33 for neutron-dynamic-routing patch is ready for review https://review.opendev.org/#/c/652099 16:21:43 please take a look on it if You will have few minutes 16:21:49 thanks tidwellr for fixing my nitpicking comment :) 16:22:21 also related small patch: https://review.opendev.org/#/c/683935/ 16:23:19 thx bcafarel +2 already 16:23:33 that was fast! 16:23:39 :) 16:23:47 +2 16:24:17 +2+W 16:24:21 :) 16:24:28 ok, anything else related to stadium projects You want to discuss today? 16:25:03 I had a little gate fix for neutron-fwaas functional tests 16:25:04 https://review.opendev.org/#/c/684386/ 16:25:15 take a look; it's blocking the PDF goal merge for that project 16:25:40 njohnston: https://review.opendev.org/#/c/652812/2 16:25:45 this one was first I think :) 16:26:05 and it seems to be fixing same issue, right? 16:26:19 indeed! I did not catch that in my search; we can push that one instead 16:26:37 it was there since April 16:26:39 LOL 16:26:45 I found it just today 16:27:47 njohnston: thx for bringing it up here :) 16:28:05 (we should be more careful when changing class definitions) 16:28:35 (me the first) 16:28:48 yes, we should 16:29:25 that's why on one of last PTGs we agreed to add some stadium project's jobs to neutron check queue 16:29:37 but only networking-ovn and midonet was interested in that 16:29:51 other stadium projects didn't propose such jobs 16:30:07 ahh, sorry 16:30:15 there is also ironic job and openstacksdk now 16:30:30 and tripleo :) 16:31:02 lots of fun now 16:31:07 :) 16:31:53 but we will discuss future of stadium projects in Shanghai so we can also think about their CI :) 16:32:07 ok, lets move on 16:32:16 #topic Grafana 16:32:16 sounds like an interesting subtopic yes 16:32:30 http://grafana.openstack.org/dashboard/db/neutron-failure-rate 16:33:30 I think we have improved the stability last week 16:34:20 ralonsoh: yes, but I still see quite high numbers on functional/fullstack jobs 16:34:45 and als grenade jobs are failing about 20% this week :/ 16:34:56 I'm trying to debug all those FT/fullstack ones 16:35:21 for those I have some examples today 16:35:50 also UT are failing around 20-30% recently 16:36:04 but for those I didn't found any specific issue not related to patch on which it was running 16:36:13 so I would not bother with that too much for now 16:37:07 slaweq, if you find something suspicious, ping me 16:37:14 ralonsoh: sure, thx 16:37:27 any other findings from grafana? 16:38:05 ok, lets move on than 16:38:45 #topic fullstack/functional 16:38:59 for functional tests I found couple of issues 16:39:05 neutron.tests.functional.agent.linux.test_l3_tc_lib.TcLibTestCase.test_get_existing_filter_ids 16:39:11 https://99b34ecc9afda69f8d26-4c2619fbf66a72a5befb0d8c52d9c271.ssl.cf1.rackcdn.com/682418/7/check/neutron-functional-python27/fb1cc17/testr_results.html.gz - similar issue like we saw last week also, 16:39:46 but I think this is a problem in testcase library 16:39:48 in py2 16:40:17 but what is causing such problem? 16:40:25 we have to have some trigger for it IMO 16:40:46 slaweq, I tried to debug this but not too much 16:40:53 because is not happening in Py3 16:41:06 yes, I saw it twice in py2 job 16:41:29 maybe we can just live with it for few more weeks and than will be gone when we will get rid of py2 finally 16:42:02 that's the spirit: if a test fails, remove it!!!! 16:42:11 ralonsoh: LOL 16:42:12 +1 16:42:21 but don't tell anyone ;) 16:42:31 so, use secret.review.openstack.org ? 16:42:38 hehehehe 16:42:38 s/openstack/opendev/ 16:42:52 in the past I was proposing here to use https://pypi.org/project/pytest-vw/ module 16:43:04 but we are not using pytest so it's hard :/ 16:43:06 :D 16:43:42 ok, now seriously :) 16:43:49 another issue wchich I found 16:43:57 neutron.tests.functional.agent.linux.test_iptables.IptablesManagerTestCase.test_tcp_output 16:44:02 https://storage.gra1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_8fe/682418/7/gate/neutron-functional/8fe4fa5/testr_results.html.gz - looks like something new for me, needs to be checked, 16:44:15 I didn't find the problem there 16:44:34 IMO, the rule was never applied 16:45:06 ralonsoh: in this test_tcp_output ? 16:45:11 yes 16:45:23 so that is the problem, no? 16:45:28 yes 16:45:38 but I can't confirm this 100% 16:46:12 I will take a look into this more deeply this week 16:46:39 #action slaweq to invesigate issue with neutron.tests.functional.agent.linux.test_iptables.IptablesManagerTestCase.test_tcp_output 16:46:57 ok, another one: 16:46:59 neutron.tests.functional.agent.linux.test_keepalived.KeepalivedManagerTestCase.test_keepalived_spawns_conflicting_pid_base_process 16:47:06 https://storage.gra1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_c52/683915/2/gate/neutron-functional-python27/c529449/testr_results.html.gz 16:47:15 yes this one 16:47:26 #link https://review.opendev.org/#/c/684249/ 16:47:39 #link https://bugs.launchpad.net/neutron/+bug/1845150 16:47:39 Launchpad bug 1845150 in neutron "[FT] "keepalived" needs network interfaces configured as in its own config" [High,In progress] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 16:48:12 ralonsoh: are You sure this is real reason of failure? 16:48:25 I was checking journal log from this job today 16:48:39 I think so, the keepalived process stops when eth0 does not have an IP 16:49:11 and this may cause the test to fail 16:49:29 I found that when developing https://review.opendev.org/#/c/681671/ 16:50:00 the os.kill() process didn't work because the keepalived process was not there anymore 16:50:03 but I found in journal log many occurences of "Cannot find an IP address to use for interface" 16:50:11 I know 16:50:16 so it seems that in many other, passing tests it was the same 16:51:17 the problem is that keepalived does not stop working inmediatly 16:51:51 ahh, so some race 16:51:57 I think so 16:52:07 But, of course, there could be another problem 16:52:09 and in most cases test is finished before keepalived crash 16:52:13 but IMO this patch is legit 16:52:16 makes sense 16:52:24 yes, I just +2'ed this patch 16:53:06 ok, and the last one from functional tests: 16:53:08 neutron.tests.functional.agent.linux.test_bridge_lib.FdbInterfaceTestCase.test_add_delete(no_namespace) 16:53:18 https://9722deaa313ebebb56dc-c08b881decb3106ff13d720dd4a26025.ssl.cf5.rackcdn.com/681846/5/check/neutron-functional-python27/c3dc48a/testr_results.html.gz 16:53:26 again with those test cases (I'm the father) 16:53:36 LOL 16:53:47 I pushed a patch to make the interface names random 16:53:54 this should not happen 16:54:02 I'll take a look at this tomorrow 16:54:07 thx 16:54:29 #action ralonsoh to take a look at failing neutron.tests.functional.agent.linux.test_bridge_lib.FdbInterfaceTestCase.test_add_delete(no_namespace) 16:54:49 ok, that's all for functional tests from me 16:54:59 anything else You have regarding functional job? 16:55:37 ok, I take that as no 16:55:40 so fullstack 16:55:48 neutron.tests.fullstack.test_agent_bandwidth_report.TestPlacementBandwidthReport.test_configurations_are_synced_towards_placement 16:55:54 https://d3359cf0499b7fa3a209-a0ae01eb6742268974ea7eef585da77c.ssl.cf1.rackcdn.com/681846/5/check/neutron-fullstack/014ed7d/testr_results.html.gz 16:56:12 I think that here it would be good if rubasov could take a look maybe 16:56:47 I will ask him tomorrow if he will have some time to take a look 16:57:17 #action slaweq to ask rubasov if he can check neutron.tests.fullstack.test_agent_bandwidth_report.TestPlacementBandwidthReport.test_configurations_are_synced_towards_placement 16:57:31 and the last one for today: 16:57:33 neutron.tests.fullstack.test_l3_agent.TestLegacyL3Agent.test_mtu_update 16:57:37 https://df0eb3e2e26f1607f7d8-b5f72c94f829be93029a2756be493e29.ssl.cf2.rackcdn.com/679813/2/gate/neutron-fullstack/dfbde3f/testr_results.html.gz 16:58:19 I didn't find the error there, the L3 agent is setting the mtu in the interface 16:59:01 I will take a look deeply on that one during this week 16:59:14 #action slaweq to investigate neutron.tests.fullstack.test_l3_agent.TestLegacyL3Agent.test_mtu_update 16:59:22 we are almost out of time now 16:59:51 I just want to ask You for review of 2 patches https://review.opendev.org/#/c/683853/ and https://review.opendev.org/#/c/681607/ if You will have some time 16:59:55 thx in advance :) 17:00:00 and we are out of time 17:00:04 thx for attendind 17:00:07 bye 17:00:09 #endmeeting