16:00:47 <slaweq> #startmeeting neutron_ci
16:00:48 <openstack> Meeting started Tue May 21 16:00:47 2019 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:49 <slaweq> hi
16:00:50 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:51 <ralonsoh> hi
16:00:52 <openstack> The meeting name has been set to 'neutron_ci'
16:00:53 <njohnston> o/
16:01:44 <slaweq> lets wait few more minutes for mlavalle and others
16:02:09 <bcafarel> just passing by to say hi before I leave :)
16:02:17 <slaweq> hi bcafarel :)
16:02:32 <bcafarel> hi and bye!
16:02:50 <njohnston> a tout a l'heure bcafarel
16:02:53 <haleyb> hi
16:02:59 <slaweq> ok, lets start
16:03:02 <slaweq> first of all
16:03:05 <slaweq> #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate
16:03:14 <slaweq> please load it now to have it ready later :)
16:03:42 <slaweq> and one small announcement - I moved agenda of this meeting to the etherpad: https://etherpad.openstack.org/p/neutron-ci-meetings
16:03:58 <slaweq> so You can take a look at it and add anything You want to it :)
16:04:21 <slaweq> first topic for today
16:04:23 <slaweq> #topic Actions from previous meetings
16:04:27 <slaweq> and first action
16:04:33 <slaweq> mlavalle to continue debuging reasons of neutron-tempest-plugin-dvr-multinode-scenario failures
16:05:01 <slaweq> I think that mlavalle is not here now
16:05:11 <slaweq> so lets skip to actions assigned to other people
16:05:26 <slaweq> ralonsoh to debug issue with neutron_tempest_plugin.api.admin.test_network_segment_range test
16:06:06 <ralonsoh> slaweq, sorry again but I didn't find anything yet
16:06:19 <slaweq> ok, no problem
16:06:27 <slaweq> I don't think it is very urgent issue for now
16:06:54 <slaweq> can I assign it to You for next week also?
16:07:12 <ralonsoh> yes but I'll need a bit of help
16:07:17 <ralonsoh> I don't find the problem there
16:07:45 <slaweq> TBH I didn't saw this issue recently
16:08:05 <mlavalle> slaweq: I have to look at an internal issue
16:08:07 <slaweq> so maybe lets just wait until it will happend again, then report proper bug on launchpad and work on it
16:08:16 <slaweq> ralonsoh: how about that?
16:08:19 <ralonsoh> slaweq, perfect
16:08:24 <slaweq> ralonsoh: ok, thx
16:08:29 <slaweq> mlavalle: sure, no problem :)
16:09:59 <slaweq> ok, so lets go back to mlavalle's actions now
16:10:08 <slaweq> mlavalle to continue debuging reasons of neutron-tempest-plugin-dvr-multinode-scenario failures
16:10:14 <slaweq> any updates on this one?
16:10:20 <mlavalle> not much time doing that
16:10:54 <slaweq> ok, can I assign it to You for next week also?
16:10:59 <mlavalle> yes
16:11:02 <slaweq> #action mlavalle to continue debuging reasons of neutron-tempest-plugin-dvr-multinode-scenario failures
16:11:03 <slaweq> thx :)
16:11:13 <slaweq> so next one
16:11:14 <slaweq> mlavalle to talk with nova folks about slow responses for metadata requests
16:11:27 <mlavalle> didn't have time, sorry :-)
16:11:31 <mlavalle> ;-(
16:11:38 <slaweq> no problem :)
16:11:52 <slaweq> can I assign it to You for next week than?
16:11:56 <mlavalle> yes
16:12:12 <slaweq> #action mlavalle to talk with nova folks about slow responses for metadata requests
16:12:13 <slaweq> thx
16:12:20 <slaweq> and the last one:
16:12:22 <slaweq> slaweq to fix number of tempest-slow-py3 jobs in grafana
16:12:32 <slaweq> I didn't but haleyb fixed this
16:12:34 <slaweq> thx haleyb :)
16:12:45 <slaweq> any questions/comments?
16:12:50 <haleyb> :)
16:13:12 <haleyb> i don't know wat i did though
16:13:47 <slaweq> You fixed number of jobs in grafana dashboard's config :)
16:14:11 <slaweq> I don't have link to patch now but it was merged for sure already
16:14:41 <haleyb> oh yes, that one
16:14:50 <slaweq> yep
16:14:59 <slaweq> ok, lets move forward then
16:15:02 <slaweq> next topic
16:15:04 <slaweq> #topic Stadium projects
16:15:10 <slaweq> Python 3 migration
16:15:18 <slaweq> Stadium projects etherpad: https://etherpad.openstack.org/p/neutron_stadium_python3_status
16:15:23 <slaweq> any updates on this?
16:16:04 <mlavalle> I have to talk to yamamoto about this, in regards to midonet
16:17:43 <njohnston> did not have a chance this week to work on it; current state of the fwaas tests in neutron-tempest-plugin is that they wait forever and all tests die with timeouts
16:18:08 <slaweq> that's bad :/
16:18:14 <slaweq> so fwaas isn't py3 ready yet?
16:18:32 <njohnston> I think it is py3 ready, it just has massive issues in other areas
16:18:40 <slaweq> ahh ok :)
16:20:12 <slaweq> ok, so I think that we can move forward as there is no a lot of update on py3 migration
16:20:19 <slaweq> tempest-plugins migration
16:20:20 <njohnston> yeah
16:20:25 <slaweq> Etherpad: https://etherpad.openstack.org/p/neutron_stadium_move_to_tempest_plugin_repo
16:20:47 <njohnston> I realize I gave my tempest plugin update in the py3 section, sorry about that.
16:21:04 <slaweq> here I know that bcafarel did some progress and his first patch is even merged
16:21:10 <slaweq> so he is the first one :)
16:21:14 <slaweq> njohnston: no problem :)
16:21:42 <slaweq> for networking-bgpvpn I have patches ready for review:
16:21:48 <slaweq> Step 1: https://review.openstack.org/652991
16:21:50 <slaweq> Step 2: https://review.opendev.org/#/c/657793/
16:22:42 <slaweq> so I'm kindly ask for review :)
16:22:57 <mlavalle> ok
16:23:00 <slaweq> any other comments/questions on this topic?
16:23:01 <njohnston> I'll take a look
16:23:07 <slaweq> thx guys
16:23:34 <mlavalle> not from me
16:24:25 <slaweq> ok, so lets move on then
16:24:34 <slaweq> #topic Grafana
16:24:51 <slaweq> #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate (was alredy given at the beginning)
16:26:17 <slaweq> there are some gaps from the weekend but except that I think that our CI looks quite good in last days
16:26:40 <njohnston> agreed
16:26:41 <slaweq> and I also saw many patches merged even without rechecking, which is supprising :)
16:26:59 <mlavalle> yes, it looks good
16:27:12 <njohnston> yep!
16:28:00 <slaweq> even ssh problems happend less often recently but I'm not sure if that is just "by accident" or there was e.g. some change in infra which helped with it somehow maybe
16:28:59 <slaweq> one problem which I still see are fullstack and (bit less but still) functional tests
16:29:07 <slaweq> all of them are on quite high failure rates
16:29:51 <njohnston> fullstack is high, around 30%
16:29:58 <slaweq> yep
16:30:03 <slaweq> functional was like that last week too
16:30:40 <njohnston> but in grafana, in check queue, functional seems much less, between 5 and 11%, so hopefully we fixed something compared to last week
16:30:49 <slaweq> but then we merged https://review.opendev.org/#/c/657849/ from ralonsoh and I think that helped a lot
16:31:53 <slaweq> so lets talk about fullstack tests now
16:31:59 <slaweq> #topic fullstack/functional
16:32:32 <slaweq> I was checking results of fullstack jobs from last couple of days
16:32:46 <slaweq> and I found couple of failure examples
16:33:01 <slaweq> one is (again) problem with neutron.tests.fullstack.test_l3_agent.TestHAL3Agent
16:33:08 <slaweq> like in http://logs.openstack.org/78/653378/7/check/neutron-fullstack-with-uwsgi/d8b47f9/testr_results.html.gz
16:33:18 <slaweq> but I think I saw it more times during last couple of days
16:34:50 <slaweq> I can find and reopen bug related to this
16:35:09 <slaweq> but I don't think I will have time to look into this in next days
16:35:17 <slaweq> so maybe someone else will want to look
16:35:54 <slaweq> #action slaweq to reopen bug related to failures of neutron.tests.fullstack.test_l3_agent.TestHAL3Agent.test_ha_router_restart_agents_no_packet_lost
16:36:12 <slaweq> I will ask liuyulong tomorrow if he can take a look at this once again
16:36:32 <slaweq> from other errors I saw also failed neutron.tests.fullstack.test_dhcp_agent.TestDhcpAgentHA.test_reschedule_network_on_new_agent test
16:36:38 <slaweq> http://logs.openstack.org/87/658787/5/check/neutron-fullstack/7d35c49/testr_results.html.gz
16:36:53 <slaweq> ralonsoh: I know You were looking into such failure some time ago, right?
16:37:20 <ralonsoh> slaweq, I don;t remember this one
16:38:19 <slaweq> ralonsoh: ha, found it: https://bugs.launchpad.net/neutron/+bug/1799555
16:38:20 <openstack> Launchpad bug 1799555 in neutron "Fullstack test neutron.tests.fullstack.test_dhcp_agent.TestDhcpAgentHA.test_reschedule_network_on_new_agent timeout" [High,Confirmed] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez)
16:38:34 <ralonsoh> slaweq, yes, I was looking at the patch
16:38:37 <ralonsoh> https://review.opendev.org/#/c/643079/
16:38:59 <ralonsoh> same as before, I didn't find anything relevant to solve the bug
16:39:26 <ralonsoh> I'll take a look at those logs
16:39:37 <slaweq> maybe we should add some additional logging to master branch
16:39:54 <slaweq> it may help us investigate when same issue will happened again
16:40:01 <slaweq> what do You think about it?
16:40:06 <ralonsoh> slaweq, I'll propose a patch for this
16:40:13 <slaweq> ralonsoh++ thx
16:40:34 <slaweq> #action ralonsoh to propose patch with additional logging to help debug https://bugs.launchpad.net/neutron/+bug/1799555
16:40:35 <openstack> Launchpad bug 1799555 in neutron "Fullstack test neutron.tests.fullstack.test_dhcp_agent.TestDhcpAgentHA.test_reschedule_network_on_new_agent timeout" [High,Confirmed] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez)
16:41:29 <slaweq> there was also once failure of test_min_bw_qos_policy_rule_lifecycle test:
16:41:35 <slaweq> http://logs.openstack.org/46/453346/11/check/neutron-fullstack/7c6c94b/testr_results.html.gz
16:42:26 <slaweq> and there are some errors in log there http://logs.openstack.org/46/453346/11/check/neutron-fullstack/7c6c94b/controller/logs/dsvm-fullstack-logs/TestMinBwQoSOvs.test_min_bw_qos_policy_rule_lifecycle_egress,openflow-cli_/neutron-openvswitch-agent--2019-05-21--00-19-19-155407_log.txt.gz?level=ERROR
16:42:44 <ralonsoh> slaweq, but I think that was solved
16:42:52 <ralonsoh> slaweq, I'll review it too
16:43:04 <slaweq> results are from 2019-05-21 00:19
16:43:55 <slaweq> but it is quite old patch so it might be that this is just an old error
16:44:05 <slaweq> it happend on this patch https://review.opendev.org/#/c/453346/
16:44:40 <ralonsoh> yes, I saw this. I think the patch I applied some weeks ago solved this
16:44:57 <slaweq> ok, so we should be good with this one then :)
16:45:02 <slaweq> thx ralonsoh for confirmation
16:45:08 <ralonsoh> there should be no complain about deleting a non exisitng QoS rule
16:45:24 <slaweq> ok, so lets move on to functional tests now
16:45:37 <slaweq> I saw 2 "new" issues there
16:45:48 <slaweq> first is test_ha_router_failover test fails again
16:45:56 <slaweq> http://logs.openstack.org/61/659861/1/check/neutron-functional/3708673/testr_results.html.gz - I reported it as new bug https://bugs.launchpad.net/neutron/+bug/1829889
16:45:58 <openstack> Launchpad bug 1829889 in neutron "_assert_ipv6_accept_ra method should wait until proper settings will be configured" [Medium,Confirmed] - Assigned to Slawek Kaplonski (slaweq)
16:46:03 <slaweq> I will take care of this one
16:46:14 <slaweq> I think I already know what is the issue there
16:46:26 <slaweq> and I described it in bug report
16:46:35 <slaweq> and second issue which I found is:
16:46:37 <slaweq> neutron.tests.functional.agent.linux.test_bridge_lib.FdbInterfaceTestCase
16:46:42 <slaweq> http://logs.openstack.org/84/647484/9/check/neutron-functional/6709666/testr_results.html.gz
16:46:53 <slaweq> for which njohnston reported a bug https://bugs.launchpad.net/neutron/+bug/1829890
16:46:54 <openstack> Launchpad bug 1829890 in neutron "neutron-functional CI job fails with InterfaceAlreadyExists error" [Undecided,New]
16:47:27 <slaweq> it failed on patch https://review.opendev.org/#/c/647484
16:47:37 <slaweq> and I saw it only on this patch
16:47:50 <slaweq> but it don't looks like related to this patch IMO
16:47:55 <njohnston> http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Interface%20interface%20already%20exists%5C%22
16:48:10 <njohnston> and expand out to 7 days
16:48:24 <njohnston> I get 11 failures on 5 different changes
16:48:58 <slaweq> ok, so it is an issue, thx njohnston
16:49:06 <slaweq> is there any volunteer to look into this one?
16:49:30 <mlavalle> I have some catch up to do this week
16:49:46 <mlavalle> but if by next week, nobody volunteers, I'll jump in
16:50:01 <slaweq> ok, I will add it to my todo for this week, but I'm not sure if I will have time
16:50:18 <slaweq> so I will not assign it to myself for now, maybe there will be someone else who wants to take it
16:50:22 <njohnston> I'm in the same boat; I have a number of things ahead of it, but if I get a chance I'll jump in
16:51:14 <slaweq> ok
16:51:23 <slaweq> so that's all regarding fullstack/functional jobs
16:51:30 <slaweq> any questions/comments?
16:51:46 <mlavalle> not from me
16:52:01 <slaweq> ok
16:52:16 <slaweq> lets then move on quickly to next topic
16:52:17 <slaweq> #topic Tempest/Scenario
16:52:40 <slaweq> first of all, I want to mention that we have quite often failures with errors like
16:52:44 <slaweq> "Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'"
16:52:57 <slaweq> it causes errors in devstack deployment and job is failing
16:53:03 <slaweq> it's not related to neutron directly
16:53:18 <slaweq> and AFAIK infra people are aware of this
16:53:45 <slaweq> and second thing which I want to mention
16:53:46 <slaweq> I did patch with summary of tempest jobs
16:53:54 <slaweq> https://review.opendev.org/#/c/660349/ - please review it,
16:54:26 <slaweq> it's follow up after discussion from Denver
16:54:41 <mlavalle> nice!
16:54:56 <slaweq> when this will be merged, my plan is to maybe switch some of those tempest jobs with neutron-templest-plugin jobs
16:55:31 <slaweq> as IMHO we don't need to test tempest-full-xxx jobs with every possible config like dvr/l3ha/lb/ovs/....
16:55:57 <slaweq> we can IMHO run tempest-full with some default config and then test neutron related things with other configurations
16:56:08 <njohnston> cool idea, I like it
16:56:27 <slaweq> but I want to have this list merged first and than use it as list of "todo" :)
16:56:39 <mlavalle> ok
16:56:44 <slaweq> there is also list of grenade jobs in this patch
16:56:55 <slaweq> and speaking about grenade jobs, I sent email some time ago
16:57:02 <slaweq> http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006146.html
16:57:17 <slaweq> please read it, and tell me what You think about it
16:57:39 <slaweq> I will also ask gmann and other qa folks about their opinion on this
16:58:05 <slaweq> and that's all from my side :)
16:58:15 <slaweq> any questions/comments?
16:58:21 <slaweq> we have about 1 minute left
16:58:31 <mlavalle> not from me
16:58:36 <njohnston> yeah I was waiting for gmann's response to that email
16:58:56 <slaweq> ok, I will ping him also :)
16:59:05 <njohnston> thanks slaweq
16:59:14 <slaweq> so if there is nothing else
16:59:17 <slaweq> thx for attending
16:59:22 <slaweq> and have a nice week :)
16:59:28 <slaweq> #endmeeting