14:00:13 <liuyulong> #startmeeting neutron_l3 14:00:13 <openstack> Meeting started Wed Jul 10 14:00:13 2019 UTC and is due to finish in 60 minutes. The chair is liuyulong. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:14 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:16 <openstack> The meeting name has been set to 'neutron_l3' 14:00:20 <njohnston> o/ 14:00:28 <liuyulong> #chair haleyb 14:00:29 <openstack> Current chairs: haleyb liuyulong 14:01:01 <ralonsoh> hi 14:01:59 <liuyulong88> The nickname 'liuyulong' is already in use! 14:02:12 <liuyulong88> OK 14:02:22 <liuyulong88> #topic Announcements 14:03:30 <liuyulong_> I have no announcement today. 14:03:43 <liuyulong_> If you have, please go ahead. 14:04:46 <liuyulong_> OK, let's move on. 14:04:53 <liuyulong_> #topic Bugs 14:05:16 <liuyulong_> #link http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007455.html 14:05:22 <liuyulong_> Bence Romsics (rubasov) was our bug deputy the week before last, thank you. 14:05:35 <liuyulong_> #link http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007577.html 14:05:41 <liuyulong_> Bernard Cafarelli (bcafarel), last week bug deputy, also thank you. 14:06:45 <liuyulong_> #link https://bugs.launchpad.net/neutron/+bug/1835044 14:06:46 <openstack> Launchpad bug 1835044 in neutron "[Queens] Memory leak in pyroute2 0.4.21" [High,Won't fix] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 14:06:54 <liuyulong_> This is now marked as wont-fix. 14:06:58 <ralonsoh> yes 14:07:09 <liuyulong_> But for this popular rpm repo, we still do not have a new version of pyroute2 for queens release. 14:07:11 <ralonsoh> because we can't modify stable requirements 14:07:12 <liuyulong_> #link http://mirror.centos.org/centos/7.6.1810/cloud/x86_64/openstack-queens/ 14:07:18 <liuyulong_> It is still python2-pyroute2-0.4.21-1.el7.noarch.rpm 14:07:33 <slaweq> hi 14:07:40 <ralonsoh> each company should fix this 14:07:54 <ralonsoh> we are pushing the changes for our RPM repos 14:08:19 <ralonsoh> but this won't be changed in a stable branch unless this is a security problem 14:08:52 <liuyulong_> So R and S repo have new version? 14:09:05 <ralonsoh> in devstack/requirements yes 14:09:08 <ralonsoh> not Q 14:09:17 <ralonsoh> 0.5.2 vs 0.4.21 14:10:05 <liuyulong_> Our environment are all running queens, we indeed need a repo fix. : ) 14:10:39 <liuyulong_> ralonsoh, thank you for working on this. 14:10:46 <liuyulong_> #link https://bugs.launchpad.net/neutron/+bug/1834308 14:10:47 <openstack> Launchpad bug 1834308 in neutron "[DVR][DB] too many slow query during agent restart" [Medium,Confirmed] - Assigned to LIU Yulong (dragon889) 14:10:47 <ralonsoh> my pleasure 14:11:02 <liuyulong_> I will submit a fix for DVR related DB query. 14:11:31 <liuyulong_> Our DBA help me to get some slow query LOG. 14:13:15 <liuyulong> sorry lost the connection again.... 14:13:39 <liuyulong> We noticed there will be 300k+ slow query (0.5s+) during 30 nodes ovs-agent restart. 14:14:08 <liuyulong> Yeah, most of them are related to DVR 14:14:32 <liuyulong> next one may be related to this. 14:14:34 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1835663 14:14:34 <openstack> Launchpad bug 1835663 in neutron "Some L3 RPCs are time-consuming especially get_routers" [Medium,Confirmed] 14:14:42 <liuyulong> As you noticed, it is really slow. 14:14:47 <liuyulong> http://logs.openstack.org/11/669111/4/check/neutron-tempest-plugin-dvr-multinode-scenario/dc3af26/controller/logs/screen-q-l3.txt.gz#_Jul_07_04_18_11_791730 14:15:24 <liuyulong> IMO, 37s for a single RPC, this is not acceptable for a production environment. My OP colleagues will complain. : ) 14:16:55 <liuyulong> Neutron server side DB slow query may be one reason. 14:18:32 <haleyb> liuyulong: ack, and that log was from a check job? that's pretty bad 14:19:29 <liuyulong> For this log here, maybe upstream CI neutron server just meet its bottleneck. It can not answer too much RPC calls concurrently. 14:19:52 <liuyulong> haleyb, yes, it is bad 14:20:20 <liuyulong> #link https://review.opendev.org/#/c/669111/ 14:20:33 <liuyulong> ralonsoh, slaweq, hi, this patch ^^ 14:21:01 <ralonsoh> there is an implementation for this function 14:21:03 <liuyulong> The time cost wrapper, I left some comments 14:21:16 <ralonsoh> I'll review it after the meeting 14:21:52 <liuyulong> ralonsoh, yes, it's good to know we have similar function already. 14:22:46 <liuyulong> Let me quote the comment here: 14:23:03 <liuyulong> but it can not distinguish each call for same RPC, so I will still add a wrapper here which call that function inside. And a log for the function start is needed as well. We need to know the precisely call start and end. 14:23:35 <njohnston> yes exactly I think that if you override the message argument to the oslo.utils time_it function and add the generated uuid then you get the benefit of the function 14:24:22 <ralonsoh> njohnston, agree 14:24:34 <njohnston> but you could do that without a separate decorator 14:25:52 <liuyulong> njohnston, how to enable the start log without a new decorator? 14:26:33 <liuyulong> We may want to see what happened between start and end. 14:26:58 <liuyulong> time_it just log the duration. 14:27:33 <njohnston> @osloutils.time_it(message="time-cost: %(seconds).02f seconds to run function '%(func_name)s', uuid=" + uuidutils.generate_uuid()) 14:29:12 <njohnston> I see, so you believe there is value in having the "call: start" separate from the "call: ended, time = %d" log messages 14:29:19 <njohnston> I can see your point 14:31:02 <slaweq> I agree with liuyulong that separate log for start and end can be useful 14:31:09 <liuyulong> And I have another concern, that 'StopWatch' is used in the 'time_it'. It looks pretty complicated, don't know if it will cause something wrong in RPC calls. 14:31:29 <ralonsoh> this is just a context manager 14:33:16 <liuyulong> OK, I will refactor this decorator. 14:33:30 <liuyulong> It will be useful for upstream CI 14:35:48 <liuyulong> I have no bug today, last week is a bit quite for L3. 14:36:41 <ralonsoh> I still have one bug in L3 14:36:45 <ralonsoh> #link https://bugs.launchpad.net/neutron/+bug/1732458 14:36:46 <openstack> Launchpad bug 1732458 in neutron "deleted_ports memory leak in dhcp agent" [Medium,In progress] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 14:36:57 <ralonsoh> #link https://review.opendev.org/#/c/521035/ 14:37:19 <ralonsoh> (CI is not passing but I'm rechecking) 14:37:41 <liuyulong> A very old one 14:39:09 <liuyulong> Recently we meet many exceptions about DHCP during some upgrading or restarting. 14:39:58 <liuyulong> And I'm deciding to remove the DHCP agent in our local environment. 14:40:44 <liuyulong> config_drive or L2-agent self-sevice DHCP looks more friendly to large scale cloud. 14:41:35 <slaweq> liuyulong: there is RFE about distributed dhcp agent 14:41:50 <slaweq> let me find it 14:42:03 <liuyulong> Yes, all from our OPs complain. 14:42:04 <amotoki> liuyulong: what do you mean by 'L2-agent self-sevice DHCP'? 14:42:22 <liuyulong> slaweq, did you mean the OVN related RFE? 14:42:25 <slaweq> liuyulong: https://bugs.launchpad.net/neutron/+bug/1806390 14:42:26 <openstack> Launchpad bug 1806390 in neutron "[RFE] Distributed DHCP agent " [Wishlist,In progress] - Assigned to Yang Youseok (ileixe) 14:42:37 <slaweq> liuyulong: no, this one isn't related to OVN 14:42:38 <liuyulong> amotoki, it is a local implementation. 14:43:10 <amotoki> liuyulong: okay, is it a kind of distributed one? 14:43:14 <liuyulong> amotoki, since OVS-agent have full acknowage of port IP and MAC. 14:43:27 <slaweq> amotoki: I remember when I was in OVH we also had something like that - neutron-ovs agent was spawning simple udhcpd service for each port on host - and that worked very well :) 14:44:07 <amotoki> liuyulong: slaweq: thanks. it reminds me of nova-network dhcp stuff per compute node. 14:44:58 <amotoki> the proposed distributed dhcp agent would be similar. 14:45:03 <liuyulong> https://review.opendev.org/#/c/658414/9/specs/train/ml2ovs-ovn-convergence.rst@38 14:45:18 <liuyulong> I left a comment here, but no response for now. : ) 14:45:33 <liuyulong> ML2+OVS+DVR and OVN 14:46:18 <haleyb> liuyulong: i will look at your comment... 14:46:43 <liuyulong> OK, next topic 14:46:58 <liuyulong> #topic Routed Networks 14:47:22 <liuyulong> I'm now interested in how this will work for external network with multiple segments. 14:47:36 <liuyulong> Yes, I mean public (provider) network for router gateway and floating IP. 14:48:58 <liuyulong> I also left some comment here: https://review.opendev.org/#/c/657170/ 14:49:02 <liuyulong> No response. 14:49:22 <liuyulong> mlavalle, tidwellr, wwriverrat: your turn now. 14:50:52 <liuyulong> No updates? 14:51:01 <liuyulong> Next topic 14:51:15 <liuyulong> #topic On demand agenda 14:51:54 <liuyulong> I have one more thing about OVN and dvr. 14:52:00 <liuyulong> #link https://blueprints.launchpad.net/neutron/+spec/openflow-based-dvr 14:52:08 <liuyulong> Maybe we shoud add a note for this BP, or mark it as something like not-complete or abandoned. 14:52:12 <liuyulong> s/should 14:52:40 <liuyulong> And also abandon the related gerrit patch. 14:55:25 <haleyb> yes, i don't think that will be implemented 14:55:28 <amotoki> +1. it clarifies the current situation and it is useful especially for operators. 14:57:02 <liuyulong> OK, time is up, let's stop here. 14:57:10 <liuyulong> Thank you guys. 14:57:15 <liuyulong> #endmeeting