Friday, 2019-01-18

mlavalle#startmeeting neutron_drivers14:00
openstackMeeting started Fri Jan 18 14:00:55 2019 UTC and is due to finish in 60 minutes.  The chair is mlavalle. Information about MeetBot at
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.14:00
*** openstack changes topic to " (Meeting topic: neutron_drivers)"14:00
openstackThe meeting name has been set to 'neutron_drivers'14:00
mlavallelet's wait 2 min for people to congregate14:01
*** njohnston_ has joined #openstack-meeting14:02
njohnston_haleyb let me know he would miss the first half of the meeting, his previous appointment is running late14:02
mlavalleso we need amotoki to have quorum14:03
mlavallethanks for the update njohnston_ :-)14:04
mlavallein the meantime: cheng1 is there something you want to bring up?14:04
openstackLaunchpad bug 1808731 in neutron "[RFE] Needs to restart metadata proxy with the start/restart of l3/dhcp agent" [Undecided,Triaged] - Assigned to cheng li (chengli3)14:05
cheng1this bug, will we have a look?14:05
*** Chenjie has joined #openstack-meeting14:05
mlavalleI left a comment last night in that bug14:06
*** abishop has joined #openstack-meeting14:06
mlavalleand I think slaweq agrees with me14:06
mlavalleand haleyb as well14:06
cheng1mlavalle: we may need the same implement as other agents, like dnsmasq14:07
*** abishop has left #openstack-meeting14:07
slaweqmlavalle: basically I agree with Your comment, but I also think that doing restart of haproxy only during start of agent may work - as it's only short time when it will not be available14:07
slaweqso in fact I think that in most cases client will have longer http timeout configured and will wait for response14:08
cheng1sure, it will14:08
cheng1like dnsmasq, we don't stop metadata proxy with the stop of l3/dhcp agent14:09
cheng1just restart it with the restart/start of l3/dhcp agent14:09
*** jamesmcarthur has quit IRC14:09
mlavallein the case of l3 agents, today haproxy is not related to start of the agent14:10
njohnston_So if this is really for an upgrades concern, doesn't this properly live in the realm of orchestration such as what ansible or puppet provides?  Usually a tool like that will be used to choreograph the sequence of events needed for upgrades, if I am not mistaken.14:10
mlavallenjohnston_: that's the point of my comment of last night14:11
mlavalleas usual, you articulated it better than me :-)14:11
cheng1it's not only about upgrade14:11
cheng1Curreny implement, metadata proxy doesn't restart even we restart l3 agent14:12
cheng1or dhcp agent14:12
njohnston_why would you want it to?14:12
cheng1this will result in issue14:12
mlavallebut going back to my previous point, in the case of router based proxies, the trigger is the creation, update or deletion or routers14:13
cheng1For example, if we change the metadata_port from default 9697 to 969914:13
*** yamamoto has joined #openstack-meeting14:13
cheng1the new 9699 port will not be used, because metadata proxy doesn't restart14:14
mlavalleso, when we restart the agent, we need to restart the proxies in that agent14:14
mlavalleall the porxies in all the routers in that agent^^^14:14
cheng1yes, that what I want to say14:14
*** yamamoto has quit IRC14:15
*** yamamoto has joined #openstack-meeting14:15
mlavallecould that be very disruptive?14:15
liuyulongSuch change should not manipulate for a cloud administrator, IMO.14:15
mlavalleliuyulong: what do you mean? can you clarify?14:16
*** mriedem has joined #openstack-meeting14:16
njohnston_what is the agents could reread the configuration when it changes, like what we implemented for neutron server?  That would be a way to nondisruptively load the change.  Since if you're changing the metadata proxy config file you are managing objects on that host, it should not eb too much of an extra step to issue a signal to trigger the config file reload14:16
liuyulongIIRC, such proxy should not do any config change during agent down time.14:17
mlavalleliuyulong: yes, that's been my point14:17
slaweqnjohnston_: but we are talking here about reload haproxy config, not l3 agent14:18
slaweqcan haproxy reload config without restart?14:18
*** yamamoto has quit IRC14:18
mlavallecheng1: also, what are the use cases where you have seen the need for this? can you elaborate on that? we are having a theoretical discussion without understanding the need behind it14:19
mlavalleI am hesitant to introduce changes in the data plane in principle, however minor, but maybe the need justifies it14:20
cheng1mlavalle: there is a running openstack env14:20
*** crazzy has joined #openstack-meeting14:20
cheng1for some reason, I want to change metadata_port14:20
cheng1the step is to change the config in neutron config files14:21
cheng1then restart l3 agent14:21
cheng1but restart l3 agent doens't restart metaproxy14:21
cheng1current implement of reload by 'kill', which doesn't re-generate configuration from neutron.conf14:22
mlavallebut that is a hypothetical: "for some reason". do you have an actual use case?14:22
cheng1not really, but it could be an use case14:22
mlavallewe can all go through the code and start hypothesizing about changes we could make to it14:23
*** tetsuro has quit IRC14:23
mlavallebut we have a huge installed base to worry about. we should introduce only changes where the actual reward justifies the risk we always run when we introduce changes14:24
cheng1just would like to confirm this bug in the meeting14:24
slaweqmlavalle: well said :)14:24
*** priteau has joined #openstack-meeting14:25
mlavalleand I am known to be in favor to introduce more features and running the risk of doing so, but I want to know the reward in terms of the benefits we will deliver to ACTUAL use cases. Then I'm willing to run the associated risks14:26
liuyulongActually the l3-agent and dhcp-agent can be hosted on the dedicated machines. And the port allocations for such key/bottleneck services should be planned in advance.14:26
*** awaugama has joined #openstack-meeting14:26
slaweqcheng1: mlavalle: what about writting somewhere in docs that if You change in L3 config something related to metadata proxy, e.g. metadata_port, You should kill haproxy services before restart of L3 agent to changes takes effect14:27
cheng1besides metadata_port, there could be other parameters for metadata proxy.14:27
slaweqthat would be for sure less risky :)14:28
slaweqand operators would be aware of it then IMO14:28
mlavallebut today, if you kill the proxies and re-start the agent, you won't re-start the proxies14:29
mlavalleproxies are associated to routers events14:29
*** lpetrut has joined #openstack-meeting14:30
slaweqmlavalle: L3 agent will not check if proxy is running and start it if it's not?14:30
slaweqI think it will take care of it during restart14:30
liuyulongslaweq, mlavalle, a router admin-state down/up action may work. But such action will cause inevitable data plane down.14:31
cheng1I see we restart dnsmasq with the restart of dhcp agent14:32
cheng1why we don't restart metadata proxy with l3/dhcp agent14:32
mlavallecheng1: yes, but that is not a valid analogy. it is a consequence of the way we use dnsmasq14:32
mlavalleand the way dnsmasq works14:33
liuyulongcheng1, DHCP can reload the config is because the normal user can set the attributes for it.14:33
liuyulongmeanwhile meta proxy is transparent to users14:33
mlavalleslaweq: you spent a long time with a public cloud operator. In your exxperience, is this something relevan to that operator?14:35
mlavallesame question to you liuyulong. you work today for a big operator14:36
slaweqmlavalle: unfortunatelly in this case there is no L3 agent used at all14:36
mlavalleslaweq: good answer... LOL14:36
liuyulongmlavalle, yes?14:36
mlavalleliuyulong: I was asking whether a change like the one cheng1 is proposing would be benefitial for the company you work for14:37
mlavalleand I ask these questions because so far we don't have an actual use case. So I want to see what operators would think about it14:39
liuyulongmlavalle, as I said before such change should not happen during agend down time. The config should be planned in advance. And it then should not change in the cloud entire life cycle. Otherwise it will increasing OP difficulties.14:40
slaweqmaybe it would be good idea to send email to ML and ask operators about their feedback on this?14:40
slaweqand then we can back to this talk here :)14:41
mlavalleliuyulong: thanks14:41
liuyulongmlavalle, np14:41
*** hongbin has joined #openstack-meeting14:42
mlavalleslaweq: I have expressed my opinion on this and I don't see the need to make changes for the sake of making changes. But I am willing to be overruled if:14:42
mlavalle1) other members of the drivers team reach consensus this is needed14:42
mlavalle2) we identify a set of actual operators who see the benefit of this. The ML is a good way to try to get that feedback. Great suggestion14:43
mlavalleit can even be a topic of discussion for the next forum / ptg14:44
*** davidsha has joined #openstack-meeting14:45
cheng1mlavalle: got it, thanks14:45
slaweqforum would be better IMO, and I agree with You mlavalle :)14:45
mlavallecheng1: do you want to start a thread in the ML and see if we get feedback from the community14:45
mlavalleBTW, cheng1, slaweq, liuyulong, njohnston_: great discussion. This is what this meeting is for ;-)14:47
mlavallesorry I played the stubborn role this time around14:47
mlavallecheng1: and I thank you for your suggestions and arguments. They are always welcome14:48
cheng1mlavalle: Maybe after days, I can try ML14:49
mlavallecheng1: great. thanks again14:49
mlavalleSince we don't have drivers quorum, and we have only 10 minutes left, I propose we look at
openstackLaunchpad bug 1811166 in neutron "[RFE] Enforce router admin_state_up=False before distributed update" [Wishlist,New] - Assigned to Matt Welch (mattw4)14:51
mlavalleafter reading it last night, my thinking is that this is not a RFE. It's really a bug14:51
mlavallethe code should do what the submitter proposes:
mlavalleslaweq, liuyulong, njohnston_, cheng1: what do you think?14:52
davidshaIt might have been a mistaken use of the RFE tag, they don't seem to have anyother bugs to their name.14:53
mlavallehe was nudged by the bugs deputy of that week to classify it as a RFE14:54
mlavallethat's my impression at least14:54
slaweqit was marked as RFE as it is trying to change exsiting API behaviour I guess14:54
slaweqother than that I agree that it should be like that14:55
*** hongbin has quit IRC14:55
slaweqso it should require admin_state_up=False before migration14:55
davidshaYa, spotted that in the comments just there14:55
mlavalleyes and in fixing it, we should document properly the change of bahavior to what it should have been from the beginning14:56
slaweqand now the question is: do we need shim api extension to make this fix discoverable?14:56
liuyulongagree with slaweq14:57
mlavalleyes, slaweq, I think you are right14:57
mlavalleas much as I dislike our extensions sprawl, in this case I think it is needed14:57
mlavallegood suggestion slaweq14:57
slaweqI agree that it's necessary14:58
slaweqalso, I think it would be good to add scenario tests to
liuyulongSeems the bug requests to save one step to accomplish that migration?14:58
slaweqbecause now we are not testing if migration is forbidden when router don't have admin_state_up=False14:58
slaweqand it would be useful IMHO14:58
mlavallegood point slaweq14:59
mlavalleso let's handle this as a bug14:59
mlavalleand I will update it with your suggestions14:59
mlavalletrhanks for your attendance. We'll do it all over again next week14:59
*** armstrong has joined #openstack-meeting15:00
mlavallehave a great weekend!15:00
slaweqhave a great weekend!15:00
*** openstack changes topic to "OpenStack Meetings ||"15:00
openstackMeeting ended Fri Jan 18 15:00:14 2019 UTC.  Information about MeetBot at . (v 0.1.4)15:00
openstackMinutes (text):
davidshaThanks, enjoy the weekend!15:00
*** njohnston_ is now known as njohnston15:00
mlavalledavidsha: nice to see you around. we want to see more of you15:00
cheng1good weekend!15:00
*** cheng1 has quit IRC15:01
davidshamlavalle: I'm going to keep trying!15:02
