14:00:04 <slaweq> #startmeeting neutron_drivers
14:00:05 <openstack> Meeting started Fri Feb  7 14:00:04 2020 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:06 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:08 <openstack> The meeting name has been set to 'neutron_drivers'
14:00:18 <njohnston> o/
14:00:20 <ralonsoh> hi
14:00:25 <yamamoto> hii
14:00:25 <hjensas> o/
14:00:30 <stephen-ma> hi
14:00:56 <haleyb> hi
14:01:17 <slaweq> I think we have quorum already so lets start
14:01:28 <slaweq> #topic RFEs
14:01:32 <amotoki> hi
14:01:45 <slaweq> as we have hjensas here, lets start with his rfe
14:01:50 <slaweq> https://bugs.launchpad.net/neutron/+bug/1861032
14:01:50 <openstack> Launchpad bug 1861032 in neutron "[RFE] Add support for configuring dnsmasq with multiple IPv6 addresses in same subnet on same port" [Undecided,In progress] - Assigned to Harald Jensås (harald-jensas)
14:03:02 <hjensas> hi, so this is still pending the actual functionality being merged in dnsmasq. But the last update for Simon was that he is working on a rewrite of the patch I proposed.
14:03:52 <hjensas> In neutron the requirement is a change in the dnsmasq driver, to write the new config. I have a early review up for that: https://review.opendev.org/704436
14:05:10 <slaweq> for me this rfe is pretty straight forward when new functionality will be merged  and release in dnsmasq
14:05:46 <slaweq> but hjensas do You think we could e.g. bump min required dnsmasq version or e.g. do some sanity check like https://github.com/openstack/neutron/blob/master/neutron/cmd/sanity/checks.py#L46
14:06:06 <slaweq> and simply warn users that this will not be supported on older versions
14:06:17 <slaweq> instead of adding new config knob?
14:07:16 <hjensas> if we write the new config format on old dnsmasq the server won't start. As it will interpret the list if IPv6 addresses as one bad IPv6 address.
14:07:27 <njohnston> what if an older version of dnsmasq needs to be run for other reasons
14:07:49 <ralonsoh> we need this config knob and the sanity check for now
14:07:57 <njohnston> I agree
14:08:11 <slaweq> can't we e.g. try to discover dnsmasq version in the driver?
14:08:15 <slaweq> it shouldn't be hard
14:08:26 <slaweq> we have dnsmasq version check in sanity checks so we could reuse it
14:08:31 <amotoki> can we expect distros can support the expected dnsmasq versions in same neutron releases?
14:08:37 <haleyb> yeah, and running dnsmasq --version might not be perfect either as distros backport things
14:08:56 <slaweq> ahh, backports, ok
14:09:13 <slaweq> so it seems that config option will be the best solution here
14:09:21 <haleyb> what about OVN? :)
14:09:42 <njohnston> haleyb: that was my next question :-)
14:10:00 <hjensas> the problem with discovering the version is downstream packaging. i.e we may have 2.80? in CentOS 8, that will stay 2.80 but we may pick features from 2.81 in packaging.
14:10:25 <amotoki> so my question does not matter?
14:10:27 <slaweq> to me this seems for me like dnsmasq driver specific rfe, so it's not related to ovn at all
14:11:13 <slaweq> amotoki: I think we can't expect what versions of dnsmasq will be supported in which distro
14:11:28 <haleyb> slaweq: i was more wondering if OVN will also only give a lease for the first IPv6 address, else it has the same "bug"
14:11:37 <amotoki> slaweq: so it leads to another question I had.
14:12:05 <njohnston> right, the question is does OVN DHCP server need to implement the same multiple return that dnsmasq does?
14:12:22 <amotoki> if we provide this feature, do we need to provide which feature(s) are avalable in neutron dhcp drivers?
14:13:14 <hjensas> for OVN I think it depends on if they want to break RFC or not. i.e allow the CLID/IAID's to change and always serve the 1 IPv6 address based on mac.
14:14:32 <amotoki> IIRC we don't provide available features via the neutron API. If so, API users need to check available features by sending an API request.
14:15:03 <haleyb> hjensas: do you think we'll ever be using the OVN dhcp driver for Ironic nodes?  we did add external port support recently
14:15:38 <slaweq> amotoki: I don't think this should be discoverable through API
14:16:21 <hjensas> haleyb: we had not discussion on support OVN in my team that I'm aware of.
14:17:10 <slaweq> haleyb: for ovn driver we should probably first check if this is supported by ovn and if so, how it should be configured, than we can add support for it in ovn driver if needed
14:17:24 <slaweq> but maybe it will also require changes on ovn side first
14:17:34 <hjensas> haleyb: but, as OVN is being used in both openstack and OKD Metal3 etc. it may come up later.
14:18:10 <haleyb> slaweq: ack, just didn't want them to diverge if this is the behavior we want
14:18:32 <slaweq> haleyb: yes, I know
14:19:01 <amotoki> slaweq: if it depends on dhcp driver, do you mean that API consumers should try API requests to check if a specific feature is avaiable?
14:19:05 <slaweq> but I understand this rfe more like "adding support for some dnsmasq feature in neutron's dnsmasq driver"
14:19:20 <hjensas> both kea and dnsmasq is going for supporting a list of IPv6 addresses for a singe host reservation.
14:19:44 <njohnston> I think it makes sense for us to approve this, and note that this is something we need parity on before OVN DHCP can support Ironic
14:20:24 <slaweq> amotoki: if we would like to go that way, we would probably need another rfe for that
14:20:41 <slaweq> amotoki: but I'm not sure if we would like to expose something like that for users
14:20:49 <amotoki> slaweq: what happens if a configured dhcp driver does not support this feature?
14:21:17 <slaweq> amotoki: in this case, now dnsmasq will have only first IP address configured to provide via DHCP
14:22:01 <slaweq> and IIUC that will be behaviour of driver if operator will not switch this new config option to use "new behaviour"
14:22:13 <slaweq> hjensas: is that correct?
14:22:15 <hjensas> slaweq: correct
14:23:03 <slaweq> amotoki: so in such case user who is using Ironic and Neutron will have some problem as he has now
14:23:26 <slaweq> IIUC, during the booting of host there may be problems with giving proper IP address to the host
14:23:29 <slaweq> hjensas: correct?
14:24:14 <amotoki> slaweq: hjensas: ah, I see. thanks. our current behavior is not enough and we need improvements in a dhcp driver side like dnsmasq.
14:24:42 <amotoki> sorry that i misunderstood the situation.
14:24:58 <slaweq> amotoki: that's correct, but this improvement is first needed on dnsmasq side
14:25:06 <slaweq> than we can add it in our driver
14:25:16 <amotoki> makes sense now
14:26:19 <slaweq> and I agree with njohnston to approve this rfe with some note about ovn dhcp (and other potentially) drivers
14:26:22 <haleyb> we can add this to the features gap document for OVN, i just can't remember if that's in the doc migration or just the spec
14:26:36 <slaweq> haleyb: it was in the spec only IIRC
14:27:04 * haleyb can add the gaps as a .rst after the docs merge
14:27:22 <slaweq> haleyb: that is good idea
14:28:47 <slaweq> so, IMO we can approve this RFE with note that it should be added to ovn gaps doc which haleyb will propose
14:28:53 <slaweq> are You ok with that?
14:28:59 <amotoki> looks good
14:29:07 <njohnston> +1
14:29:21 <ralonsoh> +1
14:29:23 <haleyb> +1 from me, gaps doc will just be what's in the spec
14:29:48 <yamamoto> what's OKD Metal3?
14:30:29 <yamamoto> anyway +1
14:30:47 <slaweq> yamamoto: https://metal3.io/blog/2019/06/25/Metal3.html
14:30:52 <slaweq> I think hjensas was talking about this
14:31:01 <hjensas> slaweq: yamamoto: yes.
14:31:04 * haleyb thought it was a heavy metal rock band :)
14:31:13 <yamamoto> thank you
14:31:17 <ralonsoh> haleyb, me too!
14:31:21 <slaweq> lol
14:31:39 <slaweq> ok, so will be approved
14:31:45 <slaweq> I will update LP after the meeting
14:31:48 <slaweq> thx
14:31:57 <slaweq> lets move on
14:31:59 <slaweq> next one
14:32:02 <slaweq> https://bugs.launchpad.net/neutron/+bug/1859362
14:32:02 <openstack> Launchpad bug 1859362 in neutron "Neutron accepts arbitrary MTU values for networks" [Wishlist,New]
14:32:03 <hjensas> thanks everyone. Now I just have to be patient.
14:32:48 <slaweq> hjensas: so this will not be for sure in Ussuri, right?
14:32:55 <ralonsoh> I don't know if Jeroen is here
14:32:59 <slaweq> we can schedule it for V cycle?
14:35:01 <slaweq> ok, lets talk about https://bugs.launchpad.net/neutron/+bug/1859362 now as we already switched to it :)
14:35:01 <openstack> Launchpad bug 1859362 in neutron "Neutron accepts arbitrary MTU values for networks" [Wishlist,New]
14:35:21 <slaweq> IMO good summary of this rfe is in comment https://bugs.launchpad.net/neutron/+bug/1859362/comments/8
14:35:40 <ralonsoh> yes, but those limits are still artificial
14:35:58 <ralonsoh> those are the recommended values, but why should we limit this in Neutron?
14:36:32 <ralonsoh> In c#9, we can see how to limit those values if needed, or change them
14:36:45 <ralonsoh> just my opinion
14:36:58 <TomStappaerts> Sorry to butt in but didn't we tackle this already? https://review.opendev.org/#/c/688656
14:37:59 <TomStappaerts> Ah I see the ticket requests min values as well
14:38:06 <slaweq> TomStappaerts: no, I think it's a bit different thing
14:38:44 <slaweq> and in this rfe reporter wants also to have possibility to set network's mtu to be higher than NeutronGlobalPhysnetMtu
14:39:09 <slaweq> my concern here is to add yet another 2 config knobs related to the mtu
14:40:22 <slaweq> we already have "global_physnet_mtu", "physical_network_mtus", "path_mtu"
14:40:24 <njohnston> I have the same concern.  MTU fiddling can lead quickly to madness, and it took so much effort to get this simplified in the past
14:40:32 <amotoki> is NeutronGlobalPhysnetMtu a config?
14:40:51 <ralonsoh> amotoki, in tripleO
14:41:00 <amotoki> ah...
14:41:13 <slaweq> amotoki: but it's related to "global_physnet_mtu" in Neutron I guess
14:41:25 <ralonsoh> slaweq, it is
14:41:32 <ralonsoh> NeutronGlobalPhysnetMtu: {{ overcloud_neutron_global_physnet_mtu }}
14:41:58 <amotoki> I see
14:42:14 <amotoki> it sounds good if we can provide consistent MTU value for a network.
14:42:44 <amotoki> this RFE makes sense from the perspective that we need some consistent MTU check.
14:43:33 <njohnston> So the RFE proposes to limit values to 1280-9250, with the upper bound based on the capabilities of current networking equipment.  But I don't think we will be the first to know when network equipment evolves to support more than that, nor should w eneed to.
14:43:45 <ralonsoh> that's the point
14:43:50 <ralonsoh> there are two proposals
14:43:52 <ralonsoh> 1) limit the values
14:44:15 <ralonsoh> 2) ensure the MTU values depending on the physnet mtu (comment #9)
14:44:27 <ralonsoh> second one, as amotoki said, makes sense
14:44:33 <amotoki> I haven't checked whether we need two different configs for tenant net and physnet which depneds on a driver.
14:44:41 <amotoki> ralonsoh: you are faster than me.
14:45:12 <amotoki> mmmm...... I sent incomplete sentence
14:45:38 <slaweq> ralonsoh: but in LP there is also given use case when vxlan network's mtu is higher than underlaying phys net's mtu
14:46:11 <ralonsoh> slaweq, from global_physnet_mtu
14:46:13 <ralonsoh> " For overlay networks '
14:46:13 <ralonsoh> 'such as VXLAN, neutron automatically subtracts the '
14:46:13 <ralonsoh> 'overlay protocol overhead from this value. Defaults '
14:46:13 <ralonsoh> 'to 1500, the standard value for Ethernet."
14:46:33 <ralonsoh> we can deal with this depending on the overlay protocol
14:47:13 <slaweq> ralonsoh: yes, but please check comment #8
14:48:40 <ralonsoh> slaweq, well the matter here is how to calculate the overhead of the overlay protocol
14:48:44 <ralonsoh> isn't it?
14:49:44 <slaweq> I'm not sure tbh
14:49:58 <haleyb> ralonsoh: i thought we did that already?  ip version + overlay header size
14:50:11 <slaweq> haleyb: yes, that is done in type driver
14:50:49 <slaweq> but IIUC LP reported of this rfe wants to have possibility to enforce higher mtu for vxlan network in some cases
14:51:59 <slaweq> so IMO this sounds a bit weird as from the one hand he wants to limit min_mtu to protect users against some "strange things" with too low mtu, but from the other hand he wants to have possibility to set "bad" mtu value for vxlan networks and get very bad performance
14:52:18 <slaweq> or I'm missing something there :/
14:52:19 <njohnston> so what do we lose if we keep the behavior we have right now?  In comment #5 the OP says "I'll admit it is a corner case, as nobody in their right mind would pick an MTU that low" - and same applies to too high.  I don't think we need to restrict every way that a user could potentially shoot themselves in the foot.
14:52:53 <slaweq> njohnston: +100 for that :)
14:53:40 <njohnston> And I am not 100% sure the OP thinks the ability to set the MTU too high to oversubscribe the network is needed, he just sort of thinks it's a neat idea.
14:53:45 <njohnston> "On the upper side, I actually do see a use case for allowing the MTU to be set higher than the MTU on the underlying physical network. However, network admins would likely still prefer to put an upper bound on the MTU values picked by their tenants, and control whether or not this type of "oversubscription" is allowed on the network."
14:54:03 <njohnston> that is a pretty ambivalent comment
14:55:09 <ralonsoh> I think we can ask for a concise and specific RFE description, related to one single problem and maybe a patch showing what he really wants
14:55:39 <njohnston> I don't see a specific use case that is being advocated for here, just a "let's redesign MTU handling so these theoretical use cases might be used later" and I don't want to whack the MTU beehive just for that
14:56:10 <njohnston> ralonsoh: +1
14:56:20 <slaweq> I agree with You both here
14:56:25 <haleyb> me too
14:56:30 <amotoki> ralonsoh: +1
14:56:40 <slaweq> and I really don't want to have yet another 2 or more config options related to mtu settings in Neutron :)
14:56:43 <amotoki> IMHO we can expect operators can configure the right value for their deployments, but it would be great if we can enforce the check for MTU requested by regular users. that's from my experience as an operator
14:58:37 <slaweq> ok, we are almost on the top of the hour so lets stop this discussion here, I will write some sum up in the comment to LP and ask for a concise RFE description and exact use case related to this
14:58:44 <slaweq> and we can get back to it later
14:58:48 <slaweq> is it ok for You?
14:58:58 <yamamoto> +1
14:59:01 <ralonsoh> +1
14:59:02 <njohnston> +1
14:59:05 <haleyb> +1
14:59:08 <amotoki> +1
14:59:19 <slaweq> ok, thx a lot
14:59:23 <ralonsoh> bye!
14:59:25 <slaweq> thx for attending
14:59:30 <slaweq> and have a great weekend
14:59:32 <slaweq> o/
14:59:34 <slaweq> #endmeeting