22:00:30 <kevinbenton> #startmeeting neutron_drivers
22:00:31 <openstack> Meeting started Thu Jul 20 22:00:30 2017 UTC and is due to finish in 60 minutes.  The chair is kevinbenton. Information about MeetBot at http://wiki.debian.org/MeetBot.
22:00:32 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
22:00:35 <openstack> The meeting name has been set to 'neutron_drivers'
22:00:54 <ihrachys> o/
22:01:18 <kevinbenton> armax, mlavalle, amotoki: ping
22:01:21 <armax> pong
22:01:22 <mlavalle> o/
22:02:18 <kevinbenton> before we start with RFEs, i need to bring up an issue i ran into with making it possible to expose 'router:external' subnets
22:02:51 <kevinbenton> in order to allow the policy engine to permit subnets to be visible
22:03:07 <kevinbenton> (context is #link https://launchpad.net/bugs/1653932)
22:03:07 <openstack> Launchpad bug 1653932 in neutron "[rfe] network router:external field not exported" [Wishlist,In progress] - Assigned to Kevin Benton (kevinbenton)
22:03:31 <kevinbenton> i have to add a DB hook that makes the user subnet query actually return the external subnets as well
22:04:05 <kevinbenton> however, if we don't make external subnets visible by default in policy.json
22:04:24 <kevinbenton> there is a mismatch between what the DB returns and what is filtered out by policy
22:04:45 <kevinbenton> this becomes a problem in pagination
22:04:58 <kevinbenton> because pagination asks for X records
22:05:10 <kevinbenton> but then policy engine filters out some and we end up with a shortage
22:05:11 <kevinbenton> http://logs.openstack.org/94/476094/4/check/gate-neutron-dsvm-api-ubuntu-xenial/a42d8fe/testr_results.html.gz
22:05:30 <ihrachys> (I knew those tests will trigger one day lol!)
22:05:53 <kevinbenton> so the question is, do we adjust the pagination tests to expect sometimes fewer records than asked for?
22:06:14 <ihrachys> I don't think so, that would be a weird api behaviour
22:06:21 <ihrachys> horizon will be all over us
22:06:27 <kevinbenton> do we attempt to have the policy engine inform the DB query builder (quite difficult)
22:06:50 <kevinbenton> or do we just adjust our default policy to now expose external subnets?
22:06:51 <ihrachys> they complained about inconsistent/broken behavior of pagination in different projects before. we will add more to the grief. ;)
22:07:07 <kevinbenton> ihrachys: yeah, i can see that
22:07:33 <kevinbenton> i've outlined it in this email thread http://lists.openstack.org/pipermail/openstack-dev/2017-July/119863.html
22:07:44 <armax> not sure I fully grasp why now pagination tests fail
22:07:59 <kevinbenton> armax: db hook makes the query return external subnets
22:08:12 <kevinbenton> armax: so you have a mix of your own and external subnets
22:08:14 <ihrachys> armax, native pagination enabled for the plugin
22:08:18 <armax> but changing tests or policy.json seems like masking a breaking change
22:08:21 <kevinbenton> armax: but policy engine by default filters that out
22:08:32 <armax> kevinbenton: so the pagination tests break because now there’s more subnets?
22:08:41 <ihrachys> I think we will need to pass those filters from policy engine into plugin somehow
22:08:43 <ihrachys> not sure how though
22:08:56 <kevinbenton> armax: yeah, the pagination tests will break whenever there is a mismatch between policy and db result
22:09:04 <armax> I see
22:09:13 <kevinbenton> any time the policy engine filters out entire records
22:09:33 <armax> but shouldn’t pagination come after policy filtering?
22:09:40 <armax> in the pipeline I mean
22:09:46 <ihrachys> armax, pagination is implemented in sqlalchemy code
22:09:59 <ihrachys> if plugin supports 'native' pagination
22:10:03 <ihrachys> which is the case for ml2
22:10:03 <armax> i see
22:10:08 <kevinbenton> right
22:10:11 <armax> ok now I get it
22:10:29 <kevinbenton> as ihrachys highlighted, the correct fix is to somehow have the policy engine policies build the queries used
22:10:59 <kevinbenton> this has been on my wishlist for a long time because the policy engine is actually a lot less flexible than it appears
22:11:19 <kevinbenton> (the DB queries filter out any non-default access)
22:11:57 <kevinbenton> but it's going to be a significant effort i think to have something that translates policy entries into column filters
22:12:02 <kevinbenton> for SQLA
22:12:15 <armax> yeah
22:12:18 <armax> that sounds messy
22:12:24 <ihrachys> kevinbenton, can we pass policy info via context?
22:12:38 <armax> unless this was special cased just to deal with subnets for external networks
22:12:49 <kevinbenton> ihrachys: yeah, i think we can do the work to get policies attached to context
22:12:59 <ihrachys> then deep in db code, we would extract that and build accordingly
22:13:06 <ihrachys> ok
22:13:16 <ihrachys> at least no need for a new plugin entry point :p
22:13:27 <kevinbenton> ihrachys: yeah, i think the 'build accordingly' bit is where it gets messy
22:13:36 <ihrachys> and it sounds like this work better go to oslo.policy/oslo.context?
22:13:37 <armax> I suppose another crazy idea would be to turn external networks into regular networks
22:13:54 <ihrachys> kevinbenton, if it's a special case that we know about, it's not that bad
22:13:54 <armax> with an extra column ‘external'?
22:14:01 <armax> and kill the external table?
22:14:18 <kevinbenton> armax: i don't think that helps. this is for subnets
22:14:49 <armax> but then the query may turn less ugly
22:15:10 <kevinbenton> armax: the query itself isn't terrible. it's knowing when to use it
22:15:30 <kevinbenton> ihrachys: so are you thinking maybe just special case this one query?
22:16:11 <ihrachys> yeah. I would refrain from generalizing if it's complex and we don't have another use case for that
22:17:22 <kevinbenton> ok, well i'll start by looking into getting the policies visible on the context or somewhere where the query builder will have access to them
22:17:59 <kevinbenton> #link https://bugs.launchpad.net/neutron/+bugs?field.status%3Alist=Triaged&field.tag=rfe
22:18:05 <armax> so to recap
22:18:12 <armax> before we go in the rfe list
22:18:34 <kevinbenton> policy mismatch with db hooks == broken pagination
22:18:39 <armax> changing the policy.json file to expose the list of subnets for external network will make the pagination tests pass?
22:18:45 <kevinbenton> armax: yes
22:19:04 <kevinbenton> but then we have the issue of exposing something we didn't previously
22:19:11 <kevinbenton> on upgrade
22:19:17 <armax> what if we add a shim extension to that?
22:19:21 <armax> :)
22:19:32 <kevinbenton> well we could have an extension to indicate the behavior
22:19:38 <armax> it’s definitely a side effect
22:19:51 <kevinbenton> but inevitably some tempest test or something will explode somewhere
22:19:53 <armax> but what could go wrong if more subnets are exposed to a tenant now during subnet-list?
22:20:04 <kevinbenton> where it assumes if a tenant can see a subnet and it's not shared, the tenant owns it
22:20:09 <armax> yeah, that’s the danger, but aside that side effect
22:20:36 <armax> what else could go wrong?
22:21:00 <kevinbenton> we also need to verify that things like router interface add
22:21:11 <kevinbenton> verify ownership by tenant matching
22:21:40 <kevinbenton> there could be places where we attach by subnet like that which depend on the DB not returning anything to verify if a user has access to it
22:22:32 <armax> sure, same problem arises when the operator does enable this via policy though
22:22:33 <armax> correct?
22:23:08 <kevinbenton> armax: yep
22:24:06 <armax> so chance is that exposing subnets for an external network will still create issues we haven’t thought of
22:24:19 <armax> and I wonder if by enabling this by default will allow us to root them out faster
22:24:31 <kevinbenton> armax: yep
22:24:48 <kevinbenton> if only a few operators use the non-default policy we probably won't hear about it until pike is EOL
22:24:50 <mlavalle> and that fixes the tests, right?
22:25:07 <kevinbenton> mlavalle: yep
22:25:14 <armax> so that’s why I am bit nervours
22:25:30 <armax> I mean if the one certainty we have is that this change will cause more 'bugs'
22:25:41 <armax> I wonder if we’re better off biting the bullet sooner rather than later
22:25:55 <armax> or we forget about it and not proceed altogether
22:26:00 <kevinbenton> it would also enable us to make the router operations more consistent from the user perspective
22:26:02 <armax> just a though
22:26:03 <armax> t
22:26:07 <kevinbenton> gateway attach could take a subnet
22:26:29 <armax> well, now you’re going down the path of changing API
22:26:38 <armax> which shouldn’t be necessary
22:26:54 <armax> I mean, yeah, I can see that this change opens up the possibility
22:26:59 <kevinbenton> yeah, that's just a potential thing
22:27:04 <kevinbenton> not required at all
22:27:56 <armax> is subnets for external networks something sensitive?
22:28:16 <kevinbenton> i'm not sure
22:28:28 <armax> is there anything in the response payload that could be considered too infra specific?
22:28:29 <kevinbenton> i haven't been able to think of case where it would be a problem
22:28:55 <kevinbenton> maybe the operator stored their root password to the gateway in the subnet description
22:29:00 <armax> yeah
22:29:02 <armax> :)
22:29:19 <mlavalle> but that's bad aoperator practice, LOL
22:29:29 <kevinbenton> one possible thing is the subnet service types feature
22:29:36 <armax> what about it?
22:29:48 <armax> you mean that gets exposed?
22:29:49 <kevinbenton> there might be a subnet meant just for DVR ports that users never really see
22:29:50 <kevinbenton> yeah
22:29:54 <armax> right
22:29:59 <kevinbenton> so they will see the floating IP subnet
22:30:00 <armax> I was thinking about that
22:30:05 <armax> or segment ID?
22:30:06 <kevinbenton> and a 'internal' subnet
22:30:17 <armax> is that on subnets? I can’t remember
22:30:30 <kevinbenton> well segment_id isn't a huge issue
22:30:34 <kevinbenton> just a UUID
22:30:52 <kevinbenton> https://github.com/openstack/neutron/blob/master/etc/policy.json#L21
22:30:58 <kevinbenton> it's admin-only anyway
22:31:18 <kevinbenton> so they wouldn't be able to see segment_id or service_types
22:31:22 <armax> so are service types
22:31:25 <mlavalle> we already have a service type network:floatingip
22:31:27 <armax> except on the GET
22:31:45 <kevinbenton> but they would still be able to see the subnets themselves
22:32:10 <mlavalle> see example 2 in https://docs.openstack.org/neutron/latest/admin/config-service-subnets.html
22:32:13 <kevinbenton> so if you have a weird subnet just for DVR interfaces on non-routable addresses you might get questions about it showing up in users subnet lists
22:32:57 <kevinbenton> mlavalle: yeah
22:33:18 <kevinbenton> so would normal users seeing all of those subnets show up be okay?
22:33:55 <mlavalle> in principle, I don't think so
22:34:04 <armax> it feels wrong
22:34:13 <kevinbenton> yeah, it feels like exposing infra stuff
22:34:27 <armax> even if the operator changed the policy.json  from defeault
22:35:00 <armax> so on the RFE
22:35:06 <armax> what was the exact use case?
22:35:32 <kevinbenton> armax: https://bugs.launchpad.net/neutron/+bug/1653932
22:35:32 <openstack> Launchpad bug 1653932 in neutron "[rfe] network router:external field not exported" [Wishlist,In progress] - Assigned to Kevin Benton (kevinbenton)
22:35:32 <armax> I wonder if it’s something PaaS related?
22:35:44 <armax> what comment #?
22:35:46 <kevinbenton> it's actually about IP visibility for allocating floating IPs
22:35:51 <kevinbenton> the description
22:36:06 <kevinbenton> users don't know which subnet to pick from
22:36:11 <armax> right, but why wouldn’t a random IP not be enough?
22:36:45 <kevinbenton> some IPs have access to one thing and others to something else
22:37:12 <armax> like if there’s some sorth of IP whitelist somewhere?
22:37:22 <armax> or even routing wise?
22:37:26 <kevinbenton> yeah
22:37:38 <ihrachys> why not creating two networks?
22:37:42 <armax> ...right
22:37:46 <armax> I was gonna say the same
22:38:07 <armax> though now the choice is not based on IP address but 'label'
22:38:19 <ihrachys> well it's even better to my taste
22:38:34 <ihrachys> you can tag the network, put a nice name and even a description, whatever
22:38:45 <armax> network pistacho or network strawberry
22:38:50 <armax> yum
22:39:39 <ihrachys> ok so how about clarifying use case again before doing coding?
22:39:59 <mlavalle> so maybe go back to the rfe and have a dialog with the submitter to dig in the use case?
22:40:01 <armax> I suppose we had done that, but it looks like we did not
22:40:47 <mlavalle> well, now we have a set of consequences that we can discuss with submitter
22:41:34 <kevinbenton> armax: can you ask for clarification>?
22:41:41 <armax> sure
22:41:45 <armax> can do
22:41:55 <kevinbenton> this does seem like a request for generic policy support without a very clear use case
22:42:03 <kevinbenton> he does mention that floating IP is just an example
22:42:17 <mlavalle> re-reading the rfe, he actually mentions that fip is just an example
22:42:32 <mlavalle> comment #4
22:42:41 <kevinbenton> yeah
22:42:54 <armax> true
22:43:05 <kevinbenton> maybe it's actually something service_types can help with
22:43:07 <armax> but since we have now understood the implications
22:43:07 <mlavalle> and we decided to approach the solution from that perspective
22:43:14 <kevinbenton> maybe one of the subnets is private and useless
22:43:17 <kevinbenton> for floating IPs
22:43:19 <armax> much more than we had at the beginning
22:43:28 <armax> and the code doesn’t look pretty and risk of errors is high
22:43:29 <mlavalle> and yes, now we know the implications
22:43:37 <armax> let’s go back and fully understand what we need
22:43:52 <kevinbenton> ok
22:43:56 <armax> kevinbenton: service types need to have subnets exposed nonetheless no?
22:44:15 <kevinbenton> true
22:44:19 <armax> Ok
22:44:26 <armax> I think we beat this horse hard enough
22:44:27 <armax> let’s move on
22:44:31 <armax> I’d say
22:44:38 <kevinbenton> i want to discuss https://bugs.launchpad.net/neutron/+bug/1604222 right away
22:44:38 <openstack> Launchpad bug 1604222 in neutron "[RFE] Implement vlan transparent for openvswitch ML2 driver" [Wishlist,Triaged] - Assigned to Trevor McCasland (twm2016)
22:44:39 <mlavalle> LOL, it must be dead now
22:44:41 <armax> that’s rude
22:44:44 <kevinbenton> :)
22:44:51 <armax> kevinbenton: please?
22:44:59 <kevinbenton> armax: please what?
22:45:07 <armax> i want to discuss https://bugs.launchpad.net/neutron/+bug/1604222 right away
22:45:12 <armax> where are your manners?
22:45:18 <kevinbenton> rather than going to the list i mean
22:45:22 <armax> still
22:45:24 <armax> be polite
22:45:27 <armax> :P
22:45:27 <kevinbenton> (even though I think this was at the top)
22:45:35 <ihrachys> it was
22:45:50 <kevinbenton> there is some mixup here between two things
22:45:54 <armax> do we have full support in OVS?
22:45:58 <armax> nowadays?
22:46:06 <kevinbenton> 2.8 will have it it sounds like
22:46:12 <armax> g
22:46:14 <ihrachys> "this will be released in ovs 2.8 in august prior to the release of pike"
22:46:22 <armax> finally
22:46:33 <ihrachys> there is no official release yet, but people can always backport/pull master
22:46:34 <kevinbenton> however, what Trevor discussed is not related to vlan transparency
22:46:48 <kevinbenton> he is working on a QinQ network type driver
22:47:01 <kevinbenton> which will also depend on QinQ support in OVS for an OVS implementation
22:47:18 <armax> not sure I understand the need for a new type drivers
22:47:21 <armax> driver*
22:47:23 <kevinbenton> but they are orthoganal and i don't want the QinQ type driver to get mixed up with vlan transparency
22:47:52 <armax> a type driver goes and in hand with a ml2 driver, no?
22:47:58 <kevinbenton> armax: the QinQ driver is to allow double-encapped packets to hit the wire
22:48:12 <kevinbenton> double-tagged*
22:48:46 <ihrachys> how do drivers that implement vlan transparency do it?
22:48:49 <mlavalle> and he is sub-classing the vlan type to do that
22:48:52 <armax> oh so you’re saying it’s a way for OVS agent to understand that it needs to do pushd/popd?
22:49:02 <armax> rather than strip/replace?
22:49:04 <kevinbenton> see this is the problem :)
22:49:17 <kevinbenton> the qinq type driver has nothing to do with vlan transparency
22:49:27 <armax> then I still don’t understant
22:49:41 <kevinbenton> the qinq type driver is just like the vlan driver
22:49:44 <armax> right
22:49:49 <mlavalle> with more tags
22:49:51 <kevinbenton> except it tells to place two tags in the header
22:49:53 <armax> so if it’s yet another tunnelling driver
22:49:53 <mlavalle> inside
22:49:56 <kevinbenton> yep
22:50:05 <armax> it does have no reason to exist in the repo
22:50:07 <armax> what’s the use case?
22:50:13 <armax> I mean
22:50:17 <armax> I have no reason against it
22:50:21 <kevinbenton> geneve has no reason to exist in the repo
22:50:35 <armax> I wasn’t PTL when that merged ;)
22:50:39 <armax> I don’t think
22:50:47 <kevinbenton> it's not vendor specific
22:50:51 <armax> but my question is what’s it for?
22:51:02 <armax> if it’s not going to be used in vlan transparency?
22:51:05 <armax> do we know?
22:51:06 <kevinbenton> if you have infra expecting qinq frames
22:51:11 <armax> I can see what geneve is going to be used for
22:51:17 <armax> OVN
22:51:28 <kevinbenton> it lets you scale way beyond 4096 vlans
22:51:29 <armax> qinq typo driver
22:51:44 <armax> do we know the mech driver that will leverage it?
22:51:45 <armax> dude
22:51:49 <armax> stop lecturing us on QinQ
22:51:50 <kevinbenton> SR-IOV
22:51:56 <kevinbenton> armax: then don't ask questions
22:51:56 <armax> we all get it, I think :P
22:52:06 <armax> you’re answering the questions I am asking
22:52:15 <kevinbenton> then we're in agreement?
22:52:23 <armax> I think so
22:52:31 <kevinbenton> ok, qinq is fine
22:52:32 <kevinbenton> :)
22:52:34 <armax> so you’re saying that hte sriov driver will use this type driver?
22:52:37 <kevinbenton> but it needs a separate RFE
22:52:49 <kevinbenton> armax: yes, the SR-IOV folks were requesting this at PTG and summit
22:52:58 <kevinbenton> i need to talk to trevor to have him file a separate RFE
22:53:02 <armax> OK
22:53:17 <armax> so there’s work on the sriov driver to stitch things together?
22:53:20 <kevinbenton> because the VLAN transparency is going to be completely different work on the OVS agent openflow pipeline
22:53:33 <kevinbenton> armax: yep, just to configure the tagging like it does for vlans
22:54:07 <armax> so long as the relationship is properly documented in the process of contributing this, then I think this is fine
22:54:13 <armax> do we need model changes you reckon?
22:54:50 <kevinbenton> no, i don't think so, but that will leave this VLAN transparency RFE open to be implemented by someone else
22:55:04 <kevinbenton> so if anyone's company is depending on OVS transparency support, find some contributors :)
22:55:44 <ihrachys> I think we may be interested, I think the topic flew around, but maybe it was for ovs, I will check :)
22:56:35 <armax> it would be nice to finally close the circle about VLAN management and OVS
22:56:43 <kevinbenton> ihrachys: sounds good, we might as well :)
22:57:12 <ihrachys> kevinbenton, noted
22:57:29 <armax> OK
22:57:35 <armax> 3 mins to the top of the hour
22:58:05 <kevinbenton> ok, sorry
22:58:08 <kevinbenton> was commenting on RFE
22:58:17 <armax> no need to be sorry
22:58:21 <armax> just did your job for once
22:58:22 <kevinbenton> anyone have any last minute announcements?
22:58:25 <armax> :P
22:58:30 <ihrachys> no
22:58:32 <armax> none from em
22:58:33 <armax> me
22:58:40 <mlavalle> no
22:58:51 <kevinbenton> ok, then i'll give you a bunch of free time back :)
22:58:53 <kevinbenton> #endmeeting