22:00:30 <kevinbenton> #startmeeting neutron_drivers 22:00:31 <openstack> Meeting started Thu Jul 20 22:00:30 2017 UTC and is due to finish in 60 minutes. The chair is kevinbenton. Information about MeetBot at http://wiki.debian.org/MeetBot. 22:00:32 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 22:00:35 <openstack> The meeting name has been set to 'neutron_drivers' 22:00:54 <ihrachys> o/ 22:01:18 <kevinbenton> armax, mlavalle, amotoki: ping 22:01:21 <armax> pong 22:01:22 <mlavalle> o/ 22:02:18 <kevinbenton> before we start with RFEs, i need to bring up an issue i ran into with making it possible to expose 'router:external' subnets 22:02:51 <kevinbenton> in order to allow the policy engine to permit subnets to be visible 22:03:07 <kevinbenton> (context is #link https://launchpad.net/bugs/1653932) 22:03:07 <openstack> Launchpad bug 1653932 in neutron "[rfe] network router:external field not exported" [Wishlist,In progress] - Assigned to Kevin Benton (kevinbenton) 22:03:31 <kevinbenton> i have to add a DB hook that makes the user subnet query actually return the external subnets as well 22:04:05 <kevinbenton> however, if we don't make external subnets visible by default in policy.json 22:04:24 <kevinbenton> there is a mismatch between what the DB returns and what is filtered out by policy 22:04:45 <kevinbenton> this becomes a problem in pagination 22:04:58 <kevinbenton> because pagination asks for X records 22:05:10 <kevinbenton> but then policy engine filters out some and we end up with a shortage 22:05:11 <kevinbenton> http://logs.openstack.org/94/476094/4/check/gate-neutron-dsvm-api-ubuntu-xenial/a42d8fe/testr_results.html.gz 22:05:30 <ihrachys> (I knew those tests will trigger one day lol!) 22:05:53 <kevinbenton> so the question is, do we adjust the pagination tests to expect sometimes fewer records than asked for? 22:06:14 <ihrachys> I don't think so, that would be a weird api behaviour 22:06:21 <ihrachys> horizon will be all over us 22:06:27 <kevinbenton> do we attempt to have the policy engine inform the DB query builder (quite difficult) 22:06:50 <kevinbenton> or do we just adjust our default policy to now expose external subnets? 22:06:51 <ihrachys> they complained about inconsistent/broken behavior of pagination in different projects before. we will add more to the grief. ;) 22:07:07 <kevinbenton> ihrachys: yeah, i can see that 22:07:33 <kevinbenton> i've outlined it in this email thread http://lists.openstack.org/pipermail/openstack-dev/2017-July/119863.html 22:07:44 <armax> not sure I fully grasp why now pagination tests fail 22:07:59 <kevinbenton> armax: db hook makes the query return external subnets 22:08:12 <kevinbenton> armax: so you have a mix of your own and external subnets 22:08:14 <ihrachys> armax, native pagination enabled for the plugin 22:08:18 <armax> but changing tests or policy.json seems like masking a breaking change 22:08:21 <kevinbenton> armax: but policy engine by default filters that out 22:08:32 <armax> kevinbenton: so the pagination tests break because now there’s more subnets? 22:08:41 <ihrachys> I think we will need to pass those filters from policy engine into plugin somehow 22:08:43 <ihrachys> not sure how though 22:08:56 <kevinbenton> armax: yeah, the pagination tests will break whenever there is a mismatch between policy and db result 22:09:04 <armax> I see 22:09:13 <kevinbenton> any time the policy engine filters out entire records 22:09:33 <armax> but shouldn’t pagination come after policy filtering? 22:09:40 <armax> in the pipeline I mean 22:09:46 <ihrachys> armax, pagination is implemented in sqlalchemy code 22:09:59 <ihrachys> if plugin supports 'native' pagination 22:10:03 <ihrachys> which is the case for ml2 22:10:03 <armax> i see 22:10:08 <kevinbenton> right 22:10:11 <armax> ok now I get it 22:10:29 <kevinbenton> as ihrachys highlighted, the correct fix is to somehow have the policy engine policies build the queries used 22:10:59 <kevinbenton> this has been on my wishlist for a long time because the policy engine is actually a lot less flexible than it appears 22:11:19 <kevinbenton> (the DB queries filter out any non-default access) 22:11:57 <kevinbenton> but it's going to be a significant effort i think to have something that translates policy entries into column filters 22:12:02 <kevinbenton> for SQLA 22:12:15 <armax> yeah 22:12:18 <armax> that sounds messy 22:12:24 <ihrachys> kevinbenton, can we pass policy info via context? 22:12:38 <armax> unless this was special cased just to deal with subnets for external networks 22:12:49 <kevinbenton> ihrachys: yeah, i think we can do the work to get policies attached to context 22:12:59 <ihrachys> then deep in db code, we would extract that and build accordingly 22:13:06 <ihrachys> ok 22:13:16 <ihrachys> at least no need for a new plugin entry point :p 22:13:27 <kevinbenton> ihrachys: yeah, i think the 'build accordingly' bit is where it gets messy 22:13:36 <ihrachys> and it sounds like this work better go to oslo.policy/oslo.context? 22:13:37 <armax> I suppose another crazy idea would be to turn external networks into regular networks 22:13:54 <ihrachys> kevinbenton, if it's a special case that we know about, it's not that bad 22:13:54 <armax> with an extra column ‘external'? 22:14:01 <armax> and kill the external table? 22:14:18 <kevinbenton> armax: i don't think that helps. this is for subnets 22:14:49 <armax> but then the query may turn less ugly 22:15:10 <kevinbenton> armax: the query itself isn't terrible. it's knowing when to use it 22:15:30 <kevinbenton> ihrachys: so are you thinking maybe just special case this one query? 22:16:11 <ihrachys> yeah. I would refrain from generalizing if it's complex and we don't have another use case for that 22:17:22 <kevinbenton> ok, well i'll start by looking into getting the policies visible on the context or somewhere where the query builder will have access to them 22:17:59 <kevinbenton> #link https://bugs.launchpad.net/neutron/+bugs?field.status%3Alist=Triaged&field.tag=rfe 22:18:05 <armax> so to recap 22:18:12 <armax> before we go in the rfe list 22:18:34 <kevinbenton> policy mismatch with db hooks == broken pagination 22:18:39 <armax> changing the policy.json file to expose the list of subnets for external network will make the pagination tests pass? 22:18:45 <kevinbenton> armax: yes 22:19:04 <kevinbenton> but then we have the issue of exposing something we didn't previously 22:19:11 <kevinbenton> on upgrade 22:19:17 <armax> what if we add a shim extension to that? 22:19:21 <armax> :) 22:19:32 <kevinbenton> well we could have an extension to indicate the behavior 22:19:38 <armax> it’s definitely a side effect 22:19:51 <kevinbenton> but inevitably some tempest test or something will explode somewhere 22:19:53 <armax> but what could go wrong if more subnets are exposed to a tenant now during subnet-list? 22:20:04 <kevinbenton> where it assumes if a tenant can see a subnet and it's not shared, the tenant owns it 22:20:09 <armax> yeah, that’s the danger, but aside that side effect 22:20:36 <armax> what else could go wrong? 22:21:00 <kevinbenton> we also need to verify that things like router interface add 22:21:11 <kevinbenton> verify ownership by tenant matching 22:21:40 <kevinbenton> there could be places where we attach by subnet like that which depend on the DB not returning anything to verify if a user has access to it 22:22:32 <armax> sure, same problem arises when the operator does enable this via policy though 22:22:33 <armax> correct? 22:23:08 <kevinbenton> armax: yep 22:24:06 <armax> so chance is that exposing subnets for an external network will still create issues we haven’t thought of 22:24:19 <armax> and I wonder if by enabling this by default will allow us to root them out faster 22:24:31 <kevinbenton> armax: yep 22:24:48 <kevinbenton> if only a few operators use the non-default policy we probably won't hear about it until pike is EOL 22:24:50 <mlavalle> and that fixes the tests, right? 22:25:07 <kevinbenton> mlavalle: yep 22:25:14 <armax> so that’s why I am bit nervours 22:25:30 <armax> I mean if the one certainty we have is that this change will cause more 'bugs' 22:25:41 <armax> I wonder if we’re better off biting the bullet sooner rather than later 22:25:55 <armax> or we forget about it and not proceed altogether 22:26:00 <kevinbenton> it would also enable us to make the router operations more consistent from the user perspective 22:26:02 <armax> just a though 22:26:03 <armax> t 22:26:07 <kevinbenton> gateway attach could take a subnet 22:26:29 <armax> well, now you’re going down the path of changing API 22:26:38 <armax> which shouldn’t be necessary 22:26:54 <armax> I mean, yeah, I can see that this change opens up the possibility 22:26:59 <kevinbenton> yeah, that's just a potential thing 22:27:04 <kevinbenton> not required at all 22:27:56 <armax> is subnets for external networks something sensitive? 22:28:16 <kevinbenton> i'm not sure 22:28:28 <armax> is there anything in the response payload that could be considered too infra specific? 22:28:29 <kevinbenton> i haven't been able to think of case where it would be a problem 22:28:55 <kevinbenton> maybe the operator stored their root password to the gateway in the subnet description 22:29:00 <armax> yeah 22:29:02 <armax> :) 22:29:19 <mlavalle> but that's bad aoperator practice, LOL 22:29:29 <kevinbenton> one possible thing is the subnet service types feature 22:29:36 <armax> what about it? 22:29:48 <armax> you mean that gets exposed? 22:29:49 <kevinbenton> there might be a subnet meant just for DVR ports that users never really see 22:29:50 <kevinbenton> yeah 22:29:54 <armax> right 22:29:59 <kevinbenton> so they will see the floating IP subnet 22:30:00 <armax> I was thinking about that 22:30:05 <armax> or segment ID? 22:30:06 <kevinbenton> and a 'internal' subnet 22:30:17 <armax> is that on subnets? I can’t remember 22:30:30 <kevinbenton> well segment_id isn't a huge issue 22:30:34 <kevinbenton> just a UUID 22:30:52 <kevinbenton> https://github.com/openstack/neutron/blob/master/etc/policy.json#L21 22:30:58 <kevinbenton> it's admin-only anyway 22:31:18 <kevinbenton> so they wouldn't be able to see segment_id or service_types 22:31:22 <armax> so are service types 22:31:25 <mlavalle> we already have a service type network:floatingip 22:31:27 <armax> except on the GET 22:31:45 <kevinbenton> but they would still be able to see the subnets themselves 22:32:10 <mlavalle> see example 2 in https://docs.openstack.org/neutron/latest/admin/config-service-subnets.html 22:32:13 <kevinbenton> so if you have a weird subnet just for DVR interfaces on non-routable addresses you might get questions about it showing up in users subnet lists 22:32:57 <kevinbenton> mlavalle: yeah 22:33:18 <kevinbenton> so would normal users seeing all of those subnets show up be okay? 22:33:55 <mlavalle> in principle, I don't think so 22:34:04 <armax> it feels wrong 22:34:13 <kevinbenton> yeah, it feels like exposing infra stuff 22:34:27 <armax> even if the operator changed the policy.json from defeault 22:35:00 <armax> so on the RFE 22:35:06 <armax> what was the exact use case? 22:35:32 <kevinbenton> armax: https://bugs.launchpad.net/neutron/+bug/1653932 22:35:32 <openstack> Launchpad bug 1653932 in neutron "[rfe] network router:external field not exported" [Wishlist,In progress] - Assigned to Kevin Benton (kevinbenton) 22:35:32 <armax> I wonder if it’s something PaaS related? 22:35:44 <armax> what comment #? 22:35:46 <kevinbenton> it's actually about IP visibility for allocating floating IPs 22:35:51 <kevinbenton> the description 22:36:06 <kevinbenton> users don't know which subnet to pick from 22:36:11 <armax> right, but why wouldn’t a random IP not be enough? 22:36:45 <kevinbenton> some IPs have access to one thing and others to something else 22:37:12 <armax> like if there’s some sorth of IP whitelist somewhere? 22:37:22 <armax> or even routing wise? 22:37:26 <kevinbenton> yeah 22:37:38 <ihrachys> why not creating two networks? 22:37:42 <armax> ...right 22:37:46 <armax> I was gonna say the same 22:38:07 <armax> though now the choice is not based on IP address but 'label' 22:38:19 <ihrachys> well it's even better to my taste 22:38:34 <ihrachys> you can tag the network, put a nice name and even a description, whatever 22:38:45 <armax> network pistacho or network strawberry 22:38:50 <armax> yum 22:39:39 <ihrachys> ok so how about clarifying use case again before doing coding? 22:39:59 <mlavalle> so maybe go back to the rfe and have a dialog with the submitter to dig in the use case? 22:40:01 <armax> I suppose we had done that, but it looks like we did not 22:40:47 <mlavalle> well, now we have a set of consequences that we can discuss with submitter 22:41:34 <kevinbenton> armax: can you ask for clarification>? 22:41:41 <armax> sure 22:41:45 <armax> can do 22:41:55 <kevinbenton> this does seem like a request for generic policy support without a very clear use case 22:42:03 <kevinbenton> he does mention that floating IP is just an example 22:42:17 <mlavalle> re-reading the rfe, he actually mentions that fip is just an example 22:42:32 <mlavalle> comment #4 22:42:41 <kevinbenton> yeah 22:42:54 <armax> true 22:43:05 <kevinbenton> maybe it's actually something service_types can help with 22:43:07 <armax> but since we have now understood the implications 22:43:07 <mlavalle> and we decided to approach the solution from that perspective 22:43:14 <kevinbenton> maybe one of the subnets is private and useless 22:43:17 <kevinbenton> for floating IPs 22:43:19 <armax> much more than we had at the beginning 22:43:28 <armax> and the code doesn’t look pretty and risk of errors is high 22:43:29 <mlavalle> and yes, now we know the implications 22:43:37 <armax> let’s go back and fully understand what we need 22:43:52 <kevinbenton> ok 22:43:56 <armax> kevinbenton: service types need to have subnets exposed nonetheless no? 22:44:15 <kevinbenton> true 22:44:19 <armax> Ok 22:44:26 <armax> I think we beat this horse hard enough 22:44:27 <armax> let’s move on 22:44:31 <armax> I’d say 22:44:38 <kevinbenton> i want to discuss https://bugs.launchpad.net/neutron/+bug/1604222 right away 22:44:38 <openstack> Launchpad bug 1604222 in neutron "[RFE] Implement vlan transparent for openvswitch ML2 driver" [Wishlist,Triaged] - Assigned to Trevor McCasland (twm2016) 22:44:39 <mlavalle> LOL, it must be dead now 22:44:41 <armax> that’s rude 22:44:44 <kevinbenton> :) 22:44:51 <armax> kevinbenton: please? 22:44:59 <kevinbenton> armax: please what? 22:45:07 <armax> i want to discuss https://bugs.launchpad.net/neutron/+bug/1604222 right away 22:45:12 <armax> where are your manners? 22:45:18 <kevinbenton> rather than going to the list i mean 22:45:22 <armax> still 22:45:24 <armax> be polite 22:45:27 <armax> :P 22:45:27 <kevinbenton> (even though I think this was at the top) 22:45:35 <ihrachys> it was 22:45:50 <kevinbenton> there is some mixup here between two things 22:45:54 <armax> do we have full support in OVS? 22:45:58 <armax> nowadays? 22:46:06 <kevinbenton> 2.8 will have it it sounds like 22:46:12 <armax> g 22:46:14 <ihrachys> "this will be released in ovs 2.8 in august prior to the release of pike" 22:46:22 <armax> finally 22:46:33 <ihrachys> there is no official release yet, but people can always backport/pull master 22:46:34 <kevinbenton> however, what Trevor discussed is not related to vlan transparency 22:46:48 <kevinbenton> he is working on a QinQ network type driver 22:47:01 <kevinbenton> which will also depend on QinQ support in OVS for an OVS implementation 22:47:18 <armax> not sure I understand the need for a new type drivers 22:47:21 <armax> driver* 22:47:23 <kevinbenton> but they are orthoganal and i don't want the QinQ type driver to get mixed up with vlan transparency 22:47:52 <armax> a type driver goes and in hand with a ml2 driver, no? 22:47:58 <kevinbenton> armax: the QinQ driver is to allow double-encapped packets to hit the wire 22:48:12 <kevinbenton> double-tagged* 22:48:46 <ihrachys> how do drivers that implement vlan transparency do it? 22:48:49 <mlavalle> and he is sub-classing the vlan type to do that 22:48:52 <armax> oh so you’re saying it’s a way for OVS agent to understand that it needs to do pushd/popd? 22:49:02 <armax> rather than strip/replace? 22:49:04 <kevinbenton> see this is the problem :) 22:49:17 <kevinbenton> the qinq type driver has nothing to do with vlan transparency 22:49:27 <armax> then I still don’t understant 22:49:41 <kevinbenton> the qinq type driver is just like the vlan driver 22:49:44 <armax> right 22:49:49 <mlavalle> with more tags 22:49:51 <kevinbenton> except it tells to place two tags in the header 22:49:53 <armax> so if it’s yet another tunnelling driver 22:49:53 <mlavalle> inside 22:49:56 <kevinbenton> yep 22:50:05 <armax> it does have no reason to exist in the repo 22:50:07 <armax> what’s the use case? 22:50:13 <armax> I mean 22:50:17 <armax> I have no reason against it 22:50:21 <kevinbenton> geneve has no reason to exist in the repo 22:50:35 <armax> I wasn’t PTL when that merged ;) 22:50:39 <armax> I don’t think 22:50:47 <kevinbenton> it's not vendor specific 22:50:51 <armax> but my question is what’s it for? 22:51:02 <armax> if it’s not going to be used in vlan transparency? 22:51:05 <armax> do we know? 22:51:06 <kevinbenton> if you have infra expecting qinq frames 22:51:11 <armax> I can see what geneve is going to be used for 22:51:17 <armax> OVN 22:51:28 <kevinbenton> it lets you scale way beyond 4096 vlans 22:51:29 <armax> qinq typo driver 22:51:44 <armax> do we know the mech driver that will leverage it? 22:51:45 <armax> dude 22:51:49 <armax> stop lecturing us on QinQ 22:51:50 <kevinbenton> SR-IOV 22:51:56 <kevinbenton> armax: then don't ask questions 22:51:56 <armax> we all get it, I think :P 22:52:06 <armax> you’re answering the questions I am asking 22:52:15 <kevinbenton> then we're in agreement? 22:52:23 <armax> I think so 22:52:31 <kevinbenton> ok, qinq is fine 22:52:32 <kevinbenton> :) 22:52:34 <armax> so you’re saying that hte sriov driver will use this type driver? 22:52:37 <kevinbenton> but it needs a separate RFE 22:52:49 <kevinbenton> armax: yes, the SR-IOV folks were requesting this at PTG and summit 22:52:58 <kevinbenton> i need to talk to trevor to have him file a separate RFE 22:53:02 <armax> OK 22:53:17 <armax> so there’s work on the sriov driver to stitch things together? 22:53:20 <kevinbenton> because the VLAN transparency is going to be completely different work on the OVS agent openflow pipeline 22:53:33 <kevinbenton> armax: yep, just to configure the tagging like it does for vlans 22:54:07 <armax> so long as the relationship is properly documented in the process of contributing this, then I think this is fine 22:54:13 <armax> do we need model changes you reckon? 22:54:50 <kevinbenton> no, i don't think so, but that will leave this VLAN transparency RFE open to be implemented by someone else 22:55:04 <kevinbenton> so if anyone's company is depending on OVS transparency support, find some contributors :) 22:55:44 <ihrachys> I think we may be interested, I think the topic flew around, but maybe it was for ovs, I will check :) 22:56:35 <armax> it would be nice to finally close the circle about VLAN management and OVS 22:56:43 <kevinbenton> ihrachys: sounds good, we might as well :) 22:57:12 <ihrachys> kevinbenton, noted 22:57:29 <armax> OK 22:57:35 <armax> 3 mins to the top of the hour 22:58:05 <kevinbenton> ok, sorry 22:58:08 <kevinbenton> was commenting on RFE 22:58:17 <armax> no need to be sorry 22:58:21 <armax> just did your job for once 22:58:22 <kevinbenton> anyone have any last minute announcements? 22:58:25 <armax> :P 22:58:30 <ihrachys> no 22:58:32 <armax> none from em 22:58:33 <armax> me 22:58:40 <mlavalle> no 22:58:51 <kevinbenton> ok, then i'll give you a bunch of free time back :) 22:58:53 <kevinbenton> #endmeeting