22:00:30 #startmeeting neutron_drivers 22:00:31 Meeting started Thu Jul 20 22:00:30 2017 UTC and is due to finish in 60 minutes. The chair is kevinbenton. Information about MeetBot at http://wiki.debian.org/MeetBot. 22:00:32 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 22:00:35 The meeting name has been set to 'neutron_drivers' 22:00:54 o/ 22:01:18 armax, mlavalle, amotoki: ping 22:01:21 pong 22:01:22 o/ 22:02:18 before we start with RFEs, i need to bring up an issue i ran into with making it possible to expose 'router:external' subnets 22:02:51 in order to allow the policy engine to permit subnets to be visible 22:03:07 (context is #link https://launchpad.net/bugs/1653932) 22:03:07 Launchpad bug 1653932 in neutron "[rfe] network router:external field not exported" [Wishlist,In progress] - Assigned to Kevin Benton (kevinbenton) 22:03:31 i have to add a DB hook that makes the user subnet query actually return the external subnets as well 22:04:05 however, if we don't make external subnets visible by default in policy.json 22:04:24 there is a mismatch between what the DB returns and what is filtered out by policy 22:04:45 this becomes a problem in pagination 22:04:58 because pagination asks for X records 22:05:10 but then policy engine filters out some and we end up with a shortage 22:05:11 http://logs.openstack.org/94/476094/4/check/gate-neutron-dsvm-api-ubuntu-xenial/a42d8fe/testr_results.html.gz 22:05:30 (I knew those tests will trigger one day lol!) 22:05:53 so the question is, do we adjust the pagination tests to expect sometimes fewer records than asked for? 22:06:14 I don't think so, that would be a weird api behaviour 22:06:21 horizon will be all over us 22:06:27 do we attempt to have the policy engine inform the DB query builder (quite difficult) 22:06:50 or do we just adjust our default policy to now expose external subnets? 22:06:51 they complained about inconsistent/broken behavior of pagination in different projects before. we will add more to the grief. ;) 22:07:07 ihrachys: yeah, i can see that 22:07:33 i've outlined it in this email thread http://lists.openstack.org/pipermail/openstack-dev/2017-July/119863.html 22:07:44 not sure I fully grasp why now pagination tests fail 22:07:59 armax: db hook makes the query return external subnets 22:08:12 armax: so you have a mix of your own and external subnets 22:08:14 armax, native pagination enabled for the plugin 22:08:18 but changing tests or policy.json seems like masking a breaking change 22:08:21 armax: but policy engine by default filters that out 22:08:32 kevinbenton: so the pagination tests break because now there’s more subnets? 22:08:41 I think we will need to pass those filters from policy engine into plugin somehow 22:08:43 not sure how though 22:08:56 armax: yeah, the pagination tests will break whenever there is a mismatch between policy and db result 22:09:04 I see 22:09:13 any time the policy engine filters out entire records 22:09:33 but shouldn’t pagination come after policy filtering? 22:09:40 in the pipeline I mean 22:09:46 armax, pagination is implemented in sqlalchemy code 22:09:59 if plugin supports 'native' pagination 22:10:03 which is the case for ml2 22:10:03 i see 22:10:08 right 22:10:11 ok now I get it 22:10:29 as ihrachys highlighted, the correct fix is to somehow have the policy engine policies build the queries used 22:10:59 this has been on my wishlist for a long time because the policy engine is actually a lot less flexible than it appears 22:11:19 (the DB queries filter out any non-default access) 22:11:57 but it's going to be a significant effort i think to have something that translates policy entries into column filters 22:12:02 for SQLA 22:12:15 yeah 22:12:18 that sounds messy 22:12:24 kevinbenton, can we pass policy info via context? 22:12:38 unless this was special cased just to deal with subnets for external networks 22:12:49 ihrachys: yeah, i think we can do the work to get policies attached to context 22:12:59 then deep in db code, we would extract that and build accordingly 22:13:06 ok 22:13:16 at least no need for a new plugin entry point :p 22:13:27 ihrachys: yeah, i think the 'build accordingly' bit is where it gets messy 22:13:36 and it sounds like this work better go to oslo.policy/oslo.context? 22:13:37 I suppose another crazy idea would be to turn external networks into regular networks 22:13:54 kevinbenton, if it's a special case that we know about, it's not that bad 22:13:54 with an extra column ‘external'? 22:14:01 and kill the external table? 22:14:18 armax: i don't think that helps. this is for subnets 22:14:49 but then the query may turn less ugly 22:15:10 armax: the query itself isn't terrible. it's knowing when to use it 22:15:30 ihrachys: so are you thinking maybe just special case this one query? 22:16:11 yeah. I would refrain from generalizing if it's complex and we don't have another use case for that 22:17:22 ok, well i'll start by looking into getting the policies visible on the context or somewhere where the query builder will have access to them 22:17:59 #link https://bugs.launchpad.net/neutron/+bugs?field.status%3Alist=Triaged&field.tag=rfe 22:18:05 so to recap 22:18:12 before we go in the rfe list 22:18:34 policy mismatch with db hooks == broken pagination 22:18:39 changing the policy.json file to expose the list of subnets for external network will make the pagination tests pass? 22:18:45 armax: yes 22:19:04 but then we have the issue of exposing something we didn't previously 22:19:11 on upgrade 22:19:17 what if we add a shim extension to that? 22:19:21 :) 22:19:32 well we could have an extension to indicate the behavior 22:19:38 it’s definitely a side effect 22:19:51 but inevitably some tempest test or something will explode somewhere 22:19:53 but what could go wrong if more subnets are exposed to a tenant now during subnet-list? 22:20:04 where it assumes if a tenant can see a subnet and it's not shared, the tenant owns it 22:20:09 yeah, that’s the danger, but aside that side effect 22:20:36 what else could go wrong? 22:21:00 we also need to verify that things like router interface add 22:21:11 verify ownership by tenant matching 22:21:40 there could be places where we attach by subnet like that which depend on the DB not returning anything to verify if a user has access to it 22:22:32 sure, same problem arises when the operator does enable this via policy though 22:22:33 correct? 22:23:08 armax: yep 22:24:06 so chance is that exposing subnets for an external network will still create issues we haven’t thought of 22:24:19 and I wonder if by enabling this by default will allow us to root them out faster 22:24:31 armax: yep 22:24:48 if only a few operators use the non-default policy we probably won't hear about it until pike is EOL 22:24:50 and that fixes the tests, right? 22:25:07 mlavalle: yep 22:25:14 so that’s why I am bit nervours 22:25:30 I mean if the one certainty we have is that this change will cause more 'bugs' 22:25:41 I wonder if we’re better off biting the bullet sooner rather than later 22:25:55 or we forget about it and not proceed altogether 22:26:00 it would also enable us to make the router operations more consistent from the user perspective 22:26:02 just a though 22:26:03 t 22:26:07 gateway attach could take a subnet 22:26:29 well, now you’re going down the path of changing API 22:26:38 which shouldn’t be necessary 22:26:54 I mean, yeah, I can see that this change opens up the possibility 22:26:59 yeah, that's just a potential thing 22:27:04 not required at all 22:27:56 is subnets for external networks something sensitive? 22:28:16 i'm not sure 22:28:28 is there anything in the response payload that could be considered too infra specific? 22:28:29 i haven't been able to think of case where it would be a problem 22:28:55 maybe the operator stored their root password to the gateway in the subnet description 22:29:00 yeah 22:29:02 :) 22:29:19 but that's bad aoperator practice, LOL 22:29:29 one possible thing is the subnet service types feature 22:29:36 what about it? 22:29:48 you mean that gets exposed? 22:29:49 there might be a subnet meant just for DVR ports that users never really see 22:29:50 yeah 22:29:54 right 22:29:59 so they will see the floating IP subnet 22:30:00 I was thinking about that 22:30:05 or segment ID? 22:30:06 and a 'internal' subnet 22:30:17 is that on subnets? I can’t remember 22:30:30 well segment_id isn't a huge issue 22:30:34 just a UUID 22:30:52 https://github.com/openstack/neutron/blob/master/etc/policy.json#L21 22:30:58 it's admin-only anyway 22:31:18 so they wouldn't be able to see segment_id or service_types 22:31:22 so are service types 22:31:25 we already have a service type network:floatingip 22:31:27 except on the GET 22:31:45 but they would still be able to see the subnets themselves 22:32:10 see example 2 in https://docs.openstack.org/neutron/latest/admin/config-service-subnets.html 22:32:13 so if you have a weird subnet just for DVR interfaces on non-routable addresses you might get questions about it showing up in users subnet lists 22:32:57 mlavalle: yeah 22:33:18 so would normal users seeing all of those subnets show up be okay? 22:33:55 in principle, I don't think so 22:34:04 it feels wrong 22:34:13 yeah, it feels like exposing infra stuff 22:34:27 even if the operator changed the policy.json from defeault 22:35:00 so on the RFE 22:35:06 what was the exact use case? 22:35:32 armax: https://bugs.launchpad.net/neutron/+bug/1653932 22:35:32 Launchpad bug 1653932 in neutron "[rfe] network router:external field not exported" [Wishlist,In progress] - Assigned to Kevin Benton (kevinbenton) 22:35:32 I wonder if it’s something PaaS related? 22:35:44 what comment #? 22:35:46 it's actually about IP visibility for allocating floating IPs 22:35:51 the description 22:36:06 users don't know which subnet to pick from 22:36:11 right, but why wouldn’t a random IP not be enough? 22:36:45 some IPs have access to one thing and others to something else 22:37:12 like if there’s some sorth of IP whitelist somewhere? 22:37:22 or even routing wise? 22:37:26 yeah 22:37:38 why not creating two networks? 22:37:42 ...right 22:37:46 I was gonna say the same 22:38:07 though now the choice is not based on IP address but 'label' 22:38:19 well it's even better to my taste 22:38:34 you can tag the network, put a nice name and even a description, whatever 22:38:45 network pistacho or network strawberry 22:38:50 yum 22:39:39 ok so how about clarifying use case again before doing coding? 22:39:59 so maybe go back to the rfe and have a dialog with the submitter to dig in the use case? 22:40:01 I suppose we had done that, but it looks like we did not 22:40:47 well, now we have a set of consequences that we can discuss with submitter 22:41:34 armax: can you ask for clarification>? 22:41:41 sure 22:41:45 can do 22:41:55 this does seem like a request for generic policy support without a very clear use case 22:42:03 he does mention that floating IP is just an example 22:42:17 re-reading the rfe, he actually mentions that fip is just an example 22:42:32 comment #4 22:42:41 yeah 22:42:54 true 22:43:05 maybe it's actually something service_types can help with 22:43:07 but since we have now understood the implications 22:43:07 and we decided to approach the solution from that perspective 22:43:14 maybe one of the subnets is private and useless 22:43:17 for floating IPs 22:43:19 much more than we had at the beginning 22:43:28 and the code doesn’t look pretty and risk of errors is high 22:43:29 and yes, now we know the implications 22:43:37 let’s go back and fully understand what we need 22:43:52 ok 22:43:56 kevinbenton: service types need to have subnets exposed nonetheless no? 22:44:15 true 22:44:19 Ok 22:44:26 I think we beat this horse hard enough 22:44:27 let’s move on 22:44:31 I’d say 22:44:38 i want to discuss https://bugs.launchpad.net/neutron/+bug/1604222 right away 22:44:38 Launchpad bug 1604222 in neutron "[RFE] Implement vlan transparent for openvswitch ML2 driver" [Wishlist,Triaged] - Assigned to Trevor McCasland (twm2016) 22:44:39 LOL, it must be dead now 22:44:41 that’s rude 22:44:44 :) 22:44:51 kevinbenton: please? 22:44:59 armax: please what? 22:45:07 i want to discuss https://bugs.launchpad.net/neutron/+bug/1604222 right away 22:45:12 where are your manners? 22:45:18 rather than going to the list i mean 22:45:22 still 22:45:24 be polite 22:45:27 :P 22:45:27 (even though I think this was at the top) 22:45:35 it was 22:45:50 there is some mixup here between two things 22:45:54 do we have full support in OVS? 22:45:58 nowadays? 22:46:06 2.8 will have it it sounds like 22:46:12 g 22:46:14 "this will be released in ovs 2.8 in august prior to the release of pike" 22:46:22 finally 22:46:33 there is no official release yet, but people can always backport/pull master 22:46:34 however, what Trevor discussed is not related to vlan transparency 22:46:48 he is working on a QinQ network type driver 22:47:01 which will also depend on QinQ support in OVS for an OVS implementation 22:47:18 not sure I understand the need for a new type drivers 22:47:21 driver* 22:47:23 but they are orthoganal and i don't want the QinQ type driver to get mixed up with vlan transparency 22:47:52 a type driver goes and in hand with a ml2 driver, no? 22:47:58 armax: the QinQ driver is to allow double-encapped packets to hit the wire 22:48:12 double-tagged* 22:48:46 how do drivers that implement vlan transparency do it? 22:48:49 and he is sub-classing the vlan type to do that 22:48:52 oh so you’re saying it’s a way for OVS agent to understand that it needs to do pushd/popd? 22:49:02 rather than strip/replace? 22:49:04 see this is the problem :) 22:49:17 the qinq type driver has nothing to do with vlan transparency 22:49:27 then I still don’t understant 22:49:41 the qinq type driver is just like the vlan driver 22:49:44 right 22:49:49 with more tags 22:49:51 except it tells to place two tags in the header 22:49:53 so if it’s yet another tunnelling driver 22:49:53 inside 22:49:56 yep 22:50:05 it does have no reason to exist in the repo 22:50:07 what’s the use case? 22:50:13 I mean 22:50:17 I have no reason against it 22:50:21 geneve has no reason to exist in the repo 22:50:35 I wasn’t PTL when that merged ;) 22:50:39 I don’t think 22:50:47 it's not vendor specific 22:50:51 but my question is what’s it for? 22:51:02 if it’s not going to be used in vlan transparency? 22:51:05 do we know? 22:51:06 if you have infra expecting qinq frames 22:51:11 I can see what geneve is going to be used for 22:51:17 OVN 22:51:28 it lets you scale way beyond 4096 vlans 22:51:29 qinq typo driver 22:51:44 do we know the mech driver that will leverage it? 22:51:45 dude 22:51:49 stop lecturing us on QinQ 22:51:50 SR-IOV 22:51:56 armax: then don't ask questions 22:51:56 we all get it, I think :P 22:52:06 you’re answering the questions I am asking 22:52:15 then we're in agreement? 22:52:23 I think so 22:52:31 ok, qinq is fine 22:52:32 :) 22:52:34 so you’re saying that hte sriov driver will use this type driver? 22:52:37 but it needs a separate RFE 22:52:49 armax: yes, the SR-IOV folks were requesting this at PTG and summit 22:52:58 i need to talk to trevor to have him file a separate RFE 22:53:02 OK 22:53:17 so there’s work on the sriov driver to stitch things together? 22:53:20 because the VLAN transparency is going to be completely different work on the OVS agent openflow pipeline 22:53:33 armax: yep, just to configure the tagging like it does for vlans 22:54:07 so long as the relationship is properly documented in the process of contributing this, then I think this is fine 22:54:13 do we need model changes you reckon? 22:54:50 no, i don't think so, but that will leave this VLAN transparency RFE open to be implemented by someone else 22:55:04 so if anyone's company is depending on OVS transparency support, find some contributors :) 22:55:44 I think we may be interested, I think the topic flew around, but maybe it was for ovs, I will check :) 22:56:35 it would be nice to finally close the circle about VLAN management and OVS 22:56:43 ihrachys: sounds good, we might as well :) 22:57:12 kevinbenton, noted 22:57:29 OK 22:57:35 3 mins to the top of the hour 22:58:05 ok, sorry 22:58:08 was commenting on RFE 22:58:17 no need to be sorry 22:58:21 just did your job for once 22:58:22 anyone have any last minute announcements? 22:58:25 :P 22:58:30 no 22:58:32 none from em 22:58:33 me 22:58:40 no 22:58:51 ok, then i'll give you a bunch of free time back :) 22:58:53 #endmeeting