22:02:31 #startmeeting neutron_drivers 22:02:32 Meeting started Thu Jul 7 22:02:31 2016 UTC and is due to finish in 60 minutes. The chair is armax. Information about MeetBot at http://wiki.debian.org/MeetBot. 22:02:33 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 22:02:35 The meeting name has been set to 'neutron_drivers' 22:02:36 hi 22:02:36 hello folks 22:02:52 yo 22:03:15 so, let’s go over the list of RFEs that you guys systematically ignore on a weekly basis 22:03:22 ack 22:03:30 starting with the stick today. 22:03:41 Well, I am stating the obvious 22:03:55 it's ok, your carrot is just a big stick painted orange too, but we all love you anyway. 22:04:01 :) 22:04:45 the list 22:04:49 #link https://bugs.launchpad.net/neutron/+bugs?field.status%3Alist=Triaged&field.tag=rfe&orderby=datecreated&start=0 22:04:58 bug #1476527 22:04:58 bug 1476527 in neutron "[RFE] Add common classifier resource" [Wishlist,Triaged] https://launchpad.net/bugs/1476527 22:05:35 * HenryG notes that the "ignoring" is also probably an indication of reality, i.e. the list is longer than we have capacity for. 22:06:25 HenryG: perhaps, but no-one is asking to dedicate the entire week to your duty as a driver 22:06:31 I saw today the other spec about the IETF standard based SFC / classifier 22:06:49 HenryG: a couple of hours a week bundled across all of us would make a huge difference, but anyway I digress 22:07:00 and I must remark, I had concerns from people leading SFC in other projects, that networking-sfc diverges a bit, 22:07:01 ajo: link? 22:07:11 yikes... /me tries 22:07:39 igordcard: ^^ 22:07:55 probably from igordcard, yes 22:07:57 * ajo keeps looking 22:08:07 the bottom line with this one is that Igor chose to propose this to neutron-specs, I think if the intention is to provide model and API, then people who are interested of working on this should simply work on the neutron-classifier project 22:08:17 I provided some comment/feedback on the RFE 22:08:31 ajo: https://review.openstack.org/#/c/333993/ ? 22:08:34 this was pretty much what we already agreed like a year ago 22:08:56 https://review.openstack.org/#/c/308453/ 22:09:11 ah, right. that one 22:09:36 honestly the classification rule sets, are much more cleaner and modular than the flow classifiers defined in the sfc spec 22:09:50 which are just huge rows with lots of columns 22:09:58 which eventually are incompatible 22:10:16 Well, a rule based approach adds complexity, to be honest 22:10:25 ajo: so are you saying that review 308453 clashes with review 333993 22:10:31 ? 22:10:54 surely the format for packet classification and rules is something that has been solved well elsewhere, and we can do something similar (and compatible), right? 22:11:11 the propose different models 22:11:35 ajo: and yet it’s the same person who proposes both? 22:11:38 I am puzzled 22:11:52 it's the same? 22:11:59 I'm puzzled too, I just saw the other proposal today 22:12:23 anyhow, I think the objective of this discussion is not so much on how to converge on a commonly agreed API, but rather if we consider traffic classification in scope for Neutron 22:12:25 I think it is 22:12:37 armax: +1 22:12:44 thanks for enrailing the conversation again, sorry 22:13:04 but to work in a way so that this thing can be reusable across the stadium project, it clearly need to be a module on its own 22:13:26 now we can take as much as we can to iterate on the spec until consensus is achieved 22:13:38 I think review 333993 is, as it says in the commit message, a place for documenting and generating discussion of possible approaches. does not clash per se with review 308453 22:13:55 if interested folks are happy to iterate on proposals targeting neutron-specs, at this point I don’t care one way or another 22:13:55 i agree that it makes sense as a neutron concept. 22:14:29 if we want the discussion on the spec, we should highlight it in the next neutron meeting or something. 22:14:50 I believe it's valuable in the neutron context, 22:14:52 since the problem space and solution is not well defined as others, we definitely want to give folks room for freedom, and hence yet another reason for having neutron-classifier as a standalone thing 22:15:27 you mean a separate repo in the stadium? 22:15:46 armax, eventually I believe it will be valuable to make qos depend on it 22:15:53 HenryG: no stadium project, it’s yet another openstack project repo 22:16:04 ajo: proves us we can use in the qos and we’ll talk 22:16:11 ajo: we’re far from getting to that point 22:16:17 I am not signing off a blank check 22:16:20 sorry 22:16:26 doesn't the classifier already have its own repo to use? 22:16:30 yew 22:16:32 yes 22:16:39 I don’t want to care if it breaks 22:16:44 I don’t want to review infra patches 22:16:52 well, it's just a db model, mostly 22:17:03 until we get it to a solid place 22:17:18 I am not sure a separate repo works. it is unclear what approach we can do as common classifier. 22:17:41 IIRC when we started neutron-classifier we try to start with db models and validators. 22:17:45 * igordcard reading from the beginning 22:17:50 amotoki: when we get to a point that for instance the qos folks can/want to use it, then we can talk 22:17:59 until then, this is just a proposal on paper 22:18:11 armax: agree 22:18:17 that might go nowhere, still 22:18:23 amotoki: it can either be a lib or a consumer; a separate repo can always work, just with more hoops to integrate. 22:18:35 in this case, it'd be a lib in pypi 22:18:41 well, if that's doable, I don't disagree 22:18:53 dougwig: sounds fair. it works 22:19:02 dougwig: you’d need to be a separate thing anyway as it may be reused by more than a single project 22:19:26 it imposes a lot of extra work for distributions to get it consumed, 22:19:40 and I believe it deserves a decent oversight to get something that it's really common and works for all 22:19:46 and the ultimate commandament of the Stadium is thou shalt not import neutron 22:19:51 otherwise, we will end with 10 common classifiers 22:19:55 :) 22:20:00 ajo: +1 22:20:07 ajo: I expect the oversight to come from the neutron cores who care 22:20:10 you being one of them 22:20:16 any other, please sign up 22:21:06 I think we can make it work, we only need perseverance and diligence 22:21:15 shall we move on? 22:21:42 I will signup, but I'll need help 22:21:57 ajo: let’s take this offline between you and me to make sure you can be effective in helping the lot converge on a solution 22:22:08 armax, ack :) 22:22:28 ok 22:22:35 anyone else wants to add anything else? 22:23:40 in general we feel positive about traffic classification being something we care, we’ll have to incubate the work and watch it closely 22:24:00 once this becomes more than just a spec we’ll talk on how to bring it into the neutron fold 22:24:03 moving on 22:24:10 bug #1552680 22:24:10 bug 1552680 in neutron "[RFE] Add support for DLM" [Wishlist,Triaged] https://launchpad.net/bugs/1552680 - Assigned to John Schwarz (jschwarz) 22:24:22 I did my homework 22:24:27 cough cough 22:24:47 * amuller hands armax a double chocolate fudge peanut butter cookie 22:24:49 Mirantis does not deploy tooz right now unless Ceilometer is used 22:25:03 yes, HOS is the same 22:25:06 so Cinder must not require it 22:25:18 * carl_baldwin did some homework too. 22:25:24 Cinder doesn't require it yet. 22:25:24 kevinbenton: Cinder is in progress to adopt it 22:25:26 but not yet 22:25:39 armax, Kevinbenton: correct 22:25:50 (it's WIP) 22:26:11 so, punt until this gets more mature? 22:26:28 so bottom line is: we can expect that operators/distros to support tooz/backend of your choice at some point 22:26:45 dougwig: at this point, I’d punt it to be the first thing we do as soon as Ocata opens up 22:26:51 dougwig: If everyone punted, no one would adopt it. We know it's used by Ceilometer and likely to be used by Cinder. 22:26:58 that gives us an extra 6 months to get the message out 22:27:24 amuller: i don't think nova and neutron need to be on the bleeding edge of helping adoption. ceilometer is still very optional. 22:27:32 for us, stability is king. 22:27:35 so my position would be, we can start working on it, but merge it as soon as ocata master opens up 22:27:50 dougwig: How would you explain nova and oslo versioned objects, API versioning 22:28:01 Sometimes the big projects march first 22:28:19 ovo has no deployer impact. 22:28:25 not every project is alike, not every problem is alike 22:28:36 API versioning, pinning and rolling upgrades do. 22:28:56 the compare and contrast exercise is not really helping us make a deicsion in this matter IMO 22:29:12 tooz requires another system with a separate life-cycle of upgrades and maintenance 22:29:15 armax: I agree that work should continue but not merged until early O 22:29:24 is Ocata a timeline people feel comfortable with or do we want to defer it to P? 22:29:33 because like it or not 22:29:41 if Cinder is in the process, what is their timeline? 22:29:48 I think tooz/backend is something distros and operators will have to bite at some point 22:29:53 might as well take levearge if we can 22:30:07 kevinbenton: if they behave perhaps N? 22:30:16 I don't understand how anyone could possibly have enough information to decide *now* if this should preemptively be punted to P, before we even started the O cycle. 22:30:22 i'm ok with O, provided the performance and stability testing shows a positive return. if we're just doing it because, then i see no point ever. 22:31:08 amuller: well it’s more a matter of whether the extra 6 months on top of the other 6 months would be welcome by distros and operators 22:31:31 but I am more like let’s go in O, as that might as well slip to P :) 22:31:39 one never knows! 22:31:43 well if cinder forces it for N, i'm fine with O 22:31:53 ok, let’s provisionally go with O 22:32:00 and reassess as things evolve 22:32:05 jschwarz: go at it! 22:32:07 +1 22:32:10 +1 22:32:11 +1 22:32:14 jschwarz: but take it easy 22:32:23 jschwarz: don’t go like until 4am every day, ok? 22:32:40 jschwarz taking it easy? :) 22:32:45 I mean if you want to party, fine by me 22:32:52 It's 1:30am for John he may not actually be here =p 22:33:02 he will read, I'm sure :-) 22:33:09 ok, moving on? 22:33:26 bug #1580880 22:33:26 bug 1580880 in neutron "[RFE] Distributed Portbinding for all port types" [Wishlist,Triaged] https://launchpad.net/bugs/1580880 - Assigned to Andreas Scheuring (andreas-scheuring) 22:33:32 anyone seen this one? 22:34:03 I think this is trying to chew more of it can handle 22:34:20 This is the one driven by live migration. 22:34:25 I provided feedback, I am not clear of the conclusion yet 22:34:28 On the slate to discuss at the Nova mid-cycle. 22:34:47 carl_baldwin: right, we need to agree on the scope of the work for N 22:34:53 but this is clearly spilling into O 22:35:03 agreed 22:35:24 It isn't on nova's priority list for N, so it won't make it there either. 22:35:31 I’d rather solve the model and logical schema duplication first and worry about the potential API changes later 22:35:35 i said last week, and i'll repeat, we shouldn't pigeonhole this as just live migration. it's also a very relevant use case for service VMs (which basically do live migration as part of their lifecycle.) 22:35:40 obviously the former need to accommodate the latter 22:36:11 dougwig: I don't see any demand for that. Is there much that I'm not aware of? 22:36:12 dougwig: what do you mean? I might have missed your position 22:36:22 from last week’s meeting 22:36:45 dougwig: I'm not saying it isn't relevant, I just don't here people asking for it. But, maybe I'm not listening in the right places. 22:36:50 i know that we'd use it in octavia. we can't do zero downtime upgrades today. 22:37:07 it might be that folks have just learned to workaround it. 22:37:29 dougwig: ok, but at the very bottom you’re still live migrating VM’s aren’t you? 22:37:51 yes, same basic use case, as long as we don't have to use the nova live migrate mechnism itself. 22:37:56 gotcha 22:38:13 dougwig: so in other words you’d be ok if the bindings were extended just to two hosts rather than N 22:38:15 dougwig: So, live migrating a port without the VM? 22:38:35 carl_baldwin: yes, that'd work 22:38:41 or do you see a case where binding a port to N hosts is something that’s releveant to your use case? 22:38:47 armax: how do we know what the second host is? 22:39:03 in the case of Nova, the scheduler or the user tells you 22:39:34 then yes, i'd be ok with a post-launch attach, so 2 would cover it. 22:40:00 ok, I am trying to understand if the added complexity of a super generic port binding API is justified 22:40:01 armax: so you don't think it's worth unspecial-casing DVR? 22:40:34 kevinbenton: from an API point of view I am not sure if we ever need to fiddle with the compute bindings from the REST API 22:41:14 kevinbenton: but I might be missing something, so it’s worth asking around 22:41:41 so for this one we shall iterate on it and provide Andreas as much feedback as we can 22:41:47 please folks, do review the spec 22:42:37 ok 22:42:42 bug #1583694 22:42:43 bug 1583694 in neutron "[RFE] DVR support for Allowed_address_pair port that are bound to multiple ACTIVE VM ports" [Wishlist,Triaged] https://launchpad.net/bugs/1583694 - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan) 22:43:08 I may be completely biased here, but I am towards the nay rather than yay 22:43:41 I have some concerns on the added complexity and future headaches that this use case may bring 22:43:56 anyone shares my feelings? Or am I simply a chicken? 22:44:13 I think your concerns are pretty well stated in the comments. 22:44:41 I know I stated them well, I was asking if you guys agreed :) 22:44:47 If there is an easy way to modify topology to achieve the goal ... 22:44:55 i think it sucks that DVR is incompatible with allowed address pairs 22:45:12 can't do any HA services offered by VMs 22:45:49 kevinbenton: it would be incompatible only under certain conditions 22:46:32 carl_baldwin: what’s your position? 22:46:39 carl_baldwin: you’re a lot more involved in DVR than I 22:46:50 armax: under any condition where DVR is used :) 22:47:01 kevinbenton: not true 22:47:05 armax: a VM can't do a virtual IP with another VM 22:47:11 kevinbenton: you can use allowed address pairs with DVR today 22:47:12 armax: I've only just read your comment. Need to digest. 22:47:18 carl_baldwin: ok 22:47:27 anyone else has an opinion? 22:47:50 come on folks, your job as driver is to have an opinion, pick your brushes 22:47:51 Question about the DVR FIP bug? 22:48:16 sorry, i'm still stuck on whether i even want DVR as a first-class feature in neutron, which biases me greatly. 22:48:25 armax: no, you can't have a virtual IP unless i'm missing something 22:48:31 armax: that's the whole point of the request, no? 22:48:55 armax: e.g. multiple ports have the same IP in allowed address pairs that they use VRRP or something similar to advertise 22:48:58 if DVR is a first-class citizen, then punting this means other first-class features don't work intuitively, which i disagree with. 22:49:03 kevinbenton: active active, no 22:49:09 kevinbenton: active/passive it works 22:49:15 kevinbenton: that’s my understanding anyway 22:49:23 I might be wrong 22:49:37 Active/Passive does not work 22:49:50 armax: unless the VMs update the API somehow, this doesn't work at all because nothing can tell the l3 agents that the IP has moved 22:50:26 armax: so active/passive only works if 'passive' means that the virtual IP is removed from the allowed address pairs 22:50:30 Right, the DVR stuff is ignoring the GARP that VRRP/keepalived sends when it moves the IP 22:50:31 on the non-active hosts 22:50:32 johnsom: I am gonna have to fish for bug reports filed in the past then 22:51:34 kevinbenton: I don’t recall on the top of my head how allowed address pair support was added to DVR in the past 22:51:47 kevinbenton: but I do recall that some fixes landed 22:51:50 The issue for us is Octavia uses allowed address pairs and VRRP to do act/stndby. With DVR enabled floating IPs don't work. 22:51:53 carl_baldwin: does that ring a bell? 22:52:12 an active/passive mechanism that needs API access to perform a failover is far from ideal... you'd want a failover process that uses solely the data plane, not the control plane 22:52:15 johnsom: the use case is clear 22:52:25 otherwise you're assuming that the control plane is even up and in an HA scenario you don't really wanna do that 22:52:37 but the solution proposed is cobbling together DVR and CVR to deliver a frankenstein 22:52:37 armax: It rings a bell. I don't think it added anything having to do with multiple ports with the same IP. 22:52:55 Yeah, failover via the API is too slow. We can fail over in a second or two with VRRP. 22:53:14 that defeats the purpose of VRRP 22:53:53 it's speed but it's also the robustness of the solution 22:54:03 if robustness is a word :) 22:54:07 it is 22:54:31 The problem is that the fip is conceptually on the other side of a router from the VMs and where VRRP is doing its thing. I don't see a way around going through the CVR. 22:54:31 so, are we saying that in lieu of a better architectural solution being proposed this is dead in the water? 22:54:54 well we can have a solution that works but is less than optimal i think 22:55:08 like "don't vrrp with fips" ? 22:55:09 if floating IP is associated with something that DVR can't find 22:55:18 it is realized on the centralized node 22:55:25 My question/proposal is why can't DVR honor the GARPs like normal switches do. 22:55:53 it sounds just a bug in DVR case to me, though it may not be easy to fix. 22:56:21 a complex to solve bug 22:56:22 johnsom: i think it could. it's just that the floating IP doesn't even get placed anywhere by dvr 22:56:31 johnsom: because it doesn't know where to put it 22:56:37 kevinbenton: +1 The fip isn't hosted anywhere. 22:57:01 carl_baldwin: and the solution is to place it on the network node 22:57:10 i think a regression to centralized would at least get us at parity with legacy routing for an interim solution 22:57:13 armax: Where else would you put it? 22:57:23 carl_baldwin: I don’t disagree 22:57:48 carl_baldwin: all I am saying is that the added complexity to deal with this scares the hell out of me 22:58:18 I want to limit the opportunities for people to say that DVR sucks 22:58:26 rather than the contrary 22:58:31 This is already one of those. :) 22:58:32 but they’d say it anyway 22:58:33 :) 22:58:38 carl_baldwin: +1 :) 22:59:01 well, no I slightly disagree 22:59:28 Yeah, we have had a number of folks hit this limitation and be upset 23:00:37 ok, let’s brainstorm on this a little longer 23:00:45 we’re at time 23:00:57 carl_baldwin: let’s take this offline 23:01:06 thanks folks and keep up the good work 23:01:11 #endmeeting