#openstack-meeting log

22:02:31 <armax> #startmeeting neutron_drivers
22:02:32 <openstack> Meeting started Thu Jul  7 22:02:31 2016 UTC and is due to finish in 60 minutes.  The chair is armax. Information about MeetBot at http://wiki.debian.org/MeetBot.
22:02:33 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
22:02:35 <openstack> The meeting name has been set to 'neutron_drivers'
22:02:36 <cgoncalves> hi
22:02:36 <armax> hello folks
22:02:52 <amuller> yo
22:03:15 <armax> so, let’s go over the list of RFEs that you guys systematically ignore on a weekly basis
22:03:22 <HenryG> ack
22:03:30 <dougwig> starting with the stick today.
22:03:41 <armax> Well, I am stating the obvious
22:03:55 <dougwig> it's ok, your carrot is just a big stick painted orange too, but we all love you anyway.
22:04:01 <armax> :)
22:04:45 <armax> the list
22:04:49 <armax> #link https://bugs.launchpad.net/neutron/+bugs?field.status%3Alist=Triaged&field.tag=rfe&orderby=datecreated&start=0
22:04:58 <armax> bug #1476527
22:04:58 <openstack> bug 1476527 in neutron "[RFE] Add common classifier resource" [Wishlist,Triaged] https://launchpad.net/bugs/1476527
22:05:35 * HenryG notes that the "ignoring" is also probably an indication of reality, i.e. the list is longer than we have capacity for.
22:06:25 <armax> HenryG: perhaps, but no-one is asking to dedicate the entire week to your duty as a driver
22:06:31 <ajo> I saw today the other spec about the IETF standard based SFC / classifier
22:06:49 <armax> HenryG: a couple of hours a week bundled across all of us would make a huge difference, but anyway I digress
22:07:00 <ajo> and I must remark, I had concerns from people leading SFC in other projects, that networking-sfc diverges a bit,
22:07:01 <armax> ajo: link?
22:07:11 <ajo> yikes... /me tries
22:07:39 <cgoncalves> igordcard: ^^
22:07:55 <ajo> probably from igordcard, yes
22:07:57 * ajo keeps looking
22:08:07 <armax> the bottom line with this one is that Igor chose to propose this to neutron-specs, I think if the intention is to provide model and API, then people who are interested of working on this should simply work on the neutron-classifier project
22:08:17 <armax> I provided some comment/feedback on the RFE
22:08:31 <cgoncalves> ajo: https://review.openstack.org/#/c/333993/ ?
22:08:34 <armax> this was pretty much what we already agreed like a year ago
22:08:56 <ajo> https://review.openstack.org/#/c/308453/
22:09:11 <cgoncalves> ah, right. that one
22:09:36 <ajo> honestly the classification rule sets, are much more cleaner and modular than the flow classifiers defined in the sfc spec
22:09:50 <ajo> which are just huge rows with lots of columns
22:09:58 <ajo> which eventually are incompatible
22:10:16 <ajo> Well, a rule based approach adds complexity, to be honest
22:10:25 <armax> ajo: so are you saying that review 308453 clashes with review 333993
22:10:31 <armax> ?
22:10:54 <dougwig> surely the format for packet classification and rules is something that has been solved well elsewhere, and we can do something similar (and compatible), right?
22:11:11 <ajo> the propose different models
22:11:35 <armax> ajo: and yet it’s the same person who proposes both?
22:11:38 <armax> I am puzzled
22:11:52 <ajo> it's the same?
22:11:59 <ajo> I'm puzzled too, I just saw the other proposal today
22:12:23 <armax> anyhow, I think the objective of this discussion is not so much on how to converge on a commonly agreed API, but rather if we consider traffic classification in scope for Neutron
22:12:25 <armax> I think it is
22:12:37 <ajo> armax: +1
22:12:44 <ajo> thanks for enrailing the conversation again, sorry
22:13:04 <armax> but to work in a way so that this thing can be reusable across the stadium project, it clearly need to be a module on its own
22:13:26 <armax> now we can take as much as we can to iterate on the spec until consensus is achieved
22:13:38 <cgoncalves> I think review 333993 is, as it says in the commit message, a place for documenting and generating discussion of possible approaches. does not clash per se with review 308453
22:13:55 <armax> if interested folks are happy to iterate on proposals targeting neutron-specs, at this point I don’t care one way or another
22:13:55 <dougwig> i agree that it makes sense as a neutron concept.
22:14:29 <dougwig> if we want the discussion on the spec, we should highlight it in the next neutron meeting or something.
22:14:50 <ajo> I believe it's valuable in the neutron context,
22:14:52 <armax> since the problem space and solution is not well defined as others, we definitely want to give folks room for freedom, and hence yet another reason for having neutron-classifier as a standalone thing
22:15:27 <HenryG> you mean a separate repo in the stadium?
22:15:46 <ajo> armax, eventually I believe it will be valuable to make qos depend on it
22:15:53 <armax> HenryG: no stadium project, it’s yet another openstack project repo
22:16:04 <armax> ajo: proves us we can use in the qos and we’ll talk
22:16:11 <armax> ajo: we’re far from getting to that point
22:16:17 <armax> I am not signing off a blank check
22:16:20 <armax> sorry
22:16:26 <dougwig> doesn't the classifier already have its own repo to use?
22:16:30 <armax> yew
22:16:32 <armax> yes
22:16:39 <armax> I don’t want to care if it breaks
22:16:44 <armax> I don’t want to review infra patches
22:16:52 <ajo> well, it's just a db model, mostly
22:17:03 <armax> until we get it to a solid place
22:17:18 <amotoki> I am not sure a separate repo works. it is unclear what approach we can do as common classifier.
22:17:41 <amotoki> IIRC when we started neutron-classifier we try to start with db models and validators.
22:17:45 * igordcard reading from the beginning
22:17:50 <armax> amotoki: when we get to a point that for instance the qos folks can/want to use it, then we can talk
22:17:59 <armax> until then, this is just a proposal on paper
22:18:11 <amotoki> armax: agree
22:18:17 <armax> that might go nowhere, still
22:18:23 <dougwig> amotoki: it can either be a lib or a consumer; a separate repo can always work, just with more hoops to integrate.
22:18:35 <dougwig> in this case, it'd be a lib in pypi
22:18:41 <ajo> well, if that's doable, I don't disagree
22:18:53 <amotoki> dougwig: sounds fair. it works
22:19:02 <armax> dougwig: you’d need to be a separate thing anyway as it may be reused by more than a single project
22:19:26 <ajo> it imposes a lot of extra work for distributions to get it consumed,
22:19:40 <ajo> and I believe it deserves a decent oversight to get something that it's really common and works for all
22:19:46 <armax> and the ultimate commandament of the Stadium is thou shalt not import neutron
22:19:51 <ajo> otherwise, we will end with 10 common classifiers
22:19:55 <ajo> :)
22:20:00 <cgoncalves> ajo: +1
22:20:07 <armax> ajo: I expect the oversight to come from the neutron cores who care
22:20:10 <armax> you being one of them
22:20:16 <armax> any other, please sign up
22:21:06 <armax> I think we can make it work, we only need perseverance and diligence
22:21:15 <armax> shall we move on?
22:21:42 <ajo> I will signup, but I'll need help
22:21:57 <armax> ajo: let’s take this offline between you and me to make sure you can be effective in helping the lot converge on a solution
22:22:08 <ajo> armax, ack :)
22:22:28 <armax> ok
22:22:35 <armax> anyone else wants to add anything else?
22:23:40 <armax> in general we feel positive about traffic classification being something we care, we’ll have to incubate the work and watch it closely
22:24:00 <armax> once this becomes more than just a spec we’ll talk on how to bring it into the neutron fold
22:24:03 <armax> moving on
22:24:10 <armax> bug #1552680
22:24:10 <openstack> bug 1552680 in neutron "[RFE] Add support for DLM" [Wishlist,Triaged] https://launchpad.net/bugs/1552680 - Assigned to John Schwarz (jschwarz)
22:24:22 <armax> I did my homework
22:24:27 <armax> cough cough
22:24:47 * amuller hands armax a double chocolate fudge peanut butter cookie
22:24:49 <kevinbenton> Mirantis does not deploy tooz right now unless Ceilometer is used
22:25:03 <armax> yes, HOS is the same
22:25:06 <kevinbenton> so Cinder must not require it
22:25:18 * carl_baldwin did some homework too.
22:25:24 <carl_baldwin> Cinder doesn't require it yet.
22:25:24 <armax> kevinbenton: Cinder is in progress to adopt it
22:25:26 <armax> but not yet
22:25:39 <ajo> armax, Kevinbenton: correct
22:25:50 <ajo> (it's WIP)
22:26:11 <dougwig> so, punt until this gets more mature?
22:26:28 <armax> so bottom line is: we can expect that operators/distros to support tooz/backend of your choice at some point
22:26:45 <armax> dougwig: at this point, I’d punt it to be the first thing we do as soon as Ocata opens up
22:26:51 <amuller> dougwig: If everyone punted, no one would adopt it. We know it's used by Ceilometer and likely to be used by Cinder.
22:26:58 <armax> that gives us an extra 6 months to get the message out
22:27:24 <dougwig> amuller: i don't think nova and neutron need to be on the bleeding edge of helping adoption. ceilometer is still very optional.
22:27:32 <dougwig> for us, stability is king.
22:27:35 <armax> so my position would be, we can start working on it, but merge it as soon as ocata master opens up
22:27:50 <amuller> dougwig: How would you explain nova and oslo versioned objects, API versioning
22:28:01 <amuller> Sometimes the big projects march first
22:28:19 <dougwig> ovo has no deployer impact.
22:28:25 <armax> not every project is alike, not every problem is alike
22:28:36 <amuller> API versioning, pinning and rolling upgrades do.
22:28:56 <armax> the compare and contrast exercise is not really helping us make a deicsion in this matter IMO
22:29:12 <kevinbenton> tooz requires another system with a separate life-cycle of upgrades and maintenance
22:29:15 <amuller> armax: I agree that work should continue but not merged until early O
22:29:24 <armax> is Ocata a timeline people feel comfortable with or do we want to defer it to P?
22:29:33 <armax> because like it or not
22:29:41 <kevinbenton> if Cinder is in the process, what is their timeline?
22:29:48 <armax> I think tooz/backend is something distros and operators will have to bite at some point
22:29:53 <armax> might as well take levearge if we can
22:30:07 <armax> kevinbenton: if they behave perhaps N?
22:30:16 <amuller> I don't understand how anyone could possibly have enough information to decide *now* if this should preemptively be punted to P, before we even started the O cycle.
22:30:22 <dougwig> i'm ok with O, provided the performance and stability testing shows a positive return. if we're just doing it because, then i see no point ever.
22:31:08 <armax> amuller: well it’s more a matter of whether the extra 6 months on top of the other 6 months would be welcome by distros and operators
22:31:31 <armax> but I am more like let’s go in O, as that might as well slip to P :)
22:31:39 <armax> one never knows!
22:31:43 <kevinbenton> well if cinder forces it for N, i'm fine with O
22:31:53 <armax> ok, let’s provisionally go with O
22:32:00 <armax> and reassess as things evolve
22:32:05 <armax> jschwarz: go at it!
22:32:07 <carl_baldwin> +1
22:32:10 <amuller> +1
22:32:11 <ajo> +1
22:32:14 <armax> jschwarz: but take it easy
22:32:23 <armax> jschwarz: don’t go like until 4am every day, ok?
22:32:40 <ajo> jschwarz taking it easy? :)
22:32:45 <armax> I mean if you want to party, fine by me
22:32:52 <amuller> It's 1:30am for John he may not actually be here =p
22:33:02 <ajo> he will read, I'm sure :-)
22:33:09 <armax> ok, moving on?
22:33:26 <armax> bug #1580880
22:33:26 <openstack> bug 1580880 in neutron "[RFE] Distributed Portbinding for all port types" [Wishlist,Triaged] https://launchpad.net/bugs/1580880 - Assigned to Andreas Scheuring (andreas-scheuring)
22:33:32 <armax> anyone seen this one?
22:34:03 <armax> I think this is trying to chew more of it can handle
22:34:20 <carl_baldwin> This is the one driven by live migration.
22:34:25 <armax> I provided feedback, I am not clear of the conclusion yet
22:34:28 <carl_baldwin> On the slate to discuss at the Nova mid-cycle.
22:34:47 <armax> carl_baldwin: right, we need to agree on the scope of the work for N
22:34:53 <armax> but this is clearly spilling into O
22:35:03 <carl_baldwin> agreed
22:35:24 <carl_baldwin> It isn't on nova's priority list for N, so it won't make it there either.
22:35:31 <armax> I’d rather solve the model and logical schema duplication first and worry about the potential API changes later
22:35:35 <dougwig> i said last week, and i'll repeat, we shouldn't pigeonhole this as just live migration. it's also a very relevant use case for service VMs (which basically do live migration as part of their lifecycle.)
22:35:40 <armax> obviously the former need to accommodate the latter
22:36:11 <carl_baldwin> dougwig: I don't see any demand for that.  Is there much that I'm not aware of?
22:36:12 <armax> dougwig: what do you mean? I might have missed your position
22:36:22 <armax> from last week’s meeting
22:36:45 <carl_baldwin> dougwig: I'm not saying it isn't relevant, I just don't here people asking for it.  But, maybe I'm not listening in the right places.
22:36:50 <dougwig> i know that we'd use it in octavia.  we can't do zero downtime upgrades today.
22:37:07 <dougwig> it might be that folks have just learned to workaround it.
22:37:29 <armax> dougwig: ok, but at the very bottom you’re still live migrating VM’s aren’t you?
22:37:51 <dougwig> yes, same basic use case, as long as we don't have to use the nova live migrate mechnism itself.
22:37:56 <armax> gotcha
22:38:13 <armax> dougwig: so in other words you’d be ok if the bindings were extended just to two hosts rather than N
22:38:15 <carl_baldwin> dougwig: So, live migrating a port without the VM?
22:38:35 <dougwig> carl_baldwin: yes, that'd work
22:38:41 <armax> or do you see a case where binding a port to N hosts is something that’s releveant to your use case?
22:38:47 <dougwig> armax: how do we know what the second host is?
22:39:03 <armax> in the case of Nova, the scheduler or the user tells you
22:39:34 <dougwig> then yes, i'd be ok with a post-launch attach, so 2 would cover it.
22:40:00 <armax> ok, I am trying to understand if the added complexity of a super generic port binding API is justified
22:40:01 <kevinbenton> armax: so you don't think it's worth unspecial-casing DVR?
22:40:34 <armax> kevinbenton: from an API point of view I am not sure if we ever need to fiddle with the compute bindings from the REST API
22:41:14 <armax> kevinbenton: but I might be missing something, so it’s worth asking around
22:41:41 <armax> so for this one we shall iterate on it and provide Andreas as much feedback as we can
22:41:47 <armax> please folks, do review the spec
22:42:37 <armax> ok
22:42:42 <armax> bug #1583694
22:42:43 <openstack> bug 1583694 in neutron "[RFE] DVR support for Allowed_address_pair port that are bound to multiple ACTIVE VM ports" [Wishlist,Triaged] https://launchpad.net/bugs/1583694 - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan)
22:43:08 <armax> I may be completely biased here, but I am towards the nay rather than yay
22:43:41 <armax> I have some concerns on the added complexity and future headaches that this use case may bring
22:43:56 <armax> anyone shares my feelings? Or am I simply a chicken?
22:44:13 <carl_baldwin> I think your concerns are pretty well stated in the comments.
22:44:41 <armax> I know I stated them well, I was asking if you guys agreed :)
22:44:47 <carl_baldwin> If there is an easy way to modify topology to achieve the goal ...
22:44:55 <kevinbenton> i think it sucks that DVR is incompatible with allowed address pairs
22:45:12 <kevinbenton> can't do any HA services offered by VMs
22:45:49 <armax> kevinbenton: it would be incompatible only under certain conditions
22:46:32 <armax> carl_baldwin: what’s your position?
22:46:39 <armax> carl_baldwin: you’re a lot more involved in DVR than I
22:46:50 <kevinbenton> armax: under any condition where DVR is used :)
22:47:01 <armax> kevinbenton: not true
22:47:05 <kevinbenton> armax: a VM can't do a virtual IP with another VM
22:47:11 <armax> kevinbenton: you can use allowed address pairs with DVR today
22:47:12 <carl_baldwin> armax: I've only just read your comment.  Need to digest.
22:47:18 <armax> carl_baldwin: ok
22:47:27 <armax> anyone else has an opinion?
22:47:50 <armax> come on folks, your job as driver is to have an opinion, pick your brushes
22:47:51 <johnsom> Question about the DVR FIP bug?
22:48:16 <dougwig> sorry, i'm still stuck on whether i even want DVR as a first-class feature in neutron, which biases me greatly.
22:48:25 <kevinbenton> armax: no, you can't have a virtual IP unless i'm missing something
22:48:31 <kevinbenton> armax: that's the whole point of the request, no?
22:48:55 <kevinbenton> armax: e.g. multiple ports have the same IP in allowed address pairs that they use VRRP or something similar to advertise
22:48:58 <dougwig> if DVR is a first-class citizen, then punting this means other first-class features don't work intuitively, which i disagree with.
22:49:03 <armax> kevinbenton: active active, no
22:49:09 <armax> kevinbenton: active/passive it works
22:49:15 <armax> kevinbenton: that’s my understanding anyway
22:49:23 <armax> I might be wrong
22:49:37 <johnsom> Active/Passive does not work
22:49:50 <kevinbenton> armax: unless the VMs update the API somehow, this doesn't work at all because nothing can tell the l3 agents that the IP has moved
22:50:26 <kevinbenton> armax: so active/passive only works if 'passive' means that the virtual IP is removed from the allowed address pairs
22:50:30 <johnsom> Right, the DVR stuff is ignoring the GARP that VRRP/keepalived sends when it moves the IP
22:50:31 <kevinbenton> on the non-active hosts
22:50:32 <armax> johnsom: I am gonna have to fish for bug reports filed in the past then
22:51:34 <armax> kevinbenton: I don’t recall on the top of my head how allowed address pair support was added to DVR in the past
22:51:47 <armax> kevinbenton: but I do recall that some fixes landed
22:51:50 <johnsom> The issue for us is Octavia uses allowed address pairs and VRRP to do act/stndby.  With DVR enabled floating IPs don't work.
22:51:53 <armax> carl_baldwin: does that ring a bell?
22:52:12 <amuller> an active/passive mechanism that needs API access to perform a failover is far from ideal... you'd want a failover process that uses solely the data plane, not the control plane
22:52:15 <armax> johnsom: the use case is clear
22:52:25 <amuller> otherwise you're assuming that the control plane is even up and in an HA scenario you don't really wanna do that
22:52:37 <armax> but the solution proposed is cobbling together DVR and CVR to deliver a frankenstein
22:52:37 <carl_baldwin> armax: It rings a bell.  I don't think it added anything having to do with multiple ports with the same IP.
22:52:55 <johnsom> Yeah, failover via the API is too slow.  We can fail over in a second or two with VRRP.
22:53:14 <ajo> that defeats the purpose of VRRP
22:53:53 <amuller> it's speed but it's also the robustness of the solution
22:54:03 <amuller> if robustness is a word :)
22:54:07 <armax> it is
22:54:31 <carl_baldwin> The problem is that the fip is conceptually on the other side of a router from the VMs and where VRRP is doing its thing.  I don't see a way around going through the CVR.
22:54:31 <armax> so, are we saying that in lieu of a better architectural solution being proposed this is dead in the water?
22:54:54 <kevinbenton> well we can have a solution that works but is less than optimal i think
22:55:08 <dougwig> like "don't vrrp with fips" ?
22:55:09 <kevinbenton> if floating IP is associated with something that DVR can't find
22:55:18 <kevinbenton> it is realized on the centralized node
22:55:25 <johnsom> My question/proposal is why can't DVR honor the GARPs like normal switches do.
22:55:53 <amotoki> it sounds just a bug in DVR case to me, though it may not be easy to fix.
22:56:21 <ajo> a complex to solve bug
22:56:22 <kevinbenton> johnsom: i think it could. it's just that the floating IP doesn't even get placed anywhere by dvr
22:56:31 <kevinbenton> johnsom: because it doesn't know where to put it
22:56:37 <carl_baldwin> kevinbenton: +1  The fip isn't hosted anywhere.
22:57:01 <armax> carl_baldwin: and the solution is to place it on the network node
22:57:10 <kevinbenton> i think a regression to centralized would at least get us at parity with legacy routing for an interim solution
22:57:13 <carl_baldwin> armax: Where else would you put it?
22:57:23 <armax> carl_baldwin: I don’t disagree
22:57:48 <armax> carl_baldwin: all I am saying is that the added complexity to deal with this scares the hell out of me
22:58:18 <armax> I want to limit the opportunities for people to say that DVR sucks
22:58:26 <armax> rather than the contrary
22:58:31 <carl_baldwin> This is already one of those.  :)
22:58:32 <armax> but they’d say it anyway
22:58:33 <armax> :)
22:58:38 <kevinbenton> carl_baldwin: +1 :)
22:59:01 <armax> well, no I slightly disagree
22:59:28 <johnsom> Yeah, we have had a number of folks hit this limitation and be upset
23:00:37 <armax> ok, let’s brainstorm on this a little longer
23:00:45 <armax> we’re at time
23:00:57 <armax> carl_baldwin: let’s take this offline
23:01:06 <armax> thanks folks and keep up the good work
23:01:11 <armax> #endmeeting