#openstack-meeting-alt log

20:00:01 <johnsom> #startmeeting Octavia
20:00:01 <openstack> Meeting started Wed Apr 12 20:00:01 2017 UTC and is due to finish in 60 minutes.  The chair is johnsom. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:00:02 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
20:00:04 <openstack> The meeting name has been set to 'octavia'
20:00:17 <johnsom> Hi folks
20:00:21 <nmagnezi> o/
20:00:42 <JudeC> Hello
20:00:45 <xgerman> o/
20:00:51 <johnsom> Welcome JudeC
20:01:10 <jniesz> hello
20:01:19 <johnsom> Hi jniesz
20:01:30 <johnsom> #topic Announcements
20:01:51 <xgerman> PTL elections coming up
20:02:04 <johnsom> It is Pike milestone 1 week.  I plan to cut our milestone releases for both neutron-lbaas and octavia today.  Dashboard has already gone out.
20:02:15 <xgerman> nice!
20:02:24 <johnsom> I have also submitted the Pike goals patches.
20:02:43 <johnsom> So I think we are in good shape for Pike-1
20:03:13 <johnsom> TC elections are coming up first (Soonish), so keep an eye out for that voting e-mail.
20:03:30 <xgerman> and there are Q&A with the candidates on the mailing list
20:04:15 <johnsom> As german said, PTL elections are coming as well.  Not sure when that is starting actually.
20:04:23 <xgerman> TC
20:04:37 <xgerman> oh, my mistype was right
20:04:42 <johnsom> Ah, ok.
20:04:51 <johnsom> Any other announcements?
20:05:18 <johnsom> #topic Brief progress reports / bugs needing review
20:06:03 <johnsom> Aside for the PTL-ish stuff listed above I have been working on the API reference.  I posted the listener section for review and hope to start pools today.
20:07:03 <johnsom> We had a couple of specs merge for QoS and alternate IP/ports for health monitors.  hopefully we will see patches soon.
20:07:25 <johnsom> The L3 Act/Act spec is also up for review, so please take a look and comment.
20:07:51 <johnsom> Any other updates or bugs of note?
20:08:00 <nmagnezi> xgerman, this ^ was officially adopyted by you, right? :)
20:08:03 <jniesz> johnsom, I saw the notes regarding the FIP
20:08:04 <nmagnezi> adopted*
20:08:10 <nmagnezi> act/act
20:08:29 <xgerman> no, this is a new spec
20:08:39 <xgerman> I am trying to salvage IBM’s work
20:08:48 <johnsom> jniesz Yeah, I posted some comments/questions on the spec
20:09:18 <jniesz> for the FIP part to work FIP api would need to support exception rule to not perform SNAT on anycast IP
20:09:21 <johnsom> We can talk about those in open discussion if you would like
20:10:07 <jniesz> sure, I did comment on your feedback
20:10:21 <johnsom> Ok cool, I will take a look
20:11:50 <johnsom> #topic Boston Summit planning
20:12:11 <johnsom> The summit is next month.  Who all is attending?
20:12:23 <xgerman> me
20:12:28 <xgerman> and rm_work
20:12:40 <johnsom> Unfortunately my travel funding got pulled at the last minute here, so I will not be attending.
20:12:59 <m-greene_> I’ll be there, spending 30% of my time on booth duty :)
20:13:06 <johnsom> Having four events (summits and PTGs) a year is making the bean counters unhappy....
20:13:45 <johnsom> Ok, cool so three folks.
20:14:04 <johnsom> I had planned to discuss the four presentations we have, but I don't see rm_work here
20:14:54 <johnsom> xgerman Is there anything you wanted to discuss about those or do we have a plan?
20:15:34 <xgerman> well, I don’t have a plan but will focus on the presentations/lab after my time off
20:15:53 <xgerman> for lab I will see if I can recycle what we did in Austin
20:16:21 <johnsom> Ok.  Maybe you can convince m-greene_ to help out...  grin
20:16:50 <johnsom> #topic PTG for Queens in September - Are you planning to attend?
20:17:17 <johnsom> After the summit the PTG will be in September in Denver.
20:17:26 <xgerman> depending on the funding situation
20:17:33 <m-greene_> does xgerman have my email?  I’m not good about trolling IRC frequently
20:17:36 <johnsom> I am planning to attend.
20:17:44 <xgerman> m-greene_ likely not
20:17:45 <jniesz> I will try to get funding for that, so hopefully we can attend
20:18:10 <johnsom> The foundation has asked the PTLs to give an estimate of how many folks are going to attend from our projects.  Do you all have an estimate?
20:18:38 <johnsom> I will ask for a dedicated room at this PTG.
20:18:54 <nmagnezi> johnsom, I cannot estimate at the moment
20:18:54 <m-greene_> me +1 (at least)… Denver is home for my team
20:18:55 <xgerman> I think there re people (me included) who would like the PTG be pased out and replaced with a design summot at the main summit
20:19:00 <jniesz> I would estimate 2-3 from my side
20:19:54 <johnsom> Ok, thanks.  I will give my estimate.  I guess its for the number of tickets and room sizes.  Atlanta was limited to 500.
20:20:10 <johnsom> Thank you for the estimates.
20:20:34 <johnsom> #topic Discuss asynchronous API and response handling
20:21:18 <johnsom> So this agenda item has been an issue for a while.  I want to get some feedback from the team so we can clean this up in the octavia v2 API.
20:22:32 <johnsom> Our API is asynchronous which means we continue to finish the request after we have returned a response to the user.
20:23:01 <nmagnezi> ack
20:23:09 <johnsom> That means that we may not have determined if the request was fulfilled completely at response time.
20:23:17 <johnsom> The question on the table is:
20:24:24 <johnsom> Should we return to the user a view that represents the future state (fields updated, etc.) that is "fake" and subsequent GET calls may reflect the "old" settings with a status of PENDING_UPDATE.
20:24:26 <johnsom> Or
20:25:19 <johnsom> Should we return a "current state" view, with old values and "PENDING_UPDATE" with the new values reflected once the operation fully completes and we return the status to "ACTIVE"
20:25:44 <johnsom> Thoughts or comments?
20:25:59 <johnsom> It's a bit of an odd situation.
20:26:02 <xgerman> I like the later since this lines up with our behavior with POST
20:26:27 <johnsom> Yes, the POST would behave like the first option
20:27:08 <nmagnezi> I tend to agree with xgerman, but I feel like I'm not fully knowledgeable of the situation here. are there any related bugs patches to read?
20:27:21 <johnsom> I guess there is a third option, which would be GET calls get the new values, but on failure they may revert
20:27:33 <nmagnezi> overall, aren't we currently do the latter anyways? (for example in loadbalancer create)
20:28:41 <johnsom> There aren't any bugs for this (yet).  It came up during the new API work.  The octavia v2 API behaves like the second option right now (for updates/deletes)
20:29:00 <xgerman> yeah, it was confusing me
20:29:08 <johnsom> POST doesn't have "old" values, so it will always show the "new" values.
20:30:17 <jniesz> hmm, if the call eventually fails, does that mean the old data is gone or reverted back?
20:30:43 <xgerman> the new data is never saved in the DB — so the old data will always be back
20:30:53 <johnsom> Currently if the operation fails the user will see the old values
20:30:54 <xgerman> which makes subsequent gets confusing
20:31:08 <xgerman> (while the operation runs)
20:31:37 <m-greene_> the DB not updating immediately might cause problems.
20:31:46 <johnsom> Right, we defer the database update until the last step in the flow.  This preserves the view of what is "currently" running in the load balancers
20:32:19 <m-greene_> for our driver, there are some ops that take some time (like getting another neutron port)  so our code has a periodic task and we expect to use the DB as the current source of truth
20:32:39 <m-greene_> we don’t cache the “new” update service definition until it’s deployed, then push new data back up into the DB
20:33:20 <m-greene_> so I’d expect the octavia DB to update immediately, hand off to the vendor driver, and we only update status from PENDING to ACTIVE
20:34:00 <johnsom> Ok, I was wondering about that.  As it is coded now, we would hand you the new data in the handler message but not update the database
20:34:07 <m-greene_> what if the vendor driver dies in the middle, we’ve lost the request entirely and cannot recover and hand back in to a replacement driver (HA)
20:35:26 <johnsom> Two things.  1. the view the user gets reflects the fact that operation was lost as the old config is still registered.  2. I would hop that is queued / persisted in some way.
20:36:13 <m-greene_> our design “goal” (not 100% true) is to be completely stateless and rely on the (currently neutron) DB
20:36:24 <johnsom> Ok
20:37:18 <xgerman> mmh, since we hand it to our taskflow system which could be backed by a DB for HA we don’t worry about resending
20:37:20 <johnsom> So from your perspective, in a failure situation, it is ok for the database to reflect a configuration back to the user that does not represent what is actually configured on the appliance?
20:37:41 <johnsom> xgerman right
20:38:15 <xgerman> yeah, so for it to be stateless we would need to have a handshake on the queue
20:38:29 <xgerman> that they resend if the consumer dies
20:38:35 <johnsom> Right, don't confirm the pull from the queue until it completes
20:38:57 <johnsom> Assuming there is a queue, which isn't a given
20:39:00 <xgerman> I think that wold solve m-greene_ ’s issue
20:39:18 <xgerman> well, we could require it if you run a stateless driver - just saying
20:40:30 <johnsom> I think we need to think about the database access too as that is something I would like to discourage without an abstraction layer.  I don't like DB changes potentially breaking drivers.
20:41:27 <xgerman> +1
20:41:29 <m-greene_> agreed- we need to fetch service definition, update status, etc.
20:41:41 <johnsom> Yep
20:41:54 <johnsom> Hmm, ok.  More to think about.
20:42:20 <johnsom> I know that I need to write up a driver spec here soon-ish where we can hammer out the details.
20:42:28 <xgerman> +1
20:43:21 <johnsom> Alright should we move on to open discussion and revisit next week?
20:43:44 <johnsom> #topic Open Discussion
20:44:05 <johnsom> Other topics for today?  L3 Act/Act?
20:44:30 <nmagnezi> I have a question but i think jniesz had something to discuss
20:45:10 <jniesz> main question I had was around next steps for FIP
20:46:04 <johnsom> Ok.  Is FIP a use case in your environment or do you expect the VIP to simply be on tenant networks?
20:46:09 <jniesz> right now I don't think this will work because FIP is 1:1 NAT between private / public IP
20:46:22 <jniesz> in our environment the IPs will be directly routeable from network
20:46:26 <jniesz> so we won't use FIPs
20:47:03 <johnsom> Right.  I kind of expect that will be a common  case to not use FIPs.
20:47:18 <jniesz> also neutron plugins that are evpn bgp based won't use FIP
20:47:31 <jniesz> FIP really is for overlay tenant networks
20:47:34 <johnsom> So, maybe that is just something this approach does not address.  If someone needs it, it can be handled a different way in the future.
20:47:40 <xgerman> I don’t think we make yu sue FIPs
20:48:01 <xgerman> can’t you just give as the VIP network your public net?
20:48:05 <jniesz> yes, I figured other active/active driver could handle other cases
20:48:40 <xgerman> Pretty sure FIP was optional
20:48:53 <jniesz> ok, I will remove it from spec
20:48:56 <johnsom> I am good with that.  So we can just remove it from the diagram and maybe make a comment about it being out of scope for the spec.
20:49:26 <johnsom> nmagnezi What is your question?
20:49:47 <nmagnezi> re: https://review.openstack.org/#/c/454172/
20:50:11 <nmagnezi> why do we use auth_strategy = noauth as default, and not keystone?
20:51:02 <xgerman> in several deployments the internal endpoint is used  which is only accessible by whitelisted IPs
20:51:16 <xgerman> so no need to authenticate for some
20:51:18 <johnsom> Until now we didn't have keystone integration to authenticate tokens.  It just hadn't been developed yet.  It wasn't critical as we were behind neutron-lbaas which we trusted.
20:51:39 <johnsom> This was fixed with the new API work
20:51:57 <xgerman> yeah, now if security requires you can add keystone authntication
20:52:10 <johnsom> So, now-ish we should be able to change the default to keystone
20:52:26 <johnsom> at least in our gates.
20:52:27 <xgerman> mmh, not sure…
20:52:36 <johnsom> That would be a breaking change if we did it blanket
20:52:52 <xgerman> yes, and might lead to more problems we get asked about
20:53:03 <johnsom> Right
20:53:24 <xgerman> nmagnezi is there a particular reason we should default it?
20:53:59 <nmagnezi> xgerman, i don't have a strong opinion here, just wondered why it is set the way it is
20:54:16 <xgerman> ok, for production you might want to make sure to enable it ;-)
20:54:35 <johnsom> Yeah, mostly just an incremental development thing.
20:54:51 <xgerman> +q
20:54:53 <xgerman> +1
20:54:57 <nmagnezi> xgerman, hah... what would we need to do to make it default on our gates? I think testing as close as possible to production setting has value
20:55:38 <xgerman> touche
20:55:44 <nmagnezi> :D
20:55:53 <johnsom> That is pretty easy actually.  We can just update our devstack plugin or gate hook to set it to keystone.  I think there was a proposed patch for that at one point.
20:56:39 <johnsom> Oh, I should also mention, I will be cutting stable branch releases as well to pick up the fixes for devstack and diskimage-builder breakage and some backported fixes.  Look for those this week as well.
20:56:49 <nmagnezi> johnsom, i can submit a new patch
20:56:51 <johnsom> Also of note, mitaka is officially EOL now.
20:56:56 <xgerman> +1
20:57:10 <johnsom> The stable branch will be removed soon.
20:57:11 <xgerman> I will be on vacation until Wednesday so skip the enxt meeting
20:57:26 <johnsom> The party for that is it was the last branch with lbaas v1 API in it....
20:57:39 <xgerman> Hooray!!
20:57:59 <nmagnezi> johnsom, I think I have some backports up for review. generally we should make sure we look at all proposed backports before you make the cut :)
20:58:01 <johnsom> Yeah, that is nice
20:58:21 <nmagnezi> johnsom, neat! (v1..)
20:58:36 <xgerman> yeah, we can review the backports
20:58:47 <johnsom> nmagnezi I thought I had looked at most, but if you have a list you want me to watch for, please send it
20:58:58 <xgerman> 1
20:59:44 <nmagnezi> johnsom, sure will
20:59:45 <johnsom> helper function, I see that one
21:00:13 <johnsom> Ok, we are out of time.  Thanks for joining!  see you in the lbaas channel
21:00:18 <johnsom> #endmeeting