20:00:01 <johnsom> #startmeeting Octavia 20:00:01 <openstack> Meeting started Wed Apr 12 20:00:01 2017 UTC and is due to finish in 60 minutes. The chair is johnsom. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:00:02 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:00:04 <openstack> The meeting name has been set to 'octavia' 20:00:17 <johnsom> Hi folks 20:00:21 <nmagnezi> o/ 20:00:42 <JudeC> Hello 20:00:45 <xgerman> o/ 20:00:51 <johnsom> Welcome JudeC 20:01:10 <jniesz> hello 20:01:19 <johnsom> Hi jniesz 20:01:30 <johnsom> #topic Announcements 20:01:51 <xgerman> PTL elections coming up 20:02:04 <johnsom> It is Pike milestone 1 week. I plan to cut our milestone releases for both neutron-lbaas and octavia today. Dashboard has already gone out. 20:02:15 <xgerman> nice! 20:02:24 <johnsom> I have also submitted the Pike goals patches. 20:02:43 <johnsom> So I think we are in good shape for Pike-1 20:03:13 <johnsom> TC elections are coming up first (Soonish), so keep an eye out for that voting e-mail. 20:03:30 <xgerman> and there are Q&A with the candidates on the mailing list 20:04:15 <johnsom> As german said, PTL elections are coming as well. Not sure when that is starting actually. 20:04:23 <xgerman> TC 20:04:37 <xgerman> oh, my mistype was right 20:04:42 <johnsom> Ah, ok. 20:04:51 <johnsom> Any other announcements? 20:05:18 <johnsom> #topic Brief progress reports / bugs needing review 20:06:03 <johnsom> Aside for the PTL-ish stuff listed above I have been working on the API reference. I posted the listener section for review and hope to start pools today. 20:07:03 <johnsom> We had a couple of specs merge for QoS and alternate IP/ports for health monitors. hopefully we will see patches soon. 20:07:25 <johnsom> The L3 Act/Act spec is also up for review, so please take a look and comment. 20:07:51 <johnsom> Any other updates or bugs of note? 20:08:00 <nmagnezi> xgerman, this ^ was officially adopyted by you, right? :) 20:08:03 <jniesz> johnsom, I saw the notes regarding the FIP 20:08:04 <nmagnezi> adopted* 20:08:10 <nmagnezi> act/act 20:08:29 <xgerman> no, this is a new spec 20:08:39 <xgerman> I am trying to salvage IBM’s work 20:08:48 <johnsom> jniesz Yeah, I posted some comments/questions on the spec 20:09:18 <jniesz> for the FIP part to work FIP api would need to support exception rule to not perform SNAT on anycast IP 20:09:21 <johnsom> We can talk about those in open discussion if you would like 20:10:07 <jniesz> sure, I did comment on your feedback 20:10:21 <johnsom> Ok cool, I will take a look 20:11:50 <johnsom> #topic Boston Summit planning 20:12:11 <johnsom> The summit is next month. Who all is attending? 20:12:23 <xgerman> me 20:12:28 <xgerman> and rm_work 20:12:40 <johnsom> Unfortunately my travel funding got pulled at the last minute here, so I will not be attending. 20:12:59 <m-greene_> I’ll be there, spending 30% of my time on booth duty :) 20:13:06 <johnsom> Having four events (summits and PTGs) a year is making the bean counters unhappy.... 20:13:45 <johnsom> Ok, cool so three folks. 20:14:04 <johnsom> I had planned to discuss the four presentations we have, but I don't see rm_work here 20:14:54 <johnsom> xgerman Is there anything you wanted to discuss about those or do we have a plan? 20:15:34 <xgerman> well, I don’t have a plan but will focus on the presentations/lab after my time off 20:15:53 <xgerman> for lab I will see if I can recycle what we did in Austin 20:16:21 <johnsom> Ok. Maybe you can convince m-greene_ to help out... grin 20:16:50 <johnsom> #topic PTG for Queens in September - Are you planning to attend? 20:17:17 <johnsom> After the summit the PTG will be in September in Denver. 20:17:26 <xgerman> depending on the funding situation 20:17:33 <m-greene_> does xgerman have my email? I’m not good about trolling IRC frequently 20:17:36 <johnsom> I am planning to attend. 20:17:44 <xgerman> m-greene_ likely not 20:17:45 <jniesz> I will try to get funding for that, so hopefully we can attend 20:18:10 <johnsom> The foundation has asked the PTLs to give an estimate of how many folks are going to attend from our projects. Do you all have an estimate? 20:18:38 <johnsom> I will ask for a dedicated room at this PTG. 20:18:54 <nmagnezi> johnsom, I cannot estimate at the moment 20:18:54 <m-greene_> me +1 (at least)… Denver is home for my team 20:18:55 <xgerman> I think there re people (me included) who would like the PTG be pased out and replaced with a design summot at the main summit 20:19:00 <jniesz> I would estimate 2-3 from my side 20:19:54 <johnsom> Ok, thanks. I will give my estimate. I guess its for the number of tickets and room sizes. Atlanta was limited to 500. 20:20:10 <johnsom> Thank you for the estimates. 20:20:34 <johnsom> #topic Discuss asynchronous API and response handling 20:21:18 <johnsom> So this agenda item has been an issue for a while. I want to get some feedback from the team so we can clean this up in the octavia v2 API. 20:22:32 <johnsom> Our API is asynchronous which means we continue to finish the request after we have returned a response to the user. 20:23:01 <nmagnezi> ack 20:23:09 <johnsom> That means that we may not have determined if the request was fulfilled completely at response time. 20:23:17 <johnsom> The question on the table is: 20:24:24 <johnsom> Should we return to the user a view that represents the future state (fields updated, etc.) that is "fake" and subsequent GET calls may reflect the "old" settings with a status of PENDING_UPDATE. 20:24:26 <johnsom> Or 20:25:19 <johnsom> Should we return a "current state" view, with old values and "PENDING_UPDATE" with the new values reflected once the operation fully completes and we return the status to "ACTIVE" 20:25:44 <johnsom> Thoughts or comments? 20:25:59 <johnsom> It's a bit of an odd situation. 20:26:02 <xgerman> I like the later since this lines up with our behavior with POST 20:26:27 <johnsom> Yes, the POST would behave like the first option 20:27:08 <nmagnezi> I tend to agree with xgerman, but I feel like I'm not fully knowledgeable of the situation here. are there any related bugs patches to read? 20:27:21 <johnsom> I guess there is a third option, which would be GET calls get the new values, but on failure they may revert 20:27:33 <nmagnezi> overall, aren't we currently do the latter anyways? (for example in loadbalancer create) 20:28:41 <johnsom> There aren't any bugs for this (yet). It came up during the new API work. The octavia v2 API behaves like the second option right now (for updates/deletes) 20:29:00 <xgerman> yeah, it was confusing me 20:29:08 <johnsom> POST doesn't have "old" values, so it will always show the "new" values. 20:30:17 <jniesz> hmm, if the call eventually fails, does that mean the old data is gone or reverted back? 20:30:43 <xgerman> the new data is never saved in the DB — so the old data will always be back 20:30:53 <johnsom> Currently if the operation fails the user will see the old values 20:30:54 <xgerman> which makes subsequent gets confusing 20:31:08 <xgerman> (while the operation runs) 20:31:37 <m-greene_> the DB not updating immediately might cause problems. 20:31:46 <johnsom> Right, we defer the database update until the last step in the flow. This preserves the view of what is "currently" running in the load balancers 20:32:19 <m-greene_> for our driver, there are some ops that take some time (like getting another neutron port) so our code has a periodic task and we expect to use the DB as the current source of truth 20:32:39 <m-greene_> we don’t cache the “new” update service definition until it’s deployed, then push new data back up into the DB 20:33:20 <m-greene_> so I’d expect the octavia DB to update immediately, hand off to the vendor driver, and we only update status from PENDING to ACTIVE 20:34:00 <johnsom> Ok, I was wondering about that. As it is coded now, we would hand you the new data in the handler message but not update the database 20:34:07 <m-greene_> what if the vendor driver dies in the middle, we’ve lost the request entirely and cannot recover and hand back in to a replacement driver (HA) 20:35:26 <johnsom> Two things. 1. the view the user gets reflects the fact that operation was lost as the old config is still registered. 2. I would hop that is queued / persisted in some way. 20:36:13 <m-greene_> our design “goal” (not 100% true) is to be completely stateless and rely on the (currently neutron) DB 20:36:24 <johnsom> Ok 20:37:18 <xgerman> mmh, since we hand it to our taskflow system which could be backed by a DB for HA we don’t worry about resending 20:37:20 <johnsom> So from your perspective, in a failure situation, it is ok for the database to reflect a configuration back to the user that does not represent what is actually configured on the appliance? 20:37:41 <johnsom> xgerman right 20:38:15 <xgerman> yeah, so for it to be stateless we would need to have a handshake on the queue 20:38:29 <xgerman> that they resend if the consumer dies 20:38:35 <johnsom> Right, don't confirm the pull from the queue until it completes 20:38:57 <johnsom> Assuming there is a queue, which isn't a given 20:39:00 <xgerman> I think that wold solve m-greene_ ’s issue 20:39:18 <xgerman> well, we could require it if you run a stateless driver - just saying 20:40:30 <johnsom> I think we need to think about the database access too as that is something I would like to discourage without an abstraction layer. I don't like DB changes potentially breaking drivers. 20:41:27 <xgerman> +1 20:41:29 <m-greene_> agreed- we need to fetch service definition, update status, etc. 20:41:41 <johnsom> Yep 20:41:54 <johnsom> Hmm, ok. More to think about. 20:42:20 <johnsom> I know that I need to write up a driver spec here soon-ish where we can hammer out the details. 20:42:28 <xgerman> +1 20:43:21 <johnsom> Alright should we move on to open discussion and revisit next week? 20:43:44 <johnsom> #topic Open Discussion 20:44:05 <johnsom> Other topics for today? L3 Act/Act? 20:44:30 <nmagnezi> I have a question but i think jniesz had something to discuss 20:45:10 <jniesz> main question I had was around next steps for FIP 20:46:04 <johnsom> Ok. Is FIP a use case in your environment or do you expect the VIP to simply be on tenant networks? 20:46:09 <jniesz> right now I don't think this will work because FIP is 1:1 NAT between private / public IP 20:46:22 <jniesz> in our environment the IPs will be directly routeable from network 20:46:26 <jniesz> so we won't use FIPs 20:47:03 <johnsom> Right. I kind of expect that will be a common case to not use FIPs. 20:47:18 <jniesz> also neutron plugins that are evpn bgp based won't use FIP 20:47:31 <jniesz> FIP really is for overlay tenant networks 20:47:34 <johnsom> So, maybe that is just something this approach does not address. If someone needs it, it can be handled a different way in the future. 20:47:40 <xgerman> I don’t think we make yu sue FIPs 20:48:01 <xgerman> can’t you just give as the VIP network your public net? 20:48:05 <jniesz> yes, I figured other active/active driver could handle other cases 20:48:40 <xgerman> Pretty sure FIP was optional 20:48:53 <jniesz> ok, I will remove it from spec 20:48:56 <johnsom> I am good with that. So we can just remove it from the diagram and maybe make a comment about it being out of scope for the spec. 20:49:26 <johnsom> nmagnezi What is your question? 20:49:47 <nmagnezi> re: https://review.openstack.org/#/c/454172/ 20:50:11 <nmagnezi> why do we use auth_strategy = noauth as default, and not keystone? 20:51:02 <xgerman> in several deployments the internal endpoint is used which is only accessible by whitelisted IPs 20:51:16 <xgerman> so no need to authenticate for some 20:51:18 <johnsom> Until now we didn't have keystone integration to authenticate tokens. It just hadn't been developed yet. It wasn't critical as we were behind neutron-lbaas which we trusted. 20:51:39 <johnsom> This was fixed with the new API work 20:51:57 <xgerman> yeah, now if security requires you can add keystone authntication 20:52:10 <johnsom> So, now-ish we should be able to change the default to keystone 20:52:26 <johnsom> at least in our gates. 20:52:27 <xgerman> mmh, not sure… 20:52:36 <johnsom> That would be a breaking change if we did it blanket 20:52:52 <xgerman> yes, and might lead to more problems we get asked about 20:53:03 <johnsom> Right 20:53:24 <xgerman> nmagnezi is there a particular reason we should default it? 20:53:59 <nmagnezi> xgerman, i don't have a strong opinion here, just wondered why it is set the way it is 20:54:16 <xgerman> ok, for production you might want to make sure to enable it ;-) 20:54:35 <johnsom> Yeah, mostly just an incremental development thing. 20:54:51 <xgerman> +q 20:54:53 <xgerman> +1 20:54:57 <nmagnezi> xgerman, hah... what would we need to do to make it default on our gates? I think testing as close as possible to production setting has value 20:55:38 <xgerman> touche 20:55:44 <nmagnezi> :D 20:55:53 <johnsom> That is pretty easy actually. We can just update our devstack plugin or gate hook to set it to keystone. I think there was a proposed patch for that at one point. 20:56:39 <johnsom> Oh, I should also mention, I will be cutting stable branch releases as well to pick up the fixes for devstack and diskimage-builder breakage and some backported fixes. Look for those this week as well. 20:56:49 <nmagnezi> johnsom, i can submit a new patch 20:56:51 <johnsom> Also of note, mitaka is officially EOL now. 20:56:56 <xgerman> +1 20:57:10 <johnsom> The stable branch will be removed soon. 20:57:11 <xgerman> I will be on vacation until Wednesday so skip the enxt meeting 20:57:26 <johnsom> The party for that is it was the last branch with lbaas v1 API in it.... 20:57:39 <xgerman> Hooray!! 20:57:59 <nmagnezi> johnsom, I think I have some backports up for review. generally we should make sure we look at all proposed backports before you make the cut :) 20:58:01 <johnsom> Yeah, that is nice 20:58:21 <nmagnezi> johnsom, neat! (v1..) 20:58:36 <xgerman> yeah, we can review the backports 20:58:47 <johnsom> nmagnezi I thought I had looked at most, but if you have a list you want me to watch for, please send it 20:58:58 <xgerman> 1 20:59:44 <nmagnezi> johnsom, sure will 20:59:45 <johnsom> helper function, I see that one 21:00:13 <johnsom> Ok, we are out of time. Thanks for joining! see you in the lbaas channel 21:00:18 <johnsom> #endmeeting