19:59:53 #startmeeting Octavia 19:59:54 Meeting started Wed Aug 3 19:59:53 2016 UTC and is due to finish in 60 minutes. The chair is xgerman. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:59:55 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:59:57 The meeting name has been set to 'octavia' 20:00:07 Hi all 20:00:20 johnsom asked me to run the meeting today :-) 20:00:43 Howdy, howdy! 20:00:43 hey 20:00:50 hi 20:00:50 #chair sbalukoff 20:00:51 Current chairs: sbalukoff xgerman 20:00:52 0/ 20:00:53 hola 20:01:00 Uh-oh! 20:01:05 I can feeeeel the powah! 20:01:10 lol 20:01:24 #topic Announcements 20:01:36 Octavia / Neutron-LBaaS mid-cycle - https://etherpad.openstack.org/p/lbaas-octavia-newton-midcycle - https://wiki.openstack.org/wiki/Sprints/LbaasNewtonSprint 20:01:51 Midcyle - yada, yada 20:01:59 Please RSVP for it if you haven't already. 20:02:13 Or yes, "Midcycle - yada, yada" 20:02:24 o/ I'm late 20:02:28 yep, be sure not to be vegetarian :-) 20:02:35 hi 20:03:02 TrevorV: You know, if you don't point it out explicitly, people probably wouldn't notice. ;) 20:03:08 +1 20:03:12 sbalukoff, I have a natural -v 20:03:17 Haha 20:03:24 also I am pretty sure easiness are fast approaching... 20:03:44 easiness? 20:03:51 * TrevorV xgerman lost me 20:04:17 Me too. 20:04:23 sorry my keyboard hates a few characters 20:04:25 deadlines 20:04:28 Oh! 20:04:43 Right. 20:04:49 Yes, get your code done. 20:04:56 N-3 8/29 20:05:07 #link http://releases.openstack.org/newton/schedule.html 20:05:12 Right after the mid-cycle. 20:05:15 that seems to be feature freeze as well 20:05:45 Yep. So if it doesn't get in by the end of the mid-cycle, it probably doesn't get in. Unless you're really good at bribing people to work through the weekend. 20:06:03 #topic Brief progress reports / bugs needing review 20:06:35 #link https://review.openstack.org/#/c/326018 20:06:40 I apologize that I had almost no time for upstream work this last week. The good news is that the internal stuff I was working on is basically done, so now I have a lot of time for upstream reviews and whatnot. 20:06:44 * TrevorV regrets to inform that has still been working almost entirely downstream 20:06:59 So, I'll be spending a number of days working on that and hopefully getting people un-stuck. 20:07:18 https://review.openstack.org/345003, https://review.openstack.org/350274 - my two pachets 20:07:22 * xgerman has been re-orged again and is busy with containers 20:07:25 #link https://review.openstack.org/#/c/323188 20:07:50 #link https://review.openstack.org/345003 20:07:51 diltram: Can you link those on separate lines (the way evgenyf just did)? 20:07:56 Haha 20:07:57 Ok. 20:08:06 #link https://review.openstack.org/350274 20:08:08 done :) 20:08:09 yeah, I am more like Review-as-a-Service, so holler if you need mt look at sth 20:08:21 xgerman: Ok, cool. 20:08:33 I'm glad you're able to still do n-lbaas / octavia reviews at all. :) 20:08:51 in my copious freetime ;-) 20:09:02 do not have updates this week, still the same as last week... 20:09:04 Parts of my team are also stuck in container hell right now. Luckily, my "evade" rolls have been excellent the last couple weeks. 20:09:06 https://review.openstack.org/#/c/342695/ 20:09:44 chenli: Ok, and I apologize I didn't get reviews of your code done last week. Unless I get hit by a bus, it should happen this week. 20:09:50 And have question about this one: https://review.openstack.org/#/c/347715/ 20:10:28 hi 20:10:40 chenli: Can you ask your question during open discussion? 20:10:48 +1 20:10:59 sbalukoff: sure 20:12:20 Any other progress reports? 20:12:56 #topic Patches requiring discussion/consensus 20:13:10 https://review.openstack.org/#/c/325624/ (Update member status to return real statuses) [rm_work] 20:13:26 #link https://review.openstack.org/#/c/325624/ 20:13:39 rm_work? 20:14:37 anybody? 20:14:43 Hmmm.... 20:14:50 otherwise I will defer until johnsom is back ;-) 20:15:06 though didn’t look offending to me 20:15:19 So, really, rm_work needs to offer input on this. 20:15:21 :/ 20:15:28 Not sure where rm_work is, but maybe defer until Michael is around. 20:15:32 Yep. 20:15:34 +1 20:15:44 ok, will defer the second rm_work topic as well 20:16:14 #link https://review.openstack.org/#/c/255963/ 20:16:22 unless anybody has info... 20:16:46 Oh! 20:16:50 I see sbalukoff among the reviewers... 20:16:53 I think that one was my objection. 20:17:12 So, the way this patch works, there's no kind of restrictions on what the tenant can specify here. 20:17:47 and you're feeling that tenant will destroy something? 20:18:02 And there's no "standard way" of doing things defined. 20:18:20 So, for example, if you want X-Forwarded-For inserted... 20:18:21 I am always a bit worried about free-for-alls as well 20:18:32 One vendor could tell you to set the parameter to X-Forwarded-For: True 20:18:40 And other to: X-Forwarded-For: client_ip 20:18:43 Or some such nonsense. 20:19:05 This also opens up this feature to being abused by vendors to activate vendor-specific functionality through the neutron-lbaas API. 20:19:17 Which breaks cloud inter-operability. 20:19:31 I recall us having this conversation a long time ago, and I thought we were going to add some defaults in with basic validation around their structure/args, and then add others if they became "normal" 20:19:59 I think we did discuss this at the January mid-cycle. But I forget what our conclusions were. 20:20:08 hahaha 20:20:18 Well I'm against arbitrary headers... 20:20:21 In any case, I feel bad poo-pooing the idea in this meeting if there's not anyone here to defend the way it's implemented in this patch set. 20:20:46 Yeah, push it off for when rm_work and johnsom are here 20:20:51 Ok. 20:20:51 k 20:21:16 https://review.openstack.org/#/c/333800/ (How to set loadbalancer_pending_timeout) [chenli] 20:21:31 that’s the last one and I believe chenli is here... 20:21:53 Again, the major objections here were from johnsom. 20:21:59 This change is about try to fix the case when a loadbalancer will stay in pending status forever 20:22:07 We discussed it a little last week. I think people were supposed to comment on the patch. 20:22:10 :) 20:22:15 (Again, apologies I didn't end up having time for that.) 20:22:21 same here 20:22:54 So, I think johnsom's objection wasn't that this is a bad idea, but I think he thought this would happen eventually anyway (like after an hour). 20:23:07 If that's not the case, then we definitely have a bug on our hands. 20:23:29 unfortunatelly it will not change 20:23:31 I think stuff stays in pending forever 20:23:37 diltram +1 20:23:43 Also, I think he didn't like it happening as part of an API get request-- that it ought to be something that happens in the house keeper or something. 20:23:44 and worse you can’t delete 20:24:00 Yeah, that's a problem. 20:24:00 because there is nothing what is checking is it updated or not and nothing what is working on this lb 20:24:38 original idea was to do this on api get_loadbalancers_list 20:24:53 Yep. And I see two bugs referenced by this... so yes, this is definitely a bug 20:25:16 and we proposed to do this in hk but johnsom don't want to add this code there 20:25:17 It seems "strange" to change the status of something on a get request. 20:25:24 Get requests should never cause changes to the system 20:25:29 yep 20:25:39 I think hk is the right place as well 20:25:56 exactly, because of that we're still looking for some place which should be responsible for that 20:26:07 I have strong feelings on this 20:26:10 health might be another area 20:26:15 johnsom!! 20:26:20 HK does sound like the right place for this. I don't think this should be part of a CW thread, as CW is likely to get restarted and then we've lost the ability to recover from an indefinite PENDING status. 20:26:26 Oh yay! 20:26:26 Ignoring a finance presentation 20:26:29 HAHA 20:26:33 On mobile. 20:26:35 Give us your feels! 20:26:41 I was talking about health and looks like good place 20:27:03 I know this is kinda odd to propose, but what if o-hk just checked for a running controller worker? 20:27:14 Then updated it to ERROR state or something 20:27:31 TrevorV: Still doesn't fix the problem if the CW gets restarted. 20:27:40 So, my issue is this should be handled in the flow/thread that was taking the action. I really don't want outside threads guessing at what is going on in other proccesses 20:27:44 TrevorV: Also sounds unnecessarily complicated. 20:27:47 if it gets restarted it should re-establish connection, right? 20:27:53 but how o-hk will know that this particular lb was created when o-cw was down? 20:27:59 Oh, you mean the flow dies 20:28:00 Got it. 20:28:41 johnsom: I think the problem is that the flow won't always be reliable. 20:28:46 Controller dies, this is why we need to enable job board 20:28:46 we need to implement flows checkpointing and state keepers 20:28:55 Right job board. 20:29:01 +1 20:29:02 Right. Job board. 20:29:18 But... that's probably a lot of work. Should this bug fix wait until then? 20:29:20 ok, so the flow should wait and if it doesn’t go active move it to error 20:29:41 I thought this was trying to solve our broken revert situation, not complete controller failure 20:29:43 Or rather, should we make it part of the flow, and just live with the fact that restarting CW could break things until we get job board implemented? 20:29:57 sbalukoff: in my opinion, yes 20:30:01 +1 20:30:16 sbalukoff: +1 20:30:40 Right. Because we expect o-cw to be pretty reliable. Certainly more so than the amphora boot process (and its bazillion dependencies) 20:30:56 I'm OK with that. What do you think, johnsom? 20:31:02 wait, now I'm confused, is the problem not "what do we do when a create_lb flow is interrupted?" 20:31:16 TrevorV: No, that's not the problem. 20:31:22 I see two issues. 1 flows that don't terminate in an end state, ie active or error. This is a fix the reverts 20:31:26 Or rather, not the problem we're trying to solve right now. 20:32:04 2 is controller process dies, flow doesn't progress. This is where I think job board is the right answer 20:32:18 +1 on job board being the solution to problem 2. 20:32:21 johnsom, so in either case, this review is "moot"? 20:33:03 can I ask, what is 'job board' ? 20:33:08 I did not like the approach in this CR and wanted to discuss it with the wider team 20:33:13 So, we should fix the flow that leaves things in PENDING status indefinitely. Possibly introducing a new timeout in the process. But it won't be something that gets set on a GET, nor will it be part of HK? 20:33:49 Yeah, that would be fixes in the flows 20:33:52 Imho 20:33:58 Thanks for your feels, johnsom. I think you are right. 20:33:58 Well, apparently it just means we need to make sure all our reverts are functional, but idk that it involves any kind of timeout 20:34:02 chenli: http://docs.openstack.org/developer/taskflow/persistence.html 20:34:11 diltram: thanks! 20:34:34 TrevorV: Yep. 20:34:38 johnsom, does that sound right to you as well? 20:34:47 It would be much better if the error comes as soon as possible anyway. 20:34:52 sbalukoff: there is create_lb_flow rewriten so it should not leave in pending create any more 20:34:54 Instead of relying on a timeout. 20:34:55 any lb 20:34:57 Thanks diltram 20:35:12 Oh, ok! 20:35:18 Cool beans. 20:35:23 chenli, johnsom: np 20:35:28 diltram, that's a separate review though right? Not the one we're talking about here? So maybe this review is just... "dead" (No offense) 20:35:43 TrevorV: Possibly, yes. 20:35:54 I haven't seen the other review yet. 20:35:56 TrevorV: looks like this review is dead 20:35:59 Sorry, I just want to be clear about what's being discussed and such. 20:36:09 But yeah, setting this status on a GET is not the right way to do it in any case. 20:36:10 ok, I will abandon this one, for now. 20:36:18 Sorry chenli thanks for your work. 20:36:18 Ok! 20:36:28 there is new create_lb_flow + johnsom is working on failover flow + implement job boards 20:36:35 Oh! 20:36:53 TrevorV: thanks 20:36:56 Yeah, diltram one of those is the review you linked earlier right? I'll try and take a look later. Was concerned you merged two flows together 20:36:58 Ok. I'll look for that when it's ready, then. Is there a review number for that already? 20:37:05 TrevorV: yes 20:37:14 Ok. 20:37:17 Linked previously. 20:37:27 I merged them to never ever leave lb in pending_create 20:37:30 Well, I have not started work on job boards, yet. I think the merge is higher priority right now 20:37:50 Probably. o-cw doesn't die that often. 20:37:55 because right now if something will be broken it should revert everything 20:38:06 johnsom: +1 20:38:17 Yeah, diltram fixed or tech debt there. Looking forward to review it! 20:38:19 diltram, agreed, but typically we kept flows separate due to their repetitive use, that's what I was concerned about 20:38:37 Just going to read through it, could totally be a non-issue, ya know? 20:38:53 TrevorV: I'm reusing existing flows so no worry :) 20:39:00 Awesome awesome 20:39:03 it's only merge and small rebuild 20:39:14 +1 20:39:40 Trevorv this was a single flow that was split due to missing funct in taskflow. It was fixed, so diltram fixed the todo 20:40:11 nice 20:40:50 Ooooh sick thanks johnsom 20:41:13 thx johnsom 20:41:40 #topic Open Discussion 20:41:50 looks like we might use the whole hour :-) 20:42:05 sometimes we need :P 20:42:13 one question about the update pool: https://bugs.launchpad.net/python-neutronclient/+bug/1606822 20:42:13 Launchpad bug 1606822 in python-neutronclient "can not update lbaas pool name" [Undecided,New] - Assigned to li,chen (chen-li) 20:42:50 neutron-client is a bit tricky since I lost track what’s going on with open stack client (Bosc) 20:42:52 osc 20:42:56 is anyone is actively testing multinode octavia installation? 20:43:16 xgerman: x-client is deprecated 20:43:18 diltram: You mean, putting the controller on different hosts? 20:43:24 we (=HPE) install that at our customers 20:43:24 sbalukoff: yes 20:43:36 when I update the pool name, it requests for --lb-algorithm, but Reedip said in the comments, this is requested by lb team: https://review.openstack.org/#/c/347715/ 20:43:46 diltram: We've been doing a little testing of it at IBM. I expect we'll be doing a lot more testing of it in coming weeks. 20:44:10 yes, we're doing a full functional testing recently. 20:44:12 ok, great, so something is working :P 20:44:18 It is not yet deprecated last I saw, but there is guidance. I think bug fixes are still ok 20:44:19 using CLI commands 20:44:32 chenli nice!! 20:44:38 chenli: Aah-- ok, so if the lb_algorithm is set, it shouldn't be asking for it again. Especially not on an update. So yes, that's a bug. 20:44:50 I heard a rumor that neutron-client will be a plugin to osc 20:45:06 xgerman: really? 20:45:07 xgerman: Sure. But we have to work with what we have right now, yes? 20:45:18 xgerman: o.. then we'd better testing with osc... 20:45:39 There is a parallel path for a while. We are moving to osc plugins. 20:45:53 Does osc do any neutron-lbaas stuff right now? (I haven't looked in a while.) 20:46:05 If you really want I can summon the neutron PTL and he will lecture you 20:46:12 :P 20:46:21 the who-should-not-be-named 20:46:22 I saw him last week. He gave me the evil eye. 20:46:25 I was tickled pink. 20:46:25 Sbalukoff last I looked, no 20:46:45 Hahahhaa 20:46:57 ok, so we should be good with that pool fix ;-) 20:47:07 nothing reporting greping lbaas 20:47:09 Right. 20:47:18 There is a spec that covers this. I don't have the link on my mobile 20:47:30 k, sounds good 20:47:44 or load 20:47:54 ok! I'll add test case for it and waiting reviews. :) 20:47:56 If it is a bug, I can lobby to get it in. The deadline will be soon though 20:48:03 Ok! 20:48:09 yep, ONE week before code-freeze 20:48:16 so about two weeks 20:48:20 So, two weeks-ish... 20:48:21 As client freezes early each cycle 20:48:23 jinx! 20:48:30 Another question is about https://bugs.launchpad.net/octavia/+bug/1607818 20:48:30 Launchpad bug 1607818 in octavia "Can not get loadbalancer statistics by neutron command" [Undecided,New] 20:48:31 johnsom +1 20:48:57 no statistics in neutron db currently ? 20:49:13 Yeah, our statistics interface is shit. 20:49:24 We need statistics. It is very important. 20:49:25 I think we need to overhaul it. 20:49:41 Yep, it's essential for billing in any case. 20:49:42 So, I think there is a listener vs load balancer disconnect if I remember 20:50:11 yeah, this is something we were to hand-wavey about when we designed lbaas v2. 20:50:12 Octavia we wanted listener level stats 20:50:16 And now it's biting us in the ass. 20:50:43 For ssl vs non-ssl listeners if I remember right 20:50:57 Yes. Because billing will likely be vastly different for them. 20:51:29 Can someone confirm they are diff octavia vs v2? 20:51:30 Al worked on making it for listener’s in LBaaS v2 a long time ago 20:51:48 xgerman: How far did he get? 20:52:02 not far… probably abondoned 20:52:06 Any links for previous work ? 20:52:08 Again, being on mobile is rough. 20:52:22 Octavia definitely does statistics at the listener level. 20:52:29 Yes 20:52:41 Looking in n-lbaas real quick... 20:52:45 I can't remember v2 20:52:49 lol 20:52:55 https://review.openstack.org/#/c/158823/ 20:53:13 abandoned 20:53:36 Yes, n-lbaas v2 has it at the load balancer level. 20:53:46 diltram yep 20:53:48 If we can pass through the query, that would be cool. Otherwise the event streamer would need to be updated to sync stats as well as status 20:53:52 it was close though ;-) 20:54:16 This might also be a defer to the merged api work... 20:54:18 I am for passing through the query... 20:54:25 We should probably look at adding statistics at the listener level to n-lbaas as well. 20:54:29 so we need to add REST API in octavia ? 20:54:34 +1000 20:54:38 Since it really is going to be important for billing. 20:54:42 chenli +1 20:54:56 there was also a plan to stream them straight to ceilometer, etc. 20:55:04 but API is needed as well 20:55:20 (hence the handwaving sbalukoff eluded to) 20:55:22 It sounds like we need to define a spec about this. 20:55:29 may anyone kick that patchset - https://review.openstack.org/#/c/324197/ it stuck 20:55:39 Streaming to celio is a whole different conversation 20:55:45 Because I want to make sure people have a chance to give feedback on the nature of this feature. :/ 20:56:02 maybe at first we will grab that data 20:56:06 +1 20:56:16 there even was a spec for streaming straight from amps on ceilometer side (also abandoned) 20:56:20 after that we can think about in which way to share them 20:56:22 diktram +! 20:56:34 I like your pragmatism 20:56:41 thx ;) 20:56:43 We do collect it in octavia 20:57:10 y, I see octavia database updated by o-hm 20:57:25 well, there was talk that this would not scale 20:57:43 but let’s run with that ;-) 20:57:50 Heh! 20:58:13 I doubt mysql can go beyond 10,000 LBs 20:58:26 That is why stats have their own db setup. But this is yet another different issue 20:58:33 Indeed. 20:58:40 ok, enough bike shedding 20:58:45 T-2 20:58:50 What, we have a whole minute left! 20:59:04 ok. so first we need to add a REST API to feed neutron-server the statistics 20:59:07 in case we have a “serious: topic 20:59:16 I will try to working on that :) 20:59:19 chenli +1 20:59:21 Thanks, chenli! 20:59:26 because this is really important to us. 20:59:27 and very first we need a spec ;-) 20:59:35 Yes, please spec this out! 20:59:40 sure! 20:59:49 So, original was about exposing the stats. Look at the octavia api spec and implement what was approved 20:59:52 chenli underatood — but I wouldn’t be surprised if it misses N 20:59:56 Yes! 21:00:09 It's OK so long as it makes it into master. 21:00:19 #link https://review.openstack.org/#/c/324197/ 21:00:22 Anyway, thanks folks! 21:00:34 again please some core to kick in ass that patchset 21:00:44 diltram: Ok! 21:00:48 thx and cu 21:00:51 #endmeeting