15:00:05 #startmeeting openstack-helm 15:00:05 Meeting started Tue Aug 1 15:00:05 2017 UTC and is due to finish in 60 minutes. The chair is srwilkers. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:06 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:09 The meeting name has been set to 'openstack_helm' 15:00:14 hello 15:00:21 hello everyone o/ 15:00:26 o/ 15:00:46 got the link out late -- here's the etherpad for today. let's take about 5 minutes and add any agenda items we'd like to discuss today: https://etherpad.openstack.org/p/openstack-helm-meeting-2017-08-01 15:01:29 o/ 15:01:30 o/ 15:05:10 nice :) let's get to it then 15:05:20 #topic release schedule 15:05:26 v1k0d3n: floors yours 15:07:19 portdirect: ill defer to you since i've seen you've left comments here 15:07:27 hey all o/ 15:07:33 hey dude :) 15:07:35 floors yours 15:07:38 crazy morning, sorry for being late. 15:07:40 oh - its started -lol 15:07:55 so i see that this is a topic at PTG too... 15:08:09 but can we at least get the ball rolling a bit on RC releases? 15:08:35 I think we need to define what we mean by a release 15:08:39 and what that entails 15:08:52 from my perspective there are two criteria: 15:08:57 1) we work :D 15:09:34 it would help first with point in time releases, and it would also help with getting vendors involved...because they can reliably say "use this rc release with our code". contrail is a good example of this, who is currently using tiller v2.4.2. as we keep moving forward...they are left behind, and this can be frustrating. 15:09:36 2) we are locking in a version of our format/layout of charts 15:09:59 v1k0d3n: they are using tiller 2.5 on their master 15:10:16 i"ll push them to update their public code 15:10:18 yes...working is good. i think identifying _what_ works is good enough for RC releases though. and of course documenting what we know _doesn't_ work. 15:10:22 would that be ok? 15:10:51 i thought i was using master when attempting to package. i will check again. 15:11:08 i am starting to work with them here as well. also folks on the ODL side. 15:11:27 thanks for the extra push portdirect :) 15:11:39 nice - we have a bm lab that we are working with juniper here 15:11:40 but would RC releases be acceptable for the team? 15:12:02 under the conditions that we document what does/doesn't work for that RC? 15:12:05 I'd be more comfortble defining the criteria for a release before pushing for one 15:12:16 can we make this a topic at ptg 15:12:18 +1 on portdirect 15:12:25 there are some things that need tied up before i think setting releases is appropriate 15:12:53 that works too. but to be fair, we've had a few releases that were mostly working at one point...and ever really branched/tagged anything. it would be nice to capture those when we hit a milestone. 15:13:05 currently master always works :D 15:13:28 with conditions ;) 15:13:30 should make that way. :D 15:14:16 i also would like to discuss what "dependencies" we would like to present with our release/tag policy. for exmpale, openstack version info. 15:14:25 if you were abloe to start working n the code again v1k0d3n it would be awesome - would help us get there quicker 15:14:28 *able 15:14:50 as a user explores openstack-helm...there's no real snapshot view of what works and what doesn't. 15:15:11 i am trying portdirect. changing jobs is pretty....disruptive. :) 15:15:25 so, i guess no on the RC releases is what i'm hearing. 15:15:55 which is fine...but how do we proceed. what is expected or what is the schedule, besides AT&T's 1710 and 1802? 15:15:55 i think it'd be appropriate to discuss this more in detail during the PTG -- i think timeboxing that decision to part of a weekly meeting is difficult 15:16:11 in order to reliably release anything, the gates and functional tests should be improved 15:16:54 to make sure that ceph, ingress or any OpenStack project is working correctly 15:17:28 ok, well i guess it will have to wait until PTG. 15:18:44 any other points of discussion here? 15:19:43 #topic list of reviews 15:19:51 korzen: floors yours to start :) 15:20:18 ok, so I like I have told you last meeting 15:20:33 I have prepared the specs for the Neutron multi-SDN approach 15:20:51 I would like to have some review there 15:21:01 if everybody is on the same page 15:21:19 korzen: glanced at it this morning. looking good :) 15:21:19 there are general approach captured: 15:21:37 #link https://review.openstack.org/#/c/487427/1 spec: Neutron multiple SDNs design 15:21:53 #link https://review.openstack.org/#/c/489580/2 spec: Add linux bridge to Neutron chart 15:22:06 as well as detailed implementation: 15:22:14 #link https://review.openstack.org/#/c/481225 Neutron: Enable decomposition of chart into descrete manifests 15:22:30 #link https://review.openstack.org/#/c/466293/14 Neutron: add linuxbridge daemonset and config script 15:22:33 for those specs 15:22:41 so you can take a look at general idea 15:22:50 as well as the implementation details 15:22:56 I need to review the working of the 1st two - but am totally happy with the direction. 15:23:15 we should get infra to provide another three node gate for lb 15:23:20 and reture the two node i think 15:23:26 *retire 15:23:32 yeah, two node can go as far as im concerned 15:23:47 thanks jayahn for updating the SONA chart 15:23:59 #link https://review.openstack.org/#/c/489500/ SONA 15:24:13 in that link, you can see how external SDN integration would look like 15:24:16 not me, it is siri. :) 15:24:24 ^^ lets get that in the gate too :) 15:24:41 as without its hard for many people to review 15:25:10 yes 15:25:10 +1 15:25:12 any thing we need to follow up to do "lets get that in the gate too"? 15:25:13 and the docs 15:25:58 #link https://review.openstack.org/#/c/470326 15:25:58 docs: Neutron multiple SDN approach design 15:26:10 jayahn: i'll ask for a three node for osh infra as well 15:26:11 this is my old documentation around 15:26:49 I would need to rewrite it 15:27:15 and it should contain the overall information about networking inside OSH 15:27:17 korzen: if you could that would be awesome - thanks for really moving this forward 15:28:08 korzen: nice 15:28:40 ok, I guess I'm doen here 15:28:42 done* 15:28:55 really hoping for the review 15:28:57 :) 15:29:04 any other review related topics from anyone? 15:29:24 I do not see dulek around, but his fernet tokens? 15:30:22 #link https://review.openstack.org/#/c/463730/ 15:30:23 Add support for Keystone's fernet tokens 15:31:54 I kinda dropped the ball there 15:32:01 really want it 15:32:08 but also want it in the gate properly 15:32:31 if dulek could add support and the docs for cron jobs then we should merge asap 15:32:45 but I'll try to find a window to complete my ps there 15:32:51 ok 15:33:14 https://review.openstack.org/#/c/484958/ 15:33:34 though if anyone wanted to take that over I'd be greatful :) 15:35:06 #topic open discussion 15:35:18 any other topics not listed to cover? 15:36:29 not from me 15:36:53 ah. 15:37:07 one question about liveness check 15:37:26 what is the current status on implementing liveness check? 15:38:13 for each service? i was following up, then it seems like ended with "we need operator feedback on this". correct? 15:39:15 i think the takeaway was that we'd tie that in to some of the prometheus work that was targeted 15:42:37 anything else? 15:43:51 we agreed that the liveness should be reproted via log analyzing 15:44:03 thus the work on LMA 15:44:28 yep. there's been some feedback left on the LMA spec in flight. I'll get that updated accordingly today 15:44:51 setting up liveness probe on each pod, that should be reported via log analyzing? 15:45:41 we have the first idea that we should check DB connection, rabbitmq 15:45:59 and service reporting in the server, like nova-manage service list 15:46:10 as liveness check 15:46:57 but then the conclusion was, if service is not able to connect to the DB/rabbit/nova api - can we restart the pod and expect to fix the situation? 15:47:55 so second idea is to report that some service has issues as alarms 15:48:31 okay, i got the idea on how discussion went. :) I will also think about this more. 15:48:48 yeah its a tricky one 15:48:48 yeah, feedback would be awesome. :) can adjust as necessary 15:48:53 can be very service / case specific. 15:49:03 personally I'm very worried about restarting the pod if a dep goes down 15:49:18 as I have nightmares of a situation cascading out of control 15:49:27 ++ portdirect 15:49:28 in the event of a small outage of say rabbitmq 15:49:33 domino effect 15:49:37 true 15:50:54 yeah, true. 15:51:07 so we are currently working on LMA infra 15:51:13 prometheus, fluentd 15:51:28 to be able to support such use-case 15:54:08 okay, we can end this one now, only 7min left. obviously, this requires more thinking. 15:54:24 sounds good :) 15:55:26 ill go ahead and close out the meeting. we can carry over any remaining conversation to openstack-helm. see you there :) 15:55:29 #endmeeting