16:01:08 #startmeeting kolla 16:01:09 Meeting started Wed Oct 12 16:01:08 2016 UTC and is due to finish in 60 minutes. The chair is inc0. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:10 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:13 The meeting name has been set to 'kolla' 16:01:14 o/ 16:01:15 woot 16:01:19 0/ 16:01:20 o/ 16:01:25 o/ 16:01:25 o/ 16:01:27 o/ 16:01:28 o/ 16:01:29 o/ 16:01:30 o/ 16:01:31 o/ 16:01:33 no roll call topic? 16:01:34 hold on 16:01:40 #topic rollcall - woot for kolla 16:01:42 o/ 16:01:44 lul 16:01:49 \o/ 16:01:54 woot guys plz;) 16:01:55 woot 16:01:57 woot 16:01:58 woot 16:02:04 i do not consent ;) 16:02:13 \o/00T 16:02:14 woot again 16:02:16 woot 16:02:19 woot 16:02:21 vhosakot, nice 16:02:26 woot 16:02:27 ;) 16:02:37 \o/ 16:02:47 ok, cores please -2 sdakes changes until he submits to our internal rituals 16:02:58 your pain not mine :) 16:03:08 (kidding ofc, we'll deal with it phusically in summit if he wont) 16:03:18 #topic announcements 16:03:21 buy me lots o beers 16:03:21 i'm broke ;) 16:03:42 1. RC2 is tagging today, we need to open stable/newton branch 16:03:58 we will have rc3 due to critical bugs outstanding 16:04:08 it's going to be Oct18 16:04:23 2. remind - summit schedule is up for review 16:04:41 any announcements from community> 16:05:18 guess not 16:05:25 #topic newton release 16:05:27 sdake, you're up 16:05:47 https://launchpad.net/kolla/+milestone/newton-rc2 16:06:00 the summary there is there are 7 critical bugs that make kolla unusable 16:06:13 we have a slew of high bugs which make a particular service have some kind of defect 16:06:25 o/ 16:06:32 on the plus side, we have fixed 66(!) bugs in rc2 16:06:44 we also have a slew of bugs in the "INPROGRESS STATE" 16:06:55 if there is a bug in INPROGRESS that can be merged, lets get it merged 16:07:11 after we tag rc2, only critical bugs will be backported to stable/newton 16:07:21 a few mechanics on the tagging/branching 16:07:41 rc2 tags today 23:45 UTC -> the release team tags when they get to it 16:07:51 the branch gets created when the release team gets to it 16:07:51 they are not slow 16:07:59 but don't expect a branch immediately when the tag occurs 16:08:13 for rc3, we are carrying 7 bugs 16:08:21 these need critical attention 16:08:27 there are some other bugs in that list that are in triaged 16:08:32 that are critical 16:08:56 if someone could get them out of the triaged state, that would be fantastic (by confrming, marking invalid, etc) 16:09:10 i will be carrying over triaged bugs that are critical as well as confirmed bugs 16:09:39 if they are marked critical - I assume whoever triaged it understands the bug impacts all of kolla 16:09:51 now, what to do if you have a high severity bug that you think should be critical? 16:09:54 change it to critical 16:10:02 (before 23:45) 16:10:21 the workflow seems to be going well from last meeting 16:10:42 it would be great if we could finish the job on those inprogress bugs rather then having to backport them, so i'd suggest fixing those today :) 16:10:46 or reviewing them 16:10:51 etc 16:10:58 if its critical, just assume it will be fixed starting tomorrow 16:11:04 I need 1 more week out of everyone :) 16:11:12 and then we have free time for some time :) 16:11:23 (tag is tuesday for rc3) 16:11:32 3 days, then summit:) 16:11:46 shhhh :) 16:12:03 after summit - you will have plenty o time to recover :) 16:12:40 inc0 you wanted a status update on each bug? 16:12:52 rather each critical 16:12:53 yes, let's do this, for critical at least 16:13:04 cool take it away - have to do something rq like :) 16:13:08 #topic critical bugs review 16:13:26 let's spend few moments to discuss each critical bug 16:13:37 to help fixing it 16:13:53 1 https://bugs.launchpad.net/kolla/+bug/1564773 16:13:53 Launchpad bug 1564773 in kolla mitaka "Continously restarting rabbitmq container for CentOS" [Critical,Confirmed] 16:14:55 hmm, anyone have steps to reproduce this one? 16:15:34 i followed this bug. but got nothing about it. :( 16:15:58 dug 16:15:59 duh 16:16:02 Bug report time is too long. 16:16:13 yes, but comments are fres 16:16:14 h 16:16:20 so it's still a thing as it seems 16:16:25 yes. the error message is different too. 16:16:34 tomorrow I'll try to triaged this bug, today the epel repo is broken for awhile 16:16:37 sean-k-mooney noted it happens on maste, can he support? 16:16:42 pbourke conirmed it 16:16:47 i changed it to critical as a result 16:17:43 we need to circle back with pbourke on this issue and see what his defintiion of confirmed is 16:17:51 i'll do tha ttomororw morning 16:18:11 one moment I'll check now 16:18:21 oops ssorry pbourke didn't know you were here 16:18:51 i confirmed it as there are at least 3 separate people in in the comments saying they see it 16:18:57 as well as our own sean-k-mooney 16:19:15 pbourke well sean should know how the software works 16:19:22 can a core take on confirming this problem 16:19:33 pbourke, can you check with Sean tomorrow to and try to triage it? 16:19:38 will do 16:19:39 and fixing it 16:19:42 thanks 16:19:57 https://bugs.launchpad.net/kolla/+bug/1616268 16:19:57 Launchpad bug 1616268 in kolla mitaka "Stale namespace removal causing "RTNETLINK answers: Invalid argument" errors" [Critical,Confirmed] - Assigned to Jeffrey Zhang (jeffrey4l) 16:20:13 Jeffrey4l_, I understand you have theory and you're going to fix it? 16:20:27 pretty sure Jeffrey4l_ is on the money on it 16:20:32 inc0, yes. i am working on it. will push a PS later today. 16:20:39 great 16:20:40 thanks 16:20:52 https://bugs.launchpad.net/bugs/1617334 16:20:53 Launchpad bug 1617334 in kolla "reconfigure action fails on [neutron | Restart the neutron_openvswitch_agent container] " [Critical,Triaged] 16:21:12 this is marked critical because if reconfigure is busted, thats not good 16:21:21 its not marked confirmed because it hasn't been confirmed in the bug log 16:21:27 with several core reviewers trying iirc 16:21:51 i use openvswitch all the time 16:21:53 and it wfm 16:21:56 even reconfigure 16:22:01 same here 16:22:14 i'm surprised linuxbridge works for htis fellow and not ovs 16:22:30 a good course of action would be eto request more info and mark it incomplete 16:22:42 determine what info to request is difficult ... 16:23:52 but it's not triaged as well right? 16:24:01 its been triaged 16:24:06 it hasn't been confirmed 16:24:25 triage = setting milestone and priority 16:24:51 Jeffrey4l_ for htis bug, you requested his inventory file 16:25:09 sdake, yep. but still get nothing idea now. 16:25:10 Jeffrey4l_ so hard for us to mark it incomplete without actually looking at his inventory file 16:25:21 Jeffrey4l_ ok - i've got this one 16:25:24 go ahead and move on inc0 16:25:25 nothing too special there 16:25:33 https://bugs.launchpad.net/kolla/+bug/1625648 16:25:33 Launchpad bug 1625648 in kolla "horizon login fails with TLS enabled" [Critical,Triaged] 16:25:55 bjolo promised to try to reproduce 16:26:30 i don't reproduce this bug 16:26:56 let's wait for bjolo 16:27:00 https://bugs.launchpad.net/kolla/+bug/1631072 16:27:00 Launchpad bug 1631072 in kolla "iscsid container: mkdir /sys/kernel/config: operation not permitted" [High,Triaged] 16:27:24 are we going throug hthe high bugs too? 16:27:37 this is a ubuntu bug. 16:27:38 ahh copyfail sorry 16:27:57 https://bugs.launchpad.net/kolla/+bug/1623013 16:27:57 Launchpad bug 1623013 in kolla "keystone-fernet: rsync: Failed to exec ssh: No such file or directory (2)" [Critical,In progress] - Assigned to Christian Berendt (berendt) 16:28:04 this must be fixed 16:28:05 see #4 and #6 16:28:21 berendt any updates? :) 16:28:33 this is under review i think. 16:28:35 i thought Jeffrey4l_ took this one 16:28:43 i see 16:28:51 https://review.openstack.org/#/c/369418 16:28:51 well - seems like whos on first to me :) 16:28:59 review is welcome ;) 16:29:21 ok, make this review priority folks please 16:29:24 groan 16:29:32 so - one thing i failed to mention in the announcements 16:29:34 the gate is busted 16:29:37 how to proceed? with or without ssh? 16:29:39 because ceph.com is down 16:29:51 yeah, but we can still review code 16:29:53 without merging it 16:29:53 a really dont want to merge anything complex before rc2 16:29:56 right 16:30:05 so review away - pls dontmerge anythign that isn't totally obvious :) 16:30:07 we can not other ceph mirror. 16:30:10 ceph is back i thought 16:30:10 so let's not merge until gates gets up, but review it 16:30:15 we can use other ceph mirror 16:30:26 jascott1, cool. 16:30:32 jascott1 no it is still down 16:30:37 at least on my site 16:30:56 digitalocean has an outage 16:31:03 ironically enough caused by ceph :) 16:31:14 i think thats the place hosting this 16:31:21 or maybe it was somewhere else 16:31:22 anyway - its down for me too 16:31:24 so last bug 16:31:28 https://bugs.launchpad.net/kolla/+bug/1631503 16:31:28 Launchpad bug 1631503 in kolla "inconsistent UID in rabbitmq results in inability to upgrade" [Critical,In progress] - Assigned to Steven Dake (sdake) 16:31:33 its dreamhost 16:31:38 jascott1 right thanks 16:31:46 fix is here: https://review.openstack.org/384598 16:32:03 i will update the commit message. 16:32:04 for rabbitmq 16:32:15 but the fix logical is done. 16:32:15 we need to figure out other affected services 16:32:24 inc0 anything with a named volume 16:32:38 which is almost everything 16:32:51 right. so i think we need another bug to describe the detail rather then centos rabbitmq upgrade issue. 16:33:00 Jeffrey4l_, can you make you review partial-bug please? 16:33:07 or that 16:33:10 change th topic of the bug eff 16:33:11 Jeffrey4l_ 16:33:21 click yellow button next to bug topic 16:33:22 inc0, np. 16:33:51 sdake, yep. change it to ? 16:34:51 Jeffrey4l_ make something up? :) 16:35:11 "upgrades broken because of inconsistent UIDs" should do it 16:35:30 done. 16:35:30 I'd rather not create 10 bugs for each service, but once Jeffreys patch is merged we need to do same thing for all the services 16:35:34 so we can spread work 16:35:48 let's make sure to use same topic in review 16:35:56 https://review.openstack.org/#/q/topic:bug/1631503 16:36:04 to avoid duplication of work 16:36:15 and reviewers, please keep eye on this link 16:36:29 inc0 an etherpad would come in handy here i think - to figure out who is doing which part if you intend to spread load 16:36:54 we need some thing to trick all the container which need fix. should we use bugs/work items or anything else. 16:37:02 6 days to 3.0.0 - spreading work will be more difficult lI think 16:37:12 sdake, agreed to use etherpad. 16:37:13 well, one person won't handle it all 16:37:45 #link https://etherpad.openstack.org/p/kolla-bug-1631503 16:37:58 cool so if spreading work lets use etherpad - Jeffrey4l_ can you link that in the bug so devs ca nfind it 16:38:08 no. 16:38:15 sorry. np. 16:38:48 http://s2.quickmeme.com/img/48/483ef0911f5e27073a015b45aee7a288b9c8d3bfa104f8bfe6625572f97cfa52.jpg sorry, couldn't help myself 16:39:20 it is a typo. ;( 16:39:38 I know, don't worry;) 16:40:02 Jeffrey4l_, I changed topic on your ps to bug/1631503 16:40:18 ok. 16:40:29 so between etherpad and this gerrit review queue, let's fix this 16:40:53 should we update the bug? it do not just fix rabbitmq upgrade in centos now. 16:41:11 Jeffrey4l_ we are doing partial-fix on each component i think 16:41:26 but the bug title is bad: inconsistent UID in rabbitmq results in inability to upgrade 16:41:32 https://bugs.launchpad.net/kolla/+bug/1631503 <= changed name 16:41:32 Launchpad bug 1631503 in kolla "inconsistent UID in named volume results in inability to upgrade" [Critical,In progress] - Assigned to Steven Dake (sdake) 16:41:44 Jeffrey4l_ i gave you a title in scrollback to use for bug description 16:41:46 no mention of centos there 16:41:46 actually it is: fix the uid issue during upgrade 16:42:33 hmm. drop the "rabbitmq" word? 16:42:38 I did:) 16:42:50 cool 16:42:51 replaced it with named voluem 16:43:04 another think is: should we backport this to mitaka? 16:43:13 one note on this bug 16:43:23 so again: 16:43:23 folks - the approach ahs been bikeshedded to death 16:43:27 #link https://review.openstack.org/#/q/topic:bug/1631503 16:43:29 lets just merge the design as is 16:43:38 this link has to be reviewed, keep an eye on it 16:44:03 by bikeshed to death - i've played telephone for atleast 8 hours on this specific issue 16:44:34 ok, I think we stressed the importance enough:) 16:45:10 so, we have status update on all critical bugs 16:45:16 let's get them fixed in following week 16:45:30 anything else to add? 16:45:48 guess not 16:45:55 #topic open discussion 16:46:12 inc0, can we re order the summit sessions 16:46:12 so one thing, you guys think we need meeting next week? 16:46:33 do we have semi-fixed summit sessions? 16:46:35 yeah, let me do it now 16:46:35 yes meeting next week - last minute summit prep is critical 16:46:43 ok 16:46:47 meeting stays then 16:46:54 About Heka we have any good alternative? 16:47:30 zhubingbing we need an honest evaluation of the alternatives 16:47:37 of which there are two that I am aware of 16:48:03 iirc there is solution already use in kolla-k8s? 16:48:05 zhubingbing, I see 3 now, filebeat and fluentd are there, snap is meaning to grow this feature in near future (we should be ok with timing) 16:48:43 snap sounds like heka 2.0 :) 16:49:00 rhallisey, sdake what were your rescheduling issues? I'm switching osic review with ci 16:49:01 -) 16:49:10 sdake, yeah, snap is exactly that 16:49:19 inc0, can we move the road map later 16:49:20 heka was cool, until mozilla dropped it 16:49:29 I want to change around a bunch of things 16:49:36 disagree with the current order 16:49:55 I do have some conflicts, but I think the overall order needs to be changed 16:49:57 should look at logstash again also 16:50:08 pbourke logstash-forwarder is EOL bro ;) 16:50:11 imo 16:50:13 Mozilla mention this yesterday failed to get kk8s up due to ceph.com outage so worked on ansible playbooks, reviewed code. Today create cinder ansible commit. Blockers: ceph.com is still down:( 16:50:23 pbourke filebeat.. not logstash 16:50:23 whoa epic fail 16:50:25 sdake: you sure? 16:50:31 https://github.com/trink/hindsight) 16:50:35 pbourke yup look at their repo 16:50:48 logstash-forwarder is replaced by Beat iirc 16:50:53 rhallisey, whats wrong with it?;) 16:50:57 seem promise? 16:50:57 pbourke elastic EOL'ed logstash-forwarder and introduced filebeat instead 16:51:15 inc0, could u check this again https://review.openstack.org/378762 i perfer use the same timezone for all containers. 16:51:16 why? don't know ;) 16:51:18 sdake: Beats, indeed 16:51:32 inc0, let's get into it after the meeting 16:51:44 whats the difference between logstash and logstash forwarder :/ 16:51:56 ok rhallisey 16:51:58 logstash collects logs and sends them to elasticsearch 16:52:07 iirc what's mean ? i don;t know ;) 16:52:10 logstachforwarder sends logs to logstash (or elasticsearch) 16:52:11 Jeffrey4l_, ok, I just wanted more context 16:52:13 we don't need logstash itself 16:52:27 pbourke: Elastic introdue Beats as protocol for log shipping purpose 16:52:35 is mozilla's own recommeded replacement not suitable? https://github.com/trink/hindsight 16:52:37 what happen to logstash? isn't it widely used? 16:53:00 jascott1 we need an honest eval of the different solutions 16:53:07 rather then picking one randomly :) 16:53:12 jascott1: i would think twice before using something from mozilla again 16:53:21 oh sure but no one had mentioned it. got it. 16:53:21 we'd better talk the alternative after the summit ;) 16:53:22 heh 16:53:23 logstash-forwarder was replaced by filebeat 16:53:24 well, it's a 4th option to look at sdake 16:53:28 logstash is not deprecated 16:53:40 logstash is 300M JVM iirc 16:53:49 sdake: is there some criterial for evaluate logging engine? 16:54:16 Filebeat is a tiny little go app. 16:54:19 it has to do what heka is doing today and not deprecated;) 16:54:21 duonghq not that i know of - an action item 16:54:31 ya 16:54:32 (Logstash-forwarder was also pretty cute and tiny) 16:54:44 if it meets these 2 simple criteria - pick up best one 16:54:44 wirehead_ minus the java jvm part... 16:54:59 it should also be effecient - 3rd criteria 16:55:01 You don’t need the Logstash engine actually running unless you are mediating queues and stuff, filebeat can go straight to elasticsearch 16:55:09 25% of our cpu consumption on controller nodes is the logging stack 16:55:25 Naw, logstash-forwarder is also a go app 16:55:52 guess its time to learn go :) 16:56:01 agreed with sdake 16:56:10 i was making a joke 16:56:14 but ok :) 16:56:14 It’s awful 16:56:48 I mean, Logstash / Filebeat / ElasticSearch is sufficently popular that there has to be a clear migration off of it if something changes. 16:56:52 about summit sessions, have we fixed topic about kolla-k8s? 16:57:08 duonghq, 2 sessions for kolla-k8s 16:57:12 duonghq the schedule is in #opnestack-kolla topic 16:57:13 roadmap and arch 16:57:14 duonghq, ya there are 2 16:57:36 anwyay, we're running out of time 16:57:39 okay 16:57:44 let's move to our normal channel please 16:57:56 unless there are last comments to make?:) 16:57:57 bye;) 16:58:04 thank you all 16:58:12 ok, thank you guys! see you in summit I hope! 16:58:12 hey guys. 16:58:15 #endmeeting kolla