15:59:50 #startmeeting kolla 15:59:51 Meeting started Wed Jun 22 15:59:50 2016 UTC and is due to finish in 60 minutes. The chair is sdake. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:59:52 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:59:54 The meeting name has been set to 'kolla' 15:59:56 #topic rollcall 16:00:02 hello 16:00:04 o/ 16:00:06 \o/ containers FTW 16:00:14 I can hardly contain my excitement. 16:00:20 me either wirehead_ 16:00:38 o/ 16:00:39 o/ 16:00:40 o/ 16:00:51 0/ 16:01:04 #topic announcements 16:01:14 midcycle july 12,13th in Ansible HQ 16:01:21 #link https://wiki.openstack.org/wiki/Meetings/Kolla 16:01:29 I am going to put up an eventbrite page for folks to register 16:01:36 please register if you intend to come 16:01:38 if you intend to come i'd dbook travel now 16:01:55 within 14 days most companies wont approve travel 16:02:00 yah 16:02:19 sorry, I'll have to miss the mid-cycle meet :( traveling on work... I'll attend remote thru webex and participate in etherpad 16:02:26 #topic refactor build.py 16:02:44 ok 16:02:47 I guess that's me 16:02:54 so guys, build.py is a mess 16:03:11 before that, the link for the midcycle is here: 16:03:15 #link https://wiki.openstack.org/wiki/Sprints/KollaNewtonSprint 16:03:18 sorry about that 16:03:20 we all know it, we kinda want to ingnore it but that won't make problem go away 16:03:20 o/ 16:04:04 I was meant to say "show of hands who will help with that" but since akwasnie already volounteered... 16:04:05 :P 16:04:14 ;-) 16:04:22 ya agree build.py is a charlie foxtrot 16:04:45 i think we dont know what to do with it inc0 16:04:57 ;) 16:05:04 perhps the way to get started is to brainstorm a list of things we want to fix 16:05:06 well there are few things 16:05:23 one would be separation of jinja2 generation and dockerfile builds 16:05:39 decouple whole giant script into importable submodules 16:06:16 i'd like the build.py to be usable as a python library as well ;) 16:06:16 I'm still kinda reluctant to copy whole thing to tmp but since we don't have better idea...we might think about it 16:06:25 yeah, that's part of it 16:06:29 technically it already is 16:06:36 right 16:06:49 I already moved it out of command 16:06:51 well first step is to get a blueprint up 16:07:00 second step is to write down list of work items 16:07:02 however we still need to go forwards 16:07:22 bottom line, is there anyone feeling like leading this effort? 16:07:37 Well, the “I want to build a pile of templated containers from the CLI” is an interesting and non-Kolla specific issue. 16:08:57 inc0, if you can have the bp and list of tasks i can help with some tasks 16:09:07 one opton people have kicked out is to use ansible to bulid the containers 16:09:21 inc0: I can help too.. is there a link for the bp? 16:09:26 i dont know if that owuld be flexible enough or us 16:09:34 thats one option i think we should be considering 16:09:37 atleast PoC 16:09:43 I think it might be less imposing to first split the pieces up, then find people who can lead each piece. 16:09:59 coolsvap vhosakot I'd really like one of you to take lead then;) 16:10:06 ya inc0 - first and second steps above, then we can distribute 16:10:23 I can contribute ideas, but I would lie to you if I'd say I did enough research whether or not they are any good;) 16:10:30 I need to add nova-docker to my nova containers - so I could work on something like that? maybe something with the new 'customization' stuff? 16:10:36 inc0 i think the problem is coolsvap and vhosakot don't know what work items to write in the blueprint 16:10:55 ok, so 3 of us, let's work on it together than ok? 16:11:20 I just don't want to dominate this discussion, I just speculate what could be done 16:11:21 inc0: does refactoring build.py include chaning the way it logs output as well? 16:11:29 changing* 16:11:37 a proper refacor changes no functionality 16:11:38 sdake: yep, I'll need help writing work items 16:11:39 potentially, I know that harlowja already started this work 16:11:41 its a restructure of the code 16:11:42 inc0, wfm 16:12:06 sdake, yeah, but fixing what's broken/stupid/could be better could also be good 16:12:29 i think separating the jinja2 generaton from the build into modules is a good starting point 16:12:32 vhosakot, so how about we'll do good old etherpad brainstorm later? 16:12:41 a third module would be building the dependency graph 16:12:57 maybe include multiple base images while we're at it 16:12:58 Easier to first refactor with the understanding of what needs to change, then the change, IMHO. 16:13:03 since it seems to be recurring topic 16:13:21 i recommend not changing functionality during a refactor 16:13:24 wirehead_, yeah, totally agree, let's brainstorm ideas first 16:13:28 it makes it harder to keep correct 16:13:31 yep, we'll add an etherpad to brainstorm the work items first... I have some ideas about avoiding bad cross-builds (building Ubuntu images on CentOS and vewversa) 16:13:35 vice-versa* 16:13:41 sdake, but we can pave work so this feature will be easier to implement 16:14:01 inc0: we’ve got our own parallel non-openstack-oriented implementation of some of build.py 16:14:06 i think hands will be ffull turning this into 3 submodules 16:14:08 o/ 16:14:16 hey Jeffrey4l 16:14:35 sdake, sorry, i am late. 16:14:39 inc0: So at the very least, I can see if dcwangmit01_ and I can offer some perspective. 16:14:39 ok gotta move on :) 16:14:44 #topic do we need tqdm 16:14:44 wirehead_, I'd be interested to hear about reasons you made it and improvements 16:15:00 harlowja, around? 16:15:05 tqdm? 16:15:07 what's that? 16:15:13 no idea what it is or who added to agenda 16:15:19 https://github.com/noamraph/tqdm 16:15:22 inc0: that's to show a progress bar 16:15:23 I think we should solve bugs like https://bugs.launchpad.net/kolla/+bug/1546652 when re-factoring build.py 16:15:23 Launchpad bug 1546652 in kolla "Add "--force-cross-build" argument to kolla-build" [Low,Triaged] - Assigned to Vikram Hosakote (vhosakot) 16:15:23 its a library included by harlowja in one of the logging reviews 16:15:40 covered it last week 16:15:44 yes 16:15:47 but we could review 16:15:48 :) 16:15:57 the review on requirements queue is abandoned 16:16:09 so unless we want it it will not be restored again 16:16:36 ok, so this has been covered rhallisey ? 16:16:48 ya I mentioned it last week 16:16:53 Yeah, I am all for friendly logging, but I dont think we had any actionable discussion items. :/ 16:16:55 #topic customization of dockerfiles 16:17:15 sooo 16:17:27 we already have mechanism partially reviewed/merged 16:17:46 and by partial he means it does about 25% of what it needs to do :) 16:17:47 sdake, you mentioned that you would like to take lead on this? 16:18:06 yes of course - as soon as i am unburried from my current work 16:18:12 then distribute the work to the community 16:18:24 we should be able to knock it out in a week or so once the foundation is done 16:19:16 what are the outstanding patches that need to merge? 16:19:25 i'm hopeful the core review team can spare 30min-2 hours to port our container sets (40 of em) to this new model 16:19:42 before we do that, we need the model to be correct 16:19:44 i'll be working on that first 16:19:52 mandre they are unwritten as of yet 16:19:52 inc0: I need to learn the customization stuff - might as well have me do somethign useful while I learn it 16:19:55 I'm game for helping the port sdake. 16:20:12 britthouser cool - this work will be easy - look at one docker file make it look like another 16:20:18 sdake: me too.. will help ports 16:20:19 Mech422__, ofc.. 16:20:29 lots of work to do;) 16:20:39 but there is 130 containers in our tree 16:20:42 so it takes awhile to do 16:20:58 but the real problem is that work can't begin until i finish the prototype inc0 started 16:21:18 so thts where that is 16:21:26 any Q? 16:22:00 #tiopic non-ini config 16:22:01 #link https://review.openstack.org/#/q/topic:bp/third-party-plugin-support 16:22:09 ok, that's important 16:22:13 sdake: thanks inc0 16:22:25 we have merge config for ini 16:22:35 but we have no way of allowing customizations of non-ini config 16:22:40 #topic non-ini config 16:22:52 also as it turns out, even our merge config isn't perfect 16:23:06 so I thought of changing way we deploy configs slightly 16:23:19 merge_config does not work for non-ini config files.. we need to just over-write instead of merging 16:23:21 and figure out how to decouple generation of configs from deployment of them 16:23:40 like, prep all the configs per hostname 16:23:57 and then op will be able to change them to their liking 16:24:10 and only then deploy 16:24:17 what do you think of such solution? 16:24:19 that adds another action 16:24:23 inc0: seems like something that will be useful for kolla-kubernetes 16:24:29 inc0: I like the simplicity 16:24:40 inc0: just have a tree of configs, edit it, deploy it 16:24:56 well, it will be a lot of files tho 16:25:00 as we need this per-node 16:25:11 anyway, UX will be important in this one 16:25:22 Note that kolla-kubernetes is anti-hostname. 16:25:26 and as sdake said, we might change deploy to actually 2 separate steps 16:25:29 inc0: yeah - but its conceptually very easy to explain to people "go to configs/node1/etc/globals.yml and edit foo' 16:25:38 Most of my patches are turning off per-hostname config generation for the kube case. 16:25:54 yeah, this will be ansible specific to a point 16:26:14 so, do you guys like this approach? 16:26:31 i like the merge approach because it has less steps 16:26:36 as in one :) 16:26:43 sdake, but it's not possible for non-ini 16:26:47 i think this is a good midcycle topic 16:26:51 right for non-ini ust copy over 16:26:55 sounds good to me 16:27:03 and it won't be possible for non-ini as we don't really want to write parser for every config format there is 16:27:13 inc0 i think we all agree on that point 16:27:15 ok, we cna talk on midcycle about it 16:27:28 #topic gate instability 16:27:59 who added to agenda? 16:28:01 we still have non-voting gates 16:28:09 hold on, looking for link 16:28:37 sorry, got disconnected... 16:29:12 #link http://lists.openstack.org/pipermail/openstack-dev/2016-June/097401.html 16:29:23 we still have non-voting, unstable gates 16:29:26 we just gave you all the AIs vhosakot 16:29:29 inc0: /me whispers augeas :-) 16:29:52 the unstable gates need to be fixed before there is to be eany voting 16:29:56 FAILED on gate doesn't mean we didn't do good patch 16:30:04 wow ;) 16:30:24 at one point the gates were pertty stable 16:30:29 and i was almost going to make them voting 16:30:36 or atlleast one of them 16:30:37 inc0, yes but its very unpredictable 16:30:37 soo, we *really* need to get to the bottom of problem and fix it 16:30:41 now the gates fail all the damn time 16:31:06 and many time just a couple of rechecks solve the issue 16:31:12 if it's infra problem, we need to work with infra to find solution 16:31:14 recheck is the wrong answer 16:31:21 yes 16:31:23 yeah, we shouldnt ever recheck 16:31:33 recheck means gates are busted 16:31:36 5% recheck ok 16:31:36 50% recheck not ok 16:31:50 inc0, when we look at logs we know its not related to fix 16:32:06 coolsvap yes but that doesn't tell you patch is correct 16:32:08 coolsvap, then let's start gathering issues it WAS related to 16:32:12 no develoepr trusts the gates 16:32:14 and see if we can fix those 16:32:28 inc0 good idea 16:32:29 moment 16:32:37 at some point we want to say "gates failed so I failed" 16:32:49 inc0, agreed 16:32:57 #link https://etherpad.openstack.org/p/kolla-gate-failures 16:33:03 after that we make them voting, and after that we start going towards "gates are good so patch is good" 16:33:06 lets start first by classifying the fialures 16:33:24 inc0 we can't make them voting until we are using mirrors in infra 16:33:29 sdake, I don't thinkw e should do this right now, as it will take lots of time to figure out real issue 16:33:43 lets classify the issues 16:34:00 we dont have to do it during meeting 16:34:06 but it needs to be done over time 16:34:29 anyway, show of hands, who volounteer for "gate task force" 16:34:31 o/ 16:35:14 ehh...always alone ;_( 16:35:21 before we commit to fixing stuff 16:35:27 lets figure out what needs fixing 16:35:46 I mean people who will commit time to look at number of logs and figure out together common problems 16:36:07 oh, i expect the core team during their reviews to help identify the common problems 16:36:08 not debug them 16:36:10 but identify them 16:36:20 for example, I see sometimes the gate tries to pull from docker.io 16:36:24 i am always debugging the gate. it is pain without the gate. 16:36:26 it should never do that except the centos image 16:37:08 and why centos is a special beast? 16:37:20 inc0, the centos base image ;) 16:37:23 ubuntu same thing 16:37:26 or rather any base image 16:37:36 ya centos:7 16:37:39 or ubuntu:16.04 16:37:46 shouldn't we never pull the images from docker.io? 16:37:49 that is the only thing that hsould be coming from docker.io 16:37:56 i mean the kolla images. 16:38:04 There is a setting in docker config to block docker.io 16:38:06 Jeffrey4l right - but i see gat efialures where the gate is tring to pull kolla images from docker.io 16:38:14 yes. 16:38:20 britthouser we need docker.io for the base image 16:38:36 anyway that is just one i saw of many 16:38:40 rabbitmq implodes 16:38:42 often 16:38:45 guys, back to the topic at hand, let's not look for solution here, just figure out process of finding problems 16:38:47 i think there may be something wrong in kolla_docker script. 16:38:49 oh....so it can't find the image locally, so it starts searching. 16:38:57 the pocess of finding the problems is as follows: 16:39:12 1. when reviewing, look at the failed gate - and provide a link to the fialure in the etherpad 16:39:14 so we can use etherpad as scratchpad and add bugs 16:39:31 in 1-2 weeks we can come back and have lots of data to work with 16:39:41 and actually solve it a person per gate problem 16:39:52 any disagreement? 16:40:19 let's use specific tag for gate-related bugs 16:40:20 agree. 16:40:27 nope looks like a good plan 16:40:30 and submit bugs for every issue we see 16:40:36 lets not even file bugs yet 16:40:42 we have over 60 bugs in new state 16:40:53 filing bugs ins't helpful because they aren't being managed effectively by the core team 16:41:00 and I really can't do it alone 16:41:02 that's different issue 16:41:06 project too big 16:41:12 i realize its a different issue 16:41:21 let's manage them well;) 16:41:24 however filing bugs compounds it 16:41:49 then let's not avoid filing, let's manage these bugs well so it wont compound this issue 16:41:58 its hard to tell the noise from th stuff we need to look at immediately 16:42:00 and in the process learn to manage rest of bugs well too;) 16:42:16 i intend to teach people at midcycle how to triage properly 16:42:36 anyway moving on :) 16:42:40 i agree with sdake every morning i look at bugs i find more bugs to update things than look at them for resolution 16:42:43 seems like summarizing gate issues into buckets, and file a bug for each bucket would keep the signal-to-noise ratio down? 16:43:12 britthouser right thatis the purpose of the etherpad - t o determien a list o critical bugs we can track as gate problems 16:43:17 britthouser, yeah, we just need good process to determine buckets;) 16:43:33 * britthouser likes to state the obvious I guess 16:43:41 softball ftw ;) 16:43:49 #topic kolla-kubernetess 16:43:55 so bottom line, look at suspiciously failed gate logs and see what's wrong 16:44:10 i'd recommend not diagnosing 16:44:15 unless your keen to do so 16:44:24 diagnosing a gate bug takes 2-3 weeks 16:44:38 lets figure out which ones we have 16:44:44 I've been looking at ansible for deployment of kolla-kube. The ansible kubernetes module needs a few patches which I'm almost done with 16:45:36 rhallisey please link the specification 16:45:38 doing ansible for deployment would take some of the load off of kube to handle deployemnt 16:45:46 didn't make a spec 16:45:48 it's just a poc 16:45:57 nah i mean the k8s + olla spec 16:46:01 k 16:46:10 https://review.openstack.org/#/c/304182/ 16:46:14 ok cats 16:46:19 need +2s on the spec please 16:46:20 rhallisey, so what do you think? does it seem like a good idea at all? 16:46:30 9 +2s is enought to +a 16:46:37 oh are there 9 16:46:41 wow you guys move fast :) 16:46:56 inc0, well, it beats python + etcd + bash scripts 16:47:06 ok w+1 :) 16:47:06 rhallisey did good job about this one;) I +2 without reading;) (j/k) 16:47:08 the downside would be that kube doesn't not have as much control 16:47:32 does it need this kind of control tho? 16:47:33 inc0, :) 16:48:04 inc0, need, not really. But it would make use of the native upgrade workflow 16:48:25 but even with the native upgrade workflow, it wouldn't be possible to upgrade openstack without life cycle hooks 16:48:26 i don't think the native upgrade workflow actually works in our use case 16:48:28 well I'm afraid that you won't be able to do all the upgrades with k8s native stuff 16:48:31 but that is down the road imo 16:48:33 which dont' exists atm 16:48:55 anyway, I'm happy to help with figuring upgrades out 16:48:59 inc0, we coudl still do the upgrades 16:49:05 it would just be ansible driven 16:49:07 one way or another 16:49:11 Yeah, I’m not entirely sure where the k8s community is going to head as far as upgrades go. 16:49:28 So far, the feeling I’ve been getting is that the community doesn’t see it as part of core. 16:49:54 wirehead_, I expect them to build in more advanced life cycle hooks so that you can actaully do some stuff in the middle of scaling down/up the containers 16:49:55 So we might be stuck with our upgrades regardless. 16:50:00 Yeah. 16:50:08 but we could be waiting a long time 16:50:11 ~9 months 16:50:12 Yeah. 16:50:16 we can gradually remove ansible from it 16:50:23 inc0, that's also true 16:50:25 that just seems bizzare 16:50:33 pbourke, what's that 16:50:34 kubernetes literally has no concept of upgrades? 16:50:36 pbourke shaking my head too 16:50:46 pbourke, it does, but not to the level we need it 16:50:47 Also, there’s weirdness at the moment with the kolla-kubernetes CLI and how some parts of it are very Ansible and other parts of it aren’t. 16:50:48 pbourke, upgrades -> scale down and scale up 16:50:49 that's it 16:50:50 pbourke it does but they lack flexilbity 16:50:55 inc0: that's not really upgrades though 16:50:58 pbourke, it's exactly like inc0 said it 16:51:04 true...but that's what it is 16:51:23 I think a lot of the early adopters have made their systems less reliant upon harsh database upgrades. 16:51:35 So they just plain don’t need an OpenStack-styled upgrade. 16:51:51 yeah, container apps are different 16:52:00 I highly doubt anyone in the kube community has upgraded an 'application' as complex as openstack that needs to be cluster aware 16:52:06 we have to deal with not exactly containerish apps tho 16:52:08 the killer here is cluster aware 16:52:19 yeah, kinda 16:52:29 topic change 16:52:31 we either build that awareness into ansible or into the containers 16:52:36 so bottom line, I'd rather see proper upgrades than true to k8s 100% 16:52:39 networking security in kubernetes 16:52:51 i see this 0.0.0.0 bind all going on 16:53:01 i think that may be secure in a kubernetes world 16:53:20 but it sure does make it easy for data to leak to the wrong location 16:53:21 It’s more-or-less the only way to fly in the kube world. 16:53:26 well, it will be 0.0.0.0 without net host sdake 16:53:30 so it is secure really 16:53:36 ya there's no net=host 16:53:42 0.0.0.0 is problem with net=host 16:53:55 i get it 16:54:08 but the management network should be different from the api network for example 16:54:16 does k8s offer that kind of functionality? 16:54:21 mgmt==api 16:54:27 we can't use net=host for multinode because say if two of the same pods end up on the same node. Bad news 16:54:30 right - thats a security problem 16:54:33 data plane is different, but that won't be affected by k8s 16:55:04 So, right now, none of the services are exposed outside of the k8s cluster. 16:55:26 We need to create Ingress resources to expose the API services. 16:55:37 sdake, I think we need more knowledge to determine security of k8s tho 16:55:38 would someone write up state of the union on security in k8s with the work going on 16:55:41 i'd like to understand it 16:55:50 because 1 of 2 things that consistently kills projects is poor security 16:56:05 wirehead_, doc? :) 16:56:25 Sure. 16:56:30 cool 16:56:33 I can start a doc and we can tackle it together if you like 16:56:47 specifically I'd like to understand how if i have access to the api network i can't get at the management network :) 16:57:05 sdake, api network == public endpoints? 16:57:21 So, we’re going to need to write that up for the docs regardless because you need to be careful about a few configuration details in the everything-in-one-kube-cluster model. 16:57:38 inc0 yup 16:57:51 if there is net=host in k8s, what the openvswitch_db/libvirt will be like? 16:58:10 Jeffrey4l, I don't think we can use k8s for neutron services 16:58:12 they will be net=host I believe 16:58:12 or nova-compute 16:58:18 ok. 16:58:19 neutron agents * 16:58:25 all compute service will be anchored too 16:58:32 not in it's usual way anyway 16:58:39 net=host services can access kube services. 16:58:48 got it . 16:58:51 It’s just you can’t expose an API that uses net=host as a kube service without work. 16:59:14 ok meeting time is up 16:59:21 thanks folks 16:59:23 thx 16:59:26 hvee to overflow - apologies for no open discussion 16:59:28 Oh, and I’ve got it to a demoable state. 16:59:28 #endmeeting