20:00:20 <sdake> #startmeeting kolla 20:00:23 <openstack> Meeting started Mon Feb 9 20:00:20 2015 UTC and is due to finish in 60 minutes. The chair is sdake. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:00:24 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:00:26 <openstack> The meeting name has been set to 'kolla' 20:00:36 <sdake> #topic rollcall 20:00:39 <daneyon_> here 20:00:44 <britthouser> here 20:00:48 <sdake> steak here [\o/ 20:00:53 <rhallisey> hello 20:00:57 <sdake> hey ryan 20:01:00 <sdake> hey daneyon 20:01:02 <sdake> hey britt 20:01:02 <daneyon_> hi 20:01:04 <sdake> jpeeler make it ? 20:01:21 <jpeeler> thanks for the ping, yeah 20:01:25 <sdake> cool 20:01:30 <sdake> glad you hang out in this channel :) 20:01:32 <sdake> #topic agenda 20:02:05 <sdake> #link https://wiki.openstack.org/wiki/Meetings/Kolla#Agenda_for_next_meeting 20:02:11 <sdake> anyone have anything to add last minute? 20:02:29 <daneyon_> nada 20:02:38 <rhallisey> nothing 20:02:53 <sdake> #topic review of super-privileged container approach 20:03:11 <sdake> I don't know if everyone has had a chance to read the super privileged container spec 20:03:19 <sdake> but it proposes a new direction for kolla 20:03:42 <sdake> #link https://review.openstack.org/#/c/153798/ 20:03:46 <daneyon_> i just performed another review and submitted it just b4 the meeting. I wanted to get your feedback on details for dealing with mysql data 20:04:10 <sdake> in summary the proposal is to remove kubernetes as a dependency and focus on using docker only with the full docker API availlable 20:04:17 <sdake> rhallisey have you had a chance to review the spec? 20:04:19 <daneyon_> to perform complete seperation among the services, should we have a mysql container for each service? 20:04:25 <sdake> I saw daneyons and jpeelers reviews 20:04:34 <daneyon_> I agree to that approach. 20:04:37 <rhallisey> sdake, ya I left a few comments 20:04:50 <sdake> daneyon sounds cool, but complicated :) 20:04:57 <sdake> we dont have to specify how we do that part tho 20:05:10 <sdake> I just want general agreement on a change in focus 20:05:14 <sdake> because we will have to implement it 20:05:26 <sdake> without this change in focus, I'm not sure what else can be done with tthe current kolla implementation 20:05:52 <sdake> I have alot of outstanding comments I see in the review, I'll submit an update today 20:06:11 <sdake> what I'd like is for the four of us here from the core team unanimously agree to the specification 20:06:25 <sdake> if you disagree, propose an alternative that is viable :) 20:06:36 <daneyon_> I agree to the change in focus. 20:06:38 <sdake> that means 4 +2 votes 20:06:45 <sdake> on the specification 20:06:54 <sdake> (mine is implicit in the review request:) 20:07:10 <sdake> jpeeler/rhallisey able to review this over the next week and beat the spec into submission then? 20:07:47 <rhallisey> I read it and I think it's a good idea 20:07:51 <jpeeler> sdake: yeah i can do that. but from what i've seen, minor details are all that's left 20:08:04 <rhallisey> +2 on the idea for me 20:08:08 <sdake> cool, so hopefully we can get the spec approved this week - sounds like everyone is on board 20:08:25 <sdake> #topic milestone #3 planning 20:08:27 <daneyon_> I just +2'd 20:08:35 <sdake> well it needs love yet daneyon :) 20:08:41 <sdake> but thanks for the vote of confidence :) 20:08:47 <daneyon_> for sure 20:09:09 <daneyon_> so i would think milestone 3 is going to see big changes :-) 20:09:14 <sdake> Ok, so since we are using this new approach, I think we need to start to define the blueprints that make up milestone #3 - which is defined as launching stuff via SPC 20:10:10 <daneyon_> Would creating an easy ot use dev environment be at the top of the list? I don;t think kube-heat will be of use to us 20:11:08 <sdake> daneyon_ I think we can easily create something based on old heat-kubernetes that just launches 3 VMs to run the various nodes 20:11:14 <sdake> #link https://etherpad.openstack.org/p/kolla-blueprint-brainstorm 20:11:29 <jpeeler> how is the networking going to work? 20:11:36 <sdake> I'd like to brainstorm here in this etherpad, and I'll convert em into blueprints 20:11:43 <sdake> --net=host, in other words using the host network stack 20:12:21 <jpeeler> does that give each container a "real" ip? 20:12:27 <daneyon_> jpeeler: I think we start of with nova-network like we did before 20:12:42 <sdake> ya lets start with nova-network 20:12:53 <sdake> jpeeler it gives all containers the same ip on the system 20:12:56 <daneyon_> jpeeler: then we move to neutron, using either ovs or linucbridge + ML2 plugin 20:13:39 <britthouser> so the development env would be a dressed down all-in-one system? 20:13:47 <jpeeler> i guess i'm thinking more about container communication 20:13:54 <sdake> britthouser more or less 20:14:17 <sdake> jpeeler the containers essentially use the host's network, so if they communicate its almost as if they were a process running in the host os 20:14:33 <jpeeler> ok just trying to wrap my head around it, thanks 20:14:41 <sdake> need more input on the blueprints - can folks start adding stuff to the etherpad ;-) 20:15:00 <sdake> What about the containers we need that we don't yet have 20:15:08 <sdake> rhallisey can you add those to the etherpad 20:15:38 <daneyon_> testing? 20:15:43 <rhallisey> sdake, sure, I'm sure what we don't have yet.. 20:15:52 * rhallisey looks 20:16:12 <sdake> rhallisey read the spec - it has the desired container set - compare vs what we have 20:16:40 <rhallisey> gotcha 20:16:42 <sdake> jpeeler super-privileged containers are a raelly thin chroot essentially :) 20:16:56 <sdake> so we have "develop new containers" 20:17:06 <sdake> add that in and put some subheadings :) 20:17:20 <daneyon_> what are the OS details that we want to run this stuff on... Fedora atomic? 20:17:32 <sdake> atomic doesn't have git or any useful tools for development 20:17:33 <britthouser> Would the deployment scheme be one container per VM? or all containers on 3-VMs? or doesn't really matter? 20:17:35 <sdake> I think we want to stick with f21 20:17:55 <sdake> britthouser so we need a tool to deploy an individual container set 20:18:02 <sdake> can you add that to the etherpad? 20:18:06 <jpeeler> sorry, one more question: is it too tripleO unfriendly to use something like fig to bootstrap the environment? 20:18:18 <sdake> then when your on a vm, you can run the tool 20:18:30 <sdake> jpeeler I'd like to not complicate things early with new tools or systems if possible 20:18:41 <sdake> but long term we can add things like fig or puppet 20:18:42 <jpeeler> it's part of docker now as far as i know 20:18:45 <sdake> I just don't know the right answer yet 20:18:57 <sdake> can you give a brief overview of fig? 20:19:23 <jpeeler> ha, not a competent one. i thought it was like "vagrant up" for containers. 20:19:30 <bdastur> is f21 for the container base image? 20:20:08 <daneyon_> I would like to discuss the use of HA tools such as Pacemaker, Corosync, before we add those to the container list 20:20:12 <sdake> bdastur good question, atm we use f21, but I'd like to go to Centos 20:21:05 <jpeeler> sdake: https://www.orchardup.com/blog/fig if it's not in docker (i haven't checked), then not worth thinking about 20:21:40 <sdake> daneyon_ if you want to have a discussion about HA, we should do in the spec imo 20:21:54 <sdake> because those are specified in the spec directly 20:22:17 <sdake> swift I think doens't work :) 20:22:54 <daneyon_> sdake: Do you think that's an implementation detail? Some people may want to use corosync, others galera 20:23:19 <sdake> galera is only for mysql iiuc 20:23:24 <notmyname> ? 20:23:25 <sdake> we need galera too to do ha for mysql 20:23:32 <notmyname> what doesn't work in swift? 20:23:44 <sdake> kolla containers for swift are busted notmyname 20:24:02 <sdake> nothing of your fault - our fault :) 20:24:16 <bdastur> shouldnt galera be part of the same container as mysql 20:24:43 <bdastur> I actually managed to get two containers on two VMs running rabbitmq in a HA cluster 20:24:51 <daneyon_> sdake: I guess in general, it have been involved in HA implementations that do not use the HA tools mentioned in the spec. I think HA may be outside the scope of the initial spec? 20:24:57 <sdake> each logical service goes in a separate container, and shares the host as necessary via bind mounting /run or /var 20:25:25 <sdake> ok we can drop HA, although I'd like to tackle HA of mysql and rabbit if possible 20:26:48 <sdake> so lets have a more general discussion about ha quickly 20:26:50 <daneyon_> sdake: I want to tackle it too. I think it must get done for anyone to use kolla in prod. we can work on an implementation. 20:26:56 <sdake> do folks want to tackle ha in milestone 3 or later? 20:27:04 <daneyon_> OK, HA? 20:27:57 <sdake> we know we want ha for mysql via galera and rabbitmq 20:28:02 <sdake> maybe we should just start with those 20:28:05 <bdastur> better to validate HA earlier to make sure there are no obstacles we did not anticipate 20:28:09 <sdake> although I like the check script idea 20:28:59 <sdake> ok, well lets do this 20:29:05 <britthouser> galera and rabbit seem to be the most popular (or at least most talked about) ways of doing HA, so that is a good starting point. 20:29:06 <sdake> lets assume we are going to implement everything in the spec 20:29:25 <sdake> and write out the blueprints assuming we are doing that 20:29:35 <sdake> and if any part of the spec changes, we can just "Not" the blueprint and forget about it ;-) 20:29:45 <sdake> (between now and approval of the spec that is) 20:29:52 <daneyon_> an initial HA implementation could be 1. HAProxy for API endpoints, MySQL VIP. 2. Galera for clustering the MySQL DB 3. Standard Rabbit Clustering.... implementation specific would need to get worked out. 20:30:42 <sdake> ok, what runs the container check script and restarts it if busted? 20:30:48 <sdake> I guess we need a tool for that! 20:31:01 <sdake> then we can possibly remove corosync and pacemaker 20:31:15 <daneyon_> sdake: if we're going to cluster the DB and MQ, we need something for the API endpoints, DB VIP > HAProxy for some other OS SLB 20:31:34 <sdake> wow too many acronyms my brain just imploded ;-) 20:31:43 <sdake> SLB = ? 20:31:54 <britthouser> server load balancer 20:31:58 <daneyon_> sdake: How does the script know if the container is busted? 20:32:11 <sdake> it runs a check script in the container via docker exec 20:32:16 <daneyon_> SLB- Server Load Balancer, HAProxy, F5, etc.. 20:32:19 <sdake> if the check script returns 0 - gtg, returns -1, restart 20:32:59 <sdake> the check script can do some form of healthcheck on the softwre in the container 20:33:15 <britthouser> Probably need some intelligence in that script. If it dies 3x, then don't restart or something like that. 20:33:25 <sdake> is HA proxy any good? Or is there a better tool 20:33:29 <sdake> britthouser that is called escalation 20:33:38 <sdake> at that point, you would want to reset the machine 20:33:43 <sdake> but lets not think about escalation now 20:33:46 <daneyon_> sdake: is the check script looking to see if, for example, that a test tenant can talk to the Ketystone API and create a user, endpoint, etc? 20:33:47 <britthouser> Ok. 20:33:49 <sdake> lets assume its dumb and smiple :) 20:34:04 <sdake> right, that would be a keystone-api check script 20:35:06 <sdake> We need something to check the health of the machine, but I think that is above the line we care about 20:35:12 <sdake> that is what pacemaker + corosync tackle 20:35:25 <sdake> we only care about health of containers 20:35:31 <sdake> in a real system, someone will have to sort out health of the bare metal as well 20:35:42 <daneyon_> sdake: instead of managing our own scripts for health checking, their is a tool (trying to remember the name) that can run as a process within each container to perform deep health checking. 20:36:04 <britthouser> Ok...so corosync+pacemaker aren't implementing the HA between openstack services, just the containers themselves. I misunderstood that earlier. 20:36:06 <sdake> it health checks openstack specifics? 20:36:11 <daneyon_> Let me dig up the name and I'll send it along. 20:36:25 <sdake> put a link in the etherpad 20:36:29 <daneyon_> yes. 20:36:59 <sdake> britthouser I removed pacemaker+corosync, I think they are not necessary for container management 20:37:10 <daneyon_> i'll dig it up.... in the meantime, lets not set it in stone that we are going to create our own shell scripts from scratch to perform health checking. 20:37:13 <sdake> corosync + pacemaker manage a group of machines, we are only talking a single machine 20:37:17 <daneyon_> will do 20:37:43 * britthouser rereads... 20:38:05 <daneyon_> we can run health checks son each of the cluster tools, HAProxy, Galera, etc.. 20:38:21 <britthouser> I'm with you now sdake 20:38:31 <daneyon_> I don;t see a need for corosync or pacemaker at this point. I'm open to hear from others on the need though 20:39:01 <sdake> if we need something to monitor health of machines and restart them, that is where pacemaker and corosync come in 20:39:43 <daneyon_> health check monitoring though monit #link http://mmonit.com/monit/ 20:41:06 <sdake> license? 20:41:25 <daneyon_> open source 20:41:34 <sdake> which one :) 20:41:41 <jpeeler> AGPL 20:41:50 <sdake> gaahh 20:42:00 <sdake> who would use that license! 20:42:09 <sdake> that one makes lawyers cringe for some reason 20:43:08 <sdake> perhaps we can just put monit in a container if it does the job 20:43:14 <sdake> run with --pid=host 20:43:35 <daneyon_> for now, I say we need to investigate the best way to perform health checking in each container and across the container cluster. Creating our own scripts from scratch should be the last option... I'm open to other tools. I just mention Monit because I used it back when i was doing customer deployments and it worked well 20:43:40 <sdake> so launch would like like docker exec blah monit x 20:43:58 <sdake> cool i'll play with it today daneyon_ 20:44:02 <sdake> looks small enough 20:44:23 <jpeeler> +1 on avoiding using our own scripts 20:44:44 <sdake> ya we want our scripts to be simple - less then 100 lines if possible 20:44:54 <sdake> although not always posible inside a container 20:45:20 <daneyon_> Monit needs to be able to restart the pid of the service it's monitoring 20:45:25 <sdake> ok, i'll spec monit after I give it a go 20:45:47 <sdake> daneyon_ it would be able to do that with --pid=host 20:46:03 <daneyon_> sdake: roger that 20:46:28 <sdake> any other suggested blueprints? 20:46:32 <daneyon_> what about container logging? 20:46:43 <sdake> that needs to go in the spec if you want it :) 20:46:59 <sdake> maybe we can defer that to a later spec 20:47:07 <daneyon_> if we're going to address monitoring the services, should we dev a logging solution? 20:47:09 <sdake> I have no idea how to do the job on that point 20:47:33 <sdake> I like logging to stdout personally :) 20:47:42 <sdake> and have some tool capture them via docker log 20:47:51 <daneyon_> OK, we can address logging in a follow-on milestone 20:48:01 <sdake> or a followon spec in this milestone as well 20:48:06 <sdake> so dates 20:48:15 <sdake> I think we have beat the etherpad into pretty good shape 20:48:25 <sdake> I'll not convert to blueprints until we have approved the spec 20:48:33 <sdake> so if you want, feel free to edit as you see fit 20:49:15 <sdake> #link https://wiki.openstack.org/wiki/Kilo_Release_Schedule 20:49:49 <sdake> March 19 is k3, I think it makes alot of sense to align with the OpenStack project's release schedule 20:50:05 <sdake> and that is about 4 weeks after we wrap up our planning 20:50:25 <sdake> yah/ney? :) 20:50:44 <rhallisey> sounds good 20:50:52 <daneyon_> yah 20:51:13 <sdake> cool sounds good then no nays :) 20:51:16 <sdake> #topic open discussion 20:51:25 <sdake> we have 9 minutes but we had alot of open discussion already 20:51:32 <sdake> seems like pepole are fired up for this new approach to me :) 20:51:40 <daneyon_> ya! 20:51:50 <britthouser> Huzzah! 20:51:53 <daneyon_> Another monitoring option #link https://github.com/stackforge/monitoring-for-openstack 20:52:03 <rhallisey> sdake, when we get there, I can write us the selinux policy needed whne using the super privlaged containers 20:52:11 <daneyon_> not much dev lately, but definitly better than starting from scratch 20:52:13 <sdake> rhallisey that would totally rock! 20:53:45 <sdake> ok anything else? 20:53:49 <sdake> or shall we end the meeting 20:54:12 <sdake> danyeon_ those monitoring scripts look interesting, would work well as check scripts 20:54:26 <daneyon_> time for lunch. thx. 20:54:32 <sdake> #endmeeting