18:00:07 <daneyon> #startmeeting container-networking 18:00:09 <openstack> Meeting started Thu Sep 24 18:00:07 2015 UTC and is due to finish in 60 minutes. The chair is daneyon. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:10 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 18:00:12 <openstack> The meeting name has been set to 'container_networking' 18:00:16 <daneyon> Agenda 18:00:20 <daneyon> #link https://wiki.openstack.org/wiki/Meetings/Containers#Agenda 18:00:39 <daneyon> I'll give everyone a minute to review the agenda. 18:00:57 <daneyon> #topic roll call 18:01:01 <daneyon> o/ 18:01:08 <hongbin_> o/ 18:01:12 <eghobo> o/ 18:01:14 <s3wong> o/ 18:01:16 <gangil1> o/ 18:01:19 <Tango> o/ 18:02:16 <daneyon> Thank you hongbin s3wong gangil1 Tango for attending. 18:02:25 <daneyon> #topic Discuss discovery changes required for implementing Flannel in Swarm 18:02:45 <daneyon> This topic was discussed over irc yesterday and over the ML last week. 18:03:09 <daneyon> I just want to make sure everyone understands the issue with discovery for swarm. 18:03:34 <daneyon> Would you like me to provide a quick overview of the issue or does everyone understand it? 18:04:31 <gangil1> daneyon: I haven't read about it, so would go through it first and then ping you if I have any doubts. 18:04:40 <adrian_otto> o/ 18:04:55 <daneyon> adrian_otto thanks for joining. 18:05:23 <daneyon> adrian_otto we are at topic: Discuss discovery changes required for implementing Flannel in Swarm 18:05:59 <daneyon> I wanted to take this time to make sure everyone understands the discovery issue and agree on the solution. 18:06:53 <daneyon> as part of this patch, swarm public discovery is removed. 18:06:54 <daneyon> #link https://review.openstack.org/#/c/224367/ 18:07:38 <daneyon> instead swarm will use etcd for bootstrapping a swarm cluster. 18:08:18 <Tango> If later we find that we need other method of discovery for something else, would we run into the same situation? 18:08:28 <daneyon> keep in mind that flannel required etcd. flannel uses etcd for shared config amount flannel daemon's that run across nodes 18:08:51 <daneyon> Tango that is a good point 18:09:08 <daneyon> It seems like consul and etcd are the discovery kings 18:09:21 <daneyon> I filed a bp to address discovery from a bigger picture 18:09:34 <daneyon> make discovery more pluggable, configurable, etc.. 18:09:45 <Tango> that would be good. 18:10:08 <daneyon> that is outside of my current focus of implementing the container network model across all bay types 18:10:56 <Tango> I think it's reasonable for now 18:11:57 <adrian_otto> agreed 18:11:58 <daneyon> hongbin_ eghobo or adrian_otto do you have any questions or concerns regarding discovery? 18:12:06 <hongbin_> no 18:12:08 <daneyon> adrian_otto thx 18:12:30 <daneyon> i'll wait 1 minute before moving to our next topic. 18:12:44 <daneyon> hongbin_ thanks for the feedback. 18:12:58 <eghobo> I will comment at review, I am still thinking 18:13:13 <daneyon> eghobo that makes sense. thanks. 18:13:18 <daneyon> #topic Review Swarm patch 18:13:35 <daneyon> I'll take a few minutes to cover the main points of the patch. 18:14:26 <daneyon> 1. Implements flannel for swarm bay types. We can now have containers run across multiple nodes and they can communicate with one another using flannel's overlay (UDP or VXLAN) network. 18:14:53 <daneyon> I have tested this multiple times using native tools 18:15:35 <daneyon> Does anyone have time to test the patch? 18:15:51 <daneyon> Back to the patch review 18:16:10 <hongbin_> I will if I find some time 18:16:25 <eghobo> daneyon: but user still can run without flanel? 18:16:26 <daneyon> I removed swarm public discovery. Instead swarm uses etcd to bootstrap the swarm cluster. 18:16:34 <Tango> Can you post the link? 18:16:43 <daneyon> I implemented etcd for swarm. 18:17:21 <daneyon> eghobo no, flannel is the default network-driver if one is not specified at the baymodel creation. 18:17:43 <eghobo> hmm, not sure I am agree with it 18:18:01 <daneyon> w/o flannel docker does not have the ability to communicate across hosts until libnetwork is implemented and you use the native overlay driver or a libnetwork remote driver 18:18:13 <eghobo> many people run Swarm and Mesos without special network 18:18:25 <daneyon> We will eventually add libnetwork as a magnum network-driver, but libnetwork is only supported in docker experimental. 18:18:41 <eghobo> only Kub is very strict at network side 18:18:58 <daneyon> swarm patch 18:19:00 <daneyon> #link https://review.openstack.org/#/c/224367/ 18:20:03 <Tango> Thanks 18:20:09 <eghobo> daneyon: if my nodes can communicate docker/swarm will communicate 18:20:30 <eghobo> of course without any isolation 18:21:09 <daneyon> eghobo before this patch, container within a swarm bay type could not communicate across nodes unless you expose the container port to the host. this is because the swarm bay type was using docker legacy networking (docker bridge). 18:22:16 <Tango> So we are really enhancing current docker networking? 18:22:25 <eghobo> aha, got what you mean now, thx 18:22:41 <daneyon> eghobo swarm containers can not directly communicate with one another unless you do either 1. expose the container port to the host or 2. Use libnetwork overlay or remote driver that supports multi-host. 18:23:17 <adrian_otto> daneyon, I had inline comments with questions. None were answered. 18:23:29 <adrian_otto> I was particularly interested ina ll the repeated code 18:24:37 <daneyon> supporting native (i.e. not exposing ports to hosts) container-to-container communication is part of the Magnum Container Network Model. This is also where Docker Swarm is heading. 18:25:10 <daneyon> adrian_otto I submitted my latest patch just before this meeting. I plan to go back and address everyone's review comments. 18:25:13 <adrian_otto> also why did you take out ExecStartPost? 18:25:17 <eghobo> daneyon: flannel is value add no question, my concerns that Docker folks show Swarm without networking everywhere and user will get different experience with Magnum 18:25:52 <Tango> +1 18:25:57 <hongbin_> We could set different default network_driver per bay type 18:26:13 <hongbin_> k8s default to flannel, swarm default to something else 18:26:43 <eghobo> hongbin_: =1 18:27:17 <Tango> If user develops something on Magnum Swarm cluster and uses the networking capability here, when they bring their containers elsewhere, they may not work 18:27:24 <daneyon> eghobo libnetwork will be the preferred networking method for Docker Swarm. When it gets out of experimental, we will go through the process of adding the driver. Swarm will be the first bay type. UNless their is issue with the community, libnetwork will be the default net driver for swarm bay types 18:27:33 <adrian_otto> you should be able to specify —network-driver=none 18:27:43 <daneyon> hongbin_ beat me to the punch ;-) 18:27:47 <adrian_otto> and get the current setup with no networking for swarm if that's what you want 18:28:39 <Tango> Fair enough. We should write this up in a user guide so it's clear. 18:29:03 <hongbin_> yes, --network-driver=none make sense I think 18:29:09 <eghobo> daneyon: I agree about libnetwork but it's experimental too long from my point of view ;) 18:29:31 <eghobo> we have user who want service now ;) 18:29:35 <daneyon> Tango after the cnm gets delivered to all 3 bay types, I will work on the docs 18:30:34 <daneyon> eghobo then it comes down to what do we do with magnum? DO we want magnum to be stable or on the bleeding edge? It's my understanding that production ready was a top goal set by adrian_otto 18:30:35 <adrian_otto> I repeated my questions on patch set 4 and voted on the patch again. 18:30:56 <adrian_otto> production ready is key 18:30:57 <daneyon> If that's not the case, then let's use docker experimental instead and we can add libnetwork. 18:31:16 <daneyon> adrian_otto thx. I'll def address your comments. 18:31:18 <adrian_otto> we can offer optional features that use newer things, but we need to have the basics covered first. 18:32:27 <daneyon> adrian_otto each bay type requires a network driver. each bay type has a default network driver of flannel. This can and will change over time. 18:33:20 <daneyon> I am +1 for using stable relases of docker and other tools instead of experimental. 18:33:26 <eghobo> daneyon: does it mean you against --network-driver=none idea? 18:33:35 <daneyon> users want to start deploying containerized apps in OS 18:33:58 <daneyon> if we provide a low quality service, then users will write off the magnum project. 18:34:21 <daneyon> eghobo what does network-driver none do? 18:34:44 <eghobo> nothing 18:35:09 <daneyon> if we provide network-driver none to a k8s bay type, the result is a broken bay. 18:35:13 <eghobo> and Mesos/Swarm users confortable with this model 18:35:46 <daneyon> k8s will not work using legacy docker bridging. 18:35:49 <eghobo> it's just for Mesos and Swarm, we must have driver for Kub 18:36:58 <daneyon> eghobo if we default to flannel as the net-driver, users can still expose host ports, etc.. 18:37:54 <eghobo> correct and I think it's good option for advance user 18:38:04 <daneyon> swarm is headed in the direction of multi-host networking, so I think we are getting in front of this. I foresee all coe's using multi-host networking. 18:38:30 <adrian_otto> daneyon, why do you think that swarm bays are broken without flannel? 18:38:31 <daneyon> If the community prefers to have an option for none, then it can be implemented. 18:39:40 <hongbin_> adrian_otto: I guess daneyon means k8s bay are broken without flannel 18:40:15 <daneyon> adrian_otto swarm bays work w/o flannel. Container-to-container communication with our current swarm requires either 1. Expose the container port to the host. or 2. Support a seperate network provider such as flannel, libnetwork, etc.. 18:41:10 <adrian_otto> right. you can use docker with -v to expose the container port to the host 18:41:38 <daneyon> hongbin_ yes. k8s requires flannel. Future does not currently require a multi-host network provider, but signs indicate that one will be required in the future 18:41:48 <adrian_otto> the DEFAULT network driver per bay should be one that does not violate the principle of least surprise 18:41:52 <daneyon> i can;t comment on mesos b/c i don;t know enough about that bay type yet. 18:42:15 <daneyon> adrian_otto correct 18:42:20 <adrian_otto> so for now the default for swarm could actually be "none", and the user could enable networking by changing it to "flannel" 18:42:35 <adrian_otto> the default on k8s bays, should be "flannel" 18:42:43 <daneyon> adrian_otto ok 18:42:54 <adrian_otto> and if you think you know what you are doing you could set it to "none" and do something more exotic perhaps 18:42:57 <hongbin_> I guess the default for mesos is none as well 18:43:05 <adrian_otto> it mightbe away to simplify using k8s built-in features 18:43:13 <adrian_otto> hongbin_: yes 18:43:56 <adrian_otto> my apologies, but I have to depart here in a minute so I can not stay for the end. 18:44:09 <daneyon> I'll refactor the patch so flannel is not the swarm default 18:44:14 <eghobo> adrian_otto: kub doesn't have built-in features, we need flannel 18:44:42 <adrian_otto> eghobo: that is true today, but that's likely to change 18:45:05 <adrian_otto> I'm not arguing for disabling flannel for k8s bays. that's not the important point 18:45:15 <eghobo> ok, it looks like you know than us ;) 18:45:21 <adrian_otto> I'm more interested in providing a close-to-native experience for each COE 18:45:36 <daneyon> adrian_otto that makes sense 18:45:58 <daneyon> I'll update the patch 18:46:06 <adrian_otto> and as each COE evolves, we can follow the prevailing direction each heads 18:46:20 <daneyon> adrian_otto agreed 18:46:54 * adrian_otto waves 18:46:57 <adrian_otto> catch you next time 18:47:47 <daneyon> so then do we even need to implement flannel for swarm and mesos. I thought the original charter was to provide a native multi-host container networking solution for all bay types. 18:48:18 <Tango> I think it's a good option to have 18:48:32 <hongbin_> I think it is good to have, just not set it as default 18:48:50 <eghobo> +1, it's value add 18:48:51 <Tango> especially if libnetwork is heading that way 18:49:04 <daneyon> #action danehans to look into changing the default network-driver for swarm to none. 18:49:57 <daneyon> ok, then I think we agree on supporting flannel in swarm, but not using it for the default net driver 18:50:10 <daneyon> unless their are questions, lets move on. 18:50:18 <daneyon> and thanks for the good discussion. 18:50:26 <daneyon> #topic Review Action Items 18:50:30 * daneyon everyone who votes on the kuryr design spec to continue tracking the spec to completion by voting. 18:50:40 <daneyon> I see everyone voted. 18:50:57 <daneyon> thanks again for taking the time to review the kuryr spec and cast your vote. 18:51:02 * daneyon danehans to continue coordinating with gsagie on a combined kuryr/magnum design summit session. 18:51:14 <daneyon> I have not discussed this with gsagie 18:51:22 <daneyon> I will move this one forward. 18:51:31 <daneyon> #action danehans to continue coordinating with gsagie on a combined kuryr/magnum design summit session. 18:51:39 <daneyon> #topic Open Discussion 18:51:59 <daneyon> anyone have a topic to discuss? 18:52:17 <daneyon> ok 18:52:36 <daneyon> then I will close out our meeting. 18:52:45 <daneyon> thanks again for the great discussion 18:52:56 <daneyon> #endmeeting