#openstack-containers log

21:00:11 <strigazi> #startmeeting containers
21:00:12 <openstack> Meeting started Tue Oct 23 21:00:11 2018 UTC and is due to finish in 60 minutes.  The chair is strigazi. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:14 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:16 <openstack> The meeting name has been set to 'containers'
21:00:19 <strigazi> #topic Roll Call
21:00:29 <strigazi> o/
21:00:32 <ttsiouts> o/
21:01:17 <flwang> o/
21:02:47 <imdigitaljim> o/
21:03:14 <strigazi> brtknr: you here?
21:03:17 <eandersson> o/
21:03:53 <strigazi> Thanks for joining the meeting  ttsioutsflwang imdigitaljim eandersson
21:03:59 <strigazi> Agenda:
21:04:05 <strigazi> #link https://wiki.openstack.org/wiki/Meetings/Containers#Agenda_for_2018-10-23_2100_UTC
21:04:16 <strigazi> it has some items
21:04:27 <strigazi> #topic Stories/Tasks
21:04:51 <strigazi> 1. node groups https://review.openstack.org/#/c/607363/
21:05:13 <strigazi> I think we are pretty close to the final state of the spec
21:05:21 <strigazi> please take a look
21:05:44 <ttsiouts> strigazi: tomorrow I will push again
21:06:02 <ttsiouts> adapting ricardo's comments
21:06:41 <strigazi> oh I thought you pushed today, ok so take a looks guys take a look tmr as well :)
21:07:22 <ttsiouts> :) tmr better
21:07:24 <strigazi> @all do you want to discuss anything about nodegroups today?
21:07:46 <strigazi> questions about nodegroups?
21:08:23 <strigazi> ok, next then
21:08:26 <schaney> o/ sorry for lateness, but yes
21:09:06 <strigazi> schaney: hello. you have smth about nodegroups?
21:09:08 <schaney> any mechanism to interact with NGs individually?
21:09:29 <schaney> opposed to the top level cluster or stack
21:09:57 <strigazi> the api will be like:
21:10:01 <ttsiouts> schaney: do you mean updating a specific nodegroup?
21:10:20 <schaney> yes for scaling or the like
21:10:33 <strigazi> cluster/<cluster-identity>/nodegroup/<nodegroup-identity>
21:10:48 <strigazi> so PATCH cluster/<cluster-identity>/nodegroup/<nodegroup-identity>
21:10:48 <ttsiouts> https://review.openstack.org/#/c/607363/2/specs/stein/magnum-nodegroups.rst@117
21:11:37 <colin-> sorry i'm late!
21:11:49 <strigazi> colin-: welcome
21:12:44 <schaney> oh gotcha, ill have to dig into the work a bit, under the hood magnum just targets the name of the node group represented by the heat parameter though?
21:13:05 <flwang> ttsiouts: the node groups is basically the same thing like node pool in GKE, right?
21:13:32 <ttsiouts> schaney: that's the plan
21:13:48 <ttsiouts> flwang: exactly
21:14:05 <flwang> ttsiouts: cool
21:14:37 <schaney> i see, did there happen to be any work with the "random node scale down" when magnum shrinks the cluster?
21:14:40 <flwang> ttsiouts: i will review the spec first
21:14:56 <schaney> from the API route it would seem so?
21:14:59 <flwang> i think we probably better read the spec first and put comments in the code
21:15:07 <flwang> instead of discussing design details here
21:15:12 <schaney> good call
21:16:05 <ttsiouts> schaney: we want to add a CLI for removing specific nodes from the cluster
21:16:18 <ttsiouts> but this will come further down the way
21:16:21 <strigazi> schaney: this won't be covered by this spec, but we should track it somewhere else
21:16:39 <schaney> gotcha, thanks for the info
21:16:48 <ttsiouts> flwang: thanks!
21:17:05 <ttsiouts> flwang: tmr it will be more complete
21:17:17 <imdigitaljim> were looking at modifying our driver to perhaps consume senlin clusters for each group
21:17:39 <imdigitaljim> masters, minions-group-1, ... minions-group-n
21:17:49 <flwang> imdigitaljim: is it a hard depedency?
21:17:51 <imdigitaljim> would all have a senlin profile/cluster in heat
21:17:53 <flwang> i mean for senlin
21:18:01 <strigazi> imdigitaljim: that could be done, that is why we have drivers
21:18:14 <strigazi> flwang:  should be optional
21:18:34 <strigazi> like alternative
21:18:41 <imdigitaljim> it would probably be the driver would take on a senlin dependency (not magnum as a whole)
21:18:44 <imdigitaljim> just like octavia or not
21:19:08 <strigazi> when the cluster drivers where proposed senlin and ansble were the reasoning behind it
21:19:20 <imdigitaljim> we're more focused on autoscaling/autohealing rather than cli manual scaling
21:20:02 <imdigitaljim> the senlin PTL is here and is actively talking to the heat PTL on managing the senlin resources in heat so we might be able to inhouse create a better opportunity for senlin + heat + magnum
21:20:47 <strigazi> imdigitaljim:  here, like in this meeting?
21:20:51 <imdigitaljim> no
21:21:00 <imdigitaljim> sorry i just mean he works at blizzard
21:22:19 <strigazi> This plan is compatible with nodegroups and nodegroups make it actually easier
21:22:55 <imdigitaljim> we think so
21:23:04 <strigazi> I'm not aware of it in detail, but it sounds doable
21:23:17 <schaney> Senlin would work well with the NG layout, one thing to note is Senlin's dedicated API
21:23:26 <imdigitaljim> yeah we arent either but we'll be working out over the next couple weeks
21:23:50 <imdigitaljim> and see if its within reason on feasibility
21:23:50 <eandersson> The Senlin PTL will be in Berlin btw
21:23:53 <imdigitaljim> ^
21:24:06 <strigazi> cool
21:24:16 <cbrumm__> Its honestly too early to be talking about, there's a lot of heat/senlin ground work to do first
21:24:32 <cbrumm__> But hey, it's a thing we're thinking about.
21:24:43 <strigazi> fair enough
21:25:38 <strigazi> shall we move on?
21:26:01 <cbrumm__> yes
21:26:30 <strigazi> I pushed two patches we were talking about, one is:
21:26:41 <strigazi> Add heat_container_agent_tag label https://review.openstack.org/#/c/612727
21:27:20 <strigazi> we discussed with flwang and eandersson already, others have a look too
21:27:52 <strigazi> the tag of the heat-agent was hardcoded this makes it a label.
21:28:21 <strigazi> the 2nd one needs some discussion, it is
21:28:30 <strigazi> deploy tiller by default https://review.openstack.org/#/c/612336/
21:29:30 <strigazi> Shall we have it in by default or optional?
21:30:39 <flwang> strigazi: any potential issue if we enable it by default?
21:30:41 <strigazi> and then next steps are, with tls or without? a tiller per ns or with the cluster-role?
21:31:16 <strigazi> flwang: the user might want a diffrent tiller config
21:31:25 <flwang> strigazi: that's the problem i think
21:31:34 <strigazi> flwang: other than that, tiller will be there silent
21:32:16 <flwang> we have seen similar 'issue' with other enabling, like the keystone auth integration
21:32:17 <strigazi> flwang: so you are in-favor of optional
21:32:38 <flwang> a new enabling feature may introduce a bunch of config
21:32:51 <flwang> but now in mangum, we don't have a good way to maintain those config
21:33:08 <flwang> labels are too flexible and lose
21:33:14 <flwang> i prefer optional
21:33:38 <flwang> based on the feedback we got so far, most of customer just want a vanilla k8s cluster
21:34:05 <flwang> if they want something, they can DIY
21:34:26 <strigazi> what vanilla means? api, sch, cm, kubelet, proxy, dns, cni
21:34:38 <flwang> and i agree it's because we(catalyst cloud) are public cloud and our customers' requirements are vary
21:34:57 <flwang> vanilla means a pure cluster, not too much plugins/addons
21:35:11 <cbrumm__> flwang: We've been getting some similar feedback from power users too
21:35:17 <flwang> for private cloud, the thing maybe different
21:35:40 <cbrumm__> but I think that's expected from power users that are used to DYI
21:35:48 <flwang> most of the customers of k8s know how to play with it
21:36:13 <flwang> what they want is just a stable k8s cluster with good HA and integration with the underhood cloud provider
21:36:48 <flwang> cbrumm__: what do you mean 'power users'?
21:37:00 <strigazi> so optional
21:37:21 <cbrumm__> people who've used k8s before that was not a managed service
21:37:34 <flwang> cbrumm__: ok, i see, thx
21:38:24 <strigazi> flwang: any argument against having this optional?
21:39:07 <flwang> strigazi: TBH, I would suggest we start to define a better addon architecture
21:39:10 <imdigitaljim> ^
21:39:29 <flwang> like refactor the labels
21:39:37 <imdigitaljim> thats one of the goals our driver plans to be solving
21:39:53 <flwang> imdigitaljim: show me the code ;)
21:40:21 <strigazi> so we don't add anything until we refactor?
21:40:25 <flwang> i have heard about the v2 driver million times, i want to see the code :D
21:40:33 <flwang> strigazi: i'm not saying that
21:40:38 <imdigitaljim> :D when i get some free cycles and feel like its in a great uploading spot
21:40:42 <flwang> i just say we should be more careful
21:42:49 <strigazi> the current model is not unreasonable. we need to define careful and not make the service a framework
21:43:47 <strigazi> I think for v1 which can be refactored only replaced/deprecated the model of addons is there
21:44:00 <strigazi> s/can/can't/
21:44:27 <strigazi> labels for on/off and tags
21:44:57 <imdigitaljim> imo we should look into a config file
21:45:13 <imdigitaljim> kubernetes followed the same pattern when they realized there were too many flags
21:45:38 <strigazi> config file to create the labels/fields or a config file to pass to the cluster
21:45:42 <strigazi> ?
21:46:12 <openstackgerrit> Merged openstack/magnum stable/queens: Provide a region to the K8S Fedora Atomic config  https://review.openstack.org/612199
21:46:19 <imdigitaljim> openstack coe cluster template create --config myvalues.yaml
21:46:48 <strigazi> these values are like labels?
21:46:53 <imdigitaljim> could be everything
21:46:57 <imdigitaljim> could be just labels
21:47:02 <strigazi> code too?
21:47:13 <strigazi> like to code?
21:47:19 <imdigitaljim> ?
21:47:20 <strigazi> or link to code?
21:47:53 <imdigitaljim> instead of like --labels octavia_enabled=true, etc etc
21:47:56 <imdigitaljim> it could be
21:48:03 <imdigitaljim> [Loadbalancer}
21:48:09 <imdigitaljim> [LoadBalancer]
21:48:12 <strigazi> got it
21:48:15 <imdigitaljim> octavia_enabled=true
21:48:27 <imdigitaljim> but you could also do
21:48:29 <imdigitaljim> [Network]
21:48:55 <imdigitaljim> floating_ ...  fixed_network= .. fixed_subnet=
21:48:58 <flwang> yaml or the ini format
21:49:02 <imdigitaljim> either or
21:49:02 <imdigitaljim> any
21:49:04 <imdigitaljim> json
21:49:09 <flwang> yep
21:49:09 <imdigitaljim> doesnt matter however we want to do it
21:49:18 <flwang> agree
21:49:21 <imdigitaljim> imo im a fan of that model
21:49:43 <flwang> and with that case, we can publish sample config files
21:49:48 <strigazi> flwang: what would cover your concern about the loose label design?
21:49:52 <flwang> and user can decide how to combine the config
21:50:11 <flwang> strigazi: yep, that's basically the arch in my mind
21:51:05 <strigazi> #action strigazi to draft a spec and story for creating cluster with a config file.
21:51:13 <flwang> strigazi: can we discuss this one  https://github.com/kubernetes/cloud-provider-openstack/issues/280  next?
21:51:30 <strigazi> I'll try to bring smth in the next meeting for this
21:51:39 <flwang> strigazi: thanks
21:52:24 <strigazi> before going into the bug of CPO
21:53:02 <strigazi> @all take a look to the rest of the list of review in the agenda. they are ready to go in
21:53:07 <strigazi> https://wiki.openstack.org/wiki/Meetings/Containers#Agenda_for_2018-10-23_2100_UTC
21:53:18 <colin-> will do
21:54:00 <strigazi> flwang: what might help a little is using the config drive before the metadata service
21:54:05 <strigazi> colin-: thx
21:55:03 <flwang> strigazi: does that need change in magnum?
21:55:14 <flwang> imdigitaljim: did you ever see this issue  https://github.com/kubernetes/cloud-provider-openstack/issues/280 ?
21:55:16 <strigazi> imdigitaljim: eandersson colin- what is your experience with CPO
21:55:31 <flwang> worker nodes are missing ips
21:55:35 <imdigitaljim> i have not experienced this issue
21:55:36 <strigazi> flwang: in the config of the CPO
21:55:48 <flwang> imdigitaljim: probably because you're using v1.12?
21:55:57 <imdigitaljim> we have an internal downstream with a few patches on CPO
21:56:06 <imdigitaljim> were waiting on upstream commit permission for kubernetes org
21:56:12 <imdigitaljim> (blizzard admin stuff)
21:56:20 <imdigitaljim> we're on 1.12.1 correct
21:56:24 <strigazi> patches regarding this bug?
21:56:30 <imdigitaljim> no
21:56:51 <imdigitaljim> UDP support, LBaaS naming, and a LBaaS edge case
21:56:59 <flwang> https://github.com/kubernetes/kubernetes/pull/65226#issuecomment-431933545
21:57:04 <colin-> to jim, outloud just now i said "it's been better than starting from scratch"? :)
21:57:19 <colin-> found it useful and it has definitely saved us some time, but as he said we've also found some gaps we want to address
21:57:34 <flwang> it seems like a very common, high-chance problem
21:58:10 <strigazi> colin-: you talk about k/cpo?
21:58:15 <colin-> yes
21:58:34 <flwang> strigazi: when you say 'config of CPO', does that mean we at least have to use cm+CPO mode?
21:59:27 <strigazi> flwang: https://github.com/openstack/magnum/blob/master/magnum/drivers/common/templates/kubernetes/fragments/write-kube-os-config.sh#L12
22:00:08 <strigazi> it is adding [Metadata]search-order=configDriver,metadataService
22:01:11 <flwang> strigazi: got it. but based on https://github.com/kubernetes/cloud-provider-openstack/issues/280#issuecomment-427416908
22:01:57 <flwang> does that mean we only need add this one  [Metadata]search-order=configDriver,metadataService ?
22:02:25 <strigazi> flwang: the code is... even if you disable set [Metadata]search-order=configDrive it still calls the metadataservice
22:03:11 <strigazi> I tried with configdrive only and it was still doing calls to the APIs.
22:03:30 <flwang> i'm confused
22:03:56 <flwang> does that need any change in nova?
22:04:04 <flwang> i mean nova config
22:04:23 <strigazi> I'll end the meeting to be ~1 hour and we continue
22:04:33 <flwang> cool, thanks
22:04:40 <openstackgerrit> Merged openstack/magnum master: Minor fixes to re-align with Ironic  https://review.openstack.org/612748
22:04:53 <strigazi> @all thanks for joining the meeting
22:05:02 <strigazi> see you next week
22:05:22 <strigazi> #endmeeting