#openstack-containers log

21:00:07 <strigazi> #startmeeting containers
21:00:08 <openstack> Meeting started Tue Sep 25 21:00:07 2018 UTC and is due to finish in 60 minutes.  The chair is strigazi. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:09 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:12 <openstack> The meeting name has been set to 'containers'
21:00:19 <strigazi> #topic Roll Call
21:00:23 <strigazi> o/
21:00:24 <ttsiouts> o/
21:00:25 <cbrumm> o/
21:00:29 <colin-> hello
21:00:54 <colin-> jim is otw
21:01:45 <strigazi> it seems that flwang is not here
21:01:48 <strigazi> colin-: cool
21:02:02 <strigazi> agenda:
21:02:04 <strigazi> #link https://wiki.openstack.org/wiki/Meetings/Containers#Agenda_for_2018-09-25_2100_UTC
21:02:21 <imdigitaljim> o/
21:02:34 <strigazi> #topic Stories/Tasks
21:02:37 <strigazi> imdigitaljim: hello
21:02:59 <strigazi> I have put  4 items in the agenda
21:03:26 <strigazi> The 1st one is merged in rocky Fix cluster update command https://storyboard.openstack.org/#!/story/1722573 Patch in review: https://review.openstack.org/#/c/600806/ \o/
21:03:51 <cbrumm> very nice
21:04:07 <strigazi> no more broken stacks cause one char changed in the templates
21:04:27 <strigazi> And actually I want to mention the 4th one:
21:04:42 <strigazi> scale cluster as admin or other user in the same project
21:04:49 <strigazi> #link https://storyboard.openstack.org/#!/story/2002648
21:05:01 <strigazi> We have discussed this before,
21:05:23 <strigazi> and I think our only option is pass the public key as a string.
21:05:38 <strigazi> plus the patch from imdigitaljim to not pass a keypair at all
21:05:57 <imdigitaljim> yeah this story wont be an issue for us
21:05:59 <strigazi> imdigitaljim: cbrumm you are not using keypairs at all
21:06:00 <strigazi> ?
21:06:04 <imdigitaljim> correct
21:06:11 <strigazi> only sssd?
21:06:19 <imdigitaljim> yeah
21:06:24 <imdigitaljim> keypair is less secure as well
21:06:31 <imdigitaljim> since if anyone gets access to said key
21:06:32 <strigazi> does this make sense to go upstream?
21:06:43 <strigazi> it is a ds right?
21:06:59 <imdigitaljim> its fine to support it but we should consider the option for without
21:07:05 <strigazi> we could have a recipe we some common bits
21:07:21 <strigazi> without sssd?
21:07:49 <imdigitaljim> yeah that would be good, an option that works as you need it to and an option that will not worry about it at all for usages like sssd
21:07:53 <strigazi> *we could have a recipe with some common bits
21:07:59 <imdigitaljim> yup
21:08:15 <imdigitaljim> ive noticed this issue occur in other cases too btw
21:08:22 <strigazi> like?
21:08:49 <imdigitaljim> not with keys but just policy control flow
21:09:04 <strigazi> oh, right
21:09:38 <imdigitaljim> we have a current issue where admin/owner A creates cluster in tenant A for user B, the user B cannot create a config file (using CLI/API) for that cluster because they are neither admin/owner
21:09:59 <imdigitaljim> and user B belongs to tenant A as well
21:10:20 <strigazi> that is fixable in the policy file
21:10:20 <imdigitaljim> we would like any users of tenant A be able to generate a config for clusters of tenant A
21:10:25 <imdigitaljim> not in its current state
21:10:46 <imdigitaljim> its an API enforcement issue where our issue sits
21:10:46 <strigazi> we have it, wihtout any other change
21:10:53 <strigazi> one sec
21:11:01 <imdigitaljim> maybe share the policy, perhaps we're missing something :D
21:12:03 <strigazi> "certificate:create": "rule:admin_or_owner or rule:cluster_user",
21:12:06 <strigazi> "certificate:get": "rule:admin_or_owner or rule:cluster_user",
21:12:31 <imdigitaljim> wwhat is your cluster_user rule
21:12:32 <strigazi> "admin_or_user": "is_admin:True or user_id:%(user_id)s",
21:12:32 <strigazi> "cluster_user": "user_id:%(trustee_user_id)s",
21:12:37 <imdigitaljim> thats what we have
21:13:19 <strigazi> also:     "admin_or_owner": "is_admin:True or project_id:%(project_id)s",
21:13:38 <canori02> o/
21:13:49 <strigazi> hey canori02
21:14:12 <imdigitaljim> yeah thats what we have, i think theres a condition that doesnt get met somewhere and it fails the policy
21:14:21 <imdigitaljim> ill have to find it, sorry it was a couple weeks ago
21:14:51 <strigazi> imdigitaljim: that is our policy, works for brtknr too
21:15:17 <imdigitaljim> yeah, id like for it to work too :)
21:15:51 <strigazi> imdigitaljim: I'll double check in devstack too
21:16:14 <strigazi> ok, I have two more
21:16:41 <strigazi> This patch requires a first pass [k8s] Add vulnerability scanner https://review.openstack.org/#/c/598142
21:16:59 <strigazi> it was done by an intern, in the past months at CERN
21:17:20 <strigazi> it is a scanner to scan all images in a runnning cluster
21:17:35 <strigazi> combined with a clair serve
21:17:37 <strigazi> combined with a clair server
21:17:52 <strigazi> You can have a look and give some input
21:17:58 <imdigitaljim> oh excellent
21:18:51 <strigazi> The first iteration works only for public images, in subsequent steps we can enhance it to work for private registies too
21:19:04 <imdigitaljim> great!
21:19:16 <colin-> yeah that could be really useful
21:19:24 <imdigitaljim> looks good on everything but ill have some comments for the shell file
21:20:03 <strigazi> nice :) The last item, from me and ttsiouts is about nodegroups Nodegroups patches: https://review.openstack.org/#/q/status:open+project:openstack/magnum+branch:master+topic:magnum_nodegroups
21:20:26 <imdigitaljim> yeah can we discuss that
21:20:32 <imdigitaljim> im not sure what thats about/whats its purpose
21:20:33 <strigazi> We need to dig the spec and bring it up to date, but these patches are a kickstart
21:20:34 <ttsiouts> imdigitaljim: I'm drafting a spec for this
21:20:35 <imdigitaljim> i couldnt follow
21:20:39 <imdigitaljim> oh ok great thanks
21:21:16 <ttsiouts> cool
21:21:33 <ttsiouts> I'll try to have it upstream asap
21:21:34 <strigazi> atm the clusters are homogeneous, one AZ one flavor
21:21:54 <imdigitaljim> oh is it for cluster node groups
21:21:57 <imdigitaljim> i understand
21:22:00 <colin-> strigazi: is this to provide the option to support different types of minions in the cluster?
21:22:03 <colin-> distinctly
21:22:06 <strigazi> yes
21:22:08 <imdigitaljim> i think i have some other thoughts too for the WIP
21:22:08 <colin-> neat
21:22:41 <strigazi> From our side,
21:23:02 <strigazi> is to have minimum two groups of nodes
21:23:11 <strigazi> one for master one for minion
21:23:25 <strigazi> and then add as you go, like in GKE
21:23:35 <strigazi> in gke they call them nodepools
21:24:27 <strigazi> we don't have a strong opinon on the master nodegroups, but I think it is the most straight forward option atm
21:24:38 <strigazi> imdigitaljim: do you have some quick input
21:24:51 <strigazi> we can take the details in the spec
21:25:02 <imdigitaljim> yeah a couple questions
21:25:24 <imdigitaljim> so, is this intended to be runtime nodegroups or determined at creation time?
21:26:06 <strigazi> the first two nodegroups will be created at creation time and then the user will add more
21:26:22 <strigazi> like now
21:26:28 <strigazi> when you create a cluster
21:26:50 <strigazi> the heat stack  has two resource groups, one for master one for minions
21:27:02 <strigazi> this can be the minimum
21:27:20 <colin-> could you add a nodegroup to a cluster that was created without it?
21:27:23 <colin-> at a later time?
21:27:36 <strigazi> the you call POST cluster/UUID/nodegroups and you add more
21:27:45 <colin-> interesting
21:28:17 <imdigitaljim> for this design i was thinking something more clever with leveraging heat more
21:28:18 <imdigitaljim> https://docs.openstack.org/heat/latest/template_guide/hot_spec.html
21:28:23 <strigazi> colin-: it could be possible, but I'm not sure what is the benefit. IMO for this use case
21:28:30 <imdigitaljim> if we update the minimum heat we could have a repeat for the # of pools
21:28:42 <strigazi> imdigitaljim: this is what we want to do ^^
21:28:58 <imdigitaljim> so like 1-N pools, and provide the data through template (for now)
21:28:59 <strigazi> imdigitaljim: not many stacks
21:29:14 <imdigitaljim> pools/resourcegroups
21:29:30 <strigazi> a shallow nested  stack
21:29:52 <imdigitaljim> yeah
21:29:58 <imdigitaljim> so where do all these controllers come into play
21:30:09 <imdigitaljim> i dont see why these would be necessary to accomplish node pools
21:30:19 <strigazi> colin-: for this use case we could have the concept of extrnal groups or smth
21:30:41 <colin-> ok
21:31:17 <strigazi> imdigitaljim:  in the end it would be one stack. But end user, that don't know about heat need a way to express this
21:31:32 <strigazi> imdigitaljim: we need a route in the api
21:32:26 <strigazi> imdigitaljim: otherwise we need to do CRUD operations in a field or many fields in the cluster
21:33:00 <strigazi> have a nodegroup field that describes those pools/groups
21:33:34 <imdigitaljim> oh is this part for the feedback for api/cli/ on what exists?
21:34:00 <imdigitaljim> feedback/usage via cli/api?
21:34:39 <strigazi> I think I got the question and I'll say yes
21:34:43 <strigazi> :)
21:34:53 <imdigitaljim> let me sit on it a little longer
21:35:01 <imdigitaljim> and maybe if you can answer those questions from ricardo
21:35:10 <strigazi> ok
21:35:11 <imdigitaljim> but if its that then i can better judge the PR :)
21:35:34 <imdigitaljim> but i do think i understand what these PR's are now
21:35:46 <strigazi> :)
21:36:31 <imdigitaljim> yeah
21:36:33 <imdigitaljim> now i see
21:36:35 <imdigitaljim> cool beans
21:36:37 <imdigitaljim> looks about right
21:36:43 <imdigitaljim> ill keep following it
21:36:47 <imdigitaljim> thanks for clarifying!
21:36:53 <strigazi> :)
21:37:01 <strigazi> ttsiouts++
21:37:11 <ttsiouts> :)
21:37:37 <strigazi> oh, I would like to add two more things
21:37:46 <strigazi> one is, for imdigitaljim
21:38:09 <strigazi> Do you have experience on rebooting cluster nodes?
21:38:32 <imdigitaljim> yeah
21:38:35 <imdigitaljim> somewhat
21:38:42 <strigazi> our experience is pretty unplesant with flannel
21:38:58 <cbrumm> we've played a lot with killing and creating minions, rebooting is generally fine too
21:39:24 <imdigitaljim> ^
21:39:38 <imdigitaljim> and also killing LB's and recoverying
21:39:39 <strigazi> with the current model of flannel, 30% of the nodes lose network
21:40:03 <imdigitaljim> recoverying/recovering*
21:40:09 <strigazi> I hope that the slef-hosted flannel works better
21:40:18 <imdigitaljim> yeah i feel like it would
21:40:44 <imdigitaljim> i think you guys are doing the right thing switching to a self-hosted flannel imho
21:40:47 <strigazi> cbrumm: imdigitaljim your experience is with calico hosted on k8s, right?
21:40:49 <imdigitaljim> or join us with calico
21:40:54 <cbrumm> yeah
21:40:55 <imdigitaljim> yeah
21:41:05 <colin-> did you guys already consider that strigazi ?
21:41:08 <imdigitaljim> we're using latest calico 3.3.9
21:41:09 <colin-> must have at some point
21:41:30 <cbrumm> calico has "just worked" for us
21:41:33 <strigazi> we sticked with what we know, no other reason so far
21:41:45 <colin-> understood
21:41:46 <imdigitaljim> cbrumm+1
21:42:00 <strigazi> but we must give it a go
21:42:09 <colin-> it's nice not to deal with any layer 2 matters i have to say
21:42:21 <strigazi> we also have tungsten waiting in the corner and we kind of wait for it
21:42:22 <colin-> been a relief for me personally from an operator perspective to use calico only
21:42:45 <strigazi> colin-: you use calico for vms too?
21:43:27 <cbrumm> colin is with us
21:43:28 <colin-> as much imdigitaljim and cbrumm  do :)
21:43:57 <strigazi> oh, right :)
21:44:17 <strigazi> it is close to midnight here, sorry :)
21:45:26 <strigazi> the last thing is for people interested in Fedora CoreOS
21:46:06 <strigazi> I promised the FCOS team to try systemd-portable services for kubelet and dockerd/containerd
21:46:38 <strigazi> But I didn't have time so far, if anyone wants to help, is more than welcome
21:46:50 <strigazi> I'm fetching the pointer
21:47:30 <cbrumm> not sure we'll have time to try it out
21:47:35 <imdigitaljim> not sure we can aid with that yet but keep a finger on them for a minimal image ;)
21:47:36 <cbrumm> might, but our timeline is tight
21:47:41 <strigazi> #link https://github.com/systemd/systemd/blob/master/docs/PORTABLE_SERVICES.md
21:48:01 <imdigitaljim> strigazi: ill catch up on the literature
21:48:33 <strigazi> The goal is to run the kubelet a portable systemd service
21:49:08 <imdigitaljim> oh i see
21:49:13 <strigazi> I just wanted to share with it with you
21:49:16 <imdigitaljim> its super similar to the atomic install model already
21:49:18 <imdigitaljim> eyah
21:49:21 <imdigitaljim> ill read up some more
21:49:24 <strigazi> maybe canori01 is interested too
21:50:00 <strigazi> imdigitaljim:  and should work in many distros (? or !)
21:50:13 <imdigitaljim> yeah
21:50:35 <imdigitaljim> its same pattern/benefits of containers
21:50:49 <imdigitaljim> just rebranded/ slightly different
21:51:29 <strigazi> plus maintained by the systemd team
21:51:47 <cbrumm> I think this is the right thing to look into
21:51:48 <imdigitaljim> i can see kubelet being done fairly easily
21:51:57 <imdigitaljim> but dockerd/containerd would be much more complicated
21:52:06 <cbrumm> We'll all want to make sure it works well, but its the correct starting place
21:52:41 <strigazi> imdigitaljim: would it though? we managed to run dockerd in a syscontainer already
21:52:55 <strigazi> let's see
21:53:07 <imdigitaljim> perhaps
21:53:20 <imdigitaljim> maybe im thinking of something more complicated
21:53:23 <imdigitaljim> and not this context
21:53:28 <imdigitaljim> but ill check it out
21:53:33 <imdigitaljim> do you have the dockerd in a syscontainer?
21:53:40 <imdigitaljim> does it look like the dind project?
21:53:58 <strigazi> yes, for swarm, but we look to use it for k8s too
21:54:25 <strigazi> imdigitaljim: no, not like dind
21:54:45 <strigazi> imdigitaljim: https://gitlab.cern.ch/cloud/docker-ce-centos/
21:55:35 <imdigitaljim> oh ok
21:55:37 <imdigitaljim> cool
21:55:40 <imdigitaljim> and this works for you alreayd?
21:56:02 <strigazi> yes
21:56:28 <strigazi> for swarm for a year or so
21:56:46 <imdigitaljim> i just personally dont have intimate knowledge of the dockerd requirements but if you've got it already it should be cake!
21:56:54 <strigazi> for k8s we didn't put a lot of effort, but for some tests it was fine
21:57:27 <strigazi> imdigitaljim:  the only corner case can be mounting weird dirs on the host
21:57:30 <imdigitaljim> yeah
21:57:40 <imdigitaljim> thats where my complexities were concerned
21:57:44 <strigazi> imdigitaljim: our mount points are pretty much standard
21:57:48 <imdigitaljim> weird dirs/weird mounts
21:58:20 <imdigitaljim> ./weird permissions
21:58:21 <colin-> interesting idea, would be curious to see how it's implemented for k8s and how kubelet reacts
21:58:30 <strigazi> imdigitaljim:  we have tested mounting cinder volumes too
21:59:00 <imdigitaljim> anyways yeah we'll keep an eye on it and catch up
21:59:40 <strigazi> imdigitaljim: colin- if dockerd and kubelet share the proper bind mounts it "Just Works"
22:00:02 <colin-> nice
22:00:24 <colin-> good to remember that does still happen in real life :)
22:00:29 <colin-> (sometimes)
22:00:34 <strigazi> :)
22:00:34 <imdigitaljim> 'proper' :P
22:00:42 <imdigitaljim> is the complexity
22:00:43 <imdigitaljim> but yeah
22:01:00 <cbrumm> need to go, bye everyone
22:01:07 <strigazi> we are and hour in
22:01:12 <strigazi> cbrumm: thanks
22:01:48 <strigazi> let's wrap then
22:02:04 <strigazi> Thanks for joining the meeting everyone
22:02:10 <colin-> ttyl!
22:02:24 <ttsiouts> bye!
22:02:36 <strigazi> #endmeeting