#openstack-containers log

10:03:24 <strigazi> #startmeeting containers
10:03:25 <openstack> Meeting started Tue Jun 26 10:03:24 2018 UTC and is due to finish in 60 minutes.  The chair is strigazi. Information about MeetBot at http://wiki.debian.org/MeetBot.
10:03:26 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
10:03:29 <openstack> The meeting name has been set to 'containers'
10:03:38 <strigazi> #topic Roll Call
10:03:56 <slunkad> hi
10:04:12 <strigazi> hi slunkad
10:04:33 <flwang1> o/
10:05:47 <strigazi> hi flwang1
10:05:50 <strigazi> #topic Announcements
10:06:50 <strigazi> Last week we had another IRC meeting on Thursday, I sent a summary in the mailing list link:
10:07:33 <strigazi> #link http://lists.openstack.org/pipermail/openstack-dev/2018-June/131769.html
10:08:25 <strigazi> We will continue with two for three more weeks
10:08:45 <strigazi> #topic Blueprints/Bugs/Ideas
10:09:14 * strigazi still has  slow network
10:10:14 <strigazi> Last week I dscovered two issues and created the corresponding stories, I'd like your input on this
10:10:28 <strigazi> Allow cluster scaling from different users in the same project
10:10:37 <strigazi> #link https://storyboard.openstack.org/#!/story/2002648
10:12:34 <flwang1> cool
10:12:49 <strigazi> This is quite important for long running clusters. I need to update it the story, it is not only for scaling, it is for any operaton hiding a stack-update
10:12:59 <flwang1> but that means we're taking cluster as a shared resources in same tenant
10:13:31 <strigazi> yes, but it not only that
10:14:47 <strigazi> it is also for admins. if your client opens a support ticket and asks you to do something as admin that needs a stack update , you can not do it, stack update will fail
10:15:12 <flwang1> ok, that makes sense
10:15:31 <flwang1> admin support operation is another topic we may need to discuss later
10:16:09 <strigazi1> well, why later? we had this problem yesterday :)
10:17:00 <strigazi1> So the proposed solution, is to create a keypair in the stack
10:17:06 <strigazi1> what do you think?
10:17:12 <flwang1> i mean the admin support
10:17:22 <flwang1> it could be a bigger topic
10:17:41 <flwang1> i'm not blocking this particular issue
10:18:12 <strigazi1> ok, but what do you think about the solution?
10:19:36 <strigazi1> Taking the public_key from nova, giving it to heat which will create a new keypair and will pass it to the servers
10:19:38 <flwang1> the one you proposed in the story?
10:20:06 <strigazi1> this is only the part in heat, but yes
10:20:52 <flwang1> does that mean any other user when doing stack update will regenerate a new keypair?
10:22:08 <strigazi1> I think it doesn't but I haven't checked. The effect in heat though is that the update succeeds and the vms are not touched because of the key
10:23:30 <strigazi1> Even if it does, is this a problem?
10:24:00 <flwang1> will it cause the original owner can't ssh into the VM?
10:24:16 <strigazi1> no
10:24:43 <strigazi1> because even it creates a new keypair. The keypair will be created with the original public_key
10:25:14 <flwang1> ah, i see. so you mean the new keypair will be appended into the existing pub key?
10:25:31 <flwang1> s/new keypair/new pub key
10:26:34 <strigazi1> no, I need to check if there actually a new keypair.
10:26:37 <strigazi1> even if there is
10:27:12 <strigazi1> It will be a new reousrce in nova created with the same data. The string with the public_key of the original owner
10:28:16 <flwang1> i think you'd better propose a patch and we continue the discussion there
10:28:20 <strigazi1> ricolin: are you here?
10:28:36 <strigazi1> flwang1: ok
10:28:44 <strigazi1> flwang1: I see that you hesitate though
10:29:13 <strigazi1> the next bug I encountered
10:29:26 <strigazi1> was a gain for admins but for swarm
10:29:27 <flwang1> strigazi1: not really, i just need some time to understand  ;)
10:29:39 <strigazi1> flwang1: :)
10:30:12 <flwang1> i have no idea with swarm, but i'm listening
10:30:20 <strigazi1> in kubernetes, the Resource Groups have a removal policy
10:30:26 <strigazi1> flwang1: is is heat again
10:31:09 <strigazi1> which means that if you pass the ids of some resources or IPs it will remove them
10:31:13 <strigazi1> see here:
10:32:01 <strigazi1> http://git.openstack.org/cgit/openstack/magnum/tree/magnum/drivers/k8s_fedora_atomic_v1/templates/kubecluster.yaml#n703
10:32:24 <strigazi1> flwang1: slunkad this is good to know in our kubernetes drivers ^^
10:33:02 <strigazi1> so when there is a bad node or a node that failed to deploy, with this trick we can remove it
10:33:27 <strigazi1> for us (CERN), when we have large cluster, we scale in steps
10:34:04 <strigazi1> and when we hit update_failed, we fix the stack and continue
10:34:27 <flwang1> manually?
10:34:27 <strigazi1> we need the same for swarm
10:34:37 <strigazi1> yes
10:34:41 <flwang1> ok
10:34:50 <strigazi1> manually with the APIs
10:35:01 <flwang1> but i can see this could be a part of the auto healing feature
10:35:04 <strigazi1> it is not that manual
10:35:11 <flwang1> i see
10:35:16 <strigazi1> I was typing this
10:35:33 <strigazi1> the for healing, auto or not
10:35:42 <flwang1> right
10:35:43 <strigazi1> this is how it is going to happen
10:35:57 <strigazi1> with removal_policies
10:36:07 <flwang1> cool
10:36:14 <flwang1> it would be nice to have
10:37:08 <slunkad> strigazi1: what does that do exactly?
10:38:10 <strigazi1> slunkad: it you do: openstack stack update <stac_id> --existing --parameter minions_to_remove=124, 201
10:38:35 <strigazi1> it will remove from the resource group two nested stacks with ids 124 and 201
10:38:45 <strigazi1> instead of ids you can use private ips
10:39:15 <strigazi1> openstack stack update <stac_id> --existing --parameter minions_to_remove=10.0.0.4,10.0.0.123
10:39:28 <strigazi1> makes sense?
10:39:41 <slunkad> yes and the swarm driver does not have this?
10:39:57 <strigazi1> https://docs.openstack.org/heat/latest/template_guide/openstack.html#OS::Heat::ResourceGroup-prop-removal_policies
10:40:09 <strigazi1> slunkad: it doesn't
10:40:28 <slunkad> ok good to know then, will keep this in mind for the opensuse driver
10:40:41 <slunkad> do you have a bug filed for this?
10:40:57 <strigazi1> yes
10:41:32 <strigazi1> #link https://storyboard.openstack.org/#!/story/2002677
10:41:51 <strigazi1> ok, next, flwang1 did you see my comment on mulitmaster?
10:42:09 <strigazi1> #link https://review.openstack.org/#/c/576029
10:42:17 <flwang1> strigazi: yep, i saw that. but i don't think it's related to the sync keys patch
10:42:33 <flwang1> it maybe another corner case for the multi master race condition issue
10:43:01 <strigazi1> flwang1: it not exactly, but still the deployment doesn't work
10:43:25 <strigazi1> flwang1: we can merge yours but multi master still won't be functional
10:44:04 <flwang1> i see. i will give another try for the multi master deployment. but based on my testing, on openlab machine with 256G ram, my fix just works
10:44:24 <flwang1> i will do another round testing for the multi master test and get back to you
10:44:35 <strigazi1> openlab?
10:44:57 <flwang1> forgot that :)
10:45:33 <strigazi1> I don't know why it works
10:45:47 <strigazi1> for it worked once
10:45:51 <strigazi1> for me it worked once
10:45:57 <flwang1> interesting, i will test it again and let you know
10:45:58 <strigazi1> out of 30
10:46:07 <flwang1> i tested it several times and no fail
10:46:11 <strigazi1> happy to +2
10:47:00 <flwang1> it would be nice because I think it's a different issue from the service account key issue
10:47:21 <flwang1> i can open another story/task to track your issue
10:48:18 <strigazi1> ok
10:49:04 <strigazi1> that's it from me
10:49:35 <flwang1> my turn now?
10:49:47 <strigazi1> yes
10:49:51 <flwang1> cool
10:50:42 <flwang1> last week, i mainly focused on the magnum support on gophercloud and shade/openstacksdk, because catalyst cloud would like to enable user to use terraform and ansible to talk to magnum api
10:51:32 <flwang1> i'm still working on the k8s-keystone-auth integration, will submit a testing image to openstackmagnum soon
10:52:10 <flwang1> for auto healing, my patch https://review.openstack.org/#/c/570818/ is still waiting for review
10:52:55 <flwang1> for this cycle, i probably won't take new stuff, but just finish things on my plate
10:53:28 <flwang1> strigazi1: i'm keen to know the upgrade and auto healing status from you and ricardo
10:53:40 <strigazi1> flwang1: thanks, the patch looks good
10:53:57 <strigazi1> just point to the story instead of the blueprint
10:54:09 <flwang1> strigazi1: ok, will update
10:55:21 <strigazi1> flwang1: I'm trying to finish my part I sent out a patch for the api 10 days ago which I'm basing the rest on
10:55:55 <flwang1> strigazi1: is the current one ready for testing?
10:56:14 <flwang1> i'm worrying if we can finish them in this cycle
10:56:55 <strigazi1> I hope we'll discuss it on Thursday
10:57:10 <flwang1> will we have ricardo?
10:57:19 <strigazi1> I don't know
10:57:36 <flwang1> strigazi1: fair enough
10:57:46 <flwang1> if so, let's focus on the upgrade one
10:57:57 <flwang1> let me know if there is anything i can help
10:58:22 <strigazi1> thanks
10:58:58 <strigazi1> thanks for joining the meeting flwang1 slunkad
10:59:24 <strigazi1> #endmeeting
10:59:55 <flwang1> thank you, strigazi1
11:00:01 <strigazi> #endmeeting