10:03:24 <strigazi> #startmeeting containers 10:03:25 <openstack> Meeting started Tue Jun 26 10:03:24 2018 UTC and is due to finish in 60 minutes. The chair is strigazi. Information about MeetBot at http://wiki.debian.org/MeetBot. 10:03:26 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 10:03:29 <openstack> The meeting name has been set to 'containers' 10:03:38 <strigazi> #topic Roll Call 10:03:56 <slunkad> hi 10:04:12 <strigazi> hi slunkad 10:04:33 <flwang1> o/ 10:05:47 <strigazi> hi flwang1 10:05:50 <strigazi> #topic Announcements 10:06:50 <strigazi> Last week we had another IRC meeting on Thursday, I sent a summary in the mailing list link: 10:07:33 <strigazi> #link http://lists.openstack.org/pipermail/openstack-dev/2018-June/131769.html 10:08:25 <strigazi> We will continue with two for three more weeks 10:08:45 <strigazi> #topic Blueprints/Bugs/Ideas 10:09:14 * strigazi still has slow network 10:10:14 <strigazi> Last week I dscovered two issues and created the corresponding stories, I'd like your input on this 10:10:28 <strigazi> Allow cluster scaling from different users in the same project 10:10:37 <strigazi> #link https://storyboard.openstack.org/#!/story/2002648 10:12:34 <flwang1> cool 10:12:49 <strigazi> This is quite important for long running clusters. I need to update it the story, it is not only for scaling, it is for any operaton hiding a stack-update 10:12:59 <flwang1> but that means we're taking cluster as a shared resources in same tenant 10:13:31 <strigazi> yes, but it not only that 10:14:47 <strigazi> it is also for admins. if your client opens a support ticket and asks you to do something as admin that needs a stack update , you can not do it, stack update will fail 10:15:12 <flwang1> ok, that makes sense 10:15:31 <flwang1> admin support operation is another topic we may need to discuss later 10:16:09 <strigazi1> well, why later? we had this problem yesterday :) 10:17:00 <strigazi1> So the proposed solution, is to create a keypair in the stack 10:17:06 <strigazi1> what do you think? 10:17:12 <flwang1> i mean the admin support 10:17:22 <flwang1> it could be a bigger topic 10:17:41 <flwang1> i'm not blocking this particular issue 10:18:12 <strigazi1> ok, but what do you think about the solution? 10:19:36 <strigazi1> Taking the public_key from nova, giving it to heat which will create a new keypair and will pass it to the servers 10:19:38 <flwang1> the one you proposed in the story? 10:20:06 <strigazi1> this is only the part in heat, but yes 10:20:52 <flwang1> does that mean any other user when doing stack update will regenerate a new keypair? 10:22:08 <strigazi1> I think it doesn't but I haven't checked. The effect in heat though is that the update succeeds and the vms are not touched because of the key 10:23:30 <strigazi1> Even if it does, is this a problem? 10:24:00 <flwang1> will it cause the original owner can't ssh into the VM? 10:24:16 <strigazi1> no 10:24:43 <strigazi1> because even it creates a new keypair. The keypair will be created with the original public_key 10:25:14 <flwang1> ah, i see. so you mean the new keypair will be appended into the existing pub key? 10:25:31 <flwang1> s/new keypair/new pub key 10:26:34 <strigazi1> no, I need to check if there actually a new keypair. 10:26:37 <strigazi1> even if there is 10:27:12 <strigazi1> It will be a new reousrce in nova created with the same data. The string with the public_key of the original owner 10:28:16 <flwang1> i think you'd better propose a patch and we continue the discussion there 10:28:20 <strigazi1> ricolin: are you here? 10:28:36 <strigazi1> flwang1: ok 10:28:44 <strigazi1> flwang1: I see that you hesitate though 10:29:13 <strigazi1> the next bug I encountered 10:29:26 <strigazi1> was a gain for admins but for swarm 10:29:27 <flwang1> strigazi1: not really, i just need some time to understand ;) 10:29:39 <strigazi1> flwang1: :) 10:30:12 <flwang1> i have no idea with swarm, but i'm listening 10:30:20 <strigazi1> in kubernetes, the Resource Groups have a removal policy 10:30:26 <strigazi1> flwang1: is is heat again 10:31:09 <strigazi1> which means that if you pass the ids of some resources or IPs it will remove them 10:31:13 <strigazi1> see here: 10:32:01 <strigazi1> http://git.openstack.org/cgit/openstack/magnum/tree/magnum/drivers/k8s_fedora_atomic_v1/templates/kubecluster.yaml#n703 10:32:24 <strigazi1> flwang1: slunkad this is good to know in our kubernetes drivers ^^ 10:33:02 <strigazi1> so when there is a bad node or a node that failed to deploy, with this trick we can remove it 10:33:27 <strigazi1> for us (CERN), when we have large cluster, we scale in steps 10:34:04 <strigazi1> and when we hit update_failed, we fix the stack and continue 10:34:27 <flwang1> manually? 10:34:27 <strigazi1> we need the same for swarm 10:34:37 <strigazi1> yes 10:34:41 <flwang1> ok 10:34:50 <strigazi1> manually with the APIs 10:35:01 <flwang1> but i can see this could be a part of the auto healing feature 10:35:04 <strigazi1> it is not that manual 10:35:11 <flwang1> i see 10:35:16 <strigazi1> I was typing this 10:35:33 <strigazi1> the for healing, auto or not 10:35:42 <flwang1> right 10:35:43 <strigazi1> this is how it is going to happen 10:35:57 <strigazi1> with removal_policies 10:36:07 <flwang1> cool 10:36:14 <flwang1> it would be nice to have 10:37:08 <slunkad> strigazi1: what does that do exactly? 10:38:10 <strigazi1> slunkad: it you do: openstack stack update <stac_id> --existing --parameter minions_to_remove=124, 201 10:38:35 <strigazi1> it will remove from the resource group two nested stacks with ids 124 and 201 10:38:45 <strigazi1> instead of ids you can use private ips 10:39:15 <strigazi1> openstack stack update <stac_id> --existing --parameter minions_to_remove=10.0.0.4,10.0.0.123 10:39:28 <strigazi1> makes sense? 10:39:41 <slunkad> yes and the swarm driver does not have this? 10:39:57 <strigazi1> https://docs.openstack.org/heat/latest/template_guide/openstack.html#OS::Heat::ResourceGroup-prop-removal_policies 10:40:09 <strigazi1> slunkad: it doesn't 10:40:28 <slunkad> ok good to know then, will keep this in mind for the opensuse driver 10:40:41 <slunkad> do you have a bug filed for this? 10:40:57 <strigazi1> yes 10:41:32 <strigazi1> #link https://storyboard.openstack.org/#!/story/2002677 10:41:51 <strigazi1> ok, next, flwang1 did you see my comment on mulitmaster? 10:42:09 <strigazi1> #link https://review.openstack.org/#/c/576029 10:42:17 <flwang1> strigazi: yep, i saw that. but i don't think it's related to the sync keys patch 10:42:33 <flwang1> it maybe another corner case for the multi master race condition issue 10:43:01 <strigazi1> flwang1: it not exactly, but still the deployment doesn't work 10:43:25 <strigazi1> flwang1: we can merge yours but multi master still won't be functional 10:44:04 <flwang1> i see. i will give another try for the multi master deployment. but based on my testing, on openlab machine with 256G ram, my fix just works 10:44:24 <flwang1> i will do another round testing for the multi master test and get back to you 10:44:35 <strigazi1> openlab? 10:44:57 <flwang1> forgot that :) 10:45:33 <strigazi1> I don't know why it works 10:45:47 <strigazi1> for it worked once 10:45:51 <strigazi1> for me it worked once 10:45:57 <flwang1> interesting, i will test it again and let you know 10:45:58 <strigazi1> out of 30 10:46:07 <flwang1> i tested it several times and no fail 10:46:11 <strigazi1> happy to +2 10:47:00 <flwang1> it would be nice because I think it's a different issue from the service account key issue 10:47:21 <flwang1> i can open another story/task to track your issue 10:48:18 <strigazi1> ok 10:49:04 <strigazi1> that's it from me 10:49:35 <flwang1> my turn now? 10:49:47 <strigazi1> yes 10:49:51 <flwang1> cool 10:50:42 <flwang1> last week, i mainly focused on the magnum support on gophercloud and shade/openstacksdk, because catalyst cloud would like to enable user to use terraform and ansible to talk to magnum api 10:51:32 <flwang1> i'm still working on the k8s-keystone-auth integration, will submit a testing image to openstackmagnum soon 10:52:10 <flwang1> for auto healing, my patch https://review.openstack.org/#/c/570818/ is still waiting for review 10:52:55 <flwang1> for this cycle, i probably won't take new stuff, but just finish things on my plate 10:53:28 <flwang1> strigazi1: i'm keen to know the upgrade and auto healing status from you and ricardo 10:53:40 <strigazi1> flwang1: thanks, the patch looks good 10:53:57 <strigazi1> just point to the story instead of the blueprint 10:54:09 <flwang1> strigazi1: ok, will update 10:55:21 <strigazi1> flwang1: I'm trying to finish my part I sent out a patch for the api 10 days ago which I'm basing the rest on 10:55:55 <flwang1> strigazi1: is the current one ready for testing? 10:56:14 <flwang1> i'm worrying if we can finish them in this cycle 10:56:55 <strigazi1> I hope we'll discuss it on Thursday 10:57:10 <flwang1> will we have ricardo? 10:57:19 <strigazi1> I don't know 10:57:36 <flwang1> strigazi1: fair enough 10:57:46 <flwang1> if so, let's focus on the upgrade one 10:57:57 <flwang1> let me know if there is anything i can help 10:58:22 <strigazi1> thanks 10:58:58 <strigazi1> thanks for joining the meeting flwang1 slunkad 10:59:24 <strigazi1> #endmeeting 10:59:55 <flwang1> thank you, strigazi1 11:00:01 <strigazi> #endmeeting