10:03:24 #startmeeting containers 10:03:25 Meeting started Tue Jun 26 10:03:24 2018 UTC and is due to finish in 60 minutes. The chair is strigazi. Information about MeetBot at http://wiki.debian.org/MeetBot. 10:03:26 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 10:03:29 The meeting name has been set to 'containers' 10:03:38 #topic Roll Call 10:03:56 hi 10:04:12 hi slunkad 10:04:33 o/ 10:05:47 hi flwang1 10:05:50 #topic Announcements 10:06:50 Last week we had another IRC meeting on Thursday, I sent a summary in the mailing list link: 10:07:33 #link http://lists.openstack.org/pipermail/openstack-dev/2018-June/131769.html 10:08:25 We will continue with two for three more weeks 10:08:45 #topic Blueprints/Bugs/Ideas 10:09:14 * strigazi still has slow network 10:10:14 Last week I dscovered two issues and created the corresponding stories, I'd like your input on this 10:10:28 Allow cluster scaling from different users in the same project 10:10:37 #link https://storyboard.openstack.org/#!/story/2002648 10:12:34 cool 10:12:49 This is quite important for long running clusters. I need to update it the story, it is not only for scaling, it is for any operaton hiding a stack-update 10:12:59 but that means we're taking cluster as a shared resources in same tenant 10:13:31 yes, but it not only that 10:14:47 it is also for admins. if your client opens a support ticket and asks you to do something as admin that needs a stack update , you can not do it, stack update will fail 10:15:12 ok, that makes sense 10:15:31 admin support operation is another topic we may need to discuss later 10:16:09 well, why later? we had this problem yesterday :) 10:17:00 So the proposed solution, is to create a keypair in the stack 10:17:06 what do you think? 10:17:12 i mean the admin support 10:17:22 it could be a bigger topic 10:17:41 i'm not blocking this particular issue 10:18:12 ok, but what do you think about the solution? 10:19:36 Taking the public_key from nova, giving it to heat which will create a new keypair and will pass it to the servers 10:19:38 the one you proposed in the story? 10:20:06 this is only the part in heat, but yes 10:20:52 does that mean any other user when doing stack update will regenerate a new keypair? 10:22:08 I think it doesn't but I haven't checked. The effect in heat though is that the update succeeds and the vms are not touched because of the key 10:23:30 Even if it does, is this a problem? 10:24:00 will it cause the original owner can't ssh into the VM? 10:24:16 no 10:24:43 because even it creates a new keypair. The keypair will be created with the original public_key 10:25:14 ah, i see. so you mean the new keypair will be appended into the existing pub key? 10:25:31 s/new keypair/new pub key 10:26:34 no, I need to check if there actually a new keypair. 10:26:37 even if there is 10:27:12 It will be a new reousrce in nova created with the same data. The string with the public_key of the original owner 10:28:16 i think you'd better propose a patch and we continue the discussion there 10:28:20 ricolin: are you here? 10:28:36 flwang1: ok 10:28:44 flwang1: I see that you hesitate though 10:29:13 the next bug I encountered 10:29:26 was a gain for admins but for swarm 10:29:27 strigazi1: not really, i just need some time to understand ;) 10:29:39 flwang1: :) 10:30:12 i have no idea with swarm, but i'm listening 10:30:20 in kubernetes, the Resource Groups have a removal policy 10:30:26 flwang1: is is heat again 10:31:09 which means that if you pass the ids of some resources or IPs it will remove them 10:31:13 see here: 10:32:01 http://git.openstack.org/cgit/openstack/magnum/tree/magnum/drivers/k8s_fedora_atomic_v1/templates/kubecluster.yaml#n703 10:32:24 flwang1: slunkad this is good to know in our kubernetes drivers ^^ 10:33:02 so when there is a bad node or a node that failed to deploy, with this trick we can remove it 10:33:27 for us (CERN), when we have large cluster, we scale in steps 10:34:04 and when we hit update_failed, we fix the stack and continue 10:34:27 manually? 10:34:27 we need the same for swarm 10:34:37 yes 10:34:41 ok 10:34:50 manually with the APIs 10:35:01 but i can see this could be a part of the auto healing feature 10:35:04 it is not that manual 10:35:11 i see 10:35:16 I was typing this 10:35:33 the for healing, auto or not 10:35:42 right 10:35:43 this is how it is going to happen 10:35:57 with removal_policies 10:36:07 cool 10:36:14 it would be nice to have 10:37:08 strigazi1: what does that do exactly? 10:38:10 slunkad: it you do: openstack stack update --existing --parameter minions_to_remove=124, 201 10:38:35 it will remove from the resource group two nested stacks with ids 124 and 201 10:38:45 instead of ids you can use private ips 10:39:15 openstack stack update --existing --parameter minions_to_remove=10.0.0.4,10.0.0.123 10:39:28 makes sense? 10:39:41 yes and the swarm driver does not have this? 10:39:57 https://docs.openstack.org/heat/latest/template_guide/openstack.html#OS::Heat::ResourceGroup-prop-removal_policies 10:40:09 slunkad: it doesn't 10:40:28 ok good to know then, will keep this in mind for the opensuse driver 10:40:41 do you have a bug filed for this? 10:40:57 yes 10:41:32 #link https://storyboard.openstack.org/#!/story/2002677 10:41:51 ok, next, flwang1 did you see my comment on mulitmaster? 10:42:09 #link https://review.openstack.org/#/c/576029 10:42:17 strigazi: yep, i saw that. but i don't think it's related to the sync keys patch 10:42:33 it maybe another corner case for the multi master race condition issue 10:43:01 flwang1: it not exactly, but still the deployment doesn't work 10:43:25 flwang1: we can merge yours but multi master still won't be functional 10:44:04 i see. i will give another try for the multi master deployment. but based on my testing, on openlab machine with 256G ram, my fix just works 10:44:24 i will do another round testing for the multi master test and get back to you 10:44:35 openlab? 10:44:57 forgot that :) 10:45:33 I don't know why it works 10:45:47 for it worked once 10:45:51 for me it worked once 10:45:57 interesting, i will test it again and let you know 10:45:58 out of 30 10:46:07 i tested it several times and no fail 10:46:11 happy to +2 10:47:00 it would be nice because I think it's a different issue from the service account key issue 10:47:21 i can open another story/task to track your issue 10:48:18 ok 10:49:04 that's it from me 10:49:35 my turn now? 10:49:47 yes 10:49:51 cool 10:50:42 last week, i mainly focused on the magnum support on gophercloud and shade/openstacksdk, because catalyst cloud would like to enable user to use terraform and ansible to talk to magnum api 10:51:32 i'm still working on the k8s-keystone-auth integration, will submit a testing image to openstackmagnum soon 10:52:10 for auto healing, my patch https://review.openstack.org/#/c/570818/ is still waiting for review 10:52:55 for this cycle, i probably won't take new stuff, but just finish things on my plate 10:53:28 strigazi1: i'm keen to know the upgrade and auto healing status from you and ricardo 10:53:40 flwang1: thanks, the patch looks good 10:53:57 just point to the story instead of the blueprint 10:54:09 strigazi1: ok, will update 10:55:21 flwang1: I'm trying to finish my part I sent out a patch for the api 10 days ago which I'm basing the rest on 10:55:55 strigazi1: is the current one ready for testing? 10:56:14 i'm worrying if we can finish them in this cycle 10:56:55 I hope we'll discuss it on Thursday 10:57:10 will we have ricardo? 10:57:19 I don't know 10:57:36 strigazi1: fair enough 10:57:46 if so, let's focus on the upgrade one 10:57:57 let me know if there is anything i can help 10:58:22 thanks 10:58:58 thanks for joining the meeting flwang1 slunkad 10:59:24 #endmeeting 10:59:55 thank you, strigazi1 11:00:01 #endmeeting