20:59:26 <strigazi> #startmeeting containers 20:59:27 <openstack> Meeting started Tue Feb 26 20:59:26 2019 UTC and is due to finish in 60 minutes. The chair is strigazi. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:59:28 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:59:30 <openstack> The meeting name has been set to 'containers' 20:59:34 <strigazi> #topic Roll Call 20:59:36 <strigazi> o/ 20:59:40 <schaney> o/ 20:59:43 <jakeyip> o/ 20:59:43 <brtknr> o/ 21:00:00 <flwang> o/ 21:00:43 <strigazi> #topic Stories/Tasks 21:00:55 <strigazi> 1. openstack autoscaler 21:01:16 <strigazi> #link https://github.com/kubernetes/autoscaler/pull/1690 21:01:28 <strigazi> by far the highest number of comments in the repo 21:01:55 <schaney> was glad to get a chance to review, any thoughts on how far off a merge is? 21:02:14 <flwang> schaney: are you Scott? 21:02:18 <schaney> will we want to wait on the magnum resize API? 21:02:24 <schaney> flwang: yes thats me 21:02:28 <strigazi> I think it is close if we show aggrement to them 21:02:31 <flwang> schaney: welcome to join us 21:02:39 <strigazi> from the openstack and magnum side 21:02:41 <schaney> :) 21:02:58 <strigazi> Form openstack PoV should be ok, 21:03:04 <flwang> schaney: we also need work in gophercloud for the new api, so i'm not sure if we can wait 21:03:13 <flwang> strigazi: ^ 21:03:20 <strigazi> since they (CA maitainers) are ok to have two 21:03:52 <strigazi> What difference does it make to us? 21:03:53 <flwang> strigazi: yep, we can get current one in, and propose the new magnum_manager 21:03:56 <schaney> any thoughts on moving away from the API polling once the magnum implementation is complete? 21:04:03 <flwang> no difference for Magnum 21:04:08 <strigazi> if we agree on the design, implementation and direction 21:04:25 <strigazi> schaney: we can do that too 21:04:46 <schaney> awesome 21:05:06 <strigazi> we can leave that as a third step? 21:05:07 <flwang> strigazi: can we just rename the current pr to openstack_magnum_manager.go 21:05:14 <strigazi> First merge like it is now 21:05:20 <flwang> and refactor it as long as we have the new api 21:05:23 <strigazi> 2nd add resixe api 21:05:33 <strigazi> and then remove polling 21:06:02 <strigazi> schaney: makes sense? 21:06:22 <strigazi> all this would happen in this cycle 21:06:22 <schaney> sounds good, it will be easier to start tackling specific areas once it's out there 21:06:30 <flwang> FWIW, i don't mind get current PR in with current status, and as long as the new /resize api ready, we can decide how to do in CA 21:07:14 <strigazi> if there are no more objections with the current implementation, we can push the CA team to merge 21:08:07 <brtknr> its not so much an objection but how will the cluster API stuff affect this? 21:08:12 <strigazi> Is the ip vs id vs uuid thing clear? 21:08:22 <strigazi> brtknr: not at all 21:08:38 <strigazi> the cluster api will be very different 21:08:48 <schaney> I am not actually clear on how your templates create the IP mapping 21:08:54 <strigazi> like google has two implementations, one for gce and one for gke 21:09:10 <flwang> schaney: are you talking about this https://review.openstack.org/639053 ? 21:09:37 <strigazi> schaney: flwang bare with me for the explanation, also this change needs more comments in the commit message ^^ 21:10:02 <strigazi> in heat a resource group creates a stack with depth two 21:10:33 <strigazi> the first nested stack kube_minions has a ref_map output 21:10:39 <strigazi> which goes like this: 21:10:45 <strigazi> 0: <smth> 21:10:51 <strigazi> 1: <smth> 21:10:52 <strigazi> and so on 21:11:29 <strigazi> These indices are the minion-INDEX numbers 21:11:48 <strigazi> and the indices in the ResourceGroup 21:12:10 <strigazi> A RG supports removal_poliies 21:12:12 <strigazi> A RG supports removal_policies 21:12:49 <strigazi> which means you can pass a list of indices as a param, and heat will remove these resources from the RG 21:13:12 <brtknr> I am not clear on what is using the change made in https://review.openstack.org/639053 atm 21:13:26 <strigazi> additionally, heat will track which indices have removed and won't create them again 21:13:31 <strigazi> brtknr: bare with me 21:13:40 <strigazi> so, 21:14:00 <strigazi> in the first imlementation of removal policies in the k8s templates 21:14:27 <strigazi> the IP was used as an id in this list: 21:14:36 <strigazi> 0: private-ip-1 21:14:42 <strigazi> 1: private-ip-2 21:14:55 <strigazi> (or zero based :)) 21:15:25 <strigazi> then it was changes with this commit: 21:15:54 <strigazi> https://github.com/openstack/magnum/commit/3ca2eb30369a00240a92c254c95bea6c7a60fee1 21:16:07 <strigazi> and the ref_map became like this: 21:16:26 <strigazi> 0: stack_id_of_nested_stack_0_depth_2 21:16:32 <strigazi> 1: stack_id_of_nested_stack_1_depth_2 21:16:59 <strigazi> and the above patch broke the removal policy of the resouce group 21:17:23 <strigazi> meaning, if you passed a list of ips to the removal policy after the above patch 21:17:50 <strigazi> heat wouldn't understand to whoch index in the RG that ip belonged too 21:18:01 <strigazi> that is why for flwang and schaney didn't work 21:18:14 <schaney> gotcha 21:18:30 <strigazi> flwang: now proposes a change 21:18:37 <strigazi> to make the ref_map: 21:18:51 <strigazi> 0: nova_server_uuid_0 21:18:55 <strigazi> 1: nova_server_uuid_1 21:19:11 <strigazi> you can inspect this map in current cluster like this: 21:19:43 <colin-> sorry i'm late 21:19:48 <strigazi> openstack stack list --nested | grep <parent_stack_name> | grep kube_minions 21:20:03 <strigazi> and then show the nested stack of depth 1 21:20:07 <strigazi> you will see the list 21:20:14 <strigazi> you will see the ref_map 21:20:15 <strigazi> eg: 21:21:41 <brtknr> `openstack stack list --nested` is a nice trick! 21:21:43 <brtknr> til 21:22:04 <strigazi> http://paste.openstack.org/show/746304/ 21:22:22 <strigazi> this is with the IP ^^ 21:22:23 <brtknr> ive always done `openstack stack resource list k8s-stack --nested-depth=4 21:23:17 <eandersson_> o/ 21:23:23 <strigazi> http://paste.openstack.org/show/746305/ 21:23:40 <strigazi> this is with the stack_id 21:23:49 <imdigitaljim> \o 21:23:51 <strigazi> check uuid b4e8a1ec-0b76-48cb-b486-2a5459ea45d4 21:24:01 <strigazi> in the ref_map and in the list of stacks 21:24:24 <imdigitaljim> i like the new change to uuid =) 21:24:54 <flwang> imdigitaljim: yep, uuid is more reliable than ip for some cases 21:24:55 <strigazi> after said change, we will see the nova uuid there 21:25:24 <strigazi> so, in heat we can pass either the server uuid or the index 21:25:36 <strigazi> then heat will store the removed ids here: 21:26:24 <strigazi> http://paste.openstack.org/show/746306/ 21:26:33 <strigazi> makes sense? 21:27:36 <jakeyip> sounds good to me 21:27:52 <schaney> yep! the confusion on my end was the "output" relationship to removal policy member 21:28:30 <schaney> and the nested stack vs the resource representing the stack 21:28:37 <schaney> makes sense now though 21:29:03 <strigazi> I spent a full moring with thomas on this 21:29:51 <brtknr> do you need https://review.openstack.org/639053 for resize to work? 21:29:59 <strigazi> brtknr: yes 21:30:12 <strigazi> brtknr: to work by giving nova uuids 21:30:26 <brtknr> as it doesnt seem linked on gerrit as a dependency 21:30:29 <jakeyip> in https://github.com/openstack/magnum/commit/3ca2eb30369a00240a92c254c95bea6c7a60fee1 the name for key is OS::stack_id does that need to change or will that be confusing if we use it for soething else 21:31:01 <strigazi> jakeyip: i don't think we have an option there 21:31:13 <strigazi> it needs to be explained well 21:31:15 <brtknr> yes, probably better to call it nova_uuid? 21:31:26 <flwang> brtknr: because i assume https://review.openstack.org/639053 will be merged very soon 21:31:27 <brtknr> or OS::nova_uuid? 21:31:38 <flwang> but the resize patch may take a bit long, sorry for the cofustion 21:31:46 <strigazi> brtknr: doesn't work nova_uuid or other name 21:31:55 <strigazi> not sure if OS::nova_uuid makes sense 21:32:04 <strigazi> not sure if OS::nova_uuid makes sense to heat 21:32:09 <strigazi> (to me it does) 21:32:39 <brtknr> oh okay! i didnt realise it was a component of heat 21:33:04 <strigazi> brtknr: needs to be double checked 21:33:29 <strigazi> the important part is that the ref_map I mentioned before has the correct content 21:33:43 <brtknr> https://docs.openstack.org/heat/rocky/template_guide/composition.html 21:33:53 <brtknr> sounds like we're stuck with OS::stack_id 21:34:29 <strigazi> yeap 21:34:36 <flwang> we should move and discuss details on the patch 21:34:46 <strigazi> with comments in code, should be ok 21:35:18 <flwang> strigazi: i will update the patch based on above discussion 21:35:24 <strigazi> schaney and colleagues, flwang brtknr we have agreement right? 21:35:43 <brtknr> +1 21:35:53 <strigazi> imdigitaljim: colin- eandersson_ ^^ 21:36:18 <colin-> on the UID portion? 21:36:23 <schaney> yeah UUID will work well for us 21:36:24 <strigazi> yes 21:36:26 <colin-> lgtm 21:36:29 <imdigitaljim> thanks for the clarity, uuid looks good and works for us 21:36:51 <strigazi> \o/ 21:37:15 <jakeyip> I don't have objections but I would like to read the patch first, I am a bit confused whether stack_id is same as nova_uuid or can you get one from the other 21:37:33 <strigazi> jakeyip: they are different 21:37:53 <strigazi> but that stack_id logically corresponds the a nova_uuid 21:38:04 <strigazi> but that stack_id logically corresponds to that nova_uuid 21:38:19 <jakeyip> if we can dervie nova_uuid from stack_id should we do that instead? 21:38:25 <eandersson_> sounds good 21:39:06 <strigazi> jakeyip: well it is the other way round 21:39:20 <strigazi> jakeyip: derive the stack_id from the nova_uuid 21:39:36 <imdigitaljim> well the stack contains the nova server 21:39:48 <imdigitaljim> so it makes sense to use stack anywasy 21:39:52 <strigazi> jakeyip: the CA or the user will know which server wants to remove 21:40:00 <jakeyip> I thought the stack will have a nova_server with the uuid 21:40:25 <strigazi> imdigitaljim: that is correct but the user or the CA will know the uuid 21:40:50 <imdigitaljim> you mean the nova uuid? 21:40:54 <imdigitaljim> strigazi:^ 21:40:56 <strigazi> yes 21:41:30 <strigazi> eg when you do kubectl descibe node <foo> you see the nova_uuid of the server 21:42:03 <imdigitaljim> yeah but its in autoscaler, im missing why the user's knowledge even matters 21:42:14 <strigazi> jakeyip: also to be clear, the nova_uuid won't replace the stack uuid, the stack will still have its uuid 21:42:47 <imdigitaljim> either way, im happy with the approach 21:42:50 <jakeyip> yes. but can whichever code just looks for stackid and the OS::Nova::Server uuid of that stack ? 21:42:51 <imdigitaljim> good choices 21:43:05 <strigazi> imdigitaljim: for example for the resize API, there are cases that a user won't to get rid of a node 21:43:23 <brtknr> oh yeah you're right, under `ProviderID: openstack:///231ba791-91ec-4540-a580-3ef493e36055` 21:43:23 <imdigitaljim> ah fair point 21:43:25 <imdigitaljim> good call 21:44:37 <strigazi> jakeyip: can you image the user frustration? additionally, in the autoscaler 21:44:55 <strigazi> the CA wants to remove nova_server with uuid A. 21:45:45 <strigazi> then the CA needs to call heat and show all nested stacks to find which stack this server belongs too 21:46:05 <schaney> it saves a gnarly reverse lookup =) 21:46:15 <colin-> certificate authority? 21:46:24 <jakeyip> that's just code? I am more worried about hijacking a variable that used to mean one thing and making it mean another 21:46:25 <colin-> sorry not following the thread of convo 21:46:28 <brtknr> colin-: cluster autoscaler? 21:46:29 <strigazi> maybe can be done with a single list? or with an extra map we maintain as stack output 21:46:32 <colin-> oh 21:46:42 <colin-> that might get tricky :) 21:46:49 <imdigitaljim> CAS maybe haha 21:46:52 <strigazi> colin-: I got used to it xD 21:47:23 <flwang> should we move to next topoic? 21:47:51 <flwang> strigazi: i'd like to know the rolling upgrade work and the design of resize/upgrade api 21:48:08 <imdigitaljim> could you include both server_id and stack_id in the output and use as a reference point? 21:48:09 <imdigitaljim> is that a thing? 21:48:20 <strigazi> imdigitaljim: I don't think so 21:48:30 <jakeyip> agree with flwang maybe we discuss this straight after meeting 21:48:37 <flwang> i'm thinking if we should use POST instead of PATCH for actions and if we should follow the actions api design like nova/cinder/etc 21:48:39 <imdigitaljim> would be interesting to test 21:48:47 <imdigitaljim> if the heat update can target either or 21:48:58 <imdigitaljim> just like if ip AND stackid are present 21:49:14 <imdigitaljim> because then it would be trivial problem 21:49:32 <strigazi> it is a map, so I don't think so 21:50:11 <strigazi> flwang: what do you mean? 21:50:41 <jakeyip> flwang: is there a review with this topic ? 21:50:53 <imdigitaljim> strigazi: is the key it looks for OS::stack_id? 21:51:00 <strigazi> resize != upgrade 21:51:01 <imdigitaljim> (im new to some of this convo) 21:51:06 <brtknr> https://storyboard.openstack.org/#!/story/2005054 21:51:42 <strigazi> imdigitaljim: the key is stack_id 21:51:51 <imdigitaljim> kk 21:51:53 <imdigitaljim> thanks 21:52:00 <strigazi> flwang: can you explain more the PATCH vs POST 21:52:09 <flwang> strigazi: nova/cinder is using api like <id>/action and including action name in the post body 21:52:32 <strigazi> flwang: we can do that 21:52:49 <flwang> so two points, 1. should we use POST instead of PATCH 2. should we follow the post body like nova/cinder, etc 21:52:55 <strigazi> flwang: in practice, we won't see any difference 21:53:14 <strigazi> pointer on 2.? 21:53:29 <flwang> strigazi: i know, but i think we should make openstack like a building, instead of building blocks with different design 21:53:33 <imdigitaljim> well 21:53:38 <imdigitaljim> following a more restful paradigm 21:53:45 <imdigitaljim> patch is practically the only appropriate option 21:53:45 <imdigitaljim> https://fullstack-developer.academy/restful-api-design-post-vs-put-vs-patch/ 21:53:47 <flwang> strigazi: https://developer.openstack.org/api-ref/compute/?expanded=rebuild-server-rebuild-action-detail,resume-suspended-server-resume-action-detail 21:53:47 <imdigitaljim> somethign liket his 21:54:11 <flwang> imdigitaljim: openstack do have some guidelines about api design 21:54:33 <flwang> but the thing i'm discusing is a bit different from the method 21:55:09 <jakeyip> flwang: pardon my ignorance what is the difference between this and PATCH at https://developer.openstack.org/api-ref/container-infrastructure-management/?expanded=update-information-of-cluster-detail#update-information-of-cluster 21:55:57 <flwang> jakeyip: we're going to add new api <cluster id>/actions for upgrade and resize 21:56:00 <strigazi> imdigitaljim: flwang I agree with flwang , we can follow a similar pattern than other projects. 21:56:20 <imdigitaljim> i also agree following similar patterns as other projects 21:56:26 <imdigitaljim> just making sure we understand them =) 21:56:53 <flwang> imdigitaljim: thanks, and yes, we're aware of the http method differences 21:57:04 <jakeyip> flwang: for resize it will be something in addition to original PATCH function ? 21:57:21 <flwang> and here, upgrade/resize are really not normal update for the resource(cluster here) 21:57:23 <strigazi> personally I prefer patch, but for the data model we have, there is no real difference, at least IMO 21:57:25 <brtknr> flwang: although nova seems to use PUT for update rather than PATCH or POST 21:57:47 <flwang> both resize and upgrade cases, we're doing node replacement, delete, add new, etc 21:58:03 <flwang> brtknr: yep, but that's history issue i think 21:58:24 <strigazi> brtknr: also put is user for properties/metadata only 21:58:31 <strigazi> brtknr: also put is used for properties/metadata only 21:58:50 <flwang> when we say PATCH, it's most like a normal partial update for the resource 21:59:02 <flwang> but those actions are really beyond that 21:59:41 <strigazi> I might add that they are "to infinity and beyond" 21:59:56 <flwang> strigazi: haha, buzz lightyear fans here 22:00:57 <brtknr> hmm i'd vote for PATCH but there is not much precedence in other openstack projects... i wonder why 22:00:58 <jakeyip> I feel POST is good. PUT/PATCH is more restrictive. It's much easier to refactor POST into PATCH/PUT later if it makes sense later, but not the other way round 22:01:28 <jakeyip> since we don't have a concrete idea of how it is going to look like POST let us get on with it for now 22:01:39 <flwang> yep, we can discuss on patch 22:01:53 <flwang> we're running out time 22:01:57 <flwang> strigazi: ^ 22:02:04 <strigazi> yes, 22:02:29 <strigazi> just vert quickly, brtknr can you mention the kubelet/node-ip thing? 22:02:48 <imdigitaljim> post makes sense for these scaling operations 22:02:55 <imdigitaljim> but maybe patch if versions are updated or anything? 22:03:22 <brtknr> strigazi: yes, its been bugging me for weeks, my minion InternalIP keeps flipping between the ip addresses it has been assigned on 3 different interfaces... 22:03:46 <strigazi> I think we can drop the node-ip, since the certs has only one ip 22:03:56 <brtknr> I have a special setup where each node has 3 interfaces, 1 for provider network and 1 high throughput and 1 high latency 22:03:57 <strigazi> I think we can drop the node-ip, since the certificate has only one ip 22:04:19 <brtknr> however assigning node-ip is not working 22:04:25 <colin-> whose poor app gets the high latency card XD? 22:04:41 <brtknr> colin-: low latency :P 22:04:43 <colin-> sorry- understand we're short on time 22:05:12 <brtknr> i have applied --node-ip arg to kubelet and the ip doesnt stick, the ips still keep changing 22:05:47 <brtknr> the consequence of this is that pods running on those minions become unavailable for the duration that the ip is on a different interface 22:06:22 <brtknr> my temporary workaround is that the order that kube-apiserver resolves host is Hostname,InternalIP,ExternalIP 22:06:28 <strigazi> brtknr: I thought it might be simpler :) we can discuss it tmr or in storyboard/mailing list? 22:06:35 <imdigitaljim> random question 22:06:36 <brtknr> It was InternalIP,Hostname,ExternalIP 22:06:39 <imdigitaljim> https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/ 22:06:48 <imdigitaljim> --address 0.0.0.0 22:06:53 <imdigitaljim> do you bind a specific address? 22:07:06 <brtknr> imdigitaljim: yes, i bound it to node IP 22:07:10 <imdigitaljim> gotcha 22:07:15 <brtknr> it was already bound to node ip 22:07:19 <brtknr> by default 22:07:21 <imdigitaljim> just curious of how that is all done with multi-interface 22:07:50 <colin-> personally curious how the kube-proxy or similar would handle such a setup and rule/translation enforcement etc 22:07:54 <brtknr> is there any reason why we cant do Hostname,InternalIP,ExternalIP ordering by default 22:07:58 <imdigitaljim> https://kubernetes.io/docs/reference/command-line-tools-reference/kube-proxy/ 22:08:01 <imdigitaljim> same with kube-proxy? 22:08:08 <imdigitaljim> do you do stuff here different? 22:09:06 <brtknr> I havent touched kube-proxy settings because I couldnt find it 22:09:30 <openstackgerrit> Feilong Wang proposed openstack/magnum master: [WIP] Support <cluster>/actions/resize API https://review.openstack.org/638572 22:09:30 <imdigitaljim> --bind-address 0.0.0.0 Default: 0.0.0.0 22:09:32 <imdigitaljim> maaybe? 22:09:32 <strigazi> brtknr: /etc/kubernetes/proxy 22:09:36 <imdigitaljim> check this out? 22:09:54 <strigazi> in magnum it has the default 22:10:01 <imdigitaljim> which is all interfaces? 22:10:04 <strigazi> yes 22:10:07 <imdigitaljim> shouldnt it be node only here/ 22:10:08 <imdigitaljim> ? 22:10:15 <strigazi> for proxy? 22:10:30 <imdigitaljim> i guess what hes doing with his interfaces 22:10:41 <brtknr> oh okay, i'll try adding --bind-address=NODE_IP 22:10:55 <brtknr> to /etc/kubernetes/proxy 22:11:12 <imdigitaljim> im just curious i dont have a solution 22:11:17 <colin-> failing that i'd try imdigitaljim suggestion of wildcarding it 22:11:20 <imdigitaljim> but maybe worth a shot 22:11:20 <colin-> just for troubleshooting 22:11:33 <brtknr> wildcarding? 22:11:34 <colin-> oh, that may be the default my mistake 22:11:50 <colin-> 0.0.0.0/0 22:12:35 <brtknr> colin-: how would that help? 22:12:46 <brtknr> according to the docs, 0.0.0.0 is already the default 22:12:48 <brtknr> https://kubernetes.io/docs/reference/command-line-tools-reference/kube-proxy/ 22:12:58 <brtknr> for --bind-address 22:12:59 <strigazi> brtknr: colin- imdigitaljim let's end the meeting and just continue? 22:13:15 <brtknr> sure, this is not going to be resolved very easily :) 22:13:30 <strigazi> thanks 22:13:43 <imdigitaljim> yeah im just throwing out ideas 22:13:50 <imdigitaljim> maybe a few things to think/try 22:14:16 <imdigitaljim> maybe to get brtknr unstuck 22:14:42 <strigazi> flwang: brtknr jakeyip schaney colin- eandersson imdigitaljim thanks for joining and for the discussion on autoscaler. 22:14:54 <imdigitaljim> yeah thanks for clearing up some stuff =) 22:14:58 <imdigitaljim> looking forward to the merge 22:15:11 <strigazi> :) 22:15:19 <strigazi> #endmeeting