#openstack-containers log

20:59:26 <strigazi> #startmeeting containers
20:59:27 <openstack> Meeting started Tue Feb 26 20:59:26 2019 UTC and is due to finish in 60 minutes.  The chair is strigazi. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:59:28 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
20:59:30 <openstack> The meeting name has been set to 'containers'
20:59:34 <strigazi> #topic Roll Call
20:59:36 <strigazi> o/
20:59:40 <schaney> o/
20:59:43 <jakeyip> o/
20:59:43 <brtknr> o/
21:00:00 <flwang> o/
21:00:43 <strigazi> #topic Stories/Tasks
21:00:55 <strigazi> 1. openstack autoscaler
21:01:16 <strigazi> #link https://github.com/kubernetes/autoscaler/pull/1690
21:01:28 <strigazi> by far the highest number of comments in the repo
21:01:55 <schaney> was glad to get a chance to review, any thoughts on how far off a merge is?
21:02:14 <flwang> schaney: are you Scott?
21:02:18 <schaney> will we want to wait on the magnum resize API?
21:02:24 <schaney> flwang: yes thats me
21:02:28 <strigazi> I think it is close if we show aggrement to them
21:02:31 <flwang> schaney: welcome to join us
21:02:39 <strigazi> from the openstack and magnum side
21:02:41 <schaney> :)
21:02:58 <strigazi> Form openstack PoV should be ok,
21:03:04 <flwang> schaney: we also need work in gophercloud for the new api, so i'm not sure if we can wait
21:03:13 <flwang> strigazi: ^
21:03:20 <strigazi> since they (CA maitainers) are ok to have two
21:03:52 <strigazi> What difference does it make to us?
21:03:53 <flwang> strigazi: yep, we can get current one in, and propose the new magnum_manager
21:03:56 <schaney> any thoughts on moving away from the API polling once the magnum implementation is complete?
21:04:03 <flwang> no difference for Magnum
21:04:08 <strigazi> if we agree on the design, implementation and direction
21:04:25 <strigazi> schaney: we can do that too
21:04:46 <schaney> awesome
21:05:06 <strigazi> we can leave that as a third step?
21:05:07 <flwang> strigazi: can we just rename the current pr to openstack_magnum_manager.go
21:05:14 <strigazi> First merge like it is now
21:05:20 <flwang> and refactor it as long as we have the new api
21:05:23 <strigazi> 2nd add resixe api
21:05:33 <strigazi> and then remove polling
21:06:02 <strigazi> schaney: makes sense?
21:06:22 <strigazi> all this would happen in this cycle
21:06:22 <schaney> sounds good, it will be easier to start tackling specific areas once it's out there
21:06:30 <flwang> FWIW, i don't mind get current PR in with current status, and as long as the new /resize api ready, we can decide how to do in CA
21:07:14 <strigazi> if there are no more objections with the current implementation, we can push the CA team to merge
21:08:07 <brtknr> its not so much an objection but how will the cluster API stuff affect this?
21:08:12 <strigazi> Is the ip vs id vs uuid thing clear?
21:08:22 <strigazi> brtknr: not at all
21:08:38 <strigazi> the cluster api will be very different
21:08:48 <schaney> I am not actually clear on how your templates create the IP mapping
21:08:54 <strigazi> like google has two implementations, one for gce and one for gke
21:09:10 <flwang> schaney: are you talking about this https://review.openstack.org/639053 ?
21:09:37 <strigazi> schaney: flwang bare with me for the explanation, also this change needs more comments in the commit message ^^
21:10:02 <strigazi> in heat a resource group creates a stack with depth two
21:10:33 <strigazi> the first nested stack kube_minions has a ref_map output
21:10:39 <strigazi> which goes like this:
21:10:45 <strigazi> 0: <smth>
21:10:51 <strigazi> 1: <smth>
21:10:52 <strigazi> and so on
21:11:29 <strigazi> These indices are the minion-INDEX numbers
21:11:48 <strigazi> and the indices in the ResourceGroup
21:12:10 <strigazi> A RG supports removal_poliies
21:12:12 <strigazi> A RG supports removal_policies
21:12:49 <strigazi> which means you can pass a list of indices as a param, and heat will remove these resources from the RG
21:13:12 <brtknr> I am not clear on what is using the change made in https://review.openstack.org/639053 atm
21:13:26 <strigazi> additionally, heat will track which indices have removed and won't create them again
21:13:31 <strigazi> brtknr: bare with me
21:13:40 <strigazi> so,
21:14:00 <strigazi> in the first imlementation of removal policies in the k8s templates
21:14:27 <strigazi> the IP was used as an id in this list:
21:14:36 <strigazi> 0: private-ip-1
21:14:42 <strigazi> 1: private-ip-2
21:14:55 <strigazi> (or zero based :))
21:15:25 <strigazi> then it was changes with this commit:
21:15:54 <strigazi> https://github.com/openstack/magnum/commit/3ca2eb30369a00240a92c254c95bea6c7a60fee1
21:16:07 <strigazi> and the ref_map became like this:
21:16:26 <strigazi> 0: stack_id_of_nested_stack_0_depth_2
21:16:32 <strigazi> 1: stack_id_of_nested_stack_1_depth_2
21:16:59 <strigazi> and the above patch broke the removal policy of the resouce group
21:17:23 <strigazi> meaning, if you passed a list of ips to the removal policy after the above patch
21:17:50 <strigazi> heat wouldn't understand to whoch index in the RG that ip belonged too
21:18:01 <strigazi> that is why for flwang and schaney didn't work
21:18:14 <schaney> gotcha
21:18:30 <strigazi> flwang: now proposes a change
21:18:37 <strigazi> to make the ref_map:
21:18:51 <strigazi> 0: nova_server_uuid_0
21:18:55 <strigazi> 1: nova_server_uuid_1
21:19:11 <strigazi> you can inspect this map in current cluster like this:
21:19:43 <colin-> sorry i'm late
21:19:48 <strigazi> openstack stack list --nested | grep <parent_stack_name> | grep kube_minions
21:20:03 <strigazi> and then show the nested stack of depth 1
21:20:07 <strigazi> you will see the list
21:20:14 <strigazi> you will see the ref_map
21:20:15 <strigazi> eg:
21:21:41 <brtknr> `openstack stack list --nested` is a nice trick!
21:21:43 <brtknr> til
21:22:04 <strigazi> http://paste.openstack.org/show/746304/
21:22:22 <strigazi> this is with the IP ^^
21:22:23 <brtknr> ive always done `openstack stack resource list k8s-stack --nested-depth=4
21:23:17 <eandersson_> o/
21:23:23 <strigazi> http://paste.openstack.org/show/746305/
21:23:40 <strigazi> this is with the stack_id
21:23:49 <imdigitaljim> \o
21:23:51 <strigazi> check uuid b4e8a1ec-0b76-48cb-b486-2a5459ea45d4
21:24:01 <strigazi> in the ref_map and in the list of stacks
21:24:24 <imdigitaljim> i like the new change to uuid =)
21:24:54 <flwang> imdigitaljim: yep, uuid is more reliable than ip for some cases
21:24:55 <strigazi> after said change, we will see the nova uuid there
21:25:24 <strigazi> so, in heat we can pass either the server uuid or the index
21:25:36 <strigazi> then heat will store the removed ids here:
21:26:24 <strigazi> http://paste.openstack.org/show/746306/
21:26:33 <strigazi> makes sense?
21:27:36 <jakeyip> sounds good to me
21:27:52 <schaney> yep! the confusion on my end was the "output" relationship to removal policy member
21:28:30 <schaney> and the nested stack vs the resource representing the stack
21:28:37 <schaney> makes sense now though
21:29:03 <strigazi> I spent a full moring with thomas on this
21:29:51 <brtknr> do you need https://review.openstack.org/639053 for resize to work?
21:29:59 <strigazi> brtknr: yes
21:30:12 <strigazi> brtknr: to work by giving nova uuids
21:30:26 <brtknr> as it doesnt seem linked on gerrit as a dependency
21:30:29 <jakeyip> in https://github.com/openstack/magnum/commit/3ca2eb30369a00240a92c254c95bea6c7a60fee1 the name for key is OS::stack_id does that need to change or will that be confusing if we use it for soething else
21:31:01 <strigazi> jakeyip: i don't think we have an option there
21:31:13 <strigazi> it needs to be explained well
21:31:15 <brtknr> yes, probably better to call it nova_uuid?
21:31:26 <flwang> brtknr: because i assume https://review.openstack.org/639053  will be merged very soon
21:31:27 <brtknr> or OS::nova_uuid?
21:31:38 <flwang> but the resize patch may take a bit long, sorry for the cofustion
21:31:46 <strigazi> brtknr: doesn't work nova_uuid or other name
21:31:55 <strigazi> not sure if OS::nova_uuid makes sense
21:32:04 <strigazi> not sure if OS::nova_uuid makes sense to heat
21:32:09 <strigazi> (to me it does)
21:32:39 <brtknr> oh okay! i didnt realise it was a component of heat
21:33:04 <strigazi> brtknr: needs to be double checked
21:33:29 <strigazi> the important part is that the ref_map I mentioned before has the correct content
21:33:43 <brtknr> https://docs.openstack.org/heat/rocky/template_guide/composition.html
21:33:53 <brtknr> sounds like we're stuck with OS::stack_id
21:34:29 <strigazi> yeap
21:34:36 <flwang> we should move and discuss details on the patch
21:34:46 <strigazi> with comments in code, should be ok
21:35:18 <flwang> strigazi: i will update the patch based on above discussion
21:35:24 <strigazi> schaney and colleagues, flwang brtknr we have agreement right?
21:35:43 <brtknr> +1
21:35:53 <strigazi> imdigitaljim: colin- eandersson_ ^^
21:36:18 <colin-> on the UID portion?
21:36:23 <schaney> yeah UUID will work well for us
21:36:24 <strigazi> yes
21:36:26 <colin-> lgtm
21:36:29 <imdigitaljim> thanks for the clarity, uuid looks good and works for us
21:36:51 <strigazi> \o/
21:37:15 <jakeyip> I don't have objections but I would like to read the patch first, I am a bit confused whether stack_id is same as nova_uuid or can you get one from the other
21:37:33 <strigazi> jakeyip: they are different
21:37:53 <strigazi> but that stack_id logically corresponds the a nova_uuid
21:38:04 <strigazi> but that stack_id logically corresponds to that nova_uuid
21:38:19 <jakeyip> if we can dervie nova_uuid from stack_id should we do that instead?
21:38:25 <eandersson_> sounds good
21:39:06 <strigazi> jakeyip: well it is the other way round
21:39:20 <strigazi> jakeyip: derive the stack_id from the nova_uuid
21:39:36 <imdigitaljim> well the stack contains the nova server
21:39:48 <imdigitaljim> so it makes sense to use stack anywasy
21:39:52 <strigazi> jakeyip: the CA or the user will know which server wants to remove
21:40:00 <jakeyip> I thought the stack will have a nova_server with the uuid
21:40:25 <strigazi> imdigitaljim: that is correct but the user or the CA will know the uuid
21:40:50 <imdigitaljim> you mean the nova uuid?
21:40:54 <imdigitaljim> strigazi:^
21:40:56 <strigazi> yes
21:41:30 <strigazi> eg when you do kubectl descibe node <foo> you see the nova_uuid of the server
21:42:03 <imdigitaljim> yeah but its in autoscaler, im missing why the user's knowledge even matters
21:42:14 <strigazi> jakeyip: also to be clear, the nova_uuid won't replace the stack uuid, the stack will still have its uuid
21:42:47 <imdigitaljim> either way, im happy with the approach
21:42:50 <jakeyip> yes. but can whichever code just looks for stackid and the OS::Nova::Server uuid of that stack ?
21:42:51 <imdigitaljim> good choices
21:43:05 <strigazi> imdigitaljim: for example for the resize API, there are cases that a user won't to get rid of a node
21:43:23 <brtknr> oh yeah you're right, under `ProviderID:                  openstack:///231ba791-91ec-4540-a580-3ef493e36055`
21:43:23 <imdigitaljim> ah fair point
21:43:25 <imdigitaljim> good call
21:44:37 <strigazi> jakeyip: can you image the user frustration? additionally, in the autoscaler
21:44:55 <strigazi> the CA wants to remove nova_server with uuid A.
21:45:45 <strigazi> then the CA needs to call heat and show all nested stacks to find which stack this server belongs too
21:46:05 <schaney> it saves a gnarly reverse lookup =)
21:46:15 <colin-> certificate authority?
21:46:24 <jakeyip> that's just code? I am more worried about hijacking a variable that used to mean one thing and making it mean another
21:46:25 <colin-> sorry not following the thread of convo
21:46:28 <brtknr> colin-: cluster autoscaler?
21:46:29 <strigazi> maybe can be done with a single list? or with an extra map we maintain as stack output
21:46:32 <colin-> oh
21:46:42 <colin-> that might get tricky :)
21:46:49 <imdigitaljim> CAS maybe haha
21:46:52 <strigazi> colin-: I got used to it xD
21:47:23 <flwang> should we move to next topoic?
21:47:51 <flwang> strigazi: i'd like to know the rolling upgrade work and the design of resize/upgrade api
21:48:08 <imdigitaljim> could you include both server_id and stack_id in the output and use as a reference point?
21:48:09 <imdigitaljim> is that a thing?
21:48:20 <strigazi> imdigitaljim: I don't think so
21:48:30 <jakeyip> agree with flwang maybe we discuss this straight after meeting
21:48:37 <flwang> i'm thinking if we should use POST instead of PATCH for actions and if we should follow the actions api design like nova/cinder/etc
21:48:39 <imdigitaljim> would be interesting to test
21:48:47 <imdigitaljim> if the heat update can target either or
21:48:58 <imdigitaljim> just like if ip AND stackid are present
21:49:14 <imdigitaljim> because then it would be trivial problem
21:49:32 <strigazi> it is a map, so I don't think so
21:50:11 <strigazi> flwang: what do you mean?
21:50:41 <jakeyip> flwang: is there a review with this topic ?
21:50:53 <imdigitaljim> strigazi: is the key it looks for OS::stack_id?
21:51:00 <strigazi> resize != upgrade
21:51:01 <imdigitaljim> (im new to some of this convo)
21:51:06 <brtknr> https://storyboard.openstack.org/#!/story/2005054
21:51:42 <strigazi> imdigitaljim: the key is stack_id
21:51:51 <imdigitaljim> kk
21:51:53 <imdigitaljim> thanks
21:52:00 <strigazi> flwang:  can you explain more the PATCH vs POST
21:52:09 <flwang> strigazi: nova/cinder is using api like    <id>/action       and including action name in the post body
21:52:32 <strigazi> flwang: we can do that
21:52:49 <flwang> so two points,   1. should we use POST instead of PATCH   2. should we follow the post body like nova/cinder, etc
21:52:55 <strigazi> flwang: in practice, we won't see any difference
21:53:14 <strigazi> pointer on 2.?
21:53:29 <flwang> strigazi: i know, but i think we should make openstack like a building, instead of building blocks with different design
21:53:33 <imdigitaljim> well
21:53:38 <imdigitaljim> following a more restful paradigm
21:53:45 <imdigitaljim> patch is practically the only appropriate option
21:53:45 <imdigitaljim> https://fullstack-developer.academy/restful-api-design-post-vs-put-vs-patch/
21:53:47 <flwang> strigazi: https://developer.openstack.org/api-ref/compute/?expanded=rebuild-server-rebuild-action-detail,resume-suspended-server-resume-action-detail
21:53:47 <imdigitaljim> somethign liket his
21:54:11 <flwang> imdigitaljim: openstack do have some guidelines about api design
21:54:33 <flwang> but the thing i'm discusing is a bit different from the method
21:55:09 <jakeyip> flwang: pardon my ignorance what is the difference between this and PATCH at https://developer.openstack.org/api-ref/container-infrastructure-management/?expanded=update-information-of-cluster-detail#update-information-of-cluster
21:55:57 <flwang> jakeyip: we're going to add new  api   <cluster id>/actions  for upgrade and resize
21:56:00 <strigazi> imdigitaljim: flwang I agree with flwang , we can follow a similar pattern than other projects.
21:56:20 <imdigitaljim> i also agree following similar patterns as other projects
21:56:26 <imdigitaljim> just making sure we understand them =)
21:56:53 <flwang> imdigitaljim: thanks, and yes, we're aware of the http method differences
21:57:04 <jakeyip> flwang: for resize it will be something in addition to original PATCH function ?
21:57:21 <flwang> and here, upgrade/resize are really not normal   update for the resource(cluster here)
21:57:23 <strigazi> personally I prefer patch, but for the data model we have, there is no real difference, at least IMO
21:57:25 <brtknr> flwang: although nova seems to use PUT for update rather than PATCH or POST
21:57:47 <flwang> both resize and upgrade cases, we're doing node replacement, delete, add new, etc
21:58:03 <flwang> brtknr: yep, but that's history issue i think
21:58:24 <strigazi> brtknr: also put is user for properties/metadata only
21:58:31 <strigazi> brtknr: also put is used for properties/metadata only
21:58:50 <flwang> when we say PATCH, it's most like a normal partial update for the resource
21:59:02 <flwang> but those actions are really beyond that
21:59:41 <strigazi> I might add that they are "to infinity and beyond"
21:59:56 <flwang> strigazi: haha, buzz lightyear fans here
22:00:57 <brtknr> hmm i'd vote for PATCH but there is not much precedence in other openstack projects... i wonder why
22:00:58 <jakeyip> I feel POST is good. PUT/PATCH is more restrictive. It's much easier to refactor POST into PATCH/PUT later if it makes sense later, but not the other way round
22:01:28 <jakeyip> since we don't have a concrete idea of how it is going to look like POST let us get on with it for now
22:01:39 <flwang> yep, we can discuss on patch
22:01:53 <flwang> we're running out time
22:01:57 <flwang> strigazi: ^
22:02:04 <strigazi> yes,
22:02:29 <strigazi> just vert quickly, brtknr can you mention the kubelet/node-ip thing?
22:02:48 <imdigitaljim> post makes sense for these scaling operations
22:02:55 <imdigitaljim> but maybe patch if versions are updated or anything?
22:03:22 <brtknr> strigazi: yes, its been bugging me for weeks, my minion InternalIP keeps flipping between the ip addresses it has been assigned on 3 different interfaces...
22:03:46 <strigazi> I think we can drop the node-ip, since the certs has only one ip
22:03:56 <brtknr> I have a special setup where each node has 3 interfaces, 1 for provider network and 1 high throughput and 1 high latency
22:03:57 <strigazi> I think we can drop the node-ip, since the certificate has only one ip
22:04:19 <brtknr> however assigning node-ip is not working
22:04:25 <colin-> whose poor app gets the high latency card XD?
22:04:41 <brtknr> colin-: low latency :P
22:04:43 <colin-> sorry- understand we're short on time
22:05:12 <brtknr> i have applied --node-ip arg to kubelet and the ip doesnt stick, the ips still keep changing
22:05:47 <brtknr> the consequence of this is that pods running on those minions become unavailable for the duration that the ip is on a different interface
22:06:22 <brtknr> my temporary workaround is that the order that kube-apiserver resolves host is Hostname,InternalIP,ExternalIP
22:06:28 <strigazi> brtknr: I thought it might be simpler :) we can discuss it tmr or in storyboard/mailing list?
22:06:35 <imdigitaljim> random question
22:06:36 <brtknr> It was InternalIP,Hostname,ExternalIP
22:06:39 <imdigitaljim> https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/
22:06:48 <imdigitaljim> --address 0.0.0.0
22:06:53 <imdigitaljim> do you bind a specific address?
22:07:06 <brtknr> imdigitaljim: yes, i bound it to node IP
22:07:10 <imdigitaljim> gotcha
22:07:15 <brtknr> it was already bound to node ip
22:07:19 <brtknr> by default
22:07:21 <imdigitaljim> just curious of how that is all done with multi-interface
22:07:50 <colin-> personally curious how the kube-proxy or similar would handle such a setup and rule/translation enforcement etc
22:07:54 <brtknr> is there any reason why we cant do Hostname,InternalIP,ExternalIP ordering by default
22:07:58 <imdigitaljim> https://kubernetes.io/docs/reference/command-line-tools-reference/kube-proxy/
22:08:01 <imdigitaljim> same with kube-proxy?
22:08:08 <imdigitaljim> do you do stuff here different?
22:09:06 <brtknr> I havent touched kube-proxy settings because I couldnt find it
22:09:30 <openstackgerrit> Feilong Wang proposed openstack/magnum master: [WIP] Support <cluster>/actions/resize API  https://review.openstack.org/638572
22:09:30 <imdigitaljim> --bind-address 0.0.0.0     Default: 0.0.0.0
22:09:32 <imdigitaljim> maaybe?
22:09:32 <strigazi> brtknr: /etc/kubernetes/proxy
22:09:36 <imdigitaljim> check this out?
22:09:54 <strigazi> in magnum it has the default
22:10:01 <imdigitaljim> which is all interfaces?
22:10:04 <strigazi> yes
22:10:07 <imdigitaljim> shouldnt it be node only here/
22:10:08 <imdigitaljim> ?
22:10:15 <strigazi> for proxy?
22:10:30 <imdigitaljim> i guess what hes doing with his interfaces
22:10:41 <brtknr> oh okay, i'll try adding --bind-address=NODE_IP
22:10:55 <brtknr> to /etc/kubernetes/proxy
22:11:12 <imdigitaljim> im just curious i dont have a solution
22:11:17 <colin-> failing that i'd try imdigitaljim suggestion of wildcarding it
22:11:20 <imdigitaljim> but maybe worth a shot
22:11:20 <colin-> just for troubleshooting
22:11:33 <brtknr> wildcarding?
22:11:34 <colin-> oh, that may be the default my mistake
22:11:50 <colin-> 0.0.0.0/0
22:12:35 <brtknr> colin-: how would that help?
22:12:46 <brtknr> according to the docs, 0.0.0.0 is already the default
22:12:48 <brtknr> https://kubernetes.io/docs/reference/command-line-tools-reference/kube-proxy/
22:12:58 <brtknr> for --bind-address
22:12:59 <strigazi> brtknr: colin- imdigitaljim let's end the meeting and just continue?
22:13:15 <brtknr> sure, this is not going to be resolved very easily :)
22:13:30 <strigazi> thanks
22:13:43 <imdigitaljim> yeah im just throwing out ideas
22:13:50 <imdigitaljim> maybe a few things to think/try
22:14:16 <imdigitaljim> maybe to get brtknr unstuck
22:14:42 <strigazi> flwang: brtknr jakeyip schaney colin- eandersson imdigitaljim thanks for joining and for the discussion on autoscaler.
22:14:54 <imdigitaljim> yeah thanks for clearing up some stuff =)
22:14:58 <imdigitaljim> looking forward to the merge
22:15:11 <strigazi> :)
22:15:19 <strigazi> #endmeeting