09:01:00 <flwang1> #startmeeting magnum
09:01:01 <openstack> Meeting started Wed Apr  1 09:01:00 2020 UTC and is due to finish in 60 minutes.  The chair is flwang1. Information about MeetBot at http://wiki.debian.org/MeetBot.
09:01:02 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
09:01:04 <openstack> The meeting name has been set to 'magnum'
09:01:09 <flwang1> #topic roll call
09:01:11 <flwang1> o/
09:01:49 <strigazi> o/
09:02:07 <flwang1> brtknr:
09:02:38 <brtknr> /
09:02:40 <brtknr> o/
09:02:43 <flwang1> cool
09:03:18 <flwang1> before go through the topics, just want to let you guys know, i will be serving Magnum PTL for the V cycle
09:03:38 <flwang1> thank you for you guys support in the last 2 cycles
09:04:20 <brtknr> thank you!
09:04:45 <strigazi> cheers
09:04:53 <flwang1> strigazi: cheers
09:05:02 <flwang1> #topic train 9.3.0
09:05:16 <flwang1> anything else we need to discuss about the 9.3.0?
09:05:38 <brtknr> not really, just that we should add release notes for important fixes
09:05:50 <brtknr> i think there were quite a few in this release
09:05:55 <ttsiouts> ο/
09:06:02 <brtknr> oh hi ttsiouts !
09:06:03 <strigazi> nothing from me
09:06:04 <flwang1> brtknr: i agree we should provide more detailed release note
09:06:17 <strigazi> I ususally -1 for reno
09:06:21 <strigazi> what we missed
09:06:25 <strigazi> brtknr: explain
09:06:48 <flwang1> brtknr means we need more detailed release note, is it?
09:07:13 <strigazi> I mean, which fixed we didn't add reno for, specifically
09:07:25 <strigazi> s/fixed/fixes/
09:07:34 <brtknr> not more detailed, but rather add them for various fixes we merged
09:07:51 <brtknr> i will need to go search
09:07:58 <brtknr> will add it to etherpad
09:08:07 <brtknr> not worth further discussion here
09:09:16 <flwang1> like this https://review.opendev.org/#/c/715661/ :) ?
09:09:41 <strigazi> 17c6034e Add fcct config for coreos user_data
09:09:51 <flwang1> https://review.opendev.org/#/c/715410/ :(
09:09:58 <brtknr> flwang1: thats right
09:10:13 <brtknr> mostly me :P
09:10:16 <flwang1> brtknr: those are your baby
09:10:19 <strigazi> 17c6034e Add fcct config for coreos user_data, if this this needs a reno thne every single commit needs a reno.
09:10:42 <brtknr> strigazi: more of the user facing stuff
09:10:54 <brtknr> i would disagree that 17c6034e Add fcct config for coreos user_data needs a reno
09:11:22 <strigazi> you pasted it in etherpad :)
09:11:25 <flwang1> ok, i think we're aware of the issue now and we just need to pay more attention and case by case?
09:12:00 <brtknr> strigazi: i am deleting thme one  by one,
09:12:08 <brtknr> my brain can only work so fast
09:12:28 <strigazi> let's move one
09:13:00 <flwang1> #topic health status update
09:13:13 <strigazi> anything to dicsuss?
09:13:20 <strigazi> brtknr: needs to review it first
09:13:21 <flwang1> do you guys have new comments about this?
09:14:21 <strigazi> I don't
09:14:22 <brtknr> strigazi: i already added my comments once, need to retest it with the magnum_auto_healer label, havent found time, sorry
09:14:39 <flwang1> ok, let's move on then
09:14:42 <flwang1> #topic Multi AZ support for master nodes https://review.opendev.org/714347
09:14:57 <flwang1> strigazi: brtknr: did you get a chance to review this idea?
09:15:13 <flwang1> strigazi: i know cern is using multi az, does it sound right to you?
09:15:42 <strigazi> without fixing add/del etcd members from etcd not heat, this makes little sense.
09:16:00 <strigazi> we don't use multi az for masters
09:16:41 <flwang1> i know you don't, but why 'add/remove etcd' is a blocker for this one?
09:16:43 <strigazi> multi-az without NGs sounds like overengineering to me, but we discussed this before.
09:17:17 <flwang1> this is for masters, we already have the mutli az support for worker based on NG
09:17:19 <strigazi> you need to remove a dead memeber from etcd
09:17:40 <flwang1> we can use auto healing
09:17:50 <strigazi> no
09:18:01 <flwang1> why
09:18:16 <strigazi> as you want guys, do it with autohealing
09:18:50 <strigazi> If it is opt in, lgtm
09:19:13 <flwang1> why don't you explain more to help us understand?
09:20:21 <flwang1> i know this is not the perfect solution, but i can see the benefits
09:20:27 <strigazi> etcd uses the raft algorithm for concensus
09:20:48 <strigazi> It starts with one master and more membres are added
09:21:11 <strigazi> each member contacts the all other memeber and they elect a leader.
09:21:19 <flwang1> i understand your concern, you mean user may lost 1/2 etcd members in 1 az?
09:21:50 <strigazi> In magnum, we use discovery.etcd.io to advertise the IP of all member so they can elect a leader
09:22:27 <flwang1> let's say there are 3 az and 3 master nodes or 5 master nodes
09:22:49 <flwang1> can you please help me understand why it doesn't work?
09:22:54 <strigazi> When you change the number of members, smth needs to drop or add the old/new memeber from the the etcd API/leader
09:23:13 <strigazi> ^^
09:23:49 <flwang1> why do we have to change the number of members?  when losing an AZ?
09:24:18 <strigazi> What if an etcd member is running there?
09:24:34 <strigazi> Who is going to delete it?
09:25:21 <flwang1> sorry, why do we have to delete the member? in current design, when do we have to delete a member?
09:25:22 <brtknr> strigazi: where does an etcd member need to be deleted from?
09:25:53 <strigazi> And on then, a new member is added, how it will be added? with the discovery url it is not possible, the url is used only for bootstrap
09:26:09 <strigazi> sorry, why do we have to delete the member? in current design, when do we have to delete a member?
09:26:16 <strigazi> never ^^ we don't change master nods
09:26:23 <strigazi> strigazi: where does an etcd member need to be deleted from?
09:26:25 <strigazi> etcd ^^
09:27:17 <strigazi> https://etcd.io/docs/v3.3.12/op-guide/runtime-configuration/#remove-a-member
09:27:19 <brtknr> so the master_az_list should work with fixed number of masters but not when we have scalable masters
09:27:35 <strigazi> yes
09:27:37 <flwang1> strigazi: yep, i can see where are come from, but i  think i'm trying to fix this based on current design
09:27:52 <brtknr> does the leader election fail if a member is dead?
09:28:06 <flwang1> and even if we could support the resized master later, my current way should also work
09:28:07 <brtknr> or will the remaining members elect a leader and carry on?
09:28:15 <strigazi> so you plan to leave it dangling
09:28:48 <strigazi> I haven't tried, but the docs don't say: stop the member and everything is good.
09:28:53 <flwang1> strigazi: the list is automatically generated based on the new number
09:29:23 <strigazi> ok
09:29:31 <flwang1> strigazi: see https://review.opendev.org/#/c/714347/2/magnum/drivers/heat/k8s_fedora_template_def.py@225
09:29:49 <flwang1> it's not a fixed list
09:29:53 <strigazi> ok
09:30:21 <flwang1> are we on the same page now? do you think my current solution could work ?
09:30:29 <strigazi> I don't knwo
09:30:32 <strigazi> I don't know
09:30:48 <strigazi> let's move on, when is is complete we can test
09:31:00 <flwang1> TBH, i haven't done a fully testing, but the idea is there
09:31:05 <brtknr> what i dont understand is what the multi_az_list has to do with removing etcd member
09:31:05 <flwang1> ok, i will keep polish it
09:31:21 <brtknr> seems like two separate issues
09:31:31 <flwang1> brtknr: +1
09:31:36 <strigazi> let;s move on
09:31:38 <brtknr> might be missing a point here
09:31:38 <flwang1> #topic Allowed CIDRs for master LB  https://review.opendev.org/#/c/715747/
09:31:57 <flwang1> here is a new feature introduced in Octavia stein release
09:32:08 <flwang1> to allow setting a CIDR list for lb
09:32:10 <brtknr> this is conditional on heat merging the equivalent change right?
09:32:18 <brtknr> i.e. its not supported in heat yet
09:32:30 <brtknr> i.e. its not supported in stable/train branch yet
09:32:35 <flwang1> it's supported in heat master branch
09:32:39 <brtknr> yep
09:32:50 <flwang1> i'm trying to cherrypick to train
09:32:55 <brtknr> ah i see
09:33:02 <flwang1> since it's a very useful feature
09:33:13 <flwang1> especially when you want to open the lb on public
09:33:18 <brtknr> i'm happy with this patch
09:33:59 <flwang1> strigazi: i need your help https://review.opendev.org/#/c/715747/1..2/magnum/drivers/common/templates/lb_api.yaml
09:34:10 <flwang1> how should we deal with this case?
09:34:37 <strigazi> this was added where?
09:34:38 <flwang1> we're using old heat template version, but how can we use a good new feature in latest heat version
09:34:49 <strigazi> the feature in heat
09:35:10 <flwang1> https://review.opendev.org/#/c/715747/2/magnum/drivers/k8s_fedora_atomic_v1/templates/kubecluster.yaml
09:35:17 <brtknr> can we not upgrade the template version?
09:35:30 <strigazi> this was added where? the feature in heat
09:35:38 <flwang1> strigazi: wait a sec
09:35:50 <flwang1> https://review.opendev.org/#/c/715748/
09:35:51 <strigazi> Which version we need change to
09:36:15 <strigazi> so train
09:36:23 <flwang1> heat support this https://review.opendev.org/#/c/715748/1/heat/engine/resources/openstack/octavia/listener.py@137
09:36:29 <strigazi> or ussuri
09:36:41 <flwang1> but i haven't test if heat can automatically respect the version
09:36:47 <flwang1> yep, Ussuri now
09:36:58 <flwang1> and i'm trying to cherry pick it to train
09:37:17 <flwang1> let's assume it can't be cherrypicked
09:37:47 <flwang1> strigazi: any idea?
09:37:50 <strigazi> so change to  ussuri, then heat ussuri/train is a hard dep for magnum when API LB is required
09:38:05 <flwang1> that's what my concern
09:38:26 <flwang1> that's why i'm asking you if there is a better solution
09:38:49 <flwang1> to avoid the hard depedency
09:38:51 <strigazi> Add one more template?
09:39:25 <strigazi> fyi, at CERN it is not issue, we upgrade heat and we drop this part anyway.
09:39:56 <flwang1> yep, another template may work
09:40:00 <brtknr> drop what part?
09:40:07 <flwang1> i will think about it
09:40:12 <strigazi> Maybe it will be an issue, but we will add one more local patch
09:40:25 <flwang1> will cern be interested in this feature?
09:41:00 <brtknr> is there a way to set allowed_cidr to the loadbalancer
09:41:06 <brtknr> is there a way to set allowed_cidr to the loadbalancer directly?
09:41:24 <strigazi> we don't have LBaaS. It is a very new feature with tungten.
09:41:43 <flwang1> brtknr: should be, but it's not handy
09:42:31 <flwang1> are you talking about set it by python code by calling octavia client?
09:42:42 <flwang1> brtknr: ^
09:42:48 <brtknr> we already have train heat requirement for train magnum
09:42:57 <brtknr> because of fedora coreos
09:43:01 <strigazi> Maybe we take this in gerrit? Is there anything else for the meeting?
09:43:02 <flwang1> right
09:43:08 <flwang1> nothing else
09:43:16 <flwang1> i'm good
09:43:27 <flwang1> anything else you guys want to discuss?
09:43:42 <brtknr> ttsiouts: dioguerra said you are working on the labels
09:44:18 <strigazi> we don't have a spec yet
09:44:27 <strigazi> when can discuss it then
09:44:41 <strigazi> we can discuss it then
09:44:45 <brtknr> ok sure
09:44:46 <ttsiouts> brtknr: I'm drafting the spec
09:44:59 <ttsiouts> I'll try to upload it today
09:45:12 <brtknr> ttsiouts: sounds good :)
09:47:09 <brtknr> 1 more thing
09:47:34 <brtknr> flwang1: strigazi: what are the curentl limitations for enabling resizable masters
09:47:44 <flwang1> etcd
09:48:12 <brtknr> is that it?
09:48:37 <brtknr> because we dont have a way of adding/removing members dynamically?
09:48:56 <strigazi> have you every tried it?
09:49:09 <strigazi> you can use the heat api to change the number of masters
09:49:40 <brtknr> strigazi: havent tried yet
09:50:53 <flwang1> strigazi: i'm a bit confused, do you mean calling heat api directly to resize master could work?
09:51:11 <strigazi> it will work towards failure
09:51:47 <flwang1> right
09:51:51 <brtknr> strigazi: how do you resize directly using heat api?
09:51:55 <flwang1> let's end the meeting
09:52:06 <openstackgerrit> Merged openstack/magnum master: Update hacking for Python3  https://review.opendev.org/716347
09:52:09 <flwang1> we can continue the discussion
09:52:13 <flwang1> #endmeeting