09:01:00 #startmeeting magnum 09:01:01 Meeting started Wed Apr 1 09:01:00 2020 UTC and is due to finish in 60 minutes. The chair is flwang1. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:01:02 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:01:04 The meeting name has been set to 'magnum' 09:01:09 #topic roll call 09:01:11 o/ 09:01:49 o/ 09:02:07 brtknr: 09:02:38 / 09:02:40 o/ 09:02:43 cool 09:03:18 before go through the topics, just want to let you guys know, i will be serving Magnum PTL for the V cycle 09:03:38 thank you for you guys support in the last 2 cycles 09:04:20 thank you! 09:04:45 cheers 09:04:53 strigazi: cheers 09:05:02 #topic train 9.3.0 09:05:16 anything else we need to discuss about the 9.3.0? 09:05:38 not really, just that we should add release notes for important fixes 09:05:50 i think there were quite a few in this release 09:05:55 ο/ 09:06:02 oh hi ttsiouts ! 09:06:03 nothing from me 09:06:04 brtknr: i agree we should provide more detailed release note 09:06:17 I ususally -1 for reno 09:06:21 what we missed 09:06:25 brtknr: explain 09:06:48 brtknr means we need more detailed release note, is it? 09:07:13 I mean, which fixed we didn't add reno for, specifically 09:07:25 s/fixed/fixes/ 09:07:34 not more detailed, but rather add them for various fixes we merged 09:07:51 i will need to go search 09:07:58 will add it to etherpad 09:08:07 not worth further discussion here 09:09:16 like this https://review.opendev.org/#/c/715661/ :) ? 09:09:41 17c6034e Add fcct config for coreos user_data 09:09:51 https://review.opendev.org/#/c/715410/ :( 09:09:58 flwang1: thats right 09:10:13 mostly me :P 09:10:16 brtknr: those are your baby 09:10:19 17c6034e Add fcct config for coreos user_data, if this this needs a reno thne every single commit needs a reno. 09:10:42 strigazi: more of the user facing stuff 09:10:54 i would disagree that 17c6034e Add fcct config for coreos user_data needs a reno 09:11:22 you pasted it in etherpad :) 09:11:25 ok, i think we're aware of the issue now and we just need to pay more attention and case by case? 09:12:00 strigazi: i am deleting thme one by one, 09:12:08 my brain can only work so fast 09:12:28 let's move one 09:13:00 #topic health status update 09:13:13 anything to dicsuss? 09:13:20 brtknr: needs to review it first 09:13:21 do you guys have new comments about this? 09:14:21 I don't 09:14:22 strigazi: i already added my comments once, need to retest it with the magnum_auto_healer label, havent found time, sorry 09:14:39 ok, let's move on then 09:14:42 #topic Multi AZ support for master nodes https://review.opendev.org/714347 09:14:57 strigazi: brtknr: did you get a chance to review this idea? 09:15:13 strigazi: i know cern is using multi az, does it sound right to you? 09:15:42 without fixing add/del etcd members from etcd not heat, this makes little sense. 09:16:00 we don't use multi az for masters 09:16:41 i know you don't, but why 'add/remove etcd' is a blocker for this one? 09:16:43 multi-az without NGs sounds like overengineering to me, but we discussed this before. 09:17:17 this is for masters, we already have the mutli az support for worker based on NG 09:17:19 you need to remove a dead memeber from etcd 09:17:40 we can use auto healing 09:17:50 no 09:18:01 why 09:18:16 as you want guys, do it with autohealing 09:18:50 If it is opt in, lgtm 09:19:13 why don't you explain more to help us understand? 09:20:21 i know this is not the perfect solution, but i can see the benefits 09:20:27 etcd uses the raft algorithm for concensus 09:20:48 It starts with one master and more membres are added 09:21:11 each member contacts the all other memeber and they elect a leader. 09:21:19 i understand your concern, you mean user may lost 1/2 etcd members in 1 az? 09:21:50 In magnum, we use discovery.etcd.io to advertise the IP of all member so they can elect a leader 09:22:27 let's say there are 3 az and 3 master nodes or 5 master nodes 09:22:49 can you please help me understand why it doesn't work? 09:22:54 When you change the number of members, smth needs to drop or add the old/new memeber from the the etcd API/leader 09:23:13 ^^ 09:23:49 why do we have to change the number of members? when losing an AZ? 09:24:18 What if an etcd member is running there? 09:24:34 Who is going to delete it? 09:25:21 sorry, why do we have to delete the member? in current design, when do we have to delete a member? 09:25:22 strigazi: where does an etcd member need to be deleted from? 09:25:53 And on then, a new member is added, how it will be added? with the discovery url it is not possible, the url is used only for bootstrap 09:26:09 sorry, why do we have to delete the member? in current design, when do we have to delete a member? 09:26:16 never ^^ we don't change master nods 09:26:23 strigazi: where does an etcd member need to be deleted from? 09:26:25 etcd ^^ 09:27:17 https://etcd.io/docs/v3.3.12/op-guide/runtime-configuration/#remove-a-member 09:27:19 so the master_az_list should work with fixed number of masters but not when we have scalable masters 09:27:35 yes 09:27:37 strigazi: yep, i can see where are come from, but i think i'm trying to fix this based on current design 09:27:52 does the leader election fail if a member is dead? 09:28:06 and even if we could support the resized master later, my current way should also work 09:28:07 or will the remaining members elect a leader and carry on? 09:28:15 so you plan to leave it dangling 09:28:48 I haven't tried, but the docs don't say: stop the member and everything is good. 09:28:53 strigazi: the list is automatically generated based on the new number 09:29:23 ok 09:29:31 strigazi: see https://review.opendev.org/#/c/714347/2/magnum/drivers/heat/k8s_fedora_template_def.py@225 09:29:49 it's not a fixed list 09:29:53 ok 09:30:21 are we on the same page now? do you think my current solution could work ? 09:30:29 I don't knwo 09:30:32 I don't know 09:30:48 let's move on, when is is complete we can test 09:31:00 TBH, i haven't done a fully testing, but the idea is there 09:31:05 what i dont understand is what the multi_az_list has to do with removing etcd member 09:31:05 ok, i will keep polish it 09:31:21 seems like two separate issues 09:31:31 brtknr: +1 09:31:36 let;s move on 09:31:38 might be missing a point here 09:31:38 #topic Allowed CIDRs for master LB  https://review.opendev.org/#/c/715747/ 09:31:57 here is a new feature introduced in Octavia stein release 09:32:08 to allow setting a CIDR list for lb 09:32:10 this is conditional on heat merging the equivalent change right? 09:32:18 i.e. its not supported in heat yet 09:32:30 i.e. its not supported in stable/train branch yet 09:32:35 it's supported in heat master branch 09:32:39 yep 09:32:50 i'm trying to cherrypick to train 09:32:55 ah i see 09:33:02 since it's a very useful feature 09:33:13 especially when you want to open the lb on public 09:33:18 i'm happy with this patch 09:33:59 strigazi: i need your help https://review.opendev.org/#/c/715747/1..2/magnum/drivers/common/templates/lb_api.yaml 09:34:10 how should we deal with this case? 09:34:37 this was added where? 09:34:38 we're using old heat template version, but how can we use a good new feature in latest heat version 09:34:49 the feature in heat 09:35:10 https://review.opendev.org/#/c/715747/2/magnum/drivers/k8s_fedora_atomic_v1/templates/kubecluster.yaml 09:35:17 can we not upgrade the template version? 09:35:30 this was added where? the feature in heat 09:35:38 strigazi: wait a sec 09:35:50 https://review.opendev.org/#/c/715748/ 09:35:51 Which version we need change to 09:36:15 so train 09:36:23 heat support this https://review.opendev.org/#/c/715748/1/heat/engine/resources/openstack/octavia/listener.py@137 09:36:29 or ussuri 09:36:41 but i haven't test if heat can automatically respect the version 09:36:47 yep, Ussuri now 09:36:58 and i'm trying to cherry pick it to train 09:37:17 let's assume it can't be cherrypicked 09:37:47 strigazi: any idea? 09:37:50 so change to ussuri, then heat ussuri/train is a hard dep for magnum when API LB is required 09:38:05 that's what my concern 09:38:26 that's why i'm asking you if there is a better solution 09:38:49 to avoid the hard depedency 09:38:51 Add one more template? 09:39:25 fyi, at CERN it is not issue, we upgrade heat and we drop this part anyway. 09:39:56 yep, another template may work 09:40:00 drop what part? 09:40:07 i will think about it 09:40:12 Maybe it will be an issue, but we will add one more local patch 09:40:25 will cern be interested in this feature? 09:41:00 is there a way to set allowed_cidr to the loadbalancer 09:41:06 is there a way to set allowed_cidr to the loadbalancer directly? 09:41:24 we don't have LBaaS. It is a very new feature with tungten. 09:41:43 brtknr: should be, but it's not handy 09:42:31 are you talking about set it by python code by calling octavia client? 09:42:42 brtknr: ^ 09:42:48 we already have train heat requirement for train magnum 09:42:57 because of fedora coreos 09:43:01 Maybe we take this in gerrit? Is there anything else for the meeting? 09:43:02 right 09:43:08 nothing else 09:43:16 i'm good 09:43:27 anything else you guys want to discuss? 09:43:42 ttsiouts: dioguerra said you are working on the labels 09:44:18 we don't have a spec yet 09:44:27 when can discuss it then 09:44:41 we can discuss it then 09:44:45 ok sure 09:44:46 brtknr: I'm drafting the spec 09:44:59 I'll try to upload it today 09:45:12 ttsiouts: sounds good :) 09:47:09 1 more thing 09:47:34 flwang1: strigazi: what are the curentl limitations for enabling resizable masters 09:47:44 etcd 09:48:12 is that it? 09:48:37 because we dont have a way of adding/removing members dynamically? 09:48:56 have you every tried it? 09:49:09 you can use the heat api to change the number of masters 09:49:40 strigazi: havent tried yet 09:50:53 strigazi: i'm a bit confused, do you mean calling heat api directly to resize master could work? 09:51:11 it will work towards failure 09:51:47 right 09:51:51 strigazi: how do you resize directly using heat api? 09:51:55 let's end the meeting 09:52:06 Merged openstack/magnum master: Update hacking for Python3 https://review.opendev.org/716347 09:52:09 we can continue the discussion 09:52:13 #endmeeting