14:00:16 #startmeeting kuryr 14:00:16 Meeting started Mon Feb 19 14:00:16 2018 UTC and is due to finish in 60 minutes. The chair is irenab. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:17 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:20 The meeting name has been set to 'kuryr' 14:00:37 Hi 14:00:40 Hello! 14:00:54 Hi 14:01:27 Let's wait few more minutes for others to join 14:01:47 Greetings 14:02:04 o/ 14:02:44 Hi all, thank you for joining. Let's start 14:02:56 #topic kuryr-libnetwork 14:02:58 o/ 14:03:13 o/ 14:04:16 there is a patch #link https://review.openstack.org/#/c/544548/ that deals with tags extension changes in neutron 14:05:00 the question is if previous extensions, meaning previous neutron versions should be supported 14:05:18 irenab: I'm inclined to say that they should 14:05:27 is it a lot of trouble to make it compatible with both? 14:05:41 actually there are 3 options 14:06:02 tag and tag_ext are deprecated since pike: https://docs.openstack.org/releasenotes/neutron/pike.html#deprecation-notes 14:06:20 And guys planned to remove it in Queens, but left it. It's getting removed in Rocky. 14:06:22 It does not look like a big deal 14:06:57 So question is do we want Rocky's Kuryr to support Ocata? 14:07:07 s/do we want/do we care 14:07:41 dulek: it only depends on the cost 14:07:56 I believe the cost is not big 14:08:13 #chair celebdor 14:08:14 Current chairs: celebdor irenab 14:08:33 so shall we vote for keeping both? 14:08:47 if the cost is not big, since we are not using any new features of the new tags 14:08:51 we should support them 14:08:56 +1 14:09:02 dulek ? 14:09:31 Works for me either way. It's not a lot to carry in the code. 14:09:40 great 14:09:53 any other libnetwork related issues to discuss? 14:10:33 ok, moving on 14:10:41 #topic kuryr-kubernetes 14:11:03 anyone want to update? 14:11:26 man... 14:11:37 Well, I filed two bugs last friday 14:11:46 links? 14:11:55 basically with neutron-lbaasv2 and firewall=openvswitch 14:11:58 services are broken 14:13:24 just a sec 14:13:30 #link https://bugs.launchpad.net/kuryr-kubernetes/+bug/1749968 14:13:30 Launchpad bug 1749968 in kuryr-kubernetes "services backed by neutron-lbaas do not work with native ovs firewall" [Critical,In progress] - Assigned to Antoni Segura Puimedon (celebdor) 14:13:41 thanks yboaron 14:13:53 the biggest contention point though 14:14:12 is that for east-west, setting the same SG as the pods work 14:14:17 for loadbalancer service type 14:14:42 we either need to dynamically change the ports in a SG per LB depending ont he listeners 14:15:03 or just allow everything in for external traffic when service is loadbalancer type 14:15:10 for the issue yboaron raises I was that woith Dragonflow there is a need to add SG for the specific port of the servic itself 14:15:22 since neutron-lbaasv2 is deprected I'm inclined to just allow everything 14:15:39 (specially considering haproxy will only LB the listener ports anyway) 14:15:55 irenab, IIRC it was LoadBalancer service type --> means N-S 14:16:03 yes 14:16:52 celebdor, the patch you proposed covers what you just explained? 14:17:19 only covers east-west 14:17:26 n-s needs to be added 14:17:30 one of the two options above 14:17:49 #link https://review.openstack.org/#/c/545363/ - should support east-west 14:17:56 (I'd rather not carry the extra SG per LB since it feels like repeating octavia code, but I can be persuaded) 14:18:47 the question is if we officially support HAProxy flavor (till it is removed) 14:18:53 irenab: dulek: ltomasbo: yboaron: ^^ 14:19:12 irenab: do you have native lb already in df? 14:19:33 not yet, speced but not implemented 14:20:11 celebdor: I don't have strong opinions here, but with neutron-lbaas being deprecated, I'd say to stick with Octavia's way. 14:20:28 neither do I 14:20:49 but agree with dulek, give preference to Octavia's way 14:20:57 dulek, Octavia is a bit more resource heavy flavor,I won't be surprised if some will choose HAProxy 14:21:23 irenab: so do you use octavia or lbaasv2? 14:21:24 irenab, that's true 14:21:39 celebdor, use for what? 14:21:48 I mean the kuryr users 14:21:49 irenab: in your deployments 14:22:04 I'm inclined to support lbaasv2 until we have either containerized octavia or octavia plugins 14:22:38 I agree, then probably need to solve n-s in the Octavia like way 14:22:38 celebdor: It's awful that we don't have anything rock-solid in LB spaceā€¦ 14:23:02 I mean - we wonder if anyone will use Octavia, while LBaaSv2 is already deprecated. 14:23:09 This isn't really healthy situation. 14:23:37 irenab: I was afraid you'd say that 14:23:49 dulek, as far as I remember there is no specific details about when lbaasv2 will be removed 14:24:02 irenab: It was announced recently. 14:24:04 dulek: Octavia, as is, can only be used at scale for N-S 14:24:14 in the kubernetes scenario 14:24:14 celebdor, lets finalize it at PTG? 14:24:24 irenab: well, I'd like to merge it by wednesday 14:24:27 I think it is doable 14:24:30 with a bit of help 14:24:47 irenab: the advantage is that we can take "inspiration" from the octavia code 14:24:48 celebdor: That's a release blocker for Queens or not? 14:24:57 dulek: for me it is 14:25:01 irenab: http://lists.openstack.org/pipermail/openstack-dev/2018-January/126836.html 14:25:43 celebdor: Then we need to have it fixed this week. 14:25:44 dulek, "We are not announcing the end of the deprecation cycle at this time" 14:26:04 when is the deadline for Q. release? 14:26:06 irenab: Uh, oh. Did I simply read what I wanted to read? :P 14:26:08 it's like 14:26:20 we don't want you to use it, but we won't screw you if you do 14:26:25 that's how I read it 14:26:37 Q is released next Wednesday. We can have another RC before that date. 14:26:46 yes, mainly no new features on lbaasv2, but you can keep using it as is 14:26:55 dulek: this wednesday or next week's 14:26:57 ? 14:27:10 celebdor: IIRC next, but let me check again. 14:27:40 thanks dulek 14:27:46 Oh, I feel like I'm PTL again 14:27:48 :D 14:27:54 Ah, okay. So we have time until Friday to issue another RC. 14:27:58 https://releases.openstack.org/queens/schedule.html#q-finalrc 14:28:10 well then, lets try to have a fix and merge it by Friday 14:28:15 let's 14:28:44 do we have anything else? 14:28:49 oh, yes 14:29:00 maysamacedos reported that the cni daemon is leaking memory 14:29:09 yup 14:29:11 since she is working on a very cool feature 14:29:21 liveness checks for the cni daemon 14:29:32 that mark unhealthy when we go over certain memory 14:29:41 the IPDB more precisely 14:29:48 I fixed in pyroute2 upstream one memory leak 14:29:55 but we know there is another in setns 14:29:58 maybe more 14:30:15 interesting 14:30:25 umm, nice! 14:30:26 this is only a chance to talk about the liveness memory leak health 14:31:01 should we maintain this (IPDB) check? 14:31:12 maysamacedos: Had you checked the leak with newer pyroute2 version, including celebdor's fix? 14:31:16 I think we should 14:31:24 dulek: yes 14:31:26 dulek: she did 14:31:35 And that doesn't solve it? :( 14:31:37 we were discussing it last week 14:31:43 dulek: no 14:31:48 dulek: as I said, I know there's at least one more 14:31:56 what I proposed was to have a conf option 14:32:15 that says use at max 8GiB of mem 14:32:23 if you use more it means you are leaking 14:32:26 and you should be killed 14:32:35 hmm 14:32:48 celebdor, so when this check fails, node cannot host any more Pods? 14:32:49 dulek: which reminds me... Do we always evict entries from the registry? 14:32:50 celebdor: Let's just have the limit configurable. -1 means no limit, the you can set it in MiBs. 14:33:04 irenab: no, when it fails the health check fails 14:33:14 and this leads to what? 14:33:14 irenab: No, kuryr-daemon will get restarted. 14:33:15 and k8s should restart the cni daemon 14:33:21 and then it should start working again :-) 14:33:27 Magic. ;) 14:33:28 ok 14:33:29 the node will go notready -> ready 14:33:38 maysamacedos: it does restart it, right? 14:33:44 celebdor: I don't think that, most likely CNI will not notice. 14:33:47 tes 14:33:47 so existing Pods are not affected 14:33:53 yes celebdor 14:34:21 good 14:34:22 celebdor: And about deleting stuff from registry. We currently don't do that, don't even watch for DELETEs. 14:34:26 irenab: they are not 14:34:35 dulek: that's leaking 14:34:42 ok 14:34:44 not really 14:34:45 celebdor: I think it's easy to be implemented now, as we have locks now. 14:34:48 it's ballooning 14:34:50 xD 14:34:53 :) 14:35:01 celebdor: I can try to fix that today. 14:35:03 dulek: please, file a bug 14:35:12 and fix it :P 14:35:23 nice troubleshooting session :-) 14:35:42 any other issues to discuss? 14:35:42 irenab: I feel like I'm forgetting about some release bug 14:35:49 but we can probably address it by the .1 14:35:53 .0 is always dangerous 14:35:55 :P 14:36:04 celebdor: You have two critical issues in launchpad. 14:36:15 https://bugs.launchpad.net/kuryr-kubernetes 14:36:38 the lbaas ones 14:36:54 actually there are 3 14:37:07 dulek: what about this one https://bugs.launchpad.net/kuryr-kubernetes/+bug/1731485 14:37:07 Launchpad bug 1731485 in kuryr-kubernetes "Kuryr ignores CNI_CONTAINERID when serving requests" [Critical,In progress] - Assigned to Michal Dulko (michal-dulko-f) 14:37:11 it has your name on it 14:37:13 :-) 14:37:28 celebdor: Yup. I don't know how to fix it for case without kuryr-daemon. 14:37:44 https://bugs.launchpad.net/kuryr-kubernetes/+bug/1749921 14:37:45 Launchpad bug 1749921 in kuryr-kubernetes "Loadbalancer service type fails to create due to subnet access policy" [Critical,In progress] - Assigned to Yossi Boaron (yossi-boaron-1234) 14:38:00 And for kuryr-daemon it's fixed. I think we'll get over it once kuryr-daemon will become a default. 14:38:44 https://review.openstack.org/#/c/545270/ is almost ready to be merged 14:39:08 dulek: what do you think is missing to make it default 14:39:11 We saw some strange issue with unhealthy lb handler, just need to verify it is resolved 14:39:12 (apart from mem leaks) 14:39:26 irenab, Did you manage to check it ? 14:39:39 irenab: I thought yboaron found out what it was 14:39:44 or you mean with df? 14:39:49 yboaron, not the latest version, will complete asap 14:39:55 celebdor: Nothing really. We gate on that, we know it solves an issue that we don't know how to approach without kuryr-daemon. 14:40:12 celebdor: I'd say it's ready to become a default in Rocky. 14:40:17 It worked for me , crossing fingers :-) 14:40:17 celebdor, not with DF, it was some kuryr-kubernetes issue 14:40:26 yboaron: what is that ep_subsets thing? 14:40:33 I don't recall putting that there 14:40:35 what did I miss 14:40:54 dulek: alright 14:41:11 you can put a BP to make it default and deprecate haproxy on Rocky 14:41:12 That was the exception I got , triggers the unhealthy of LB 14:41:24 yboaron: what's it about? 14:41:30 do you have a stactrace? 14:41:42 celebdor, I posted it on the patch 14:41:50 iterable of None object 14:42:02 will check, thanks 14:42:15 celebdor: HAProxy? I've talked about deploying without kuryr-daemon. :P 14:42:42 I think celebdor is ulti tasking as usual, abit of context switch slip ... 14:42:45 dulek: and that's why I meant non daemonized cni 14:42:49 I'm just mistyping 14:42:51 :) 14:42:59 I suspect that service doesn't work in latest devstack - I was in the middle of testing that .. and guess what happened to RDO ? 14:43:00 irenab: only 4 conversations at one 14:43:10 yboaron: Kaboom? 14:43:24 celebdor, Yep 14:43:38 yboaron, dmellado mentioned maitenance of RDO 14:44:01 irenab, yes disks upgrade .. 14:44:18 Shall we move to open discussions? 14:44:22 Did someone check service LB lately on devstack ? 14:44:31 irenab: agreed 14:44:38 yboaron, what do you mean? 14:44:59 clean devstack - create LB E-W 14:45:06 clusterIP 14:45:24 not recently 14:45:26 yboaron: on BM? 14:45:27 I saw strange things with the health reports 14:45:42 yboaron: you mean the controller heatlh? 14:45:43 celebdor, VM-devstack 14:45:50 ok 14:45:51 celebdor, yes 14:45:54 that's "baremetal" 14:46:01 for poor people without real hardware 14:46:03 xD 14:46:06 yboaron, please open a bug if you see some issues 14:46:21 irenab, I will 14:46:22 #topic open discussions 14:46:29 #chair dmellado 14:46:29 Current chairs: celebdor dmellado irenab 14:46:58 dmellado asked me to remind about comming PTG, its going to be next week 14:47:14 \o/ 14:47:38 we have an etherpad where anyone who want to suggest session can add it #link https://etherpad.openstack.org/p/kuryr-ptg-rocky 14:47:41 yep, he wanted to plan a bit the schedule https://ethercalc.openstack.org/kuryr_ptg 14:48:26 correct, so please if you plan to join remotely add the prefered time slot based on the availability at the ethercalc 14:48:36 ltomasbo: Why is 15:30-16:30 not available everywhere? 14:48:38 irenab: theoretically the proposal time is closed 14:48:50 only scheduling now, right? 14:49:03 nap time? 14:49:26 :-) 14:49:41 xD 14:49:45 I don't know... 14:50:30 we'll ask dmellado when he's available :-) 14:50:34 I think its bug in the schedule, let;s give dmellado a chance to fix it 14:50:53 othewise we'll do bug fixing sessions 14:51:03 +1 14:51:09 any other topics? 14:51:41 not here 14:51:44 then I guess we can close the meeting 14:51:48 Folks , please review #link https://review.openstack.org/#/c/536387/ 14:52:05 garyloug: updates? 14:52:05 yboaron, will do 14:52:12 irenab, 10x 14:52:31 also Danil told that he fixed his patches for interaction with the CNI-daemon, they can be reviewed 14:52:45 #link https://review.openstack.org/#/c/471012/ https://review.openstack.org/#/c/512280/ https://review.openstack.org/#/c/512281/ 14:52:51 kaliya, did you try them? 14:53:08 Not yet, but I plan to check on devstack the one which fails in zuul 14:53:20 kaliya, great, thanks 14:53:24 This one https://review.openstack.org/#/c/524590/ 14:53:31 if anyone has ideas, welcome :) 14:53:51 Hi kaliya, no update, something came up in my work last week that needed my attention. I'm working on the code right now and need to get it through internal review before I push. 14:54:14 we have a meeting tomorrow on this topic with garyloug and dmellado 14:54:21 don't forget! XD 14:54:32 We plan to have a session on multivif at PTG also to go though the code, hope it will be the latest wehn we finalize the patches 14:54:39 kaliya: thanks 14:54:46 good to know irenab 14:55:05 dulek: after you finish the registry delete fix, take a look at danil's cni patches ;-) 14:55:07 kaliya, worth to share the meeting details if someone wants to join 14:55:29 we should put the bluejeans links soon 14:55:30 Great, irenab has this session been scheduled for any day or time? 14:55:33 dmellado has the calendar event 14:55:42 garyloug: kaliya: right 14:55:47 garyloug, I do not know 14:55:49 celebdor: Sure, I'm helping daniel debug them. :) 14:55:50 he'll probably update tomorrow or wednesday 14:55:58 dulek: daniel or danil? 14:56:06 No worries, I'll keep an eye on the doc 14:56:07 we have both 14:56:09 :-) 14:56:11 celebdor: danil, right. 14:56:14 :-) 14:56:24 ok, we have 2 mins left 14:57:15 thank you all for joining 14:57:21 #endmeeting