14:00:13 #startmeeting kuryr 14:00:14 Meeting started Mon Jan 14 14:00:13 2019 UTC and is due to finish in 60 minutes. The chair is dmellado. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:15 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:17 The meeting name has been set to 'kuryr' 14:00:23 Hi kuryrs, who's here today for the meeting? ;) 14:00:51 o/ 14:01:14 #chair dulek 14:01:15 Current chairs: dmellado dulek 14:02:38 o/ 14:03:43 #chair ltomasbo 14:03:44 Current chairs: dmellado dulek ltomasbo 14:04:27 so, not a really big quorum but anyway 14:04:33 #topic kuryr-kubernetes 14:04:55 I can share something about CI. 14:04:56 First of all, thanks for all your efforts 14:05:04 dmellado: Okay, sorry, go on. ;) 14:05:07 I just want to notice that on Fri we cut a new version 14:05:16 from kuryr, kuryr-kubernetes and kuryr-tempest-plugin 14:05:28 to align with upstream M2 as well as ocp 14:05:35 kudos all \o/ 14:05:57 secondly, let's go to the CI, I have some findings as well but I'll let you go first, dulek 14:06:45 I've talked again with OVH guys. They are pretty sure that they've did everything possible to increase storage performance. 14:07:26 I've also tried running etcd with higher io priority and with all data in ramdisk. 14:07:41 dulek: hmmm I was going to ask you about the ramdisk, but if it's already there... 14:07:54 failures seem to be the same as we had before holidays... 14:08:06 Neither of those helped much so it seems to me like etcd issues that randomly blow gate runs may be due to different contention. 14:08:23 Natural next idea would be to increase CPU priority for etcd process. 14:08:43 I'll try sending a patch today. 14:08:47 dulek: just niceness on the devstack plugin? 14:09:15 dmellado: Yeah, actually on devstack itself - etcd is part of core devstack. 14:09:23 Another thing I was thinking about would be to just install etcd on a different node 14:09:36 but that would imply that *all* jobs would have to be multinode 14:10:09 dmellado: I've looked into that. It's pretty hard to do. 14:10:38 dmellado: Because you don't have subnode's IP in the master. 14:10:44 dmellado: Only the other way around. 14:10:50 dulek: hmmm I see 14:11:02 dmellado: So it's hard to configure stuff with etcd IP on subnode. 14:11:10 And the other way around with all the other servcies. 14:11:19 I'd like to spend some resources on this as it's getting really hard as of now to get a proper CI run :\ 14:12:33 dulek: if you send the patch please add me there as reviewer 14:12:48 I'm also trying some hacks from my side, as well as mayse 14:12:50 maysa 14:13:15 dmellado: Sure. 14:13:35 another thing we thought about doing is to increase the timeout 14:13:51 as some of the jobs were just running out of time, but that wouldn't be the main issue 14:14:04 ltomasbo: any idea on this from your side? 14:15:05 dmellado, I was not checking the CI in the last week, so I'm not sure 14:15:34 dmellado, we found other problems though, such as the ones related to neutron being slow and randomly breaking pool/namespace gates 14:15:48 ltomasbo: you mean (more) slow? xD 14:16:02 I wasn't tracking those, would you mind ellaborating that ltomasbo? 14:16:33 dmellado, tempest test that gcheresh_ was doing about the pool was hitting random results on the number of ports created 14:17:00 dmellado, and it was just because neutron was too slow creating 10 ports in a bulk request, therefore, kuryr was triggering a second one 14:17:14 ltomasbo: like neutron not being able to cope up with the amount of requests? 14:17:27 dmellado, this is fixing the issue: https://review.openstack.org/#/c/628160/ 14:17:32 well, not fixing, avoiding it 14:17:46 oh, now I do recall it 14:17:47 dmellado, ltomasbo: Hey, most of the timeouts are due to lost notifications from K8s API due to etcd timeouts! 14:17:51 yeah, mitigating it at least 14:17:52 dmellado, it is able, but it took more time than the waiting time kuryr has to not trigger another repopulation 14:17:53 So it probably won't help 14:18:17 so in the end we got a bottleneck on the etcd stuff 14:18:28 dulek, this one actually is not related to that, as it is kuryr-controller -> neutron, without etcd 14:19:51 dmellado, btw, maysa and I found a few issues with the PS merged last week about NetworkPolicies 14:19:53 ltomasbo: Okay. 14:20:26 ltomasbo: saw your comments, I'll be keeping up with the patches and the bugs and review them 14:20:30 and we are seeing a weird behaviour where svc rules are note being properly created 14:20:33 ltomasbo: please also do track them 14:21:03 dmellado, I actually -W this one (as it got not already merged): https://review.openstack.org/#/c/629856/3 14:21:12 and it missing some part 14:21:15 ltomasbo: we'll mark these as bugs and once that they're fixed we'll do a minor release with bugfixes 14:21:27 that is already mark as a bug 14:21:32 yep, saw it 14:21:38 I'm speaking in a general way 14:21:43 should there be any new findings 14:22:04 thanks for finding it, anyway 14:23:10 in any case let's prioritize the CI debugging as it'd be blocking patches to get merged anyway 14:23:50 anything else around, fellas? 14:24:13 I have another thing to bring 14:24:25 shoot 14:24:49 dmellado, why don't we move the kuryr meeting to not being UTC, but CET or whatever... then changing in time makes collisions... 14:25:02 dmellado, we discussed about moving it 1 hour later, but then it never happened 14:25:11 ltomasbo: oh, I thought I already did it, not a problem from my side 14:25:24 not a problem for me either 14:25:26 #TODO(dmellado) move kuyr meeting to 1 hour later and CET to avoid time zone collisions 14:25:42 ltomasbo: I'll send a patch for this + an email to the ML upstream 14:25:56 so it wouldn't collide with any another meetings next week 14:26:34 dmellado: if you move it to CET 14:26:42 what happens when Europe moves to CEST 14:26:54 it would change like 1 hour 14:26:55 if that ends up happening (not sure which coutnries will cancel DST) 14:27:15 (I'm not trolling, I swear) 14:27:29 Isn't UTC the only way of specifying upstream meetings times? 14:27:42 well, I'll change it afterwards for CEST 14:27:42 dmellado: did you and maysams finally figure out the test flakiness 14:27:44 xD 14:27:52 celebdor: we were discussing that on the first part of the meeting 14:27:55 just scroll 14:27:57 xd 14:28:05 I saw you spam "recheck" like there was no tomorrow 14:28:08 during the weekend 14:28:16 it kept me entertained I must say 14:28:21 like watching a lottery 14:28:45 celebdor: it was even a script xD 14:28:51 which I checked and so xD 14:29:02 so, anywas 14:29:13 dulek: yep, only UTC on upstream, I'll change it again when the DST comes 14:29:36 dmellado: That works. So next meeting is going to be at the same time or not? 14:29:45 dulek: 1 hour later now 14:29:48 15:00 UTC 14:29:56 okey dokey 14:29:58 dmellado: Okay! 14:29:58 thanks dmellado! 14:30:10 next order of business 14:30:47 is our participation or lack-of-it to the PTG official, dmellado? 14:31:02 celebdor: yep, for the PTG I added ourselves as tentatives 14:31:16 as I got a good-to-go in terms of budget 14:31:26 I would dare say I'm not tented[sic] to attend 14:31:31 and also reached out with some samsung + intel folks who asked to wait 14:31:38 so I have asked for the smallest room available, just in case 14:31:43 where's it this time? 14:31:52 tell me it's not Denver 14:31:54 celebdor: you joking, aren't you? 14:31:56 xD 14:31:56 Not Denver? 14:31:59 it actually is xd 14:32:07 so Denver again 14:32:22 although it seems that it's not on the same hotel but on downtown 14:32:22 there's no connections from my town to Denver 14:32:27 there CAN'T be 14:32:35 so I might have to go there in order for me to get some decent sleep this time 14:32:39 I have spoken 14:32:41 xD 14:32:48 but if you all wanna come to my town you're all welcome 14:32:52 anyway 14:32:59 more interesting things 14:33:18 celebdor: mainly my concern is on cross-project sessions 14:33:28 btw, ltomasbo dulek celebdor https://review.openstack.org/630689 14:33:30 there you go 14:33:39 dmellado: like which? 14:33:48 like Octavia and Infra 14:34:09 in any case this time I would really love for some qe representative to go there and contribute 14:34:32 now that you mention Octavia... 14:34:52 celebdor more interesting things? 14:34:52 I would like to explore the two options 14:35:02 gate with ovn provider 14:35:06 (for octavia) 14:35:39 celebdor, we have already gate with octavia + OVN provider 14:35:51 dmellado: can you remind me if there is any change bigger than "no rain in a month in dmellado's town" of having pod in VM gate? 14:35:56 celebdor, it's experimental 14:36:12 yboaron: what would it take for it to not be experimental anymore? 14:36:14 celebdor: actually there hasn't been rain in a month 14:36:26 but I wouldn't count on it in a fast path 14:36:30 celebdor: for it to be stable for some time 14:36:32 dmellado: any flying pigs as well? 14:36:39 then we'd move it out of experimental 14:36:40 celebdor, seems that LB service always fail 14:36:43 dmellado: gotcha 14:37:02 somebody should work with the networking-ovn folks then 14:37:04 :-0 14:37:06 :-) 14:37:17 celebdor: be warned, I was told they have a terrible new manager 14:37:31 the other thing I want to look is investigation for optional support for kube-proxy on pod-in-VM scenario 14:37:38 I asked reedip today about FIP support, and he told me that it's merged 14:37:48 dmellado: upstream projects have no manager 14:38:01 only lazy PTLs 14:38:14 yboaron: so what are we missing? 14:38:46 I just rerun check experimental on this gate .. let wait and see 14:39:02 in case it fail I'll work with reedip on that 14:39:50 dmellado: did you consider moving the openstack/release of kuryr to independent or cycle with intermediaries? 14:40:02 sorry, the latter is what we have now, innit? 14:40:08 celebdor: yep, we have the latter 14:41:06 maybe we should be releasing with K8s releases 14:41:18 celebdor: that's something I was thinking about as well 14:41:40 dmellado: and what conclusions did you get to? 14:41:42 but in any case it should't hurt us to do a release when openstack as well just to align for now with upper reqs 14:42:24 I would like to check how we would deal with requirements on the upstream gates in such case in non-containerized mode 14:42:37 I do recall that we had some discussions on that, but I'm open for suggestions 14:43:04 now that we'll be slightly out of craziness mode I'll send an invite to you and some rdo folks as well 14:43:21 I'd like to re-sync with jpena and amoralej before changing anything 14:43:43 the upper constraints can be troublesome indeed 14:45:14 anyways, I'll update all of you folks on this on next meeting 14:45:54 any another topic to cover for today anyone? 14:48:14 All right! Then thanks all for attending, we'll be at #openstack-kuryr if you have any follow-up! 14:48:17 #endmeeting