14:00:52 #startmeeting kolla 14:00:52 Meeting started Wed Jun 1 14:00:52 2022 UTC and is due to finish in 60 minutes. The chair is mnasiadka. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:52 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:52 The meeting name has been set to 'kolla' 14:00:52 mnasiadka: meeting time? 14:01:00 ha, jinx 14:01:03 o/ 14:01:04 o/ 14:01:14 \o 14:01:18 #topic rollcall 14:01:27 now you can repeat ;-) 14:01:28 o/ 14:01:31 hi 14:03:06 let's move on mnasiadka 14:03:13 #topic agenda 14:03:23 * Announcements 14:03:23 * Review action items from the last meeting 14:03:23 * CI status 14:03:23 * Release tasks 14:03:23 * Current cycle planning 14:03:25 * Additional agenda (from whiteboard) 14:03:25 * Open discussion 14:03:29 #topic Announcements 14:03:56 OpenInfra Summit next week - but you know that 14:04:01 #topic Review action items from the last meeting 14:04:03 let's check 14:04:22 #link https://grafana.opendev.org/d/c0d59dad13/kolla-failure-rate?orgId=1 14:04:46 still waiting on feedback whether to add more jobs or less. also maybe second page for arm jobs? 14:04:47 frickler update grafana dashboard 14:04:47 hrw to look for aarch64 failures in yoga/master 14:04:47 hrw to look for aarch64 failures in yoga/master 14:04:47 frickler add a mention of doing monthly point releases for stable branches in kolla meetings docs page 14:04:47 mnasiadka add more release liasons 14:05:16 also https://review.opendev.org/c/openstack/kolla/+/844286 with questions for the stable releases 14:05:16 frickler: Ceph jobs failure rates? 14:05:34 mnasiadka: sure, just tell me what to add 14:05:48 ceph came to my mind, too 14:06:31 failure rates most of the time at 100% do not sound well and it's not true 14:06:35 regarding aarch I did a look too, seems debian has updated qemu which is broken 14:06:43 I don't think this gives the right perspective 14:07:39 yoctozepto: which ones are you talking about? 14:08:18 frickler: kolla-ansible-ubuntu-source 14:08:20 well, all of the graphs show 100% failure rate from time to time 14:08:37 or are these not percent? 14:08:52 I mean, we probably need a different timewindow 14:09:11 so as not to get one failed jobs create a spike to 100% 14:09:52 it's nice to have the graphs but I struggle with their interpretation 14:10:30 iirc the timeframe is 24h, so if there is only one job per day (periodic), it's either 100% or 0%. I'll see if that can be increased to a week or so 14:11:01 even if you go to 6 months time window - these are still spikes to 100% 14:11:11 https://grafana.opendev.org/d/c0d59dad13/kolla-failure-rate?orgId=1&viewPanel=4&from=now-6M&to=now 14:11:27 the calculation still looks at 24h intervals 14:11:34 right 14:11:56 https://opendev.org/openstack/project-config/src/branch/master/grafana/kolla.yaml#L36 14:12:07 movingAverage(...'24hours') 14:12:14 I'll try to change that 14:12:29 ok, anyway - we need to be able to see some benefit in those dashboards 14:12:35 but thanks for working on them 14:12:51 #action frickler to continue working on Grafana dashboards 14:13:02 maybe once we no longer have large incidents every couple of days, things will look better 14:13:14 hrw is absent, what about those master/yoga failures, are those fixed? 14:13:38 no, aarch debian is completely broken with their qemu update 14:13:51 yeah, whatever happened 14:13:54 cirros did not adapt 14:13:59 I heard ubuntu worked 14:14:13 we can either test with ubuntu cloud image instead of cirros or discuss whether we can use older qemu from not -backports 14:14:32 or somebody can tell me what I need to change in cirros 14:14:44 ok, so it's still in progress 14:14:52 I tested with newer kernel and grub, that didn't help 14:14:55 #action hrw to look for aarch64 failures in yoga/master 14:15:15 frickler add a mention of doing monthly point releases for stable branches in kolla meetings docs page - I've seen a patch in progress with comments 14:15:36 #link https://review.opendev.org/c/openstack/kolla/+/844286 14:16:06 once those questions are answered, I can also create the release patch(es) 14:16:38 commented 14:16:40 let's go on 14:16:50 I haven't requested more releases liaisons 14:16:59 #action mnasiadka add more release liasons 14:17:07 #topic CI status 14:17:26 green after yet another emergency weekend 14:17:31 fantastic 14:17:43 thankfully, me and frickler seem to be workaholics :-) 14:17:52 #topic Release tasks 14:18:10 end me) 14:18:15 and me) 14:18:22 R-18 - nothing on our radar I guess 14:18:42 There was somebody on the list complaining about Kolla 14.0.0 failing to build Ubuntu 14:18:52 Did we fix that and should release 14.0.1? 14:19:06 mmalchuk: :-) 14:19:17 mnasiadka: frickler will release 14:19:19 I mean 14:19:21 propose 14:19:28 we fixed that indeed 14:19:31 ok 14:20:06 frickler: happy to follow the process we agree to and post new point releases for Yoga? 14:20:07 I've replied, they confirmed that it is fixed 14:20:50 frickler: well I'd propose stable releases for all branches in one go was my idea 14:21:09 ehm ... mnasiadka: 14:21:25 yoctozepto: that's a good idea 14:21:27 yoctozepto: thank you 14:21:31 :D 14:21:32 yeah, in one go - sorry ;-) 14:21:59 #topic Current cycle planning 14:23:17 Ubuntu 22.04 LTS is moving forward-ish - Horizon is broken for Python 3.10 (https://review.opendev.org/c/openstack/horizon/+/830618), MariaDB is broken-ish (but workarounded for now) 14:24:38 what's wrong with mariadb? 14:24:43 (just curious) 14:25:14 https://bugs.launchpad.net/ubuntu/+source/mariadb-10.6/+bug/1970634 14:25:30 fix is out, but in kinetic-proposed 14:25:51 (and we're using mariadb packages from Ubuntu repo for now, because upstream MariaDB hasn't built those yet) 14:27:07 Anything on other priorities? 14:27:30 the solution is "disable lto" nice 14:27:34 so much for optimisations 14:27:36 :D 14:27:46 https://etherpad.opendev.org/p/KollaWhiteBoard - L289 14:27:50 mnasiadka: I guess we should discuss letsencrypt 14:27:54 I posted some comments there 14:27:58 please check them out 14:28:00 yoctozepto: yeah well, tell me why mariadb checks kernel version ;-) 14:28:07 (I have as I promised to do it May :D ) 14:28:24 ok, I'll try to look into it - mgoddard got some cycles to look there as well? 14:29:01 yeah, should do 14:29:36 so let's try to focus on getting this in finally, then we can look at podman 14:30:01 #topic Additional agenda (from whiteboard) 14:30:09 (yoctozepto) https://review.opendev.org/c/openstack/kolla/+/843751/ 14:30:45 kevko will also want his proxysql in 14:30:50 we merged the first part 14:31:01 and regarding that patch posted by mnasiadka: 14:31:17 how do we go about removing the admin endpoints? 14:31:32 we did not in yoga, we are leaving users with admin endpoints 14:31:51 for keystone it matters what this points to but we don't have any strategy atm to apply 14:32:07 that's what needs discussion 14:32:45 I would leave it to deployers to decide when to clean up 14:33:01 agree 14:33:16 and it doesn't hurt if they decide not to do it at all 14:33:51 works for me 14:34:14 case solved I guess 14:34:19 but there is a discussion with mgoddard that he wants it differently 14:34:33 let me find that 14:34:56 https://review.opendev.org/c/openstack/kolla-ansible/+/840898 14:34:57 well, we have been removing endpoints in the past (like cinderv2) 14:35:20 I'd prefer that upgraded systems look like fresh deploys, where possible & sensible 14:35:33 what's the argument for leaving the admin endpoints? 14:35:34 yeah, so maybe do it again and be gone with that extra unused endpoint? 14:35:53 mgoddard: I guess the argument is us being lazy - it's still a valid argument! 14:35:56 :D 14:36:21 that patch is about changing the URL for the keystone admin endpoint, that's a different thing 14:36:43 frickler: well, different but very much related 14:36:50 update or remove 14:36:53 that is the question 14:37:02 and I guess then we should remove all admin endpoints 14:37:23 for the other service's admin endpoints, the URL is the same as internal iirc. so they will continue to work 14:37:33 btw, is the keystone admin endpoint still used anywhere? maybe we should run tempest soon to discover this ;d 14:37:40 for the different port, we want to cleanup haproxy so it will no longer work 14:37:53 yoctozepto: but we'r lazy) 14:37:58 @frickler is that something for ra-rau or me? 14:38:04 frickler: I agree with mgoddard that it's better to clean up 14:38:17 agree with frickler, they are fairly different things 14:38:26 so 1, removing admin endpoints 14:38:41 the argument not to clean up may be that users can be using them without deployers being aware of it 14:38:44 a reason we might not want to do it on upgrade is backwards compat 14:39:06 since it is a local config in clouds.yaml or whereever 14:39:16 what if after upgrade some need a revert? 14:39:23 ok, makes sense 14:39:32 yoctozepto: it was used in keystone client, but it doesn't default to it anymore (at least now) 14:39:32 mmalchuk: not something we can support really 14:39:34 OTOH, we didn't make it optional, so they're essentially running custom endpoints 14:39:42 mnasiadka: awesome 14:40:00 and if they change FQDN or VIP, the admin endpoints will be invalid 14:40:13 users may to some time to update their clients 14:40:19 mgoddard: optional what? (I did not follow) 14:40:29 yoctozepto: optional admin endpoints 14:40:33 s/to/take/ 14:41:10 mgoddard: ack 14:41:22 so they can go desync 14:41:27 which is bad 14:41:33 and surprising probably 14:41:37 Kolla-Ansible created those endpoints, Kolla-Ansible should remove them on upgrade - if something breaks - the user can add the endpoint back manually (and therefore they will be treated as something beyond k-a control) 14:41:55 makes sense? 14:42:36 +2 14:42:57 ok. indeed make sense 14:43:00 also what if I deploy region2 14:43:06 but let's leave yoga alone? this needs to be advertised better I guess 14:43:18 mgoddard: then you will have region2 deployed 14:43:22 ba-dum-tss 14:43:25 :D 14:43:27 with no admin endpoints 14:43:31 anyways 14:43:56 Yes, let's do it in Zed I think - enough time to understand if fresh deployments without admin endpoints break any project 14:44:02 +1 14:44:23 ok, then I will focus on cleaning up admin endpoints 14:44:58 Ok, enough on that I think :) 14:45:03 Second additional topic is: (frickler) Monthly stable releases https://review.opendev.org/c/openstack/kolla/+/844286 14:45:17 we tackled that already I guess 14:45:23 I think so 14:45:24 just wanted to have the link handy 14:45:42 goodie 14:45:55 #topic Open discussion 14:46:02 Anyone, anything? 14:46:08 please take a look https://review.opendev.org/c/openstack/kayobe/+/840033 14:46:16 mgoddard know it 14:46:48 bad issue need to be backported too 14:47:11 when is the switch from yoga to master due? 14:47:27 * frickler wanting to dump centos :D 14:47:53 also https://review.opendev.org/c/openstack/kolla/+/842472 14:48:07 we need this customized from kayobe 14:48:32 frickler: I think once we have CentOS Stream 9 working, and now we don't. 14:49:11 mnasiadka: seriously? I thought we had a schedule for these things ;-) 14:49:14 so we wait without limit for c9s? 14:49:32 if cs9 is not there, we drop cs8 and live happily without it for the time being 14:49:36 3 months or 3 years, whatever :) 14:49:41 Yeah, just laughing 14:50:53 please +W : https://review.opendev.org/q/I3ab603f7cab7946ea8f2e063fe91190d6592066a 14:51:36 done 14:51:44 thx 14:51:47 ok 14:51:50 #endmeeting