14:00:52 <mnasiadka> #startmeeting kolla 14:00:52 <opendevmeet> Meeting started Wed Jun 1 14:00:52 2022 UTC and is due to finish in 60 minutes. The chair is mnasiadka. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:52 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:52 <opendevmeet> The meeting name has been set to 'kolla' 14:00:52 <frickler> mnasiadka: meeting time? 14:01:00 <frickler> ha, jinx 14:01:03 <frickler> o/ 14:01:04 <yoctozepto> o/ 14:01:14 <mgoddard> \o 14:01:18 <mnasiadka> #topic rollcall 14:01:27 <mnasiadka> now you can repeat ;-) 14:01:28 <mnasiadka> o/ 14:01:31 <mmalchuk> hi 14:03:06 <yoctozepto> let's move on mnasiadka 14:03:13 <mnasiadka> #topic agenda 14:03:23 <mnasiadka> * Announcements 14:03:23 <mnasiadka> * Review action items from the last meeting 14:03:23 <mnasiadka> * CI status 14:03:23 <mnasiadka> * Release tasks 14:03:23 <mnasiadka> * Current cycle planning 14:03:25 <mnasiadka> * Additional agenda (from whiteboard) 14:03:25 <mnasiadka> * Open discussion 14:03:29 <mnasiadka> #topic Announcements 14:03:56 <mnasiadka> OpenInfra Summit next week - but you know that 14:04:01 <mnasiadka> #topic Review action items from the last meeting 14:04:03 <mnasiadka> let's check 14:04:22 <frickler> #link https://grafana.opendev.org/d/c0d59dad13/kolla-failure-rate?orgId=1 14:04:46 <frickler> still waiting on feedback whether to add more jobs or less. also maybe second page for arm jobs? 14:04:47 <mnasiadka> frickler update grafana dashboard 14:04:47 <mnasiadka> hrw to look for aarch64 failures in yoga/master 14:04:47 <mnasiadka> hrw to look for aarch64 failures in yoga/master 14:04:47 <mnasiadka> frickler add a mention of doing monthly point releases for stable branches in kolla meetings docs page 14:04:47 <mnasiadka> mnasiadka add more release liasons 14:05:16 <frickler> also https://review.opendev.org/c/openstack/kolla/+/844286 with questions for the stable releases 14:05:16 <mnasiadka> frickler: Ceph jobs failure rates? 14:05:34 <frickler> mnasiadka: sure, just tell me what to add 14:05:48 <frickler> ceph came to my mind, too 14:06:31 <yoctozepto> failure rates most of the time at 100% do not sound well and it's not true 14:06:35 <frickler> regarding aarch I did a look too, seems debian has updated qemu which is broken 14:06:43 <yoctozepto> I don't think this gives the right perspective 14:07:39 <frickler> yoctozepto: which ones are you talking about? 14:08:18 <yoctozepto> frickler: kolla-ansible-ubuntu-source 14:08:20 <mnasiadka> well, all of the graphs show 100% failure rate from time to time 14:08:37 <mnasiadka> or are these not percent? 14:08:52 <yoctozepto> I mean, we probably need a different timewindow 14:09:11 <yoctozepto> so as not to get one failed jobs create a spike to 100% 14:09:52 <yoctozepto> it's nice to have the graphs but I struggle with their interpretation 14:10:30 <frickler> iirc the timeframe is 24h, so if there is only one job per day (periodic), it's either 100% or 0%. I'll see if that can be increased to a week or so 14:11:01 <mnasiadka> even if you go to 6 months time window - these are still spikes to 100% 14:11:11 <mnasiadka> https://grafana.opendev.org/d/c0d59dad13/kolla-failure-rate?orgId=1&viewPanel=4&from=now-6M&to=now 14:11:27 <frickler> the calculation still looks at 24h intervals 14:11:34 <mnasiadka> right 14:11:56 <frickler> https://opendev.org/openstack/project-config/src/branch/master/grafana/kolla.yaml#L36 14:12:07 <frickler> movingAverage(...'24hours') 14:12:14 <frickler> I'll try to change that 14:12:29 <mnasiadka> ok, anyway - we need to be able to see some benefit in those dashboards 14:12:35 <mnasiadka> but thanks for working on them 14:12:51 <mnasiadka> #action frickler to continue working on Grafana dashboards 14:13:02 <frickler> maybe once we no longer have large incidents every couple of days, things will look better 14:13:14 <mnasiadka> hrw is absent, what about those master/yoga failures, are those fixed? 14:13:38 <frickler> no, aarch debian is completely broken with their qemu update 14:13:51 <yoctozepto> yeah, whatever happened 14:13:54 <yoctozepto> cirros did not adapt 14:13:59 <yoctozepto> I heard ubuntu worked 14:14:13 <frickler> we can either test with ubuntu cloud image instead of cirros or discuss whether we can use older qemu from not -backports 14:14:32 <frickler> or somebody can tell me what I need to change in cirros 14:14:44 <mnasiadka> ok, so it's still in progress 14:14:52 <frickler> I tested with newer kernel and grub, that didn't help 14:14:55 <mnasiadka> #action hrw to look for aarch64 failures in yoga/master 14:15:15 <mnasiadka> frickler add a mention of doing monthly point releases for stable branches in kolla meetings docs page - I've seen a patch in progress with comments 14:15:36 <frickler> #link https://review.opendev.org/c/openstack/kolla/+/844286 14:16:06 <frickler> once those questions are answered, I can also create the release patch(es) 14:16:38 <mnasiadka> commented 14:16:40 <mnasiadka> let's go on 14:16:50 <mnasiadka> I haven't requested more releases liaisons 14:16:59 <mnasiadka> #action mnasiadka add more release liasons 14:17:07 <mnasiadka> #topic CI status 14:17:26 <yoctozepto> green after yet another emergency weekend 14:17:31 <mnasiadka> fantastic 14:17:43 <yoctozepto> thankfully, me and frickler seem to be workaholics :-) 14:17:52 <mnasiadka> #topic Release tasks 14:18:10 <mmalchuk> end me) 14:18:15 <mmalchuk> and me) 14:18:22 <mnasiadka> R-18 - nothing on our radar I guess 14:18:42 <mnasiadka> There was somebody on the list complaining about Kolla 14.0.0 failing to build Ubuntu 14:18:52 <mnasiadka> Did we fix that and should release 14.0.1? 14:19:06 <yoctozepto> mmalchuk: :-) 14:19:17 <yoctozepto> mnasiadka: frickler will release 14:19:19 <yoctozepto> I mean 14:19:21 <yoctozepto> propose 14:19:28 <yoctozepto> we fixed that indeed 14:19:31 <mnasiadka> ok 14:20:06 <mnasiadka> frickler: happy to follow the process we agree to and post new point releases for Yoga? 14:20:07 <priteau> I've replied, they confirmed that it is fixed 14:20:50 <frickler> frickler: well I'd propose stable releases for all branches in one go was my idea 14:21:09 <frickler> ehm ... mnasiadka: 14:21:25 <yoctozepto> yoctozepto: that's a good idea 14:21:27 <yoctozepto> yoctozepto: thank you 14:21:31 <yoctozepto> :D 14:21:32 <mnasiadka> yeah, in one go - sorry ;-) 14:21:59 <mnasiadka> #topic Current cycle planning 14:23:17 <mnasiadka> Ubuntu 22.04 LTS is moving forward-ish - Horizon is broken for Python 3.10 (https://review.opendev.org/c/openstack/horizon/+/830618), MariaDB is broken-ish (but workarounded for now) 14:24:38 <yoctozepto> what's wrong with mariadb? 14:24:43 <yoctozepto> (just curious) 14:25:14 <mnasiadka> https://bugs.launchpad.net/ubuntu/+source/mariadb-10.6/+bug/1970634 14:25:30 <mnasiadka> fix is out, but in kinetic-proposed 14:25:51 <mnasiadka> (and we're using mariadb packages from Ubuntu repo for now, because upstream MariaDB hasn't built those yet) 14:27:07 <mnasiadka> Anything on other priorities? 14:27:30 <yoctozepto> the solution is "disable lto" nice 14:27:34 <yoctozepto> so much for optimisations 14:27:36 <yoctozepto> :D 14:27:46 <mnasiadka> https://etherpad.opendev.org/p/KollaWhiteBoard - L289 14:27:50 <yoctozepto> mnasiadka: I guess we should discuss letsencrypt 14:27:54 <yoctozepto> I posted some comments there 14:27:58 <yoctozepto> please check them out 14:28:00 <mnasiadka> yoctozepto: yeah well, tell me why mariadb checks kernel version ;-) 14:28:07 <yoctozepto> (I have as I promised to do it May :D ) 14:28:24 <mnasiadka> ok, I'll try to look into it - mgoddard got some cycles to look there as well? 14:29:01 <mgoddard> yeah, should do 14:29:36 <mnasiadka> so let's try to focus on getting this in finally, then we can look at podman 14:30:01 <mnasiadka> #topic Additional agenda (from whiteboard) 14:30:09 <mnasiadka> (yoctozepto) https://review.opendev.org/c/openstack/kolla/+/843751/ 14:30:45 <yoctozepto> kevko will also want his proxysql in 14:30:50 <yoctozepto> we merged the first part 14:31:01 <yoctozepto> and regarding that patch posted by mnasiadka: 14:31:17 <yoctozepto> how do we go about removing the admin endpoints? 14:31:32 <yoctozepto> we did not in yoga, we are leaving users with admin endpoints 14:31:51 <yoctozepto> for keystone it matters what this points to but we don't have any strategy atm to apply 14:32:07 <yoctozepto> that's what needs discussion 14:32:45 <frickler> I would leave it to deployers to decide when to clean up 14:33:01 <mmalchuk> agree 14:33:16 <frickler> and it doesn't hurt if they decide not to do it at all 14:33:51 <yoctozepto> works for me 14:34:14 <mnasiadka> case solved I guess 14:34:19 <yoctozepto> but there is a discussion with mgoddard that he wants it differently 14:34:33 <yoctozepto> let me find that 14:34:56 <yoctozepto> https://review.opendev.org/c/openstack/kolla-ansible/+/840898 14:34:57 <mnasiadka> well, we have been removing endpoints in the past (like cinderv2) 14:35:20 <mgoddard> I'd prefer that upgraded systems look like fresh deploys, where possible & sensible 14:35:33 <mgoddard> what's the argument for leaving the admin endpoints? 14:35:34 <yoctozepto> yeah, so maybe do it again and be gone with that extra unused endpoint? 14:35:53 <yoctozepto> mgoddard: I guess the argument is us being lazy - it's still a valid argument! 14:35:56 <yoctozepto> :D 14:36:21 <frickler> that patch is about changing the URL for the keystone admin endpoint, that's a different thing 14:36:43 <yoctozepto> frickler: well, different but very much related 14:36:50 <yoctozepto> update or remove 14:36:53 <yoctozepto> that is the question 14:37:02 <yoctozepto> and I guess then we should remove all admin endpoints 14:37:23 <frickler> for the other service's admin endpoints, the URL is the same as internal iirc. so they will continue to work 14:37:33 <yoctozepto> btw, is the keystone admin endpoint still used anywhere? maybe we should run tempest soon to discover this ;d 14:37:40 <frickler> for the different port, we want to cleanup haproxy so it will no longer work 14:37:53 <mmalchuk> yoctozepto: but we'r lazy) 14:37:58 <tibeer> @frickler is that something for ra-rau or me? 14:38:04 <yoctozepto> frickler: I agree with mgoddard that it's better to clean up 14:38:17 <mgoddard> agree with frickler, they are fairly different things 14:38:26 <mgoddard> so 1, removing admin endpoints 14:38:41 <frickler> the argument not to clean up may be that users can be using them without deployers being aware of it 14:38:44 <mgoddard> a reason we might not want to do it on upgrade is backwards compat 14:39:06 <frickler> since it is a local config in clouds.yaml or whereever 14:39:16 <mmalchuk> what if after upgrade some need a revert? 14:39:23 <yoctozepto> ok, makes sense 14:39:32 <mnasiadka> yoctozepto: it was used in keystone client, but it doesn't default to it anymore (at least now) 14:39:32 <yoctozepto> mmalchuk: not something we can support really 14:39:34 <mgoddard> OTOH, we didn't make it optional, so they're essentially running custom endpoints 14:39:42 <yoctozepto> mnasiadka: awesome 14:40:00 <mgoddard> and if they change FQDN or VIP, the admin endpoints will be invalid 14:40:13 <frickler> users may to some time to update their clients 14:40:19 <yoctozepto> mgoddard: optional what? (I did not follow) 14:40:29 <mgoddard> yoctozepto: optional admin endpoints 14:40:33 <frickler> s/to/take/ 14:41:10 <yoctozepto> mgoddard: ack 14:41:22 <yoctozepto> so they can go desync 14:41:27 <yoctozepto> which is bad 14:41:33 <yoctozepto> and surprising probably 14:41:37 <mnasiadka> Kolla-Ansible created those endpoints, Kolla-Ansible should remove them on upgrade - if something breaks - the user can add the endpoint back manually (and therefore they will be treated as something beyond k-a control) 14:41:55 <mnasiadka> makes sense? 14:42:36 <yoctozepto> +2 14:42:57 <mmalchuk> ok. indeed make sense 14:43:00 <mgoddard> also what if I deploy region2 14:43:06 <yoctozepto> but let's leave yoga alone? this needs to be advertised better I guess 14:43:18 <yoctozepto> mgoddard: then you will have region2 deployed 14:43:22 <yoctozepto> ba-dum-tss 14:43:25 <yoctozepto> :D 14:43:27 <mgoddard> with no admin endpoints 14:43:31 <mgoddard> anyways 14:43:56 <mnasiadka> Yes, let's do it in Zed I think - enough time to understand if fresh deployments without admin endpoints break any project 14:44:02 <mgoddard> +1 14:44:23 <yoctozepto> ok, then I will focus on cleaning up admin endpoints 14:44:58 <mnasiadka> Ok, enough on that I think :) 14:45:03 <mnasiadka> Second additional topic is: (frickler) Monthly stable releases https://review.opendev.org/c/openstack/kolla/+/844286 14:45:17 <frickler> we tackled that already I guess 14:45:23 <mnasiadka> I think so 14:45:24 <frickler> just wanted to have the link handy 14:45:42 <mnasiadka> goodie 14:45:55 <mnasiadka> #topic Open discussion 14:46:02 <mnasiadka> Anyone, anything? 14:46:08 <mmalchuk> please take a look https://review.opendev.org/c/openstack/kayobe/+/840033 14:46:16 <mmalchuk> mgoddard know it 14:46:48 <mmalchuk> bad issue need to be backported too 14:47:11 <frickler> when is the switch from yoga to master due? 14:47:27 * frickler wanting to dump centos :D 14:47:53 <mmalchuk> also https://review.opendev.org/c/openstack/kolla/+/842472 14:48:07 <mmalchuk> we need this customized from kayobe 14:48:32 <mnasiadka> frickler: I think once we have CentOS Stream 9 working, and now we don't. 14:49:11 <yoctozepto> mnasiadka: seriously? I thought we had a schedule for these things ;-) 14:49:14 <frickler> so we wait without limit for c9s? 14:49:32 <yoctozepto> if cs9 is not there, we drop cs8 and live happily without it for the time being 14:49:36 <mnasiadka> 3 months or 3 years, whatever :) 14:49:41 <mnasiadka> Yeah, just laughing 14:50:53 <mmalchuk> please +W : https://review.opendev.org/q/I3ab603f7cab7946ea8f2e063fe91190d6592066a 14:51:36 <mnasiadka> done 14:51:44 <mmalchuk> thx 14:51:47 <mnasiadka> ok 14:51:50 <mnasiadka> #endmeeting