#openstack-kolla log

14:00:52 <mnasiadka> #startmeeting kolla
14:00:52 <opendevmeet> Meeting started Wed Jun  1 14:00:52 2022 UTC and is due to finish in 60 minutes.  The chair is mnasiadka. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:52 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:52 <opendevmeet> The meeting name has been set to 'kolla'
14:00:52 <frickler> mnasiadka: meeting time?
14:01:00 <frickler> ha, jinx
14:01:03 <frickler> o/
14:01:04 <yoctozepto> o/
14:01:14 <mgoddard> \o
14:01:18 <mnasiadka> #topic rollcall
14:01:27 <mnasiadka> now you can repeat ;-)
14:01:28 <mnasiadka> o/
14:01:31 <mmalchuk> hi
14:03:06 <yoctozepto> let's move on mnasiadka
14:03:13 <mnasiadka> #topic agenda
14:03:23 <mnasiadka> * Announcements
14:03:23 <mnasiadka> * Review action items from the last meeting
14:03:23 <mnasiadka> * CI status
14:03:23 <mnasiadka> * Release tasks
14:03:23 <mnasiadka> * Current cycle planning
14:03:25 <mnasiadka> * Additional agenda (from whiteboard)
14:03:25 <mnasiadka> * Open discussion
14:03:29 <mnasiadka> #topic Announcements
14:03:56 <mnasiadka> OpenInfra Summit next week - but you know that
14:04:01 <mnasiadka> #topic Review action items from the last meeting
14:04:03 <mnasiadka> let's check
14:04:22 <frickler> #link https://grafana.opendev.org/d/c0d59dad13/kolla-failure-rate?orgId=1
14:04:46 <frickler> still waiting on feedback whether to add more jobs or less. also maybe second page for arm jobs?
14:04:47 <mnasiadka> frickler update grafana dashboard
14:04:47 <mnasiadka> hrw to look for aarch64 failures in yoga/master
14:04:47 <mnasiadka> hrw to look for aarch64 failures in yoga/master
14:04:47 <mnasiadka> frickler add a mention of doing monthly point releases for stable branches in kolla meetings docs page
14:04:47 <mnasiadka> mnasiadka add more release liasons
14:05:16 <frickler> also https://review.opendev.org/c/openstack/kolla/+/844286 with questions for the stable releases
14:05:16 <mnasiadka> frickler: Ceph jobs failure rates?
14:05:34 <frickler> mnasiadka: sure, just tell me what to add
14:05:48 <frickler> ceph came to my mind, too
14:06:31 <yoctozepto> failure rates most of the time at 100% do not sound well and it's not true
14:06:35 <frickler> regarding aarch I did a look too, seems debian has updated qemu which is broken
14:06:43 <yoctozepto> I don't think this gives the right perspective
14:07:39 <frickler> yoctozepto: which ones are you talking about?
14:08:18 <yoctozepto> frickler: kolla-ansible-ubuntu-source
14:08:20 <mnasiadka> well, all of the graphs show 100% failure rate from time to time
14:08:37 <mnasiadka> or are these not percent?
14:08:52 <yoctozepto> I mean, we probably need a different timewindow
14:09:11 <yoctozepto> so as not to get one failed jobs create a spike to 100%
14:09:52 <yoctozepto> it's nice to have the graphs but I struggle with their interpretation
14:10:30 <frickler> iirc the timeframe is 24h, so if there is only one job per day (periodic), it's either 100% or 0%. I'll see if that can be increased to a week or so
14:11:01 <mnasiadka> even if you go to 6 months time window - these are still spikes to 100%
14:11:11 <mnasiadka> https://grafana.opendev.org/d/c0d59dad13/kolla-failure-rate?orgId=1&viewPanel=4&from=now-6M&to=now
14:11:27 <frickler> the calculation still looks at 24h intervals
14:11:34 <mnasiadka> right
14:11:56 <frickler> https://opendev.org/openstack/project-config/src/branch/master/grafana/kolla.yaml#L36
14:12:07 <frickler> movingAverage(...'24hours')
14:12:14 <frickler> I'll try to change that
14:12:29 <mnasiadka> ok, anyway - we need to be able to see some benefit in those dashboards
14:12:35 <mnasiadka> but thanks for working on them
14:12:51 <mnasiadka> #action frickler to continue working on Grafana dashboards
14:13:02 <frickler> maybe once we no longer have large incidents every couple of days, things will look better
14:13:14 <mnasiadka> hrw is absent, what about those master/yoga failures, are those fixed?
14:13:38 <frickler> no, aarch debian is completely broken with their qemu update
14:13:51 <yoctozepto> yeah, whatever happened
14:13:54 <yoctozepto> cirros did not adapt
14:13:59 <yoctozepto> I heard ubuntu worked
14:14:13 <frickler> we can either test with ubuntu cloud image instead of cirros or discuss whether we can use older qemu from not -backports
14:14:32 <frickler> or somebody can tell me what I need to change in cirros
14:14:44 <mnasiadka> ok, so it's still in progress
14:14:52 <frickler> I tested with newer kernel and grub, that didn't help
14:14:55 <mnasiadka> #action hrw to look for aarch64 failures in yoga/master
14:15:15 <mnasiadka> frickler add a mention of doing monthly point releases for stable branches in kolla meetings docs page - I've seen a patch in progress with comments
14:15:36 <frickler> #link https://review.opendev.org/c/openstack/kolla/+/844286
14:16:06 <frickler> once those questions are answered, I can also create the release patch(es)
14:16:38 <mnasiadka> commented
14:16:40 <mnasiadka> let's go on
14:16:50 <mnasiadka> I haven't requested more releases liaisons
14:16:59 <mnasiadka> #action mnasiadka add more release liasons
14:17:07 <mnasiadka> #topic CI status
14:17:26 <yoctozepto> green after yet another emergency weekend
14:17:31 <mnasiadka> fantastic
14:17:43 <yoctozepto> thankfully, me and frickler seem to be workaholics :-)
14:17:52 <mnasiadka> #topic Release tasks
14:18:10 <mmalchuk> end me)
14:18:15 <mmalchuk> and me)
14:18:22 <mnasiadka> R-18 - nothing on our radar I guess
14:18:42 <mnasiadka> There was somebody on the list complaining about Kolla 14.0.0 failing to build Ubuntu
14:18:52 <mnasiadka> Did we fix that and should release 14.0.1?
14:19:06 <yoctozepto> mmalchuk: :-)
14:19:17 <yoctozepto> mnasiadka: frickler will release
14:19:19 <yoctozepto> I mean
14:19:21 <yoctozepto> propose
14:19:28 <yoctozepto> we fixed that indeed
14:19:31 <mnasiadka> ok
14:20:06 <mnasiadka> frickler: happy to follow the process we agree to and post new point releases for Yoga?
14:20:07 <priteau> I've replied, they confirmed that it is fixed
14:20:50 <frickler> frickler: well I'd propose stable releases for all branches in one go was my idea
14:21:09 <frickler> ehm ... mnasiadka:
14:21:25 <yoctozepto> yoctozepto: that's a good idea
14:21:27 <yoctozepto> yoctozepto: thank you
14:21:31 <yoctozepto> :D
14:21:32 <mnasiadka> yeah, in one go - sorry ;-)
14:21:59 <mnasiadka> #topic Current cycle planning
14:23:17 <mnasiadka> Ubuntu 22.04 LTS is moving forward-ish - Horizon is broken for Python 3.10 (https://review.opendev.org/c/openstack/horizon/+/830618), MariaDB is broken-ish (but workarounded for now)
14:24:38 <yoctozepto> what's wrong with mariadb?
14:24:43 <yoctozepto> (just curious)
14:25:14 <mnasiadka> https://bugs.launchpad.net/ubuntu/+source/mariadb-10.6/+bug/1970634
14:25:30 <mnasiadka> fix is out, but in kinetic-proposed
14:25:51 <mnasiadka> (and we're using mariadb packages from Ubuntu repo for now, because upstream MariaDB hasn't built those yet)
14:27:07 <mnasiadka> Anything on other priorities?
14:27:30 <yoctozepto> the solution is "disable lto" nice
14:27:34 <yoctozepto> so much for optimisations
14:27:36 <yoctozepto> :D
14:27:46 <mnasiadka> https://etherpad.opendev.org/p/KollaWhiteBoard - L289
14:27:50 <yoctozepto> mnasiadka: I guess we should discuss letsencrypt
14:27:54 <yoctozepto> I posted some comments there
14:27:58 <yoctozepto> please check them out
14:28:00 <mnasiadka> yoctozepto: yeah well, tell me why mariadb checks kernel version ;-)
14:28:07 <yoctozepto> (I have as I promised to do it May :D )
14:28:24 <mnasiadka> ok, I'll try to look into it - mgoddard got some cycles to look there as well?
14:29:01 <mgoddard> yeah, should do
14:29:36 <mnasiadka> so let's try to focus on getting this in finally, then we can look at podman
14:30:01 <mnasiadka> #topic Additional agenda (from whiteboard)
14:30:09 <mnasiadka> (yoctozepto) https://review.opendev.org/c/openstack/kolla/+/843751/
14:30:45 <yoctozepto> kevko will also want his proxysql in
14:30:50 <yoctozepto> we merged the first part
14:31:01 <yoctozepto> and regarding that patch posted by mnasiadka:
14:31:17 <yoctozepto> how do we go about removing the admin endpoints?
14:31:32 <yoctozepto> we did not in yoga, we are leaving users with admin endpoints
14:31:51 <yoctozepto> for keystone it matters what this points to but we don't have any strategy atm to apply
14:32:07 <yoctozepto> that's what needs discussion
14:32:45 <frickler> I would leave it to deployers to decide when to clean up
14:33:01 <mmalchuk> agree
14:33:16 <frickler> and it doesn't hurt if they decide not to do it at all
14:33:51 <yoctozepto> works for me
14:34:14 <mnasiadka> case solved I guess
14:34:19 <yoctozepto> but there is a discussion with mgoddard that he wants it differently
14:34:33 <yoctozepto> let me find that
14:34:56 <yoctozepto> https://review.opendev.org/c/openstack/kolla-ansible/+/840898
14:34:57 <mnasiadka> well, we have been removing endpoints in the past (like cinderv2)
14:35:20 <mgoddard> I'd prefer that upgraded systems look like fresh deploys, where possible & sensible
14:35:33 <mgoddard> what's the argument for leaving the admin endpoints?
14:35:34 <yoctozepto> yeah, so maybe do it again and be gone with that extra unused endpoint?
14:35:53 <yoctozepto> mgoddard: I guess the argument is us being lazy - it's still a valid argument!
14:35:56 <yoctozepto> :D
14:36:21 <frickler> that patch is about changing the URL for the keystone admin endpoint, that's a different thing
14:36:43 <yoctozepto> frickler: well, different but very much related
14:36:50 <yoctozepto> update or remove
14:36:53 <yoctozepto> that is the question
14:37:02 <yoctozepto> and I guess then we should remove all admin endpoints
14:37:23 <frickler> for the other service's admin endpoints, the URL is the same as internal iirc. so they will continue to work
14:37:33 <yoctozepto> btw, is the keystone admin endpoint still used anywhere? maybe we should run tempest soon to discover this ;d
14:37:40 <frickler> for the different port, we want to cleanup haproxy so it will no longer work
14:37:53 <mmalchuk> yoctozepto: but we'r lazy)
14:37:58 <tibeer> @frickler is that something for ra-rau or me?
14:38:04 <yoctozepto> frickler: I agree with mgoddard that it's better to clean up
14:38:17 <mgoddard> agree with frickler, they are fairly different things
14:38:26 <mgoddard> so 1, removing admin endpoints
14:38:41 <frickler> the argument not to clean up may be that users can be using them without deployers being aware of it
14:38:44 <mgoddard> a reason we might not want to do it on upgrade is backwards compat
14:39:06 <frickler> since it is a local config in clouds.yaml or whereever
14:39:16 <mmalchuk> what if after upgrade some need a revert?
14:39:23 <yoctozepto> ok, makes sense
14:39:32 <mnasiadka> yoctozepto: it was used in keystone client, but it doesn't default to it anymore (at least now)
14:39:32 <yoctozepto> mmalchuk: not something we can support really
14:39:34 <mgoddard> OTOH, we didn't make it optional, so they're essentially running custom endpoints
14:39:42 <yoctozepto> mnasiadka:  awesome
14:40:00 <mgoddard> and if they change FQDN or VIP, the admin endpoints will be invalid
14:40:13 <frickler> users may to some time to update their clients
14:40:19 <yoctozepto> mgoddard: optional what? (I did not follow)
14:40:29 <mgoddard> yoctozepto: optional admin endpoints
14:40:33 <frickler> s/to/take/
14:41:10 <yoctozepto> mgoddard: ack
14:41:22 <yoctozepto> so they can go desync
14:41:27 <yoctozepto> which is bad
14:41:33 <yoctozepto> and surprising probably
14:41:37 <mnasiadka> Kolla-Ansible created those endpoints, Kolla-Ansible should remove them on upgrade - if something breaks - the user can add the endpoint back manually (and therefore they will be treated as something beyond k-a control)
14:41:55 <mnasiadka> makes sense?
14:42:36 <yoctozepto> +2
14:42:57 <mmalchuk> ok. indeed make sense
14:43:00 <mgoddard> also what if I deploy region2
14:43:06 <yoctozepto> but let's leave yoga alone? this needs to be advertised better I guess
14:43:18 <yoctozepto> mgoddard: then you will have region2 deployed
14:43:22 <yoctozepto> ba-dum-tss
14:43:25 <yoctozepto> :D
14:43:27 <mgoddard> with no admin endpoints
14:43:31 <mgoddard> anyways
14:43:56 <mnasiadka> Yes, let's do it in Zed I think - enough time to understand if fresh deployments without admin endpoints break any project
14:44:02 <mgoddard> +1
14:44:23 <yoctozepto> ok, then I will focus on cleaning up admin endpoints
14:44:58 <mnasiadka> Ok, enough on that I think :)
14:45:03 <mnasiadka> Second additional topic is: (frickler) Monthly stable releases https://review.opendev.org/c/openstack/kolla/+/844286
14:45:17 <frickler> we tackled that already I guess
14:45:23 <mnasiadka> I think so
14:45:24 <frickler> just wanted to have the link handy
14:45:42 <mnasiadka> goodie
14:45:55 <mnasiadka> #topic Open discussion
14:46:02 <mnasiadka> Anyone, anything?
14:46:08 <mmalchuk> please take a look https://review.opendev.org/c/openstack/kayobe/+/840033
14:46:16 <mmalchuk> mgoddard know it
14:46:48 <mmalchuk> bad issue need to be backported too
14:47:11 <frickler> when is the switch from yoga to master due?
14:47:27 * frickler wanting to dump centos :D
14:47:53 <mmalchuk> also https://review.opendev.org/c/openstack/kolla/+/842472
14:48:07 <mmalchuk> we need this customized from kayobe
14:48:32 <mnasiadka> frickler: I think once we have CentOS Stream 9 working, and now we don't.
14:49:11 <yoctozepto> mnasiadka: seriously? I thought we had a schedule for these things ;-)
14:49:14 <frickler> so we wait without limit for c9s?
14:49:32 <yoctozepto> if cs9 is not there, we drop cs8 and live happily without it for the time being
14:49:36 <mnasiadka> 3 months or 3 years, whatever :)
14:49:41 <mnasiadka> Yeah, just laughing
14:50:53 <mmalchuk> please +W : https://review.opendev.org/q/I3ab603f7cab7946ea8f2e063fe91190d6592066a
14:51:36 <mnasiadka> done
14:51:44 <mmalchuk> thx
14:51:47 <mnasiadka> ok
14:51:50 <mnasiadka> #endmeeting