15:01:23 <mgoddard> #startmeeting kolla
15:01:25 <openstack> Meeting started Wed Nov 18 15:01:23 2020 UTC and is due to finish in 60 minutes.  The chair is mgoddard. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:01:26 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:01:28 <openstack> The meeting name has been set to 'kolla'
15:01:33 <mgoddard> #topic rollcall
15:02:00 <yoctozepto> o/
15:02:03 <mgoddard> \o
15:02:09 <wuchunyang> o/
15:02:13 <headphoneJames> o/
15:04:06 <mgoddard> #topic agenda
15:04:19 <mgoddard> * Roll-call
15:04:21 <mgoddard> * Announcements
15:04:23 <mgoddard> ** Vote for Kolla Wallaby priorities https://etherpad.opendev.org/p/kolla-wallaby-priorities
15:04:25 <mgoddard> * Review action items from the last meeting
15:04:27 <mgoddard> * CI status
15:04:30 <mgoddard> * Victoria release planning
15:04:31 <mgoddard> * Dockerhub pull rate limits https://etherpad.opendev.org/p/docker-pull-limits
15:04:33 <mgoddard> * Cinder active/active https://bugs.launchpad.net/kolla-ansible/+bug/1904062
15:04:35 <openstack> Launchpad bug 1904062 in kolla-ansible wallaby "external ceph cinder volume config breaks volumes on ussuri upgrade" [High,In progress] - Assigned to Michal Nasiadka (mnasiadka)
15:04:35 <mgoddard> * Wallaby PTG actions
15:04:37 <mgoddard> * Review new retirements (Wallaby)
15:04:39 <mgoddard> * Cinder v2 to be dropped in Wallaby http://lists.openstack.org/pipermail/openstack-discuss/2020-November/018697.html
15:04:41 <mgoddard> #topic announcements
15:04:43 <mgoddard> #info Vote for Kolla Wallaby priorities
15:04:54 <mgoddard> #link https://etherpad.opendev.org/p/kolla-wallaby-priorities
15:05:06 <mgoddard> I'll close the poll at the end of the week
15:05:21 <yoctozepto> ++
15:05:30 <mgoddard> #action mgoddard to email openstack-discuss about final reminder for wallaby priority voting
15:06:31 <mgoddard> Any other announcements?
15:07:05 <wuchunyang> no...
15:07:06 <mgoddard> #topic Review action items from the last meeting
15:07:22 <mgoddard> mgoddard check if escurator broken and mark unbuildable if so
15:07:35 <mgoddard> I think this was resolved
15:07:51 <mgoddard> #topic CI status
15:08:04 <mnasiadka> resolved with a workaround I think, we are still waiting for escurator to make a new release and go back to normal :)
15:08:48 <mgoddard> ok
15:09:15 <mgoddard> [minor] NFV job broken due to lack of Aodh for Tacker/Heat
15:09:39 <mgoddard> I guess that affects master & victoria?
15:12:19 <mgoddard> master/victoria: 2020-10-23 08:02:41.182 6 CRITICAL nova [req-24a27c19-7657-43dc-bb1b-b93223fd8b29 - - - - -] Unhandled error: TypeError: _wrap_socket_sni() got an unexpected keyword argument 'ca_certs' - https://bugs.launchpad.net/nova/+bug/1902696
15:12:20 <openstack> Launchpad bug 1902696 in oslo.messaging "nova-compute fails with Unhandled error: TypeError: _wrap_socket_sni() got an unexpected keyword argument 'ca_certs'" [Undecided,New]
15:13:31 <imtiazc> Is there any plan to support Selinux with Kolla?
15:13:44 <mgoddard> imtiazc: we're in a meeting right nw
15:14:49 <mnasiadka> mgoddard: nfv and nova rabbitmq tls bug are two different things?
15:14:55 <mgoddard> yes
15:15:04 <mgoddard> I got no response so moved on...
15:15:56 <yoctozepto> i will take a look at that nfv later this week
15:16:12 <yoctozepto> just hoped someone else would do but it seems no real interest in nfv nowadays
15:18:13 <wuchunyang> i have never used nfv before..
15:19:08 <mgoddard> I just commented on https://review.opendev.org/#/c/761194/. I'm not convinced we need to wait for a kombu release
15:19:09 <patchbot> patch 761194 - requirements - Pin kombu and amqp requirements due to incompatibi... - 2 patch sets
15:19:48 <mnasiadka> mgoddard: I think it needs some analysis ;)
15:20:10 <mgoddard> maybe. I'm no requirements expert
15:20:42 <mnasiadka> or maybe it's already fixed, there was a revert somewhere - let's just recheck it and analyse if it fails ;)
15:20:43 <yoctozepto> wuchunyang: me neither
15:20:52 <mnasiadka> yoctozepto: nobody has :)
15:20:54 <mgoddard> https://review.opendev.org/#/c/761519/
15:20:54 <patchbot> patch 761519 - kolla-ansible - Revert "CI: Temporarily disable rabbitmq internal ... - 1 patch set
15:21:14 <yoctozepto> mnasiadka: :D
15:21:45 <mgoddard> ok, move on?
15:21:51 <mnasiadka> mgoddard: sent for recheck, will monitor that
15:21:54 <mnasiadka> yep, move on!
15:21:58 <mgoddard> #topic Victoria release planning
15:22:14 <yoctozepto> mgoddard: you can action me on nfv
15:22:24 <mgoddard> #action yoctozepto fix NFV
15:22:40 <mgoddard> I think we are at the point where we should list the blockers for the Victoria release
15:23:10 <mgoddard> #link https://etherpad.opendev.org/p/KollaWhiteBoard
15:23:14 <mgoddard> L144
15:23:18 <mgoddard> hit me
15:23:27 <mgoddard> RMQ TLS?
15:24:07 <mnasiadka> for sure
15:24:18 <mgoddard> probably not a real blocker
15:24:30 <mgoddard> but would be nice to release the feature working
15:24:53 <mnasiadka> well, we can post a release note that it currently doesn't work due to this bug, and then post a reno when it's fixed
15:26:18 <mgoddard> Any other blockers?
15:26:30 <mgoddard> Cinder active/active
15:27:16 <mnasiadka> no critical bugs in launchpad
15:27:37 <mnasiadka> well, cinder active/active with current config just enforces users to rewrite volumes. to another cinder-volume agent ;)
15:27:49 <mnasiadka> but we have that somewhere later in agenda I think
15:28:27 <mgoddard> yes
15:28:44 <mgoddard> I don't think we should release without a long-term config though
15:28:53 <mgoddard> https://review.opendev.org/#/c/760308/
15:28:54 <patchbot> patch 760308 - kolla-ansible - kibana: Remove 6.x migration from upgrade - 1 patch set
15:30:04 <yoctozepto> mgoddard: long-term
15:30:04 <yoctozepto> ?
15:30:09 <yoctozepto> mgoddard: you mean to push a-a?
15:30:15 <yoctozepto> or a/a
15:30:56 <mnasiadka> so should we discuss it now, if it's a blocker and how are we planning to make it work?
15:31:25 <mgoddard> I mean let's not release with one config, then recommend changing it once we've worked out how to do it properly
15:31:46 <mgoddard> anyway, it's a later topic
15:32:24 <mgoddard> any other release blockers?
15:33:00 <openstackgerrit> Michal Nasiadka proposed openstack/kolla-ansible master: Revert "CI: Temporarily disable rabbitmq internal tls"  https://review.opendev.org/761519
15:33:47 <mgoddard> I proposed an RC1 release for kayobe, still waiting on it
15:34:46 <mgoddard> https://review.opendev.org/#/c/763022/
15:34:46 <patchbot> patch 763022 - releases - Release Kayobe 9.0.0.0rc1 and branch for Victoria - 1 patch set
15:34:59 <mgoddard> #topic Dockerhub pull rate limits https://etherpad.opendev.org/p/docker-pull-limits
15:35:12 <yoctozepto> what should we do
15:35:17 <yoctozepto> oh what should we doooo
15:35:19 <yoctozepto> :D
15:35:42 <mnasiadka> remove kolla in Docker Hub and deprecate ;)
15:36:26 * mgoddard makes notes in https://etherpad.opendev.org/p/docker-pull-limits
15:37:53 * yoctozepto does too
15:38:03 <yoctozepto> let's look there mnasiadka
15:38:21 <mnasiadka> looking
15:45:46 <mgoddard> yoctozepto, mnasiadka, hrw
15:45:57 <mgoddard> are those the options?
15:46:00 <mgoddard> are there more?
15:46:56 <mnasiadka> there was a proposal to push weekly and encourage users to build their own images, but it doesn't matter if we fail daily or weekly (well matters if we wait another week to push something, and we fail again)
15:47:06 <yoctozepto> mgoddard: 3 is likely extra to me, not a user-faced option
15:47:23 <yoctozepto> mnasiadka: well, limits are mostly on pulls
15:47:23 <mnasiadka> yeah, that's true
15:47:30 <yoctozepto> still, the limits are per ip address
15:47:40 <mgoddard> well if we do 3 then we don't use dockerhub in CI
15:47:44 <yoctozepto> so unless one user downloads more than 100 images at once
15:47:44 <mgoddard> so no limits
15:47:47 <yoctozepto> they should be pretty happy
15:48:11 <yoctozepto> mgoddard: yeah, that's what I started to explore in my mind
15:49:06 <yoctozepto> mgoddard: the general - of 3 is that we don't yet know how it is supposed to work
15:49:11 <mgoddard> unclear if the registry is designed to hold images between jobs?
15:49:31 <mgoddard> yoctozepto: I had a look at it, I have more of an understanding now
15:51:31 <mgoddard> I think we need some more info from infra about the intermediate registry
15:52:10 <yoctozepto> mgoddard: I looked at it in the past, though I had lots of other stuff on my mind so forgot
15:52:23 <yoctozepto> mgoddard: one is definitely temporary
15:52:28 <yoctozepto> mgoddard: so it would not do
15:52:32 <mgoddard> buildset is temporary
15:52:37 <yoctozepto> mgoddard: not remember any permanent
15:52:49 <mgoddard> intermediate is permanent but not guaranteed to be reliable
15:54:33 <yoctozepto> mgoddard: hah, could you link to the docs?
15:55:02 <mgoddard> #link https://zuul-ci.org/docs/zuul-jobs/docker-image.html
15:55:17 <mgoddard> #link https://docs.opendev.org/opendev/base-jobs/latest/docker-image.html
15:56:18 <mgoddard> I don't think we have an answer, but we do have some options listed
15:56:23 <mgoddard> let's move on
15:56:26 <yoctozepto> mgoddard: but this intermediate is infra-level
15:56:46 <mgoddard> yes
15:56:53 <mgoddard> #topic Cinder active/active https://bugs.launchpad.net/kolla-ansible/+bug/1904062
15:56:56 <openstack> Launchpad bug 1904062 in kolla-ansible wallaby "external ceph cinder volume config breaks volumes on ussuri upgrade" [High,In progress] - Assigned to Michal Nasiadka (mnasiadka)
15:57:46 <mgoddard> What cases to we need to consider?
15:58:01 <mgoddard> Train with backend_host set
15:58:02 <yoctozepto> it's probably best to go real a/a already
15:58:07 <yoctozepto> less playing around
15:58:15 <mgoddard> Ussuri with backend_host set
15:58:16 <yoctozepto> yes, train is supposedly fine
15:58:24 <mgoddard> Ussuri with backend_host not set
15:58:28 <yoctozepto> yup
15:58:48 <yoctozepto> we might want to craft help instructions for currently affected
15:58:52 <mgoddard> So we only need to change ussuri (and beyond)
15:59:06 <yoctozepto> so train->ussuri broken upgraders
15:59:10 <mgoddard> and ensure that an upgrade from train with backend_host set works
15:59:41 <mgoddard> and bear in mind that users may have custom config with backend_host in :)
15:59:49 <yoctozepto> yeah, lovely case
16:00:01 <yoctozepto> we need to use all our channels to make this a known issue
16:00:06 <yoctozepto> which means
16:00:08 <yoctozepto> release notes
16:00:11 <yoctozepto> and mailing list
16:00:12 <openstackgerrit> Victor Chembaev proposed openstack/kolla-ansible master: RabbitMQ handler refactored to restart services in serial  https://review.opendev.org/763137
16:00:19 <yoctozepto> and potentially personal emails
16:00:20 <mgoddard> kolla klub
16:00:23 <yoctozepto> yup
16:01:34 <mgoddard> so I guess first step is to decide on the end state
16:01:43 <mgoddard> cluster=ceph, coordinator enabled
16:01:50 <mgoddard> precheck for a coordinator?
16:02:45 <mnasiadka> yoctozepto: well, we had backend_host in docs only for external ceph, if someone had template overrides for cinder in Train, it should work in Ussuri I think?
16:03:20 <mgoddard> mnasiadka: yes. it should work as well as it ever did
16:03:40 <mgoddard> the problem comes if you try to remove the custom config and use what's in kolla
16:04:00 <mnasiadka> so, we should have a precheck for coordinator backend - and use redis or etcd when they are enabled
16:04:12 <mnasiadka> should we favor one over the other, if both are enabled?
16:04:40 <mgoddard> we do use them if enabled
16:05:22 <mnasiadka> so we just need precheck, if external ceph and cinder-volume group has more than one host
16:05:45 <mnasiadka> or even only if cinder-volume > 1
16:05:52 <mgoddard> yes
16:05:57 <yoctozepto> though we need to make redis and/or etcd robust in ha
16:06:16 <mnasiadka> so we need to test both
16:06:19 <yoctozepto> otherwise it might be better to stick to bad and fugly active-active in active-passive
16:06:29 <mgoddard> ?
16:06:46 <mnasiadka> do we have any CI with coordination backend for any service?
16:07:21 <mgoddard> I don't believe so
16:07:23 <mnasiadka> well, anyway I need to look into why ceph-ansible jobs are failing so often, so we just may enable it there\
16:07:47 <mnasiadka> I remember we had some issues with etcd3/etcd3gw, so maybe this is the backend we should look into ;)
16:09:39 <yoctozepto> mnasiadka: etcd3gw is the way to go
16:09:40 <mgoddard> let's wrap up, we're over time
16:09:45 <yoctozepto> the other one breaks w/ eventlet
16:09:52 <yoctozepto> let's continue after
16:09:54 <yoctozepto> thx mgoddard
16:09:58 <mgoddard> #endmeeting