15:01:23 #startmeeting kolla 15:01:25 Meeting started Wed Nov 18 15:01:23 2020 UTC and is due to finish in 60 minutes. The chair is mgoddard. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:26 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:28 The meeting name has been set to 'kolla' 15:01:33 #topic rollcall 15:02:00 o/ 15:02:03 \o 15:02:09 o/ 15:02:13 o/ 15:04:06 #topic agenda 15:04:19 * Roll-call 15:04:21 * Announcements 15:04:23 ** Vote for Kolla Wallaby priorities https://etherpad.opendev.org/p/kolla-wallaby-priorities 15:04:25 * Review action items from the last meeting 15:04:27 * CI status 15:04:30 * Victoria release planning 15:04:31 * Dockerhub pull rate limits https://etherpad.opendev.org/p/docker-pull-limits 15:04:33 * Cinder active/active https://bugs.launchpad.net/kolla-ansible/+bug/1904062 15:04:35 Launchpad bug 1904062 in kolla-ansible wallaby "external ceph cinder volume config breaks volumes on ussuri upgrade" [High,In progress] - Assigned to Michal Nasiadka (mnasiadka) 15:04:35 * Wallaby PTG actions 15:04:37 * Review new retirements (Wallaby) 15:04:39 * Cinder v2 to be dropped in Wallaby http://lists.openstack.org/pipermail/openstack-discuss/2020-November/018697.html 15:04:41 #topic announcements 15:04:43 #info Vote for Kolla Wallaby priorities 15:04:54 #link https://etherpad.opendev.org/p/kolla-wallaby-priorities 15:05:06 I'll close the poll at the end of the week 15:05:21 ++ 15:05:30 #action mgoddard to email openstack-discuss about final reminder for wallaby priority voting 15:06:31 Any other announcements? 15:07:05 no... 15:07:06 #topic Review action items from the last meeting 15:07:22 mgoddard check if escurator broken and mark unbuildable if so 15:07:35 I think this was resolved 15:07:51 #topic CI status 15:08:04 resolved with a workaround I think, we are still waiting for escurator to make a new release and go back to normal :) 15:08:48 ok 15:09:15 [minor] NFV job broken due to lack of Aodh for Tacker/Heat 15:09:39 I guess that affects master & victoria? 15:12:19 master/victoria: 2020-10-23 08:02:41.182 6 CRITICAL nova [req-24a27c19-7657-43dc-bb1b-b93223fd8b29 - - - - -] Unhandled error: TypeError: _wrap_socket_sni() got an unexpected keyword argument 'ca_certs' - https://bugs.launchpad.net/nova/+bug/1902696 15:12:20 Launchpad bug 1902696 in oslo.messaging "nova-compute fails with Unhandled error: TypeError: _wrap_socket_sni() got an unexpected keyword argument 'ca_certs'" [Undecided,New] 15:13:31 Is there any plan to support Selinux with Kolla? 15:13:44 imtiazc: we're in a meeting right nw 15:14:49 mgoddard: nfv and nova rabbitmq tls bug are two different things? 15:14:55 yes 15:15:04 I got no response so moved on... 15:15:56 i will take a look at that nfv later this week 15:16:12 just hoped someone else would do but it seems no real interest in nfv nowadays 15:18:13 i have never used nfv before.. 15:19:08 I just commented on https://review.opendev.org/#/c/761194/. I'm not convinced we need to wait for a kombu release 15:19:09 patch 761194 - requirements - Pin kombu and amqp requirements due to incompatibi... - 2 patch sets 15:19:48 mgoddard: I think it needs some analysis ;) 15:20:10 maybe. I'm no requirements expert 15:20:42 or maybe it's already fixed, there was a revert somewhere - let's just recheck it and analyse if it fails ;) 15:20:43 wuchunyang: me neither 15:20:52 yoctozepto: nobody has :) 15:20:54 https://review.opendev.org/#/c/761519/ 15:20:54 patch 761519 - kolla-ansible - Revert "CI: Temporarily disable rabbitmq internal ... - 1 patch set 15:21:14 mnasiadka: :D 15:21:45 ok, move on? 15:21:51 mgoddard: sent for recheck, will monitor that 15:21:54 yep, move on! 15:21:58 #topic Victoria release planning 15:22:14 mgoddard: you can action me on nfv 15:22:24 #action yoctozepto fix NFV 15:22:40 I think we are at the point where we should list the blockers for the Victoria release 15:23:10 #link https://etherpad.opendev.org/p/KollaWhiteBoard 15:23:14 L144 15:23:18 hit me 15:23:27 RMQ TLS? 15:24:07 for sure 15:24:18 probably not a real blocker 15:24:30 but would be nice to release the feature working 15:24:53 well, we can post a release note that it currently doesn't work due to this bug, and then post a reno when it's fixed 15:26:18 Any other blockers? 15:26:30 Cinder active/active 15:27:16 no critical bugs in launchpad 15:27:37 well, cinder active/active with current config just enforces users to rewrite volumes. to another cinder-volume agent ;) 15:27:49 but we have that somewhere later in agenda I think 15:28:27 yes 15:28:44 I don't think we should release without a long-term config though 15:28:53 https://review.opendev.org/#/c/760308/ 15:28:54 patch 760308 - kolla-ansible - kibana: Remove 6.x migration from upgrade - 1 patch set 15:30:04 mgoddard: long-term 15:30:04 ? 15:30:09 mgoddard: you mean to push a-a? 15:30:15 or a/a 15:30:56 so should we discuss it now, if it's a blocker and how are we planning to make it work? 15:31:25 I mean let's not release with one config, then recommend changing it once we've worked out how to do it properly 15:31:46 anyway, it's a later topic 15:32:24 any other release blockers? 15:33:00 Michal Nasiadka proposed openstack/kolla-ansible master: Revert "CI: Temporarily disable rabbitmq internal tls" https://review.opendev.org/761519 15:33:47 I proposed an RC1 release for kayobe, still waiting on it 15:34:46 https://review.opendev.org/#/c/763022/ 15:34:46 patch 763022 - releases - Release Kayobe 9.0.0.0rc1 and branch for Victoria - 1 patch set 15:34:59 #topic Dockerhub pull rate limits https://etherpad.opendev.org/p/docker-pull-limits 15:35:12 what should we do 15:35:17 oh what should we doooo 15:35:19 :D 15:35:42 remove kolla in Docker Hub and deprecate ;) 15:36:26 * mgoddard makes notes in https://etherpad.opendev.org/p/docker-pull-limits 15:37:53 * yoctozepto does too 15:38:03 let's look there mnasiadka 15:38:21 looking 15:45:46 yoctozepto, mnasiadka, hrw 15:45:57 are those the options? 15:46:00 are there more? 15:46:56 there was a proposal to push weekly and encourage users to build their own images, but it doesn't matter if we fail daily or weekly (well matters if we wait another week to push something, and we fail again) 15:47:06 mgoddard: 3 is likely extra to me, not a user-faced option 15:47:23 mnasiadka: well, limits are mostly on pulls 15:47:23 yeah, that's true 15:47:30 still, the limits are per ip address 15:47:40 well if we do 3 then we don't use dockerhub in CI 15:47:44 so unless one user downloads more than 100 images at once 15:47:44 so no limits 15:47:47 they should be pretty happy 15:48:11 mgoddard: yeah, that's what I started to explore in my mind 15:49:06 mgoddard: the general - of 3 is that we don't yet know how it is supposed to work 15:49:11 unclear if the registry is designed to hold images between jobs? 15:49:31 yoctozepto: I had a look at it, I have more of an understanding now 15:51:31 I think we need some more info from infra about the intermediate registry 15:52:10 mgoddard: I looked at it in the past, though I had lots of other stuff on my mind so forgot 15:52:23 mgoddard: one is definitely temporary 15:52:28 mgoddard: so it would not do 15:52:32 buildset is temporary 15:52:37 mgoddard: not remember any permanent 15:52:49 intermediate is permanent but not guaranteed to be reliable 15:54:33 mgoddard: hah, could you link to the docs? 15:55:02 #link https://zuul-ci.org/docs/zuul-jobs/docker-image.html 15:55:17 #link https://docs.opendev.org/opendev/base-jobs/latest/docker-image.html 15:56:18 I don't think we have an answer, but we do have some options listed 15:56:23 let's move on 15:56:26 mgoddard: but this intermediate is infra-level 15:56:46 yes 15:56:53 #topic Cinder active/active https://bugs.launchpad.net/kolla-ansible/+bug/1904062 15:56:56 Launchpad bug 1904062 in kolla-ansible wallaby "external ceph cinder volume config breaks volumes on ussuri upgrade" [High,In progress] - Assigned to Michal Nasiadka (mnasiadka) 15:57:46 What cases to we need to consider? 15:58:01 Train with backend_host set 15:58:02 it's probably best to go real a/a already 15:58:07 less playing around 15:58:15 Ussuri with backend_host set 15:58:16 yes, train is supposedly fine 15:58:24 Ussuri with backend_host not set 15:58:28 yup 15:58:48 we might want to craft help instructions for currently affected 15:58:52 So we only need to change ussuri (and beyond) 15:59:06 so train->ussuri broken upgraders 15:59:10 and ensure that an upgrade from train with backend_host set works 15:59:41 and bear in mind that users may have custom config with backend_host in :) 15:59:49 yeah, lovely case 16:00:01 we need to use all our channels to make this a known issue 16:00:06 which means 16:00:08 release notes 16:00:11 and mailing list 16:00:12 Victor Chembaev proposed openstack/kolla-ansible master: RabbitMQ handler refactored to restart services in serial https://review.opendev.org/763137 16:00:19 and potentially personal emails 16:00:20 kolla klub 16:00:23 yup 16:01:34 so I guess first step is to decide on the end state 16:01:43 cluster=ceph, coordinator enabled 16:01:50 precheck for a coordinator? 16:02:45 yoctozepto: well, we had backend_host in docs only for external ceph, if someone had template overrides for cinder in Train, it should work in Ussuri I think? 16:03:20 mnasiadka: yes. it should work as well as it ever did 16:03:40 the problem comes if you try to remove the custom config and use what's in kolla 16:04:00 so, we should have a precheck for coordinator backend - and use redis or etcd when they are enabled 16:04:12 should we favor one over the other, if both are enabled? 16:04:40 we do use them if enabled 16:05:22 so we just need precheck, if external ceph and cinder-volume group has more than one host 16:05:45 or even only if cinder-volume > 1 16:05:52 yes 16:05:57 though we need to make redis and/or etcd robust in ha 16:06:16 so we need to test both 16:06:19 otherwise it might be better to stick to bad and fugly active-active in active-passive 16:06:29 ? 16:06:46 do we have any CI with coordination backend for any service? 16:07:21 I don't believe so 16:07:23 well, anyway I need to look into why ceph-ansible jobs are failing so often, so we just may enable it there\ 16:07:47 I remember we had some issues with etcd3/etcd3gw, so maybe this is the backend we should look into ;) 16:09:39 mnasiadka: etcd3gw is the way to go 16:09:40 let's wrap up, we're over time 16:09:45 the other one breaks w/ eventlet 16:09:52 let's continue after 16:09:54 thx mgoddard 16:09:58 #endmeeting