15:01:09 #startmeeting kolla 15:01:09 Meeting started Wed Aug 11 15:01:09 2021 UTC and is due to finish in 60 minutes. The chair is mgoddard. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:09 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:09 The meeting name has been set to 'kolla' 15:01:19 mgoddard mnasiadka hrw egonzalez yoctozepto rafaelweingartne cosmicsound osmanlicilegi bbezak parallax Fl1nt 15:01:21 ^ meeting 15:01:25 #topic rollcall 15:01:42 \o 15:02:11 o/ 15:03:11 |o| 15:04:14 o/ 15:05:38 #topic agenda 15:05:46 * Roll-call 15:05:48 * Agenda 15:05:50 * Announcements 15:05:52 ** TC & PTL election season looms http://lists.openstack.org/pipermail/openstack-discuss/2021-August/024093.html 15:05:54 * Review action items from the last meeting 15:05:56 * CI status 15:05:58 ** Discussion on general issues 15:06:00 * Release tasks 15:06:02 * Xena cycle planning 15:06:04 ** Xena feature prioritisation https://docs.google.com/spreadsheets/d/1BuVMwP8eLnOVJDX8f3Nb6hCrNcNpRQl57T2ENU9Xao8 15:06:06 ** Clean up old priorities from the whiteboard to get fresher look at it 15:06:08 * Kolla operator pain points https://etherpad.opendev.org/p/pain-point-elimination 15:06:10 * Open discussion 15:06:12 ** Kolla Ansible Framework and its QA https://etherpad.opendev.org/p/kolla-ansible-framework 15:06:14 #topic announcements 15:06:16 #info TC & PTL election season looms 15:06:18 #link http://lists.openstack.org/pipermail/openstack-discuss/2021-August/024093.html 15:06:37 Nominations start on 17th August 15:06:53 I will not run again for PTL 15:07:15 You will always stay our BDFL mgoddard 15:07:33 is there no escape? :) 15:07:35 (in our hearts) 15:09:18 <3 15:09:34 small plug: I will be participating in tomorrow's opendev.live session on ironic 15:09:45 yeah, I will be watching you 15:10:38 psst: let's increase the tempo a bit; we are already 10 mins into the meeting and no meat still :-) 15:11:25 (I also wonder where mnasiadka is :-) ) 15:11:27 #link https://www.youtube.com/channel/UCQ74G2gKXdpwZkXEsclzcrA 15:11:44 #topic Review action items from the last meeting 15:11:55 none 15:12:10 #topic CI status 15:12:52 I updated it today for k and k-a 15:12:59 not sure about kayobe 15:13:05 anything new? 15:13:28 not really, I deleted some old stuff; zun seems to be failing on ubuntu 15:13:31 but that's it 15:13:47 I wanted to discuss general issues we have listed at the top in the whiteboard 15:13:56 (this is the extra subtopic for CI status) 15:14:01 I think Kayobe is green at the moment 15:14:25 +1 15:14:30 Are ussuri and train still red? 15:14:44 unsure, probably not 15:15:10 There's no open patches against them 15:15:25 last ussuri runs look ok 15:15:25 no periodic jobs to check? 15:16:08 No periodic 15:16:10 no 15:16:20 go ahead yoctozepto 15:16:48 ok 15:17:11 I have collected 5 general issues 15:17:24 i.e. mostly job-type-independent 15:17:39 the first one is SIGPIPE rc=-13 error 15:17:48 I have never got this outside of CI 15:17:53 have you? 15:18:10 not that I remember 15:18:12 (for the notes please see the whiteboard; I will not be repasting them in the chat) 15:18:24 priteau and you? 15:18:35 (nobody else to ask today) 15:18:46 osmanlicilegi is here 15:19:24 sorry! missed due to the same colour as priteau in my client :-) 15:19:26 yoctozepto: I don't think I have ever seen this error 15:19:38 yoctozepto: do you have some proposal for it? 15:19:49 yeah, I'm pretty sure it's something really weird going in the CI 15:20:00 mgoddard: no, for this issue I'm just collecting others' feedback 15:20:07 not much to work on 15:20:15 ok, let's move onto the next issue 15:20:25 failing pulls 15:20:31 this has a fix 15:20:34 it has been discussed 15:20:39 so just go and merge :-) 15:20:59 yoctozepto: I'm trying to catch up what I've missed for a while :/ 15:20:59 I happens outside of CI so it's legit and fixing is nice for end-users as well 15:21:20 osmanlicilegi: sure; the question was whether you have ever seen rc=-13 error from kolla-ansible when running it locally 15:21:35 osmanlicilegi: so it's independent of the upstream knowledge :-) 15:21:47 ^^^ it* happens 15:22:01 never seen this but I'll recheck 15:22:04 any thoughts on issue #2? if not, let's move on onto 3. 15:22:17 osmanlicilegi: thanks, no sweat :-) 15:22:54 so for issue #2 just merge the proposal 15:23:04 and issue #3 is about weird attach behaviour 15:23:20 this has not happened to me in prod but seems more legit than issue #1 15:23:28 yoctozepto: retry is good. Never seen it with local registry but improved CI stability will be good 15:23:28 though it's probably an upstream bug 15:24:06 did you manage to reproduce the attachment issue? 15:24:52 mgoddard: locally not; but it seems to be luck-based so it could be that one-more-try could trigger it 15:24:57 but one has to stop somewhere lol 15:25:06 it's repeatable in CI though 15:25:15 I have a proposal that can be rechecked to trigger it 15:25:21 as it only runs the affected jobs 15:25:58 well, at least I *had* 15:26:01 waiting longer didn't help? 15:26:06 nope, it did not 15:26:12 it seems like some process just does not complete 15:26:16 no errors, no nothing 15:26:20 the volume gets stuck 15:26:24 only ever seen this on ubuntu 15:26:32 and the cinder backend is irrelevant 15:26:43 (always nice to be able to blame ceph but not this time, fellas) 15:27:10 raise with cinder? 15:27:28 yeah, that is my proposal as well 15:27:32 just enquiring first 15:27:36 ok, thanks 15:28:02 oh, I see I switched the order a bit 15:28:16 so that was actually issue #4 15:28:21 the #3 is about logging mess 15:28:45 I guess it's obvious we need a volunteer for better logs 15:29:00 can we link to any related proposals? do any come to your minds? 15:30:19 there are a few patches proposed for logging 15:30:52 it depends which part of the mess you want to fix 15:31:34 well, at least the one making it hard to find issues in the log files themselves 15:31:49 please just add any links you deem valuable 15:32:24 nothing else to discuss for this particular issue 15:32:31 as for issue #5: 15:32:42 do we agree to ignore this in CI? 15:32:57 because otherwise we would have to nicely stop all the services in proper order 15:33:07 and only then start them again 15:33:14 how many retries to we use? 15:33:47 that's a good question; more precisely: how many retries does placement use in this config 15:34:04 the issue is because keystone goes down during upgrade 15:34:16 and haproxy also needs to pick it up back alive 15:34:29 so a few retries should help 15:34:33 perhaps it's not doing enough 15:34:35 let's check 15:34:40 https://docs.openstack.org/placement/wallaby/configuration/config.html#keystone_authtoken.http_request_max_retries 15:35:20 raise to 5 in CI? 15:35:24 or 6 15:35:42 could do 15:35:52 or maybe we just ignore it :) 15:36:22 mgoddard: I'll try with that bump 15:36:35 I have only ever seen placement and (rarer) neutron to hit this 15:36:42 they are most talkative it seems 15:36:48 for whatever reason 15:36:55 could use bumping for all core services 15:37:01 all right 15:37:03 the plan is there 15:37:10 thank you for the fruitful discussion 15:37:16 precisely what I wanted :-) 15:38:57 thanks yoctozepto for bringing it up 15:39:04 my pleasure! 15:39:07 #topic Release tasks 15:39:17 Finally, it is R-8 15:39:26 :O 15:39:27 so according to https://docs.openstack.org/kolla/latest/contributor/release-management.html 15:39:41 we must Switch binary images to current release 15:39:57 would anyone like to do it? 15:40:25 "like" is a strong word :-) 15:41:09 I can give it a try if no one else is already on it 15:41:21 I have other stuff to do as well but I guess it won't hurt me to bump; though I'm looking forward to broader participation :-) 15:41:33 oh, priteau already volunteered, good! 15:41:51 #action priteau to Switch binary images to current release 15:41:53 thanks priteau 15:41:57 thanks ++ 15:42:10 we need to think about cycle highlights soon 15:42:13 but it can wait 15:42:19 yeah, next meeting 15:42:24 let's add to the agenda though 15:42:30 I will add then 15:42:31 #topic Clean up old priorities from the whiteboard to get fresher look at it 15:42:38 mine again 15:42:48 I suggest we simply clean all completed one 15:42:51 ones* 15:42:58 simple topic :-) 15:43:03 just gathering your approval 15:44:24 makes sense 15:44:35 the list wasn't really updated for xena 15:44:43 yup 15:44:50 ok, so I can clean this up 15:44:54 no problem 15:45:01 you can action me on it 15:45:11 #action clean up whiteboard priorities 15:45:27 Pierre Riteau proposed openstack/kolla master: [release] Use UCA Xena https://review.opendev.org/c/openstack/kolla/+/804268 15:45:33 CentOS Stream 9... 15:45:51 we should probably check in with RDO on that one 15:46:38 it's cutting it quite fine for a major upgrade 15:46:49 argh, noez 15:46:58 I forgot we are awaiting a landslide 15:47:08 #action mgoddard check in with RDO re CS9 15:47:37 #topic Kolla operator pain points https://etherpad.opendev.org/p/pain-point-elimination 15:48:08 I don't see any new ones since last time 15:48:27 me neither 15:48:30 #topic Kolla Ansible Framework and its QA https://etherpad.opendev.org/p/kolla-ansible-framework 15:48:47 so, I have done a larger writeup 15:49:12 that we should do some core-involving exercises to maintain better posture :-) 15:49:32 my goal for today is to share this with you 15:49:44 and ask you for collaboration 15:50:10 Pierre Riteau proposed openstack/kolla master: [release] Use RDO master Delorean packages https://review.opendev.org/c/openstack/kolla/+/804269 15:50:11 who would want to drive/discuss this with me? (even indepently of our general meetings not to eat up their time) 15:50:56 I see only mgoddard is lurking in the etherpad so not many ppl to ask :-) 15:51:13 Sorry, was working on my action :P 15:51:38 priteau: no problem 15:52:29 it makes sense to me to expand this model 15:53:10 it could easily be used for DB setup, check-containers.yml, etc. 15:53:31 config would be more work 15:53:42 but could start simple with config.json 15:54:19 yeah, we have config-check in workings (on me) but config itself, especially config.json, is a nice candidate 15:54:43 so, as you figured, the work is threefold: document, refactor, test 15:55:00 but first obviously decide on the scope etc. 15:55:17 put any relevant idea somewhere in that etherpad 15:55:18 any time 15:55:35 So it's about moving more K-A code into a framework, so there is less duplication of code between roles? With ultimately services could just be defined as a dict of their config? 15:55:52 Merged openstack/kolla-ansible stable/wallaby: Extra var ironic_enable_keystone_integration added. https://review.opendev.org/c/openstack/kolla-ansible/+/804087 15:56:36 yoctozepto: missing me? 15:56:44 priteau: +/- yeah 15:56:56 but for starters to clean up what we already have and decide on the next steps 15:57:00 Merged openstack/kolla-ansible stable/wallaby: ironic: Follow up for ironic_enable_keystone_integration https://review.opendev.org/c/openstack/kolla-ansible/+/804161 15:57:02 which ultimately could work as you described 15:57:12 I suppose my question would be, how do we prioritise this against other work? 15:57:54 tough nut; I can drive most of this because it feels important to me 15:58:11 but need you for discussion and review of course 15:58:20 bring your ideas, thoughts, comments 15:58:33 mnasiadka: I'm always missing fellow cores :-) 15:58:52 *you = you all 16:00:50 all right, we are past time unfortunately :-( 16:00:57 indeed 16:01:05 thanks for driving discussions today yoctozepto 16:01:15 mgoddard: you are welcome; thanks for chairing 16:01:41 #endmeeting