15:00:25 #startmeeting tc 15:00:25 Meeting started Thu Mar 17 15:00:25 2022 UTC and is due to finish in 60 minutes. The chair is gmann. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:25 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:25 The meeting name has been set to 'tc' 15:00:28 #topic Roll call 15:00:30 o/ 15:00:37 o/ 15:00:43 o/ 15:00:51 o/ 15:00:59 o/ 15:01:08 o/ 15:01:30 o/ 15:02:01 #link https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee 15:02:04 Today agenda ^^ 15:02:13 let's start 15:02:16 #topic Follow up on past action items 15:02:20 no action item form last meeting 15:02:29 #topic Gate health check 15:02:41 any news on gate? 15:03:21 there was a creeping failure in the nova-ceph-multistore job, 15:03:36 which was OOMing mysql, which I hopefully fixed by trimming down 15:03:42 o/ 15:03:45 that affected a few projects at least 15:04:03 +1 15:04:14 I know we've still got the volume detach failure thing going on with the centos jobs, at seemingly 100%, but those aren't voting anywhere that I know of 15:04:25 yeah, I am able to make rescue server test pass with SSH-able server fix but there are two more test failing #link https://review.opendev.org/c/openstack/tempest/+/831608 15:04:50 oh I thought all the sshable fixes were merged 15:05:06 oh right, I remember this one now, nevermind 15:05:07 even after unrescue server we need to wait for ssh-ready before detach happening 15:05:25 this breaks non-centos jobs now right? 15:05:27 we upgraded gitea yesterday, and this exposed a regression with pip install of git remotes from it which impacted blazar's jobs because they were still configured to try zuul v2 era zuul-cloner workflows 15:05:29 yeah, not all. rescue test was seen as failure in reported bug but there are few more 15:05:50 dansmith: no, they will pass. i did recheck. 15:06:07 gmann: oh, then why haven't we merged it yet? 15:06:07 i've got a fix in the pipe to address the pip install errors, but also it exposed that their jobs are sorely in need of modernizing 15:06:22 dansmith: I was trying a litlte smart for having active server then rescue/unrescue and then SSH but that did not work 15:06:38 dansmith: It just passed yesterday might so will merge after gate pass :) 15:06:42 oh okay 15:06:43 *yesterday night 15:07:28 fungi: you mean their jobs also on zuulv2? 15:08:08 On a related note the Cinder team is starting a renewed effort to get people to not do 'naked rechecks' . 15:08:19 oh very nice 15:08:21 +1 15:08:25 \o/ 15:08:27 gmann: more that their jobs hadn't been touched since the zuul v2 days, so were trying to use the zuul-cloner tool to checkout openstack/nova, and because that's not a thing any longer they were falling back to pip install nova from a git remote 15:08:38 jungleboyj: I try to shame people when I find them doing that with evidence that it's wrong :) 15:08:50 * jungleboyj isn't surprised 15:08:54 :-) 15:09:01 dansmith: we expect no less of you 15:09:07 but it works ;) 15:09:07 slaweq had good script to collect recheck numbers. 15:09:11 speaking about rechecks, I prepared some data as we talked last week 15:09:17 *hides* 15:09:20 yeah, I was coming to that 15:09:20 https://docs.google.com/spreadsheets/d/1zlJixCttF1e7ZSJdZzORfrJWRqsllRDn97FHyTr11e8/edit?usp=sharing 15:09:30 here are data for "main" repositories 15:09:54 and if You want it for each repo in openstack/ there is tar.gz file http://kaplonski.pl/files/openstack_rechecks_data.tar.gz 15:10:20 I collected data for each repo from openstack from last 365 days 15:10:40 Oh thats a pretty cool visualization. 15:10:49 very nice. 15:10:56 I'm not really sure what the numbers are though 15:11:11 the number there are basically average number or rechecks done on last PS before it was merged 15:11:13 is it like <2 recheck per week for most project? or I am reading wrongly ? 15:11:13 is this rechecks or just build failures? 15:11:16 dansmith: Ok, good. Not just me. 15:11:24 average for every week 15:11:25 slaweq: ah okay 15:11:30 Ah ... 15:11:43 slaweq: so higher numbers potentially mean recheck grinding to get a patch in? 15:11:52 dansmith: right 15:11:59 ohk so it is per patch not all jobs 15:12:13 it's per patch and average per week 15:12:13 then 1 or 2 still high number per patch 15:12:15 So, on average ever other patch has to go through a recheck to merge. 15:12:24 jungleboyj: yeah 15:12:35 Wow. 15:12:36 is that just for gate pipeline failures, or check as well? 15:12:44 good question ^^ 15:12:45 fungi: both 15:13:01 check might have mostly but yeah gate also have recheck 15:13:08 so given that the patch has to pass once in check and once in gate in order to merge 15:13:08 I was basically counting "recheck" comments on last patch sets 15:13:31 of course there may be some patches where rechecks were done "on purpose" 15:13:44 but in general I think it's not very common practice 15:13:57 Fair assumption. 15:14:01 also note that long patch series and/or depends-on can skew this, since one change failing can cause all the ones which rely on it to also fail 15:14:14 slaweq: ohk so on last PS not all recheck on that commit ? 15:14:28 fungi: true, it's not ideal metric for sure 15:14:54 Oh wow, if it is just the last patch set, then the actual number of rechecks per patch could be higher. 15:14:56 gmann: yes, I was counting only last patch set as I assumed that if that PS was merged finally, it means it was good 15:15:01 yeah, DNM, testing patch also in that but that is ok 15:15:21 DNM patches aren't in that metric. I was filtering only merged patches 15:15:29 k, +1 15:15:42 jungleboyj: but remember that this includes check, so includes the "surely the problem isn't my patch, oh i guess maybe it is?" rechecks too 15:16:02 or would if you included patch sets before the final one 15:16:09 I'm using this script and metric in neutron since some time and even if it's not great it shows us pretty clearly current state of the Neutron CI :) 15:16:25 ++ 15:16:32 Some data is better than no data. 15:16:36 yeah I think this probably gives us a good view of how much rechecking needs to happen to get something to land, 15:16:42 https://github.com/slawqo/tools/blob/master/rechecks/rechecks.py 15:16:46 that is script 15:16:48 slaweq: and is it lot of data if we collect for all recheck including previous recheck that merge PS? 15:16:59 but probably needs a bit more to tell us more than that, like if individual patches are actually rechecked more than the average, etc 15:17:00 dansmith: yeah 15:17:10 but as a heartbeat sort of thing, if the graph goes up -> bad 15:18:19 dansmith: I can prepare some data "per patch" too 15:18:33 I will need to modify that script but it shouldn't be hard 15:18:40 yeah and we can like ignore patches if just 1 recheck or so if data is more 15:19:05 slaweq: I'm not asking you to do that, just suggesting, but yeah always nice to have data and more data :) 15:19:05 slaweq: just to make sure it is just 'recheck' not recheck with reason? 15:19:24 or both 15:19:40 gmann: actually it is counting number of comments like "Build failed" on the last PS 15:19:49 https://github.com/slawqo/tools/blob/master/rechecks/rechecks.py#L155 15:19:53 I really hate that we've drifted away from "recheck with reason" .. I wish we could encourage better behavior there sometime 15:19:56 *somehow 15:20:22 The same way you should be descriptive with your commit message, you should be descriptive with why you are rechecking. 15:20:38 diablo_rojo: I agree, and I never do naked rechecks, but I'm in the minority 15:20:45 "recheck with reason" has always been optional. people who are inclined to look into and record why they're rechecking something will do it regardless of whether it's expected, and people who don't want to bother will make something up like they did back when we enforced it 15:20:53 I know I am not innocent when it comes to rechecks without explicitly saying why. 15:20:53 dansmith: sure, I will do modification to have data "per patch" 15:20:55 I just kind of assumed it triggered off the commit just being recheck 15:20:56 and I was shamed for my use of shame, so.. lollipops? :) 15:21:04 I don't know if for next week but I will do that 15:21:17 * jungleboyj is guilty as well. 15:21:19 fungi: yeah I know 15:21:24 My guess is training issue. We see just recheck so we use just resheck 15:21:33 Sure its optional, but it would be better if we made it the majority rather than a minority. 15:21:39 slaweq: no hurry. I am going to add this recheck script/data in PTG and we can discuss what all data we want to monitor per week in zed 15:21:42 Wouldn't want dansmith feeling lonely, you know. 15:21:54 * dansmith sobs uncontrollably 15:22:03 easy data will be per week as we monitor weekly so will be easy to check even for all patches or per patches 15:22:05 gmann: sure, great idea. I will be more than happy to help with that 15:22:06 spotz_: it used to be required, but people would just "recheck foo" or "recheck bug 00000" 15:22:07 There there ... 15:22:11 i agree it's a good practice, but it's not a good source of data unfortunately because of the number of people who knowingly pollute it 15:22:35 * diablo_rojo hands dansmith a handkerchief "there there" 15:22:44 slaweq: thanks for this. 15:22:45 maybe we should try to encourage PTLs to push the better behavior in their teams 15:23:01 dansmith: That is where Cinder is starting. 15:23:06 slaweq, yes thanks for the data. I look forward to the per patch info! 15:23:13 sure, how? in TC+PTL sessions or in ML? 15:23:17 jungleboyj: ack, well, let's try to spread that 15:23:25 gmann: yeah we could start in the PTG session 15:23:28 +1 15:23:37 I will add it 15:23:44 yeah, in neutron we are trying to do "recheck with reason" too but it's not always easy 15:23:46 gmann: cool 15:23:50 +2 15:23:56 and I also don't do it sometimes :/ 15:24:04 but I will try to do better :) 15:24:12 #action gmann to add recheck data topic in PTG etherpad (TC and TC+PTL for awareness) 15:24:12 slaweq: be the change.. be the change.. :P 15:24:22 +100 15:24:32 this is our advice in cinder: https://docs.openstack.org/cinder/latest/contributor/gerrit.html#ci-job-rechecks 15:24:33 dansmith: yes sir! :D 15:24:44 :) 15:24:48 heheh 15:24:51 24 minutes in and still on gate, eh? 15:24:54 just putting it out there, because i don't know that we are generating machine parseable comments 15:25:42 It is our favorite topic dansmith 15:25:42 rosmaita: "recheck I don't know but at least I looked" is better to me than nothing 15:25:57 fungi: coming back to blazer issue, do you have link for that/job or know if they are working to fix their side? 15:26:54 priteau is working on it, but it was jobs for blazar-nova specifically 15:26:57 rosmaita: I see them sometime machine generated comment (not recheck) and that annoy me more than anything 15:27:08 fungi: ok. 15:27:42 i have tried to get out third-party ci to add the appropriate gerrit tag so they don't pollute the comments, but you can see how much success i have had 15:27:55 and as frickler pointed out today I pushed moving l-c job to focal/py38 but there are existing config error in that field which needs to be fixed 15:28:45 gmann: an old tools/tox_install.sh in blazar-nova specifically 15:28:57 fungi: I see 15:29:13 good discussion on gate things today. anything else? 15:29:19 +1000 15:29:40 very glad to see the gate getting proper attention 15:29:48 slaweq: nice work on that script, btw 15:29:51 true, +10000 :) 15:29:55 thx 15:30:04 gmann: you just had to +10x me huh? 15:30:11 yeah 15:30:13 hah 15:30:29 #topic Z cycle Leaderless projects 15:31:08 only 1 project adjutant left which we are waiting until March end. we will discuss that in PTG 15:31:14 I will remove it from genda 15:31:23 #topic PTG Preparation 15:31:40 ++ 15:31:42 #link https://etherpad.opendev.org/p/tc-yoga-ptg 15:31:51 #link https://etherpad.opendev.org/p/tc-ptl-interaction-zed 15:31:58 please add topic in those etherpad 15:32:14 timeslots are finalized and I have updatad it on ML as well as in etherpad 15:32:44 note that the schedule and precreated etherpad links are now live in ptgbot, so can probably safely start adding overrides if needed. diablo_rojo would know for sure though 15:32:52 the first one is the link from the yoga ptg :) 15:32:58 I have informed Kubernets steering committee for joining us in PTG 15:33:00 gmann do we need to bring up Sahara, Magnum, etc there or will it be too late 15:33:10 I think you can override it now. 15:33:29 sorry #link https://etherpad.opendev.org/p/tc-zed-ptg 15:33:40 here we go. 15:33:44 spotz_: in TC+PTL sessions? 15:33:49 Yeah 15:34:11 yeah, we call everyone actually not specific projects 15:34:13 Everything looks correct at this point. I just need to do the zoom room setup once we are closer but that shouldn't affect other things. 15:34:30 diablo_rojo: +1, nice 15:35:29 spotz_: and for less active/broken project like sahara, magnum we can address/ping them separately. I would like to keep TC+PTL sessions to get/give feedback sessions instead of going towards project health checks 15:36:00 if we do project health check many PTLs will not join :) 15:36:20 gmann ok 15:36:40 spotz_: for magnum I know there are few new cores in last cycle which you can ping. 15:36:41 Probably true. 15:37:23 anything else on PTG? 15:37:57 Please register if you havent yet! 15:38:15 +1, i did. 15:38:28 Me too 🙂 15:38:40 #topic Open Reviews 15:38:43 #link https://review.opendev.org/q/projects:openstack/governance+is:open 15:38:45 \o/ 15:39:01 I need one more vote on slaweq vice-chair nomination #link https://review.opendev.org/c/openstack/governance/+/833171 15:39:15 voted:) 15:39:20 all other open reviews are good, either waiting for time or PTL +1 15:39:36 thanks, that is all from my side today. anything else to discuss? 15:39:41 nice :) 15:39:45 we have around 21 min 15:40:03 Voted! Thank you slaweq ! 15:40:18 thanks and yes thanks slaweq for volunteer 15:40:26 Assuming we have joint leadership in Berlin do we want to do anything separate from that? 15:40:37 yw, I hope I will learn quickly and be able to help gmann there :) 15:40:45 slaweq: +100 15:41:05 Forum submissions should be opening next week I think 15:41:20 spotz_: I think that is good one to restart. and joint leadership meeting is enough at least for Board interaction 15:41:56 Sounds good, I pinged the OPS Meetup folks as we're 10 weeks out and really need to get planning 15:42:00 diablo_rojo: on Forum sessions, do we need TC volunteer for selection committee like we used to have? 15:42:12 spotz_: +1 on ops meetup. 15:42:17 I have a few PTL volunteers actually 15:42:28 So we are good for OpenStack forum selection representation 15:42:31 I told her I would if no one else staeeped up 15:42:46 That too :) 15:43:21 dansmith: nice, I saw wiki and if i understand correctly requirement is not two TC has to be in selection but it can be anyone from community right? 15:43:33 diablo_rojo: ^^ 15:43:37 lol 15:43:40 dansmith: please ignore 15:43:46 aheh 15:43:50 I was like ..uhh 15:43:50 your both name with d* :) 15:43:53 Yeah it can be anyone from the community just ideally someone in a governance position 15:44:03 so PTLs are great too 15:44:50 Yeah dansmith, here I thought you were the Forum expert lol 15:44:52 "1 delegate from each OpenInfra Project 15:44:52 2 OpenInfra Foundation staff members" 15:45:02 yep 15:45:13 diablo_rojo: may be good to mention that clearly about governance in that 15:45:28 gmann, it says elsewhere in the wiki I believe 15:45:34 #link https://wiki.openstack.org/wiki/Forum 15:46:09 "The TC and UC are best placed to decide which of their members should represent each body...." may be this line can be modified now? 15:46:20 this is left om previous requirement? 15:46:29 Ah yeah that needs to be updated. 15:46:34 I will tweak later today 15:46:42 k, just making sure we do not miss anything from TC which we need to do 15:46:47 diablo_rojo: thanks 15:46:53 I would let you know if we were :) 15:47:01 great. 15:47:15 anything other topic to discuss/ 15:48:12 #endmeeting