15:00:24 #startmeeting tc 15:00:25 Meeting started Thu Jan 21 15:00:24 2021 UTC and is due to finish in 60 minutes. The chair is mnaser. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:27 #topic rollcall 15:00:27 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:29 The meeting name has been set to 'tc' 15:00:30 o/ 15:00:35 o/ 15:00:36 too fast? 15:00:38 o/ 15:00:47 openstack is having a slow morning i guess 15:00:58 o/ 15:02:02 well, we can get started 15:02:08 #topic Follow up on past action items 15:02:18 #link http://eavesdrop.openstack.org/meetings/tc/2021/tc.2021-01-14-15.00.html 15:02:21 irc is not 100% synchronous, so if messages lag between the server to which you're connected and the server to which the bot is connected, you'll see a delay in its responding 15:02:48 fungi: so i guess i can add irc to 'asynchronous collaboration tools' :P 15:02:53 ok, so we have 15:02:57 diablo_rojo complete retirement of karbor 15:03:20 i know there was a few patches left 15:03:53 #link https://review.opendev.org/c/openstack/governance/+/767056 15:04:09 #link https://review.opendev.org/c/openstack/project-config/+/767057 15:04:34 looks like we had a mishap with the rebase :) i'll keep it on the list 15:04:39 #action diablo_rojo complete retirement of karbor 15:04:52 mnaser to remove proposed goal topic from agenda << gone 15:05:14 diablo_rojo reach out to SIGs/ML and start auditing states of SIGs << neither ricolin or diablo_rojo seem to be around right now, so i think we can follow up on this later 15:05:30 #action diablo_rojo/ricolin reach out to SIGs/ML and start auditing states of SIGs 15:05:39 mnaser remove annual report suggestions from agenda << that's done 15:05:56 dansmith update osc change to include ci/docs commentary < thats in https://review.opendev.org/c/openstack/governance/+/759904 15:06:17 thanks dansmith for revising that, and it looks like we're on the right track, tc-members, please review ^ 15:06:31 mnaser add openstack/infra-core as discussion topic in opendev meeting << done, will discuss later 15:06:39 gmann follow-up with zigo if debian uses l-c 15:06:40 I just +1d, I was waiting until others had since I revised it latest 15:07:01 l-c thing we can discuss later, we have more response in ML 15:07:07 ok cool, perfect 15:07:08 later in topic section 15:07:28 gmann update supported distros to drop opensuse < has a discussion topic too, so perhaps we can keep it for that? 15:07:39 https://review.opendev.org/c/openstack/devstack/+/769884 15:07:45 sure 15:08:08 cool, tryingh to get through those so we can get to the discussions 15:08:19 #topic Audit SIG list and chairs (diablo_rojo / ricolin) 15:08:49 no updates that i've seen on this, unless i missed something on the ML, we can probably keep that for next time? 15:09:17 +1 15:09:34 #topic Add Resolution of TC stance on the OpenStackClient (diablo_rojo / dansmith) 15:09:47 not much to do about this anymore at this point I think, just need to get tc members to review the patch and land it 15:10:03 so i think we can remove this topic for the future perhaps and keep it under the open reviews? 15:10:24 yeah 15:10:53 #action mnaser drop "Add Resolution of TC stance on the OpenStackClient" from agenda 15:10:59 #topic Gate performance and heavy job configs (dansmith) 15:11:03 #link http://paste.openstack.org/show/jD6kAP9tHk7PZr2nhv8h/ 15:11:07 So, 15:11:13 I'm concerned about the gate lately 15:11:24 last week, we had a 24+ hour turnaround literally all week, 15:11:29 this week it's been 8+ hours 15:11:45 for someone in the last timezone (i.e me), that means it is impossible to submit a job and get a result in the same working day 15:11:51 that is super frustrating 15:11:53 triploe, neutron is more than 50% consumption 15:11:58 tripleo 15:11:59 right, that's the takeaway there 15:12:05 :-( 15:12:07 yeah, that is really rough 15:12:17 neutron is going to remove some of their tripleo jobs, but it's going to take a while to land that, 15:12:28 and even still, they have several 2+ hour jobs in their large list 15:12:33 note that nova is 5% of the total.. NOVA 15:12:42 dansmith: question, is this a lack of resources (compute power) or is this unreliable jobs (so causing large number of resets) 15:12:44 it definitely slow down the development 15:12:45 I've already identified some fat we should cut from the nova list 15:13:12 (spontaneously pinging clarkb / fungi if they have thoughts on all of this too) 15:13:23 mnaser: it's always lack of resources, but in this case, it's waaaaay too many jobs that are multi-hour jobs 15:13:25 mnaser: we could be in a lot better place if tripleo wasn't taking a quarter of the gate on super long multinode jobs, 15:13:27 I see more time in queue to get the node for run so lack of resources ? 15:13:35 and definitely bad to drag those into neutron 15:13:53 I've been talking to clarkb and fungi .. they're tired of me already :) 15:14:05 I'm trying to point us (the TC) at maybe some people engineering to work on, 15:14:08 tempest-slow job also take much time 2 hrs or so which I have in list to optimize 15:14:23 specifically trying to ask projects to cut their jobs down to more reasonable sizes, 15:14:27 and push things into periodic 15:14:39 dansmith: oh i could never get tired of you ;) 15:14:46 * dansmith makes eyes at fungi 15:15:03 thanks for trying to improve devstack/tempest performance! 15:15:08 hmm 15:15:10 periodic and experimental is good idea so that they can be run on demand too 15:15:13 I've also been working on optimizing devstack itself, and we could use some general job perf optimizations I think, which nobody ever wants to work on 15:15:21 but we can't keep going like this 15:15:33 a day turnaround is going to discourage people from testing things 15:15:51 have we lost any nodepool providers lately that might have contributed to worsening this? 15:15:55 for devstack main challenge is to maintain the new framework/things if we change 15:16:19 mnaser: I really don't think it's a recent abrupt loss of resources, I think it's jobs growth 15:16:28 mnaser: not particularly recently, it seems projects are just adding an ever increasing amount of jobs and not necessarily looking at which ones they can safely stop running 15:16:34 that ^ 15:16:45 and hopefully that's where the TC can put some pressure on people 15:16:48 i _hate_ to say it but does uh, a quota be something we might look into adding? 15:16:51 though also there has been a significant post-holiday rush leading into milestone 2 15:17:00 because the health of these systems impacts the health of the contributor community 15:17:11 true, thats good point 15:17:25 it also changes our 'gate everything' mentality 15:17:36 where we just want to slip things across because its such a pita to land things 15:18:01 because 5 revisions of a patch = 5 business days 15:18:06 at least 15:18:07 i was surprised to find out that neutron had started running (multi-stage, multi-node) tripleo jobs in check and gate 15:18:12 I literally submitted something the other day, 15:18:21 and waited until the NEXT DAY to realize I had forgotten to git-add a file 15:18:22 that sucked 15:18:33 three-hour multi-node jobs and other jobs which pause waiting on those to finish consume rather a lot of our available quota 15:18:38 fungi: right, that's the thing I'm hoping we can put some pressure on 15:18:50 dansmith: patch for that is here: https://review.opendev.org/c/openstack/neutron/+/771486 and just needs approval 15:19:02 and we already merged https://review.opendev.org/c/openstack/neutron/+/770832 15:19:03 neutron has optimized it a lot in past but with tripelo jobs it is back to more worst 15:19:23 slaweq: yes, and thank you VERY much for proposing it :) 15:19:37 yw :) 15:19:41 anyway, I want to bring it up 15:19:54 i think cross gating is a really neat thing, but we might not have the necessary r sources to drive it 15:20:00 but another point we need to consider is 'does this narrow down our upstream testing quality' ? 15:20:04 and maybe we can periodically revisit this and/or figure out some guidelines, and some requests to the projects 15:20:25 cross-gating could still be done with a less heavy-weight approach, i expect 15:20:26 I dunno what we can ask of tripleo, but even if neutron went to 5% like nova, I think tripleo would probably swell to more than its current 25% 15:20:31 and I dunno what to do about that 15:20:53 does 3rd party CI/CD causing more load? I have not checked yet 15:21:06 3rd party ci doesn't really affect it that much 15:21:11 ok 15:21:23 the script clarkb's been using to provide node-hour breakdowns by project and job can help highlight opportunities to normalize utilization 15:21:59 dansmith: +1, we need to get tripleo with 10% somehow 15:22:06 gmann: ++ 15:22:14 maybe we can find a way to work with the tripleo team to find a way to off-load many of their jobs 15:22:18 and yeah, just to put it in perspective, all the non-openstack projects (everything in x/, but also zuul, opendev, airship, starlingx, et cetera) account for <5% of our available quota utilization at peak 15:22:27 gmann: I'm tempted to say that tripleo is special enough in how heavy it is that they really need to have their own resources, but I know that's a hard line to take 15:22:28 i know all of these jobs and that doesnt include a bunch that run inside the rdo cloud 15:22:29 The 3rd party shouldn't be making a big difference. 15:22:59 yeah 15:23:34 What about volume of developer changes to CI usage 15:23:48 fungi: mnaser not sure if we can add cap for node utilization per project even some node are free 15:23:56 fungi: has a script for the review load, but it crashed for me 15:24:00 and I've been busy with other stuff, 15:24:06 but that would be a good metric, mnaser 15:24:07 so that there is always nodepool for other patches whenever they are in gate 15:24:45 so say if nova want more jobs and load then it effect nova gate run time only 15:24:48 Can we look at that metric before we start asking projects to lower their CI usage 15:25:06 and then project will automatically narrow down their jobs 15:25:20 yeah, for opendev as a whole we have an "engagement" reporting tool i've been working on which includes a breakdown of review activity by git namespace, it also shards the gerrit queries by repository so could probably be extended/modified to do stats based on deliverable aggregation to teams from the governance projects data 15:25:22 gmann: well, that means work for zuul people, and while technical solutions are nice, we can chase the people ones first I think 15:25:42 fungi: do you have a paste of that? I couldn't get it to run 15:26:09 mnaser: not all projects are going to be the same review-to-cpu-cycles ratio of course 15:26:26 mnaser: but if it's like 10x for one project over another similar one that'd be useful 15:26:38 #link http://paste.openstack.org/show/801829 preliminary OpenDev 2020 Engagement Report 15:26:47 dansmith: right yes, so we can approach a specific team and point out these 'fair-use' ratios 15:27:22 ok so maybe lets come up with some actionable things we can try doing and following up on 15:27:29 fungi: ah okay that's not quite granular enough for projects yet 15:27:43 well, one is the neutron paring down, which is in progress 15:27:58 right, that's what i was saying, it would need modifying to report by repo, but the querying is already sharded by repo to stabilize the pagination from the rest api 15:28:00 i think one action is maybe first approach teams with high usage (say tripleo) and ask what can be done about it? 15:28:12 +1 15:28:25 yeah 15:28:28 #link https://review.opendev.org/729293 OpenDev Engagament Reporting Tool 15:28:38 is where i've been working on the script itself 15:28:48 fungi: ah, gotcha 15:29:00 (that's what produced that 2020 report) 15:29:06 can anyone reach out to perhaps the PTL of these high use projects with a tc hat on and ask what we can do to improve the situation? 15:29:41 i'm also happy to volunteer too 15:29:52 pretty sure tripleo and neutron are already aware as they've been in prior discussions with opendev/tact sig representatives about disproportionate resource consumption 15:29:52 I don't have a relationship with any of the tripleo people, but I have already been pushing on neutron 15:30:12 may be TC liaison for tripleo ? 15:30:21 fungi: i think we've historically had those discussions for months and years but i think we've reached a point where the discussion needs to transition to "what can you do about this?" 15:30:39 gmann: i think i don't mind picking it up and reaching out to them and seeing what we can do 15:30:50 sure 15:31:06 well, to be fair, tripleo has been working on slimming it down. there was a point where they were consistently >50% of our overall consumption, then it was 40%, now it's down around 30% 15:31:15 #action mnaser reach out to tripleo + other high usage job teams re: usage 15:31:21 thanks 15:31:23 I forgot about the liaisons.. who is the tripleo one? 15:31:24 fungi: ah yes that's good too 15:31:34 fungi: ack, maybe neutron's adoption of that has regressed us then 15:31:36 that'll be on the project teams page 15:31:47 perhaps we can wait and see how things are once the neutron patch to drop those goes 15:31:48 https://governance.openstack.org/tc/reference/projects/tripleo.html 15:31:49 dansmith: it was me and ricolin :) 15:32:14 o/ 15:32:18 aight 15:32:31 cool, i say we should move on the next topics to make sure we can hit all topics 15:32:36 and we can follow up on this next week 15:32:39 yeah 15:32:41 thanks for bringing it up dansmith :> 15:32:44 tripleo has been working on ways to avoid redundancy between their jobs, run fewer jobs, make their frameworks more efficient, et cetera and it's had an impact, but they're still by far the highest resource consumer when measured on a per-team basis 15:32:47 yeah sorry, but thanks for humoring me :) 15:32:49 +1, thanks dansmith 15:33:01 #topic Audit and clean-up tags (gmann) 15:33:14 we have manila application for API tag 15:33:16 https://review.opendev.org/c/openstack/governance/+/770859 15:33:23 need more review 15:33:28 * gouthamr sneaks in 15:33:59 we can do in gerrit, nothing much to discuss here unless any question 15:34:18 gmann, one question 15:34:25 also I am thinking to remove this from agenda as such and add when there is progress or things to discuss ? 15:34:28 ricolin: sure 15:34:40 if a project didnot impl micro version, is that mean it should not apply for that tag? 15:35:08 as the docs looks like so 15:35:13 ricolin: microversion is not mandatory, if project does any versioing for API to discover the changes/features then it satisfy the rquiremnet 15:35:27 ricolin: we changed that doc to clarify that but let me check 15:35:50 it does tell about that 15:35:53 https://governance.openstack.org/tc/reference/tags/assert_supports-api-interoperability.html 15:36:07 point 2, say aor a discoverablity mechanism 15:36:17 i think i clarified it in ML also 15:36:29 but do we need to add more clarification in doc ? 15:36:36 okay make sense to me 15:36:57 gmann, I think current docs is fine 15:37:01 ok, ricolin feel free to propose some change id doc is not so clear that can help other projects too 15:37:04 ok 15:37:12 id/if 15:37:19 sure:) 15:37:22 thanks 15:37:51 mnaser: reposting if you missed it. 'also I am thinking to remove this from agenda as such and add when there is progress or things to discuss ?' 15:38:01 gmann: yeah, that's fair with me too 15:38:10 doesn't seem to need that much so we can add it ad-hoc 15:38:15 yeah 15:38:51 #action mnaser drop API tags topic from weekly agenda 15:39:03 for the rest, we just continue to follow up on reviews 15:39:20 #topic infra-core updates (mnaser) 15:39:37 i brought up the topic on tuesday's opendev meeting which made for some interesting discussion 15:40:09 #link http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-01-19-19.01.html 15:40:18 see "Discuss infra-core (on behalf of OpenStack)" 15:40:27 #link http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-01-19-19.01.log.html#l-19 15:41:07 the summary and actionable thing at the end of the discussion was that fungi was going to draft up an email to reach out to the community if anyone is looking to volunteer, and perhaps we can take an opportunity to improve the processes and tools to simplify the work of any reviewers 15:41:34 yeah, i'm hoping to get a tact sig request for help e-mail out to openstack-discuss by the weekend 15:41:37 i don't think there's anything actionable from a tc's pov at the moment, so i'm afraid maybe we can put the topic aside and if we don't gather any attention for volunteers we can bring that up? 15:41:53 I think doc are really in good shape but yeah we can simplify if needed 15:42:05 also big thanks to mnaser and gmann for pitching in there so far 15:42:54 the tl;dr though is that we need people invested in openstack to review configuration chages on behalf of openstack. if your config changes aren't getting reviewed fast enough for your liking, that's on you (at least in part) 15:43:24 seems fair. i'll drop this topic for now and we can bring it up if it's an issue 15:43:33 +1 15:43:35 ++ 15:43:36 #action mnaser remove infra-core discussion from agenda 15:43:46 #topic Dropping lower-constraints testing from all projects (gmann) 15:44:18 we have responses in ML and few suggestion also 15:44:24 and its going on 15:44:31 yeah, discussion has been lively 15:44:54 that's kinda good i guess :) 15:45:09 do we have anything to update/discuss on that or shall we leave the discussion to continue and go back to follow up on it next week? 15:45:09 so should we continue discussion for some more and then finalize anything in TC or now? 15:45:39 FYI I will start the SIG update thing these two days 15:46:10 cool ricolin. i think i like leaving the discussion run while it's active and we can follow up on it again next week if we seem to be getting to a consensus next week 15:46:21 it seems like a popular compromise might be to limit lower-constraints.txt files to just what's listed in requirements.txt for a project in master, and drop the lower bounds checking as soon as stable is branched (i really don't see it being a viable approach in stable branches) 15:46:53 current suggestion from stephen and what os-win and networking-hyperv tried is 1. 'remove the indirect deps form l-c file' which can ease the maintenance 2. remove the job form stable branches 15:46:54 The discussion has been good. Let's see where it goes. 15:47:16 yeah what fungi wrote 15:47:46 fungi: for stable branch, we should remove it. 15:48:08 i think for today and for time purposes, maybe we should let the conversation keep going and see where we are next week 15:48:09 but not just existing stable branches, future stable branches are going to hit the same problems as they age 15:48:13 so we can have a solid reliable recommendation 15:48:27 make sense, 15:48:28 yeah, i agree the discussion hasn't played out yet 15:48:34 Makes sense. 15:48:50 okey cool 15:48:53 #topic Decide on OpenSUSE in testing runtime (gmann) 15:48:55 really do not want us to interrupt the ongoing discussion and conclude in half way 15:49:00 #link https://review.opendev.org/c/openstack/devstack/+/769884 15:49:04 #link https://review.opendev.org/c/openstack/governance/+/770855 15:49:15 yeah patches are up for this, need more review 15:49:26 tc-members: ^ lets get reviews on the governance change so we can let the qa team do their cleanups 15:49:28 its a trival review 15:50:07 and I think nothing else needed on this so we can remove from agenda too ? 15:50:14 yeah i think its just mostly a review 15:50:18 yeah 15:50:28 * dansmith hath blessed it with his +1 15:50:29 #action mnaser remove "Decide on OpenSUSE in testing runtime (gmann)" from agenda 15:50:39 #topic Define 2021 upstream investment opportunities 15:50:54 I'm starting a hedge fund, looking for investors 15:51:05 we have not got any help on this yet for many years but in case we want to continue it for 2021. 15:51:18 dansmith: waited all meeting long to drop this 15:51:31 hah 15:51:34 i have pushed to continue the 2020 one which are goal, qa, rbac 15:51:50 i'm ok with just copying them over and merging them, but perhaps there will be a time to evaluate if there's just any point in doing all of this 15:52:05 year after year we've put in effort into writing all this stuff up but it kinda sadly just ends up sitting 15:52:19 and when these things do get help or traction, it's probably rarely been someone who read this document and decided to put resources on it 15:52:29 yeah, at some point we have to stop this if it does not help 15:53:14 but not fair to drop due to COVID situation where we should not expect more unpaid volunteer. if company invest then good 15:54:02 prior expectation was that we would direct board members to that list, pick a different one to highlight in each foundation monthly newsletter, et cetera 15:54:29 i get that there's not a lot of attention on it, but i suspect we've been forgetting to bring anyone's attention to it as well 15:54:30 #link Define 2021 upstream investment opportunities 15:54:31 yeah but I do not think board has been any help till now 15:54:33 #undo 15:54:34 Removing item from minutes: #link Define 15:54:40 #link https://review.opendev.org/c/openstack/governance/+/771707 15:55:22 gmann: ok great, thank you for bringing this up, i think we can drop this as it's just an open review now? 15:55:31 yup 15:56:07 #action mnaser drop "Define 2021 upstream investment opportunities" from agenda 15:56:16 #topic open discussion / open reviews 15:56:30 i'll do a run over the open things soon, this past week has been pretty hectic with a personal move so 15:59:52 are we done? 15:59:59 seems so, we can end 16:00:04 mnaser: ? 16:00:55 mnaser: I bill overtime by the minute 16:01:10 :) 16:01:23 Sorry, had to step out for a sec. Let me end it before I go in deep debt 16:01:26 #endmeeting