15:00:13 #startmeeting tc 15:00:13 Meeting started Thu Aug 26 15:00:13 2021 UTC and is due to finish in 60 minutes. The chair is gmann. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:13 The meeting name has been set to 'tc' 15:00:23 #topic Roll call 15:00:30 tc-members: meeting time 15:00:31 o/ 15:00:50 o/ 15:01:28 o/ 15:01:38 o/ 15:01:46 absence: Belmiro Moreira (belmoreira) - PTO 15:02:09 * jungleboyj is back 15:02:17 o/ 15:02:21 o/ 15:02:31 after surviving Tropical Storm Fred 15:02:39 Belmiro was online yesterday 15:02:40 :O 15:02:48 let's start 15:02:57 #link https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Agenda_Suggestions 15:03:00 today agenda ^^ 15:03:12 #topic Follow up on past action items 15:03:31 gmann to add py3.6 testing plan (after its EOL -Dec 2021) in PTG etherpad 15:03:36 that is done 15:03:38 hello all 15:04:10 #topic Gate health check (dansmith/yoctozepto) 15:04:15 so, 15:04:22 definitely a lot of rush load starting I think 15:04:33 we've noticed a bunch of timeouts, 15:04:45 yeah, it's starting to get hot 15:04:48 both pip related and some like failures to do certain operations 15:05:02 I saw at least one this morning that had either a rabbit failure, or some dropped mq messages or something 15:05:03 we're also seeing node quota backlogs during some times of day 15:05:04 release time 15:05:12 ++ 15:05:17 not sure that's really anything more than "running hot" as yoctozepto says, but... it's definitely slowing things down 15:05:26 gmann: yep 15:05:55 some of our cloud donors put our account on dedicated host aggregates, and our workloads become their own noisy neighbors, slowing down things like i/o 15:06:21 I don't blame them.. is that recent? 15:06:29 not recent at all, no 15:06:36 oh okay 15:06:42 but we only see the impact when we're "running hot" 15:06:51 well, that feeds into our release "running hot" causing some things like rabbit fails 15:06:52 yeah 15:06:57 once that oversubscription is really becoming a constraint on the provider side 15:07:05 and that makes jobs take longer 15:07:09 * dansmith notes yoctozepto has coined a term that will live in infamy 15:07:41 :D 15:07:48 do we have temp settings :P 15:08:05 as in thermocouples? 15:08:10 oh, running hot, i get it 15:08:15 The opposite of Cool Runnings 15:08:22 dansmith ++ 15:08:38 what we can do in this? putting experimental pipeline or so on hold at 'running hot :)' time? but they might not be more. 15:08:50 I dunno really 15:09:02 knowing that the failures are a result of load helps to avoid getting too concerned at least 15:09:29 we could lower our quotas in providers where we see this issue is especially pronounced, maybe, but scheduling isn't always going to evenly distribute our workload across available hosts either so that may not help 15:10:20 (scheduling on the nova/placement side in the provider i mean) 15:10:40 yeah, I dunno 15:10:52 does it happens so that all jobs from one patchset go to one provider? 15:10:56 or are they scattered? 15:11:10 * yoctozepto never remembers 15:11:13 I think scattered ? 15:11:21 they're scattered, but nodes in a multi-node job all get satisfied from the same provider 15:11:29 yeah, ok 15:12:15 let's see next week how it is and then we can discuss if anything can be done or just live with that 15:12:35 anything else on gate health ? 15:13:10 #topic Xena Cycle Tracker 15:13:14 #link https://etherpad.opendev.org/p/tc-xena-tracker 15:13:25 I have divided it in pending and completed items 15:13:39 as we are very close to finish the Xena cycle, let's iterate on pending one 15:13:55 1. TC members to drive the Y cycle community wide goal 15:14:00 ricolin: diablo_rojo_phone ^^ 15:14:36 secure RBAC goal is selected at least. 15:14:47 I think it a little depends on the painpoint one 15:15:01 but do we always need two goal in a cycle? 15:15:03 ricolin: but if we want to select more than one goal right? 15:15:15 ricolin: no, it can be zero or more 15:15:29 I kind of wondering about pain point 15:15:40 I prefer to be on in Yoga but that is something we can discuss in PTG also once we have more candidate 15:15:54 for tls support in gate, in can be done without make it a goal IMO 15:16:11 ricolin: we will discuss pain point things in next topic 15:16:12 At least it's not a problem for every team 15:16:28 ricolin: yeah, it does not need to be goal as such 15:16:49 let's see how many candidate we have and then in PTG we can check them and close this 15:16:54 Interesting how hard it is to collect pain points. Is everyone in too much pain to respond or ... ? 15:17:10 jungleboyj: yes :) 15:17:26 jungleboyj: i mean there are lot of painpoint listed in etherpad 15:17:32 I think it's up to how we can actually driving that and make sense for project teams:) 15:17:50 yeah. anyways let's discuss that in separate topic we have for that 15:17:54 2. Audit and finish the previous cycle pending community-wide goal work 15:18:10 contributor guide is almost done, I will check if anything pending 15:18:14 ;or they’re not on the community and have no clue we’re asking 15:18:28 pdf-guide is something I will start in Yoga, no bandwidth in Xena. 15:18:44 3. Review the tags for usefulness and cleanup. Based on what left, make a decision on whether to continue the tag framework based or not. 15:18:46 yoctozepto: jungleboyj ^^ 15:19:13 gmann: Haven't had a chance to engage on that. 15:19:26 me neither 15:19:31 Will see if I can make some time to look at that in the coming week. 15:19:36 I think it aligns with what we want to discuss in the ptg 15:19:42 regarding project quality 15:19:51 we might want to change the framework 15:19:56 So, it would be good to get the leg work done before the PTG. 15:19:58 so it would be good to start something earlier 15:20:07 yoctozepto: if we can have some re-work/feedback on that then we can discuss more 15:20:08 yeah, mindshare ++ 15:20:14 yeah 15:20:16 yeah 15:20:36 thanks. please add 'Status' there once you start it, 15:20:36 (and then folks come around and think tc is just throwing "yeahs" around) 15:20:42 ok 15:20:56 4. Getting projects to broadast out/mentor 15:21:01 spotz: belmoreira ^^ 15:21:05 * dansmith says nuke the tags 15:21:26 Ok. yoctozepto let's get some notes put together. 15:21:47 ( jungleboyj: yeah, let's catch up next week ) 15:21:51 jungleboyj: +1, thanks that will be helpful . may be start a separate etherpad or so 15:21:57 ( dansmith: that's the plan +/- ) 15:22:10 le woot 15:22:16 yoctozepto: __ 15:22:18 on Brpadcastmentor, same spotz please add note if you are planning to do this one 15:22:23 5. Stable core team process change 15:22:24 yoctozepto: ++ 15:22:26 this is imp 15:22:28 mnaser: jungleboyj ^^ 15:22:51 sorry, i've dropped the ball on this, i'm going to get write something up 15:23:09 do we have some etherpad with context that i can use 15:23:25 mnaser: in Xena PTG etherpad 15:23:27 I have the info will pull addresses from election repo:5 15:23:43 ok 15:23:52 mnaser: Let me know when you have something and I can help review. 15:23:57 L320 https://etherpad.opendev.org/p/tc-xena-ptg 15:24:24 added in racking etherpad too 15:24:28 mnaser: jungleboyj thanks 15:24:48 6. Project Health checks: 15:24:57 ricolin: belmoreira ^^ 15:25:12 this is needed to remove the TC liaison things 15:25:53 That is not yet starts, I will start take some looks into it 15:26:10 ricolin: thanks, please add note also in tracking etherpad 15:26:18 rest other are done, thanks for those. 15:26:20 okay 15:26:26 moving next. 15:26:28 #topic Moving weekly meetings to video/voice calls. 15:26:48 so we discussed this in last week meeting and agreed on doing it monthly for now 15:27:04 and in PTG, we will discuss based on our experience 15:27:24 today we need to decide the channel/tool for video call. 15:27:48 Interesting. Cinder has been doing this as well. 15:27:50 and time: which i propose as next week meeting (1st meeting of month) as video call 15:28:12 jungleboyj: yeah. 15:29:04 gmann ++ 15:29:07 gmann: sounds good to me 15:29:37 what are the tool options? gmeet and zoom I assume? 15:29:38 mnaser: you suggested google meet last time. or any other tool ? 15:29:46 dansmith: yeah 15:29:48 I'm pro gmeet 15:29:53 i did suggest google meet because i think it has really good transcription 15:30:29 zoom is 40 min restriction unless paid subscription 15:30:45 we dont use zoom anymore so i cant help with that 15:30:59 but if people are really tied to zoom i think oif uses it 15:31:00 oh I thought the foundation would zoom us up 15:31:00 gmeet sounds fine 15:31:02 so we could get a room easily 15:31:03 but in that case, gmeet 15:31:06 Maybe we could get a Foundation room? 15:31:14 but yes, for me, i prefer gmeet 15:31:26 spotz_: no need if we prefer gmeet 15:31:27 +1 on transcription 15:31:31 ^ 15:31:33 function 15:31:37 I’m good with either 15:31:49 I am find with either as well. 15:32:16 only thing in gmeet is restricted access from china or other company proxy especially asia side 15:32:34 Oh yeah. 15:32:35 jungleboyj: it's: find - found - found ;-) 15:32:40 guh 15:32:43 but we do not have much audience from there in TC meeting and ricolin can access so ... 15:32:50 ++ 15:32:54 :-) 15:33:00 must said, zoom is not working well for company proxy in China too 15:33:18 ohk 15:33:20 ok, so it's decided 15:33:21 the counterpoint is that the reason zoom works from mainland china is that zoom cooperates with the chinese government to allow them to watch/listen/record any zoom calls their residents participate in 15:34:00 *smh* 15:34:02 let's go with gmeet as majority says. and we can discuss in PTG if any objection on access things 15:34:05 So damned if you do and damned if you don’t 15:34:27 I will prepare the schedule and send 15:34:48 anything else on this? 15:35:11 #topic Next step on pain points from projects/SIGs/pop-up (ricolin) 15:35:13 #link https://etherpad.opendev.org/p/pain-point-elimination 15:35:30 we do not need to discuss all pain point here but decide on how to proceed on these 15:35:43 having separate brainstrom sessions or so or in PTG > 15:35:56 And as we try to survey those data and see if it can be a goal, that means we need to find some reasonable suggestions for project teams. 15:35:56 And encourage them to pick up one from the list and targeting it across cycles 15:36:30 for goal, we need some common pain point applicable to majority of projects 15:37:09 it seems hard to find a pattern 15:37:10 Oh, we actually do have quite a few pain points in here. 15:37:12 but yes, if we can help on individual project specific pain point then it will be great 15:37:14 That leave us only core team size issue?:/ 15:37:40 ricolin: yeah 15:38:14 ricolin: so idea here is 'how TC can help on those' right? 15:38:57 may be we can start sorting such things which we can help and leaving project specific pain point which they have to work on mostly ? 15:38:59 or how TC can get project team's interests to working and pickup some items IMO 15:39:09 humm 15:39:46 I can see one pattern - rmq 15:39:55 gosh yes 15:39:56 does tc see a way to escape this trap? 15:40:05 if we can have a hybrid rabbitmq/grpc driver or something 15:40:10 it should be coordinated at oslo's level 15:40:16 it already is 15:40:21 that would make me the happiest person on earth :-) 15:40:21 so tc-wise makes sense to touch it? 15:40:27 dansmith: what do you mean? 15:40:37 as noted, things like nova depend on o.msg not rabbit itself 15:40:42 yoctozepto: agree 15:40:59 dansmith: ah, you replied to "it should be coordinated at oslo's level" before I ended the sentence, ok 15:41:08 yes 15:41:14 ok,ok 15:41:14 :) so doing it in o.m side ? 15:41:29 which would be a bit of a group effort i guess 15:41:37 yeah, I was just saying that it's openstack-wide any way we look at it 15:41:42 mnaser, +1 15:41:46 ++ 15:41:56 ++ 15:42:07 I feel like this is a "obviously the grass would be greener with a different tool" sort of thing 15:42:33 dansmith: ask any operator about the worst part of openstack deployment (-: 15:42:36 * fungi tries to forget the proton replacement work 15:42:47 dansmith: rabbitmq is _pain_ 15:42:55 Fair warning: We've lost basically all of our messaging experts in Oslo over the past couple of years. 15:42:57 yoctozepto: yeah totally, but I need to be convinced that replacing it just magically makes that better 15:42:58 Rabbit! 15:42:59 99% of major cloud outages i've seen across many deployment is rabbitmq 15:43:08 I only run rpc over rabbitmq and I kill it entirely if it starts acting weird 15:43:09 and a large majority of edge cases too 15:43:11 no regrets 15:43:14 yoctozepto, not just deployment, but also long term running with it;/ 15:43:22 We had some folks looking to pus QDR and Kafka, but they're not around anymore. 15:43:30 *push 15:43:35 dansmith: yeah, ironic has json rpc which is painless 15:43:45 yoctozepto: and also doesn't queue 15:43:46 mnaser: is that because rabbit is terrible, or because the way openstack services utilize rpc message queuing is terrible? 15:44:07 fungi: i think rabbitmq has a lot of problems with queues randomly crashing, eating messages that it never delivers to the other side 15:44:11 fungi: I think we lack rabbitmq expertise on all sides here 15:44:27 like, only the actual rabbitmq folks know how to handle rabbitmq 15:44:28 or something 15:44:46 i echo the sentiment that it would be unfortunate to replace the backend only to find out it's the overall model which is really to blame 15:44:55 might be the fact it's in glorious erlang (-: 15:45:02 ok, 15 min left... 15:45:02 fungi: ++ 15:45:04 One question for all, if we talking about replace rabbitmq, what tool will be the clear winner? 15:45:20 maybe bringing some operators to the table 15:45:23 and people interested in this subject 15:45:25 should we have a separate call to discuss rmq or other such pain points ? 15:45:26 mnaser, +1 15:45:27 to sit together at the ptg or call 15:45:28 yep ^ 15:45:31 gmann ++ 15:45:46 mnaser: sure, a pre-PTG call and then in PTG 15:45:54 woohoo 15:45:56 Maybe we can invite some for our next meeting? 15:46:06 that needs to be a ptg thing I think 15:46:16 dansmith: definitely too 15:46:18 ricolin: would you help in scheduling that? not just TC but for openstack-disucss audience ? 15:46:29 sure 15:46:30 dansmith: sure 15:46:30 I think we lost bnemec's message in the process 15:46:35 ricolin: thanks 15:46:39 gmann, to make sure 15:46:41 yeah bnemec's message is critical too, IMHO 15:46:44 "We had some folks looking to push QDR and Kafka, but they're not around anymore." 15:46:49 ricolin: also can you add in PTG etherpad too ? 15:46:49 you mean schedule on PTG or next meeting? 15:46:53 and : "Fair warning: We've lost basically all of our messaging experts in Oslo over the past couple of years." 15:46:53 because swapping out the devil we know for one we don't is not an improvement, IMHO 15:47:07 #action ricolin to schedule a call for pain point and add it in PTG etherpad too 15:47:15 dansmith: I think the point is we actually don't know rabbitmq except for the fact it breaks 15:47:34 ricolin: both, one meeting before PTG and then continue in PTG ? 15:48:00 ( btw, bnemec, what is QDR? I'm not recognising the acronym ) 15:48:01 gmann, okay, will send out something 15:48:14 ricolin: thanks. 15:48:20 #topic Leaderless projects 15:48:29 #link https://etherpad.opendev.org/p/yoga-leaderless 15:48:29 yoctozepto: Qpid Data Router 15:48:35 yoctozepto: I don't think that's true 15:48:38 we have left with two projects for leader assignments 15:48:43 o/ 15:48:51 yoctozepto: there are lots of things I don't know the internals of, but I know the characteristics, failure patterns, and mitigation techniques 15:48:53 Adjutant and Sahara 15:49:04 we have jeremyfreudberg here so we can start with sahara :) 15:49:23 jeremyfreudberg: hi 15:49:24 yeah 15:49:33 yeah, i can update on the ptl search. in addition to emailing the list i emailed a bunch of people directly 15:49:39 jeremyfreudberg: any possible candidate you think of who can lead in Yoga? 15:49:40 ( dansmith: fair enough; though I would still argue we don't know enough as we use the hammer most of the time ) 15:49:43 I only got one response, from two engineers at Inspur. they use Sahara now and are still interested in improving it. qiujunting@inspur.com - Qiu Fossen, and ruifaling@inspur.com - ruifaling 15:49:44 ( bnemec: thank you ) 15:50:13 i asked them about taking over as PTL or participating in DPL, but no response (yet?) 15:50:13 thanks jeremyfreudberg for joining 15:50:20 communication from inspur is not always great 15:50:28 anyway, i'm fine to discuss all options for sahara. 15:50:44 jeremyfreudberg: thanks for reaching out to them for options. 15:50:50 ++ 15:51:21 even if we retire it now then it can be added back anytime if anyone want to maintain it 15:51:53 or we can call out for leader on openstack-discuss ML first ? 15:52:14 yeah but we need to discuss those quality merits we want to have in openstack; I argue we can't have extremely struggling projects under our banner 15:52:43 yoctozepto: agree on that. 15:52:45 gmann: I think sahara is being handled for now; let's wait 15:52:56 yoctozepto: ++ 15:53:08 so what next step. wait for inspur and then retire> 15:53:41 I updated sahara in etherpad 15:53:46 thanks 15:53:51 so Adjutant 15:53:51 wait for inspur and retire if no response ++ 15:53:54 yes 15:53:55 +1 15:54:03 my thought is wait a while longer for inspur, if that's okay with TC. i will discuss with tosky also about retiring some plugins which are a lost cause 15:54:24 jeremyfreudberg: ++ 15:54:30 jeremyfreudberg: thanks, makes sense to me but what should we declare sahara's status? 15:54:32 jeremyfreudberg: sure, let's wait for soem time 15:54:34 + 15:54:36 transitional PTL jeremyfreudberg? 15:55:10 I will say wait for 1 week till we have PTL election close (Cyborg election) 15:55:26 gmann: I think jeremyfreudberg suggested to wait LONGER than that 15:55:38 but if that's the case, then I'm ok and no questions asked 15:56:02 we can if jeremyfreudberg is fine leading it until then ? and once jeremyfreudberg give go ahead then we can proceed ? 15:56:19 as I said, both options work for me 15:56:26 i really don't want to put my name for another cycle, even if it's temporary 15:56:37 ok, so we wait only till end of elections 15:56:39 enough time 15:56:46 yeah. 15:56:48 Makes sense to me. 15:56:50 +1 15:56:51 if they need more, then imagine the project's struggle 15:56:57 now Adjutant 15:57:02 tosky and i will follow up later to be sure 15:57:02 so Adjutant ? did any once reachout to previous PTL ? 15:57:05 who wants to press the red button? 15:57:22 jeremyfreudberg: thanks a lot for helping in transition 15:57:28 ++ 15:57:29 if some time more is required, of course I can fill the PTL role for either the rebirth or the shutdown phase 15:57:30 ooh, I thought ianychoi[m] was reaching out to previous ptls? 15:57:37 but let's see 15:57:50 ok, so for sahara we have tosky as transitional PTL 15:57:51 tosky: ack, thanks 15:57:52 writing it down 15:58:19 done 15:58:49 Yeah even I’m a bit lost on any next steps if any I need to do for the election or if everything is in process 15:58:55 ok two min left moving to open review and then we can discuss on Adjutant 15:59:16 #topic Open Reviews 15:59:21 spotz: let's sync up after meeting 15:59:30 for this https://review.opendev.org/c/openstack/governance/+/805105 15:59:31 Sounds good 15:59:50 mnaser: can you review the depends-on #link https://review.opendev.org/c/openstack/project-config/+/805103/ 16:00:19 done :) 16:00:25 other open review is Venus project application which is under review #link https://review.opendev.org/c/openstack/governance/+/804824 16:00:28 mnaser: thanks 16:01:07 let's close the meeting and continue discussion on election/leaderless projects 16:01:11 thanks all for joining 16:01:14 #endmeeting