Tuesday, 2021-03-30

*** hamalq has quit IRC01:25
*** sboyron has joined #opendev-meeting06:54
*** hashar has joined #opendev-meeting06:57
*** hashar_ has joined #opendev-meeting07:43
*** hashar has quit IRC07:46
*** hashar_ is now known as hashar07:51
*** hashar has quit IRC09:20
*** hashar has joined #opendev-meeting11:39
*** hashar has quit IRC13:24
*** hashar has joined #opendev-meeting15:32
*** hamalq has joined #opendev-meeting16:15
*** hamalq_ has joined #opendev-meeting16:19
*** hamalq has quit IRC16:20
*** hashar is now known as hasharDinner17:36
*** diablo_rojo has joined #opendev-meeting18:52
clarkbanyone else here for the meeting?19:00
fungiyeah, more or less19:00
clarkb#startmeeting infra19:01
openstackMeeting started Tue Mar 30 19:01:18 2021 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:01
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:01
*** openstack changes topic to " (Meeting topic: infra)"19:01
openstackThe meeting name has been set to 'infra'19:01
clarkb#link http://lists.opendev.org/pipermail/service-discuss/2021-March/000199.html Our Agenda19:01
diablo_rojoo/19:01
clarkbI wasn't around last week, but will do my best :) feel free to jump in help keep things going in the right direction19:01
ianwo/19:02
clarkb#topic Announcements19:02
*** openstack changes topic to "Announcements (Meeting topic: infra)"19:03
clarkbI didn't have any. Do others?19:03
fungii don't think so19:03
fungigitea was upgraded19:03
fungikeep an eye out for oddities?19:04
clarkb++19:04
fungizuul was recently updated to move internal scheduler state into zookeeper19:04
fungikeep an eye on that too19:04
clarkb#topic Actions from last meeting19:05
*** openstack changes topic to "Actions from last meeting (Meeting topic: infra)"19:05
clarkb#link http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-03-23-19.01.txt minutes from last meeting19:05
clarkbianw had an action to start asterisk retirement. I saw an email to service-discuss about it.19:05
ianwno response on that, so i guess i'll propose the changes soon19:05
clarkbianw do you want to keep the action around until the changes are up and or landed? seems to be moving along at least19:06
ianwsure, make sure i don't forget :)19:06
clarkb#action ianw Propose changes for asterisk retirement19:06
clarkb#topic Priority Efforts19:06
*** openstack changes topic to "Priority Efforts (Meeting topic: infra)"19:06
clarkb#topic OpenDev19:06
*** openstack changes topic to "OpenDev (Meeting topic: infra)"19:06
clarkbas mentioned we upgraded gitea from 1.13.1 to 1.13.619:07
clarkbkeep an eye out for weirdness.19:07
clarkbDo we also want to reenable project description updates and see if 1.13.6 handles that better? or maybe get the token usage change in first?19:07
ianwtokens seems to maybe isolate us from any future hashing changes, but either way i think we can19:08
clarkbianw: maybe I should push up the description update change again and then compare dstat results with and without the token use.19:09
clarkbthat should give us a good indication for whether or not 1.13.6 has improved hashing enough or not?19:09
fungimaybe19:09
ianw#link https://review.opendev.org/c/opendev/system-config/+/78288719:09
fungiit was never completely smoking gun that project management changes triggered the cpu load19:09
ianwfor anyone reading without context :)19:09
fungithey would sometimes overload *a* gitea backend and the rest would be perfectly happy19:10
clarkbya I suspect it has to do with background load as well19:10
fungiso if we want to experiment in that direction, we'll need to leave it in that state for a while and it's not a surety19:10
clarkbdue to the way we load balance we don't necessary get a very balanced load19:10
clarkbI also made some new progress on the gerrit account classification process before taking time off19:11
clarkbif you can review groups in review:~clarkb/gerrit_user_cleanups/notes.20210315 and determine if they can be safely cleaned up like previous groups that would be great19:12
clarkbI'll pick that up again as others have had a chance to cross check my work19:12
clarkb#link https://review.opendev.org/c/opendev/system-config/+/780663 more user auditing improvements19:12
clarkbthat is a related scripting improvement. Looks like I have one +2 so I may just approve it today19:12
clarkbessentially I had the scripts collect a bunch of data into yaml then I could run "queries" against it to see different angles19:13
clarkbthe different angesl are written down in the file above and can be corss checked19:13
clarkb#topic Update Configuration Management19:14
*** openstack changes topic to "Update Configuration Management (Meeting topic: infra)"19:14
clarkbAny new config mgmt updates we should be aware of/review?19:14
fungii don't think so19:16
clarkb#topic General Topics19:16
*** openstack changes topic to "General Topics (Meeting topic: infra)"19:16
clarkb#topic Server Upgrades19:16
*** openstack changes topic to "Server Upgrades (Meeting topic: infra)"19:16
clarkbI did end up completing the upgrades for zuul executors and mergers and nodepool launchers19:16
clarkbThat leaves us with the zookeeper cluster and the scheduler itself19:17
clarkbI have started looking at the zk upgrade and writing notes on an etherpad19:17
clarkb#link https://etherpad.opendev.org/p/opendev-zookeeper-upgrade-202119:17
clarkbthat etherpad proposes two options we could take to do the upgrade. If ya'll can review it and make sure the plans are complete and/or express an opinion on which path you would like to take I can boot instances and keep pushing on that19:18
clarkb#topic Deploy new refstack server19:20
*** openstack changes topic to "Deploy new refstack server (Meeting topic: infra)"19:20
clarkb#link https://review.opendev.org/c/opendev/system-config/+/78159319:20
clarkbthis change merged yesterday. ianw  should I go ahead and remove this item from the meeting agenda?19:20
ianwyep, deployment job ran so i'm not aware of anything else to do there19:20
clarkbcool I'll get that cleaned up19:21
clarkb#topic PTG Planning19:22
*** openstack changes topic to "PTG Planning (Meeting topic: infra)"19:22
clarkbI did submit a survey and put us on the schedule last week19:22
clarkbthe event runs April 19-23 and I selected Thursday April 22 1400-1600UTC and 2200-0000UTC for us19:22
clarkbthe first time should hopefully work for those in EU timezones and the second for those in asia/pacific/australia19:23
clarkbmy thought on that was we could do office hours and try to help some of our new project-config reviewers get up to speed or help other projects with infra related items19:23
clarkbif the times just don't work or you think we need more or less let me know. I indicated we may need to rearrange scheduling when I filled out the survey19:24
clarkb#topic docs-old volume cleanup19:24
*** openstack changes topic to "docs-old volume cleanup (Meeting topic: infra)"19:24
clarkbnot sure if this is still current but it was on the agenda so here it is :)19:25
ianwoh it was from when i was clearing out space the other day19:25
ianwdo we still need docs-old?19:26
fungiwe do not19:26
clarkbis docs-old where we stashed the really old openstack documentation so that it could be found if people have really old installations but otherwise wouldn't show up in google results?19:26
fungithat was kept around for people to manually copy things from if we failed to rebuild them during the transition to zuul v319:27
fungii think anything we weren't actively building but was relevant was manually copied to the docs volume19:27
ianwclarkb: yeah, it leaking into google via https://static.opendev.org/docs-old/ which i guess has nothing to stop that was a concern19:27
ianwok, well it sounds like i can remove it then19:28
fungiwe should probably robots.txt to exclude spiders from the whole static vhost19:28
clarkbwould it make sense to see if Ajaeger has an opinion?19:28
clarkbsince Ajaeger was pretty involved in that at the time iirc19:28
ianwfungi: yeah, i can propose that.  everything visible there should have a "real" front-end i guess19:29
clarkbI don't have enough of the historical context to make a decision. I'll defer to others, but suggest maybe double checking with ajaeger if we can19:31
ianwok, i can ask, don't want to bother him with too much old cruft these days :)19:31
clarkbya I don't think ajaeger needs to help with cleanup or backups or anything, just indicate if he thinks any of it is worth saving19:32
clarkb#topic planet.openstack.org19:32
*** openstack changes topic to "planet.openstack.org (Meeting topic: infra)"19:32
clarkbAnother one I don't have a ton of background on but I see a retire it option and I like the sound of that >_>19:33
clarkblooks like the aggregator software is not being maintained anymore whih puts us in a weird spot doing server updates19:33
ianwyeah, linux australia retired their planet which made me think of it19:33
fungii guess we should probably at least let the folks using it know somehow19:33
fungilike make an announcement19:33
clarkb++ and probably send that one to openstack-discuss given the service utilization19:33
ianwi did poke at aggregation software, i can't see any that look python3 and maintained19:34
fungii could get the foundation to include a link to the announcement in a newsletter19:34
clarkbbasically say the software is not maintained and we can't find alternaties. We will retire the service as a result.19:34
ianwi thought we could replace it with a site on static that has an OPML of the existing blogs if we like19:34
ianwthese days, a RSS to twitter feed would probably be more relevant anyway19:34
fungior if the foundation sees benefit in it, they may have a different way they would want to do something similar anyway19:34
fungiyeah19:34
fungimicroblogging sites have really become the modern blog aggregators anyway19:35
ianw(i did actually look for an rss to twitter thing too, thinking that would be more relevant.  nothing immediately jumped out, a buch of SaaS type things)19:35
clarkbya twitter, hacker news, reddit etc seem to be the modern tools19:36
clarkband authors just send out links from their accounts on those platforms19:36
ianwvale RSS, RIP with google reader19:36
ianwmaybe give me an action item to remember and i can send that mail and start the process19:37
clarkb#action ianw Announce planet.o.o retirement19:38
ianwi am old enough to remember when jdub wrote and released the original "planet" and we all though that was super cool and created a bunch of planets19:38
clarkb#topic Tarballs ORD replication19:39
*** openstack changes topic to "Tarballs ORD replication (Meeting topic: infra)"19:39
ianwok, last one, again from clearing out things earlier in the week19:39
ianwof the things we might want to keep if a datacentre burns down, i think tarballs is pretty much the only one not replicated?19:40
ianw#link https://etherpad.opendev.org/p/gjzssFmxw48Nn3_SBVo619:40
ianwthat's the list19:40
ianwdocs is already replicated19:41
clarkb++ I think the biggest consideration has been that the vos release to a remote site of large sets of data isnt' quick19:41
clarkbI think tarballs is not as large as our mirrors but bigger than docs?19:41
clarkbI also suspect that we can set it up and see how bad it is and go from there?19:41
fungiyeah, in that ballpark19:41
fungialso the churn is not bad as it's mostly append-only19:41
fungior at least that's the impression i have19:42
fungii guess we'll find out if that's really true19:42
ianwyeah, i don't think it's day-to-day operation; just recovery situations19:42
ianwwhich happen more than you'd hope19:42
ianwbut still, i'd hate to feel silly if something happened and we just didn't have a copy of it19:43
clarkbya I think this is the sort of thing where we can make the change, monitor it to see if it is unhappy and go from there19:44
ianwORD has plenty of space.  we can always drop the RO there in a recovery situation i guess too, if we need19:44
ianwalright, i'll set that up.  lmn if you think anything else in that list is similar19:44
clarkbI want to say the newer openafs version we upgraded to is better about higher latency links?19:44
ianwapparently, but still there's only so fast data gets between the two when it's a full replication scenario19:44
clarkbianw: maybe do all the project.* volumes?19:45
clarkbI think those host docs for various things like zuul and starlingx19:46
clarkbmirror.* shouldn't matter and is likely to be the most impacted by latency19:46
ianwyeah, probably a good idea.  i can update the docs for volume creation because we've sometimes done it and sometimes not it seems19:46
clarkb++19:46
fungisure, small volumes are probably good to mirror more widely if for no other reason than we can, and they're one less thing we might lose in a disaster19:47
ianwyeah, it all seems theoretical, but then ... fires do happen! :)19:48
clarkbindeed19:49
clarkb#topic Open Discussion19:49
*** openstack changes topic to "Open Discussion (Meeting topic: infra)"19:49
clarkbThat was all on the published agenda19:49
ianwi have a couple of easy ones from things that popped up19:49
clarkbworth noting we think we have identified a zuul memory leak which is causing zk disconnects19:50
ianw#link https://review.opendev.org/c/opendev/system-config/+/78286819:50
ianwstops dstat output to syslog19:50
clarkbfungi was going to restart the scheduler to reset the leak and keep us limping along. corvus mentioned being able to actually debug tomorrow19:50
ianw#link https://review.opendev.org/c/opendev/system-config/+/78312019:50
ianwputs haproxy logs into our standard container locations19:50
ianw#link https://review.opendev.org/c/opendev/system-config/+/78289819:50
clarkbianw: the dstat thing is unexpected but change lgtm19:51
ianwallows us to boot very large servers when they are donated to us :)19:51
clarkbha on that last one19:51
fungiyeah, we're a few minutes out from being able to restart the scheduler without worrying about openstack release impact19:52
fungii'm just waiting for one build to finish updating the releases site19:52
ianwis it helpful to restart with a debugger or anything for the leak?19:52
fungioh, clarkb, that oddity we were looking at with stale gerritlib used in a jeepyb job? it happened again when i rechecked19:53
ianwclarkb: yeah, i was like "i'm sure i provided a reasonable size for boot from volume ... is growroot failing, etc. etc." :)19:53
clarkbianw: I want to say we already have a hook to run profiling on object counts19:53
clarkbianw: but that is agood question and we should confirm with corvus before we restart19:53
corvusi have not previously used a debugger when debugging a zuul memory leak; only the repl and siguser19:53
corvusi'm always open to new suggestions on debugging memleaks though :)19:54
clarkbseems like the repl stuff and getting object counts has been really helpful in the past at least19:54
clarkbcorvus: when I've tried in the past its been "fun" to figure out adding debugging symbols and all that. I suspect that since we use a compiled python via docker that this may be even more fun?19:56
clarkbwe can't just install the debugger symbols package from debian19:56
clarkb(sorting that out may be a fun exercise for someone with free time though as it may be useful generally)19:57
clarkbsounds like this may be about it. I can end here and we can go have breakfast/lunch/dinner :)19:57
clarkbthank you everyone!19:57
clarkb#endmeeting19:57
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev"19:57
openstackMeeting ended Tue Mar 30 19:57:31 2021 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:57
openstackMinutes:        http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-03-30-19.01.html19:57
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-03-30-19.01.txt19:57
openstackLog:            http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-03-30-19.01.log.html19:57
fungithanks clarkb!19:58
diablo_rojothanks clarkb!19:58
*** hasharDinner has quit IRC20:20
*** openstackstatus has quit IRC22:42
*** openstack has joined #opendev-meeting22:43
*** ChanServ sets mode: +o openstack22:43
*** sboyron has quit IRC23:02

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!