Monday, 2021-07-19

clarkb	and so when we started the scheduler up again it got failures from those merger instances when fetching configs?	00:00
ianw	i didn't, only the scheduler. that's a good point. i'll do a full restart	00:00
clarkb	ya ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) shows up in the zuul error report	00:00
ianw	i'm running the full restart playbook now	00:01
clarkb	ok	00:01
ianw	i didn't think about the mergers	00:01
ianw	ok that's finished	00:02
clarkb	its still running through its processes to start up though	00:03
ianw	yep	00:03
clarkb	ianw: Unknown projects: opendev/meetbot	00:05
clarkb	in my paranoia I wonder if that wasn't synced over to the new server properly?	00:06
clarkb	hrm no it seems git/opendev/meetbot.git does exist	00:06
clarkb	it might be an order of operations thing loading configs?	00:07
clarkb	ya I think its a cross tenant order of operations thing	00:07
ianw	loading, loading, loading ...	00:12
ianw	ok, seems back	00:13
clarkb	the error list is much much smaller now too :)	00:13
clarkb	I think you can recheck your system-config change	00:13
ianw	yep, it's running now	00:14
ianw	\o/	00:14
ianw	i think time for a cup of tea and take a breath!	00:14
clarkb	I want to see the jobs actually start but I agree	00:15
clarkb	https://zuul.opendev.org/t/openstack/stream/5427ef9af78943c5aafe41ca8431fa99?logfile=console.log is the tox-docs job and it did just start	00:15
clarkb	I'm a bit worried that this extended queued time is due to the mergers taking longer to set up repos	00:17
clarkb	but I guess it could also just be slow node launches. Far too early to say	00:17
ianw	the rest are multi node jobs ithink	00:18
ianw	not quite i guess	00:19
ianw	we have quite a few building nodes	00:20
clarkb	Looking at zuul logs I think the issue is noderequest fulfilment	00:20
clarkb	the scheduler has only accepted 2 completed node requests	00:21
clarkb	and another job just started. I just need to learn to be pateitn I think	00:21
ianw	inap-mtl01 has a bunch of building nodes that look exactly like what we want :)	00:21
clarkb	I wonder if we are hitting its image cache problems that we run into periodically where it has a slow period. I think everything is going as expected except for slow node boots and that is independent of our work today	00:22
clarkb	ianw: I think I'll take a break now since stuff seems to be moving the right direction. I'll check in later	00:23
ianw	++ thank you!	00:24
clarkb	`grep 'Accepting node request' /var/log/zuul/debug.log` on the scheduler if you want to see its progress using nodesets	00:24
clarkb	though I guess that isn't much different than checking the js dashboard	00:25
ianw	yeah it's building a ton of nodes, so i think it's just getting itself warmed up	00:25
clarkb	and thank you for doing all the hard work to make this happen :)	00:25
clarkb	there we go a bunch of jobs just startedon the system-config change	00:29
clarkb	and really need to take a break for dinner now.	00:30
fungi	i'm not really around still, sorry, but all's looking okay?	00:47
Clark[m]	We've sorted through the issues that have come up so far. Currently waiting for zuul to merge the system config change to make review02 the review server in Ansible. Then we can run the playbook manually	00:48
Clark[m]	Then once that is happy we can reenable Ansible and do some cleanup	00:49
Clark[m]	I'm working on dinner then probably a walk then will check back in again	00:49
ianw	ahh, that system-config job wasn't actually running the gerrit checks prior	00:56
Clark[m]	Ya it's different because you touched the config file	00:57
opendevreview	Ian Wienand proposed opendev/system-config master: review02: move out of staging group https://review.opendev.org/c/opendev/system-config/+/797563	00:58
ianw	Clarkb[m]: another attempt that updates the system-config-run job as well	00:59
clarkb	+2	01:00
Clark[m]	ianw: looks like it is still failing	01:33
ianw	argh	01:44
ianw	groupadd: GID '3000' already exists	01:46
Clark[m]	That's the Gerrit gid ?	01:47
Clark[m]	Maybe we are running twice for some reason?	01:47
ianw	calling it review02 will help not have to fiddle fake letsencrypt certs	01:50
ianw	(in the system-config-run test)	01:50
opendevreview	Ian Wienand proposed opendev/system-config master: review02: move out of staging group https://review.opendev.org/c/opendev/system-config/+/797563	01:53
clarkb	ianw: any idea why Create Gerrit Group isn't the only thing created a gid 3000?	01:54
ianw	no i'm going to watch this more closely	01:55
clarkb	ok	01:55
ianw	it is created as 3000 on review02	01:55
clarkb	I wonder if some package on focal that we install is creating a group after we shift the min gid and uid?	02:01
clarkb	ianw: do you have a hold set up for the runs of ^	02:01
clarkb	might be a good idea to see what /etc/group says about gid 3000	02:01
ianw	yep :)	02:04
clarkb	gerrit has mreged a change since the move (just one more indication that things are working overall)	02:05
clarkb	https://review.opendev.org/c/openstack/sushy/+/801034/ that one	02:05
clarkb	I'm going to double check it shows up on the giteas	02:06
ianw	yeah i think it's fine. i really wish i'd noticed this job not running prior to this	02:06
clarkb	ianw: I wonder if we should consider skipping ahead and reenqueuing zuul changes?	02:08
clarkb	fwiw that change showed up on the giteas just fine	02:08
ianw	yep if this run doesn't pass i'll skip ahead, re-enqueue the changes (there's only about 4) and make sure the backups are running	02:15
ianw	the node doesn't appear to start with a gid 3000, so that's something	02:17
clarkb	ianw: for general sanity should we remove the replication config on review01? maybe just ocmment it out in the file?	02:20
clarkb	(just wondering what happens if gerrit starts there again unexpectedly and I think the only real issue would be if it replicated)	02:21
ianw	sure, i can move that out of the way. the apache is still serving the maintenance page so should be hard to merge anything	02:26
ianw	i've moved it to a .post-ugprade file	02:26
clarkb	thanks	02:27
ianw	gerrit2:x:3000:	02:33
ianw	so it made the group	02:33
clarkb	thats good	02:34
clarkb	could it have been a side effect of the LE failure somehow?	02:34
ianw	i think it must have been, but i'm not sure how	02:35
ianw	Destination directory /etc/netplan does not exist	02:36
ianw	sigh	02:36
ianw	it's going to be a bit more work getting the CI job and hence ansible run working	02:37
clarkb	seems like it is getting close now though as that is the lsat bit of the playbook isn't it?	02:37
clarkb	ianw: I think you can split the netplan fix up out into another playbook and have that not run in the test job?	02:38
ianw	we only want to do that on the production hos	02:38
clarkb	but do run it in the infra prod job	02:38
clarkb	ya	02:38
clarkb	or use a testing flag and only do that when it is undefined or false?	02:40
clarkb	that might be better for simplifying testing and keeping things consistent across hosts	02:40
ianw	i will re-enqueue the zuul changes, put review02 in emergency and allow ansible to start running again	02:40
clarkb	ok	02:40
ianw	i'm happy the server is operational, it's now just making sure the ansible apply is idempotent and doesn't move it backwards :)	02:41
clarkb	++	02:42
clarkb	I'm working on an update to your change to do the its a test flag	02:42
clarkb	for the netplan config	02:42
opendevreview	Clark Boylan proposed opendev/system-config master: review02: move out of staging group https://review.opendev.org/c/opendev/system-config/+/797563	02:45
clarkb	ianw: ^ something like that for the netplan issue maybe	02:45
ianw	thanks	02:46
clarkb	that hasn't kicked the running jobs out of check yet	02:47
clarkb	a new change has just entered check so why hasn't the new patchset of ^ bumped the old one out	02:51
clarkb	that could be a bug in the zuul pipeline changes I ugess	02:51
clarkb	now its queued up. Ya I suspect some sort of starvation processing the pipelines	02:53
ianw	maybe the last job there was in it's post phase or something	02:54
clarkb	well we did redo the pipeline processing in zuul this last week	02:54
clarkb	so it could totally be something to do with that	02:54
clarkb	ianw: do you know why we need to build the gerrit images in those changes too?	02:56
clarkb	hrm I bet we turned it on for test_gerrit.py but we don't really need it? probably helps in the long run	02:56
clarkb	I expect we had problems where we were updating the tests and tryingto test new images with depends on or similar and it wasn't working	02:56
ianw	i think system-config-run-review depends on the images so it always builds them?	03:00
ianw	i don't imagine we'll be taking the server down at this point, so i think we can announce that it is back online	03:03
clarkb	maybe mention that we are still working through restoration of our config managmeent processes so acl changes and new projects aren't possible yet	03:05
ianw	i might keep it simple and say the update is over, and if i can't get this sorted by EOD (which I should be able to) call that out	03:17
clarkb	ok	03:18
ianw	I wonder if the "unknown" time remaining somehow has to do with the pause entering the gate	03:22
clarkb	ya zuul only shows a number there once all jobs have at least started	03:24
ianw	#status alert The maintenance of the review.opendev.org Gerrit service is now complete and service has been restored. Please alert us in #opendev if you have any issues. Thank you	03:24
opendevstatus	ianw: sending alert	03:24
clarkb	so having a pause and then waiting for some stuff ot happen causes that to happen in the zuul web ui	03:25
-opendevstatus- NOTICE: The maintenance of the review.opendev.org Gerrit service is now complete and service has been restored. Please alert us in #opendev if you have any issues. Thank you		03:25
clarkb	ianw: do alerts change the topic?	03:25
clarkb	doesn't look like it. I guess	03:25
ianw	not at the moment	03:25
ianw	something to do with acl permissions in oftc or something or other	03:25
ianw	oh, doh, there's an end command isn't there	03:26
clarkb	ya you #status ok to end the alert	03:26
clarkb	which sets the topics back again iirc	03:26
ianw	#status ok	03:27
clarkb	I usually use #notice unless I know I want it in the topics	03:27
clarkb	it might not process that until it is done processing the alert (and you may need to reissue it?	03:27
ianw	oh well it's in the checklist for next time :)	03:28
ianw	probably it's good that it's been so long since i sent a global alert that i forgot!	03:29
ianw	review jobs running now, fingers crossed	03:29
opendevstatus	ianw: sending ok	03:30
ianw	system-config-run-review-3.2 success ! yay	03:48
clarkb	progress	03:48
ianw	i've disabled the backup cron jobs on review01 and will get backups happening on 02 once 797563 merges and i run it	03:54
clarkb	ianw: ok. Keep in mind having review02 in the emergency file makes running the playbook weird	03:54
ianw	yep i have a command that uses inventory out of my checkout	03:54
clarkb	I think you may end up a huge set of jobs because ineventory changed that zuul will work through. If service-review is far down the list you might get away with just running the playbook after removing 02 from the emergency file	03:55
clarkb	ianw: without the emergency file?	03:55
ianw	yeah, for just running review. i'll run it by hand as i want to watch it	03:55
clarkb	got it	03:55
clarkb	ianw: I'm a little annoyed we'll get a new gerrit image we don't need, but at the same time we just updated the gerrit imgae so that should be fine for hwenever we restart in the future	04:01
ianw	yeah, i'm not sure of a way around that	04:04
opendevreview	Ian Wienand proposed opendev/system-config master: gerrit: fix Launchpad credentials write https://review.opendev.org/c/opendev/system-config/+/801227	04:07
opendevreview	Merged opendev/system-config master: review02: move out of staging group https://review.opendev.org/c/opendev/system-config/+/797563	04:49
ianw	yay, it's that easy	04:51
*** ykarel\|away is now known as ykarel		04:53
*** dpawlik0 is now known as dpawlik		05:10
ianw	ok i have run the review playbook against the new server and everything looks good. replication config is setup, nothing out of order in the other configs, cron jobs are there for cleanup etc.	05:17
ianw	i'm taking the server out of emergency as it should be fine now	05:17
opendevreview	Ian Wienand proposed opendev/system-config master: backups: add review02.opendev.org https://review.opendev.org/c/opendev/system-config/+/797564	05:29
*** mgoddard- is now known as mgoddard		06:04
*** amoralej\|off is now known as amoralej		06:10
opendevreview	Merged opendev/system-config master: backups: add review02.opendev.org https://review.opendev.org/c/opendev/system-config/+/797564	06:19
ianw	looks like package installation on review02 is actually borked due to https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1926918	06:44
ianw	i'm going to try the downgrade mentioned there	06:44
ianw	i think we actually might need to check all our focal systems for this	06:45
jssfr	Good morning everyone. First time contributor to OpenStack here. My company just signed the CCLA, with my address on the list. I am now looking at the gerrit UI to figure out how to apply this. The only choices I have are the "OpenStack Individual Contributor License Agreement" and two "externally managed" ones. Should I sign the ICLA (<https://docs.openstack.org/contributors/common/setup-gerrit.html#contri	06:49
jssfr	butors-from-a-company-or-organization> seems to suggest that) or will an external process (which I may need to poke?) add some system CLA to my account once the CCLA by my employer has been processed?	06:49
*** dpawlik5 is now known as dpawlik		06:50
ianw	jssfr: i'm no expert here, but you should sign the ICLA and the corporate one is an extra things for company lawyers and the opendev foundation	07:02
*** hashar is now known as Guest1365		07:02
*** hashar_ is now known as hashar		07:02
ianw	ok, i finally got borg onto review02. running initial backups now	07:02
opendevreview	Ian Wienand proposed opendev/system-config master: review02: skip ~gerrit2/tmp in backup https://review.opendev.org/c/opendev/system-config/+/801235	07:05
*** dpawlik7 is now known as dpawlik		07:06
jssfr	ianw, aha, that is a viewpoint which fits my mental model and the stuff written on the page. Thanks!	07:17
ianw	it could be made more explicitly clear, i'd probably agree	07:22
jssfr	I mean the text as written is unambiguous, but combined with the slightly aged screenshots, I wasn't sure if the process is still up-to-date.	07:25
ianw	jssfr: i'm sure a contribution would be welcome to https://opendev.org/openstack/contributor-guide/src/branch/master/doc/source/common/setup-gerrit.rst :)	07:26
ianw	ok, long day, but all 24. of the checklist points are marked off on https://etherpad.opendev.org/p/gerrit-upgrade-2021	07:42
ianw	the server is up; no complaints and it's processed quite a few changes now, it has had successful backup runs	07:43
ianw	nothing on the cleanup list can't wait	07:44
ianw	i'll try to check back in for the next few hours, but i'm mostly out now	07:44
opendevreview	Merged opendev/system-config master: review02: skip ~gerrit2/tmp in backup https://review.opendev.org/c/opendev/system-config/+/801235	08:14
*** dpawlik3 is now known as dpawlik		08:30
*** dpawlik4 is now known as dpawlik		08:44
*** ykarel is now known as ykarel\|lunch		08:54
*** rpittau\|afk is now known as rpittau		09:31
*** kopecmartin is now known as kopecmartin\|pto		09:45
*** ykarel\|lunch is now known as ykarel		09:59
fungi	jssfr: just to clarify, the ccla is purely paperwork, an a best-effort/honor-system tracking of affiliations for contributors to official open infrastructure foundation projects. in contrast, the icla is required for all contributors to certain projects, for example openstack, and enforced in the code review system so that it prevents contributions from being pushed for repos under the	11:29
fungi	governance of those projects unless you've agreed to it	11:29
jssfr	aha, understood	11:34
fungi	infra-root: just a reminder, i'm still away and on the road all day today, but should be around and start catching back up tomorrow	12:26
*** amoralej is now known as amoralej\|lunch		12:52
opendevreview	Ananya Banerjee proposed opendev/elastic-recheck master: Run elastic-recheck container https://review.opendev.org/c/opendev/elastic-recheck/+/729623	13:06
*** sshnaidm\|afk is now known as sshnaidm		13:10
opendevreview	Ananya proposed opendev/elastic-recheck master: Run elastic-recheck container https://review.opendev.org/c/opendev/elastic-recheck/+/729623	13:14
mnaser	is it me or gerrit does feel much more snappy/quick	13:42
*** amoralej\|lunch is now known as amoralej		13:43
rm_work	Question for folks -- when you do DB maintenance, do you just ... take down the DB briefly and expect OpenStack services to deal with retries or whatever for the duration? Do you have a more complex strategy? Turn off OpenStack services first? Keep the DB available via a mirrored DB setup using the read-only node?	13:49
clarkb	rm_work: we don't operate openstack services for the most part so not in a great position to answer	14:28
rm_work	heh yeah but this channel is a who's who of operators :D	14:28
clarkb	mnaser: our big theory for why the old gerrit was very slow was memory contention preventing gerrit and the operating system and the web server from having enough memory to all be happy at once. The new instance is larger (thank you mnaser and vexxhost!) allowing us to allocate more memory for each of those memory consumers	14:28
clarkb	mnaser: long story short I'm very glad to hear you think it is snappier and we thank you for the help in making that happen :)	14:29
mnaser	clarkb: yeah, i'm happy that it actually had a positive impact -- i tried removing a topic from a change and it happened instantly vs before which would take quite a bit of time :)	14:29
clarkb	rm_work: the two recent db maintenances we did (the gerrit move and a zuul upgrade that required a db migration) were both done with services down. Not ideal but things are getting better slowly.	14:30
clarkb	mnaser: another good test is dansmith's giant patch bombs :) pushing those has been very slow in the past	14:31
clarkb	a series of changes or change updates to a large repo in particular	14:31
clarkb	since things seem to be going well this morning I'm going to go find breakfast and do my normal startup routine. I'm hoping that I can then start hacking on testing of our project rename playbook today as well in prep for the planned renames sometime next week	14:33
clarkb	rm_work: back when lifeless was thinking about these problems I think he liked the idea of a transparent cutover using an intelligent proxy	14:34
clarkb	rm_work: I have no idea how feasible that is with the tooling available today, but basically you mirror the database then force all reads and writes to go through a proxy to keep things in sync. Then to cut over you have the proxy halt conncetions momentarily while you do a catch up on the new side and then remove the old side from the proxy	14:35
rm_work	yeah the thing i've run into is a DB team that thought it'd be helpful of them to use read-only mode for cutovers rather than a hard outage, and some services (at least octavia) that are coded to understand and retry on that, but write failures cause them to behave BADLY	14:35
rm_work	trying to figure out if it's reasonable and normal to just do hard-down for maintenance on the DB briefly, and if most services play nice with that	14:36
rm_work	* yeah the thing i've run into is a DB team that thought it'd be helpful of them to use read-only mode for cutovers rather than a hard outage, and some services (at least octavia) that are coded to understand and retry on hard-outage, but write failures cause them to behave BADLY	14:37
*** ykarel is now known as ykarel\|away		14:38
rm_work	sorry for probably misusing your channel, I have a kind of bad habit of that since it's the best place I know of to catch a specific set of people 😅	14:39
clarkb	no problem, I just wanted to be clear that we don't really have direct experience with that problem and openstack. Though I suppose other channel lurkers may (like mnaser?)	14:39
rm_work	yeah I was about to ping him directly :P	14:40
clarkb	corvus: yesterday when we were trying to get the system-config change to specify review02 as the new gerrit server tested and landed I pushed a new patchset for the change and zuul didn't evict the old patchset as quickly as I expect it would.	15:20
clarkb	corvus: https://review.opendev.org/c/opendev/system-config/+/797563 is the change and it was patchset 5 in check when I pushed patchset 6. I don't think this is currently urgent but it occured to me that that may indicate starvation in the pipeline processing loops?	15:22
clarkb	wanted to call it out in case others notice similar	15:22
dtantsur	hey! are there any mirror problems with opensuse nodes? https://zuul.opendev.org/t/openstack/build/4ba8493813d440998547da49825f7440/log/job-output.txt#673	15:34
clarkb	dtantsur: we may have synced bad state from our upstream mirror	15:35
clarkb	looks like we last synced opensuse 18 days ago. The upstream we are using has a different repomd.xml that points at a file present in the upstream dir http://mirror.us.leaseweb.net/opensuse/update/leap/15.2/oss/repodata/	15:39
clarkb	VLDB: vldb entry is already locked	15:41
clarkb	that is why we aren't updating that volume. I'll dig into that	15:41
dtantsur	thank you!	15:45
clarkb	I don't see any running vos release for that on the system that does the vos releases. I've held the flock we use to do the mirror updates for opensuse and will break the vldb lock and manually run the mirror update script	15:47
*** marios is now known as marios\|out		16:29
*** amoralej is now known as amoralej\|off		16:57
clarkb	dtantsur: I think you should be good now. I've rerunning the update manually one mroe time to convince myself that it is happy on the update side but the mirrors show the new content as expected	17:03
dtantsur	great, thanks! I'll create a test patch	17:04
clarkb	infra-root I've updated https://gerrit-review.googlesource.com/c/gerrit/+/312302/ with tests and if those pass in upstream CI (figuring out how to run them locally was an experience) I'll see what I can do to get reviews upstream	17:07
clarkb	dtantsur: thank you for letting us know	17:09
*** rpittau is now known as rpittau\|afk		17:38
opendevreview	Chao Zhang proposed zuul/zuul-jobs master: Update commits since tag calculation https://review.opendev.org/c/zuul/zuul-jobs/+/801370	18:03
opendevreview	Chao Zhang proposed zuul/zuul-jobs master: Update commits since tag calculation https://review.opendev.org/c/zuul/zuul-jobs/+/801370	18:04
opendevreview	melanie witt proposed openstack/project-config master: Set launchpad bug Fix Released after adding comment https://review.opendev.org/c/openstack/project-config/+/801376	18:54
timburke	i saw the all-clear notice went out a while ago, but i'm still getting redirects to the maintenance page when i go to https://review.opendev.org/ -- is that expected?	19:22
timburke	i'm also seeing errors like "ssh: connect to host review.opendev.org port 29418: Connection refused" if i use git/git-review, which makes me curious about how the patches just above got submitted :-/	19:24
timburke	maybe i've got some stale dns? review.opendev.org and review.openstack.org both seem to resolve to 104.130.246.32 for me, fwiw	19:27
opendevreview	melanie witt proposed openstack/project-config master: Set launchpad bug Fix Released after adding comment https://review.opendev.org/c/openstack/project-config/+/801376	19:33
Clark[m]	Yes, that is the old DNS record	19:33
Clark[m]	timburke: any idea what might be holding on to that value? We lowered the ttl to 5 minutes last week and prior to that it was 60 minutes. Both much shorter than the time between now and when we updated dns	19:34
timburke	seems to be something on my end -- dig's telling me there's a TTL of 0 (!!) coming from SERVER: 127.0.0.53#53(127.0.0.53) :-(	19:36
timburke	definitely user error! turns out i've got something in /etc/hosts with a comment like "WTF IPv6 (Nov 2020)" 🤣	19:37
timburke	ignore me :-)	19:37
opendevreview	Chao Zhang proposed zuul/zuul-jobs master: Update commits since tag calculation https://review.opendev.org/c/zuul/zuul-jobs/+/801370	19:59
opendevreview	Chao Zhang proposed zuul/zuul-jobs master: Update commits since tag calculation https://review.opendev.org/c/zuul/zuul-jobs/+/801370	20:00
clarkb	I've made my edits to the team meeting agenda. I'll hold off on sending it until ianw can check it for any missing important items/details	20:32
clarkb	please add your edits soon though :)	20:32
clarkb	infra-root rax rebooted eavesdrop01.opendev.org a few minutse ago. A heads up if you notice bots acting weird	20:58
ianw	o/	22:24
clarkb	ianw: good morning. Its been really quiet. I think things went well. I did leave a -1 over a small thing on your change to fix the lp creds file	22:25
ianw	agenda lgtm thanks	22:25
ianw	yep i checked my mail this morning to see if i had a bunch of "revert" emails but thankfully not :)	22:26
clarkb	ianw: I was also going to suggest maybe you double check backups and such now that the ~gerrit2/tmp exclusion landed and all the jobs for that should've run	22:26
clarkb	but other than that I think its mostly answer the occasional question that came up (timburke had an /etc/hosts override for review and anothe rperson was looking for firewall update details)	22:26
ianw	can do	22:27
clarkb	overall looking really good. I've been working on my java as a result :) wrote tests for my openid fix and am communicating iwth a reviewer on that now. I'm hopeful we can get this landed	22:27
ianw	++ it would be great to not deal with that again!	22:28
ianw	somehow we've got two scripts trying to backup the review db	22:32
mordred	that seems less than optimal	22:33
ianw	oh, it's an old one from before it got the "use a local mariadb flag"	22:34
mordred	clarkb: what's your gerrit change link?	22:34
mordred	clarkb: nm. I see in backscroll	22:34
clarkb	mordred: https://gerrit-review.googlesource.com/c/gerrit/+/312302	22:34
ianw	there is also a cron job for /usr/local/bin/track-upstream	22:34
ianw	which i think we removed right?	22:34
clarkb	ianw: I know fungi was working on that, but I don't know if it landed /me looks in git logs	22:34
clarkb	a change titled "Good riddance to track-upstream and its cronjob" did merge	22:35
clarkb	ianw: I think you can remove that cronjob	22:35
clarkb	that change was in system-config	22:35
ianw	I0d6edcc34f25e6bfe2bc41d328ac76618b59f62d yep; ok i'll remove the entry	22:35
clarkb	mordred: I was hoping to get some feedback on my assertions there before I push a new patchset, but I'll probably push a new patchset at EOD if I don't hear back before then just to keep things moving	22:36
ianw	ok, root now runs only the two cron jobs for the backups	22:36
clarkb	agenda has been sent	22:43
clarkb	ianw: one good side effect of keeping the maintenance banner up on review01 has been that it is abundandtly clear you are talking to the wrong gerrit	22:47
clarkb	we might want to update the text to say something like "This server has moved. If you are seeing this page then double check your DNS resolution and /etc/hosts file for review.opendev.org." ? Though that may be a one off	22:48
ianw	yeah we can change it to "if you are seeing this, you're in the wrong place" :)	22:48
ianw	if infra-root wants to audit their home dirs etc. for anything they feel is important and migrate it, we can probably shut it down after that	22:49
clarkb	I'll make a note to do that tomorrow	22:50
clarkb	I do want to preserve the gerrit account cleanup records I've been keeping. I can move those	22:50
opendevreview	Ian Wienand proposed openstack/project-config master: afs graphs: track openeuler mirror volume https://review.opendev.org/c/openstack/project-config/+/801397	22:58
ianw	clarkb: ^ i think this is what pushed afs up recently and will give a more complete picture in the dashboard	22:59
ianw	it would probably be good to have a stacking graph that shows all the volumes usage in context	23:00
clarkb	ianw: ya the opensuse mirro stopped updated (stale lock) and I went looking I expect it was that mirror. There was talk elsewhere about maybe doing alma linux and debian is 5GB below its quota limit	23:00
clarkb	anyway wanted to discuss if we thought we needed more disk and if the mirror.yum-puppetlabs is used	23:01
clarkb	I went ahead and approved ^ since it seems straightforward	23:02
clarkb	ianw: if only we could make the distros smaller :)	23:04
fungi	clarkb: ianw: i'm home and skimming nick highlights, but not really here properly until tomorrow... i did delete the track-upstream cronjobs on both the old and new review server, if you check there should have been a sudo crontab -e logged when i did it too. perhaps something put it back?	23:06
fungi	i guess wait and see if it reappears	23:06
ianw	we may have had a run against an older system-config at some point	23:07
clarkb	fungi: ya no rush or worries at the moment. I think it will likely just be a bunch of small updates here and there as we find things to improve	23:09
fungi	Jul 15 13:06:18 review02 sudo: fungi : TTY=pts/6 ; PWD=/home/fungi ; USER=root ; COMMAND=/usr/bin/crontab -e	23:09
fungi	i wonder what replaced it	23:09
clarkb	fungi: if you do have a moment https://review.opendev.org/c/opendev/system-config/+/800274 is one that I'd like to land on Wednesday probably (I should have time to watch and monitor as it goes in). Again no rush given that timeframe but review always welcome	23:12
fungi	i'll load it up in my gertty at least, maybe that'll remind me	23:12
opendevreview	Ian Wienand proposed opendev/system-config master: Point cacti at review02 explicitly https://review.opendev.org/c/opendev/system-config/+/801399	23:13
ianw	^ that's one i just thought of, i'm pretty sure cacti is still hanging on to talking to the old server, but it's better to be clear about it	23:13
clarkb	ianw: review.openstack.org points to the new server so it should be getting data from the new one now	23:14
clarkb	but explicit is nice	23:14
opendevreview	Merged openstack/project-config master: afs graphs: track openeuler mirror volume https://review.opendev.org/c/openstack/project-config/+/801397	23:19
ianw	yeah i guess http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=27&rra_id=all looks right	23:21
ianw	i'm really not sure about the load average results though http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=26&rra_id=all	23:21
clarkb	that looks about right given what I recall from memory before? It might be a bit lower now if we don't have as much memory/io/disk contention	23:22
ianw	also the "used memory" doesn't seem to show up	23:23
ianw	i wonder if cacti isn't so happy with something that focal is doing	23:24
clarkb	ianw: sometimes bouncng the snmpd service on the host (review02 in this case) is sufficient to make things happy again	23:25
clarkb	but it could also be due the the size of the values (they are much larger now)	23:25
ianw	i guess this is not worth too much effort given replacement plans	23:26
opendevreview	Merged openstack/diskimage-builder master: Convert multi line if statement to case https://review.opendev.org/c/openstack/diskimage-builder/+/734479	23:31

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!