Friday, 2023-11-17

@nexn:matrix.org	NilesX: May I know which release of starlingx ISO did you use	07:10
@nexn:matrix.org	I am using release 8	07:11
@maloute:matrix.org	Same m.	07:29
@nexn:matrix.org	NilesX: If it is possible kindly send the direct downloading link for that?	07:31
@nexn:matrix.org	Could I know any raid configuration is in your environmrnt	07:53
@maloute:matrix.org	The link to download is on the release page	08:10
@maloute:matrix.org	And no raid for me	08:10
@nexn:matrix.org	okay	08:15
@nexn:matrix.org	> <@nexn:matrix.org> sent an image.	09:18
I hope you can see the alarm raised in fault management, could maybe the reason we can't able to unlock the controller host?
@maloute:matrix.org	if you look at the dashboard when unlocking you should see the differents steps it's doing	09:46
@maloute:matrix.org	as for your alarm, this seems to be normal since you did this command after 1 min of unlocking	09:46
@nexn:matrix.org	Then what will be the reason we can't able to unlock the host .we tried with you guidance still case is same.	11:10
@nexn:matrix.org	Thank you NilesX ,We are able to unlock the controller host!	12:57
@maloute:matrix.org	Nice one ! What was different?	13:04
@bruce.jones:matrix.org	Looks like review.opendev.org is down	15:36
@jreed:matrix.org	Can confirm	15:36
@bruce.jones:matrix.org	fungi: is there an ETA?	15:36
@fungicide:matrix.org	should take no more than an hour. this upgrade maintenance was announced some time ago: https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/thread/XT26HFG2FOZL3UHZVLXCCANDZ3TJZM7Q/	15:39
@bruce.jones:matrix.org	tyty	15:40
@fungicide:matrix.org	if you're not subscribed to service-announce, you might consider doing so. it's very low-traffic	15:40
@fungicide:matrix.org	also for more in-the-moment things, you may want to follow https://fosstodon.org/@opendevinfra/	15:41
@fungicide:matrix.org	it was offline for about 10 minutes upgrading, but we're still poking around to make sure everything looks okay so don't be surprised if we have to take it down again for a rollback if we find something we overlooked in staging tests	15:44
@fungicide:matrix.org	also we have zuul offline for a few more minutes, so anything that gets pushed or approved at the moment may need a recheck once it comes back online	15:49
@jreed:matrix.org	Zuul is still down :/ - Will it automatically pick up jobs when it comes back?	16:59
@jreed:matrix.org	Zull is back up!!!	17:08
@jreed:matrix.org	* Zuul is back up!!!	17:08
@jreed:matrix.org	But this is the biggest back log of jobs I've ever seen... WOW -	17:09
@jreed:matrix.org	https://zuul.openstack.org/status	17:09
@fungicide:matrix.org	jreed: note that most of the builds for those were already completed, only builds which happened to be running when we brought the service down are being rerun now, so the backlog looks worse than it is. the test nodes and node request graphs here give a slightly more accurate picture: https://grafana.opendev.org/d/21a6e53ea4/zuul-status	17:15
@fungicide:matrix.org	also on the running builds graph you can see that it had an average of about 40 builds in progress from each executor before the maintenance, and caught back up to roughly the same within a few minutes of coming back online	17:19
@fungicide:matrix.org	it will have missed gerrit events which occurred between 15:30 and 17:05 utc though, so you might have to add recheck comments to any changes you uploaded during that span of time, or remove and reapply approval votes for anything that got approved within that window	17:20
@jreed:matrix.org	I think I have one stuck.	17:22
@jreed:matrix.org	I'll see if I can get a core to redo their vote real quick and see if that works.	17:22
@jreed:matrix.org	I don't think simply replying with a comment does it.	17:22
@fungicide:matrix.org	stuck as in it got approved while zuul was offline? yeah in that case you'll need a new approval vote on it (which can be done by the same person too, they just have to remove their workflow +1 and then add it again)	17:24
@jreed:matrix.org	Yes. I'm trying that now.	17:24
@jreed:matrix.org	And it's running.... that's the way to go. Thanks fungi	17:25
@fungicide:matrix.org	yw	17:27
@jreed:matrix.org	Well... zuul ran and one of the jobs failed but I don't think it was a test failure	17:49
@jreed:matrix.org	https://zuul.opendev.org/t/openstack/build/7b5008b924e247c7a1f3eb76fe96151f	17:49
@jreed:matrix.org	Said it couldn't install dependencies??	17:49
@jreed:matrix.org	It 'ass the normal check just fine... I think zuul is having an issue with it because of the maintenance	17:51
@jreed:matrix.org	I guess we should just re-trigger it again?	17:52
@jreed:matrix.org	fungi: Any idea what's up?	17:54
@jreed:matrix.org	I've requeued the job. Not sure what the heck is happening.	17:57
@jreed:matrix.org	Failed trying to reach - https://mirror-int.ord.rax.opendev.org/wheel/debian-11.8-x86_64/pip/ - Is that mirror undergoing maintenance as well?	18:00
@fungicide:matrix.org	jreed: that looks like ansible is incorrectly using `debian-11.8` instead of just `debian-11` in the url, likely something to do with how it expands the distro version string. i recall something related to that recently in newer ansible, digging to see if i can find details	19:12
@fungicide:matrix.org	looking deeper, that url path is wrong but it's also wrong in the subsequent builds that passed, so normally it's non-fatal (but should still be fixed). i think whatever happened to that build was something else, but i'm still looking	19:43
@fungicide:matrix.org	tox log collection in that job definition seems to be broken, and the console log is massive (i may have to switch to plaintext because it's shredding my browser), so investigation is slow-going but it does at least seem that whatever happened to cause that build to fail is not occurring consistently	19:53
@fungicide:matrix.org	almost 62k lines in that console log	19:54
@fungicide:matrix.org	`Could not fetch URL https://mirror-int.ord.rax.opendev.org/pypi/simple/pygments/: connection error: HTTPSConnectionPool(host='mirror-int.ord.rax.opendev.org', port=443): Read timed out. - skipping`	20:11
@fungicide:matrix.org	https://zuul.opendev.org/t/openstack/build/7b5008b924e247c7a1f3eb76fe96151f/log/job-output.txt#61230	20:11
@fungicide:matrix.org	jreed: that was the fetch that succeeded in later builds, so it does look like some temporary network issue impacted the job node (or possibly the mirror server), such that the https connection timed out reading from the socket at some point when trying to retrieve that index	20:12

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!