| *** tosky has quit IRC | 00:20 | |
| Alex_Gaynor | apt updates appear to be broken on Ubunt ARM64 machines | 00:30 |
|---|---|---|
| Alex_Gaynor | Can't connect to the apt mirror | 00:30 |
| ianw | kevinz: ^ this is the problem we've seen periodically where hosts seem to shut themselves down | 00:44 |
| ianw | both nb03 and the mirror are in SHUTOFF | 00:44 |
| ianw | i've actually had a netconosle on the mirror all this time, trying to catch this | 00:44 |
| ianw | [3449556.856931] afs: volume location server 23.253.200.228 in cell openstack.org is back up (code 0) | 00:45 |
| ianw | stupid relative timestamps; but i didn't catch any oops or other messages out of the mirror | 00:45 |
| ianw | this suggests to me the cloud shut it down without warning the host | 00:45 |
| ianw | Alex_Gaynor: thanks for helping as we get this more stable :) | 00:46 |
| *** larainema has quit IRC | 02:26 | |
| *** auristor has quit IRC | 02:27 | |
| *** larainema has joined #opendev | 02:29 | |
| *** hamalq_ has quit IRC | 02:32 | |
| *** iurygregory has quit IRC | 03:32 | |
| *** mlavalle has quit IRC | 03:45 | |
| *** mlavalle has joined #opendev | 03:48 | |
| *** mlavalle has quit IRC | 04:11 | |
| *** brinzhang has joined #opendev | 04:12 | |
| *** mlavalle has joined #opendev | 04:13 | |
| *** mlavalle has quit IRC | 04:16 | |
| *** mlavalle has joined #opendev | 04:17 | |
| *** ysandeep|away is now known as ysandeep | 04:17 | |
| *** mlavalle has quit IRC | 04:23 | |
| *** mlavalle has joined #opendev | 04:24 | |
| *** mlavalle has quit IRC | 04:26 | |
| *** mlavalle has joined #opendev | 04:27 | |
| *** auristor has joined #opendev | 04:41 | |
| *** mlavalle has quit IRC | 04:42 | |
| *** mlavalle has joined #opendev | 05:07 | |
| *** mgagne has quit IRC | 05:31 | |
| *** mlavalle has quit IRC | 05:53 | |
| *** mlavalle has joined #opendev | 05:57 | |
| *** brinzhang has quit IRC | 07:02 | |
| *** slaweq has joined #opendev | 08:14 | |
| *** DSpider has joined #opendev | 08:17 | |
| *** slaweq has quit IRC | 08:45 | |
| *** tosky has joined #opendev | 13:14 | |
| *** ykarel has joined #opendev | 14:27 | |
| *** ykarel has quit IRC | 15:06 | |
| Alex_Gaynor | We're now seeing 403s from the apt mirror in the ARM64 (Linaro) cloud. | 16:28 |
| *** klonn has joined #opendev | 17:10 | |
| *** slaweq has joined #opendev | 17:11 | |
| *** klonn has quit IRC | 17:17 | |
| *** klonn has joined #opendev | 17:17 | |
| *** klonn has quit IRC | 17:24 | |
| fungi | ugh, checking iy | 17:30 |
| fungi | it | 17:31 |
| fungi | http://mirror.regionone.linaro-us.opendev.org/pypi/simple/cryptography/ seems to work | 17:34 |
| Alex_Gaynor | Clicking re-run let's see if it works now | 17:35 |
| fungi | ahh, the afs-backed mirrors seem to return 403 | 17:35 |
| fungi | afs-backed paths on our mirrors in other providers are working, so at least it doesn't seem to be a central afs problem | 17:36 |
| fungi | the openafs lkm is still loaded and afsd is still running | 17:37 |
| *** ysandeep is now known as ysandeep|away | 17:38 | |
| fungi | nothing new in dmesg output since ianw booted the server up at ~00:45 utc | 17:38 |
| fungi | [Sat Dec 19 00:47:15 2020] Unable to handle kernel paging request at virtual address 7f9a50d0d0509187 | 17:40 |
| fungi | [Sat Dec 19 00:47:15 2020] Internal error: Oops: 96000004 [#1] SMP | 17:40 |
| fungi | that seems to have happened during the afs mount bringup at boot, so maybe it broke while booting (maybe it's got a corrupt local cache from the unclean shutdown earlier) | 17:40 |
| fungi | and afsd is unkillable | 17:44 |
| fungi | i'm going to try a soft reboot | 17:45 |
| fungi | #status log rebooting mirror02.regionone.linaro-us.opendev.org in order to attempt to free an unkillable afsd process | 17:46 |
| openstackstatus | fungi: finished logging | 17:46 |
| fungi | it's parent is init, and even kill -9 didn't work, stayed in Ss state, not even zombie | 17:46 |
| fungi | curious to see if it will even shutdown | 17:47 |
| fungi | if not, i'll hard reboot it via nova api | 17:47 |
| fungi | it did eventually reboot, and now after afsd taking its sweet time starting up, i can finally get a directory listing again | 17:55 |
| fungi | http://mirror.regionone.linaro-us.opendev.org/debian/ returns content now instead of a 403 firbidden | 17:56 |
| fungi | Alex_Gaynor: sorry for the delay, i think it should be back in working order now | 17:56 |
| Alex_Gaynor | will retry momentarily | 18:00 |
| *** slaweq has quit IRC | 18:10 | |
| *** klonn has joined #opendev | 18:31 | |
| *** klonn has quit IRC | 19:15 | |
| *** prometheanfire has quit IRC | 19:36 | |
| *** prometheanfire has joined #opendev | 19:36 | |
| *** klonn has joined #opendev | 22:19 | |
| *** klonn has quit IRC | 22:41 | |
| *** slittle1 has quit IRC | 22:43 | |
| *** slittle1 has joined #opendev | 22:44 | |
| *** fressi has joined #opendev | 22:58 | |
| ianw | fungi: yeah, several times i've just had to rm -rf the cache dir to get things sane again | 23:23 |
| *** DSpider has quit IRC | 23:25 | |
| ianw | it maybe wouldn't be insane to have a boot job that did that before afs | 23:29 |
| fungi | downside is every boot starts with a cold cache, even if it was a clean/controlled reboot | 23:30 |
| *** tosky has quit IRC | 23:44 | |
| *** fressi has quit IRC | 23:57 | |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!