clarkb | real 437m38.542s <- one notedb migration | 00:00 |
---|---|---|
clarkb | fungi: I think I figured out how the change numbers work the All-Projects repo has a refs/meta/sequence entry that seems to be the counter | 00:19 |
clarkb | so ya redirects may just work | 00:19 |
clarkb | though maybe it has to scan all the repos for the number? not sure how that works | 00:20 |
clarkb | perhaps it looks it up in the index | 00:20 |
*** hamalq has quit IRC | 00:32 | |
openstackgerrit | Fatema Khalid Sherif proposed opendev/storyboard-webclient master: Show story description markdown preview by default https://review.opendev.org/756940 | 01:12 |
openstackgerrit | Fatema Khalid Sherif proposed opendev/storyboard-webclient master: Show story description markdown preview by default https://review.opendev.org/756940 | 01:47 |
*** auristor has quit IRC | 02:17 | |
*** auristor has joined #opendev | 02:20 | |
ianw | i believe this opens the maintenance window for the rax db's .. will keep an eye | 03:02 |
clarkb | thanks. I'mchecking irc periodically | 03:06 |
*** ysandeep|away is now known as ysandeep | 03:29 | |
openstackgerrit | Merged opendev/system-config master: Add initial borg backup server https://review.opendev.org/756607 | 03:42 |
ianw | infra-prod-service-bridge timed_out | 03:42 |
ianw | hrm | 03:42 |
ianw | looks like i have also not hooked borg-backup jobs in correctly eitehr | 03:43 |
ianw | oh, no doh that's the hourly runs. still something is wrong | 03:43 |
ianw | # ps -aef | grep ansible-playbook | wc -l | 03:44 |
ianw | 193 | 03:44 |
ianw | logstash-worker02.openstack.org. seems to be the dead host | 03:45 |
ianw | hung tasks as usual i guess | 03:51 |
ianw | (i mean i checked the console and that's what's on it) | 03:51 |
clarkb | rebooting those is basically always safe | 03:52 |
clarkb | even when not sad | 03:52 |
clarkb | (we'll drop some loga butmeh at a billion records a day thats ok) | 03:52 |
ianw | #status rebooted logstash-worker02.openstack.org | 03:56 |
openstackstatus | ianw: unknown command | 03:56 |
ianw | #status log rebooted logstash-worker02.openstack.org | 03:56 |
openstackstatus | ianw: finished logging | 03:56 |
ianw | i've cleared out everything on bridge that was stuck | 03:56 |
*** ykarel|away has joined #opendev | 04:23 | |
*** ykarel|away is now known as ykarel | 04:28 | |
clarkb | I think we areoutside the db window? | 05:01 |
*** marios has joined #opendev | 05:12 | |
*** ykarel has quit IRC | 05:34 | |
*** ykarel has joined #opendev | 05:35 | |
*** rpittau|afk is now known as rpittau | 05:43 | |
ianw | 2am cdt | 05:43 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: install-borg: also install python3-venv https://review.opendev.org/757000 | 05:51 |
*** sshnaidm is now known as sshnaidm|off | 06:06 | |
*** eolivare has joined #opendev | 06:34 | |
*** tkajinam has quit IRC | 06:42 | |
*** tkajinam has joined #opendev | 06:42 | |
*** ralonsoh has joined #opendev | 06:59 | |
*** hashar has joined #opendev | 07:01 | |
*** Dmitrii-Sh has quit IRC | 07:11 | |
*** ysandeep is now known as ysandeep|lunch | 07:25 | |
*** fressi has joined #opendev | 07:27 | |
*** slaweq has joined #opendev | 07:37 | |
*** slaweq has quit IRC | 07:37 | |
*** slaweq has joined #opendev | 07:38 | |
*** tosky has joined #opendev | 07:46 | |
*** moppy has quit IRC | 08:01 | |
*** moppy has joined #opendev | 08:01 | |
*** fressi has left #opendev | 08:20 | |
*** fressi has joined #opendev | 08:23 | |
*** Dmitrii-Sh has joined #opendev | 08:58 | |
openstackgerrit | likui proposed openstack/diskimage-builder master: Switch to unittest mock https://review.opendev.org/757031 | 08:59 |
openstackgerrit | likui proposed openstack/diskimage-builder master: replace imp module https://review.opendev.org/751236 | 09:09 |
*** ysandeep|lunch is now known as ysandeep | 10:01 | |
*** roman_g has joined #opendev | 10:02 | |
*** DSpider has joined #opendev | 10:31 | |
*** priteau has joined #opendev | 10:49 | |
*** Eighth_Doctor has quit IRC | 11:02 | |
*** mordred has quit IRC | 11:02 | |
*** mordred has joined #opendev | 11:11 | |
*** ykarel has quit IRC | 11:16 | |
*** ykarel_ has joined #opendev | 11:16 | |
*** ykarel has joined #opendev | 11:32 | |
*** ykarel_ has quit IRC | 11:33 | |
*** ttx has quit IRC | 11:34 | |
*** Eighth_Doctor has joined #opendev | 11:35 | |
*** ttx has joined #opendev | 11:36 | |
*** slaweq has quit IRC | 12:03 | |
*** slaweq has joined #opendev | 12:17 | |
*** ysandeep is now known as ysandeep|brb | 12:25 | |
*** fressi has quit IRC | 12:31 | |
*** hashar has quit IRC | 12:46 | |
*** rpittau is now known as rpittau|afk | 13:03 | |
*** ysandeep|brb is now known as ysandeep | 13:31 | |
openstackgerrit | Bernard Cafarelli proposed openstack/project-config master: Update neutron stable grafana dashboards https://review.opendev.org/757102 | 13:49 |
*** ykarel has quit IRC | 13:52 | |
*** ykarel has joined #opendev | 13:52 | |
openstackgerrit | Nicolas Alvarez proposed openstack/project-config master: Add initial files to project-config repo. https://review.opendev.org/756717 | 14:16 |
openstackgerrit | Nicolas Alvarez proposed openstack/project-config master: Rename StarlingX Armada App files. https://review.opendev.org/757113 | 14:16 |
*** slaweq has quit IRC | 14:42 | |
*** fressi has joined #opendev | 14:44 | |
*** fressi has quit IRC | 14:47 | |
*** mlavalle has joined #opendev | 14:57 | |
*** ysandeep is now known as ysandeep|away | 14:58 | |
openstackgerrit | Nicolas Alvarez proposed openstack/project-config master: Add SNMP Armada App to StarlingX. https://review.opendev.org/756717 | 15:00 |
*** ykarel is now known as ykarel|away | 15:08 | |
*** eolivare has quit IRC | 15:10 | |
openstackgerrit | Nicolas Alvarez proposed openstack/project-config master: Add SNMP Armada App to StarlingX. https://review.opendev.org/756717 | 15:12 |
*** lpetrut has joined #opendev | 15:19 | |
*** lpetrut has quit IRC | 15:35 | |
*** ykarel has joined #opendev | 15:36 | |
*** ykarel|away has quit IRC | 15:37 | |
*** ykarel has quit IRC | 15:39 | |
*** priteau has quit IRC | 16:06 | |
*** marios is now known as marios|out | 16:10 | |
*** marios|out has quit IRC | 16:23 | |
*** tosky has quit IRC | 16:35 | |
*** priteau has joined #opendev | 16:41 | |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Stop replicating to local git mirror on gerrit https://review.opendev.org/757152 | 16:44 |
clarkb | fungi: as I push changes like ^ up I'll be editing configs on review-test and restarting things there if necessary | 16:45 |
*** hamalq has joined #opendev | 16:45 | |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Disable change.move in gerrit https://review.opendev.org/757153 | 16:50 |
openstackgerrit | Nicolas Alvarez proposed openstack/project-config master: Add SNMP Armada App to StarlingX. https://review.opendev.org/756717 | 16:55 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Stop blocking /p/ in the gerrit apache vhost https://review.opendev.org/757155 | 16:56 |
fungi | clarkb: thanks for the heads up, i'm not testing anything at the moment | 17:02 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Switch to zuul's default gerrit auth type https://review.opendev.org/757156 | 17:03 |
clarkb | fungi: is there a change to fix the cla.html problem yet? | 17:04 |
clarkb | it won't be a problem in prod but only because we'll upgrade gerrit on the existing host | 17:05 |
clarkb | asking because I need to sort out the best way to clean up commentlinks and such and want to avoid conflicts if I can | 17:05 |
clarkb | might just rebase my whole stack on that actually. If you haven't written one yet should I go ahead and add it to my stack ? | 17:07 |
fungi | clarkb: there's not yet, but per your earlier comments about dropping the js file change, maybe we can just repurpose that one to add the cla.html file? | 17:08 |
clarkb | fungi: ya I'm thinking now we should land a change soon that manages the files we use, then I'll do a cleanup change that is WIP'd to remove the ones we don't want later | 17:09 |
clarkb | the reason for that is if we end up on 2.16 then we want to keep at least the css stuff for the old web ui | 17:09 |
clarkb | I can update that change to do the cla.html to if you'd prefer I do it | 17:09 |
clarkb | then I'll rebase the changes above on that | 17:09 |
*** ykarel has joined #opendev | 17:12 | |
fungi | i figure we'll want that file included for 2.16 use too | 17:15 |
clarkb | ++ | 17:17 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Add gerrit static files that were lost in ansiblification https://review.opendev.org/746335 | 17:43 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Stop replicating to local git mirror on gerrit https://review.opendev.org/757152 | 17:43 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Disable change.move in gerrit https://review.opendev.org/757153 | 17:43 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Stop blocking /p/ in the gerrit apache vhost https://review.opendev.org/757155 | 17:43 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Switch to zuul's default gerrit auth type https://review.opendev.org/757156 | 17:43 |
clarkb | fungi: ^ updated that first change and rebased the stack on it. Need to reapply some WIP's but the beginning of that stack should be safe to land | 17:43 |
fungi | i guess 746335 was rebased in addition to being updated. interdiff is yuge | 17:53 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Clean up old Gerrit html theming and commentlinks https://review.opendev.org/757161 | 17:54 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Remove reviewdb config from Gerrit https://review.opendev.org/757162 | 17:54 |
clarkb | fungi: oh yup sorry about that | 17:54 |
clarkb | I'm going to apply those last two changes to review-test now | 17:55 |
*** priteau has quit IRC | 17:58 | |
clarkb | and all that looks good | 17:58 |
*** priteau has joined #opendev | 18:00 | |
johnsom | Are there known issues with the proxies? Failed to fetch https://mirror.bhs1.ovh.opendev.org/ubuntu/dists/bionic/InRelease Could not connect to mirror.bhs1.ovh.opendev.org:443 (158.69.73.218), connection timed out | 18:02 |
johnsom | https://71ded92e78f7ee54474f-70fe5a5f20a67e625f6dcb03e84a8d62.ssl.cf1.rackcdn.com/757158/1/check/octavia-v2-dsvm-scenario/afa1d04/job-output.txt | 18:02 |
fungi | that looks unexpected | 18:03 |
fungi | i wonder if the vm has died | 18:04 |
johnsom | Most of our jobs are red right now | 18:04 |
clarkb | [Fri Oct 9 14:33:46 2020] afs: Lost contact with file server 23.253.73.143 in cell openstack.org (code -1) (all multi-homed ip addresses down for the server) | 18:04 |
clarkb | didn't that just happen? | 18:04 |
fungi | i can ssh into it | 18:04 |
fungi | 14:33:46 is 3.5 hours ago | 18:04 |
clarkb | ya I wonder if it made the afs sad | 18:05 |
fungi | [Fri Oct 9 14:34:30 2020] afs: file server 23.253.73.143 in cell openstack.org is back up (code 0) (multi-homed address; other same-host interfaces may still be down) | 18:05 |
clarkb | if you hit https://mirror.bhs1.ovh.opendev.org/ubuntu/dists/bionic/InRelease it fails though | 18:05 |
clarkb | which implies apache can't read from the fs | 18:05 |
fungi | it saw it again ~45 seconds later | 18:05 |
fungi | i am having trouble accessing /afs/openstack.org/ from that vm | 18:06 |
fungi | i can get to it from other servers, like static.o.o | 18:07 |
clarkb | we can restart the openafsclient service or reboot then probably? | 18:07 |
clarkb | I'm betting its a local issue triggered by the server going away | 18:07 |
*** ykarel has quit IRC | 18:07 | |
*** priteau has quit IRC | 18:08 | |
fungi | yeah, it may not be able to restart though if there are open file handles in /afs | 18:08 |
fungi | trying to restart it now | 18:09 |
*** ralonsoh has quit IRC | 18:10 | |
fungi | it restarted but still hangs trying to access /afs/openstack.org/ | 18:11 |
clarkb | I wonder if the network issues I've got to rax hosts are related | 18:11 |
fungi | i can't ssh into afs02.dfw | 18:11 |
clarkb | ugh | 18:11 |
fungi | afs01.dfw is responding though | 18:12 |
fungi | betting afs02.dfw is hung once again | 18:12 |
fungi | checking oob console | 18:12 |
fungi | hung kernel tasks 323400 seconds after boot (3.75 days ago, i rebooted it 2020-10-05 20:06:49 UTC according to our status log) | 18:15 |
fungi | #status log hard rebooted afs02.dfw.o.o to address a server hung condition | 18:17 |
openstackstatus | fungi: finished logging | 18:17 |
clarkb | its weird because those servers have been fairly stable until recently. But I guess it could be live migrations or similar | 18:17 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: DNM Forcing a gitea job failure to test gerrit replication https://review.opendev.org/757165 | 18:18 |
fungi | i can ssh into afs02 again | 18:18 |
TheJulia | are mirror issues a known thing right now? | 18:18 |
clarkb | TheJulia: yes, the file server went out to lunch again | 18:19 |
TheJulia | looks like it | 18:19 |
TheJulia | ugh | 18:19 |
clarkb | I've put a hold on https://review.opendev.org/757165 and will use the gitea it builds to test gerrit replication from review-test | 18:19 |
clarkb | probably tomorrow though as I'll need to figure out credentials and all that | 18:19 |
*** roman_g has quit IRC | 18:19 | |
TheJulia | shoudl we be giving CI some time or.... | 18:20 |
TheJulia | Well, should I go enjoy a beverage or go poke patches I guess is what I'm wondering | 18:20 |
clarkb | I expect this will recover as soon as the server finishes rebooting | 18:20 |
TheJulia | k | 18:20 |
*** roman_g has joined #opendev | 18:20 | |
*** roman_g has quit IRC | 18:21 | |
clarkb | https://mirror.bhs1.ovh.opendev.org/ubuntu/dists/ isn't loading yet so not yet | 18:21 |
fungi | yeah, my attempts to ls in /afs/openstack.org/ from it are still stuck | 18:21 |
fungi | load average on mirror.bhs1.ovh.opendev.org is 122 | 18:22 |
fungi | i'm going to stop apache on it for a minute | 18:22 |
TheJulia | would it make sense to just shutdown CI because I suspect it is getting killed with activity too | 18:23 |
TheJulia | With mirrors out to lunch, the jobs are toast anyway | 18:23 |
fungi | well, it will take longer to remove that region from nodepool than it will to get it back on track, worst case i'll reboot it | 18:23 |
TheJulia | k | 18:24 |
* TheJulia goes and checks on the pie | 18:24 | |
fungi | stopping apache on it is taking forever, so may be faster to forcibly reboot the vm | 18:24 |
clarkb | fungi: wfm | 18:24 |
*** tkajinam has quit IRC | 18:25 | |
fungi | #status log hard rebooted mirror01.bhs1.ovh to recover from high load average (apparently resulting from too many hung reads from afs) | 18:26 |
openstackstatus | fungi: finished logging | 18:26 |
fungi | it's still booting up. hopefully it's not prompting for an interactive fsck on the console | 18:27 |
clarkb | you should be able to get the console from ovh | 18:28 |
clarkb | via the normal apis | 18:28 |
fungi | A start job is running for OpenAFS client (3min 4s / 3min 24s) | 18:29 |
fungi | okay, now it's up | 18:29 |
fungi | ls: cannot access '/afs/openstack.org/': No such file or directory | 18:29 |
clarkb | I wonder if it is a network issue between ovh and rax then | 18:30 |
clarkb | on top of the other issue | 18:30 |
clarkb | try restart the openafs client? | 18:30 |
fungi | i can ping both afs01.dfw and afs02.dfw from mirror.bhs1.ovh | 18:30 |
fungi | didn't help | 18:31 |
*** tosky has joined #opendev | 18:32 | |
clarkb | the fact that /afs is entirely empty makes me think it is the client/kernel | 18:33 |
clarkb | if it were just the remote we'd be able to see the other afs fses? | 18:33 |
fungi | `vos status -server afs01.dfw.openstack.org` reports "attachFlags: busy" | 18:34 |
clarkb | fungi: you restarted openafs client? based on systemctl status openafs-client it appears to have been running since 18:26 | 18:35 |
clarkb | er not status ps | 18:35 |
fungi | and for afs02.dfw is says "procedure: Restore" | 18:35 |
clarkb | the afsd is running since 18:26 | 18:35 |
clarkb | but the service is from 18:31 which amkes me think it didn't truly restart | 18:35 |
fungi | huh, i did a `sudo systemctl restart openafs-client` | 18:35 |
fungi | for me it says "Active: active (running) since Fri 2020-10-09 18:31:08 UTC; 5min ago" which was shortly after my restart | 18:36 |
clarkb | maybe we should stop then start it | 18:37 |
fungi | yeah, i concur. systemd and the process list are not in agreement | 18:37 |
fungi | it's not stopping, presumably because it's busy | 18:38 |
clarkb | ugh | 18:38 |
fungi | it's been trying to rmmod openafs for 11 minutes | 18:38 |
fungi | shall i add the nl hosts into the emergency disable list and then take bhs1 out of the nodepool configs on the servers? | 18:39 |
clarkb | fungi: you only need to do nl04 and ya that seems like a good idea | 18:40 |
fungi | oh, righ | 18:40 |
fungi | t | 18:40 |
*** priteau has joined #opendev | 18:42 | |
fungi | does the container need any kicking for a max-servers value change? | 18:42 |
fungi | or does it pick that up automatically when the file is modified? | 18:42 |
clarkb | it should pcik it up | 18:43 |
clarkb | it rereads the config file on every pass through its runtime loop | 18:43 |
fungi | i suppose i should check the other mirrors to see if this problem is more widespread | 18:43 |
clarkb | I wonder if we should try another cleaner reboot of the bhs1 mirror | 18:43 |
fungi | i suspect it won't be able to cleanly reboot because of the openafs lkm | 18:44 |
fungi | but trying a grafeful reboot now | 18:44 |
fungi | graceful | 18:44 |
fungi | it did manage to shutdown | 18:45 |
fungi | at least according to the console | 18:45 |
fungi | gra1.ovh seems to be working fine | 18:46 |
clarkb | fungi: did you sudo reboot or nova reboot? | 18:46 |
fungi | sudo reboot | 18:47 |
fungi | all three rax mirrors can reach afs | 18:47 |
clarkb | it does seem like it may have shut down services but is waiting for the kernel to be happy | 18:47 |
clarkb | because ssh is refusing connections for al ong time which normally doen't happen on boot | 18:47 |
fungi | both vexxhost mirrors are happy | 18:47 |
fungi | ssh just started | 18:48 |
clarkb | oh now I get pam complaining ya | 18:48 |
*** priteau has quit IRC | 18:48 | |
fungi | it was complaining about hung kernel tasks at shutdown for a few minutes | 18:48 |
fungi | i guess because the openafs driver was unresponsive/busy | 18:48 |
fungi | console says the openafs-client startup is timing out again | 18:50 |
clarkb | I feel like auristor said there was a race that may cause this at one time | 18:50 |
clarkb | but I don't recall if there was a proposed fix (or if I even recall accurately) | 18:50 |
clarkb | fungi: idea: we disable openafs-client and reboot again. Let it come up happy then manually start openafs-client? | 18:51 |
fungi | the construction crew here is wrapping up so i'm going to need to step away for a bit. if this persists at all we're going to need a change to zero the max-servers in git | 18:51 |
clarkb | I can do the service disable and reboot | 18:52 |
fungi | thanks | 18:52 |
clarkb | then try and start it manually | 18:52 |
clarkb | disabled and rebooting now | 18:52 |
clarkb | after confirming there was another rmmod and /afs was empty | 18:52 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Switch to zuul's default gerrit auth type https://review.opendev.org/757156 | 19:01 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Clean up old Gerrit html theming and commentlinks https://review.opendev.org/757161 | 19:01 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Remove reviewdb config from Gerrit https://review.opendev.org/757162 | 19:01 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Update gerrit container image to 3.2 https://review.opendev.org/757176 | 19:01 |
clarkb | doing the manual start doesn't seem happier | 19:02 |
clarkb | my systemctl start openafs-client hasn't returned after a few minutes | 19:02 |
clarkb | I need to eat lunch but will look at this more after | 19:02 |
clarkb | the region is disabled in nodepool so we should be ok just at lower capacity | 19:03 |
cgoncalves | clarkb, thanks | 19:06 |
*** priteau has joined #opendev | 19:13 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add nim roles https://review.opendev.org/747865 | 19:18 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add nim roles and job https://review.opendev.org/747865 | 19:20 |
*** priteau has quit IRC | 19:27 | |
clarkb | checkvolumes and flushvolume do not work say they are not implemented | 19:29 |
clarkb | there is another stuck rmmod happening though so maybe related to unloadingfrom the kernel? | 19:29 |
clarkb | I'm beginning to run out of ideas thataren't rebuild the mirror | 19:30 |
clarkb | it seems the problem is in getting the openafsclient to run at all and not specific to our afs tree | 19:31 |
fungi | i'm still tied up for the moment, but suspect there's some persistent state causing it to not access a working fileserver. like we see when "localhost" winds up in a server list | 19:31 |
clarkb | oh | 19:31 |
fungi | oh, yeah i guess afsd should still be able to start under those conditions though | 19:31 |
clarkb | everything looks fine in /etc/openafs | 19:33 |
clarkb | I'm trying to modprobe openafs just to see if it will complain about something | 19:35 |
clarkb | but it isn't returning | 19:36 |
clarkb | could it be that our dkms build is justbad? | 19:36 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add nim roles and job https://review.opendev.org/747865 | 19:36 |
clarkb | maybe we should force a rebuild/reinstall of openafs-client? | 19:36 |
fungi | it's possible a kernel update triggered a rebuild which got interrupted for some reason | 19:38 |
clarkb | I'm reinstalling openafs-modules-dkms which appears to be rebuilding the modules | 19:39 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add nim roles and job https://review.opendev.org/747865 | 19:40 |
clarkb | and the rebuild is done. I'm going to try and reboot it and start manually again | 19:47 |
clarkb | its not looking better | 19:53 |
clarkb | still waiting for it to error but its doing the sit and wait thing. COuld be new kernel isn't compat with our openafs package or we have some other local state problem | 19:54 |
clarkb | it just timed out | 19:54 |
clarkb | this is a bionic mirror I think we've got at least one focal mirror now | 19:54 |
clarkb | maybe we do just rebuild it | 19:54 |
*** roman_g has joined #opendev | 20:32 | |
*** roman_g has quit IRC | 20:44 | |
clarkb | I'm going to pop out for a bike ride now. I still don't have a good answer other than rebuild | 21:10 |
*** nuclearg1 has joined #opendev | 21:17 | |
*** nuclearg1 has quit IRC | 21:29 | |
*** nuclearg1 has joined #opendev | 21:32 | |
*** hamalq has quit IRC | 21:37 | |
*** moppy has quit IRC | 21:42 | |
*** paramite has quit IRC | 22:00 | |
*** hamalq has joined #opendev | 22:00 | |
johnsom | Failed to fetch https://mirror.mtl01.inap.opendev.org/ubuntu/dists/focal/InRelease Could not connect to mirror.mtl01.inap.opendev.org:443 | 22:06 |
johnsom | inap too | 22:06 |
*** Dmitrii-Sh has quit IRC | 22:08 | |
fungi | mm, that's one of a couple i didn't test | 22:31 |
fungi | it can still access afs | 22:31 |
fungi | is that maybe from a few hours ago? | 22:32 |
fungi | okay, this is nuts. i can connect to the ssh port on it, but https times out? | 22:36 |
corvus | maybe all the apache procs are stuck? | 22:41 |
fungi | maybe. tcpdump says i can reach it, but it's not responding to my syn packets | 22:41 |
*** qchris has quit IRC | 22:41 | |
fungi | yeah, that was likely it | 22:42 |
fungi | stopping apache, making sure all the processes were gone, then starting again seems to have allowed me to get a response | 22:43 |
corvus | oh, that explains why i wasn't seeing anything in netstat, i'm assuming you restarted it right before i started inspecting | 22:43 |
fungi | likely | 22:44 |
fungi | between 2241 and 2242 | 22:44 |
fungi | i'm supposing lsof would have showed open file handles to /afs which might have been timing out from the afs02.dfw restart | 22:45 |
fungi | or failing to time out, rather | 22:45 |
fungi | there are a bunch of old vos operations (listvol, release, partinfo) hanging out in the process lists for mirror-update.openstack.org and mirror-update.opendev.org too which need reaping, looks like | 22:48 |
fungi | i'll terminate them | 22:48 |
fungi | done, they were all from 7+ hours ago | 22:52 |
fungi | keeping an eye on the periodic volume release logs now to see if they pick back up normal operation | 22:52 |
*** qchris has joined #opendev | 22:53 | |
fungi | looks like it's on track again | 23:14 |
*** mlavalle has quit IRC | 23:19 | |
*** Dmitrii-Sh has joined #opendev | 23:21 | |
*** hamalq has quit IRC | 23:26 | |
*** tosky has quit IRC | 23:26 | |
clarkb | any better ideas for bhs1 mirror? | 23:54 |
fungi | not really, i guess it will be nice to have it on focal anyway? | 23:57 |
clarkb | ya I think we've already started converting them (though I would need to double check that) | 23:58 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!