ianw | stephenfin: trying to get to the bottom of a stestr/stevedore/importlib-metadata/python3.7 horror combo issue -> https://github.com/mtreinish/stestr/issues/336 | 03:00 |
---|---|---|
ianw | it looks like https://opendev.org/openstack/stevedore/commit/143a3e9f0716690be7343d4d083f65d7624b3d2e in stevedore 3.5.1 should be the fix; but stestr still isn't finding it's commands | 03:02 |
ianw | real_groups with old importlib-metadata looks like -> https://paste.opendev.org/show/bZ7yPaO1pVF8aKT2uGof/ | 03:14 |
ianw | real_groups with stevedore 3.5.1 looks like -> https://paste.opendev.org/show/bpqS0OPic22XF284x5Dg/ | 03:14 |
ianw | ... i think they should look the same. it suggests to me the expansion maybe isn't quite right? | 03:15 |
ianw | https://review.opendev.org/c/openstack/stevedore/+/861695 | 03:57 |
ianw | that seems to fix stestr on buster/python 3.7 ... but does it introduce some other issue? i don't know | 03:58 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: Pin sphinx to 5.2.3 https://review.opendev.org/c/zuul/zuul-jobs/+/861587 | 04:30 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: linter: Use capitals for names https://review.opendev.org/c/zuul/zuul-jobs/+/854933 | 04:30 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: Fix ansible-lint name[template] https://review.opendev.org/c/zuul/zuul-jobs/+/861559 | 04:30 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: Add names to include tasks https://review.opendev.org/c/zuul/zuul-jobs/+/861560 | 04:30 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: Standarise block/when ordering https://review.opendev.org/c/zuul/zuul-jobs/+/861562 | 04:30 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: Update to ansible-lint 6.8.2 https://review.opendev.org/c/zuul/zuul-jobs/+/861563 | 04:30 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] sphinx circular dependencies error https://review.opendev.org/c/zuul/zuul-jobs/+/861588 | 04:30 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: Workaround stevedore/python3.7 issues https://review.opendev.org/c/zuul/zuul-jobs/+/861698 | 04:30 |
*** ysandeep|out is now known as ysandeep | 05:47 | |
ianw | chown: invalid group: ‘root:letsencyrpt’ | 06:20 |
ianw | ... ? | 06:20 |
ianw | https://7e827a77180c1e6e432f-3c4e8d8f712aba3e652b0cfd0c30a298.ssl.cf5.rackcdn.com/861138/12/check/system-config-run-letsencrypt/35bbcf9/letsencrypt01.opendev.org/acme.sh/acme.sh.log | 06:20 |
*** ramishra_ is now known as ramishra | 06:38 | |
*** jpena|off is now known as jpena | 07:17 | |
*** ysandeep is now known as ysandeep|lunch | 08:16 | |
*** marios is now known as marios|call | 09:00 | |
*** jpodivin__ is now known as jpodivin | 09:33 | |
*** ysandeep|lunch is now known as ysandeep | 10:37 | |
*** marios|call is now known as marios | 11:33 | |
*** dviroel|out is now known as dviroel | 11:41 | |
stephenfin | ianw: One comment on https://review.opendev.org/c/openstack/stevedore/+/861695 | 12:01 |
*** ysandeep is now known as ysandeep|away | 12:20 | |
frickler | anyone else seeing lags when loading etherpads currently? | 13:03 |
fungi | it was maybe a little slower than usual for me. i'll take a look at the system resource utilization | 13:35 |
*** dasm|off is now known as dasm | 13:49 | |
clarkb | memory and system load look fine. I do note that the root fs is a bit full | 14:09 |
clarkb | it looks like log rotate has rotated out (deleted) one of the old db backups in /var/backups/etherpad-mariadb | 14:10 |
clarkb | the bulk of the disk is consumed by the etherpad container's json log file | 14:14 |
clarkb | I think we should either directly truncate that under the container (ugh) or down then up the container later to restart the log collection with a new container | 14:16 |
clarkb | and then look at redirecting the logs to /var/log/containers in order to log rotate them there | 14:17 |
clarkb | also we should clear out the old backup at /var/backups/etherpad-mariadb/etherpad-mariadb.sql.gz.3 | 14:17 |
fungi | yeah, we can probably do it safely around 18:00 utc if we don't want to wait until the weekend | 14:17 |
clarkb | ya I think start with clearing the extra backup file | 14:19 |
clarkb | then down then up and that should get the disk usage into a happy enough spot where we can add the syslog redirects without being rush | 14:19 |
clarkb | *without being in a rush | 14:19 |
fungi | people will get disconnected from pads they've left up in their browsers, but since there shouldn't be any sessions running at that time it hopefully won't be too disruptive | 14:22 |
fungi | we can #status log it for a bit of added visibility | 14:22 |
clarkb | ++ | 14:22 |
fungi | i have no idea what would happen if we restarted etherpad while a meetpad call is running | 14:23 |
clarkb | it should just fail the document | 14:25 |
clarkb | the call will work (as was the casewhen we had the cross site domain stuff improperly set up) | 14:25 |
*** ysandeep|away is now known as ysandeep | 14:31 | |
fungi | yeah, just didn't know if it would be able to reconnect clients to the pad after | 14:31 |
*** dviroel is now known as dviroel|dr_appt | 14:52 | |
clarkb | one thing I noticed about the jammy cloud image I used for gitea-lb02 is that it has a /boot/efi as a separate partition | 15:31 |
fungi | debian's official images are like that too | 15:31 |
clarkb | Does that imply it is also gpt instead of mbr? | 15:32 |
clarkb | vexxhost booted it just fine so I'm not really worried about it, but curious to see cloud images moving forward like that | 15:32 |
fungi | you should be able to inspect it easily, but i expect so yes | 15:32 |
fungi | `sudo fdisk -l /dev/vda` | 15:33 |
fungi | "Disklabel type: gpt" | 15:33 |
clarkb | is that from gitea-lb02? | 15:33 |
fungi | yes | 15:33 |
clarkb | neat | 15:33 |
fungi | partition types are Linux filesystem on /dev/vda1, BIOS boot on /dev/vda14, and EFI System on /dev/vda15 | 15:34 |
clarkb | oh interesting I guess that implies it can boot legacy or efi | 15:35 |
fungi | vda14 isn't mounted but vda15 is, so that's what i take from it yes | 15:35 |
clarkb | that makes sense for an official cloud image that might be used in many places | 15:35 |
fungi | also fstab says it's configured to use swap on /swapfile | 15:36 |
fungi | so no separate swap partition | 15:36 |
clarkb | yup the swapfile is created by our launch scripts | 15:39 |
*** ysandeep is now known as ysandeep|out | 15:44 | |
fungi | ah okay | 15:47 |
clarkb | infra-root the ptg sessions I got up earlyfor today are winding down and I should be around if we want to alnd https://review.opendev.org/c/opendev/zone-opendev.org/+/861229 | 15:59 |
clarkb | cacti is collecting info from the new server now too so we'll be able to watch and compare that data against the historical data for theold server | 16:00 |
*** marios is now known as marios|out | 16:02 | |
clarkb | and now breakfast since I managed to skip that | 16:03 |
fungi | yeah, i'm ready for 861229 whenever | 16:04 |
fungi | about to go pick up some lunch takeout, but i'll only be gone for like 10-15 minutes | 16:04 |
*** jpena is now known as jpena|off | 16:31 | |
clarkb | fungi: are you back? If so I'll go ahead and approve that change | 16:46 |
*** svinavel_ is now known as svinavel | 16:47 | |
*** dviroel|dr_appt is now known as dviroel | 16:49 | |
fungi | clarkb: yeah, sorry, had stepped away to eat | 17:07 |
fungi | but i'm around and ready to watch/fix stuff | 17:07 |
clarkb | ok approved | 17:08 |
clarkb | next up is looking at ethepad. Earlier I think you mentioend waiting until 1800 for that to be sure people had set the ptg down? | 17:08 |
fungi | yeah, sessions officially ended at 17:00 but i expect some may run long | 17:09 |
clarkb | wfm | 17:09 |
fungi | but an hour after sessions have officially ended seems like plenty of buffer | 17:09 |
fungi | and the outage should be extremely brief | 17:10 |
clarkb | yup | 17:10 |
opendevreview | Merged opendev/zone-opendev.org master: Swap gitea-lb01 to gitea-lb02 for opendev.org https://review.opendev.org/c/opendev/zone-opendev.org/+/861229 | 17:10 |
clarkb | gitea-lb02 is in dns for opendev.org now | 17:48 |
clarkb | It seems towork for me but please say something if you notice anything odd or unexpected. We can monitor resource utilization via cacti as well and probably clean up gitea-lb01 towards the end of the week if nothing comes up | 17:49 |
fungi | i'm cloning nova now just as a cursory check | 17:51 |
clarkb | ++ | 17:53 |
fungi | that worked | 17:53 |
fungi | i also let #openinfra-events know about the impending etherpad restart | 17:54 |
clarkb | thanks. Before doing that etherpad down and up we should clean up the unneeded backup file if others agree that file is extra | 17:54 |
fungi | looking | 17:55 |
fungi | etherpad-mariadb.sql.gz.3 from 2022-10-11 or something else? | 17:55 |
clarkb | yes that one | 17:56 |
clarkb | it seems that the rotation failed. I suspect due to running out disk space to move things around | 17:56 |
fungi | yes, it looks like there is currently insufficient space to create a new db backup even | 17:57 |
clarkb | I think we can manually delete it and then down then up -d the containers to clear out the json log file associated with the container. And that should free a bunch of space | 17:57 |
fungi | agreed | 17:57 |
fungi | i don't think .3 is necessarily "extra" but that it was probably created just before space got too tight to rotate any new backups | 17:58 |
clarkb | well I think we only keep like one extra day normally | 17:58 |
clarkb | which is why .1 and .2 don't exist now that .3 won't go away? | 17:58 |
fungi | and since then it's just been overwriting the primary backup file | 17:59 |
fungi | could be | 17:59 |
clarkb | fungi: were you planning the do the things or should I? | 18:00 |
clarkb | (I'm around tohelp either way) | 18:00 |
fungi | i can do the things, no prob | 18:00 |
fungi | working on that now | 18:00 |
clarkb | thanks! | 18:00 |
fungi | first deleting /var/backups/etherpad-mariadb/etherpad-mariadb.sql.gz.3 | 18:00 |
fungi | free space on the rootfs went from 2.9G to 6.4G | 18:01 |
fungi | when we stop and start the container, is there a log deletion step i need to do in between? | 18:02 |
clarkb | fungi: you have to use docker-compose down then docker-compose up -d which will cause docker-compose and docker to delete the containers and create new ones. The container deletion step will wipe the logs | 18:02 |
clarkb | if you stop then start the container process will stop and you'd have to manually delete the files behind docker's back which seems hackier | 18:03 |
fungi | perfect, so no extra step required | 18:03 |
fungi | done. available space on the rootfs is back up to 20G now | 18:04 |
clarkb | heh we just got an email about backups failing. I checked the log and the reason is that we stopped the databsae server (the dockercompose down) while it was backing up remotely | 18:05 |
clarkb | I think that is fine, but if we are concerned we could manually retrigger the backup crontab entry in a root screen | 18:05 |
clarkb | I'm able to open a coupel of etherpads so all looks well from that perspective | 18:06 |
clarkb | according to cacti the gitea-lb02 network traffic is picking up. System load and cpu usage both look to be happy so far | 18:10 |
fungi | yeah, etherpads seem to be working for me | 18:13 |
fungi | database backups kick off at 04:42z daily, so i expect if we interrupted that db backup then it was approximately hung | 18:14 |
clarkb | fungi: there are two different db backups | 18:15 |
clarkb | there are the borg driven backups which happen randomly for each server and stream to the borg backup servers (this is what we interrupted). And there is the local write a file on the host for ease of use and local db backups which is what we rm'd thestale file for | 18:15 |
fungi | oic | 18:15 |
fungi | it's the local mysqldump which happens at 04:42 | 18:16 |
clarkb | yup | 18:16 |
clarkb | fungi: https://review.opendev.org/q/topic:docker-cleanups+OR+topic:use-new-python is a set of docker image fixes and python modernization changes | 18:17 |
clarkb | if you've got time to take a look at those a number of them are probablfairly safe to land | 18:17 |
fungi | and indeed, the borg backup to backup01.ord.rax.opendev.org started at 17:49z, about 15 minutes before i took the container down | 18:18 |
fungi | #status log Restarted the services on etherpad.opendev.org in order to free up some disk space | 18:25 |
opendevstatus | fungi: finished logging | 18:25 |
opendevreview | Merged opendev/system-config master: Fixup jinja-init image https://review.opendev.org/c/opendev/system-config/+/861473 | 18:41 |
clarkb | sounds like gitea 1.18 rc0 will be out in a few days | 18:50 |
clarkb | that will include the vendor file indentification fix I wrote. I'll try ot get a patch up testing it once the tag exists | 18:50 |
fungi | oh, nice! | 18:51 |
opendevreview | Merged openstack/project-config master: Move grafyaml check and gate jobs in repo https://review.opendev.org/c/openstack/project-config/+/861482 | 20:10 |
opendevreview | Merged opendev/grafyaml master: Run pep8 and unittest jobs out of in repo config https://review.opendev.org/c/opendev/grafyaml/+/861483 | 20:22 |
opendevreview | MICHAEL KELLY proposed zuul/zuul-jobs master: helm: Add job for linting helm charts https://review.opendev.org/c/zuul/zuul-jobs/+/861799 | 21:26 |
opendevreview | Clark Boylan proposed opendev/system-config master: Stop updating pip in our docker assemble script https://review.opendev.org/c/opendev/system-config/+/861800 | 21:35 |
*** dasm is now known as dasm|off | 22:15 | |
clarkb | hrm I think there is a chicken and egg in that change for the uwsgi image | 22:17 |
clarkb | the good news with that is it means we do actually test it. The bad news is I have to figure out how to unravel things | 22:17 |
opendevreview | MICHAEL KELLY proposed zuul/zuul-jobs master: helm: Add job for linting helm charts https://review.opendev.org/c/zuul/zuul-jobs/+/861799 | 22:18 |
clarkb | hrm so it does seem that the uwsgi builds properly did not update pip but we've still got the same problem | 22:24 |
clarkb | in this case with the uwsgi package instead of netifaces | 22:24 |
clarkb | maybe my reproduction case with netifaces locally was too trivial and there is something bigger happening | 22:27 |
clarkb | hrm and this doens't fix nodepool either | 22:28 |
opendevreview | MICHAEL KELLY proposed zuul/zuul-jobs master: helm: Add job for linting helm charts https://review.opendev.org/c/zuul/zuul-jobs/+/861799 | 22:32 |
opendevreview | MICHAEL KELLY proposed zuul/zuul-jobs master: helm: Add job for linting helm charts https://review.opendev.org/c/zuul/zuul-jobs/+/861799 | 22:42 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: Pin sphinx to 5.2.3 https://review.opendev.org/c/zuul/zuul-jobs/+/861587 | 23:21 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: Workaround stevedore/python3.7 issues https://review.opendev.org/c/zuul/zuul-jobs/+/861698 | 23:25 |
opendevreview | Merged opendev/system-config master: infra-prod-bootstrap-bridge: run directly on bridge https://review.opendev.org/c/opendev/system-config/+/861138 | 23:34 |
opendevreview | Ian Wienand proposed opendev/system-config master: docs: Update force-merge docs for removing votes https://review.opendev.org/c/opendev/system-config/+/861802 | 23:41 |
opendevreview | Merged zuul/zuul-jobs master: Pin sphinx to 5.2.3 https://review.opendev.org/c/zuul/zuul-jobs/+/861587 | 23:48 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: Workaround stevedore/python3.7 issues https://review.opendev.org/c/zuul/zuul-jobs/+/861698 | 23:51 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: linter: Use capitals for names https://review.opendev.org/c/zuul/zuul-jobs/+/854933 | 23:51 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: Fix ansible-lint name[template] https://review.opendev.org/c/zuul/zuul-jobs/+/861559 | 23:51 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: Add names to include tasks https://review.opendev.org/c/zuul/zuul-jobs/+/861560 | 23:51 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: Standarise block/when ordering https://review.opendev.org/c/zuul/zuul-jobs/+/861562 | 23:51 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: Update to ansible-lint 6.8.2 https://review.opendev.org/c/zuul/zuul-jobs/+/861563 | 23:51 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] sphinx circular dependencies error https://review.opendev.org/c/zuul/zuul-jobs/+/861588 | 23:51 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!