opendevreview | James E. Blair proposed opendev/system-config master: Add zuul-db01 to cacti https://review.opendev.org/c/opendev/system-config/+/915101 | 00:15 |
---|---|---|
corvus1 | i have started a timed import process | 00:20 |
fungi | finally have my q2 paperwork knocked out, so can be semi-helpful again | 00:21 |
opendevreview | James E. Blair proposed opendev/system-config master: Restrict permissions on mariadb compose file https://review.opendev.org/c/opendev/system-config/+/915102 | 00:22 |
corvus1 | fungi: two more changes ^ | 00:22 |
corvus1 | memory during the import looks good; mysql is using 51% as expected. theres about 3Gi available, almost all currently used for buffers/cache | 00:25 |
corvus1 | s/mysql/mariadb/ :) | 00:25 |
corvus1 | s/mariadb/mariadbd/ :) | 00:25 |
corvus1 | it's using about 3/8 cores | 00:26 |
fungi | ah, yep, i did the same perms on the mailman3 server too | 00:58 |
fungi | resource utilization sounds spot-on | 01:00 |
opendevreview | Merged opendev/system-config master: Mariadb: listen on all IP addresses https://review.opendev.org/c/opendev/system-config/+/915096 | 01:56 |
opendevreview | Merged opendev/system-config master: Add zuul-db01 to cacti https://review.opendev.org/c/opendev/system-config/+/915101 | 01:56 |
corvus1 | real 88m4.022s | 02:07 |
corvus1 | that's the good news. the bad news is that the mariadb 10.11 query planner has come up with a third way of handling these queries, and it's worse than mysql 5.7. i'm going to manually stand up a mysql 8 on this host so we can compare apples to apples, then decide how to proceed. | 02:23 |
opendevreview | Merged opendev/system-config master: Restrict permissions on mariadb compose file https://review.opendev.org/c/opendev/system-config/+/915102 | 03:17 |
*** TheMaster is now known as Unit193 | 09:25 | |
fungi | after lengthy debate, it's looking like the importlib.resources "legacy" api is getting un-deprecated: https://discuss.python.org/t/deprecating-importlib-resources-legacy-api/11386/47 | 12:26 |
opendevreview | Dr. Jens Harbott proposed openstack/project-config master: gerritbot: move docs tools to TC channel https://review.opendev.org/c/openstack/project-config/+/915130 | 12:49 |
corvus1 | real 106m25.886s for mysql 8 | 14:16 |
corvus1 | query planner is producing sensible results | 14:17 |
corvus1 | oh this makes things more confusing: https://jira.mariadb.org/browse/MDEV-27302 | 14:18 |
corvus1 | apparently we may not actually be able to tell if the query planner wants to use a backwards index scan | 14:19 |
opendevreview | Thierry Carrez proposed opendev/irc-meetings master: Move release team meeting one hour earlier https://review.opendev.org/c/opendev/irc-meetings/+/915134 | 14:19 |
corvus1 | (but, empirically, it's not, since the query is slow) | 14:19 |
corvus1 | i'm trying one more idea with mariadb; i'm trying to make an explicit descending index on the primary key to see if it (a) will use that automatically, or (b) if i can force it to use it, and if so, if that improves things | 14:37 |
fungi | i think i almost understand what that means ;) | 14:37 |
opendevreview | Merged openstack/project-config master: Revert "Temporarily remove release docs semaphores" https://review.opendev.org/c/openstack/project-config/+/914689 | 14:38 |
corvus1 | okay, that's a negative to all of the above :( | 14:39 |
opendevreview | Dr. Jens Harbott proposed openstack/project-config master: gerritbot: move docs tools to TC channel https://review.opendev.org/c/openstack/project-config/+/915130 | 14:44 |
clarkb | so this likely is a known behavior difference between mariadb and mysql? It seems likely to me that we should be able to express what we want in mariadb the trick is figuring out how I guess | 14:48 |
clarkb | fungi: if https://review.opendev.org/c/opendev/system-config/+/914895 looks alright to you I should be able to test that rotates properly today | 14:50 |
clarkb | I still have my old key so don't anticipate getting locked out either. But on the slim possibility it might be good to only land that if someone else is around to debug should that happen | 14:51 |
fungi | ah, yes sorry, i meant to approve that yesterday but got sidetracked by other activities | 14:51 |
opendevreview | Merged opendev/irc-meetings master: Move release team meeting one hour earlier https://review.opendev.org/c/opendev/irc-meetings/+/915134 | 14:58 |
corvus1 | clarkb: i'm not sure how known it is... my understanding is that both mysql 5.7 and mariadb should be able to perform backward index scans, but with some overhead (but that would be entirely acceptable for us). but neither seems to be doing so. meanwhile, 8.0 can perform backward index scans without any extra overhead, and seems to do it automatically. | 14:58 |
corvus1 | clarkb: (then there's the bonus failure of mysql 5.7 of not actually reversing the data at all, and instead returning the first N results instead of the last N) | 14:59 |
fungi | what are the odds that this is the root cause of the lengthy delays in pipeline event processing we've been seeing this week? | 15:07 |
corvus1 | fungi: definitely non-zero | 15:09 |
clarkb | reading about descending indexes in mysql docs it says "DESC in an index definition is no longer ignored but causes storage of key values in descending order." I wonder if the failure for this to help mariadb implies it is still ignored there | 15:10 |
clarkb | that must be something in the sql spec that databases haev long ignored due to complexity? | 15:10 |
fungi | i guess it could also be a second-order effect, inefficient queries putting extreme load on the db server, and that's causing it to be unable to process other queries in a timely fashion | 15:10 |
clarkb | https://mariadb.com/kb/en/descending-indexes/ this says they finally implemented it in mariadb 10.11 | 15:10 |
clarkb | but maybe it is buggy | 15:10 |
opendevreview | Merged opendev/system-config master: Rotate clarkbs ssh key https://review.opendev.org/c/opendev/system-config/+/914895 | 15:45 |
corvus1 | clarkb: fungi i think i see a way to fix zuul with mariadb, but it's going to take some non-trivial zuul changes. i think we should proceed with migrating to the mysql8 db i manually set up yesterday, run that while i work on the zuul changes necessary to support all 3 platforms, then maybe next weekend we can migrate to the ansible-managed mariadb. | 16:23 |
clarkb | that sounds like a good path forward to me | 16:24 |
fungi | sure, no objection here. let me know what help you need | 16:27 |
corvus1 | i think we're actually all set. the db server is ready, tomorrow we can export/import and then update zuul.conf to point to the new dburi (we can make that change now so it's ready). then also we'll want to merge the index hint change in zuul and restart schedulers/web again after that. but we shouldn't merge that before we switch dbms. | 16:33 |
fungi | yep, makes sense | 16:34 |
clarkb | I suspect that I'll be around tomorrow having a lazy day (its cold and rainy and motivation to go out and do something is low) | 16:34 |
clarkb | I'll keep an eye on irc so holler if I can help | 16:35 |
fungi | yeah, i'm planning to be home all day, may be away from the computer from time to time for gardening tasks | 16:35 |
clarkb | the key on bridge seems to have updated and I can still get in. After this cup of tea is brewed I'll ensure I remove my old key from my agent and then spot check things a bit more | 16:44 |
clarkb | and then I need to go look at our PTG doc and try to organize and add content to it | 16:44 |
clarkb | ya this appears to be working for me | 16:50 |
clarkb | fungi: there is a gitea 1.21.10 upgrade change https://review.opendev.org/c/opendev/system-config/+/914292 up for review if you have a moment. I don't think there is a rush on landing that and you indicated a preference for taking it easy through the ptg which is fine by me | 16:57 |
clarkb | the LE job seems to still be failing with that same issue as yesterdy which is also likely why we got an alert that nb02's cert expires in less than a month today | 17:13 |
clarkb | its intersting that nb02 is the node that also fails in the ansible log | 17:13 |
clarkb | ah ok this is liekly the same issue we have had in the past that just mysteriously went away | 17:15 |
clarkb | oh! nb02 is just straight up failing but for whatever reason we progress forward | 17:16 |
clarkb | forward in ansible I mean. Which makes the error later a bit of a red herring /me goes to figure out why nb02 is sad | 17:16 |
clarkb | hrm though maybe those early failures get retried and eventually succeed. That would explain why it continues to proceed later | 17:17 |
clarkb | ok ya the issue is nb02's acme.sh install is in a modified state so the tasks to enforce the state we want are not running | 17:20 |
clarkb | then it gives up on nb02 for cert renewal but the certcheck domain list creation doesn't know that and then breaks. So we need to figure out why nb02 is in this state and whether or not to fix it further | 17:21 |
clarkb | acme is stored in /opt/ and /opt filled up on nb02 | 17:26 |
clarkb | My hunch here is that the git operations we try to perform on that repo had a sad due to the disk being full and ended up in this state that causes ansible to bail out | 17:26 |
clarkb | I'm going to manually move that directory aside in /opt/ so that it can be further inspected/debugged but the next run of the LE playbooks should hopefully set us back into a working state | 17:27 |
clarkb | ya filestamp timings seem to align with the timing of disk filling | 17:29 |
clarkb | #status log Reset acme.sh on nb02 as a full disk appears to have corrupted it | 17:30 |
opendevstatus | clarkb: finished logging | 17:30 |
opendevreview | Clark Boylan proposed opendev/system-config master: Add more LE debugging info to our Ansible role https://review.opendev.org/c/opendev/system-config/+/915173 | 17:36 |
clarkb | the next LE run (daily periodic I think) should correct the problem and ^ should make it slightly easier to debug in the future | 17:36 |
fungi | good find | 17:42 |
clarkb | I added that more explicit loop in order to log the node names because when this happened in the past it didn't even log that | 17:43 |
clarkb | we probably would've eventually found it but having the explicit "somethign wrong with nb02" led to finding it quicker | 17:43 |
opendevreview | Merged opendev/system-config master: Update gitea to v1.21.10 https://review.opendev.org/c/opendev/system-config/+/914292 | 19:29 |
fungi | infra-prod-service-gitea is starting (or possibly already done, it's hard to tell with the current lag in zuul reporting) | 19:45 |
fungi | "/usr/local/bin/gitea web" process started on gitea14 at 19:39, 7 minutes ago | 19:46 |
fungi | https://opendev.org/ currently says "Powered by Gitea Version: v1.21.10" so i guess it's done | 19:47 |
fungi | browsing around, things look the same as always | 19:47 |
fungi | cloning openstack/nova seems to work fine, albeit slowly (but that's not uncommon for my isp unfortunately) | 19:49 |
fungi | yep, git clone completed without issue | 19:54 |
fungi | openbsd 7.5 today. guess it's upgrade time | 20:01 |
Clark[m] | Oh heh I popped out for lunch and missed the upgrade. Thank you for pushing that along | 20:02 |
fungi | no sweat. it occurred without incident | 20:02 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Upgrade Mailman's MariaDB to 10.11 https://review.opendev.org/c/opendev/system-config/+/915183 | 20:06 |
fungi | we can probably do that ^ whenever | 20:06 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Cleanup lingering Mailman 2 playbook https://review.opendev.org/c/opendev/system-config/+/915184 | 20:09 |
clarkb | I've skimmed the 6 gitea notes and they all report the expected version | 20:17 |
fungi | nodes? if so, yes i concur | 20:18 |
clarkb | yup I can't type | 20:19 |
opendevreview | Clark Boylan proposed opendev/system-config master: Add more LE debugging info to our Ansible role https://review.opendev.org/c/opendev/system-config/+/915173 | 20:40 |
opendevreview | Clark Boylan proposed opendev/system-config master: More completely disable ansible galaxy proxy testing https://review.opendev.org/c/opendev/system-config/+/915185 | 20:40 |
clarkb | fungi: ^ that galaxy proxy testing is why 915173 refused to pass | 20:40 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!