Thursday, 2022-09-08

ianwas a bonus, it seems that comments in tickets don't work in firefox00:15
opendevreviewMerged opendev/system-config master: Fix docker wait requires at least one argument
fungiianw: in rackspace tickets? it's worked for me00:21
ianwfungi: is what i see :/00:22
ianwanyway, i have filed a ticket on the translate db upgrade because it seems stuck00:23
fungii have had things like that get stuck in the past. seems like they end up in scheduling deadlocks00:24
ianwit says it has "1 replica" but I can't see the replica00:24
ianwso i can't detach that replica, and so the main db is stuck waiting for a replica to detach that i can't access, afaict00:25
clarkbI've restarted the manual zuul_reboot run and it seems happy00:26
clarkbit seems to be doing what I expect now00:27
clarkbze01 did its reboot and started back up again and now ze02 stop ran and we are waiting as expected00:28
clarkband that is running in screen with typical log location00:28
ianwthe translate db is still responding; despite it's half-upgraded state in the UI.  so i guess i just need to wait for a response on the ticket00:41
opendevreviewMerged zuul/zuul-jobs master: upload-npm : support authToken argument
opendevreviewIan Wienand proposed openstack/project-config master: upload-npm: use token auth
ianwok; rax issue turns out to be what they think is a UI bug in the website; and navigating directly i've managed to detach the replica.  i've updated the connection strings in hiera, run puppet and rebooted both servers.  so that basically completes that03:15
ianw#status log translate and translate-dev RAX hosted mysql instances upgraded to mysql 5.7 03:16
opendevstatusianw: finished logging03:16
ianwi'll remove the old instances -- the upgrade notice says to make sure to do this to avoid their script doing it for us03:17
Clark[m]ianw do we need to update hiera config somewhere to keep puppet doing the right thing?03:29
ianwClark[m]: yep, i've commited those changes on bridge and did a manual run to deploy the updated connect strings on the servers03:30
Clark[m]Excellent thanks03:30
opendevreviewMerged openstack/project-config master: upload-npm: use token auth
slittle1_sorry, internet dropped just after I reported my problem13:23
slittle1_Can't push a signed tag to ssh:// ssh://
slittle1_the tag was created with git tag -s, and verified with git tag -v13:24
slittle1_acl ...13:25
slittle1_[access "refs/tags/*"]13:25
slittle1_createSignedTag = group starlingx-release13:25
slittle1_i'm a member of starlingx-release13:25
fungislittle1_: what error message do you get when pushing that?13:59
fungiis it possible you're not pushing with your username? i don't see any username in the remotes you quoted, so openssh is likely assuming your local username is valid on the gerrit server (and i don't know whether they match in your case)14:00
slittle1_git push gerrit 7.0.014:08
slittle1_Enumerating objects: 1, done.14:08
slittle1_Counting objects: 100% (1/1), done.14:08
slittle1_Writing objects: 100% (1/1), 814 bytes | 814.00 KiB/s, done.14:08
slittle1_Total 1 (delta 0), reused 0 (delta 0), pack-reused 014:08
slittle1_remote: error: branch refs/tags/7.0.0:14:08
slittle1_remote: use a SHA1 visible to you, or get update permission on the ref14:08
slittle1_remote: User: slittle114:08
slittle1_remote: Contact an administrator to fix the permissions14:08
slittle1_remote: Processing changes: refs: 1, done    14:08
slittle1_To ssh://
slittle1_ ! [remote rejected] 7.0.0 -> 7.0.0 (prohibited by Gerrit: update for creating new commit object not permitted)14:08
slittle1_error: failed to push some refs to 'ssh://'14:08
fungislittle1_: are you tagging a commit which doesn't exist on gerrit, or has a commit in its history which isn't already in the public repository? that's suggesting you have14:09
fungimaybe you ended up with some dirty merges in the local repository state?14:10
fungiwhat's the commit id (sha-1 hash) of the commit you tagged as 7.0.0?14:12
slittle1_looks loike review 851132 is outstanding14:12
fungiassuming your intent was to tag the current branch tip of the r/stx.7.0 branch, it looks like that would be 16466c2d84d97a3f285a9a7ccd3e8e9144176a02 instead:
slittle1_zuul barfed on, which I would like to include un the tagging14:15
slittle1_Zuul: This change is unable to merge due to a missing merge requirement.14:16
slittle1_Oh: Removed Workflow+1 by chen haochuan14:17
fungiyes, that would count as a merge requirement14:18
fungithe change no longer had sufficient votes to merge when zuul finished testing it14:18
opendevreviewSteven Webster proposed openstack/project-config master: Add STS-Silicom app to StarlingX
*** dviroel is now known as dviroel|lunch14:57
clarkbthe manual run of zuul_reboot is on to ze10 now. Seems to have run happily overnight. Once it gets to the mergers we can confirm that all the changes made are good.15:15
clarkbslittle1_: fungi: it is important that you always update your local copy of the remote branch and tag off of that to ensure you don't end up pushing things you shouldn't. It appears new Gerrit is more selective about that now than in the past which is great, but this has caused problems in the past with pushing code that was never intended for Gerrit and otherwise pushing bad15:24
clarkbFor example even if that change was merged, if it merged with a merge commit then tagging the change commit itself wuold represent an improper state15:25
clarkbyou'd need to tag the resulting merge commit to include other changes that are mixed into the HEAD of that branch15:25
fungiagreed, that's why the documentation i linked yesterday includes a git pull --ff-only as its first step15:27
opendevreviewClark Boylan proposed opendev/system-config master: Update python builder and base image
clarkbfungi: I started looking at the jvb thing more closely and found I ran up against the need to figure out what the key-store-path should point to but tldr there seems to be no one knows ow to set this up :/16:55
fungier, um... wow!16:55
clarkbI'm beginning to wonder if that actually means do ssl client cert auth and use the keystore to check it?16:57
clarkbrather than do normal tls connection verification for privacy purposes16:57
clarkb this shows someone generating a keystore16:58
clarkbso I'm now looking at openssl docs16:59
clarkboh wait keytool is a java thing17:01
clarkbI half wonder if this is just a simple way to do unverified encyption via tls. Nothing I can find in docs or forums indicates you need a shared CA across these things so maybe that is all it is doing17:04
clarkbHowever, it is nginx thati s proxying these connections I would've expected nginx to verify the connection17:04
funginot necessarily. like in haproxy we have the option to tell it to care about the ssl cert validity or not17:05
funginginx may similarly not care about validity of backend certs when proxying unless told explicitly that it should17:06
clarkbgood point17:06
clarkb apperas to be how you convert our LE provisioned pem to a java keystore file17:06
clarkbof course it isn't a single command or something that can be run by default without user interaction :/17:07
fungiso i guess that's basically just to get the jetty serving your le cert?17:07
clarkbya. Alternatively we could just generate a random cert17:08
clarkband load that in. That would be easier to automate. Then see if that functions17:08
opendevreviewClark Boylan proposed opendev/system-config master: WIP Update colibri for all the JVBs
clarkbfungi: ^ I did everything but the java keystore stuff in that change. Firewall rules, setting WS_SERVER_ID, copying in the new jvb.conf and mounting it into the container and so on.17:20
fungiawesome, curious to see how it tests out17:23
fungionce i'm done making lunch, assuming no new fires, i can work on mm3 imports17:24
clarkbI suspect it won't actually fail because we don't check if jvb functionality is working? Unless the container startup itself breaks due to my config edits17:24
clarkbfungi: sounds good.17:24
clarkbI'm going to work on getting a bike ride in now before it gets hot17:24
fungihave fun!17:25
fungiokay, 6 of 7 production mailing lists (including kata's) are now copied onto the held node via rsync; the openstack site is the last and currently underway, but will take a while due to its sheer size19:21
fungionce that's done i'll fiddle with more test imports, but in the meantime i'm going to pop out to get some vaccines. shouldn't hopefully take too long19:22
opendevreviewClark Boylan proposed opendev/system-config master: Update python builder and base image
fungimanaged to get vaccines and come back before the rsync finished20:41
opendevreviewClark Boylan proposed opendev/system-config master: Update colibri for all the JVBs
clarkbThats a bit of a "I wonder if this works"20:50
clarkbI'm using the java keytool command to generate a random keystore20:50
clarkbthe zuul reboot playbook just started on the mergers20:50
clarkband now zuul01 is restarting20:58
clarkband now zuul0221:04
clarkbI expect that this playbook will run happily over the weekend based on this run being happy21:04
clarkbthe zuul reboot playbook has completed successfully. I've closed the screen session. Logs are in the normal log file21:09
clarkbinfra-root any idea what might be happening here: it seems like docker is hanging up? All of those image builds are timing out21:46
fungifor the record, `du -sh /var/lib/mailman/import` reports 27G after copying all the current mm2 sites into it21:47
clarkbIt looks like the RUN commands that happen after disabling recommends is an `apt-get update && apt-get install -y dumb-init && apt-get clean`21:49
clarkbI wonder if the apt-get install -y dumb-init is blocking on some expected input from the user21:49
clarkbrunning docker build locally against that dockerfile does not have any issues21:51
clarkbI've rechecked and if things stall out again I'll ssh into the test node to look21:52
fungithe manpage suggests -y should entirely sidestep interaction21:53
ianwclarkb: the linux/arm64 would be the bit i guess ... that's binary emulation21:58
clarkbianw: oh!21:59
clarkbin the past that has been quick enough with file copies, but maybe we're doing some extra work here for some reason22:00
ianwprobably worth a re-run just incase it was a particularly slow node, but yeah something could have changed that we're suddenly building something etc.22:00
clarkbya I already rechecked and have ssh'd into one of the nodes to take a look22:00
clarkbif it is the emulation I should see a cpu sitting at like 100% all the time doing that I expect22:00
clarkbI think I see some of the issue. For amd64 we don't need to upgrade pip it is already up to date but on arm we are a few releases behind?22:03
clarkboh maybe not I built python3.10 locally and maybe they update pip to match some python release version stuff22:03
clarkbya both amd and arm updated pip on python3.8 based images22:04
fungistripping out the "mailman" lists which we won't migrate, we have 48 total currently we could import across all 7 sites22:10
clarkbthis recheck is looking happy so ya maybe just a slow node with the arm emulation22:12
clarkbstill no libc update for bullseye :/22:13
clarkbfungi: and this is on the newer held node right?22:22
clarkbI've just found it and I see your session. I need to clean up my old gitea gravatar testing hold. And I guess once we've got working imports on this newer node we should delete the old hold for mm322:23
clarkbianw: can I delete your dib rocky 9 hold? I think that got sorted22:24
ianwoh yes, please do22:25
clarkbfrickler: you have a couple of holds too. The older of which I'm pretty sure can be deleted too22:26
clarkb heres another unrelated change that had a similar timout. I think in the same cloud region22:29
clarkbI guess keep an eye out to see if this is a persistent issue that we need to debug22:29
clarkbhalf worried it could be a zuul/ansible thing22:30
clarkbcurrently insufficient evidence to say that though22:31
fungii started a scripted loop import of the site a couple of minutes ago in a screen session on the held node, logging a full transcript to file, with timings for each command22:42
fungithe source of the trivial shell script is in
fungiand it's already done22:46
fungiso around 5 minutes to import all the opendev site lists22:47
clarkbfungi: for the pipermail redirects I don't thnk I could use the exact path you had in the old comment due to needing to address things properly by site. If you look at the apache config the path should be clera though22:47
clarkband I guess if you have a suggestion for being more like what you had in mind I'm open to it I just couldn't figure it out with mod rewrite22:48
fungi is the log of the import22:48
fungiick, the percent counter is rather annoying that way22:49
fungistrange that for the computing-force-network ml archive import it warned thus: not importing messages older than Aug. 25, 2022. Use "--since=<date>" to import older messages.23:00
clarkbshould we add a --since for 1900 just to be extra sure they'll all go on all lists?23:02
fungithough it does say go on to say "Indexing 3 emails" which is the total number in the imported mbox file, with dates of 2022-08-09, 2022-08-19 and 2022-08-25 respectively23:02
fungialso, none of the other archive imports spewed such a warning23:03
fungijust the first one23:03
fungianyway, i'm starting a trial import of all of now23:10
fungioh, interesting, i think some of these are getting skipped for not being in our initial config so we don't have lists created for them to migrate into23:11
fungiError: No such list: openstack-announce@lists.openstack.org23:12
fungiet cetera23:13
clarkbhrm I dropped them all out of the config because I didn't want to have valid working lists for things on the server until we are ready ot migrate the23:13
clarkbwhich meant I dropped all but opendev out23:13
fungioh, got it23:13
clarkbfungi: should I undo that? I think we can create them all but mark them as private23:14
clarkband then when we want to migrate mark them public as appropriate and migrate into them23:14
fungiwell, doesn't matter for now. i think this will still import the archives just not the configs23:15
clarkbgot it23:15
fungiso should still be a good test of the db fix23:15
clarkbI'll look into creating them as a locked down list tomorrow then. I'm pretty sur eit is possible and sound slike that will make migrating easier23:15
fungiit's currently working on importing the openstack-discuss archive23:16
clarkbon the colibri ws change I seem to be hanging on the keytool generation. Going to run it on the test node in the foreground to see if it is asking for some info23:18
clarkbbah it wants name and org and location info23:19
fungiit's 2020. why can tools not be run unattended/headless?23:19
fungier, 2022 even23:19
fungibut still feels like 2020 isn't over yet23:20
fungiopenstack-discuss archive is 1/3 imported and haven't seen that db error yet23:20
fungican't recall how far in it errored last time23:20
clarkbya its crazy to me they have a bunch of command line flags but none for these values23:23
clarkbI've got input sorted out for the commnd now .Just need to figure out how to plumb it through ansible23:25
opendevreviewClark Boylan proposed opendev/system-config master: Update colibri for all the JVBs
clarkbmaybe something like that23:27
ianwclarkb: is one i posted last week in the ongoing saga of console files on the static hosts.  it is a host/playbook var to disable the console stream -- i came around to this thinking it might be a better generic solution than us just running a cron purge.  i think we can use it like :
fungiclarkb: yep, no db packet size error this  time23:31
clarkbfungi: excellent, we need to check the mysqldumps too23:32
clarkbbut that is 50% of the problem solved :)23:33
clarkbya I like that approach and will attempt a review tomorrow morning23:37
fungishortly i should have a total archive import time estimate for the site, as a rough upper-bound for outage windows23:43
fungioof, we assumed importing openstack-discuss was a good benchmark, we should have been testing imports of openstack-stable-maint23:53
fungiin recent years that's been averaging over 100 messages a day because of all the broken bitrot jobs23:54
clarkbfungi: wow23:54
clarkbmaybe we should remove the smtp reporter from zuul23:55
fungiat the moment it's around 120 messages a day (and more on weekends)23:55
clarkbif people aren't looking at it and all it does is fill our mailing list disks23:55
fungii've brought it up with elodilles before but it seemed like he wanted to keep getting them23:55
fungithose messages don't end up with random attachments though, so while it probably uses more index it doesn't have the raw message sizes openstack-discuss does23:56

