ianw | i'm really struggling to see why bridge01.opendev.org doesn't pick up variables from the bastion group in https://review.opendev.org/c/opendev/system-config/+/861112 | 00:29 |
---|---|---|
Clark[m] | Is it the zuul level or the inner nested Ansible that isn't picking it up? | 00:35 |
opendevreview | Ian Wienand proposed opendev/system-config master: [wip] switch testing bridge name to bridge01.opendev.org https://review.opendev.org/c/opendev/system-config/+/861112 | 00:52 |
ianw | in inner ansible. why it finds it when it uses bridge.openstack.org is really weird :/ | 00:53 |
*** rlandy is now known as rlandy|out | 01:00 | |
ianw | for some reason when we switch the bastion name, it is picking up https://opendev.org/opendev/system-config/src/branch/master/playbooks/zuul/templates/group_vars/control-plane-clouds.yaml.j2 | 01:35 |
opendevreview | Ian Wienand proposed opendev/system-config master: Run jobs with a jammy bridge.openstack.org https://review.opendev.org/c/opendev/system-config/+/857799 | 02:12 |
opendevreview | Ian Wienand proposed opendev/system-config master: testinfra: Update selenium calls https://review.opendev.org/c/opendev/system-config/+/858003 | 02:12 |
opendevreview | Ian Wienand proposed opendev/system-config master: Abstract name of bastion host for testing path https://review.opendev.org/c/opendev/system-config/+/858476 | 02:12 |
opendevreview | Ian Wienand proposed opendev/system-config master: bootstrap-bridge: use abstracted hostname https://review.opendev.org/c/opendev/system-config/+/861031 | 02:12 |
opendevreview | Ian Wienand proposed opendev/system-config master: Convert production playbooks to bastion host group https://review.opendev.org/c/opendev/system-config/+/858486 | 02:12 |
opendevreview | Ian Wienand proposed opendev/system-config master: Run a base test against "old" bridge https://review.opendev.org/c/opendev/system-config/+/860802 | 02:12 |
opendevreview | Ian Wienand proposed opendev/system-config master: [wip] switch testing bridge name to bridge01.opendev.org https://review.opendev.org/c/opendev/system-config/+/861112 | 02:12 |
opendevreview | Ian Wienand proposed opendev/system-config master: Move clouds definitions into control-planes-clouds group https://review.opendev.org/c/opendev/system-config/+/861130 | 02:12 |
opendevreview | Ian Wienand proposed opendev/system-config master: Abstract name of bastion host for testing path https://review.opendev.org/c/opendev/system-config/+/858476 | 03:26 |
opendevreview | Ian Wienand proposed opendev/system-config master: bootstrap-bridge: use abstracted hostname https://review.opendev.org/c/opendev/system-config/+/861031 | 03:26 |
opendevreview | Ian Wienand proposed opendev/system-config master: Convert production playbooks to bastion host group https://review.opendev.org/c/opendev/system-config/+/858486 | 03:26 |
opendevreview | Ian Wienand proposed opendev/system-config master: Run a base test against "old" bridge https://review.opendev.org/c/opendev/system-config/+/860802 | 03:26 |
opendevreview | Ian Wienand proposed opendev/system-config master: [wip] switch testing bridge name to bridge01.opendev.org https://review.opendev.org/c/opendev/system-config/+/861112 | 03:26 |
frickler | fwiw I noticed that "duplicated IRC verification failed message" symptom on kolla changes earlier. bit confusing, but no easy way to resolve it, I guess | 05:02 |
opendevreview | Ian Wienand proposed opendev/system-config master: infra-prod-bootstrap-bridge: run directly on bridge https://review.opendev.org/c/opendev/system-config/+/861138 | 05:50 |
ianw | clarkb: ^ your comment on https://review.opendev.org/c/opendev/system-config/+/861031/3 inspired 861138. i think i intended the bootstrap process to run against bridge directly; not via the playbooks/zuul/run-production-playbook path. | 05:52 |
ianw | i might have just got distracted on seeing it through. | 05:52 |
*** ysandeep|out is now known as ysandeep | 05:55 | |
opendevreview | Ian Wienand proposed opendev/system-config master: [wip] switch testing bridge name to bridge01.opendev.org https://review.opendev.org/c/opendev/system-config/+/861112 | 06:00 |
opendevreview | Takashi Kajinami proposed opendev/system-config master: Add puppetlabs packages for Ubuntu Jammy to mirror https://review.opendev.org/c/opendev/system-config/+/861139 | 06:00 |
*** jpena|off is now known as jpena | 07:18 | |
*** pojadhav is now known as pojadhav|afk | 07:49 | |
*** pojadhav|afk is now known as pojadhav | 08:16 | |
*** ysandeep is now known as ysandeep|afk | 08:17 | |
*** rlandy|out is now known as rlandy | 10:24 | |
opendevreview | Merged opendev/irc-meetings master: Update Barbican meeting chair and time https://review.opendev.org/c/opendev/irc-meetings/+/860929 | 10:29 |
*** ysandeep|afk is now known as ysandeep | 10:42 | |
*** rlandy is now known as rlandy|mtg | 11:09 | |
*** rlandy|mtg is now known as rlandy | 11:50 | |
*** dasm|off is now known as dasm | 13:00 | |
*** knikolla[m] is now known as knikolla | 13:26 | |
*** ysandeep is now known as ysandeep|dinner | 14:19 | |
opendevreview | Merged opendev/system-config master: Revert "Pin version of grafana-oss container" https://review.opendev.org/c/opendev/system-config/+/852056 | 14:48 |
*** dviroel_ is now known as dviroel | 14:55 | |
*** ysandeep|dinner is now known as ysandeep | 15:18 | |
clarkb | frickler: if you have time for https://review.opendev.org/c/opendev/system-config/+/861117 today and don't have objections for us restarting Gerrit later today this will fix the gerrit + ssh rsa key issue some users have run into with newer openssh clients | 15:25 |
*** ysandeep is now known as ysandeep|out | 15:33 | |
*** marios is now known as marios|out | 15:38 | |
*** tkajinam is now known as Guest2971 | 15:43 | |
clarkb | fungi: you up for testing meetpad today to double check the recent update before the PTG starts next week? | 16:31 |
clarkb | if so what is a good time for that? | 16:31 |
fungi | clarkb: i can give it a go in a few minutes if you're available | 16:32 |
clarkb | that works for me | 16:34 |
*** jpena is now known as jpena|off | 16:34 | |
clarkb | we can use https://meetpad.opendev.org/isitbroken just let me know when you are ready | 16:35 |
fungi | clarkb: i've joined if you have a few minutes | 16:43 |
jrosser_ | clarkb: i got some mail from review.openstack.org today so messagelabs is passing those now - i added a rule for the domain. unusually there were no held messages that i could release from quarantine so i'm not sure what was happening to them before. | 16:47 |
clarkb | jrosser_: well it seemed that the remote mail server was rejecting them outright according to the responses our mail server was getting | 16:52 |
clarkb | wouldn't surprise me if they never made it far enough to be quarantined | 16:52 |
fungi | probably someone who uses that same mail service flagged messages from the address as spam or something | 16:53 |
clarkb | I think gitea-lb01 will be replaced with a jammy gitea-lb02 since replacing that server should be really simple. I think all it requires is making the new server and updating DNS | 16:58 |
clarkb | er will be my first jammy replacement | 16:58 |
clarkb | that way I'm not debugging afs on jammy and other jammy things if they come up | 16:58 |
fungi | yeah, should be quite trivial | 16:59 |
fungi | even the dns update ought to be effectively hitless for clients as long as we don't take down the original server until the old ttl expires | 17:00 |
clarkb | yup | 17:01 |
clarkb | http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=66612&rra_id=all I think that shows just under 8GB of memory is appropriate so I'll stick with the same flvaor size | 17:02 |
fungi | agree | 17:03 |
clarkb | hrm vexxhost has v3 flavors now though so I should look at this more carefully. THis might be the most complicated part :) | 17:03 |
fungi | sizing based on processors may be better in that case | 17:03 |
fungi | we may end up with too few cpus if we choose based on ram | 17:03 |
clarkb | http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=66609&rra_id=all I dont' think we are very cpu limited | 17:04 |
fungi | pretty sure the newer flavors there are comparatively generous on ram | 17:04 |
clarkb | v3-standard-2 is 8GB memory 2dedicated vcpu. We currently run v2-highcpu-8 which is 8GB memory and 8vcpu (not sure if dedicated) | 17:06 |
clarkb | but it really seems like we don't use a ton of cpu according to cacti. I think v3-standard-2 will be sufficient | 17:07 |
clarkb | I'll go ahead with v3-standard-2 and we can always start over if necessary | 17:08 |
clarkb | hrm there is no jammy image in vexxhost yet | 17:10 |
clarkb | mnaser__: ^ is that something ya'll are planning to do or should I upload our own image? | 17:11 |
mnaser__ | clarkb: feel free to upload one, i think we've got a couple other things we're juggling around | 17:11 |
clarkb | ack | 17:13 |
clarkb | https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img is the image I'll pull down and reupload | 17:13 |
mnaser__ | clarkb: you might want to flip it to raw if you're doing bfv :> | 17:14 |
clarkb | I don't think we're doing bfv. But ya I'll upload raw to make it more versatile | 17:14 |
* clarkb converts it locally first to ensure I don't fill bridge's disk | 17:16 | |
clarkb | I'm uploading the image now finally (took some time to convert and copy it and check hashes and so on) | 17:48 |
*** Guest2868 is now known as diablo_rojo | 17:59 | |
clarkb | I'm working through some ssh issues trying to boot the node now | 18:01 |
clarkb | adding local debugging to launch-node | 18:01 |
clarkb | I think authentication is failing but it is failing suspiciously quickly I suspect that this may be related to when ssh tells you to go away as the server isn't fully booted yet. I may need to manually boot an instance and test it | 18:08 |
fungi | does passing --keep to the script not help? | 18:09 |
clarkb | thats the next step. I wanted a manual boot to control everything and get a baseline that the image works (it does) | 18:14 |
clarkb | when you use --keep sometimes cleaning up everything that results in isn't straightforwad | 18:14 |
clarkb | I'm suspecting that maybe the userwarning about the unknown host key is related | 18:14 |
clarkb | oh nevermind we set the policy to warn which is what it does | 18:19 |
clarkb | I think this might be rsa sha2 + paramiko problems | 18:34 |
clarkb | paramiko doesn't have an ed25519 generate method | 18:38 |
clarkb | we could maybe use ecdsa or upgrade paramiko instead. I guess that should be my next thing to try. Install stuff into a venv. Run out of venv with newer paramiko | 18:38 |
clarkb | but I must eat lunch now | 18:39 |
ianw | clarkb: if you have a sec to think about https://review.opendev.org/c/opendev/system-config/+/861138 which is a slight rework to the bridge bootstrap job, that would be good, as if ok i'll have to rebase the upgrade bits on it. | 19:49 |
ianw | thanks for looking at the jammy images, lmn if i can help ... | 19:50 |
ianw | one other thing i noticed we should probably venv-ize is docker-compose. not sure i want to switch every production host all at once, maybe an opt-in update | 19:51 |
clarkb | I was just looking to see where we install paramiko on bridge and I'm not seeing it | 19:51 |
clarkb | I suspect it may have been pulled in as an ansible dep in the past but it is no longer an ansible dep? | 19:51 |
clarkb | anyway I think the next thing with jammy is to make a virtualenv with a new paramiko and run out of that then see if that fixes the auth issues | 19:52 |
clarkb | yup that fixed it | 19:55 |
clarkb | now to see if we can bootstrap the rest of a jammy node | 19:55 |
*** dviroel is now known as dviroel|biab | 19:55 | |
clarkb | infra-root I think the "fix" here is to update paramiko on bridge to latest. I'm not aware of anything else using paramiko there (ansible uses openssh) | 19:56 |
opendevreview | Clark Boylan proposed opendev/system-config master: Add Jammy gitea-lb02 to our inventory https://review.opendev.org/c/opendev/system-config/+/861226 | 20:10 |
clarkb | it appears that this IP was used by an older server in the past. We have a known hosts entry for it that conflicts. I'm going to clear that out | 20:15 |
opendevreview | Clark Boylan proposed opendev/zone-opendev.org master: Add gitea-lb02 to DNS https://review.opendev.org/c/opendev/zone-opendev.org/+/861227 | 20:22 |
opendevreview | Clark Boylan proposed opendev/zone-opendev.org master: Prepare opendev.org records for switch to new LB https://review.opendev.org/c/opendev/zone-opendev.org/+/861228 | 20:29 |
opendevreview | Clark Boylan proposed opendev/zone-opendev.org master: Swap gitea-lb01 to gitea-lb02 for opendev.org https://review.opendev.org/c/opendev/zone-opendev.org/+/861229 | 20:29 |
opendevreview | Clark Boylan proposed opendev/zone-opendev.org master: Reset TTLs for opendev.org records https://review.opendev.org/c/opendev/zone-opendev.org/+/861230 | 20:29 |
clarkb | ianw: I left a note on that chagne. I think it is mostly workable except for the one thing I called out | 20:31 |
opendevreview | Clark Boylan proposed opendev/system-config master: Fixups for launch node https://review.opendev.org/c/opendev/system-config/+/861231 | 20:39 |
opendevreview | Merged openstack/diskimage-builder master: Added cloud-init growpart element https://review.opendev.org/c/openstack/diskimage-builder/+/855856 | 20:41 |
clarkb | ianw: https://review.opendev.org/c/opendev/system-config/+/861117 is a good one for fixing ssh + rsa + sha1 with gerrit. | 20:43 |
clarkb | I'm doing all the ssh + rsa + sha1 things today :) | 20:43 |
*** dviroel|biab is now known as dviroel | 20:51 | |
ianw | clarkb: hrm, yes good call on that key. that's the key that is used to login to all the hosts, right? | 21:09 |
clarkb | ianw: I think so | 21:28 |
clarkb | Heres haproxy on the test gitea-lb02 jammy node logging what appear to be successful requests https://zuul.opendev.org/t/openstack/build/0681e64943674281bbfa1645e51df878/log/gitea-lb02.opendev.org/haproxy.log | 21:30 |
clarkb | I think https://review.opendev.org/c/opendev/system-config/+/861226 and https://review.opendev.org/c/opendev/zone-opendev.org/+/861227 are probably safe to land | 21:31 |
*** dviroel is now known as dviroel|afk | 21:33 | |
clarkb | if we ever need good examples of what healthy mailing list interactions look like I've been really impressed with the mailman3 users list | 21:58 |
opendevreview | Merged opendev/system-config master: Update our Gerrit images https://review.opendev.org/c/opendev/system-config/+/861117 | 22:10 |
*** rlandy is now known as rlandy|bbl | 22:12 | |
opendevreview | Ian Wienand proposed opendev/system-config master: [wip] infra-prod-bootstrap-bridge: run directly on bridge https://review.opendev.org/c/opendev/system-config/+/861138 | 22:23 |
ianw | clarkb: ^ do you want to restart gerrit for 861117? | 22:24 |
clarkb | ianw: ya I can do it. Probably in about an hour just so that its as quiet as possible first | 22:26 |
ianw | ++ i can do it my afternoon if you like too | 22:26 |
clarkb | I should probably do one to keep in practice :) It should be a docker-compose pull && docker-compose down && docker-compose up -d right? | 22:28 |
ianw | yep that's what i do :) i usually run an inspect on the latest image just to double check it is what i think it is before restarting | 22:30 |
opendevreview | Ian Wienand proposed opendev/system-config master: [wip] infra-prod-bootstrap-bridge: run directly on bridge https://review.opendev.org/c/opendev/system-config/+/861138 | 22:33 |
ianw | clarkb: 861226 should be safe to merge, as the host will just sit there until we switch dns? might be good to just get it in and make sure it's running ansible ok, etc? | 22:35 |
clarkb | ianw: yup exactly | 22:39 |
clarkb | infra-root one thing I notice on review02 is that we may want to run `docker image prune -f` there? seems we've got a number of older imgaes and I'm not sure how much value those provide but they do consume disk | 22:41 |
ianw | yeah, i think we can prune most. i guess we just keep them out of an abundance of caution so we can roll backwards | 22:42 |
clarkb | yup, on a number of other services we automatically prune but we don't do that for gerrit (which seems sane? I dunno) | 22:43 |
clarkb | I think it will leave behind images for gerrit 3.2, 3.3, and 3.4 since they are tagged | 22:43 |
clarkb | (which I'm also fine with) | 22:43 |
clarkb | should I go ahead and run that? the upside to running it before we pull the new image too is then we can have the current image left behind untouched by prune for rolling back to | 22:44 |
ianw | can you do something like prune all but the latest 2? | 22:45 |
clarkb | no unfortunately. You can time bound and leave behind tagged images. I guess you can figure out what the last two are and tag those before pruning | 22:46 |
clarkb | maybe do something like pruning everyhting more than 5 months old? | 22:46 |
clarkb | that should keep the last 2-3 images | 22:47 |
opendevreview | Ian Wienand proposed opendev/system-config master: [wip] infra-prod-bootstrap-bridge: run directly on bridge https://review.opendev.org/c/opendev/system-config/+/861138 | 22:47 |
ianw | maybe start with that and just manual prune any left overs? should remove the bulk of them | 22:48 |
clarkb | ya | 22:48 |
clarkb | `sudo docker prune --filter "until=2022-05-01T00:00:00" -f` maybe | 22:49 |
clarkb | running that command locally against the images i have on my laptop it pruned what I expected | 22:50 |
clarkb | everything with a tag was kept and then everything without a tag older than 5 months was removed | 22:51 |
clarkb | however I don't have an untagged image newer than 5 months | 22:51 |
ianw | lgtm; we don't need those 15/20 month old things | 22:51 |
clarkb | ok I'll run that on review02 now | 22:51 |
clarkb | er `sudo docker image prune` not `sudo docker prune` | 22:52 |
clarkb | Total reclaimed space: 17.72GB and the image listing looks sane | 22:54 |
clarkb | one thing we could add to the gerrit automation is a docker image prune with an until of like 6 months | 22:57 |
clarkb | adding an until to all our image prunes to keep the last week might also be good (though I think space is tighter on some hosts) | 22:57 |
clarkb | I'll have to think about that a bit more. Maybe do it on a case by case basis | 22:57 |
opendevreview | Ian Wienand proposed opendev/system-config master: [wip] infra-prod-bootstrap-bridge: run directly on bridge https://review.opendev.org/c/opendev/system-config/+/861138 | 22:59 |
clarkb | there are two changes in hte openstack gate pipeine that look close to merging. I'll wait for those to clear out then do the gerrit restart | 23:12 |
clarkb | fungi: if you are around this evening it would be great if you can test and confirm removing your sha1 override is functional too, but that can happen tomorrow | 23:12 |
fungi | yeah, i can test, np | 23:13 |
opendevreview | Ian Wienand proposed opendev/system-config master: [wip] infra-prod-bootstrap-bridge: run directly on bridge https://review.opendev.org/c/opendev/system-config/+/861138 | 23:18 |
opendevreview | Ian Wienand proposed opendev/system-config master: [wip] infra-prod-bootstrap-bridge: run directly on bridge https://review.opendev.org/c/opendev/system-config/+/861138 | 23:27 |
clarkb | the last job I was waiting on is finishing up now. | 23:27 |
clarkb | and its done. Running docker-compose pull and then will inspect it to double hceck | 23:29 |
clarkb | opendevorg/gerrit@sha256:9947e82a212c9a00c7171a656e8935485522d509a49d5b97fde5e54fabfaf7c9 seems to match docker hub | 23:30 |
clarkb | proceeding with the down and up -d now | 23:31 |
clarkb | the web ui is up for me | 23:33 |
clarkb | there is an IBM IP spamming the error log about not being able to negotiate ssh connections but that is preexisitng in the log pre restart as well | 23:33 |
clarkb | the client only does diffie hellman sha1 variants and gerrit doesn't do those | 23:34 |
clarkb | fungi: are you able to ssh without your config override? | 23:35 |
clarkb | I can using my ed25519 key which doesn't tell us much other than ssh is generally working | 23:35 |
opendevreview | Ian Wienand proposed opendev/system-config master: [wip] infra-prod-bootstrap-bridge: run directly on bridge https://review.opendev.org/c/opendev/system-config/+/861138 | 23:37 |
fungi | clarkb: yep! it works | 23:38 |
clarkb | woot | 23:38 |
fungi | i no longer need the override for it | 23:39 |
clarkb | I just realized that we're still installing the 3.5.2 tag for a number of plugins when we build our gerrit war. We should update those to 3.5.3 but I don't think this is urgent. Often times the tags point to the same commit and we were already running on 3.5.3 with 3.5.2 plugins for a few days | 23:40 |
clarkb | working on a change for that now | 23:40 |
clarkb | and then I hsould write an email followup to the last email I sent about rsa keys | 23:41 |
clarkb | yup confirmed that for 3.5.x all of the 3.5.2 plugins we install have 3.5.3 tagged on the same commit so defiitely not urgent, just good bookkeeping | 23:45 |
clarkb | oh but I have discovered that we apparnetly try to checkout plugins/its-base on stable-3.6 which no longer exists (it got merged into master) so zuul must be using a stale version of that branch? And a stable-3.5 branch exists for that now (instead of master) which is 3 commits behind master all of which appera related to stable-3.6. I'll correct that for consistency too | 23:51 |
*** dasm is now known as dasm|off | 23:52 | |
opendevreview | Ian Wienand proposed opendev/system-config master: infra-prod-bootstrap-bridge: run directly on bridge https://review.opendev.org/c/opendev/system-config/+/861138 | 23:54 |
ianw | clarkb: ^ I think that works around the issue with the root key. and gives us a point to write out other things if we need to | 23:55 |
*** rlandy|bbl is now known as rlandy | 23:55 | |
opendevreview | Clark Boylan proposed opendev/system-config master: Resync gerrit plugin versions to latest gerrit releases https://review.opendev.org/c/opendev/system-config/+/861270 | 23:58 |
clarkb | ok I think that ^ is bookkeeping caught up. Reviewers should pay attention to the its base changes | 23:58 |
clarkb | since that is the only real thing that changes there. | 23:58 |
*** rlandy is now known as rlandy|out | 23:59 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!