Tuesday, 2023-02-28

opendevreviewIan Wienand proposed opendev/system-config master: make-tarball: role to archive directories  https://review.opendev.org/c/opendev/system-config/+/86578400:09
opendevreviewIan Wienand proposed opendev/system-config master: tools/make-backup-key.sh  https://review.opendev.org/c/opendev/system-config/+/86643000:09
clarkbkevinz: are you around?00:26
clarkbianw: kevinz: re works on arm I think the firs thting I would start with is pointing out that the hardware didn't work for the vast majority of the last 6 months00:26
clarkbBut then on top of that we can talk about the crpytography package builds and kolla work?00:27
ianwyeah probably the fact it's fairly constantly busy is probably confirmation enough.  i feel like we could pull stats from zuul00:30
clarkbI can work on drafting something if ya'll prefer but I won't be able to get to it until tomorrow. I do think its a bit unfair to act as if the hardware was available for those 6 months though. Since aiui there was hardware issues at equinix00:30
clarkbianw: ya grafana has info00:30
clarkbanyway I think we should carefully call out that we don't have 6 months of data because the hardware was really only available for the last month or two00:30
clarkbbut then point at the good things that are happening anyway00:31
ianwi wonder if i should just manually delete the  linaro-us-regionone graphite stats00:34
ianwthe reason that shows in the graph is we use linaro-*00:34
clarkbaha https://grafana.opendev.org/d/391eb7bb3c/nodepool-linaro?orgId=1&from=now-30d&to=now&var-region=linaro-regionone is a better graph00:34
ianwour sweep will clear them out eventually, after they don't get updates for a while00:35
clarkbworking on a response there00:35
clarkbianw: something like that maybe?00:55
clarkbfeel free to edit and send or discard and send etc. Otherwise I'll probably send it out tomorrow00:55
clarkband now I need to get our meeting agenda out before dinner00:55
ianwthat looks pretty good to me.  although we're building on osuosl (for now, anyway, until building becomes a special case of a regular test) the fact that we're producing working arm64 images is maybe another good angle.  i'm not sure if they're used, but we have  a tool provably doing it00:58
clarkboh ya ++ to that00:58
clarkbmaybe add that to the paragraph about zuul + ansible00:59
clarkbok agenda is out and I'm being told dinner is ready. ianw if you want to send that email feel free. Otherwise I can send it out tomorrow morning01:02
ianwi think about it a bit more, not quite what i want to say.  i'll re-read it all in a bit but i think maybe sleep on it and send tomorrow01:02
clarkbLooks like they are also asking about plans for the future. Which I haven't really addressed so far01:02
clarkbianw: if you can think of future plans ^ feel free to add those too01:03
ianwwheel builds is another thing we do01:03
opendevreviewIan Wienand proposed opendev/system-config master: make-tarball: role to archive directories  https://review.opendev.org/c/opendev/system-config/+/86578401:11
opendevreviewIan Wienand proposed opendev/system-config master: tools/make-backup-key.sh  https://review.opendev.org/c/opendev/system-config/+/86643001:11
opendevreviewIan Wienand proposed opendev/system-config master: make-tarball: role to archive directories  https://review.opendev.org/c/opendev/system-config/+/86578402:43
opendevreviewIan Wienand proposed opendev/system-config master: tools/make-backup-key.sh  https://review.opendev.org/c/opendev/system-config/+/86643002:43
opendevreviewIan Wienand proposed opendev/system-config master: make-tarball: add some extraction instructions  https://review.opendev.org/c/opendev/system-config/+/87558703:49
ianwclarkb: ^ i think that's now ready to go again03:50
ianwyou can validate the process with https://4bf453712ef03cadf4d7-91acd78cd46015ce54b5f888f723113e.ssl.cf2.rackcdn.com/865784/15/check/system-config-run-base/7e31121/bridge99.opendev.org/make-tarball/backup_2023-02-28T03:04:56.tar03:50
ianwthe key can be reconstructed from https://review.opendev.org/c/opendev/system-config/+/865784/15/playbooks/zuul/templates/group_vars/bastion.yaml.j203:51
ianw875587 has a little session example03:51
*** yadnesh is now known as yadnesh|away11:07
bbezakHi. I got ssh key issue in one of job, when connecting to zuul@ - https://zuul.opendev.org/t/openstack/build/2965f8f2f4c84560b1e5ffeaf3ac6c1c13:08
bbezaklooks weird :)13:09
fungibbezak: it happens from time to time, when nova loses track of a vm in a cloud provider's network for some reason and then neutron allocates the same ip address to us for a new vm without realizing something on the network still answers to it, so ansible randomly ends up getting routed to the old vm13:12
fungiif we see a high incidence of it, we usually correlate the ip addresses and then provide a list to the provider, but most of the time they run background cleanup processes to find and delete those automatically anyway13:13
fungilooks like that one is in rackspace's iad region13:15
bbezakok, so vm is not in openstack, but somehow still running on the host13:15
bbezakok, thx for clarification fungi13:17
fungiyes, usually you end up (as the operator) having to do something like track its mac down to a specific hypervisor host through your bridge tables and then use virsh to get a listing of things running there and check them against nova's db13:19
fungibut obviously an automated process there could just periodically dump a list of all virtual machines on the host and then see if any don't match a db entry13:19
fungii think there are some popular openstack ops tools around stuff like that, but since i'm not a production cloud operator these days my understanding of this is mostly theoretical13:20
clarkboh wow gitea09 sync is still not completed16:15
clarkbI'm thinking for the next giteas I'll try to do them assembly line style all at once and maybe we can have the syncs run over a weekend16:16
fricklersomethings's wrong with meetbot, for the recent #-qa meeting only one file was saved, also no links were posted to the channel https://meetings.opendev.org/meetings/qa/2023/16:22
fricklerthe neutron-ci meeting that ended just a bit earlier was recorded fine16:22
fricklernothing obvious in the logs that I could find16:23
clarkbcould it be afs quotas?16:24
clarkbthough I would've expected messages back to the channel and onl the file issues in that case16:24
clarkbdoesn't look like that volume is on our afs dashboard16:24
fricklerI don't think that thas is on afs even?16:27
fricklerha, it just missed the earlier #endmeeting16:27
fricklerjust tried again and now all looks well16:27
clarkbI think it is in afs16:27
clarkbor at least the site is, maybe not the raw logs16:28
funginothing is wrong. people sometimes forget that the chair changing their nick after starting the meeting causes them to no longer appear to be the meeting chair16:37
fungimeetbot simply doesn't track nick changes. possible feature addition for the future i suppose16:38
fungithe obvious workarounds are to add a second chair before changing your nick, or changing your nick back to what you started the meeting with16:41
clarkbfungi: frickler: have you had a chance to look at https://etherpad.opendev.org/p/3DcVXw0PBOknv1bgyZWh ? This is an email response to arm about the use of our Works on Arm hardware hosted by equinix running a linaro cloud for arm CI16:56
clarkbLooking to send that out today and any feedback you might have is appreciated16:56
clarkbthe git gc on gitea09 is starting to pack those objects into more space effiicent pack files17:16
* clarkb is a bit annoyed the replication sync is still happening. I thought I'd be asking for reviews to add gitea09 to haproxy by now17:16
fungiclarkb: the writeup lgtm, i left a couple of comments17:21
clarkbfungi: the point about linking to the image build logs and images themselves is good. I'll get that added. For the second item I think ianw was trying to convey that we're busy on keeping things running which makes it difficult to find time for writing blogposts17:45
opendevreviewClark Boylan proposed opendev/system-config master: Publish raw images on our nodepool builders  https://review.opendev.org/c/opendev/system-config/+/87577517:50
clarkbfungi: ^ that comes out of this too17:51
clarkbI'll let ianw respond to the other comment then we can hopefully ship this out in a couple hours17:52
clarkbthere are about 200 more replication tasks before we get into the ~200 or so retries that are queued up18:10
fungii don't remember replication being so slow in the past18:10
clarkbfungi: its slow for the initial sync18:11
clarkbbut also made slower due to gerrit putting all the data in git18:11
fungii remember it taking hours, but not this long18:11
clarkbya there are a lot more refs now with notedb18:12
fungii wonder if an rsync initially would make the process faster18:12
clarkbit might, but gitea relies on git hooks receiving the data to populate its knowledge of the repo iirc18:13
clarkbI didn't attempt an rsync due to ^18:13
clarkbalso rsync would not apply the gerrit acls (which mostly doesn't matter for us outside of all-projects and all-users)18:14
fungitrue. i couldn't remember if you were copying the database in already18:16
clarkbI did, but since we didn't prevent people from pushing new changes in the interim18:17
clarkbIts probably possible, its just this route is likely the most safe18:17
clarkbfungi: for the next servers I think what I want to do after gitea09 proves the process is deploy three new servers all together then run a big sync over the weekend18:20
fungithat makes sense18:24
clarkbfungi: the gerrit trouble yesterday would've impacted replication too18:45
clarkbthough its still taking quite a bit of time considering18:45
fungioh, yeah i suppose that did chuck a wrench into the gears18:46
clarkbfungi: and between gitea09 being done and adding more new servers I think we should land https://review.opendev.org/c/opendev/system-config/+/875533 to update gitea. I don't want to do that until gitea09 is behind haproxy though18:47
clarkbjust to avoid moving parts18:47
fungisounds good, i agree18:59
ianwclarkb/fungi: yeah was just trying to say we're probably not going to be writing full on blog posts, etc., but if there's a marketing/comms person driving that sort of thing we can certainly help19:02
fungiianw: got it, i included some suggested rephrasing to make that clearer19:03
opendevreviewMerged opendev/system-config master: Publish raw images on our nodepool builders  https://review.opendev.org/c/opendev/system-config/+/87577519:55
clarkbianw: I think that edit looks good19:58
clarkbianw: do we want to wit for nb04 to list raw images then update the footnotes before sending?19:58
ianwwe can do for our own sake of being technically correct, but i doubt anyone will double check it really :)19:59
clarkband did you have a preference for who sends it? I'm happy either way. But you've put a lot of effort into the arm stuff so you may want to take some of the credit here :)20:00
ianwheh, i'm happy for you to send it20:00
clarkbok, I'll wait for nb04 to list images then send it out20:01
fungii struck through the opening to paragraph #7 since it was somewhat repetitove with the start of paragraph #8 (typo notwithstanding)20:01
clarkbfungi: thanks20:02
fungiso yeah, clean up line 59 once the vhost fix deploys and it's good to go in my opinion20:03
clarkbgitea09 is down to replicating tripleo-heat-templates20:04
clarkbonce that completes I'll trigger a full replication of everything?20:04
clarkbeverything == all repos and all giteas20:04
clarkbare we comfortable with that at this point after yesterday's fun?20:04
ianwi think so, i didn't get any particular response, but also it doesn't seem we're still getting hit with that20:07
fungiyeah, no new hits on my grep of apache logs20:09
clarkbalright I'll trigger that once I see gitea09 finish up20:14
clarkband then I'll remove my WIP from https://review.opendev.org/c/opendev/system-config/+/87417520:15
clarkbI had to restart/reload apache to pick up the vhost changes https://nb04.opendev.org/images/ loads interesting info now though20:17
clarkbwill put the email together and send it now20:17
clarkbfull replication has been enqueued20:23
clarkband arm email is sent20:25
clarkbI think the full sync is done already20:52
clarkbwhich makes sense since it should noop most things20:52
clarkbI have unwiped https://review.opendev.org/c/opendev/system-config/+/874175 to put gitea09 behind the load balancer20:53
ianwlgtm, i didn't +w in case you want to watch it or something, but gitea09:3000 seems fine20:54
clarkbI think we can probably send it in. Worst case I'll manually remove it from pools and push a revert20:55
clarkbI've approved it20:56
opendevreviewIan Wienand proposed opendev/system-config master: make-tarball: add some extraction instructions  https://review.opendev.org/c/opendev/system-config/+/87558720:59
clarkbfungi: whats up with this email from gmann on openstack-discuss "Returned mail: Data format error"21:13
fungiwhere did you see it?21:15
gmannclarkb: fungi: I just checked, not sure how it coming 21:15
gmannfungi: https://lists.openstack.org/pipermail/openstack-discuss/2023-February/032494.html21:15
gmannit was not from my sent item21:15
fungioh, i see, my mailserver rightly detected it as spam21:16
clarkboh someone spoofing then21:16
fungispoofed post, originated from an ip address in korea21:17
fungiit made it through because it was spoofed for an address which is a subscriber and, unlike usually happens with these, it was under the 40kb message limit21:18
fungiif more of these start coming in, i'll put the list in emergency moderation mode, but would prefer not to do that since it would mean every post gets held for moderation and i think i'm the only active moderator these days21:23
gmannfungi: anything I need to do on this? 21:23
fungii don't think so, no21:24
gmannack, thanks 21:24
opendevreviewMerged opendev/system-config master: Add gitea09 to the gitea load balancer  https://review.opendev.org/c/opendev/system-config/+/87417521:56
clarkbthat got in ahead of the hourly runs22:01
clarkbI'm still getting balanced to gitea01 so hard for me to test that gitea09 is happy through the lb. But cacti shows that it appears to be in use23:22
fungithat's probably the bets you can do without trying random machines with different source addresses23:28
clarkbianw: if you get a chance to look at https://review.opendev.org/c/opendev/system-config/+/875533 I'll try to land that tomorrow when I can keep an eye on it now that gitea09 is just like the other 8 gitea servers23:45
ianwnp will do23:46
clarkbianw: in https://review.opendev.org/c/opendev/system-config/+/865784 what caused the rax dns backup contents to be present and now things pass?23:53
ianwumm i think that directory was always there, maybe it was something else?  I had a few issues in the tar generation but i think it's all quoted and a bit simpler now23:54
clarkbah. I seem to recall it complaining specifically about that but maybe it was a quoting problem and I didn't notice that23:54
ianwi just wanted to do two fairly separate directories, but also not always upload too much to the logs23:55
ianwi did double check by manually getting into the generated output23:55
ianwi thought about including a script or something, but i just ended up making some notes -> https://review.opendev.org/c/opendev/system-config/+/87558723:56
clarkbianw: ya just left some comments on that (minor things)23:58
ianwclarkb: yeah, you'll need the password to import the private key, and decrypt with it.  although if you use gnome it may be cached in between23:59

