Friday, 2022-04-29

ianwit looks like $releasever isn't being expanded in the nodepool builders00:00
clarkbfungi: yes I expect this next run will vos release successfully00:00
clarkbbut it takes time to get to that point so will be a few minutes before I Know for sure00:00
clarkbdo y'all think I neeed to rerun this script a second time or just release the lock once I've got success and let the regular process handle it00:01
ianwclarkb: did you run with NO_TIMEOUT=1 ?00:02
clarkboh I did not. But its fine it hasn't been that long00:03
clarkblike its getting to that error and fialing not getting to a timeout and failing00:04
clarkbits just like 10 minute sor something00:04
clarkbThis is a complete release of volume 53687094900:15
clarkbhrm maybe that is why I needed to set NO_TIMEOUT?00:15
clarkbianw: ^ do you know what happens if we timeout the vos_release?00:15
ianwactually i think that bit is ok these days00:16
clarkbanyway reprepro completed successfully. We're just waiting on the vos release now. It is running in screen window 2 on mirror-update and it logs to the regular log file00:16
ianwthat is actually an untimed ssh call to do the release on the afs01 directly00:16
clarkbI'll try to check on it after dinner and release the lock, but if it goes longer than that you may need to release the lock00:16
ianwbut yeah, we don't want to kill that if we can avoid it, because then we have to clear the locks and do full releases00:17
clarkbya its running in screen so it should be fine as long as the timeout doens't get it00:18
clarkbcrazy idea: default to no timeout then set the timeout flag in the cron jobs sothat manual runs don't have to remember00:19
*** dviroel|rover is now known as dviroel|out00:20
ianwwe may be at a point of stability we could avoid the timeouts all together.  we haven't had anything hang that i know of in a long time00:21
fungiwell, the timeouts might be hiding it though00:21
ianwtrue, i guess we don't vos release if something times out, so we don't see the partial updates00:22
clarkbReleased volume mirror.ubuntu successfully its done already00:22
clarkbdo you think we need to run it again or can I release the lock?00:22
fungibut i agree it seems like a worthwhile experiment. those timeouts seem more likely to break than fix stuff these days00:22
ianwyay!  i think you can probably release the lock00:22
clarkbdone. I'll leave the screen up for further cleanups00:23
clarkb826GB to 581GB00:24
clarkbnot bad00:24
clarkband now dinner. frickler  I think we can probably continue to add jammy stuff now as that frees up quite a bit of room00:25
ianwok, so the rpm in the centos7 chroot thinks that nothing is installed01:00
opendevreviewSteve Baker proposed openstack/diskimage-builder master: Make centos reset-bls-entries behave the same as rhel  https://review.opendev.org/c/openstack/diskimage-builder/+/83983001:41
opendevreviewSteve Baker proposed openstack/diskimage-builder master: Parse block device lvm lvs size attributes  https://review.opendev.org/c/openstack/diskimage-builder/+/83982901:41
fungi#status log Replaced block storage volume backup01.ord.rax.opendev.org/main02 with main04 in order to avoid service disruption from upcoming provider maintenance activity02:00
opendevstatusfungi: finished logging02:00
opendevreviewOpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml  https://review.opendev.org/c/openstack/project-config/+/83983702:28
opendevreviewMerged openstack/project-config master: Normalize projects.yaml  https://review.opendev.org/c/openstack/project-config/+/83983702:55
ianwok, so we create the initial chroot from outside tools (dnf/rpm) on buster.  it seems that something has changed, and centos 7 rpm can not read the rpmdb created by this03:09
ianwso, rpm -qa returns none.  this flows through to yum, which somehow uses rpm to figure out how to populate $releasedir03:10
ianw$releasever even03:10
ianwand so we see the error where yum can not download 03:10
ianwwhat is weird is that everything is installed.  so if you tell yum manually "--releasever=7" it somehow works.  it goes off and  installs everything, presumably rewrites the rpmdb and the build seems to work03:11
ianwi think i see the problem.  the rpm on the host side is creating a rpmdb in sqlite format.  centos 7 expects a bdb format rpmdb.  these don't share files in common; to centos 7 it looks like a blank directory03:59
ianwyou can convert from bdb -> sqlite, but not the other way (the bdb backend is read-only, it seems).  the workaround of setting releasever and letting the in-chroot rpm figure it out seems like the best idea now04:00
ianwespecially given how much time i want to spend on centos7 issues, which is ~ none04:00
*** ysandeep|out is now known as ysandeep04:11
ianwnote centos-7-0000264326 is an "accidentally" fixed build that has been uploaded now04:14
ianwi was working on nb01 because this doesn't replicate in the gate environment, and this build went to completion.  i've left the changes in the currently running nb01 container, but if that restarts, it won't build centos-7 again04:15
ianwthis is fine for now until we get fixes in, and at least we have a refreshed image04:16
ianwthis has worked to build the openafs rpms -- see https://zuul.opendev.org/t/openstack/build/eee983e184de48a2a970de2e97f5dc4604:16
*** bhagyashris|ruck is now known as bhagyashris|sick04:37
opendevreviewIan Wienand proposed openstack/diskimage-builder master: yum-minimal: workaround missing $releasedir variable  https://review.opendev.org/c/openstack/diskimage-builder/+/83984004:42
opendevreviewSteve Baker proposed openstack/diskimage-builder master: Parse block device lvm lvs size attributes  https://review.opendev.org/c/openstack/diskimage-builder/+/83982904:55
*** marios is now known as marios|ruck05:06
opendevreviewIan Wienand proposed opendev/system-config master: Test openafs roles on CentOS 9-stream  https://review.opendev.org/c/opendev/system-config/+/83984105:14
*** ysandeep is now known as ysandeep|afk05:39
*** ysandeep|afk is now known as ysandeep06:00
*** jpena|off is now known as jpena07:07
*** ysandeep is now known as ysandeep|lunch07:32
opendevreviewIan Wienand proposed opendev/system-config master: Test openafs roles on CentOS 9-stream  https://review.opendev.org/c/opendev/system-config/+/83984107:36
opendevreviewIan Wienand proposed opendev/system-config master: Remove puppet-kibana  https://review.opendev.org/c/opendev/system-config/+/83986707:36
*** ysandeep|lunch is now known as ysandeep09:16
*** marios|ruck is now known as marios|ruck|lunch10:23
*** marios|ruck|lunch is now known as marios|ruck10:46
*** ysandeep is now known as ysandeep|afk11:03
*** iurygregory__ is now known as iurygregory11:14
*** dviroel|out is now known as dviroel11:21
frickleranyone remember the fix for https://github.com/pypa/setuptools/issues/3197 ? this is now affecting keystone-specs cf. https://zuul.opendev.org/t/openstack/build/101ff54f13ba42d4acf840394851576711:52
fricklerah, https://review.opendev.org/q/topic:setuptools-issue-3197 seems to have some eamples11:54
fungifrickler: yeah, i still think there's probably something we could do in pbr itself to solve that, since pbr-using packages want to let pbr decide what files to package and not rely on setuptools' file finder feature at all12:02
fungibut it's easily worked around by just preempting the feature in setuptools through configuration12:03
opendevreviewchandan kumar proposed zuul/zuul-jobs master: [DNM] ovs debug  https://review.opendev.org/c/zuul/zuul-jobs/+/83993812:20
fricklersomeone mentioned that https://wiki.openstack.org/wiki/UsingIRC is using an example pic with a big "freenode" on it. I don't think that it's worth updating the pic, maybe just drop it and add a link to our docs page instead?13:02
fricklermainly just mentioning it because this seems to be a relevant path for newcomers to find us13:02
* frickler updated the wiki page now13:08
*** ysandeep|afk is now known as ysandeep13:36
*** pojadhav is now known as pojadhav|afk13:50
fungithanks! i agree with your suggestion there14:04
fungii hadn't noticed it (and forgot we even had that page)14:05
fungi#status log Replaced block storage volume mirror01.ord.rax.opendev.org/main01 with main02 in order to avoid service disruption from upcoming provider maintenance activity14:25
opendevstatusfungi: finished logging14:25
fungithat pvmove took close to 12 hours even though the volume was only ~256gb. not sure whether it's because of a slow backend or heavy write activity on the fs14:26
fungianyway, that's all three of the volumes they warned us about, so i'll close out that ticket14:27
*** ysandeep is now known as ysandeep|out14:59
fungiclarkb: have you been unenrolling servers from ua before deleting?15:27
fungioh, maybe we didn't enroll things we knew we were planning to decommission15:28
fungistatus.o.o: "UA Infra: Extended Security Maintenance (ESM) is not enabled."15:28
fungiso nothing to unenroll there anyway15:28
Clark[m]fungi: but of a slow start today. If I remember to unenroll then yes, but a couple have been missed. And ya ELK and friends weren't enrolled due to quantity and expectation they would go away15:37
fungithanks for confirming, and yes it dawned on me after i asked that we had consciously omitted those15:39
*** dviroel is now known as dviroel|lunch15:45
opendevreviewJeremy Stanley proposed opendev/system-config master: Decommission status.openstack.org and services  https://review.opendev.org/c/opendev/system-config/+/83996315:50
fungii expect zuul to tell me i missed something there15:50
fungithe server is offline and being imaged currently15:51
fungiit looks like at a minimum we'll be able to retire puppet-elastic_recheck and puppet-reviewday repos after that merges15:52
fungiprobably also the reviewday repo itself15:52
*** marios|ruck is now known as marios|out15:53
clarkbfungi: yes reviewday itself too. But not e-r itself as it is used still15:54
clarkbjust not in our systems15:54
fungiright15:55
fungithough i would at this point encourage the current e-r maintainers to merge the rdo branch back into master15:55
clarkbya probably a good idea so that people don't get confused15:55
fungistatus.o.o image has saved successfully. i'm deleting the server and associated dns records now16:05
fungiserver instance and dns records all deleted now16:12
fungi#status log Decommissioned the status.openstack.org server as it was no longer hosting any working services: http://lists.openstack.org/pipermail/openstack-discuss/2022-April/028279.html16:13
opendevstatusfungi: finished logging16:13
clarkbfungi: the error on your change above may be related to me missing removal of a repo from our system-config zuul jobs required projects16:14
clarkbfungi: do you want to just fix that in your change or should I push up a separate fix for it?16:14
fungichecking16:15
clarkbspecifically I retired puppet-kibana but didnt' remove it from those jobs16:15
fungioh, good catch. i can roll it into this16:16
clarkbthanks16:16
fungiit's also still in modules.env16:16
opendevreviewJeremy Stanley proposed opendev/system-config master: Decommission status.openstack.org and services  https://review.opendev.org/c/opendev/system-config/+/83996316:18
clarkbfungi: looks like I also failed to remove health from groups. I'll take a look fixing that with a followup momentarily16:21
fungii can add it16:26
opendevreviewClark Boylan proposed opendev/system-config master: Remove health group from our ansible groups  https://review.opendev.org/c/opendev/system-config/+/83996616:26
clarkbfungi: ^ got it16:26
fungioh, you got it16:26
clarkbfungi: looks like ianw found the puppet-kibana thing https://review.opendev.org/c/opendev/system-config/+/83986716:30
clarkbI'm inclined to approve that and then rebase your change on top? Any objections to that?16:30
fungiwfm16:30
fungiapproved it16:30
*** jpena is now known as jpena|off16:30
clarkbanyone know if we've got a jammy ubuntu-ports change yet?16:31
fungii'll also reorder the health cleanup ahead of my change16:31
fungii haven't seen a jammy ubuntu-ports addition yet, but could have missed it16:31
clarkbya a quick search on gerrit shows no results. I'll push one up now that we have room16:32
opendevreviewJeremy Stanley proposed opendev/system-config master: Remove health group from our ansible groups  https://review.opendev.org/c/opendev/system-config/+/83996616:35
opendevreviewJeremy Stanley proposed opendev/system-config master: Decommission status.openstack.org and services  https://review.opendev.org/c/opendev/system-config/+/83996316:35
opendevreviewClark Boylan proposed opendev/system-config master: Mirror Jammy arm64 ubuntu-ports  https://review.opendev.org/c/opendev/system-config/+/83997216:35
clarkbheh that stack has that same diff problem again16:35
fungiyep, because i reordered changes16:36
clarkbfungi: the puppet list is looking very small now :)16:39
clarkbslowly but surely we've made progress16:39
opendevreviewMerged opendev/system-config master: Remove puppet-kibana  https://review.opendev.org/c/opendev/system-config/+/83986716:44
*** dviroel|lunch is now known as dviroel16:47
* fungi cheers16:51
opendevreviewClark Boylan proposed opendev/system-config master: Enable Gerrit httpd requestLog  https://review.opendev.org/c/opendev/system-config/+/83997617:08
clarkbthat is another suggestion that has come out of discussion about Gerrit 3.5's increased memory consumption17:08
clarkbOur test jobs should collect that file and we can compare memory costs between the 3.4 test instance adn 3.5 test instance to start17:09
opendevreviewMerged opendev/system-config master: Remove health group from our ansible groups  https://review.opendev.org/c/opendev/system-config/+/83996617:10
fungiyay! 839963 seems to be passing tests, so i guess i didn't miss anything after all17:12
clarkbfungi: we can probably single core approve https://review.opendev.org/c/opendev/system-config/+/839963/ since the server is gone now17:23
fungiyep17:23
fungiplease do17:23
fungiit was announced a week ago and there were no concerns raised, so i think we're well covered17:24
opendevreviewMerged opendev/system-config master: Decommission status.openstack.org and services  https://review.opendev.org/c/opendev/system-config/+/83996317:39
clarkblooking at the gerrit httpd log from the chagne above and there are some requests that consume more memory on 3.4 and some that use more on 3.518:08
clarkbI think we'll need to recheck it a few times and see if we have a stable baseline within a version of gerrit18:08
clarkband then compare those averages (or discard the info beacuse it is too inconsistent)18:08
fungimakes sense18:10
clarkbthat said even if it is too inconsistent in CI I think we should land this for production as I expect it will be far more stable there18:11
clarkbI guess I could try running it without the performance logging thing toggled off and see if we notice a difference in CI too18:23
clarkbya after this recheck I'll reorder the changes so that we get httplogs with and without performance logging disabled18:23
opendevreviewJeremy Stanley proposed opendev/system-config master: Clean up defunct OpenStack mailing lists  https://review.opendev.org/c/opendev/system-config/+/83999018:29
fungithat removes almost half of the remaining lists on lists.openstack.org18:33
clarkbits got my +2 :)18:33
opendevreviewClark Boylan proposed opendev/system-config master: Enable Gerrit httpd requestLog  https://review.opendev.org/c/opendev/system-config/+/83997622:44
opendevreviewClark Boylan proposed opendev/system-config master: Explicitly disable Gerrit tracing.performanceLogging  https://review.opendev.org/c/opendev/system-config/+/83925122:44
mnaseri don't assume zuul in opendev was restarted with the new unrestricted ansible stuff?22:44
clarkbmnaser: it has not been22:44
clarkbsoon probably, but not yet22:44
mnaseraw okay, i'm looking super forward to it :) probably won't happen on a friday ;)22:44
fungiprobably this weekend22:47
fungior first thing in the week22:48
clarkblooking at the two 3.4 http logs and the two 3.5 http logs they do seem reasonably stable within the same version of gerrit22:51
clarkband then the differences between the two versions are fairly minor22:51
clarkbgit-upload-pack used more memory on 3.4 than 3.522:52
clarkbadding an ssh key used more memory on 3.5 than 3.4 but not significantly so its a few percent22:53
clarkbssh key handling is the biggest diffefrence I've been able to find between the tw22:54
clarkbI guess we need to see the numbers without performance logging disabled22:54
clarkbbecause nothing here is making me concerned yet22:54

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!