Monday, 2021-08-23

opendevreviewSteve Baker proposed openstack/diskimage-builder master: Move grubenv to EFI dir
opendevreviewSteve Baker proposed openstack/diskimage-builder master: Support grubby and the Bootloader Spec
opendevreviewSteve Baker proposed openstack/diskimage-builder master: RHEL/Centos 9 does not have package grub2-efi-x64-modules
opendevreviewSteve Baker proposed openstack/diskimage-builder master: Add policycoreutils package mappings for RHEL/Centos 9
opendevreviewSteve Baker proposed openstack/diskimage-builder master: Add reinstall flag to install-packages, use it in bootloader
opendevreviewSteve Baker proposed openstack/diskimage-builder master: Add DIB_YUM_REPO_PACKAGE as an alternative to DIB_YUM_REPO_CONF
opendevreviewSteve Baker proposed openstack/diskimage-builder master: Use non-greedy modifier for SUBRELEASE grep
*** ysandeep|away is now known as ysandeep06:32
*** sshnaidm|afk is now known as sshnaidm06:34
*** jpena|off is now known as jpena07:37
*** rpittau|afk is now known as rpittau07:51
gthiemongeHi Folks, opendevreview disconnected from #openstack-lbaas 2 days ago and still hasn't reappeared07:56
opendevreviewOleg Bondarev proposed openstack/project-config master: Update grafana to reflect dvr-ha job is now voting
*** ykarel is now known as ykarel|lunch08:23
opendevreviewJing Li proposed openstack/diskimage-builder master: Add new element rocky
*** ykarel|lunch is now known as ykarel09:36
opendevreviewJing Li proposed openstack/diskimage-builder master: Add new element rocky
*** ykarel is now known as ykarel|afk10:53
*** arxcruz|off is now known as arxcruz11:01
*** dviroel|out is now known as dviroel|ruck11:31
*** jpena is now known as jpena|lunch11:33
*** jpena|lunch is now known as jpena12:29
opendevreviewsean mooney proposed openstack/project-config master: Add review priority label to nova deliverables
opendevreviewsean mooney proposed openstack/project-config master: Add review priority label to nova deliverables
fungigthiemonge: it doesn't stay in channels indefinitely, it joins and parts from them based on a sort of lru order for whatever channels it needs to announce things in next in order to stay under the 120 max joined channel limit for the network12:55
fungiwas there a change uploaded which should have been announced in #openstack-lbaas in the past two days and wasn't?12:55
gthiemongefungi: oh i didn't know that! I think the activity was low during the weekend12:56
fungiif you notice something get pushed which should have been announced and isn't, let us know and we can check logs to see if something broke, but it's been continuing to announce changes in here so i think it's still working normally12:58
opendevreviewAnanya proposed opendev/elastic-recheck rdo: Fix ER bot to report back to gerrit with bug/error report
opendevreviewsean mooney proposed openstack/project-config master: Add review priority label to nova deliverables
opendevreviewJing Li proposed openstack/diskimage-builder master: Add new element rocky
opendevreviewJing Li proposed openstack/diskimage-builder master: Add new element rocky
opendevreviewsean mooney proposed openstack/project-config master: Add review priority label to nova deliverables
opendevreviewAndreas Jaeger proposed opendev/system-config master: Retire openstack-i18n-de mailing list
clarkbfungi: I don't know why these things occur to me on monday mornings immediately before actually starting work but suddenly I was concered that our test was also backing up to the backup server with the same target backup location as prod. Turns out prod doesn't have backups so this is a "non issue".14:53
clarkbfungi: I think that enable backups will be simpler once we have finished the upgrade (we'll be sure to do a snapshot though)14:53
fungiyeah, sounds good14:53
clarkbotherwise we'll have to ensure the borg virtualenv install stuff is redone post upgrade (will have to sort that out on lists.o.o)14:53
clarkbAnyway just added that to the checklist of things14:53
fungialternatively we could plan to move the lists.k.i site onto lists.o.o sooner14:54
fungisince they'll be in sync as far as what version of software they're running14:54
clarkbya, though it serves as a good in place upgrade first platform14:54
fungiright, i mean after we upgrade both14:54
clarkboh ya ++14:54
fungibut only needing to back up one server would be preferable14:55
fungiwe'll also want to let them run for a bit before we do that anyway so we can make sure we're not more resource-constrained after the upgrade14:56
fungias we've seen mailman processes get oom-killed on a semi-regular basis14:56
fungi(that may not be due to an under-sized server, might have been a runaway memory leak or similar)14:57
clarkbgood point14:57
clarkbOk time for breakfast and watering the garden now that I don't have to fix up backups immediately.15:02
clarkbinfra-root landing today to get new gerrit point release image published that we can upgrade to would be nice15:02
clarkb is half related to that in that it will test upgrading from our 3.2 image to our 3.3 image. Both of them get updated in the previous change15:03
*** dviroel|ruck is now known as dviroel|ruck|lunch15:17
*** jpena is now known as jpena|off15:36
mordredclarkb: that looks good - any reason I shouldn't land it?15:43
Clark[m]I think they should both be landable15:47
*** ysandeep is now known as ysandeep|dinner15:51
*** rpittau is now known as rpittau|afk15:58
*** ysandeep|dinner is now known as ysandeep16:10
clarkbWednesday will be the 3 weeks after disabling accounts mark for the last batch of gerrit account cleanups. This means I'll plan to do the second step (which is far less reversable) either wednesday or Thursday to complete that batch16:12
clarkbThen the real fun begins as we'll have ~30 accounts that we can do surgery on after communicating with their owners16:13
*** dviroel|ruck|lunch is now known as dviroel|ruck16:19
clarkbZuul is busy this morning. I guess we're seeing OpenStack's feature freeze build up16:35
fungiin positive news, that means we'll get a good scale test of recent zuul/nodepool development16:35
clarkbfungi: Looking at my todo list realistically I'm not sure where time has gone and don't think I'll be able to do the thing for ricolin_ and OID. Do you think that may still be salvageable? I Feel like if it were a month later I'd have all the time in the world but with school starting up and trying to get some of my backlog completed it seems undoable in the current timeline16:36
fungino, i'm in the same boat you are. unanticipated (as well as poorly-anticipated) tasks have jumped up my todo list16:37
clarkbalright I'll write ricolin_ an email to follow up on that16:45
fungithanks, it's unfortunate and i'm sorry it looks like i won't have time for it16:56
opendevreviewMerged opendev/system-config master: Update Gerrit images to most recent releases
opendevreviewMerged opendev/system-config master: Test a gerrit 3.2 -> 3.3 upgrade
opendevreviewClark Boylan proposed opendev/system-config master: Preserve zuul executor SIGTERM behavior
clarkbcorvus: ^ updated zuul executor sigterm config change17:04
corvusclarkb: thanks!17:08
clarkbinfra-root restarting gerrit to pick up is probably a good idea. Maybe early afternoon my time when hopefully zuul has caught up a bit?17:29
fungi#status log ze04 was rebooted at 02:16 UTC today due to a hypervisor host outage in that provider, but appears to be running normally17:29
fungimmm, looks like we've lost openstackstatus17:29
clarkbI wonder if the dns issues on that executor are related17:30
fungier, opendevstatus i mean17:30
fungi2021-08-21 17:12:58     <--     opendevstatus (~opendevst@ has quit (Remote host closed the connection)17:30
fungilooks like we updated /etc/statusbot/statusbot.config and the container was restarted at that time17:31
clarkbcouldn't reconnect for some erason?17:32
fungidunno, statusbot_debug.log also ends at the same time17:33
fungii'll try manually restarting the container again17:33
fungiit's joining channels now17:34
fungimaybe something happened when ansible tried to restart it17:34
fungidocker-compose logs was empty though17:34
fungiseems to be done17:36
fungi#status log Restarted the statusbot container on eavesdrop01 just now, since it seemed to not start cleanly immediately following a configuration update at 2021-08-21 17:1217:36
opendevstatusfungi: finished logging17:36
fungi#status log ze04 was rebooted at 02:16 UTC today due to a hypervisor host outage in that provider, but appears to be running normally17:36
opendevstatusfungi: finished logging17:36
opendevreviewClark Boylan proposed opendev/system-config master: Upgrade gitea to 1.15.0
opendevreviewClark Boylan proposed opendev/system-config master: DNM force gitea failure for interaction
clarkbgitea 1.15.0 exists now. I'm going to put a hold for the jobs running against 800516 for human verification. In particular we want to ensure the image files are hosted as gerrit and paste expect them on top of any other gitea verification17:54
clarkboh I should also double check the template files have no delta between rc3 and the release17:57
clarkbyup no template deltas for the templtes we update17:58
*** timburke__ is now known as timburke18:14
*** ysandeep is now known as ysandeep|away19:05
clarkbhttps:// LGTM. is being served as expected too which is consumed by the gerrit theme however not at that url. I don't know if updating the url in the gerrit theme (which does do) will require we restart gerrit19:45
clarkbalso the openstack gate queues are getting longer. Looks like they have had some resets :/19:45
*** slaweq is now known as slaweq_19:50
clarkbI'd still like to do the gerrit restart in about an hour or two. I don't think we should combine the gitea and gerrit updates19:52
clarkbMuch better to do the gerrit update and address any issues that might come up and then sort out gitea later even if gerrit needs a restart to see the new url for the theme19:52
fungii concur19:53
fungiand i should be around, though i might be mid-dinner depending on exactly what time19:53
clarkbI'm happy to be flexible if there is a time that works best for you. I'm finishing up some lunch now but should be around the rest of the afternoon19:54
clarkbI was also hoping the gate queues would go in the opposite direction before restarting gerrit but now worry there may not be enough hours in the day (or week) for that :)19:56
fungithey have shrunk a bit in the last few minutes20:02
clarkbI've done a quick update to our meeting agenda. Anything else to add to that before I sned it out?21:06
funginothink i can thing of, sned away!21:24
clarkbAlright, once the current change at the tip of openstack's gate queue lands we should have about 15 minutes to restart gerrit21:28
clarkbinfra-root ^ is now a good time to do that?21:28
fungistanding by21:28
fungigood with me21:28
clarkbok I'll run the pull now21:28
fungiopendevorg/gerrit   3.2       4a2af078eb62   5 hours ago    810MB21:31
fungithat's the one we want i guess21:31
clarkbalso the mariadb image updated as well so I'll do a full down then up -d21:31
fungiwe also ended up with a new mariadb, yeah saw that21:31
fungi mariadb             10.4      59f9f97d14ce   2 weeks ago    400MB21:31
clarkbthe chagne I'm waiting on is still doing its thing21:32
clarkbthere is also a release-post job21:32
clarkbI should warn the release team I guess21:32
fungigood thinkin21:33
clarkbok looks like I have a window of about 6 minutes21:40
clarkbfungi: you good for me to down then up -d now?21:40
fungigo for it21:40
fungithat should be enough21:40
fungithe next change merging is an approximate time anyway21:40
fungicould wrap up its last job in the next few seconds, you never know21:40
fungiwant me to send a status notice?21:40
clarkbthe web ui loads for me again21:41
fungi#status notice The Gerrit service on has been restarted for a patch version upgrade, resulting in a brief outage21:41
opendevstatusfungi: sending notice21:41
clarkband shows the expected version21:41
fungigood, good21:41
-opendevstatus- NOTICE: The Gerrit service on has been restarted for a patch version upgrade, resulting in a brief outage21:41
fungialso a good test of the status notices for matrix i suppose21:41
clarkbdo we have that working?21:42
fungino idea, finding out now21:42
clarkbI wish I understood this java jvm log rotation errors so that I could clean them up21:43
clarkbI suppose we could stop recording a jvm log entirely now that we're not under memory pressure21:43
fungiyeah, i don't know that we need it now, and we can always turn it back on again if we decide we do21:56
zigoHi! Looks like I haven't recieve any announce from release-announce since last 15th of June.21:58
zigofungi: Could you check?21:58
clarkbzigo: did you double check that you are subscribed and that the emails aren't getting sent to spam or similar?21:59
zigoclarkb: I just did try to re-register, and mailman sent me a mail saying I was already registered.21:59
zigoclarkb: It's my own mail server, and I well know my antispam settings are kind of super nice for spammers ... :P22:00
zigo(ie: half of the spam can go through... except the *very* obvious spams)22:00
zigoclarkb: I don't think it's possible that *all* of the release-announce emails would go to spam...22:01
* zigo checks anyways22:01
clarkbzigo: 451 Greylisted22:01
zigoclarkb: Nothing wrong with that, is it?22:01
zigo451 means: please retry later ...22:02
zigoclarkb: I confirm: a quick grep in my .SPAM folder shows no release-announce ...22:02
clarkblooks like it does eventually retry and deliver at least some of those 451 error'd messages22:03
clarkbI'm not sure how to find release-announce specific emails in the exim logs. I guess I need to find their entries in the mailman log somewhere first22:03
zigoclarkb: If I'm not mistaking, I'm register with, so it should go through / that I have whitelisted in my postfix setup.22:04
clarkbzigo: disabling due to bounce score 5.0 >= 5.022:04
clarkbthat happened on june 2922:04
fungican also check mailman's unsub/bounce logs to see if the address got bounce-disabled22:04
fungior check the subscription status22:04
fungiaha, you beat me to it22:04
zigoclarkb: Can you revert this?22:04
zigoMy mail server doesn't bounce... :)22:05
fungizigo: even you can undo it, just have to use the mailman webui to adjust your subscription preferences22:05
clarkbthen in july it says openstack-announce: has stale bounce info, resetting22:05
clarkbbut I guess it didn't undo the disable just reset the county?22:05
zigofungi: To do that, I have to set a password in mailman, right?22:05
fungiunless you recorded it22:06
zigoI haven't set one... :/22:06
fungibut yeah, we can undo it for you too22:06
clarkbI guess if the greylisting persists long enough then eventually we give up and count it as a bounce?22:06
clarkbso in that case it is an issue22:06
zigofungi: That's be nice, thanks.22:07
zigoI'm ok-ish not having the backlog from last June.22:07
zigoBut usually, the release-announce mails are very helpful, close to a release.22:07
zigo(ie: I can track what package I missed...)22:07
clarkbsure, but we don't decide when you bounce us right?22:08
zigoI don't bounce !!!22:08
zigoI only greylist ...22:08
clarkbmailman disagrees22:08
clarkbright but if an email fails to deliver for $time eventually it is treated as a bounce ? fungi can confirm that22:08
zigoLet me see if I can add more stuff in my Postfix's "my-network" ...22:09
fungiit would have to be a long time22:09
fungibasically the server would need to retry for days, i doubt the greylisting persists that log per delivery22:09
clarkblooks like the counter only increments once per day22:09
fungiyeah, multiple bounces in a day don't add more than one bounce point22:10
fungiand then the bounce points decay over time too22:10
clarkblooks like there were 12 bounces over 5 days22:10
clarkband that is what tripped it22:10
fungibut we only have something like 10 days of mta log retention, so not much chance of getting details from when it transpired22:10
zigoI confirm that I have both debian MX in my whitelist, so I don't get why this is happening.22:10
clarkboh sorry it was over a longer period june 21 to 2922:11
zigoMy greylisting is set at 600 seconds I believe.22:11
zigoOk, so now you've reseted the counter, it should be all good to go, no? :)22:12
clarkbthen on july 12 the bounce score had gone back down to 1.0 and on July 29 it was reset entirely. No new bounces since22:12
clarkbzigo: I don't know. the counter reset on the 29th of july and there have been emails since.22:12
clarkbI think we have to tell mailman to undisable your account22:12
zigoOh, got an idea: maybe it stopped when the lists where in spamhaus ?22:13
fungiyes, it can be adjusted on the subscription. an admin can either do it on the subscribers list in the webui or via the cli22:13
zigoBecause then, effectively, it has bounced ...22:13
fungizigo: yes, that seems likely to have been the cause22:13
clarkbfungi: but zigo can do it too right?22:13
fungiclarkb: yeah, he'd have to set a password and then log in22:13
zigoHow do I set a password for my account if it's already regsiter?22:14
fungisame place22:14
zigoOh, I just enter my addr, and set a password?22:14
fungiwell, enter your addr and ask it to send you one, yes22:14
fungizigo: the password reminder at the bottom22:15
zigoOk, thanks !22:16
zigoHopefully, it will send me a password even if it thought my address bounces? :)22:16
clarkbwe can find out :)22:16
zigoGot it ! :)22:17
fungiit should, the bounce outcome is merely to suspend delivery for your subscription22:17
zigoI'm in.22:18
zigoMy account was disabled indeed.22:18
zigoI mean, recieving.22:18
zigoThanks a lot guys, this was very helpful.22:18
fungiyou're welcome, and sorry about the disruption22:19
clarkbfungi: the ipv4 addr was never added to spamhaus right? maybe we should force ipv4 afterall22:20
zigoNot your fault, IMO.22:20
clarkband ya it got added by rackspace in a bulk block iirc22:20
clarkbor someone added rackspaces bulk ipv6 block22:20
fungiclarkb: well, the v4 block was in the policy blocklist, and we filed an exclusion for it with spamhaus22:21
clarkbfungi: recently?22:21
fungia while back22:21
clarkbgot it. The recent thing was ipv6 specific. We didn't have an exclusion preexisting I guess?22:21
clarkbthen when the block was added we got hit?22:21
zigoI've long disabled IPv6 on mail servers: it's sadly more trouble than help.22:21
fungithe recent v6 block was i think spamhaus assuming all addresses in a /64 are controlled by the same entity22:21
fungiwhich isn't the case in service providers who assign individual /128 addresses instead of networks22:22
zigofungi: Not only spamhaus does this. At least Google does that too.22:22
clarkbaha making assumptions about ipv6 allcoations that don't hold in the real worl;d22:23
zigoThey all base reputations of /64 blocks, which indeed, is silly, but there's little one can do about it.22:23
fungithe v6 one wasn't the pbl, it was apparently that they added the entire /64 after some address in it started trying to deliver e-mail with a helo of "localhost.localdomain"22:23
fungi(wasn't us)22:23
fungiworkaround would be to get a global /64 for a neutron network i suppose and put a new listserver in it22:24
clarkbor force ipv422:24
clarkbI think we may already do that for google because they don't trust any mail over v622:24
fungii doubt there are any v6-only mailservers anyway, so we should be able to assume not using v6 wouldn't cut out deliveries for any subscribers22:24
clarkbhrm that is a good point though that that is the risk going with that workaround22:25
zigo.oO(The Internet would have been a lot nicer if there was ways so that a v6 address could communicate with a v4)22:30
fungithe problem there is on the v4 side. there's a standard way to encode a v4 address as v6, but no v4-only system is going to be able to communicate outside the v4 address space without some way to map/translate v6 addresses into v422:36
*** dviroel|ruck is now known as dviroel|out22:37
clarkbre the gerrit restart, it going so quickly makes me much more confident that we can land and just restart gerrit to pick up the theme change if necessary. That means the focus goes back to ensuring the deployed gitea is functional. The test instance is up at
clarkbfungi: ianw: do one of you want to review to update our executor configs to handle the change in sigterm behvaior for zuul executors?22:39
clarkbthe depends on change has merged. The current running zuul should ignore the extra config entry then once we've restarted in the future we want to have it there to avoid graceful stops unless explicitly requested22:40
ianwi feel like i looked at that, did it change?22:40
clarkbianw: yes, zuul switched from using an env var to a config file entry to select the behavior22:40
clarkbps1 handled the env var ps2 does the current config file selector22:41
ianwah, right.  cool, lgtm22:42
stevebakerclarkb, ianw: hey it looks like the openeuler mirror is missing some sqlite.bz2 files, which is breaking dib jobs
ianwstevebaker: hrm, thanks for pointing out, let's see ...23:14
stevebakerianw: thanks. Also fedora mirrors sometimes break dib-functests-bionic-python3 without this fix
ianw looks like it is correctly mirroring23:16
clarkblooks like those updates are from today. But I'm not sure what timezone the timestamp belongs to or if that is 24 or 12 hour time :/23:16
clarkbif it is 12 hour time and we assume a utc timezone the update would've been from an hour and a half ago?23:16
stevebakerit was failing yesterday too fwiw23:17
clarkbalso with the rsync updates we rely on them updating files "safely"23:18
clarkbwith debuntu we have reprepro producing complete mirrors each time we publish but with rpm repos we rely on the publishers not updating out of sequence23:18
ianwit looks like we're in sync with
ianwwhich looks like it is out of sync with ""23:19
clarkb oh yup you just found that23:19
stevebakershould be the source instead of ru-repo?23:20
ianw... maybe?23:21
clarkbthe issue was authentication23:21 requires authentication. We would prefer to not be a special mirror becuse we don't want people using our mirrors as actual distro mirrors23:22
ianwyeah, not sure if that was fixed though23:22
clarkbat least that was the situation when this was set up iirc. They pointed us at that mirror because we didn't need toauth iir23:22
ianwi think we need to reach out and if we can't resolve in a timely fashion we can override testing23:22
clarkbya we should be able to stop using our mirror as well?23:23
clarkbthough that migt also be flaky depending on how things go23:23
ianwyeah, tbh i think "this is turned off until mirror is fixed" is a better situation for everyone23:23
stevebakerI just checked every mirror and ru-repo is the only one missing those files
clarkbkevinz: ^ fyi23:24
fungii'm in favor of disabling those optional jobs until they have working mirrors23:24
ianwxinliang has been the only contact via irc23:24
clarkbianw: oh I thought kevinz  was involved too?23:24
ianwi think so; xinliang doesn't appear signed in23:25
fungii mean, until they're back to having working mirrors23:25
ianwwe can try the admin@ address there too.  i'll send a mail to that.  in the mean time we can comment out the tests23:26
ianwmail sent ... 23:30
opendevreviewMerged opendev/system-config master: Preserve zuul executor SIGTERM behavior
clarkbWe should be ready to restart zuul again whenever that is appropriate for queues and the state of zuul things23:34
*** yoctozepto4 is now known as yoctozepto23:37

Generated by 2.17.2 by Marius Gedminas - find it at!