Wednesday, 2022-02-02

opendevreviewSteve Baker proposed openstack/diskimage-builder master: Allow the bootloader element with the default block device
opendevreviewSteve Baker proposed openstack/diskimage-builder master: Revert "Use rpm -e instead of dnf for cleaning old kernels"
*** dviroel is now known as dviroel|out00:59
*** rlandy|ruck is now known as rlandy|out01:23
ianwtime="2022-02-02T01:26:41Z" level=error msg="invalid internal status, try resetting the pause process with \"podman system migrate\": command required for rootless mode with multiple IDs: exec: \"01:28
ianwnewuidmap\": executable file not found in $PATH"01:28
ianwhrm, not sure why this happens on the builders but not in the gate01:28
*** lajoskatona_ is now known as lajoskatona01:34
ianwthere's a lot of out-of-date packages in the builder image.  i wonder why that is01:36
ianw... i wonder why we have unstable in there? deb unstable main01:37
ianwoh that's right, we need to podman unstable because it uses incompatible calls01:40
opendevreviewMerged opendev/system-config master: Stop mirroring centos-8
ianwwell, installing uidmap/unstable into the builder containers makes it work ... this leaves the question of why it works in the gate sill unresolved01:51
fungido we save the package version info in the gate jobs? or maybe the package install stdout01:58
ianwyes it's all a bit jumbled up but it is in the image build logs, e.g.
ianwWARN[0000] "/" is not a shared mount, this could cause issues or missing mounts with rootless containers 02:06
ianwERRO[0000] invalid internal status, try resetting the pause process with "podman system migrate": command required for rootless mode with multiple IDs: exec: "newuidmap": executable file not found in $PATH 02:06
ianwboth do not occur in the gate02:06
ianw2022-02-01 05:08:55.764517 | ubuntu-focal | Setting up podman (3.4.4+ds1-1) ...02:06
ianw$ dpkg --list | grep podman02:07
ianwii  podman                                           3.4.4+ds1-1                    amd64        engine to run OCI-based containers in Pods02:07
ianwwe are using the same podman in the gate as deployed on the builders02:07
ianwhrm, we run on a focal node in the gate, but nb02 is bionic02:09
fungithat could be it02:22
fungimaybe we just need to upgrade the builders02:22
ianwi'm thinking in-place upgrade is probably fine.  these aren't much more than container runners 02:23
ianwsigh, i in-place upgraded nb02 and ... same thing03:06
ianwbut still, a useful process.  notes03:06
ianw1) re-run base with --flush-cache03:07
ianw2) rm /usr/local/bin/pip* as the python upgrade breaks the global pip install.  removing these indicates to ensure-pip role to re-install via get-pip.py03:07
ianw3) ansible-playbook -v --limit ./playbooks/service-nodepool.yaml03:08
*** ysandeep|out is now known as ysandeep03:33
opendevreviewAde Lee proposed zuul/zuul-jobs master: WIP/DNM - Test new version of python3
opendevreviewAde Lee proposed zuul/zuul-jobs master: WIP/DNM - Test new version of python3
opendevreviewAde Lee proposed zuul/zuul-jobs master: WIP/DNM - Test new version of python3
opendevreviewAde Lee proposed zuul/zuul-jobs master: WIP/DNM - Test new version of python3
ianwf34 & f35 have built with the equivalent of applied.  i have no idea why :/06:01
ianwnb02 has been upgraded to focal in-place.  i will do nb01/nb03 tomorrow to ensure things are kept consistent06:01
ianwthis way at least our test environment is more equal to production06:02
*** marios is now known as marios|ruck06:05
opendevreviewDr. Jens Harbott proposed openstack/project-config master: Add py36 variant of
*** amoralej|off is now known as amoralej07:22
lourotclarkb, re: master branch existence check, sure I'll have a look, thanks!08:18
amoralejmay i get attention on ?08:53
opendevreviewAlfredo Moralejo proposed opendev/system-config master: Add CentOS SIGs for CentOS Stream 9 to AFS mirrors
*** ysandeep is now known as ysandeep|lunch09:39
*** ysandeep|lunch is now known as ysandeep10:53
*** rlandy|out is now known as rlandy|ruck11:03
*** dviroel|out is now known as dviroel11:15
*** pojadhav is now known as pojadhav|brb12:19
*** pojadhav|brb is now known as pojadhav12:43
*** amoralej is now known as amoralej|lunch12:56
*** amoralej|lunch is now known as amoralej14:01
*** ysandeep is now known as ysandeep|dinner14:43
*** ysandeep|dinner is now known as ysandeep15:24
clarkbinfra-root I suppose today is probably as good as any to land and then double check unattended-upgrades are still functional a day later?15:41
clarkbany objection to me approving that now?15:41
opendevreviewDr. Jens Harbott proposed openstack/project-config master: Add py36 variant of
fungiclarkb: i say go for it15:43
clarkbthanks done15:45
opendevreviewClark Boylan proposed opendev/system-config master: Remove unused puppet modules
clarkbthats step 0 to retiring repos. I wanted to make sure there weren't any leftovers in system-config that might confuse us. As far as I can tell that is the cleanup we can safely do today16:08
*** pojadhav is now known as pojadhav|afk16:09
*** dviroel is now known as dviroel|lunch16:09
opendevreviewAlfredo Moralejo proposed opendev/system-config master: Add CentOS SIGs for CentOS Stream 9 to AFS mirrors
amoralejfrickler, i left a comment in wrt failing cs9 jobs, it's kind of chicken-and-egg we need that patch to be merge to get them fixed16:21
amoralejso my expectation is that, once we have updated the configure-mirrors, which is also enabling crb repo we can make them voting16:22
clarkbamoralej: you should be able to stack the changes and flip the change to voting on the child16:23
amoralejthe problem is that those jobs are not using the proposed change in configure-mirrors until we merge it, that's how i get it16:23
amoralejclarkb, doesn't those jobs use the last commited?16:23
clarkbthey may not pass immediately but you can recheck once the mirror is updated16:23
amoralejthey will use roles from parent?16:23
clarkbamoralej: no they should largely be self testing16:23
amoralejbut those ensure-pip jobs, i.e. are not taking the change in configure-mirror in the same review16:24
amoralejso it seems only changes in ensure-pip role are tested there?16:24
clarkbcurrently there is no relationship between them right?16:27
clarkbyou have to set that relationship16:27
opendevreviewMerged opendev/system-config master: Manage 10periodic and 20auto-upgrades together
opendevreviewMerged openstack/project-config master: Add py36 variant of
opendevreviewAlfredo Moralejo proposed zuul/zuul-jobs master: Make CentOS9 jobs voting
*** ysandeep is now known as ysandeep|out16:28
fungiamoralej: have you confirmed the failing job actually uses the configure-mirrors role and not the other one (the name of which escapes me at the moment)?16:29
amoraleji'd say so but lemme double check16:30
amoralejyes, they does, but they report "WARNING: CentOS mirrors are not supported either by this role yet. The execution of the job will continue without setting up cached mirrors." which is what i'm fixing16:31
amoraleji've piled a new review making the jobs voting, let's see if the child gets the change16:32
amoralejnop, it's just using it from master branch apparently16:35
amoralejso, how may i create that relationship for child reviews to use roles from parent?16:36
clarkbamoralej: fungi: the base job that configures the repos will use merged code. But then there should be role specific tests that use the updated code in a safe manner.16:36
clarkbwhat I was saying is that for the first part you may need to land the changes one at a time in order. But when you do that you can recheck the child change and confirm it is happy16:36
amoralejok, sounds good to me16:37
clarkbAnd reviewers can approve the parents based on the test results from the sandboxed role tests if they exist16:37
clarkbBut this way reviewers undersatnd the relationship is there16:37
clarkbinfra-root that list may not be complete but I Think it is a really good start. Can you look over the list and mark any that you think we shouldn't retire? Feel free to add additional repos that we should retire16:38
*** marios|ruck is now known as marios|out16:48
clarkblourot: fungi: re it would be good to avoid doing this with a throwaway project too16:52
clarkbSomthing lower priority that won't be affected greatly by working through any issues that pop up is preferable16:53
fungiyeah, part of the problem with trying to find a balance there is that openstack will want its official deliverables to be consistent, so *if* they decide they're going to rename all their master branches to main, then doing it with something that isn't throw-away would be fine. the problem arises that if they do decide they won't be renaming branches in the other repos and whatever it is needs16:56
fungito have its imported main branch renamed to master... also finding a completely disconnected new deliverable which won't need working integration testing with other projects that use different branch naming could be challenging16:56
fungior finding something to import which doesn't need to be working for possibly months while we work through the logistical tests16:56
clarkbI agree. I'm just still minorly annoyed that we've got like 3 copies of all of openstack in our hosting system due to the way packaging repos work and would like yo avoid perpetuating that sort of issue if we can16:57
fungiideally we'd test with the sandbox repo, but it already exists, so deleting it cleanly and reimporting is probably not a valid test16:58
fungi(mainly because i don't know how to truly delete it)16:59
clarkbya. There is the delete repo plugin but as you say untested and who knows what it leaves behind17:00
fungii'm definitely open to suggestions. it's just that finding a unicorn which is a project openstack will use eventually so it's not throw-away, but it doesn't need to be usable for the foreseeable future while we test, is tough17:02
fungii do agree it feels "dirty" to make a repo we know we're just going to retire17:03
fungifrom a governance perspective it would probably be fine to just assign it to the tact sig until we're done with it17:03
clarkbfungi: lourot: a compromise might be to use a smalelr dummy repo than one with 1.5k commits?17:05
fungiahh, i didn't even look at which one was being imported. but yeah, we don't need something massive17:06
*** dviroel|lunch is now known as dviroel17:12
*** lbragstad9 is now known as lbragstad17:19
jrossercould i get some advice on this
jrosseri feel like this is a catch-22 somehow17:31
fungijrosser: insofar as that we'll need to remove the nodeset reference from all branches at the same time?17:32
jrosserif you look at the topic on that patch there are tons of similar patches that i've been able to merge17:33
jrosserbut i really don't know why this one fails instantly like that, ironically when it's the very job i'm trying to remove /o\17:34
fungiyes, but the nodeset definition was taken out early today (or last night depending on where in the world you are)17:34
jrosserothers are working where i just remove jobs though, i'm kind of confused17:34
fungiwell, other infra-root folk might have an idea i'm not coming up with, but worst case i can bypass zuul to merge those changes directly via the gerrit api and get things unstuck17:35
jrosserit is getting to that point i think17:35
jrosseri'm kind of tired of this after spending nearly two weeks on it17:35
fungiyeah, we didn't want to have it be so sudden, but the centos mirror operators didn't really give us much choice. by the time we realized they'd ripped all the packages out of the mirrors, we'd already copied that to ours17:36
clarkbI think that usually happens beacuse something is overly aggressive about applying configs from one branch to another17:37
jrosserto make merging my job removal patches harder, that mirror removal has also broken all my stable branches due to an unrelated bug17:37
jrosseri have to go do childcare now but i may need some help to progress these17:38
fungiin this case there are two removal changes for openstack-ansible-os_keystone which are failing (victoria and wallaby) so it's presumably something to do with the config in that particular repo17:38
jrosserthe comments that zuul has left on kind of sums up the horror of this17:39
fungiand yes, to clarkb's point, there might be explicit branch matchers in the configs of those branches?17:39
clarkbside note: ubuntu and debian seem to keep the package repos up for what may as well be forever17:42
clarkbits a bit disappointing that centos-8 gave everyone a month and then sent it. I mean I wanted to clean things up on our side too but even then was willing to work with people to clean it up overtime17:42
fungijrosser: when you have a moment to put together a list of which changes you think need to bypass testing in order to be able to get the others running clean, i'm happy to help try and get you unstuck. just a word of warning, any time i think it's safe to bypass testing on an untested change, i inevitably end up merging a new bug in the process :/17:44
clarkbjust as a sanity check I've looked at the zuul scheduler log for that change reporting and it doesn't say anything more17:45
clarkbso zuul isn't hiding any info from us.17:45
fungiyeah, so i guess turning on debug in the patch wouldn't help in that case17:45
clarkbI wonder if the required-projects list for the OSA functional jobs forces all of them to be evaluated17:46
clarkbwhcih leads to this sort of problem where you get a chicken and egg17:46
clarkbor maybe the issue is just that the entire os_keystone repo's config across all branches is validated and since you can't land a change to all branches at the same time it leads to this17:47
clarkbcorvus: ^ are either of those possibilities somethign that you would expect zuul to do ?17:47
corvusaroo?  reading17:48
fungitl;dr is opendev had to remove the centos-8 nodeset, it's being referenced in jobs used in multiple branches of a repository, and seems like we have to simultaneously remove it from multiple branches since the per-branch removal patches are erroring with zuul complaining it can't evaluate the configuration17:51
funginotably, the configuration on those branches refer to jobs in another repository, so it seems like a deadlock17:52
corvusi see that zuul is saying you can't remove that job definition because it's used in pipelines of other repos17:52
corvuslike, you have to remove it from the check+gate pipelines of os_manila before you can delete the job definition17:52
fungiclarkb: maybe terribly idea, but what if we reintroduced the centos-8 nodeset and made it based on the centos-9-stream label?17:52
fungier, centos-8-stream label i mean17:53
clarkbfungi: that would probably work. Jobs might fail but they are likely going to do that anyway17:53
fungijobs referencing the label go from being unresolvable configuration to being maybe-broken builds, but unconfiguring those is likely simpler17:53
corvusmaybe i'm missing something, but the change you linked isn't complaining about labels....17:54
clarkbcorvus: that change17:54
corvusoh i was looking at 82457017:55
clarkbthe one with os_manila appears to be in process via and I expect that stack is resolvable17:55
fungiyeah, there are possibly multiple sources of deadlocks here too. restoring the nodeset might resolve some and not all17:55
corvusanyone have a link to the nodeset removal change?17:55
* fungi gets it17:55
fungiyeah, beat me to it17:56
corvusokay, good, zuul did report that would be a problem but for a tenant that doesn't vote on that change :)17:57
fungiyes, zuul did the right things. we knew it was going to lead to breakage in the openstack tenant17:57
clarkbyes and I think we knew it would be a problem but then redhat/centos forced our hand by completely breaking the distro17:57
clarkbI think what we're confused about now is why a change that removes the use of that label fails because it complains about hte use of the label17:58
corvusso yeah, i think you should restore the centos-8 nodeset, then unwind from the outside in17:58
fungii've got the change about ready to push17:58
corvusAH that's the question :)17:58
corvuszuul freezes the old and new configuration :/17:58
corvusthat's how the automatic job delta detection works17:59
clarkband it is the old side that is failing17:59
corvusso it probably failed on the old side17:59
corvusthat could probably be made more robust17:59
clarkbfungi: and if you push a revert up too then we can recheck the revert until it doesn't error from the other tenants18:00
corvuswould either just run too many or not enough jobs, but better than failing18:00
opendevreviewJeremy Stanley proposed opendev/base-jobs master: Partially revert "Remove centos-8 as it is EOL"
fungii didn't bring back the arm64 nodeset, as i'm not sure it's involved18:00
clarkbfungi: ya I think we can take it one step at a time +2 from me18:01
clarkbIs anyone interested in reviewing the threaddump that corvus  took of gerrit during the git pull slowness? I'm going to try and take a look at it myself today and attach it to the issue we created and will skim it. But I expect it is quite large and a second set of eyes may be a good idea18:04
fungilooking for sensitive data?18:05
fungisomeone copied it out of the container fs right?18:05
fungilooks like it's ~clarkb/dump18:06
*** amoralej is now known as amoralej|off18:08
fungiit does contain client ip addresses18:09
fungionly one, and it seems to be on line 483118:10
clarkbya sensitive info like that or secrets etc18:11
fungii don't see any obvious signes of keys or credential strings or e-mail addresses18:11
clarkblooks like there are usernames18:14
clarkbthe N/A where the IP addr is that you found would be a username if it was authenticated. The ssh threads have usernames but no IPs18:15
clarkbI'll work on scrubbing that18:15
fungioh, yep i see zuul's username for example18:16
clarkbI think we're ok leaving zuul in there. But I'll scrub the other users out18:17
clarkbfungi: I'm working on dump_sanitized in the same dir and have corrected the things we've identified so far18:18
clarkb'H2 File Lock Watchdog' has some file paths in them, but I don't think they are sensitive18:23
fungiyeah, that seems fine18:25
clarkbthere are replication threads that disclose our replication targets btu that info is already public18:29
clarkbI suspect that SSH-Interactive-Worker-85 may be the sort of info that gerrit would be looking at to investigate this issue. Though I'm not sure I undersatnd it fully (I also don't think it exposes anything it shouldn't)18:34
clarkbok I've gone over the whole file skimming it and updating things in dump_sanitized. The end result was removing several usernames and one ip address. The diff between the files shows this. fungi maybe you want to look it over and make sure I didn't miss anything in the cleanup?18:43
fungisure, checking18:44
clarkbI went ahead and approved the partial revert of the centos-8 nodeset removal change18:44
fungiclarkb: the sanitized dump lgtm18:51
fungii don't see any more obviously sensitive info in it18:51
clarkbcool I'll give it a few more hours in case anyone else wants to look then can attach it to the issue18:51
fungisounds great18:52
opendevreviewMerged opendev/base-jobs master: Partially revert "Remove centos-8 as it is EOL"
*** artom__ is now known as artom18:53
clarkbI rechecked the change jrosser pointed out and it is gating now19:02
fungijrosser: ^ may be worth rechecking the changes which complained about the missing nodeset, since it's "sort of" back for now19:02
clarkbLooks like it is running different centos-8 jobs still which may have been part of the problem too19:03
jrosseroh cool, thankyou all for taking a look at that19:05
jrosseri know where those other jobs are from, it's this
jrosserthe patch to os_keystone was removing jobs defined and used in that repo19:06
clarkbyup but if it can't run the jobs to check/gate the repo it would still fail I think.  Hopefully the nodeset with the hacked label for stream will get things moving forward19:06
jrosserbut there are project templates defined outside that repo which also carry centos-8 jobs19:06
clarkbI spot checked the unattended upgrades config change and it appears to ahve applied as expected. Same source file into two different files managed by the different involved packages19:30
clarkblgtm is what I'm saying :)19:30
*** timburke_ is now known as timburke20:57
clarkbinfra-root when do you think we should rip out the gerrit 3.3 images? Has enough time passed?21:02
clarkbI ask beacuse is impacted by that. Maybe land this chagne first and then do cleanup?21:02
ianwyeah i think we're unlikely to go back now21:03
ianwclarkb: especially since we have a java developer on the team ;)21:03
clarkbianw: if you have time can you check the review02:~clarkb/dump and ~clarkb/dump_sanitized files to make sure we aren't overdisclosing anything if I upload dump_sanitized to the gerrit bug tracker?21:05
clarkbdump is the original and dump_sanitized I've removed some content from21:05
opendevreviewClark Boylan proposed opendev/system-config master: Remove unused puppet modules
clarkbI left one of my local notetaking comments in ^ and it was bothering me. :)21:08
ianwyep, one sec21:11
admin1hi .. is there a nova command to see all ongoing migrations ? 21:16
clarkbadmin1: we run the software development tools for openstack but don't haev much experience running openstack ourselves. You'll probably have better luck in #openstack (I don't know the answer to your question)21:17
admin1sorry .. wrong channel :) 21:19
admin1it was meant for #openstack-nova21:19
*** dviroel is now known as dviroel|out21:22
ianwclarkb: thanks for approving through
clarkband thank you for debugging even if we don't fully understand it yet21:24
ianwmight have noticed from scrollback, in desperation i in-place upgraded nb02 yesterday.  i'll do that for the rest of the builders today to keep things in sync21:24
ianwit is cool21:25
ianwnot sure how much more effective it is than irc and an etherpad though21:25
clarkbhaving the graphs overlaid with actions and checkpoints is nice for tracking changes to the system21:26
clarkbI think that is what they are showing the system does21:26
ianw(this comment will probably age like the famous HN "dropbox is stupid, you can do that with a vps and rsync")21:26
* join_subline irc+etherpad masterrace 👑21:36
fungiclarkb: i was about to ask about ripping out gerrit 3.3 images yesterday. seems like it's time to me21:41
clarkbexcellent. I guess if we can land that reordering of the plugins first thenwe can base changes to remove 3.3 off of that21:42
clarkbfor some reason I thought we had conflicted with that reordering somehow but seems like we haven't21:42
clarkbthe melody addition must be what I thought would do it21:43
opendevreviewRamil Minishev proposed openstack/diskimage-builder master: Make growvols config path platform independent
opendevreviewRamil Minishev proposed openstack/diskimage-builder master: Add grow_gpt to growvols
opendevreviewRamil Minishev proposed openstack/diskimage-builder master: Make xfs_growfs execution back-compatible
opendevreviewClark Boylan proposed opendev/system-config master: Stop building Gerrit 3.3 images
clarkbSomething like that ?21:59
opendevreviewRamil Minishev proposed openstack/diskimage-builder master: Add grow_gpt to growvols
opendevreviewClark Boylan proposed openstack/project-config master: Switch jeepyb over to building Gerrit 3.4 images
opendevreviewClark Boylan proposed opendev/system-config master: Stop building Gerrit 3.3 images
ianwi'm going to do those builders now22:21
opendevreviewMerged openstack/project-config master: Switch jeepyb over to building Gerrit 3.4 images
ianw#status log in-place upgraded nb0<1,2,3> to focal to better match production to our gate test environment23:09
opendevstatusianw: finished logging23:09
opendevreviewClark Boylan proposed opendev/system-config master: Switch nb01 to focal in testing
clarkbianw: ^ updated to match in system-config-run23:30
ianwclarkb: good catch, thanks23:34
opendevreviewMerged opendev/system-config master: Better organize gerrit plugins in job defs
opendevreviewMerged opendev/system-config master: Remove unused puppet modules

Generated by 2.17.3 by Marius Gedminas - find it at!