Wednesday, 2021-04-21

*** brinzhang has joined #opendev00:02
fungisent just now00:14
openstackgerritMerged zuul/zuul-jobs master: ensure-docker: ensure docker.socket is stopped  https://review.opendev.org/c/zuul/zuul-jobs/+/78727100:18
brinzhangmordred: hi, I am a cyborg core, we would like to switch using launchpad instead of storyboard, but the bugfix and/or feature's commit cannot be have a relation, and that cannot statistics by the https://www.stackalytics.io/00:23
brinzhangmordred: could you help us, and have a look? I saw https://launchpad.net/openstack here created by yourself ^00:24
fungibrinzhang: maybe you want to revisit the conversation i had with xinranwang in #openstack-infra earlier: http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2021-04-20.log.html00:35
fungihopefully that answers all your questions00:35
brinzhangfungi: ack, let me see, thanks00:37
*** elod has quit IRC00:40
*** elod has joined #opendev00:42
openstackgerritIan Wienand proposed openstack/diskimage-builder master: dib-run-parts: stop leaving PROFILE_DIR behind  https://review.opendev.org/c/openstack/diskimage-builder/+/78730300:42
fungirackspace e-mailed us about an outage for the ethercalc server around 14:00 utc, if someone gets a chance to check in on it01:25
ianwi'm just cleaning up nb03 and old images from the arm64 region rename, hopefully then we can upload raw images to OSU01:45
fungiawesome01:48
ianwhrm, i might have messed up a bit swapping the names01:57
ianwthere's a bunch of deleting images for "linaro-us" "osuosl" (as opposed to "linaro-us-regionone", "osuosl-regionone") but i don't think anything will pick them up to delete them, as there isn't a provider for that now01:57
ianwtwo thoughts are to manually delete the ZK nodes, or put in a provider by that name with no images by hand to see if it picks them up01:59
openstackgerritWenping Song proposed openstack/project-config master: Change cyborg project track to launchpad  https://review.opendev.org/c/openstack/project-config/+/78730602:00
ianwadding a fake provider seems to be deleting the images02:03
*** artom has quit IRC02:06
ianwi don't know why, but nb03 doesn't seem to want to refresh the images for osu02:14
openstackgerritWenping Song proposed openstack/project-config master: Change cyborg project track to launchpad  https://review.opendev.org/c/openstack/project-config/+/78730602:50
*** amoralej|off has quit IRC03:37
*** ykarel|away has joined #opendev04:06
*** ykarel_ has joined #opendev04:10
*** ykarel|away has quit IRC04:12
fungi#status log Removed temporary block of 161.170.233.0/24 in iptables on gitea-lb01.opendev.org after discussion with operators of the systems therein04:39
openstackstatusfungi: finished logging04:39
*** hamalq has quit IRC04:49
*** vishalmanchanda has joined #opendev04:55
*** ralonsoh has joined #opendev05:29
*** ralonsoh has quit IRC05:33
*** ralonsoh has joined #opendev05:36
*** ralonsoh_ has joined #opendev05:55
*** seongsoocho_ has joined #opendev05:55
*** seongsoocho has quit IRC05:55
*** seongsoocho_ is now known as seongsoocho05:55
*** paladox has quit IRC05:55
*** ykarel_ has quit IRC05:55
*** ykarel__ has joined #opendev05:55
*** ralonsoh has quit IRC05:57
*** mlavalle has quit IRC05:57
*** mlavalle has joined #opendev05:58
*** mnaser has quit IRC05:59
*** mnaser has joined #opendev06:00
*** marios has joined #opendev06:03
openstackgerritIan Wienand proposed opendev/system-config master: nodepool-base: prefer ZK IPv6 addresses  https://review.opendev.org/c/opendev/system-config/+/78731306:11
*** sboyron has joined #opendev06:21
*** ykarel_ has joined #opendev06:38
*** ykarel__ has quit IRC06:40
ianwbullseye images should be making their way out06:43
ianwarm64 is uploaded06:43
openstackgerritIan Wienand proposed opendev/system-config master: nodepool-base: prefer ZK IPv6 addresses  https://review.opendev.org/c/opendev/system-config/+/78731306:56
*** slaweq has joined #opendev07:02
*** fressi has joined #opendev07:07
*** avass has quit IRC07:09
*** avass has joined #opendev07:10
*** andrewbonney has joined #opendev07:11
*** whoami-rajat has joined #opendev07:12
*** ralonsoh_ is now known as ralonsoh07:21
*** gothicserpent has quit IRC07:25
*** gothicserpent has joined #opendev07:27
*** amoralej has joined #opendev07:29
*** rpittau|afk is now known as rpittau07:33
*** eolivare has joined #opendev07:41
*** tosky has joined #opendev07:46
*** ysandeep|away is now known as ysandeep07:49
*** brinzhang has quit IRC07:50
*** ykarel_ has quit IRC07:52
*** brinzhang has joined #opendev07:55
*** jpena|off is now known as jpena07:56
*** ysandeep is now known as ysandeep|lunch08:15
*** ykarel_ has joined #opendev08:27
kevinzianw: Good evening :-)08:46
kevinzianw: does ZK running in the node : nb03.opendev.org?08:46
ianwkevinz: nb03.opendev.org talks to the zookeeper cluster of zk*.openstack.org hosts for communication from zuul08:52
kevinzianw: OK, so  ZK lost has happened today? It was working well before right?08:53
ianwkevinz: it's been a persistent problem.  it drops the connection, but then connects again quickly08:53
hrwianw: cool (bullseye images)! images means nodes. nodes means wheel cache. so soon bullseye be ready for CI jobs ;D09:10
*** ysandeep|lunch is now known as ysandeep09:23
*** ykarel_ has quit IRC09:34
*** lpetrut has joined #opendev09:39
*** slaweq_ has joined #opendev10:14
*** slaweq has quit IRC10:18
*** slaweq_ is now known as slaweq10:18
*** dtantsur|afk is now known as dtantsur10:37
*** jpena is now known as jpena|lunch11:32
*** sshnaidm has quit IRC12:00
*** amoralej is now known as amoralej|lunch12:07
*** sshnaidm has joined #opendev12:07
hrwhm. python3 crashes in centos-8-stream-aarch64 nodes ;(12:27
hrwhttps://5a819d33a93f73917b61-d92e3f8cc209d4a3d1e66263399702fb.ssl.cf2.rackcdn.com/772479/31/check-arm64/kolla-build-centos8s-source-aarch64/6219eb7/job-output.txt12:27
*** jpena|lunch is now known as jpena12:32
fungi(core dumped) python313:04
fungiwe may have more details in the console json13:04
*** amoralej|lunch is now known as amoralej13:05
funginope, job-output.json doesn't have any additional output, it was probably swallowed by the shell invocation13:06
fungilooks like the last success result was 2021-04-16 20:06:1413:20
fungiso this has potentially been failing for almost 5 days13:21
fungihrw: i set an autohold for that job on https://review.opendev.org/772479 and have added a check arm64 comment to get it rerun13:27
hrwok13:28
fungiso once it fails we should be able to install the symbols and load the core in gdb, hopefully13:29
*** artom has joined #opendev13:38
fungihrw: well, the bad news is it doesn't trivially crash, so we'll probably need to try to replicate what ansible is asking python to do14:05
fungilikely there's some c extension getting involved14:06
hrwok. should we disable testing on centos for now?14:08
fungiprobably so until we get to the bottom of this14:09
hrwok. will prepare patch14:09
*** fressi has quit IRC14:09
fungii may need some help from someone better versed in ansible's internals, but it looks like there are several /tmp/ansible_setup_payload_*/ansible_setup_payload.zip we could try unpacking and running to narrow down the cause of the crash14:12
hrwsorry, I barely know how to use ansible so cannot help14:14
fungiyeah, not a problem, i'm sure we've got folks hanging around in the shadows ;)14:14
hrwhttps://review.opendev.org/c/openstack/kolla/+/787375 submitted to stop jobs14:15
fungiif i had to place a wager, i'd say odds favor some non-stdlib ansible dependency has a c extension compiled for the wrong architecture... but at this stage i have no evidence to support that theory14:16
*** lpetrut has quit IRC14:29
johnsomclarkb Many moons ago I requested to delete the xenial "test" image on tarballs.o.o. I think you wanted to wait a week or such before deleting it. Can we delete that now? It hasn't been updates since 2019....14:33
johnsomhttps://tarballs.opendev.org/openstack/octavia/test-images/14:33
funginot sure what the concern with deleting it back then was, but i can take care of it shortly14:35
johnsomThat would be great, thank you14:35
clarkbjohnsom: ah sorry14:39
clarkbI think the concern was it had just been announced? something like that14:39
johnsomNo worries, just following up14:39
hrwfungi: ok. centos stream 8 aarch64 job disabled15:22
fungithanks, once i'm off my current conference call i hope to get back to trying to come up with a python3 invocation to directly reproduce the crash15:30
*** eolivare has quit IRC15:59
*** tkajinam has quit IRC16:07
*** tkajinam has joined #opendev16:07
zigofungi: Just checking (no pressure): do we have bullseye now? :)16:19
*** chandankumar is now known as raukadah16:22
*** hamalq has joined #opendev16:22
*** hamalq has quit IRC16:23
*** hamalq has joined #opendev16:24
fungizigo: i believe so?16:37
fungii've been fairly busy today but i saw ianw mention while i was asleep that we had nodes booting16:38
zigoAh cool ! :)16:42
fungizigo: a quick check of nodepool says we have bullseye images uploaded to our arm64 providers but not amd6416:44
zigoOh ... :/16:45
fungithough we did build an amd64 image16:45
zigoSo, that's for tomorrow?16:45
fungiaha, it was only built an hour ago so may still be uploading16:45
fungier, no, it started building an hour aho16:45
fungiago16:45
fungino, wait, that's a minute ago16:46
fungidebian-bullseye-arm64 completed building almost 12 hours ago so maybe there's something wrong with the amd64 image build still16:46
*** amoralej is now known as amoralej|off16:51
zigo:/16:51
fungiyeah, insta-failing: https://nb02.opendev.org/debian-bullseye-0000001505.log16:52
fungiErr:13 https://mirror.dfw.rax.opendev.org/debian-security bullseye-security/updates/main amd64 Packages16:53
fungi404  Not Found [IP: 2001:4800:7819:105:be76:4eff:fe04:9b8a 443]16:53
fungithe /updates is the problem16:53
fungiit's just https://mirror.dfw.rax.opendev.org/debian-security/dists/bullseye-security/main/16:54
fungiso we've got something wrong in our mirror entry for the amd64 image builds but not arm6416:56
*** ysandeep is now known as ysandeep|away16:57
*** marios is now known as marios|out16:57
*** jpena is now known as jpena|off17:03
*** ralonsoh has quit IRC17:09
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: ensure-docker: prevent issue on centos-7 where the socket does not exists  https://review.opendev.org/c/zuul/zuul-jobs/+/78742117:12
*** marios|out has quit IRC17:21
*** rpittau is now known as rpittau|afk17:24
*** dtantsur is now known as dtantsur|afk17:40
openstackgerritClark Boylan proposed opendev/system-config master: Add inmotion cloud to cloud launcher  https://review.opendev.org/c/opendev/system-config/+/78742517:40
*** ocsabat has joined #opendev17:41
*** ocsabat has quit IRC17:53
openstackgerritClark Boylan proposed openstack/project-config master: Add InMotion cloud to nodepool  https://review.opendev.org/c/openstack/project-config/+/78742817:56
*** andrewbonney has quit IRC18:01
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: ensure-docker: do not manage the socket on distro centos  https://review.opendev.org/c/zuul/zuul-jobs/+/78742918:01
*** elod has quit IRC18:04
*** elod has joined #opendev18:06
openstackgerritClark Boylan proposed opendev/system-config master: Add inmotion cloud to cloud launcher  https://review.opendev.org/c/opendev/system-config/+/78742518:18
openstackgerritMerged zuul/zuul-jobs master: ensure-docker: prevent issue on centos-7 where the socket does not exists  https://review.opendev.org/c/zuul/zuul-jobs/+/78742118:19
*** vishalmanchanda has quit IRC18:54
*** whoami-rajat has quit IRC18:55
*** lpetrut has joined #opendev19:32
openstackgerritMerged opendev/system-config master: Add inmotion cloud to cloud launcher  https://review.opendev.org/c/opendev/system-config/+/78742519:32
*** lpetrut has quit IRC19:34
*** hamalq has quit IRC19:44
*** hamalq has joined #opendev19:44
corvusinfra-root: i'd like to restart zuul with the latest batch of zk changes (secrets in zk); any objections or points to conisder?19:58
fungicorvus: we've got a deploy item in progress we'd like to see finish soonish (deploying credentials for the inmotion hosting cloud)20:02
fungiother than that no concerns on my end20:03
fungiitem 787425 in the deploy pipeline20:04
corvuscool, i'll give that a little bit20:05
clarkbcorvus: airship did finally do their release aiui so should be good on that front (though I've been busy with this cloud stuff today and haven't gotten to zk cluster upgrades yet)20:07
fungi#status log Deleted /afs/.openstack.org/project/tarballs.opendev.org/openstack/octavia/test-images/test-only-amphora-x64-haproxy-ubuntu-xenial.qcow2 as requested by johnsom20:09
openstackstatusfungi: finished logging20:09
corvusclarkb, fungi: fyi the base and letsencrypt playbooks failed; manage-projects is running now.20:18
fungithanks, i'll take a look. we're mostly concerned with the nodepool and cloud-launcher jobs i think, depending on where base failed at least20:20
corvusfungi: buildset complete: https://zuul.opendev.org/t/openstack/buildset/7e90244873494a059d3f2fe3b6aac7f920:24
fungioh poo, all the other prod jobs were skipped, i guess because they're set to depend on infra-prod-base20:24
fungiyep20:24
corvusfungi: actually i think they skipped due to letsencrypt20:26
fungioh, huh...20:26
fungifatal: [mirror01.regionone.limestone.opendev.org]: UNREACHABLE!20:28
fungiyeah, that's the underlying reason for both builds failing20:29
fungii also can't ssh into it20:30
clarkbI think we can probably make cloud launcher run pretty independent of everything else20:30
clarkbbut figuring out limestone first seems like a good idea20:30
fungii can get to the api for limestone, server list shows the server is in SHUTOFF state20:34
fungii've asked it to start, will check its logs if it lets me ssh in now20:34
johnsomfungi Thank you!20:35
fungisomething seems to have asked that mirror to shutdown at 15:27:47 utc today20:38
fungino indication in syslog of what that was20:39
fungilogan-: ^ just a heads up, if you're around, seems like there may have been some trauma in that cloud20:39
fungimay also be ongoing, but it was at least willing to let me start the mirror server back up again20:40
fungi#status log Started mirror01.regionone.limestone.opendev.org, which seems to have spontaneously shutdown at 15:27:47 UTC today20:41
openstackstatusfungi: finished logging20:41
*** slaweq has quit IRC20:46
*** sboyron has quit IRC20:49
clarkbcorvus: if we can let the next hourly run run through the cloud launcher that should finish about 21:22 ish that would be great20:52
fungicorvus: clarkb: seems like it's safe to go ahead with the zuul restart? i can always manually reenqueue the failed deploy after zuul's back up20:52
clarkbfungi: if we do that we may need to enqueue an hourly run?20:52
fungiheh, messages crossed on the wire20:52
fungiclarkb: well, if zuul's restarted right now, the currently incomplete hourly run will be reenqueued20:52
clarkbah yup20:54
clarkbeither way I guess then20:54
fungiit ends up showing a 0 ref instead of refs/heads/master i think, but we observed the jobs still end up using the correct branch20:55
fungior maybe that's fixed more recently20:55
corvus... :)20:56
corvusyeah, i think there's no harm to the inmotion project by restarting zuul now20:57
fungii agree. clarkb?20:57
corvus(it may even speed it up -- all assuming the zuul deploy doesn't explode)20:58
clarkbya unless the job ends before we grab the queues20:58
clarkbbut we can manually add in an hourly run at that point if necessary20:58
corvusi have run zuul_pull21:00
corvushourly just finished21:00
corvusi will save queues now and restart21:00
corvuszuul is currently iterating through all the projects loading keys into zk from the filesystem21:02
fungiawesome21:02
corvusit's probably doing a little more than 5 projects/sec... we might want to think about making that more efficient...21:03
corvusalso...21:03
corvusi think i'd like to restart the scheduler again immediately after it finishes, just to observe a second startup with keys already in zk21:03
fungiseems like a reasonable precaution21:04
fungithe earlier we catch a problem there the better21:04
corvus(after it finishes loading keys into zk -- we don't need to wait for the cat jobs)21:04
clarkb~5 minutes for loading keys I guess? maybe a little less since projects in zuul is < projects in gerrit?21:06
corvus21:06:15,080 - 21:07:42,111 the second time around21:08
clarkbstatus loads now21:11
corvusre-enqueing21:11
corvus#status log restarted zuul on commit 620d7291b9e7c24bb97633270492abaa74f5a72b21:16
openstackstatuscorvus: finished logging21:16
clarkblooks like we didn't end up with an hourly job. How would I go about enqueuing that?21:17
corvusclarkb: i'll do it in a sec21:17
corvusa job uploaded logs: https://4b2e22934a7c49e91cc6-5f33f4a8f6999785c5e66684a945b77a.ssl.cf1.rackcdn.com/783969/37/check/openstack-tox-pep8/2c04510/21:18
clarkbcool thanks, if you cna share the command after that would be great :)21:18
corvusi think that means secrets still work :)21:18
corvusclarkb: i'll just let the current re-enqueue finish21:18
corvusclarkb: i snagged a queue dump while the hourly was still running as well, so hopefully it already has the command in it :)21:18
clarkboh got it21:19
corvuszuul enqueue-ref --tenant openstack --pipeline opendev-prod-hourly --project opendev.org/opendev/system-config --ref refs/heads/master21:19
corvusclarkb: that's what the script came up with and looks reasonable to me, so i'll run that21:19
clarkbcool it is enqueue-ref which was one of my big questions21:19
clarkbthen we just tell it to use master21:19
corvusclarkb: enqueued21:20
clarkbthanks!21:20
ianwfungi: if i read correctly two things that need action are bullseye amd64 builds and centos8 ansible arm64 is somehow crashing?21:25
fungiianw: yeah, sorry, it's been a crazy day. i haven't made much progress on those. i held an arm64 centos7 node to try and work out why ansible gathering facts leads to a coredump21:26
clarkbdidn't nasible have similar issues with a systemd update once21:26
fungithe bullseye amd64 problem is probably a quick fix... just need to correct the security mirror suite name21:26
clarkbsomething about how ansible gloms onto systemd on centos21:26
fungii'll see if i can figure out where/why bullseye-security is misconfigured21:28
ianwsystemd is involved, well i never!21:28
* ianw clutches pearls21:28
clarkbianw: well I don't know that it is just that I seem to recall similar ansible problems that did involve it once21:28
clarkb:)21:28
ianwclarkb: if you have a quick sec https://review.opendev.org/c/openstack/diskimage-builder/+/787303 should stop leaking all those profile dirs i hope21:29
clarkboh neat, I do have time for a review while I wait for the cloud launcher to run21:30
clarkbianw: does bash trap put things in a stack?21:31
clarkb(just wondering if we need to worry about other exit traps being overwritten)21:31
fungithat's one of those things i would whip up a local test to determine21:32
clarkblooks like it overrides according to trap21:32
clarkber man trap21:32
fungigood to know21:32
ianwit should be running the actual scripts in a subshell though21:33
clarkbyup and man trap also says they are reset per subshell21:33
clarkbjust making sure this won't cause issues, it shouldn't as long as no other trap EXIT exists in this particular script21:33
clarkband there isn't another21:34
clarkb\o/ bridge is now able to talk to the inmotion cloud (I can list images for example)21:36
corvusi had to scroll past 4 pages of zuul status before i saw a failing job; i thought something was wrong for a minute :)21:36
clarkbnow just need cloud launcher to configure the things it configures and next step can do a mirror launch21:36
ianwcorvus: if you have a sec, could you look at https://review.opendev.org/c/opendev/system-config/+/787313 to switch zk for nodepool to ipv621:37
ianwprimarily, i'd like to see if this helps nb03 stay in zookeeper.  it will give us a data point that it really is ipv4 that is getting us cut off (as opposed to more general network issues)21:37
ianwbut generally, i think we're ok to use ipv6 for it21:38
ianwfungi: alright, i'm seeing the bullseye build failure.  i don't think i'm quite understanding how things are layed out incorrectly, and why it hasn't failed in the gate21:40
ianwhttps://mirror.dfw.rax.opendev.org/debian-security/dists/ ... should this just be "bullseye", not "bullseye-security"?21:41
openstackgerritIan Wienand proposed opendev/system-config master: debian-security: fix bullseye codename  https://review.opendev.org/c/opendev/system-config/+/78744721:44
clarkbI want to say bullseye changed how security was handled compared to buster21:45
clarkbI would definitely have fungi ack that one since I don't grok all the debianness there21:45
mordredyeah - I agree - I looked at repo paths a few weeks ago and it's definitely laid out a little different for bullseye21:45
mordredthe sources.list in a bullseye container is different too21:46
ianwhttp://security.debian.org/debian-security/dists/21:46
ianwhrm, indeed21:46
fungisorry, trying to juggle too many discussions at once. the "/updates" in the security suite is the problem21:47
fungifor some reason the amd64 image tries to get bullseye-security/updates/main instead of bullseye-security/main21:48
fungi(that would have been correct for buster, but is not correct for bullseye)21:48
ianwi note that http://security.debian.org/debian-security/dists/bullseye-security/updates/ is a recursive link to itself apparently21:49
ianwhttp://security.debian.org/debian-security/dists/bullseye-security/updates/updates/updates/updates/updates/updates/updates/ ... i wonder how far it can go21:49
clarkbcloud launcher failed. I'm looking at it. I think it managed to update security groups though?21:49
ianwclarkb: note that it will fail on osu using the default sdk21:51
ianwuntil https://review.opendev.org/c/openstack/openstacksdk/+/786148 is incorporated21:52
clarkbBadRequestException: 400: Client Error 400: Client Error for url: https://173.231.255.228:9696/v2.0/security-group-rules, Unrecognized attribute(s) 'remote_address_group_id'21:52
clarkbin this case the failure was in the new cloud, but good to know21:52
clarkbits the same issue21:53
clarkbianw: did you run cloud launcher by hand with a different sdk version somehow?21:53
ianwclarkb: yep21:53
ianwsudo ./venv/bin/ansible-playbook -e ansible_python_interpreter=/home/ianw/system-config/venv/bin/python3 -v ./playbooks/run_cloud_launcher.yaml21:54
ianwthat would probably work for you too21:54
clarkbthanks I'll try it21:54
ianwok, so the gate tests don't use our mirror -> https://zuul.opendev.org/t/openstack/build/933e4b208ea44bfbb97ea08e7bd66c96/log/nodepool/builds/test-image-0000000001.log#22721:54
clarkbianw: is that ./venv/bin/ansible-playbook that same  as /home/ianw/system-config/venv ?21:54
ianwumm yep, but i think you could probably just use ansible-playbook21:55
ianw(from path)21:55
ianwthat venv just has an old openstacksdk installed21:56
clarkbok I'll try sudo ansible-playbook  -e ansible_python_interpreter=/home/ianw/system-config/venv/bin/python3 -v path/to/normal/system-config-check/playbooks/run_cloud_clauncher.yaml21:56
ianw++21:57
fungiso to refresh my memory there, openstacksdk released a behavior change which is broken on anything except wallaby neutron?21:58
clarkbthe cloud we're trying to get going now is victoria so certainly seems like it must be a very recent cloud?21:59
ianwi think that is an accurate summary; though i would stand to be corrected if someone dug through the various points the api bits got released22:00
clarkbwe don't notice it against the other clouds because we don't try to change things if they are already up to date22:00
clarkbso we notice on the new clouds as we enroll them22:00
ianwfungi: sorry, i know you're doing other things22:02
ianwi'm seeing upstream job working with22:02
ianwhttp://security.debian.org/ bullseye-security/updates main22:02
ianwand our job failing with22:02
ianwhttps://mirror.dfw.rax.opendev.org/debian-security bullseye-security/updates main22:02
ianwthe difference being, we don't have an "updates" in our mirror, while upstream does22:02
ianw(http://security.debian.org/debian-security/dists/bullseye-security/updates/)22:03
ianwbut upstream's "updates" appears to be a recursive link to itself22:03
ianwwhich is about the state my brain is now in :)22:03
fungiianw: yeah, i think we owe that to reprepro... pretty sure that updates recursive symlink is a temporary workaround debian added on their mirrors to make upgrades less painful22:09
clarkbianw: fungi: there was an issue in the cloud launcher config that I set up where it improperly applied config to osuosl that I wanted against inmotion22:14
clarkbI'm working on a fix, but I think it may have created a network and subnet (but not router) on osuosl's zuul project :/ I'm not sure if this has an impact on our abiltiy to boot functioning instances there yet22:14
clarkbthe keypairs and security groups should be identical so those don't matter22:15
ianwclarkb: probably unlikely as we hard-code the network to use, as they have two, and used fixed ip's on them22:15
clarkbianw: ok good22:15
clarkbI'll get this fixed up version pushed then look at cleaning up the mess22:15
ianwit might be easier to delete via horizon22:16
clarkbya probably since order matters22:17
openstackgerritClark Boylan proposed opendev/system-config master: Set the correct cloud for opendevzuul-inmotion enrollment  https://review.opendev.org/c/opendev/system-config/+/78745222:17
clarkbthat should fix the problem22:17
fungiianw: see evidence of a somewhat mass bug filing trying to get stuff to stop relying on the symlink... https://duckduckgo.com/?q=site:bugs.debian.org+"bullseye+updates+security"22:18
clarkbI won't do the cleanup yet as it will just get recreated until https://review.opendev.org/c/opendev/system-config/+/787452 merges. Since we are explicit about networks I don't expect this is causing functional issues, its just messy22:20
ianwfungi: but for the immediate issue; our missing symlink is the problem?  i'm not sure what creates that on our mirror22:20
fungiwell, nothing creates it on our mirror, it's not on our mirror22:21
fungii think our sources.list needs to be fixed22:21
fungiin the images22:21
ianwhttps://mirror.dfw.rax.opendev.org/debian-security/dists/buster/updates/ is there?22:21
fungiyes, but not bullseye. they redid how the security repository is organized starting in bullseye22:22
clarkboh wait my patch is still wrong one moment22:22
openstackgerritClark Boylan proposed opendev/system-config master: Set the correct cloud for opendevzuul-inmotion enrollment  https://review.opendev.org/c/opendev/system-config/+/78745222:23
ianwfungi: right.  *our* mirror doesn't have the updates symlink.  but debian-security does -> http://security.debian.org/debian-security/dists/bullseye-security/updates/22:23
clarkbI'm going to manually run cloud launcher again but against 787452's state to get the new cloud set up22:25
fungiianw: yep, and that's apparently a temporary hack to fix mostly regressions in debian's package testing infrastructure and ease upgrades from buster. deployed bullseye systems shouldn't follow that symlink22:26
fungiso either we add a similar hack symlink to our reprepro mirror, and then fix it properly for the subsequent debian release, or we solve it now22:27
fungiianw: what i haven't figured out yet is where we write out the sources.list file when creating those images22:29
ianwdiskimage_builder/elements/debian-minimal/environment.d/10-debian-minimal.bash is what builds it22:30
ianwdiskimage_builder/elements/debian-minimal/root.d/75-debian-minimal-baseinstall is what ends up writing it22:31
corvusianw: 787313 lgtm22:31
fungiaha, now i see why my naive codesearch patterns weren't turning up any hits22:31
ianwcorvus: thanks; OSU also just added an ipv6 address and AAAA records for their API too22:33
ianwlike WOPR i think we've figured out the best way to diagnose ipv4 nat issues is just not to play the game :)22:34
fungiianw: so i think just moving the DIB_DEBIAN_SECURITY_SUBPATH assignment into the bullseye conditional block will probably address this22:34
fungii'll gibe that a shot22:34
fungigive22:34
ianwfungi: perhaps we should just hand-end on nb01 and see, because the gate job isn't setting us up to use our security mirror22:35
fungiyeah i can try that too22:35
ianw(we can fix that too)22:35
fungibut i'll still push a change first just to record what i'm trying22:36
openstackgerritJeremy Stanley proposed openstack/diskimage-builder master: debian-minimal: bullseye: /updates -> -security  https://review.opendev.org/c/openstack/diskimage-builder/+/78745422:40
fungiianw: ^ so that's what i want to try22:40
fungizigo: ^ next time you're around, in case you want to provide more context22:42
zigofungi: I'm here !22:42
* zigo reads22:42
fungizigo: i'm fairly sure that's why the current images are failing to build22:42
clarkbit seems running the cloud launcher against my checkout is insufficient22:42
clarkb(I think var lookup paths may not be as relative as I want in this case)22:42
ianwfungi: ok, just have to do school run, bib22:43
zigofungi: Because of bullseye/updates -> bullseye-security thingy ?22:43
clarkbI think that means I need https://review.opendev.org/c/opendev/system-config/+/787452 in22:43
ianwclarkb: yeah, ansible.conf is probably making it look in global paths22:43
clarkbianw: ya22:43
clarkbZuul has +1'd https://review.opendev.org/c/opendev/system-config/+/787452 if I can get reviews :)22:43
fungizigo: yep, official debian mirrors have added an updates symlink in bullseye-security as a temporary workaround, but reprepro doesn't know to create that (and our sources.list files shouldn't depend on it anyway)22:44
zigofungi: This thing is a major pain in many components, but it's very helpful for Debian users, so they don't mistake between stable/updates and stable-updates anymore ...22:45
zigofungi: My own mirror does *NOT* have a bullseye/updates folder ...22:45
fungiabsolutely, i followed the discussions on the ml when reorganizing the security layout was proposed, it all makes sense. just means we need to do a bit more special-casing (we already did in fact, but it was incomplete)22:45
clarkbthe opendevci side seems to be fine so I will proceed with launching the mirror22:45
zigoSo that symlink is not an official thing at all.22:46
fungiright, if memory serves it was added to work around some autopkgtests and to ease in-place upgrades from buster22:46
fungithere was an mbf for updating packages which had hard-coded the old pattern22:47
fungibullseye d-i will properly write out sources.list without the /updates though, and dib should be made to do the same22:48
openstackgerritMerged opendev/system-config master: Set the correct cloud for opendevzuul-inmotion enrollment  https://review.opendev.org/c/opendev/system-config/+/78745222:56
clarkbssh failed to the mirror I tried lauinching. I'm going to try booting something by hand now22:56
clarkbI thought horizon might make this easy. I was wrong22:57
zigofungi: Could you point at the DIB code so that I can try to help?23:06
zigoWhich part contains the brokenness ?23:06
ianwzigo: fungi already proposed the fix :) https://review.opendev.org/c/openstack/diskimage-builder/+/78745423:07
zigoBrilliant ! :)23:07
ianwi can try putting this on nb01 and see if it picks it up23:07
zigoThough: "if [ "${DIB_RELEASE}" = "bullseye" ]; then" is probably not the right way to go...23:07
zigoI would have go the other way around.23:07
zigoif [ "${DIB_RELEASE}" = "buster" ] || [ "${DIB_RELEASE}" = "stretch" ]; then23:08
zigoBecause now, we have a problem with Bookworms ... :)23:08
fungizigo: yep, and if we decide to do a sid image or something23:09
fungii was trying not to disrupt the current logic there, but i agree it needs future-proofing23:09
clarkbthe issue with launching the new mirror seems to be that focal has decided that you cannot ssh in as root and must use ubuntu now?23:17
clarkbThat said when I manually booted an instance root seemed to work, so I wonder if this is impacted by cloud-init somehow23:17
ianwclarkb: you know i think i hit that launching the osu mirror, hand-edited launch.py and thought "that's weird" and forgot about it23:22
ianwfungi: ok, bullseye is building further along now on nb02 with your change applied23:22
clarkbianw: I think I figured it out. Its managed by the ssh key data. The raeson my manual boot worked is that our ssh key for infra root is a number of keys and the way the break you is by prefixing the expected single key with a command23:23
clarkbianw: I think if we modify launch-node.py to put the pubkey in twice it would work23:23
clarkbwhich is probably my favorite hack of the last little while23:23
clarkbI'm going to try this23:23
clarkbbut I need to cleanup my instance first23:23
clarkboh we may actually already handle this properly23:26
clarkbjust need to remove root from the front of the attempted list23:27
clarkbstuff seems to be moving now that i did ^23:28
clarkbI think we need to properly catch the error there and check the next one in the list though23:29
clarkbthat is the proper fix23:29
*** hamalq has quit IRC23:34
openstackgerritClark Boylan proposed opendev/system-config master: Add iad3.inmotion mirror node  https://review.opendev.org/c/opendev/system-config/+/78745623:37
clarkbI need to get dns address before ^ lands23:39
clarkbworking on dns now23:39
*** tosky has quit IRC23:42
openstackgerritClark Boylan proposed opendev/zone-opendev.org master: Add inmotion mirror to DNS  https://review.opendev.org/c/opendev/zone-opendev.org/+/78745923:43
clarkbok I think DNS first then inventory then we should be good23:44
fungithanks! reviewing23:48
openstackgerritClark Boylan proposed opendev/system-config master: Handle focal's insistence we don't use root in launch-node.py  https://review.opendev.org/c/opendev/system-config/+/78746123:53
clarkbthis last change isn't tested, but I think that is the correct fix to launch node for our root vs ubuntu issues23:53
clarkboh the ns update will be stuck behind the cloud launcher fix too so this might not move too quickly23:53

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!