Tuesday, 2024-10-08

corvus	naturally that failed; so i'm going to set another autohold and run again	00:01
mnasiadka	clarkb: Right, any idea when that might happen? I don’t think DIB is managed in openstack/releases?	05:39
ianw	mnasiadka: dib releases are manual - clarkb happy to make one if you wish	07:46
kevko	Hi, anybody know what is happening with pypi mirror ?	11:36
kevko	Connection to mirror.bhs1.ovh.opendev.org timed out. (connect timeout=60.0)')': /pypi/simple/setuptools/	11:36
ianw	it does look like mirror.bhs1.ovh.opendev.org is not responding	11:57
ianw	it reports as active from api	12:02
kevko	can anybody check it please ?	12:06
ianw	i can't see the console	12:16
fungi	confirmed, i can ssh into mirror.gra1.ovh but not mirror.bhs1.ovh, checking the nova api	12:16
fungi	server show says it's active, console log show is taking a while to return	12:18
ianw	mirror03.gra1.ovh.opendev.org console works	12:18
ianw	fungi: never returns for me; and also trying to get to it via the OVH mgmt website throws an error	12:19
fungi	cnosole irl show also seems to be timing out	12:19
fungi	er, console url show i meant, of course	12:19
ianw	fungi: want me to reboot it, see what happens? i think this is an ovh problem	12:20
fungi	ianw: should we try to do a server reboot?	12:20
fungi	yeah, agreed	12:20
fungi	go for it	12:21
fungi	shows it's in a reboot state	12:21
ianw	yeah, if console doesn't come back maybe a full stop/start	12:22
fungi	https://public-cloud.status-ovhcloud.com/ doesn't indicate any widespread issue in that region at least	12:23
fungi	if we can't get it restarted, we can temporarily turn down that nodepool region	12:24
ianw	sigh, still rebooting, and can't stop it if it's rebooting	12:25
fungi	well, stop would probably have failed similarly	12:25
fungi	i've set max-servers to 0 there for the moment	12:29
fungi	i'll add nl04 to the emergency disable list while i put a change together	12:29
ianw	++	12:33
opendevreview	Jeremy Stanley proposed openstack/project-config master: Temporarily disable OVH BHS1 launching in nodepool https://review.opendev.org/c/openstack/project-config/+/931777	12:34
fungi	infra-root: ^	12:34
fungi	once that merges i'll take nl04 out of the emergency disable list	12:35
Clark[m]	ianw: I think a dib release should be fine. There have only been a few commits since the recent release	13:45
Clark[m]	fungi: the mirror responds to http(s) for me now	13:50
opendevreview	James E. Blair proposed opendev/zuul-jobs master: Finish upload job https://review.opendev.org/c/opendev/zuul-jobs/+/931355	14:01
corvus	apparently it needs the segment-size argument ^	14:02
corvus	i thought that was automatic based on my previous testing, but for some reason, this time it just did one stream	14:03
fungi	Clark[m]: oh! excellent, i was afk for a few but will check to see if they updated the ticket	14:03
corvus	Clark: fungi ^ if you have a sec to re-review 931355 i think we can try again :)	14:03
fungi	yep, on it	14:04
fungi	lgtm!	14:05
fungi	ovh hasn't replied to the ticket yet as far as i can see, so we should probably wait a bit for a post-mortem before we assume it's probably staying up	14:07
fungi	also ssh is still timing out for me	14:08
fungi	which is odd since https is working	14:08
corvus	fungi: ssh wfm	14:09
corvus	ipv4	14:09
fungi	yeah, socket timeout reaching 22/tcp over ipv6	14:09
fungi	v4 is indeed working	14:09
fungi	aha, 443/tcp isn't reachable over v6 either	14:10
fungi	so ipv6 connectivity to the mirror is still broken	14:10
fungi	the server never rebooted either, "up 321 days"	14:10
fungi	and yes, it's still stuck reporting "reboot" status according to nova	14:11
fungi	i'll try rebooting it from the cli since i can reach it over v4	14:11
fungi	worth noting, the v6 default route on the node is still there, and through a gateway that's marked reachable in its neighbor table	14:12
corvus	i just lost a connection to bridge	14:14
fungi	sorry, that was me :/	14:14
corvus	whew. good to clean out the cobwebs every now and then anyway :)	14:14
fungi	yeah, it's on its way back up now	14:14
fungi	and up again	14:15
fungi	#status log Rebooted bridge01	14:15
opendevstatus	fungi: finished logging	14:16
fungi	at least we're running on a new kernel that way	14:16
fungi	apologies to anyone who had something running there that i accidentally interrupted!	14:16
fungi	so anyway, mirror02.bhs1.ovh has a default v6 route through 2607:5300:201:2000::1 and that's reachable and responding to ping from the server	14:18
fungi	when i traceroute6 from home to the mirror, the last hop that responds is an address in an ovh assignment	14:21
fungi	it's a small (/44) allocation, but unfortunately it has a very generic netname in whois so i don't know how far into their network that really is	14:22
Clark[m]	Is it possible that v6 was the problem all along?	14:23
fungi	entirely possible, i didn't think to try v4 networking, maybe ianw didn't either	14:24
fungi	tracerouting in the opposite direction from the mirror to my home v6 address, it gets two hops through ovh's network and then stops	14:24
fungi	i'll try a command-line reboot from the server now, but have an increasing suspicion it will come back into the same state	14:25
stephenfin	fungi: clarkb: Would one of you be able to change the topic of the #openstack-sdks channel for us. We'd like it to point back to launchpad rather than storyboard	14:26
Clark[m]	It would surprise me that the control plane is trying to use ipv6 too and failing so the state changes never occur/register	14:27
stephenfin	fungi: ...and I see you just replied on #openstack-sdks. Sorry for the nosie	14:27
stephenfin	*noise	14:27
Clark[m]	*wouldn't	14:27
corvus	should we consider omitting inactive repos from the on-image cache?	14:29
fungi	i thought we did, i guess we only skip them if they're using the retired acl?	14:29
corvus	oh that may be, i was just assuming based on names i see scrolling by; but yeah, maybe some of these just aren't actually retired	14:30
fungi	we should at least consider also skipping any that aren't in any zuul tenant	14:30
Clark[m]	Yes we should skip any with retired acls. Worth double checking though	14:30
fungi	since we've been removing repos with broken job configs from tenants	14:30
corvus	yes, we do have many fewer repos on the image than in projects.yaml	14:31
fungi	it's also possible that when openstack got its own separate retired project acl, we didn't add it to the list of what to skip	14:31
corvus	so i assume it's working, but we might be able to eek out a small improvement if we add in some other heuristics like zuul tenancy	14:31
corvus	if acl and os.path.basename(acl) == 'retired.config':	14:32
fungi	so, rebooting mirror02.bhs1.ovh neither fixed its v6 reachability nor did it clear the reboot status nova is reporting for it	14:33
corvus	i think that retired.config works for both	14:33
fungi	it remains reachable via ipv4 after the reboot however	14:33
fungi	i'll update the ovh trouble ticket in a few minutes with latest findings	14:34
clarkb	I guess we should still land the nodepool config change since no ipv6 is a problem for job success?	14:37
clarkb	fungi: if you agree ^ maybe go ahead and +A the change? I +2'd it but didn't approve in case we think we can just return it to service somehow	14:38
mnasiadka	ianw, clarkb : I assume we want to hold off with merging https://review.opendev.org/c/openstack/diskimage-builder/+/924421 with that planned release to fix locale and then do a major one after this lands?	14:39
clarkb	mnasiadka: yes that seems like a good plan. Do a quick bugfix release for locales then plan to do a bigger release with the potentially breaking for users change to the rocky element	14:41
clarkb	I don't think that will affect opendev because we always specify vm but other users may be impacted	14:41
opendevreview	James E. Blair proposed opendev/zuul-jobs master: WIP: testing https://review.opendev.org/c/opendev/zuul-jobs/+/931347	14:44
corvus	clarkb: fungi ^ i'd like to see if we can use ansible env variables to hold the credential information and make it safe to run without no_log -- can you double check that test change before i run it through its paces?	14:46
fungi	lookin	14:46
corvus	(i want to run the job and then see if the fake credential strings show up in any of the logs)	14:47
clarkb	corvus: you seem to have overridden the secret in its entirety so I don't think there is any way for that to leak info	14:49
clarkb	or rather leak sensitive info. We don't know if it will leak the public test data	14:49
opendevreview	Doug Goldstein proposed openstack/project-config master: Update ironic ACL for editHashtags https://review.opendev.org/c/openstack/project-config/+/931799	14:50
opendevreview	Mohammed Naser proposed zuul/zuul-jobs master: Stop using temporary registry https://review.opendev.org/c/zuul/zuul-jobs/+/931713	14:55
corvus	clarkb: yep, thanks; just wanted more eyes on that to make sure i didn't miss something :)	14:57
corvus	i'll send it and we can see what happens	14:58
fungi	yes, looks like a safe test	14:59
fungi	and then we can examine the ansible output/manifest to see if the overridden strings show up anywhere	14:59
opendevreview	Merged opendev/zuul-jobs master: Finish upload job https://review.opendev.org/c/opendev/zuul-jobs/+/931355	15:03
fungi	merge failed	15:03
clarkb	fungi: merge of what failed?	15:04
fungi	Error merging gerrit/opendev/zuul-jobs for 931347,11	15:04
fungi	the wip child of the change that just merged	15:04
fungi	so we didn't get an actual build in gate to inspect	15:05
fungi	outdated parent?	15:05
clarkb	I think they may have been disconnected in git so not actually sharing a relationship to resolve conflicts	15:05
fungi	ah	15:05
fungi	yes, correct. they merge-conflicted in gate	15:06
opendevreview	Merged openstack/project-config master: Update ironic ACL for editHashtags https://review.opendev.org/c/openstack/project-config/+/931799	15:06
fungi	so i guess it needs a rebase on the current branch tip	15:06
fungi	mirror.bhs1.ovh is reachable over ipv6 again!	15:10
clarkb	I wonder if ipv6 coming back allowed the nova status to reconcile too	15:11
fungi	status is still "reboot" in nova though	15:11
opendevreview	James E. Blair proposed opendev/zuul-jobs master: WIP: testing https://review.opendev.org/c/opendev/zuul-jobs/+/931347	15:16
opendevreview	James E. Blair proposed opendev/zuul-jobs master: WIP: testing https://review.opendev.org/c/opendev/zuul-jobs/+/931347	15:17
corvus	okay i think that dtrt	15:18
corvus	oh! i forgot something in the real change -- the artifact was returned from the role i removed, so i need to add that back.	15:18
corvus	oh here's a pickle -- we were using sdk to get the endpoint to construct the url of the image we uploaded; i wonder if the swift cli can provide that information	15:21
timburke	corvus, running `swift stat -v <container> <obj>` should give you the full URL of the object (among a bunch of other info)	15:24
corvus	timburke: perfect thanks!	15:24
timburke	if you're just interested in the OS_STORAGE_URL, you can get that with `swift auth`	15:24
corvus	oh yeah that'll work too	15:25
corvus	neither of these are hard to parse -- but json output might be cool to have someday :)	15:26
timburke	wrote up https://bugs.launchpad.net/python-swiftclient/+bug/2083948	15:36
corvus	++	15:36
opendevreview	James E. Blair proposed opendev/zuul-jobs master: Return image artifacts https://review.opendev.org/c/opendev/zuul-jobs/+/931815	15:59
corvus	clarkb: fungi https://zuul.opendev.org/t/opendev/build/fd8e5201483c4f5688f1368774f50885 the only "leak" i see is the output of the credential id in that error message; i think we decided we weren't very worried about that. i don't see "testcredentialsecret" in any of the output files, and i don't see testcredentialid anywhere except in those error messages, so i'm inclined to think this approach should be safe.	16:03
corvus	if you agree, then https://review.opendev.org/931815 incorporates that, along with getting the url and checksums for returning the artifact.	16:04
fungi	yeah, not long ago i was wondering really how sensitive we found that id, and a quick grep of irc logs indicates we've pasted urls with the id in them with some regularity	16:05
fungi	since it appears as part of a url for at least some systems, i wouldn't consider it worth trying to keep secret	16:06
clarkb	ya the only other thought I've got is wondering what the scope of that credential is.	16:06
clarkb	If it does leak because ansible or swiftlcient change then what is the impact	16:06
clarkb	is it bad enough that we want to try the dedicated user with acl thing and see if we can make that work instead or is the application credential scoped to that service and region already so maybe we care less? I don't know	16:07
fungi	well, to be clear, i meant the parent account id is scattered all over the place, so worrying about a sub-credential id is fairly pointless	16:07
clarkb	fungi: this isn't currently a sub credential aiui fwiw	16:08
clarkb	but I agree that it seems like any id is fine to expose	16:08
fungi	yes, passwords and api keys are what we should be worried about guarding	16:09
corvus	oh the application credential that is the subject of this secret is not our main credential. it is an "application credential" that i created just for uploading images	16:10
corvus	i don't think it's the thing that shows up in the url	16:10
clarkb	corvus: right the thing in the url should be the id portion. What I'm curious to know is if that credential is a global one for the account	16:11
clarkb	I don't know what kind of scoping it has if any	16:11
corvus	yes it is global for the account; the scoping that it has seems pretty coarse	16:11
clarkb	in that case I think my inclination would be to be careful here even if that means continuing to no log things	16:11
corvus	there were like 6 things like "creator" "reader" and some others. no idea what any of them mean.	16:12
clarkb	maybe rewrite them so that we can manually remove the no log to aid in debugging later while still likely being safe?	16:12
fungi	i'm still not especially worried about the application credential id as long as there's still a strong api key or password we're not exposing, but i understand the hesitance	16:13
corvus	clarkb: which part concerns you? having them as env variables? potential leaking of the id? or potential leaking of the secret?	16:13
clarkb	corvus: potential leaking of the secret without nolog if say exception handling does the wrong thing	16:14
fungi	in unrelated news, mirror02.bhs1.ovh.opendev.org is Status:ACTIVE again, so probably safe to put that region back into use but i've seen no reply on our trouble ticket about it yet	16:15
jrosser	it is possible to add access rules to an application credential (different to keystone roles), to limit their usability to particular apis	16:16
fungi	looks like the mirror has been up since the manual reboot i performed, so whatever got corrected was only on the backend	16:16
corvus	clarkb: okay, if we're worried about command line clients outputting secrets, then i have no counter to that. but if so, then we should probably not use the environment variables at all since anything could access it and print it. that includes that "swift stat" command i just added.	16:16
clarkb	corvus: historically openstack hasn't been very good about this. Openstack client very explicitly has tried to not leak things that way but historically the other client tools have not	16:17
corvus	jrosser: possible for a user or a cloud admin?	16:17
clarkb	unfortunately, we're not able to use the openstackclient here without hitting other bugs so...	16:18
jrosser	corvus: you can do that as a user https://docs.openstack.org/keystone/latest/user/application_credentials.html#access-rules	16:18
jrosser	what is difficult with it though, is for more complex things, perhaps server create, that you just have to know that glance/neutron/<other> are also involved and those have to be allowed as well	16:19
clarkb	in this case it would just be for swift so maybe we're lucky with a simple case?	16:19
corvus	jrosser: neat. i did not see that in the web ui i used to create the cred; but we could try doing that from the cli.	16:19
jrosser	could well be, yes.... simple things will be much more straightforward	16:20
fungi	fwiw, we've considered any leak of client credentials in client/lib projects a severe security vulnerability unless it occurs when debugging options are enabled	16:20
jrosser	if you're concerned that the application credential is too powerful then access rules are useful, but I have found them to be quote difficult to configure	16:20
clarkb	fungi: but aiui they still occur with non zero frequency in logs/crash handling?	16:21
fungi	they shouldn't, unless it's in debug level log entries	16:21
fungi	and even then, projects have moved toward trying to mask them (there's specific functionality for that in oslo.log now since years)	16:22
corvus	(if we are concerned about this i could have saved some time and not run that test since this is not falsifiable with a test)	16:22
clarkb	swiftcleint doesn't use oslo.log I don't think	16:22
opendevreview	James E. Blair proposed opendev/zuul-jobs master: Return image artifacts https://review.opendev.org/c/opendev/zuul-jobs/+/931815	16:24
corvus	okay, there it is with no env variables and no logging i think	16:25
clarkb	ah but it does use keystoneauthv1 which does use oslo.log? anyway if we scope things I feel more comfortable dropping the no log if we're not scoping then I thinkw eshould be careful /me reviews the change above	16:26
opendevreview	James E. Blair proposed opendev/zuul-jobs master: Delete images after 72 hours https://review.opendev.org/c/opendev/zuul-jobs/+/931819	16:28
clarkb	corvus: re the use of async does that help if we're still doing things serially task by task on a single node? (I'm just trying to undersatnd the benefit to async in the hash tasks)	16:28
corvus	yes, those get backgrounded and then we check the result after the upload	16:28
clarkb	oh I see we check for them later	16:29
clarkb	got it thats special poll=0 behavior	16:29
corvus	yup	16:30
clarkb	looking at that application credential acl stuff and thinking about jrosser's feedback I wonder if openstack could provide some "recipes" for that	16:38
clarkb	liek one for "create server" and another for "swift usage but nothing else"	16:38
clarkb	but I think we could set up swift acls by setting path to /**, service to object/swift/whatevertheofficialserviceis, and then create a rule like that for one each of HEAD,PATCH,GET,POST,PUT ?	16:39
fungi	i've removed nl04 from the emergency disable list, set max-servers back to 120 for ovh-bhs1, abandoned 931777 and closed the ticket in ovh about it	16:39
clarkb	if we wanted to be even fancier we could scope it to the specific container using better path rules	16:40
clarkb	btu I think even just "this can only do swift api actions" is a big improvement	16:40
corvus	do we have to specify the method? can we use * instead of GET?	16:43
clarkb	corvus: the docs that were linked only mention wildcards being valid for path not method. But maybe the docs are incomplete?	16:43
jrosser	here is what we ended up with for server create https://paste.opendev.org/show/bLejT3tuJqaUusE61tX4/	16:45
clarkb	oh swift probably uses DELETE too	16:45
clarkb	jrosser: recipe #1 right there :) more seriously that sort of thing would probably make a good appendix to the doc you linked to?	16:47
clarkb	then people can add additional ones as they are created and known to work?	16:47
jrosser	that would be great really	16:47
jrosser	i think what was most difficult about it was that as an end user it was pretty opaque why the rule you were working on did not work	16:48
jrosser	it was only by also digging through the service logs could you find some 2nd order 4xx to then allow that as well	16:48
fungi	keep in mind that if someone compromised the image publishing credentials, they could in theory upload their own image to replace one mid-process before it got pushed to glance and added to the zuul launcher, so could inject a custom version of some binary which then alters our container images of things like zuul or gerrit backdooring them. or alter openstacksdk release packages so that when	16:51
fungi	we install a new version on our bastion server it's compromised. as such i'm not sure there's a ton of benefit from spending lots of time trying to tightly scope it, since uploading images to the swift container is already a possible key to the kingdom (even if a somewhat circuitous one)	16:51
jrosser	i would also not the gigantic caveat in the docs `Application credentials with access rules require additional configuration of each service that will use it`	16:51
jrosser	*note	16:51
clarkb	jrosser: on the backend you mean? I guess they have to opt into checking the restrictions with keystone?	16:52
corvus	so we'll need to figure out if this will work with rax-flex	16:52
jrosser	https://opendev.org/openstack/keystone/commit/3856cbf10d4d19b9d7797d600ef096b0c04aaedb	16:53
corvus	fungi: zuul can perform checksum when it's doing the cloud upload (i don't think we've written code that does that yet, but we can). we can compare it to the checksum we make when we return the artifact to zuul, so it's effectively a "did someone compromise the intermediary object storage" check.	16:53
fungi	yes, i was thinking that checksum verification would be a good way to thwart that	16:54
corvus	(only trusted points in that are zuul's database and the image build node)	16:54
clarkb	jrosser: thanks. This is good info if it doesnt' work we can feed that back to $cloud	16:55
fungi	afaik rax-flex is basically just a very recent vanilla openstack, and the "weirdness" with our basic credentials is simply because they're using a federated login to their old/existing account system, but the accounts also have local ids within keystone so we should be able to use whatever the usual openstack apis are to lock them down	16:55
fungi	the ids we ended up using in our clouds.yaml are the local keystone ones rather than the federated ones	16:56
clarkb	fungi: ya but according to the doc above they need toconfigure swift with keystonemiddleware to make this work	16:56
clarkb	which they may or may not do	16:56
fungi	oh, got it	16:57
clarkb	I think everything is still checking on the vanilla clloud side of things and we should avoid problems with the federated logins. But still needs special configuration	16:57
corvus	remote: https://review.opendev.org/c/zuul/zuul/+/931824 WIP: verify downloaded image checksum [NEW] <-- made a note so i don't forget to write that.	16:57
fungi	thanks!	16:59
clarkb	fwiw I =2'd it because I think merging it with a todo is fine as well	17:00
fungi	probably the biggest concern with that mitigation is time. reading a 25GB file in order to checksum it is not fast, though we could probably parallelize that by checksumming chunks	17:00
fungi	and we have to checksum it twice (once when creating the image, then later when retrieving it for upload to glance), so twice the time	17:01
fungi	maybe we want gpu flavors from our cloud providers ;)	17:01
fungi	then again, it's just as likely to be i/o bound and that's not as easy to solve	17:04
fungi	oh, unless we checksum the swift upload and download chunks since that's already happening in parallel?	17:05
fungi	we already have to read the image from disk to upload it to swift and to glance, so i guess if we do the checksumming inline with those reads we're already stuck with, it shouldn't increase the number of reads from disk	17:06
fungi	we could also checksum inline with the download from swift instead and then not even bother with starting the glance upload if the checksum doesn't match	17:07
fungi	but yeah, regardless, keeping performance in mind in the design will be important	17:08
mnasiadka	ianw: so if you can - then please release a new minor version of DIB - hopefully it fixes Kolla-Ansible Ansible locale issues ;-)	17:40
clarkb	the TC meeting has me mulling an idea. A job setup where you deploy an openstack using devstack/kolla/whatever then pause that job and then have several other jobs run to test various api interactions using various versions of things	18:27
clarkb	for example that the current release of openstacksdk/openstackclient work but also master and maybe the stable releases too	18:27
clarkb	I don't know if that would be more or less headache then having a separate cloud for each of those	18:27
clarkb	but it occurred to me that we could prbably share resources if it would be useful	18:28
fungi	in that design it would necessarily use multiple job nodes	18:31
clarkb	particularly if you are booting many VMs	18:33
opendevreview	Merged openstack/project-config master: Switch the remaining opendev zuul tenants to ansible 9 by default https://review.opendev.org/c/openstack/project-config/+/931320	19:15
clarkb	I've not seen anyone complain about ^ yet. I need to do a school run in a bit and if there are no complaints still I guess I'll proceed with merging that ozj change	20:43
fungi	yeah, all's quiet on the nwestern front	20:51
opendevreview	James E. Blair proposed opendev/zuul-jobs master: Return image artifacts https://review.opendev.org/c/opendev/zuul-jobs/+/931815	21:42
opendevreview	James E. Blair proposed opendev/zuul-jobs master: Delete images after 72 hours https://review.opendev.org/c/opendev/zuul-jobs/+/931819	21:42
ianw	just for the logs i did try pinging on ipv4 when mirror02 was down:	21:42
ianw	[iwienand@fedora19 dist-git]$ ping -4 mirror.bhs1.ovh.opendev.org	21:42
ianw	PING mirror.bhs1.ovh.opendev.org (158.69.69.81) 56(84) bytes of data.	21:42
ianw	--- mirror.bhs1.ovh.opendev.org ping statistics ---	21:42
ianw	3 packets transmitted, 0 received, 100% packet loss, time 2083ms	21:42
ianw	yay clouds	21:44
clarkb	maybe their network gear recovers ipv4 more quickly	21:52
clarkb	looks like I'm still in the clear to land that ozj ansible-lint update. I'll hit the approval button shortly	21:54
clarkb	I've rereviewed my own change to triple check there weren't any silly typos	22:04
clarkb	I'm going to go ahead and self approve it now with only fungi's actual review	22:05
clarkb	ianw: btw we just updated the openafs version built on centos 9 stream because the previous one stopped building there (incompatible function declarations in the kernel and openafs for abort()). I couldn't find anything else that needed to be done to consume that new rpm so I think it must be automatic?	22:06
clarkb	not sure if you recall	22:06
corvus	clarkb: apparently we don't get to register stdout on a no_log task, so my method of obtaining the url from the swift command won't work	22:08
clarkb	corvus: I guess in that case we have to risk it. I did do some digging after fungi pointed out that oslo.log should handle things and while swift and swiftclient don't directly consume oslo.log the keystoneauth lib does and I guess as long as you mark items secret=True it is supposed to handle it automatically for you	22:10
ianw	clarkb: that bump should be it; it should make it's way to https://tarballs.opendev.org/openstack/openstack-zuul-jobs/openafs/ which is then used to install	22:10
clarkb	now there has been at least one case of a config option accidentally lacking secret=True in the past, but we're probably fine for this case at least as long as the toolchain stays relatively static? We're installing from the distro right? so that should be the case until we bump the test node up?	22:10
corvus	clarkb: an alternative would be to just hard-code the url.	22:10
corvus	re distro install: yes	22:11
clarkb	corvus: oh ya if it is a static thing in rax-flex that also seems reasonable (with rax proper you have to use the cdn and its a bit more convoluted)	22:11
clarkb	I think it is still static for rax with the cdn but its some hmac hashed domain name?	22:11
corvus	i think it's https://swift.api.sjc3.rackspacecloud.com/v1/AUTH_f063ac0bb70c486db47bcf2105eebcbd for this account	22:12
clarkb	that does seem workable too then	22:12
clarkb	arg centos 9 stmrea just updated the kernel again so ozj change won't land	22:13
clarkb	I'm going to manually request that nb04 rebuild centos-9-stream now in hopes that maybe I can land that tomorrow morning instead	22:14
clarkb	kernel-devel-aarch64 = 5.14.0-513.el9 is needed by openafs-1.8.12.1-1.el9.aarch64 <- is the latest error. 514 appears to be the new kernel so I think we must be booted on 513 and thus its looking for those headers but only finding the modern 514 ones? I'm trying t oconfirm via the job facts	22:16
clarkb	ya BOOT_IMAGE: (hd0,gpt3)/boot/vmlinuz-5.14.0-513.el9.aarch64	22:16
opendevreview	James E. Blair proposed opendev/zuul-jobs master: Return image artifacts https://review.opendev.org/c/opendev/zuul-jobs/+/931815	22:16
opendevreview	James E. Blair proposed opendev/zuul-jobs master: Delete images after 72 hours https://review.opendev.org/c/opendev/zuul-jobs/+/931819	22:16
corvus	clarkb: can you re-review 931815 and see if that lgty?	22:16
clarkb	corvus: the artifact change lgtm. What is with the zuul.success to zuul_success change?	22:19
clarkb	is the zuul var loaded at the top level as zuul_success instead of an entry in the zuul dict?	22:19
corvus	yep	22:19
corvus	there was a reason for that i think... since it changes over different playbook runs	22:20
clarkb	got it	22:20
clarkb	+2 from me	22:20
clarkb	I made the image build request on nl01 and that returned an error about trying to build some non disk image builder image. Running the command against nb04 it worked so I guess some sort of config issue	22:27
clarkb	also we've got a zk entry for a debian-bookworm-arm64 image build with invalid json in it	22:28
clarkb	doing a dib-image-list shows that. I suspect we can just delete the zk db record for that?	22:28
clarkb	in any case I think the request is in now. A different image is ucrrently building though. Hopefully this will be in a happier spot tomorrow morning	22:28
opendevreview	Merged opendev/zuul-jobs master: Return image artifacts https://review.opendev.org/c/opendev/zuul-jobs/+/931815	23:27

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!