Thursday, 2024-11-14

opendevreviewMerged opendev/infra-openafs-deb jammy: Update Jammy to 1.8.13  https://review.opendev.org/c/opendev/infra-openafs-deb/+/93502500:41
opendevreviewMerged openstack/diskimage-builder master: Reapply "Make sure dnf won't autoremove packages that we explicitly installed"  https://review.opendev.org/c/openstack/diskimage-builder/+/93499201:48
tonybclarkb: Yeah it's for Noble, I thought I had something workable, but then I discovered that docker-compose (v1 written in python), dosn't work with python >= 3.12 so we'd really need to consider using v2 on distros > jammy (Possible >= jammy)01:49
opendevreviewDoug Goldstein proposed zuul/zuul-jobs master: add roles to install helm-docs  https://review.opendev.org/c/zuul/zuul-jobs/+/93503001:50
jcapitao[m]hello folks, we are currently trying to add CS10 support in DIB https://review.opendev.org/c/openstack/diskimage-builder/+/93404509:07
jcapitao[m]but looks like the worker CPUs does not support x86-64-v3 microarch level09:08
jcapitao[m]is there a way in Zuul/Nodepool to request a node that support it ?09:09
jcapitao[m]apevec: ^ FYI09:09
tonybjcapitao[m]: Not that I'm aware of.  IMO it was a problematic  decision to make that the minimum 09:11
apevecuhoh which cloud is that?09:12
tonybjcapitao[m]: the only ones that might work are onmetal, and raxflex. 09:13
tonybapevec: most of them09:13
jcapitao[m]yeah, it's a bit disruptive even though x86-64-v3 was designed back in ~2013 with Haswell, many CPUs still don't support all the features (e.g AVX) :/09:20
apevecRAX with Xen is older :)09:25
apevectonyb: could we get a label which is restricted to the providers with do have CPU v3 ?09:26
tonybthere's also a complication around the kernel/KVM version of the hypervisor.09:27
tonybapevec: yes RAX is, but raxflex is more recent 09:27
tonybapevec: I expect we can09:28
jcapitao[m]tonyb: should we formally request the labels somewhere to track the topic ? 09:49
tonyb no it's okay I'll look at it tomorrow.10:38
tonybI'll firstly verify which providers if any will work.10:39
fricklerinfra-root: release-team has noticed multiple gitea failures like https://zuul.opendev.org/t/openstack/build/f8ebe67a160847d29610c150b6812d06 GnuTLS recv error (-110): The TLS connection was non-properly terminated.11:26
fricklerchecked all gitea instances and they seem fine in general, so likely some kind of sporadic issue11:42
*** darmach688 is now known as darmach6811:59
opendevreviewTae Park proposed openstack/project-config master: Add repo app-openbao for starlingx  https://review.opendev.org/c/openstack/project-config/+/93515414:49
frickleranother issue I saw when checking logs, not sure if it is a kolla thing, might usually not be noticed since it is getting retried, could be related to raxflex being different https://zuul.opendev.org/t/openstack/build/d7a108adc1c8491a92362bcb40b3767014:57
fricklerI checked a couple of similar rechecks and they all were on raxflex. will put that onto the kolla agenda15:09
fungilooks like mkfs isn't actually setting the requested label, or it's not showing up in the device scan afterward15:16
fungimight be something related to virtio15:17
fungii know we had some initial wierdness with virtio causing the block device mapping to get swizzled around15:18
fungiswap and ephemeral were switched with each other, i think15:18
Clark[m]jcapitao apevec tonyb the closest thing we have today is going to be the nested virt labels. I don't know for sure if they have all the cpu flags you need but the machines are generally more capable and modern. But you limit where your jobs can run pretty substantially 15:35
Clark[m]My suse machines mix in the extra CPU capabilities which I feel like is a nice compromise between not working at all and getting the desired performance improvements. It's unfortunate that this approach isn't more common15:37
*** artom_ is now known as artom15:54
Clark[m]frickler: I wonder if the gitea issues can be traced back to the memory leak15:58
opendevreviewMohammed Naser proposed zuul/zuul-jobs master: Add buildset registry image override  https://review.opendev.org/c/zuul/zuul-jobs/+/93516815:59
Clark[m]It seems what happens is we end up swapping and things slow down quite a bit before an OOM occurs. Maybe that is enough to make TLS sad. I think it fits noonedeadpunk's connection timeout problems better but could fit here too15:59
opendevreviewMohammed Naser proposed zuul/zuul-jobs master: Add buildset registry image override  https://review.opendev.org/c/zuul/zuul-jobs/+/93516816:03
opendevreviewMohammed Naser proposed zuul/zuul-jobs master: Add buildset registry image override  https://review.opendev.org/c/zuul/zuul-jobs/+/93516816:04
opendevreviewMohammed Naser proposed zuul/zuul-jobs master: Allow overriding the buildset registry image  https://review.opendev.org/c/zuul/zuul-jobs/+/84998916:06
opendevreviewMohammed Naser proposed zuul/zuul-jobs master: Allow overriding the buildset registry image  https://review.opendev.org/c/zuul/zuul-jobs/+/84998916:07
opendevreviewMerged openstack/project-config master: Use 2024.2 constraints in master translation jobs  https://review.opendev.org/c/openstack/project-config/+/93440216:13
opendevreviewMerged openstack/project-config master: Add separate acl group for watcher-tempest-plugin  https://review.opendev.org/c/openstack/project-config/+/93435716:13
opendevreviewMerged openstack/project-config master: Update more ironic project ACLs for editHashtags  https://review.opendev.org/c/openstack/project-config/+/93502216:13
fungislittle1: is https://review.opendev.org/935154 (Add repo app-openbao for starlingx) expected?16:15
clarkbone thing I want to try and get done today after parent teacher conferences is prepping a fallback swift container for insecure-ci-registry pruning tomorrow16:59
clarkbif anyone recalls any special setup for that please let me know otherwise I'll probably just login and manually inspect things and see what I can see17:00
slittleplease review https://review.opendev.org/c/openstack/project-config/+/935154 ... Set myself up as the initial core.  I'll take care of he rest.  Thanks17:05
fungislittle: thanks! i asked you in here earlier if that change was on your radar. will approve it now17:06
slittleHa.  Hazard of working from two sites.  Left my hexchat running on the other location17:08
opendevreviewMerged openstack/project-config master: Add repo app-openbao for starlingx  https://review.opendev.org/c/openstack/project-config/+/93515417:10
fungionce that ^ deploys i'll add you to starlingx-app-openbao-core17:10
slittleThanks fungi17:14
corvusclarkb: fungi it's looking like x-delete-after doesn't work in rax-flex.  i just tried it with SLO objects as well as normal objects, and they're not being deleted.17:25
corvusthat has implicatons for using it for log storage in addition to the image uploads.17:26
fungii wonder if there's some background process in swift that they should be running but aren't17:27
corvusyeah, i believe that's how it's implemented.  i'll send off an email to rax folks in a bit.17:28
fungislittle: done!17:30
fungicorvus: swift docs mention "the swift-object-expirer daemon" so maybe that's missing17:30
clarkboh interesting. That feels like something that falls under "its good we're testing it and can provide feedback" category of things17:30
corvusoh wait, it does work -- but the objects are not removed from the listing.  they still appear to be present...18:18
fungihttps://docs.openstack.org/swift/latest/overview_expiring_objects.html#accessing-objects-after-expiration18:38
fungithat i guess18:38
corvusfungi: perhaps?  but it's been a month for some of these objects and they still appear in the listing as if they were present18:43
corvusbut yeah, perhaps it's only done the "expiration" step and still hasn't actually deleted them after a month18:44
corvus(but in the configuration that we can't HEAD or GET them after expiration)18:44
corvusinterestingly, DELETE-ing them returns a 404 but does actually delete them from the listing.18:45
fungi"eventually consistent" (for possibly very long definitions of "eventually")18:46
clarkblooks like we're using rax dfw swift for the intermediate registry with a dedicated user18:56
clarkbI'll dig into creating a new container there and give that user access18:57
clarkbI actually wonder if I can just create a container as that user and then problem solved18:57
* clarkb attempts this18:57
clarkbinfra-root python3 -m venv failed on insecure-ci-registry because we don't have the python3-venv package installed. Would it be ok for me to install that to make an openstackclient venv or would you prefer I use an openstackclient docker image or interact with that clouds.yaml on say bridge instead?18:59
corvusi'm happy to have venv installed on that host for easy tool usage19:04
clarkbok I'll go ahead and quickly install that package19:05
corvusand i'm not worried about that being out of sync with system-config; i think our testing keeps that from being a problem19:05
clarkbdo you want me to also push up a change to get it back into sync?19:06
clarkbarg you can't pip install python-openstackclient without a full buildchain because of netifaces19:09
clarkbwhy does the openstackclient need to care about netifaces?19:10
mordredclarkb: openstacksdk uses netifaces19:25
mordredclarkb: https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/cloud/_utils.py#L19119:27
clarkbya but why does the client care?19:27
clarkbyou do name lookups and if you get back ipv6 you use those addresses19:27
clarkbotherwise you use ipv419:27
clarkbseparately I'm having a wonderful time tracking down whatever happened to being able to specify the clouds.yaml path19:28
clarkbit looks like os-client-config's config loader actually gets implented in openstacksdk?19:28
mordredclarkb: you'd THINK that would be how you would do things - but to provide interface_ip (since servers launched do not default to having anything dns related) - you have to look at the ip's on the server object, and then determine which of those are usable from the calling client19:29
mordredyou know, because we can't have nice things19:29
mordredand yes - os-client-config is just a compat shim layer- the relevant code was merged in to openstacksdk long ago19:30
clarkbmordred: the problem with ^ is that its pure spaghetti. I've gone though python-opnstackclient, keystoneauth, os-client-config, and now openstacksdk just to dtermine if I can set a path to clouds.yaml (whcih I'm 99% certain was possible before)19:31
clarkband I still haven't answered that qusetion19:31
fungiinstalling python3-venv on any server seems fine to me19:31
fungior on every server, sure19:31
mordredclarkb: fair enough. I was not quite done deprecating old things, sorry about that. :) 19:32
clarkbconfig_file_override = self._get_envvar('OS_CLIENT_CONFIG_FILE')19:32
clarkbok you can still do this but its not documented19:32
clarkbor at least I don't think it is. Checking that is next up19:33
mordredyup19:33
mordredhttps://opendev.org/openstack/openstacksdk/src/branch/master/openstack/config/loader.py#L19919:33
clarkbit is documented in openstacksdk and os-client-config but not in python-openstackclient's documentation/manpage/helpoutput19:34
fungipart of the problem with the config belonging to a library rather than the application, it's not always clear where to document it19:34
mordredyeah - that's a good point. osc should really flow-through some relevant sdk docs so that an osc user doesn't have to know about sdk itself19:35
clarkbya I think linking from osc docs to sdk docs on auth setup would be good in the online docs. Its more painful to do that with manpages and help output though19:35
mordredwe need exportable documentation chunks that the consuming app can include :) 19:35
opendevreviewJames E. Blair proposed opendev/zuul-jobs master: Switch to using openstacksdk for image uploads  https://review.opendev.org/c/opendev/zuul-jobs/+/93521819:36
clarkbI'm going to put this clouds.yaml file on bridge and then run commands from there I think that is the most straightforwad thign at this point (doable because I can override the clouds.yaml path now)19:38
clarkbcontainer list results in a forbidden response19:45
clarkbI half expected that but I was hoping this would be easy mode. So I think I need to create the container and then delegate it to this user somehow /me pulls up rax swift docs19:45
opendevreviewJeremy Stanley proposed opendev/infra-openafs-deb focal: Update Focal to 1.8.13  https://review.opendev.org/c/opendev/infra-openafs-deb/+/93522019:46
fungiclarkb: are you trying to delete objects? if so, i'm not sure the unified openstacksdk plumbs through the bulk delete operation19:47
clarkbfungi: no I'm trying to set up a new fallback container that the dedicated registry swift user can read and write to19:48
fungiah19:48
clarkbfungi: in case things go poorly with registry pruning tomorrow. I want the backup container ready to go and just update the config file on the registry to switch over19:49
fungiyep, makes sense19:49
clarkbI thought maybe just maybe the special user could list and create new containers with implicit perms in place for it but doesn't appear to be the case19:49
clarkbI think ti was a good exercise though19:49
corvusclarkb: fungi https://review.opendev.org/935218 is a switch *back* to using sdk for the image uploads.  i believe this has workarounds for all known problems, so we should end up actually deleting images after 3 days.19:57
fungiyay!20:00
clarkbany idea what properties     | Access-Log-Delivery='false' this property on the existing registry container means? I've created a new container and it doesn't have that20:00
clarkbbut also it looks like there is no way to set acls via headers in opesntaclient?20:01
fungilove the carried-over copyright lines, reminds me how much i miss jhesketh20:01
clarkbthere is container set but that seems to set property values liek the one I pasted above20:02
corvusfungi: yeah, i think there's like a line or two that goes all the way back :)20:02
clarkbI'm not seeing an obvious way to set X-Container-Read/X-Container-Write20:02
corvusclarkb: that property does not ring a bell for me20:02
clarkbcorvus: since you've been poking at this a bit recently do you know if openstackclient can set those X-Container-* header values somehow?20:02
fungiclarkb: seems likely to have been in the default properties and then configured out between when that was created and now20:03
clarkbfungi: ya that could be20:03
corvusclarkb: i'm pretty sure we had to use curl for that20:03
fungiprobably has no impact on our use either way20:03
clarkbok I skipped breakfast so I'm starving for lunch. I'll pick this back up again after I eat somethign (I'll also review those changes yall pushed)20:05
clarkbcorvus: ya even in openstackclient tests they just do a raw HEAD request against the conatiner to set the values then check they come back out again on a container show20:06
clarkbtimburke: ^ this probably isn't your highest priority, but as a user I find this particularly frustrating. I should be able to do straightfowrard tasks like this through the default tool20:07
clarkbtimburke: tl;dr is it doesn't appear that openstackclient is capable of setting read or write acls on swift containers20:07
clarkboh the test is in a mock so its not even that sophisticated20:07
clarkbtimburke: a semi related thing is that I almost never want to use project specific client tools because they very rarely support clouds.yaml, but maybe this has changed. I would say as a user having tasks like this through the main client is super helpful and so is having clouds.yaml support in the project specific clients20:12
clarkbok lunch now. Back in a bit20:13
mordredclarkb: we started slowly adding clouds.yaml support to project specific clients back in the day, but only got through a few of them20:19
clarkbapi operations require tokens. I think by default those tokens have a relatively short lifespan. But I see that osc also has a token revoke command but no token listing. I guess that makes me wonder if tokens are as ephemeral as I thought they may be. DO they last forever?20:57
clarkboh maybe if I issue one the response will tell me when it expires20:58
clarkbreading https://docs.rackspace.com/docs/set-up-cloud-files-and-acls there is an additional complication where all of the curl examples are actually against clouddrive and not swift apis?21:00
clarkbok I got ^ to work except that it created the container in the wrong region21:10
clarkbbecause it wasn't already created and I set the read acl so it helpfully created the conatiner too21:10
* clarkb tries again21:10
clarkbspecifying --os-region-name doesn't seem to work (I ended up installing swift client and i have that almost working except for the region so maybe the -A auth url is wrong?)21:12
clarkbI now have a container that should work in the wrong region (its the wrong region bceause its not the same region as the service using swift21:14
clarkbso I think you can't use -A/-U/-K with --os-auth-url/--os-region-name/--os-username/--os-password. Its one of the other. Problem is I think I cannot auth with swiftclient using the openstack stuff because that requires special rax key auth21:21
clarkbit works when you use -A/-U/-K because that is the special key auth stuff but then I can't select the region21:22
mordredclarkb: the rax key auth is *very* similar to username/password auth, iirc it's structurally the same, just with a word replaced. I feel like at one point we had a ksa plugin that would do it, but then there were concerns about landing it in ksa itself since it was "non-standard"21:28
clarkbmordred: ya we're using the ksa plugin for not swift21:28
clarkbbut that is driven by clouds.yaml and I don't see any way to drive this via swiftclient other than using -A21:28
clarkbso its either you use swiftclient and get the default region or you use openstacksdk and don't get full functionality or you do everything "bit by bit" with curl or nc21:30
mordredand that's because you need to do a thing that OSC doesn't have exposed, right?21:30
clarkbyes, I want ot set read and write acls21:30
mordredare you trying to do this in base with osc? or in python with sdk?21:30
mordreds/base/bash/21:31
clarkbwith osc not the sdk. But the sdk doesn't directly expose it either21:31
clarkbyou can probably manipulate the sdk to do a "raw" post with headers set though21:31
clarkbmy frustration is more along the linse of "this should work the first time with the default client everytime"21:31
clarkbI'm a special case of user that is going to bang their head against this until something works. Most people are not21:31
mordredwell - yeah - believe me, of all the people I totally hear you and agree with you21:32
mordredit looks to me like this is directly supported in SDK though21:33
mordredhttps://opendev.org/openstack/openstacksdk/src/branch/master/openstack/object_store/v1/container.py#L7821:33
clarkbthe sdk and osc will read the acls back21:33
clarkbbut I haven't found any where where they will write them21:33
clarkboh I see in the docs now so maybe that is my best path forward21:34
clarkbwrite a script taht gets the cloud then gets the container then sets the metadata21:34
mordredyeah - that should work. I agree, this should be exposed in OSC for sure21:35
clarkbok I'm going to take a deep breath and then tackle it that way21:35
clarkbI may post up a draft of the script here in a few just to sanity check before I try to run it21:35
timburkeclarkb, as far as expired objects appearing in listings, i remember the rackspace folks always complaining about their expirers running behind. i know we at nvidia have had to tune things a lot to handle hundreds of millions of queue entries21:38
timburkeunfortunately, the overlap of osc and swift developers is basically nil; even worse, the same could be said of the set of swift developers that actively push forward client development21:40
clarkbmordred: https://paste.opendev.org/show/bawRwIdnWn7lm1iBq44W/ does this look good?21:42
clarkbtimburke: ya I also think this is a common issue and not limited to swift. I'm just hitting it with swift dueto particular circumstances21:42
mordredclarkb: yes - that seems reasonable - assuming those acls are correct, which I have no idea about21:43
clarkbmordred: yup mostly looking for structural correctness21:43
clarkbthanks21:43
clarkbwell that ran and didn't raise an exception but it doesn't seem to have set the acls either21:47
clarkbtime for more debugging21:47
clarkblooks like maybe I have to explicitly state which metadata key values I want to write back (the example for this doesn't do so but the method docs imply this is necessary)21:51
clarkbproblem is it says "key value pairs" so now I'm like wat21:53
clarkbok after 3 hours I have successfully managed to set the acls21:58
clarkbmordred: thanks for the pointer this wasn't as easy as I would've liked but it was workable21:58
clarkbI'm going to push up a little script into system-config so that its captured somewhere and then also push an sdk docs update21:58
clarkbI need to cleanup the container that was auto created for me in the wrong region too /me starts chipping away at athat22:00
fungii wonder how far the code distance is from that script to an osc implementation. i guess mostly arguing about what the ui should be22:03
clarkbORD conatiner has been deleted22:03
clarkblooks like the container name is set in private vars so I won't push any chagnes to WIP before we do the pruning tomorrow. We'll just have to edit the name tomorrow22:04
clarkbassuming we do fallback22:04
opendevreviewClark Boylan proposed opendev/system-config master: Add a rax swift acl setting script  https://review.opendev.org/c/opendev/system-config/+/93522822:14
clarkbmordred: et al ^ the final result22:14
clarkbI wrote a summary of the problems and why this is a solution in that file too22:15
mordredfungi: looking at it briefly, it would not be hard. you're absolutely right, deciding what the cli should be will be the hardest part. like "openstack container set acl" perhaps?22:17
opendevreviewJay Faulkner proposed openstack/diskimage-builder master: [gentoo] Fix+Update CI for 23.0 profile  https://review.opendev.org/c/openstack/diskimage-builder/+/92398522:20
fungiand figuring out whether swift folks need to weigh in on the syntax22:21
fungi(since as timburke said there are approximately zero swift team members with interest in the unified openstack cli or sdk)22:21
opendevreviewJay Faulkner proposed openstack/diskimage-builder master: [gentoo] Fix+Update CI for 23.0 profile  https://review.opendev.org/c/openstack/diskimage-builder/+/92398522:22
mordredtypically service teams have not cared about such syntax discusisons22:22
clarkbremote:   https://review.opendev.org/c/openstack/openstacksdk/+/935229 Fix swift metadata setting documentation22:24
clarkbI also updated the sdk docs22:24
mordredof course, looking further, I'm now reminded that cli on the backend is just using the REST adapter out of SDK and not the higher level service objects, because it has its own "higher level" internal API. So - still not hard, but mildly annoying22:24
clarkbok I promosed peopel code reviews off to do that now22:27
clarkbcorvus: fungi fyi left a -1 on https://review.opendev.org/c/opendev/zuul-jobs/+/935218 mostly concerned about ansible logging of secrets22:38
clarkbplease review and let me know if I've got something wrong there22:39
clarkbcorvus: fwiw I have a feeling my little openstacksdk script would work for raxflex too if we want to test that. Probably on a new container when we're not already distracted though22:49
opendevreviewMerged opendev/infra-openafs-deb focal: Update Focal to 1.8.13  https://review.opendev.org/c/opendev/infra-openafs-deb/+/93522022:55
opendevreviewJay Faulkner proposed openstack/diskimage-builder master: [gentoo] Fix+Update CI for 23.0 profile  https://review.opendev.org/c/openstack/diskimage-builder/+/92398523:31
opendevreviewJames E. Blair proposed opendev/zuul-jobs master: Switch to using openstacksdk for image uploads  https://review.opendev.org/c/opendev/zuul-jobs/+/93521823:54
corvusclarkb: ^ fixed and replied23:55
corvusclarkb: do you want to?  i'm switching to a new container with that patchset because of the detritus in the current container which i can't delete.  now would be a good time to change the acls, etc, if we want.  however, tbh, i'm sort of over it and think we should just use the more normal credentials.23:56
clarkbcorvus: ya I mean I feel like the three hours of investigation make me not want to use it either. But the end result seems usable23:58
clarkb(until the next round of cloud updates and client changes and it breaks again I guess)23:58
clarkbcorvus: oh I didn't realize that prune was explicitly to work around the behavior issue23:58
corvusclarkb: it would also be an experiment to see if that works with flex23:59
clarkbfor some reason I had in my head it was to handle the lack of pruning on the objects you already uploaded23:59
clarkbbut you're swapping containers for that23:59
corvusoh nope, they actually did get deleted23:59
corvusbut their entries are somehow stuck and i can't even delete them through the web23:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!