Tuesday, 2025-10-28

*** mrunge_ is now known as mrunge		06:39
frickler	looks like canonical is having some issues, not sure if anything other than mirror updates will be affected https://status.canonical.com/	12:54
Clark[m]	The zuul launcher fix hasn't merged yet. I wonder if we need to dig into these failures in zuul too cc corvus. I didn't think any looked related to the change when I inspected the earlier failures	14:12
corvus	google api ModuleNotFoundError: No module named 'packaging'	14:23
corvus	looks like swest is on it	14:24
clarkb	I see the changes and have added my +2 though you've already approved them. Thanks!	14:45
oschwart	hello o/ sorry if it had been discussed here before, but I see many of OSP projects (probably only stable/2024.X jobs) that fail on this method lib/keystone:create_keystone_accounts	14:57
clarkb	oschwart: what is osp?	14:58
oschwart	sorry, openstack	14:58
oschwart	e.g. https://zuul.opendev.org/t/openstack/build/bf4a0904ee7e4caa80e9262615a13182 or	14:58
clarkb	oschwart: but I suspect https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/2AVDUXBHX2TNDKNR4MRIHHO5MQJG4RVM/ is the answer you are looking for	14:58
oschwart	clarkb: right, thanks!!	14:59
opendevreview	Brian Rosmaita proposed opendev/irc-meetings master: Update cinder Festival of Reviews info https://review.opendev.org/c/opendev/irc-meetings/+/965150	16:23
opendevreview	Merged opendev/irc-meetings master: Update cinder Festival of Reviews info https://review.opendev.org/c/opendev/irc-meetings/+/965150	16:54
mnasiadka	fungi: Regarding our ARA discussion on the summit - it seems if there’s an option running ara-server somewhere in OpenDev - we could try out using distributed sqlite approach (https://ara.readthedocs.io/en/latest/distributed-sqlite-backend.html) and just remove old dbs after a given period of time	19:58
fungi	yeah, that's probably worth discussing in an opendev meeting, sorry i didn't remember to get it onto this week's agenda	19:59
mnasiadka	It’s a busy week - if there’s a link I can add that to agenda - I can do that by myself	20:00
fungi	we tried running a persistent ara server but the database quickly blew up, if memory serves. but that was also right after the functionality existed so odds are it's improved in the years since	20:00
fungi	mnasiadka: https://meetings.opendev.org/#OpenDev_Meeting	20:00
fungi	though discussing here in the meantime is perfectly fine	20:01
clarkb	we ran it with the embedded db as a regular ci log artifact	20:01
clarkb	then ara dropped support for doing that	20:01
mnasiadka	Distributed sqlite creates a file for each job/ansible-playbook run, so it might more suit OpenDev (not sure what backend you used in the past)	20:01
mnasiadka	Ah, embedded, right	20:01
clarkb	opendev system-config jobs actually upload ara reports using the file based backend iirc	20:01
clarkb	https://0b981208b2857e19931c-9f3d82aa4a1e99e8623570283c52345d.ssl.cf2.rackcdn.com/openstack/daaf3ca1de05433c9096289bbc9f663d/bridge99.opendev.org/ara-report/ for example	20:02
mnasiadka	We have support for uploading html files as part of job logs in Kolla-Ansible - but that’s a lot of html and takes too much time	20:02
fungi	i think part of the problem we ran into with the idea of a central ara is that at least initially it was oriented toward persistent server environments so having a bunch of different ci jobs with ansible output claiming to be for the same servers (based on inventory names used in the jobs) confused it quickly	20:03
fungi	again, it may be better in that sort of situation now	20:04
clarkb	there are issues with compatibility too	20:05
clarkb	a central ara server is not compatible with every ansible run aiui	20:05
mnasiadka	If there’s a separate db for each job id - then maybe that problem goes away	20:05
clarkb	ansible and ara need some alignment across versions	20:05
mnasiadka	Yeah, they seem to support latest three Ansible versions	20:06
fungi	so we'd need to run more than one ara depending on what versions of ansible jobs used i guess?	20:07
fungi	also is this for nested ansible or job ansible?	20:07
clarkb	I'll say right now that we shouldn't run an ara for job ansible	20:07
clarkb	zuul provides a method for viewing the ansible output at the job level. We can improve that if necessary (I have a change up right now that does so)	20:08
mnasiadka	Kolla and OSA would use that for nested Ansible	20:08
mnasiadka	And I’m not that sure we need ancient Ansible versions support	20:09
clarkb	mnasiadka: no but what happens when opendev, osa, and kolla are all using different versions without mutual ara support?	20:09
clarkb	just talking out loud here: what if you tarball the file based ara report and publish that. Then you can fetch it locally and file:///path/to/ara/index.html?	20:10
clarkb	would that mitigate the problem or is that too clunky for people to use?	20:10
clarkb	or similarly can we upload an ara + sqlite and reduce the total file count (I don't know how the fiel counts break down between service and data)	20:11
mnasiadka	Well, that’s probably clunky, but if there’s no other option that’s still better than nothing	20:12
mnasiadka	Hmm, uploading sqlite and templating out a script that stands up ara for viewing it might be an option	20:13
fungi	maybe a prebuilt ara container you can just mount a known location into with your data?	20:14
mnasiadka	I can experiment with that, maybe even we could point to sqlite via http url in ara’s config	20:15
clarkb	I don't see any new magic in https://codeberg.org/ansible-community/ara/src/branch/master/ara/ui/management/commands/generate.py that might cut down on the total file counts	20:25
mnasiadka	Ah right, because the databases are not the only files you have on the filesystem	20:29
clarkb	I was hoping there might be a combined method or some sort of virtual file setup. But no it seems to iterate over every resource and serialize them then render their corresponding page and write it to disk as a distinct file	20:33
clarkb	and this mimics the website in dynamic mode so that the templates and requests can all be shared. You'd need some sort of proxy to rewrite requests otherwise	20:33
clarkb	and at that point you're better of just running the dynamic service	20:33
mnasiadka	True - it seems that downloading sqlite works, I wasn’t aware but we had that support earlier - need to compare how slower Kolla-Ansible is with ARA running and saving to sqlite file	20:40
mnasiadka	Thanks for the discussion ;)	20:41
clarkb	mnasiadka: you're downloading the sqlite file with a local running ara isntall right? And ya as you suggest maybe a simple script that does a docker run of the ara container with a sqlite path is a reasonable solution?	20:42
mnasiadka	Yeah, we can even template out some README so people can just copy paste a command	20:42
clarkb	this is interesting swift bulk operations middleware can extract tarballs automatically for you	20:50
clarkb	mattoliver: ^ is there a way for swift users of public clouds to know if that middlewhere is present?	20:51
clarkb	mnasiadka: fungi ^ its possible that we could speed up the ara report uploads by uploading a single tarball then have swift extract it into separate files when read	20:51
clarkb	corvus: ^ curious if you know if other object stores support similar and that could be part of the main zuul upload role(s) potenntially or do we have to make it swift specific	20:52
clarkb	the actual data is not very large as far as I can tell. The issue is that we're uploading many files and each one is a separte round trip wit hswift so if can cut those round trips down we potentially speed the whole thing up	20:53
clarkb	its possible this is useful for more than ara too	20:53
clarkb	looks like rackspcae supports bulk operations	20:57
corvus	clarkb: i don't know the answer to that question. i will note that the swift log upload role uploads 24 files in parallel.	20:57
clarkb	I can't find evidence that bulk operations is enabled in ovh (it may still be just undocumented)	20:58
clarkb	corvus: ya I think the issue is that ara is thousands of files so it adds up quickly	20:58
clarkb	I do think in the general case uploads have been quick enough but when you start having lots of small files it is noticeable. We saw problems with tripleo and their practice of grabbing many many files (often redundantly)	20:58
clarkb	looks like log uploads for https://zuul.opendev.org/t/openstack/build/daaf3ca1de05433c9096289bbc9f663d/ took about 30 seconds which seems reasonable to me	21:00
clarkb	but I'm guessing a kolla opesntack deployment has a lot more content in the ara export. Not sure what time to upload they were experiencing	21:00
mattoliver	Do a /info call on the endpoint, and you should get json with a bunch of info back.	21:00
corvus	fungi: mnasiadka is there some place where i can catch up on the ara discussion at the summit? i'd like some background.	21:01
clarkb	mattoliver: ack thanks	21:01
fungi	corvus: it was a 5-minute aside in the hallway, maybe when i was standing at the zuul pavilion? i don't remember exactly, i just agreed that it was worth bringing up here for an actual discussion	21:02
corvus	what's the user story?	21:02
fungi	openstack-ansible's deployment test jobs generate vast volumes of nested ansible output, and generating static html from that and uploading it all takes a very long time, so they were hoping to find some way to make that less painful	21:03
fungi	mnasiadka: ^ is that an accurate summary?	21:04
corvus	while also keeping it browseable? (like, without downloading a tarball of the output?)	21:04
fungi	i'm not sure what compromises would be acceptable	21:05
fungi	i'm sure ideally it would come with your choice of free cupcake flavors, but obviously there have to be trade-offs	21:05
fungi	which is why clarkb was asking whether expecting the devs to download a bundle of data and then process it locally for inspection would be too onerous	21:06
clarkb	right I think the problem is ara specifically (but maybe other tools) generate many many small files. The total data is not large but when you upload each one distinctly the overhead of uploading all of the fiels can be noticeable	21:09
clarkb	tripleo had similar issues where half an hour of the job would be spent uploading logs	21:09
clarkb	this means that tools like ara are less easily consumed by groups generating large ara exports	21:11
corvus	i don't know if this would be appropriate, but as long as we're brainstorming, i have had a nested ansible write output to the job-output.json file so that the zuul web-ui reads it just like zuul's ansible. with that approach, all the output is in one (possibly large) file.	21:14
corvus	there may be other ideas worth considering that stop short of that... like multiple files, or converting ara output into json...	21:14
clarkb	doesn't look like openstackclient knows how to request swift's info endpoint so I need to figure out how to auth manually I guess	21:15
corvus	clarkb: one thing to keep in mind -- at least one of our clouds has suggested we use the s3 api for swift, so we'd want to know if we could still do that trick in that case.	21:17
clarkb	corvus: yup that is partly why I wondered if you knew about support in s3 or other sytems	21:18
clarkb	I wonder if this is a general feature or something swift specific	21:18
clarkb	looks like openstacksdk cloud.get_object_capabilities() may give the info I need so trying that	21:19
clarkb	except now I get 'public endpoint for object-store service in BHS1 region not found' errors	21:24
clarkb	doing a catalog list shows a public endpoint for swift object-store in that region fwiw	21:24
clarkb	conn.list_keypairs() works so auth is successful	21:25
clarkb	I'm guessing this code is just untested or incompatbile with the cloud	21:25
mordred	iirc, there was something different/weird with ovh and swift that isn't "normal" - and I'm guessing I never sorted it out and encoded it into SDK	21:27
clarkb	mordred: I think I may have figured it out. Looks like teh region is BHS not BHS1? I'm trying to test that now	21:29
mordred	there's frequently an impedence mismatch between non-regional swift deployments and regional keystone catalogs	21:29
mordred	clarkb: ha. that sounds familiar at least	21:29
clarkb	`for x in conn.service_catalog: if x['type'] == 'object-store': for y in x['endpoints']: print(y)` confirms we have things but seem to be called BHS not BHS1	21:30
mordred	sob	21:30
clarkb	but if I try to use BHS it says this is not a valid region name (I think because our clouds.yaml lists specific regions	21:30
clarkb	presumably if I add BHS to the list it would work	21:30
mordred	yeah. I feel like I added a way to override a region name on a per-service basis somewhere deep in the goo ... possibly for this use case. but I also might be hallucinating that	21:31
clarkb	I started looking for that maybe I won't give up just yet	21:32
opendevreview	Merged openstack/diskimage-builder master: Fix RPM DB path for Centos 10 Stream https://review.opendev.org/c/openstack/diskimage-builder/+/963939	21:35
clarkb	I ended up just creating a copy of the clouds.yaml file and modifying it to add the extra GRA and BHS regions	21:40
clarkb	then I was able to connect to BHS instead of BHS1 and call get_object_capabilities. I see thinsg for tempurl and bulk deletions but not for bulk operations	21:40
clarkb	so I think it isn't supported?	21:40
clarkb	infra-root do you think we should add these regions that only work for swift to our normal clouds.yaml or do we think it might be too confusing for what we're typically doing which requires the more specific region names?	21:41
corvus	clarkb: i lean toward "only if needed" due to confusion but maybe it's worth adding a comment so we retain the knowledge?	21:42
clarkb	corvus: that seems reasonable. I'll work on that change now	21:42
opendevreview	Clark Boylan proposed opendev/system-config master: Add note about OVH swift regions to clouds.yaml https://review.opendev.org/c/opendev/system-config/+/965187	21:45
clarkb	something like this	21:45
clarkb	re trixie debian mirror ended up in the pypi index list I'm guessing that somewhere the mirro_fqdn is overridden to point at the upstream deb repos not realizing that the same var value is used for pypi	22:01
clarkb	that makse a lot more sense to me than not having a mirror set confuses the systems	22:01
clarkb	and ya the role seems to assume if it is running then there is a mirror to configure	22:07
clarkb	rocky linux does not have mirrors either, but I think we just skip it as an unsupported distro type in configure-mirrors	22:10
clarkb	I expect alma is the same	22:10
clarkb	so ya seems like the easiest way to manage not mirroring content is to do so consistently for an entire distro at this moment. I should note that pypi mirror config is always applied as that is considered to not be platform specific	22:11
clarkb	and then there is the mirror configuration improvement (which I can't currently find) that would allow specific control of each entity	22:12
clarkb	corvus: did code for that land or maybe it is just a spec. I feel like I looked at it not that long ago too	22:13
corvus	the "spec" landed as docs updates i think	22:44
corvus	https://zuul-ci.org/docs/zuul-jobs/latest/mirror.html	22:44
clarkb	thank you	22:44
corvus	and i think tonyb was starting to continue that	22:44
tonyb	I did but didn't actually make a whole lot of progress	22:47

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!