Monday, 2024-12-16

fungiyeah, it was 64m lines long and over 5gb in size when i deleted it00:00
fungii'll look at this again tomorrow with a (hopefully) clearer head, but i think we're at the blow-it-away-and-start-over stage00:01
*** ykarel_ is now known as ykarel06:42
fricklermight be some actual loop caused by a symlink? do you remember one of the file paths?11:06
frickleralso regarding the cirros cert, it seems we might soon need to make the warning threshold for expiry configurable anyway, LE plans to offer certs with just 6d validity https://letsencrypt.org/2024/12/11/eoy-letter-2024/11:08
fricklerclarkb: sorry about lack of feedback for the held node. I guess we'd need a fresh node anyway if we wanted to continue to debug the ansible module. but that's also not on my priority list11:09
fungifrickler: the file paths seemed to be every deb file in the mirror12:56
fungiand source packages for them too, e.g.12:58
fungipool/universe/g/golang-github-jacobsa-crypto/golang-github-jacobsa-crypto-dev_0.0~git20161111.0.293ce0c+dfsg1-7_arm64.deb12:58
fungipool/universe/g/golang-github-jacobsa-crypto/golang-github-jacobsa-crypto_0.0~git20161111.0.293ce0c+dfsg1-7.debian.tar.xz12:58
fungipool/universe/g/golang-github-jacobsa-crypto/golang-github-jacobsa-crypto_0.0~git20161111.0.293ce0c+dfsg1-7.dsc12:58
fricklerfungi: weird, sounds like the "big badaboom" approach would be the next best option to try indeed13:44
fricklerinfra-root: we're down to < 200 zuul config errors now due to various cleanups and there's more pending with the eom-eol transitions. so it would be great to also be able to tackle some of the big non-openstack offenders like https://zuul.opendev.org/t/openstack/config-errors?project=starlingx%2Fzuul-jobs&severity=error&skip=0&limit=50 and x/packstack, anyone willing to help with nagging the relevant 13:52
fricklerfolks?13:52
fungifrickler: looks like packstack is part of rdo, so maybe we can find someone from their crowd to get it back on track (or retire it if development has moved elsewhere)14:07
frickleryes, my hope was that we might have enough redhat people around such that they could handle this kind of internally (looking at no infra root in particular ;-D)14:14
ykareljcapitao[m], karolinku[m] if you can check those ^14:50
jcapitao[m]frickler: wrt Packstack you are referring to https://zuul.opendev.org/t/openstack/config-errors?project=x%2Fpackstack&severity=error&skip=0&limit=50 ?14:55
jcapitao[m]thanks ykarel for the ping14:55
fungijcapitao[m]: correct. they could be solved through eol of the affected branches or adjustments to job configs on those branches15:12
fungiinteresting that it has both a stable/yoga and unmaintained/yoga branch15:12
fungilooks like it transitioned to unmaintained/yoga but stable/yoga never got removed15:13
jcapitao[m]hmm those errors were already fixed15:13
fungiin each stable branch or just master? (master isn't reporting errors)15:14
jcapitao[m]hmm actually no I misread15:16
jcapitao[m]lemme fix that by eol most of them and fixing the active stable branches15:16
fungisounds great, thanks!15:16
fricklercool, progress \o/15:27
clarkbGerrit does appear to be pruning log files after all. The day off set between the two sets of logfiles persists and it seems to be doing a couple more days than 30 ( Ithink it is only counting compressed files not the current and yseterday uncompressed files)15:48
clarkbI think that is probably good enough and we can consider that done and land the followup to remove the cron from ansible completely if others agree15:49
clarkbhttps://review.opendev.org/c/opendev/system-config/+/937278 is the change for that15:49
fungii already voted in favor, happy for you to self-approve it if you don't think it's likely to get any additional feedback15:52
clarkback thanks It should be a noop at this point as the cronjob is gone15:57
clarkbbut I'll triple check that before approving15:57
opendevreviewClark Boylan proposed opendev/system-config master: Switch mailman role to docker-compose exec  https://review.opendev.org/c/opendev/system-config/+/93779015:59
fungii'm disappearing for a bit to run some pre-travel errands and grab lunch, but should return in an hour or so16:03
opendevreviewClark Boylan proposed opendev/system-config master: Update Gerrit db container to use journald logging  https://review.opendev.org/c/opendev/system-config/+/93779116:05
opendevreviewJoel Capitao proposed openstack/project-config master: Authorize packstack-core to force push to remove branch  https://review.opendev.org/c/openstack/project-config/+/93779216:06
opendevreviewClark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman  https://review.opendev.org/c/opendev/system-config/+/93764116:08
clarkbnow to see if lists and review are happy with noble docker compose and podman16:08
opendevreviewJoel Capitao proposed openstack/project-config master: Authorize packstack-core to force push to remove branch  https://review.opendev.org/c/openstack/project-config/+/93779216:17
fricklerthe config error fix for shade is failing CI as miserably as I expected. if people would review it anyway, I would just go ahead and force-merge it? then we can ignore that repo again until something really serious comes up https://review.opendev.org/937788 (cc gtema)16:20
clarkbfrickler: isn't shade dead and rolled into openstacksdk? I wonder if we should just remove it from zuul instead?16:22
fricklerclarkb: I can try that, yes, though I'm not sure how many references old stable branches might still have16:34
gtemafrickler - +w-ed the change, clarkb - indeed, we could just drop zuul conf from the repo16:34
clarkbfrickler: oh I meant remove it from the zuul tenant config not cleanup the config in shade itself16:47
clarkbthough you could do both16:47
fricklerclarkb: I read that as dropping it from the zuul tenant config, too, that can still trigger issues for other repos that reference it. let me just push a change to test it17:00
clarkboh I see what you mean. I thougth you meant needing to clean out .zuul.yaml in all stable branches for the repo17:01
opendevreviewDr. Jens Harbott proposed openstack/project-config master: DNM: Drop openstack/shade from zuul config  https://review.opendev.org/c/openstack/project-config/+/93779717:02
fricklerI guess if we want to really proceed with ^^ we'll need to split it and also include a governance update, but let's see what zuul says first17:02
opendevreviewClark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman  https://review.opendev.org/c/opendev/system-config/+/93764117:05
fricklera second review on the stack at https://review.opendev.org/c/openstack/project-config/+/935696 would be nice17:06
clarkbfrickler: not sure what the commit message means there, the project moved from some other zuul tenant to the openstack tenant?17:08
clarkblooks like we didn't move tenants we just added it to gerrit and zuul17:09
clarkboh there are two repos one moved from the vexxhost to the openstack tenant and theo ther is a new repo that explains my confusion17:10
opendevreviewJoel Capitao proposed openstack/project-config master: Authorize packstack-core to force push to remove branch  https://review.opendev.org/c/openstack/project-config/+/93779217:11
frickleroh, I missed that the second governance patch is still pending, sorry :-(17:22
clarkbits fine now were' ready to land things on our side with a quick recheck once governance is sorted17:22
clarkbcool I think 937792 shows mailman and gerrit stuff working happily too. Zuul is the last big one that I've been avoiding becuse I know we rely on the docker exec containername process in those plays/roles quite a bit and mechanically I know we can convert them to a compatible system but still a bit of work to get through17:38
clarkbanyway at thsi point we've probably got enough data to discuss if we like the approach, have any more concerns we want to test through, etc I'll make sure thati s part of tomorrow's meeting agenda17:38
fungii'll be in the middle of crazy holiday highway traffic during the meeting, but i'm advance registering my preference for that future direction17:55
clarkbfungi: if you have time before you're driving can you review the changes under topic:podman-prep even if it isn't a full review just check out the shape of things and call out any concerns if you have them18:02
fungiyou bet18:07
clarkbI have confirmed that the cronjob on review02 appears to be gone. I will approve the change ot remove it from ansible now18:12
fungithanks!18:14
clarkbI'm dropping gerrit 3.10 stuff from the meeting agenda too as ^ was the last thing remaining related to it18:25
fungihaving heard no objections so far, i'll plan to blow away the contents of the ubuntu-ports volume and let our script bootstrap it from scratch again. at least the impact of it being stale for another ~week will likely continue to go unnoticed18:26
clarkbfungi: in theory that won't break running jobs since the ro copy will stay as is for now?18:26
fungiif anybody strongly disagrees with that approach and wants to try their hand at troubleshooting the present state of the mirror, feel free18:26
fungiclarkb: correct, we'll continue serving the old (stale) state until it's done18:27
clarkbthat is my only real concern18:27
fungito be fair, nobody brought it up until i happened to notice it after fixing the stale state of our regular ubuntu miror (presumably because arm64 jobs aren't as closely scrutinized)18:28
fungii'm just hoping to get it back to working before it becomes a job-affecting issue18:28
clarkb++18:29
fungibut this time of year it seems like we can probably afford to wait the near-week that bootstrapping and vos releasing it from zero will require18:29
clarkbI'm also going to drop backup server pruning/purging from the agenda. I think that reached a reasonable conclusion last week (though we can continue to apply ot to the other backup server when it starts to fill up)18:29
fungisgtm18:29
clarkbok those agenda edits are in. I'll send it out later today after others have a hcance to chime in on any other edits18:37
fungii've started recursively deleting all the contents of /afs/.openstack.org/mirror/ubuntu-ports18:39
clarkbI guess I should add note about ubuntu-ports mirroing18:40
fungionce it completes, i'll start another run of the usual script which should hopefully redownload everything and repopulate the databases18:40
fungiplease do, just be aware i won't be around for that discussion18:40
clarkbyup mostly a "be aware of this situation" thing more than expecting you to chime in tomorrow18:41
fungithat has already finished, starting the script now18:52
fungiit's running in a root screen session on mirror-update, and i'll try to check in on it from time to time over the course of the week18:56
opendevreviewMerged opendev/system-config master: Drop Gerrit log cleanup cron from Ansible  https://review.opendev.org/c/opendev/system-config/+/93727819:33
clarkbhttps://zuul.opendev.org/t/openstack/build/f05e379479794954bab4319521a221a6 zuul reports ^ deployed successfully19:37
clarkb(it should be a noop)19:37
fungiexcellent~!19:38
clarkblooks like we're approving the prep changes. One thing to note about that is for some services we may restart automatically and others we may not19:44
fungiyes19:44
clarkbI'll have to take stock of that once things land and work through what doesn't (I know gerrit won't for example)19:45
fungiseems like things are pretty quiet this week, so manual restarts are probably doable whenever is convenient19:45
clarkblooks like gitea won't either19:45
clarkbyup its a good time to work through things like that19:45
fungithe sooner the better as far as i'm concerned19:46
clarkbI think lodgeit may automatically restart19:46
fungimailman should as well19:46
fungibut worth double-checking19:46
clarkbI in mailman's case we are only changing the ocnfig management checks19:47
fungiit may not restart if images don't change (which they probably won't)19:47
clarkbso I think those should noop19:47
fungiagreed19:47
fungisince we build our own mailman container images, the only unknown is if the upstream mariadb container images change i guess19:48
clarkbya19:49
clarkbthe main thing I think to be on the lookout for is syslog -> journald logging change and we apply that to paste, gitea, and gerrit19:50
clarkbI think paste may be automatic but not gitea or gerrit19:50
clarkbI can work through manual restarts of gitea later today then see where we are at before doing gerrit. That might happen tomorrow19:50
opendevreviewMerged opendev/system-config master: Refactor check for new container images  https://review.opendev.org/c/opendev/system-config/+/93765519:58
clarkbthat change is deploying to gitea09 now whihc should noop unless there is a new mariadb image20:00
clarkbthen later when the logging change lands I can manually restart things20:00
clarkbI have imageand container listings done on gitea09 and gitea10 that I will check when the job is done just to confirm the noop20:00
clarkband then I'll work on lunch20:01
clarkbthe gitea job completed and as far as I an tell we did not grab new images and did not restart services (the expected behavior20:06
clarkbfungi: I've just realized that the string change in https://review.opendev.org/c/opendev/system-config/+/937717/4/playbooks/roles/gitea/tasks/main.yaml may append a newline t othose cron commands20:10
clarkbin theory this isn't an issue but I'm not positive of that20:10
clarkbso that will need checking after it lands I guess20:11
clarkbI think worst case ansible might put a stray newline in the crontab file which is fine20:11
fungii'll check once it deploys20:11
clarkbmeetpad didn't get new images or restart as expected so that check seems to be working in the no new imgaes cases20:11
fungii've made a dump of the root crontab on gitea09 for comparison20:12
fungii'll diff it after20:13
clarkbcool based on other command blocks I don't think the command module is bother by the newline at the end so its jus tthe cron entry in there that I'm slightly ocncerned about20:13
clarkbactually we may have examples of that elsewhere /me looks20:14
clarkbfungi: the gitea db backup cron uses > and not >- (I think >- eats the newline) and that cron job seems to be fine. Your diff will hopefully confirm20:15
clarkband now I need to make lunch20:15
opendevreviewMerged opendev/system-config master: Use docker-compose for container execs in gitea  https://review.opendev.org/c/opendev/system-config/+/93771720:22
opendevreviewMerged opendev/system-config master: Switch mailman role to docker-compose exec  https://review.opendev.org/c/opendev/system-config/+/93779020:22
opendevreviewMerged opendev/system-config master: Log with journald and not syslog in lodgeit docker compose  https://review.opendev.org/c/opendev/system-config/+/93765620:22
clarkbfungi: I think it did what we want and chomped it in ansible somewhere20:31
clarkbhowever now I notice the old command ran docker exec -t which means with a tty and the new one is docker-compose exec -T which means without a tty20:32
clarkbI'm not sure if that matters or not for running git garbage collection20:32
clarkbthe original cron job used -t so we didn't add it later to make things happy. corvus you write that original cron job not sure if you remember if that was needed or not20:34
fungi/var/spool/cron/crontabs/root was last updated 6 minutes ago on gitea0920:34
clarkbwe can probably run the new cronjob manually now if we don't want to wait for it to run and possibly email us angrily20:34
clarkbfungi: huh I just did crontab -l earlier and it showed the new content then20:35
clarkbit still shows the new conten tso not sure what made it update20:35
fungi-55 16 * * */2 docker exec -t gitea-docker_gitea-web_1 find /data/git/repositories/ -maxdepth 2 -name *.git -type d -execdir git --git-dir={} gc --quiet \;20:35
fungi+55 16 * * */2 /usr/local/bin/docker-compose -f /etc/gitea-docker/docker-compose.yaml exec -T gitea-web find /data/git/repositories/ -maxdepth 2 -name *.git -type d -execdir git --git-dir={} gc --quiet \;20:35
clarkbyup thats what I see from the crontab -l output and I think that all looks good except maybe we need to drop the -T if the old -t was necessary20:36
clarkbfungi: are you maybe in a position to manually run the new cronjob in a screen and see if it works as is?20:36
fungisure, just a sec20:36
fungiin progress in a root screen session on gitea0920:37
clarkbprobably need to check $? after it returns to see if it was happy or not20:38
fungii planned to, yeah20:38
fungiit seems to take a few minutes to complete20:38
opendevreviewMerged opendev/system-config master: Update gitea containers to use journald logging  https://review.opendev.org/c/opendev/system-config/+/93765720:38
clarkblodgeit did restart and the logging seems to still go to /var/log/containers so I think that is looking good20:40
clarkbhttps://paste.opendev.org/show/br9trC10ppXbuoWBgNaW/ and I made a test paste20:41
clarkbthe giteas do not appear to hvae restarted for the syslog -> journald change (expected). The haproxy for opendev.org did restart (expected)20:47
clarkbthe haproxy for zuul is about to restart if it hasn't yet20:48
fungigc on gitea09 is still going20:49
clarkbif you ps -elf | grep gc you can usually catch the repo it is doing20:50
clarkbit does seem to be making progress at least20:50
clarkbzuul lb looks happy and did restart20:50
fungiyeah, it's on openstack/tacker so should finish soon i hope20:50
clarkbI think the only unknowns are if git gc is happy (have to check exit code) and restarting services for gitea and gerrit to pick up the new logging config20:51
fungiwell, openstack/cinder now, so not alpha order20:51
clarkbonce gc is happy I'll work on restarting the gitea services on the 6 backends to pickup the new logging config20:52
fungiwell, i don't really know how long the git gc will take20:53
clarkbmy guess is it will be done by the time I'm done with lunch stuff20:54
clarkbanother 10 minute sor so?20:54
clarkbfungi: but I cant stop the gitea container the gc is running in without stopping the gc20:54
clarkbso I have to wait for gitea09 anyway. I could start on 10-14 though20:54
fungiokay, it just finished20:55
fungiroot@gitea09:~# echo $?20:56
fungi020:56
clarkbperfect I guess the old -t was not needed so having -T is correct20:56
fungiyeah. lgtm20:56
clarkbfungi: I detached from the screen and yo ucan clos eit whenever yo ulike20:56
fungidone20:56
clarkbfungi: you may want to double check the bits o fthe mailman playbook that I updated too20:57
corvusclarkb: sorry i don't know if the "-t" was required.  my feeling is: whatever works, works, and sounds like fungi has established that.  :)20:57
clarkbcorvus: ++20:57
clarkbfungi: probably need to look at the logs on bridge for that20:57
clarkbI need to relocate back to the office then will work on service restarts to pick up the logging change on the giteas20:58
opendevreviewMerged opendev/system-config master: Update Gerrit db container to use journald logging  https://review.opendev.org/c/opendev/system-config/+/93779121:00
clarkbok processing gitea14 first and will work backward through the list. I'm pulling servers out of the load balancer before I restart them too21:03
clarkbthat all looks good (logs still go to /var/log/containers and containers started without complaint as far as I can tell) I'll proceed through the list21:05
clarkbthat is done21:16
clarkbI'm now of two minds A) go ahead and restart gerrit once 937791 applies to pick up the journald logging change ( Ithink hourly jobs are currently running so it hasn't applied yet) or B) wait until tomorrow anway and just check that the giteas and paste don't have anything unexpected from journald logging21:17
clarkbactually looks like the gerrit deploy for journald logging went before the hourly jobs21:21
clarkbdouble checking the disk contents confirms21:22
clarkbI think I'm leaning towards doing the gerrit restart later21:22
clarkbbut happy for others to override me on that and just get it done21:23
clarkbfrom gitea09 /dev/vda1       155G   47G  109G  30% / if we wait on gerrit we can see if that moves significantly21:26
clarkb/dev/root                  39G  8.6G   31G  23% / is paste21:27
tonybjcapitao[m], karolinku[m]: I'm sorry with PTO and being unwell I let the ball drop on the CentOS-10 issues.  care to update me on current state and let me know what help you need from me, if any?22:13
clarkbtonyb: my understanding is the rhel 10 (and also centos 10 stream) have decided that x86-64-v3 is the minimal level of hardware support for those distro releases22:15
clarkbtonyb: problem is very few of our clouds (if any) currently provide hardware with those capabilitie s(particularly avx is an issue)22:16
clarkbI think of our clouds maybe vexxhost, raxflex, and openmetal can provide that level of cpu cpability but they are far from a majority of available resources and their flavors may not provide those features yet so another level of need to update those bits may be required22:17
fungias for openmetal, we may need to make our own nova config adjustments to expose that flag, i haven't checked22:17
clarkbmy personal take on this is that red hat has chosen poorly by getting ahead of the cloud providers and what is testable in the wild and other distros are making very different approaches (suse mixes in v3 capable compiled software on top of the normal distro and alma linux is doing v2 capable rebuild alongside the default of v3)22:18
fungiregardless, yes, the bulk of our quota comes from regions in ovh and rackspace classic, neither of which support centos 10 guests22:18
clarkbI think that red hat should be working with cloud providers to address this and not using us as a proxy. I don't feel this is a fight we should be involved in22:18
fungi(unless red hat backtracks on their choice of compiler options)22:19
clarkbfungi: I found a post from alma that made it seem that was unlikely let me see if I can dig that up again22:19
clarkbI'm not sure if rockylinux is doing anything different NeilHanlon may know22:19
tonybclarkb: Yeah the minimum bump and implementation thereof was flagged, and it appears ignored.22:20
fungiit's possible red hat is assuming that once centos 10 and rhel 10 release, cloud providers will be forced to upgrade their infrastructure. in the meantime though, we can't really assist with testing it22:20
clarkbhttps://almalinux.org/blog/2024-10-22-introducing-almalinux-os-kitten/ has a "AlmaLinux OS Kitten includes an additional build using x86-64-v2" section22:20
tonybI doubt very much that there will be a back-track.22:20
clarkbya I don't expect a backtrack. Mor ethat I don't want to be in the middle expected to solve these problems22:21
fungithen hopefully red hat and the centos community collectively are prepared for it not to be usable in lots of existing places22:21
clarkbif red hat wants to work with clouds to make their distro work in those clouds we'll take advatnage of it22:21
tonybSo assuming that we have a cloud that can support v3, we could provide a nodepool label, like we do with nested-virt, for some testing of CentOS-10, or is that not-okay22:22
clarkbso far I'v ebeen operating under the assumption that that isn't ok22:23
clarkbthe reason is if that cloud goes away we can no longer run centos 10 at all22:23
fungi1. someone needs to figure out where those are, and 2. it's likely to represent a very small proportion of our available quota22:23
clarkbwhereas with nested virt if that goes away all of our platforms continue to work only specific jobs don't22:23
clarkband even those specific jobs may work just more slowly22:23
fungi"support" for centos 10 testing might be on equal (or worse) footing to arm testing22:24
clarkbthat also means taking a stance that centos 10 gets to run on only our fastest resources22:24
clarkbwhich I think is unfair from a general scheduling perspective22:24
tonybAh I see the distinction.  I admit I wasn't thinking far enough ahead22:24
clarkbas far as determining where we can run these things we should be logging cpu flags via /proc/cpuinfo captures in every job now22:26
tonybI was thinking only about the DIB aspect, as in making sure that DIB would work with CentOS-10, not the general, now we have images let's run them22:26
clarkbso it should be possible to grab those from jobs that run in every cloud region and see what if any of them have the required flags to support v322:26
fungii don't personally object to folks working on getting it going, but be aware that from an established public cloud perspective it seems beyond bleeding-edge22:26
fungiwhich is an unusually out-of-character decision for red hat22:27
fungithough maybe the shift in how they look at centos explains it (they probably aren't expecting users to want to boot rhel 10 on public clouds any time soon)22:27
clarkboh hrm I thought we merged the change to get cpuinfo in zuul-jobs but now I'm not finding that in our regular job logs /me digs more22:29
fungithere was discussion of adding it to the zuul/zuul-jobs role that collects routes and disk utilization22:30
clarkbyes I thought that landed22:30
tonybYeah I thought the cpuinfo stuff merged22:31
clarkbhttps://review.opendev.org/c/zuul/zuul-jobs/+/937376 it did land now to find the info in the logs22:31
clarkbhttps://zuul.opendev.org/t/openstack/build/76b63ed3066146d69c9901ef55427e74/console#0/3/5/debian-bullseye22:32
clarkbmaybe we don't write it to the host info file but its there in ansible?22:32
clarkbwe are missing osxsave and lzcnt in ^ which was an openmetal node22:34
clarkbapparently intel consideres lzcnt part of bmi1 which we do have but it advertises the feature separately so not sure if we actually have it or not22:35
fungii think there's a filter in the nova/libvirt config that has to include flags we want passed through to guests?22:36
clarkbfor adding centos 10 stream support to dib we can't even run the functests because they chroot and expect executables in the chroot to run iirc :/22:36
fungiso would need job nodes that are capable of running centos 10 binaries22:37
clarkbfungi: the config is a bit more complicated than that. When using kvm (not qemu) you pick using host pasthrough or a custom model. You can pick from predefiend custom modules or define your own in libvirt22:37
clarkbfungi: yes I think have any testing for cnetos 10 in dib requires us to have test nodes capable of running centos 1022:37
fungigot it, so still possible our cpus in openmetal would work and "just" need config adjustments22:37
clarkbalso those cpu models are per hypervisor / nova compute setup not a flavor thing iirc22:38
clarkbfungi: ya or /proc/cpuinfo doesn't report lzcnt because its part of bmi1 and osxsave is under some other flag and we're good in openmetal already22:38
fungioh, that's a good point22:38
fungii don't know where to find the documentation that would confirm or refute that though22:39
fungiand would probably resort to just trying to run something there instead and see if it works22:39
clarkbya problem with that is it works until you find some other piece of software relying on a feature you thought was good but didn't actually check22:39
clarkbI wonder if there is a tool in linux to get a report of level and what is missing for other levels22:40
tonybYeah okay, that matches what I though WRT nova and what my testing in openmetal+vexxhost indicated.22:40
tonybclarkb: I expect there is but I don't know of one off the top of my head.22:41
clarkb`ld.so --help` reports it22:43
clarkbbut not what is missing from unsupported levels22:43
clarkbalso we know that some cloud regions don't have consistent levels but we can probably ignore that for now if we establish a baseline22:44
tonybSort of, it reports the variants it supports and which are detected, so if your libc *only* supports v3 you essentially get nothing useful there22:44
clarkbthe mirror on openemtal seems to report v3 is supported and searched22:48
clarkbtonyb: are you saying that ld.so is rpeorting what it was compiled to support not what the hardware supports?22:49
tonybclarkb: That's my understanding22:49
clarkbthe jammy mirror in openmetal says v4 is supported but the noble mirror in raxflex does not22:50
tonybHmm okay.  that's confusing :/22:51
clarkbthe focal mirror in both vexxhost regions just errors22:52
clarkbcannot load shared object etc22:52
clarkbfrom that I do suspect that openmetal and raxflex support it22:55
tonybOh I thought vexxhost did too.22:56
clarkbovh mirror says searched and supported22:58
clarkbso maybe the main issue is rax and/or vexxhost? (just haven't been able to get data from vexxhost yet)22:58
clarkbor there is plenty of variance ?22:58
tonybI suspect that RAX for sure has enough variance to be a problem23:00
clarkbtonyb: the nested virt label did not work in initial testing with dib support and that includes raxflex, vexxhost, openmetal, and ovh iirc23:01
clarkbso at least one of them doens't work. Which may also mean maybe this method of checking is invalid23:01
tonybclarkb: Yeah, I thought that was for $other reasons though23:02
clarkbyou mean a problem other than v3 cpu support?23:02
clarkbunderstanding why the nested virt lable didn't work is probably a good starting point since that constrains the problme space a bit23:05
tonybYeah I thought the job did something funky because it had nested support and the $funky failed, but I could easily be wrong23:05
tonybI have a small bash script that should (untested) report which flags are missing23:07
tonybwell it's untested in that the laptop I wrote it on has all the tested flags23:07
clarkbmy local jammy fileserver reports v2 only supported and searched and not v323:13
clarkbwhich I think is accurate for that cpu so this detecton method is at least sort of working23:13
clarkbya no avx on that system23:13
tonybCan you run: https://paste.opendev.org/show/b0Zw2AdKxbaInIvhhBLu/ on it ?23:14
tonybnote the final flag "cve12" is my bogus flag to verify that it does fail to detect a flag23:15
clarkbtonyb: ya give me a few (I want to understand it before running it). I also notice that qemu emulation of haswell feature srequires qemu 7.2 or newer23:17
clarkbwhich may be a problme for the dib tests that build an image and check it (I don't know what version of qemu those currently have)23:17
tonybYeah I don't know about qemu versions either23:18
clarkbTIL here strings23:21
tonybclarkb: you're welcome?23:24
clarkbtonyb: what is the flags="${flags## }" for? It seems to result in the same string at the end as the previous step for me23:25
clarkbtonyb: also I haven't confirmed yet but I think your script may not match flgas at the beginning or end of the flags string due to he requirement for the spaces on either side?23:25
tonyb"just in case the pevious line left spaces in the begining of the flags23:25
clarkbah ok but I think you do want a space at either end of the list for the case matching?23:26
clarkboh you embed that in the case staement on both side snevermind23:26
tonybthe case " ${flags} " in ensures they're there23:27
tonybIt's probably a little stupid to do it that way but I was rushing23:27
clarkbtonyb: local msg is unneded. I ran it without the v4 detection and got all v1 and v2 are found but most of v3 is not found23:30
clarkbso I think it is owrking23:30
tonybOkay cool.  Thanks23:30
clarkbmy cpu is from 2016 fwiw23:31
clarkband my not very old laptop and desktop don't support v4 either23:32
clarkbbecause they are amd and like one generation too old23:33
tonybOkay.  I might make a patch for DIB to call that do aid with debugging.23:41
clarkbtonyb: maybe capture the qemu version too. Thoughj I suspect we can infer that by checking the packaged version for the distro after the fact23:44
clarkbcrazy idea time: do everything on arm23:46
tonybLOL23:51

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!