Monday, 2025-03-03

amorinhey corvus Clark[m] , I just receive highlight on my name, I partially understand that you want so custom flavors?08:01
frickleramorin: iiuc we have a single custom flavor with 8gb RAM currently, and the question would be whether we could additionally have the same flavor, but with 4gb and 16gb respectively, similar to these https://opendev.org/opendev/zuul-providers/src/branch/master/zuul.d/providers.yaml#L14-L22 . but maybe wait for corvus or Clark[m] to confirm before taking any action09:47
amorinack09:47
corvusamorin: yep, the question is exactly as frickler said.  i seem to recall the current flavors are special so that they are scheduled onto specific hypervisors, so obviously we don't want to do anything without checking with you.  but we have some projects that would be able to run jobs more efficiently with a little more ram, and we're hoping that we could also use smaller nodes on some simple jobs to balance that out, at least a little bit. so14:44
corvuswe were wondering if it would be possible to have 4 and 16gb flavors.14:44
*** priteau is now known as Guest1054914:46
*** priteau2 is now known as priteau14:46
amorinI will check that with the team15:15
amorinwe currently have a CPU overallocation of 2, do you agree to keep that, increase / decrease?15:41
amorin(with the flavor ssd-osFoundation-3)15:41
Clark[m]the main limitation of those flavors seems to be iops and not cpu allocation. So that is probably fine?15:45
fungii don't think we're looking to change any existing oversubscription, though it's good to keep in mind that when we use >50% of our quota we're probably slowing our cpu performance for all jobs there proportionally15:45
fungiwhich reinforces some of the "we're our own noisy neighbor" observations we've made in the past15:46
Clark[m]++15:46
Clark[m]I'm in the middle of doing local updates for my desktop and my irc bouncer. Hoping to be back on IRC in the not too distant future (just fyi that my morning is going to have a very slow start)15:46
amorinI would recommand to stay at 2 as well, it would allow us to consume more RAM15:47
amorinor I can reduce the number of CPU given per instance15:47
amorinI am building the figures in a etherpad, will share when ready15:48
Clark[m]corvus: ^ should we reduce the cpu count for the 4GB flavor?15:48
Clark[m]amorin: that is great thanks15:48
Clark[m]I don't think we can change the cpu count for the existing 8GB flavor as that may have unexpected consequences for existing jobs. But maybe 4vcpu 4GB on the smaller flavor is ok?15:48
corvusare we 8cpu/8gb now?15:49
amorinyes15:49
amorinthat's far from optimal15:49
amorinwe have much more memory on the compute15:49
corvusyeah, 4vcpu for the small one sounds fine.15:50
fungisimilarly should we have a higher cpu count on the 16gb flavor, or keep that at 8vcpu?15:54
corvusi think for jobs the memory is more important here, and it sounds like we're already out of balance the "wrong" way on that, so 8/16 might improve the balance?15:56
fungiwfm15:57
amorinhttps://etherpad.opendev.org/p/ovh-flavors15:57
amorinhere is what we propose15:58
corvusi worry that if we change the 8gb ram flavor to lower the vcpu count, that will adversely affect the current jobs -- but if we can keep the current flavor as well, then we can continue to use that, and try out the oif-4-8-80 flavor to see if that's okay.16:03
corvusso -- is it okay to have and use all 4 flavors?  :)16:04
amorinwe will keep the old flavor yes, one downside is that the old one is not going to mix with the new ones, I means, if some instances are spawn on some hypervisors, they will prevent new instances to come there.16:05
corvuswell, hrm, that could be a problem.  we'd want to continue using the current flavor for almost everything, with just a few projects occasionally launching nodes with the new flavors.  this is the first time we've ever run with heterogeneous flavors, so we're just starting to experiment.16:08
corvus(so in practice, i think that means we'd probably never be able to schedule the new flavors, unless the system is nearly idle)16:09
Clark[m]maybe just start with the 16gb and 4GB for now? and limit the exposure. THough now that I've written that I'm not sure that is much of a difference16:20
amorinok, so, that's not as easy as that16:20
amorinlet me explain why we cant mix, and we can endup having another solution16:21
fungicould maybe add a new equivalent to the old flavor, and then we can switch our primary use to it?16:21
amorinI pasted a bunch of explanation at the end of the etherpad16:29
amorinwith an eventual new proposition16:30
fungithanks!16:35
clarkbok I think I'm back on IRC now16:50
fungiwelcome back!16:56
fungiinfra-root: over the weekend i tried to get the rest of the rackspace flex region/project updates proposed, if anyone has time to look them over and let me know what may still be missing (e.g. swift bits for niz image uploads?): https://review.opendev.org/q/hashtag:flex-dfw316:59
corvusamorin: i'm still unclear about the flavor thing -- i left another question in the etherpad17:02
corvusbut basically, i'm confused about osFoundation-4 -- i don't know if the new proposal is to have the three new "oif" flavors plus one more new "osFoundation" flavor, or if the new proposal is to only have osFoundation flavors and no oif flavor17:04
clarkbfungi: I made a note about quotas and increasing max-servers on https://review.opendev.org/c/openstack/project-config/+/943106 the old sjc3 value of32 looks appropriate for the new region and tenats17:12
fungii also just realized i haven't proposed cleanup changes for the old mirror server and project17:12
fungiworking on that now17:12
clarkbediting the cloud launcher details to have a comment about setting mtu to 1500 might be a good followup too (its less urgent though)17:15
fungiah, good point17:15
clarkbalso as a reminder heads up I have to pop out early afternoon today to do passport things with the kids. Both parents must be present to do that and appointments are extremely limited so I'm taking what I can get17:16
fungigood luck!17:19
opendevreviewJeremy Stanley proposed opendev/system-config master: Remove Ansible for old Rackspace Flex SJC3 mirror  https://review.opendev.org/c/opendev/system-config/+/94319517:25
opendevreviewJeremy Stanley proposed opendev/zone-opendev.org master: Clean up DNS for old Rackspace Flex SJC3 mirror  https://review.opendev.org/c/opendev/zone-opendev.org/+/94319617:26
fungithose should be able to go whenever, i think, since we already switched to the new mirror17:26
clarkbI usually remove dns last so that LE jobs don't fail17:27
fungiright, hence the depends-on17:27
fungii just meant we shouldn't need to wait for the other flex project switching changes17:28
fungii'm also working on cleanup for the old hostvars and cloud entries, but that will depend on both of course17:29
clarkback17:30
fungiheading out to run some quick errands, bbiab17:38
corvusi'm restarting the zuul schedulers to pick up some small bugfixes17:39
corvusdone17:46
clarkbI did a quick check against gitea13 and its response times still appear to be stable for me17:48
clarkbNext up I'm trying to find some container image build and service test change that can act as a canary for the theoretically new docker hub limits17:48
clarkbwas hopign to find some real work I can do but etherpad, gitea, and gerrit are all up todate17:49
clarkbmay just use a DNM change17:49
clarkblodgeit is already on quay17:50
opendevreviewClark Boylan proposed opendev/system-config master: DNM this is just a canary for Docker hub rate limits  https://review.opendev.org/c/opendev/system-config/+/94319717:52
fungiokay, back18:16
corvusfungi: i looked into the application token; it's scoped to the project18:30
corvusso we'll need a new one18:30
corvusi can do that real quick18:30
fungigot it. we'll need buckets too, or does zuul/ansible create them?18:30
fungis/buckets/containers/18:31
corvusnew bucket.  i already made it for testing (just to make sure the old cred couldn't see it)18:31
fungione in each region (sjc3 and dfws)?18:32
fungis/dfws/dfw3/18:32
fungifor https://review.opendev.org/94310418:32
corvusjust one -- we only need to upload to one location since it's temp storage18:33
fungioh, got it, actual uploads to each region are in glance18:33
fungicool, thanks!18:33
corvusyeah.  we could round-robin or something later for HA, but we don't strictly need more than one at a time18:33
clarkbhttps://securitytxt.org/ is this /.well-known/security.txt file something we should consider for opendev (and maybe zuul, openstack, starlingx, etc)?18:38
clarkbseems like a good idea to have a standard url path to look for that info in, even if that ultimately points to richer docs elsewhere18:38
opendevreviewJames E. Blair proposed opendev/zuul-providers master: Switch to new rax flex project for image uploads  https://review.opendev.org/c/opendev/zuul-providers/+/94320018:41
corvusfungi: clarkb ^ that can happen any time (including now)18:41
clarkb+2 from me18:41
clarkband ya thats similar to the mirror where we moved it before other resources because it is fine as long as you can access it18:42
fungilooks like the security.txt is normalized in ietf rfc 9116: https://www.iana.org/assignments/well-known-uris/well-known-uris.xhtml18:43
fungiah, the faq at the bottom of the site you linked mentions that18:45
clarkbits also in the haeder18:46
clarkbhas a read the rfc link18:46
fungioh, so it is, there's a button. my eyes have been trained to gloss over buttons and look for blocks of text18:46
corvusthe contrast on that button is not great18:49
fungii mostly just thought "i wonder if they registered it with iana"18:50
fungiand then checked the iana registry18:50
fungiand it helpfully linked to the ietf rfc18:50
corvuswe have an operator/developer split for the services we run which makes this a little fuzzy18:51
corvuslike, not sure we want to get a security report for gerrit18:51
corvusand the gerrit project doesn't want to hear if we left our passwords out18:52
clarkbagreed. Maybe we just set it on the main domain opendev.org (served by gitea) and dno't set it for everything else?18:52
clarkbto minimize that problem (though we may still get gerrit reports since we host a gerrit)18:52
corvus(of course, misrouted calls are not the worst thing, and better than missing it altogether, assuming the parties are all friendly)18:52
fungiwell, if a user finds a vulnerability in the gerrit service running at review.opendev.org and confidentially reports it to us rather than directly to gerrit, i'd say that's not the worst outcome. we can still check whether it's an upstream bug or a problem we've introduced, and then forward the report as needed18:52
corvusexactly; just worth thinking about the best way to encourage it getting it right the first time.  the web site doesn't seem to talk about that.18:53
corvusclarkb: that sounds like a good first step18:54
fungii might even say it's preferable, because if the reporter doesn't have sufficient understanding to figure out themselves whether it's an upstream gerrit problem or an us gerrit problem, i expect we're in a better position to figure that out than the upstream gerrit security folks are18:54
fungiwe don't want opendev configuration issues, for example, being reported to gerrit upstream and then relying on them to get in touch with us about it18:55
corvusthat's a good point18:56
clarkbhow do we provide comment on the request for comments? I do think this is a corner case that they should consider18:57
clarkbthough I'm not sure how to express it in a productive way yet. And maybe we don't have to if we just point in the right direction18:57
fungithis is similar to the distro/upstream relationship, where if a reporter isn't capable or motivated to work out whether a vulnerability they've discovered is preset upstream or introduced downstream by the distro, notifying the package maintainer is the better option18:58
fungithen let the package maintainer communicate it upstream if necessary, since they're probably better-connected with them than the original reporter anyway18:59
corvusi went looking for discussions around this and found this:19:00
corvushttps://github.com/securitytxt/security-txt/issues/18519:00
corvusi have no idea what that is about19:00
clarkbgoing back to the docker hub rate limit canary I think the gitea image build and service test job are going to end up being happy in the end. So it hasn't made things catastrophically worse as least. I do note that it is even possible this is a better situationfor us19:00
corvusbecause the only content in that gh issue is two dead links to twitter19:00
clarkbwow19:01
clarkbthe title does seem to match what we are discussing but agreed without the twitter context it is hard to say for sure19:01
fungione of those discussions also led me to https://github.com/disclose/dnssecuritytxt19:02
fungithough i don't see any indication that ever made it into a standard19:03
opendevreviewClark Boylan proposed opendev/system-config master: Mirror selenium/standalone-firefox to Quay.io  https://review.opendev.org/c/opendev/system-config/+/94320219:06
clarkbanother docker hub hosted image that we can mirror without much trouble to ease the total number of docker hub requests system-config makes19:07
clarkbnote that the name mapping is a little odd for this one19:07
clarkbbah of course I jinked it. The gitea job did fail pulling haproxy-statsd from docker hub19:09
clarkbits odd that it took so long to get to that point. Maybe we don't deploy the haproxy load balancer until much later in the job19:10
clarkbno the gitea-lb playbook runs early but the way we run playbooks is using an ansible loop and it doesn't short circuit on first failure19:12
clarkbhrm and until doesn't quite do what we want from a loop though maybe there is a way to construct a rule19:14
clarkbansible 11 adds break_when. I guess until then we just accept this19:14
clarkbthe test job ran in ovh-bhs1 on a jammy node which does configure a global ipv6 addr so we likely hit the /64 problem (as the host only fetched one image from docker hub)19:17
clarkbtonyb: did the bad but maybe necessary idea of having a shim to force ipv4 for docker hub access get proposed?19:18
clarkbI'm thinking if we add that in alongside all the other effort to use docker hub less we'd maximize available quota for each job and have higher success rates19:18
corvus2025-03-03 15:02:00,616 ERROR zuul.Launcher:   openstack.exceptions.HttpException: HttpException: 504: Server Error for url: https://neutron.api.sjc3.rackspacecloud.com/v2.0/floatingips, 504 Gateway Time-out: nginx20:10
corvusthat's from create_floating_ip20:11
corvusnot sure if that's just something we should ignore until we move to the new project?20:11
fungiprobably worth pointing out to cloudnull or jamesdenton if it recurrs often20:14
jamesdentonhey, we are aware of that and actively working on it now20:18
fungithanks jamesdenton!20:19
jamesdentonfungi sent a PM 20:19
opendevreviewMerged opendev/zuul-providers master: Switch to new rax flex project for image uploads  https://review.opendev.org/c/opendev/zuul-providers/+/94320020:30
jamesdentoncorvus mind trying again?20:49
corvusjamesdenton: looks good!  thanks!20:58
jamesdentonsweet! some weird collision of floating IPs in the DB20:59
* corvus backs away slowly....21:06
corvus:)21:06
clarkbI'm going to have to pop out soon, but this is your reminder to edit the meeting agenda or let me know if there are updates you want me to make. I figured we could metnion the ovh flavors discussion as well as raxflex tenant swap21:10
clarkbI'm going to drop the service coordinator election topic and the gitea caches topic. I might add another note about docker hub rate limits due to their supposed change on Saturday21:11
clarkbhttps://review.opendev.org/c/opendev/system-config/+/943202 related this mirrors another image so that we don't have to fetch it from docker hub21:13
opendevreviewMerged opendev/system-config master: Mirror selenium/standalone-firefox to Quay.io  https://review.opendev.org/c/opendev/system-config/+/94320221:24
clarkbrecheck of the gitea canary did pass so its not a complete breakdown (yay)21:25
clarkbI've just made a number of edits to the meeting agenda23:15
clarkbfungi: you don't happen to still be around do you? Thinking now might be a good time to approve the old raxflex mirror cleanup changes?23:22
clarkbthey will take some time to apply because the inventory file landed23:22
clarkbbut should also be fairly noopy23:22
clarkbI'll send the agenda out in about half an hour. Let me know if anything important is missing23:32
tonybNothing from me23:32
clarkbtonyb: did you see my question about forcing docker hub onto ipv4? Curious if that ever ended up in a change to review23:33
tonybCan I host my personal git repos on opendev23:33
clarkbmostly because I think it may be a useful stopgap in the clouds that have ipv623:33
tonybclarkb: I did not, let me check scrollback23:33
clarkbtonyb: https://opendev.org/inaugust is a small example of that. I think as long as it is licensed appropriately it would be within the rules to do so23:34
tonybOkay I think It's all Apache 2.023:34
tonybAt this point I don't really need CI I'm happy for check and gate to be noops23:35
clarkbya I think the inaugust repos are super minimal ci23:36
clarkb maybe a linter?23:36
tonybMy standards aren't that high ;P23:37
corvusactually i think inaugust may have some CD23:53
corvusas well as ci23:53

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!