amorin | hey corvus Clark[m] , I just receive highlight on my name, I partially understand that you want so custom flavors? | 08:01 |
---|---|---|
frickler | amorin: iiuc we have a single custom flavor with 8gb RAM currently, and the question would be whether we could additionally have the same flavor, but with 4gb and 16gb respectively, similar to these https://opendev.org/opendev/zuul-providers/src/branch/master/zuul.d/providers.yaml#L14-L22 . but maybe wait for corvus or Clark[m] to confirm before taking any action | 09:47 |
amorin | ack | 09:47 |
corvus | amorin: yep, the question is exactly as frickler said. i seem to recall the current flavors are special so that they are scheduled onto specific hypervisors, so obviously we don't want to do anything without checking with you. but we have some projects that would be able to run jobs more efficiently with a little more ram, and we're hoping that we could also use smaller nodes on some simple jobs to balance that out, at least a little bit. so | 14:44 |
corvus | we were wondering if it would be possible to have 4 and 16gb flavors. | 14:44 |
*** priteau is now known as Guest10549 | 14:46 | |
*** priteau2 is now known as priteau | 14:46 | |
amorin | I will check that with the team | 15:15 |
amorin | we currently have a CPU overallocation of 2, do you agree to keep that, increase / decrease? | 15:41 |
amorin | (with the flavor ssd-osFoundation-3) | 15:41 |
Clark[m] | the main limitation of those flavors seems to be iops and not cpu allocation. So that is probably fine? | 15:45 |
fungi | i don't think we're looking to change any existing oversubscription, though it's good to keep in mind that when we use >50% of our quota we're probably slowing our cpu performance for all jobs there proportionally | 15:45 |
fungi | which reinforces some of the "we're our own noisy neighbor" observations we've made in the past | 15:46 |
Clark[m] | ++ | 15:46 |
Clark[m] | I'm in the middle of doing local updates for my desktop and my irc bouncer. Hoping to be back on IRC in the not too distant future (just fyi that my morning is going to have a very slow start) | 15:46 |
amorin | I would recommand to stay at 2 as well, it would allow us to consume more RAM | 15:47 |
amorin | or I can reduce the number of CPU given per instance | 15:47 |
amorin | I am building the figures in a etherpad, will share when ready | 15:48 |
Clark[m] | corvus: ^ should we reduce the cpu count for the 4GB flavor? | 15:48 |
Clark[m] | amorin: that is great thanks | 15:48 |
Clark[m] | I don't think we can change the cpu count for the existing 8GB flavor as that may have unexpected consequences for existing jobs. But maybe 4vcpu 4GB on the smaller flavor is ok? | 15:48 |
corvus | are we 8cpu/8gb now? | 15:49 |
amorin | yes | 15:49 |
amorin | that's far from optimal | 15:49 |
amorin | we have much more memory on the compute | 15:49 |
corvus | yeah, 4vcpu for the small one sounds fine. | 15:50 |
fungi | similarly should we have a higher cpu count on the 16gb flavor, or keep that at 8vcpu? | 15:54 |
corvus | i think for jobs the memory is more important here, and it sounds like we're already out of balance the "wrong" way on that, so 8/16 might improve the balance? | 15:56 |
fungi | wfm | 15:57 |
amorin | https://etherpad.opendev.org/p/ovh-flavors | 15:57 |
amorin | here is what we propose | 15:58 |
corvus | i worry that if we change the 8gb ram flavor to lower the vcpu count, that will adversely affect the current jobs -- but if we can keep the current flavor as well, then we can continue to use that, and try out the oif-4-8-80 flavor to see if that's okay. | 16:03 |
corvus | so -- is it okay to have and use all 4 flavors? :) | 16:04 |
amorin | we will keep the old flavor yes, one downside is that the old one is not going to mix with the new ones, I means, if some instances are spawn on some hypervisors, they will prevent new instances to come there. | 16:05 |
corvus | well, hrm, that could be a problem. we'd want to continue using the current flavor for almost everything, with just a few projects occasionally launching nodes with the new flavors. this is the first time we've ever run with heterogeneous flavors, so we're just starting to experiment. | 16:08 |
corvus | (so in practice, i think that means we'd probably never be able to schedule the new flavors, unless the system is nearly idle) | 16:09 |
Clark[m] | maybe just start with the 16gb and 4GB for now? and limit the exposure. THough now that I've written that I'm not sure that is much of a difference | 16:20 |
amorin | ok, so, that's not as easy as that | 16:20 |
amorin | let me explain why we cant mix, and we can endup having another solution | 16:21 |
fungi | could maybe add a new equivalent to the old flavor, and then we can switch our primary use to it? | 16:21 |
amorin | I pasted a bunch of explanation at the end of the etherpad | 16:29 |
amorin | with an eventual new proposition | 16:30 |
fungi | thanks! | 16:35 |
clarkb | ok I think I'm back on IRC now | 16:50 |
fungi | welcome back! | 16:56 |
fungi | infra-root: over the weekend i tried to get the rest of the rackspace flex region/project updates proposed, if anyone has time to look them over and let me know what may still be missing (e.g. swift bits for niz image uploads?): https://review.opendev.org/q/hashtag:flex-dfw3 | 16:59 |
corvus | amorin: i'm still unclear about the flavor thing -- i left another question in the etherpad | 17:02 |
corvus | but basically, i'm confused about osFoundation-4 -- i don't know if the new proposal is to have the three new "oif" flavors plus one more new "osFoundation" flavor, or if the new proposal is to only have osFoundation flavors and no oif flavor | 17:04 |
clarkb | fungi: I made a note about quotas and increasing max-servers on https://review.opendev.org/c/openstack/project-config/+/943106 the old sjc3 value of32 looks appropriate for the new region and tenats | 17:12 |
fungi | i also just realized i haven't proposed cleanup changes for the old mirror server and project | 17:12 |
fungi | working on that now | 17:12 |
clarkb | editing the cloud launcher details to have a comment about setting mtu to 1500 might be a good followup too (its less urgent though) | 17:15 |
fungi | ah, good point | 17:15 |
clarkb | also as a reminder heads up I have to pop out early afternoon today to do passport things with the kids. Both parents must be present to do that and appointments are extremely limited so I'm taking what I can get | 17:16 |
fungi | good luck! | 17:19 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Remove Ansible for old Rackspace Flex SJC3 mirror https://review.opendev.org/c/opendev/system-config/+/943195 | 17:25 |
opendevreview | Jeremy Stanley proposed opendev/zone-opendev.org master: Clean up DNS for old Rackspace Flex SJC3 mirror https://review.opendev.org/c/opendev/zone-opendev.org/+/943196 | 17:26 |
fungi | those should be able to go whenever, i think, since we already switched to the new mirror | 17:26 |
clarkb | I usually remove dns last so that LE jobs don't fail | 17:27 |
fungi | right, hence the depends-on | 17:27 |
fungi | i just meant we shouldn't need to wait for the other flex project switching changes | 17:28 |
fungi | i'm also working on cleanup for the old hostvars and cloud entries, but that will depend on both of course | 17:29 |
clarkb | ack | 17:30 |
fungi | heading out to run some quick errands, bbiab | 17:38 |
corvus | i'm restarting the zuul schedulers to pick up some small bugfixes | 17:39 |
corvus | done | 17:46 |
clarkb | I did a quick check against gitea13 and its response times still appear to be stable for me | 17:48 |
clarkb | Next up I'm trying to find some container image build and service test change that can act as a canary for the theoretically new docker hub limits | 17:48 |
clarkb | was hopign to find some real work I can do but etherpad, gitea, and gerrit are all up todate | 17:49 |
clarkb | may just use a DNM change | 17:49 |
clarkb | lodgeit is already on quay | 17:50 |
opendevreview | Clark Boylan proposed opendev/system-config master: DNM this is just a canary for Docker hub rate limits https://review.opendev.org/c/opendev/system-config/+/943197 | 17:52 |
fungi | okay, back | 18:16 |
corvus | fungi: i looked into the application token; it's scoped to the project | 18:30 |
corvus | so we'll need a new one | 18:30 |
corvus | i can do that real quick | 18:30 |
fungi | got it. we'll need buckets too, or does zuul/ansible create them? | 18:30 |
fungi | s/buckets/containers/ | 18:31 |
corvus | new bucket. i already made it for testing (just to make sure the old cred couldn't see it) | 18:31 |
fungi | one in each region (sjc3 and dfws)? | 18:32 |
fungi | s/dfws/dfw3/ | 18:32 |
fungi | for https://review.opendev.org/943104 | 18:32 |
corvus | just one -- we only need to upload to one location since it's temp storage | 18:33 |
fungi | oh, got it, actual uploads to each region are in glance | 18:33 |
fungi | cool, thanks! | 18:33 |
corvus | yeah. we could round-robin or something later for HA, but we don't strictly need more than one at a time | 18:33 |
clarkb | https://securitytxt.org/ is this /.well-known/security.txt file something we should consider for opendev (and maybe zuul, openstack, starlingx, etc)? | 18:38 |
clarkb | seems like a good idea to have a standard url path to look for that info in, even if that ultimately points to richer docs elsewhere | 18:38 |
opendevreview | James E. Blair proposed opendev/zuul-providers master: Switch to new rax flex project for image uploads https://review.opendev.org/c/opendev/zuul-providers/+/943200 | 18:41 |
corvus | fungi: clarkb ^ that can happen any time (including now) | 18:41 |
clarkb | +2 from me | 18:41 |
clarkb | and ya thats similar to the mirror where we moved it before other resources because it is fine as long as you can access it | 18:42 |
fungi | looks like the security.txt is normalized in ietf rfc 9116: https://www.iana.org/assignments/well-known-uris/well-known-uris.xhtml | 18:43 |
fungi | ah, the faq at the bottom of the site you linked mentions that | 18:45 |
clarkb | its also in the haeder | 18:46 |
clarkb | has a read the rfc link | 18:46 |
fungi | oh, so it is, there's a button. my eyes have been trained to gloss over buttons and look for blocks of text | 18:46 |
corvus | the contrast on that button is not great | 18:49 |
fungi | i mostly just thought "i wonder if they registered it with iana" | 18:50 |
fungi | and then checked the iana registry | 18:50 |
fungi | and it helpfully linked to the ietf rfc | 18:50 |
corvus | we have an operator/developer split for the services we run which makes this a little fuzzy | 18:51 |
corvus | like, not sure we want to get a security report for gerrit | 18:51 |
corvus | and the gerrit project doesn't want to hear if we left our passwords out | 18:52 |
clarkb | agreed. Maybe we just set it on the main domain opendev.org (served by gitea) and dno't set it for everything else? | 18:52 |
clarkb | to minimize that problem (though we may still get gerrit reports since we host a gerrit) | 18:52 |
corvus | (of course, misrouted calls are not the worst thing, and better than missing it altogether, assuming the parties are all friendly) | 18:52 |
fungi | well, if a user finds a vulnerability in the gerrit service running at review.opendev.org and confidentially reports it to us rather than directly to gerrit, i'd say that's not the worst outcome. we can still check whether it's an upstream bug or a problem we've introduced, and then forward the report as needed | 18:52 |
corvus | exactly; just worth thinking about the best way to encourage it getting it right the first time. the web site doesn't seem to talk about that. | 18:53 |
corvus | clarkb: that sounds like a good first step | 18:54 |
fungi | i might even say it's preferable, because if the reporter doesn't have sufficient understanding to figure out themselves whether it's an upstream gerrit problem or an us gerrit problem, i expect we're in a better position to figure that out than the upstream gerrit security folks are | 18:54 |
fungi | we don't want opendev configuration issues, for example, being reported to gerrit upstream and then relying on them to get in touch with us about it | 18:55 |
corvus | that's a good point | 18:56 |
clarkb | how do we provide comment on the request for comments? I do think this is a corner case that they should consider | 18:57 |
clarkb | though I'm not sure how to express it in a productive way yet. And maybe we don't have to if we just point in the right direction | 18:57 |
fungi | this is similar to the distro/upstream relationship, where if a reporter isn't capable or motivated to work out whether a vulnerability they've discovered is preset upstream or introduced downstream by the distro, notifying the package maintainer is the better option | 18:58 |
fungi | then let the package maintainer communicate it upstream if necessary, since they're probably better-connected with them than the original reporter anyway | 18:59 |
corvus | i went looking for discussions around this and found this: | 19:00 |
corvus | https://github.com/securitytxt/security-txt/issues/185 | 19:00 |
corvus | i have no idea what that is about | 19:00 |
clarkb | going back to the docker hub rate limit canary I think the gitea image build and service test job are going to end up being happy in the end. So it hasn't made things catastrophically worse as least. I do note that it is even possible this is a better situationfor us | 19:00 |
corvus | because the only content in that gh issue is two dead links to twitter | 19:00 |
clarkb | wow | 19:01 |
clarkb | the title does seem to match what we are discussing but agreed without the twitter context it is hard to say for sure | 19:01 |
fungi | one of those discussions also led me to https://github.com/disclose/dnssecuritytxt | 19:02 |
fungi | though i don't see any indication that ever made it into a standard | 19:03 |
opendevreview | Clark Boylan proposed opendev/system-config master: Mirror selenium/standalone-firefox to Quay.io https://review.opendev.org/c/opendev/system-config/+/943202 | 19:06 |
clarkb | another docker hub hosted image that we can mirror without much trouble to ease the total number of docker hub requests system-config makes | 19:07 |
clarkb | note that the name mapping is a little odd for this one | 19:07 |
clarkb | bah of course I jinked it. The gitea job did fail pulling haproxy-statsd from docker hub | 19:09 |
clarkb | its odd that it took so long to get to that point. Maybe we don't deploy the haproxy load balancer until much later in the job | 19:10 |
clarkb | no the gitea-lb playbook runs early but the way we run playbooks is using an ansible loop and it doesn't short circuit on first failure | 19:12 |
clarkb | hrm and until doesn't quite do what we want from a loop though maybe there is a way to construct a rule | 19:14 |
clarkb | ansible 11 adds break_when. I guess until then we just accept this | 19:14 |
clarkb | the test job ran in ovh-bhs1 on a jammy node which does configure a global ipv6 addr so we likely hit the /64 problem (as the host only fetched one image from docker hub) | 19:17 |
clarkb | tonyb: did the bad but maybe necessary idea of having a shim to force ipv4 for docker hub access get proposed? | 19:18 |
clarkb | I'm thinking if we add that in alongside all the other effort to use docker hub less we'd maximize available quota for each job and have higher success rates | 19:18 |
corvus | 2025-03-03 15:02:00,616 ERROR zuul.Launcher: openstack.exceptions.HttpException: HttpException: 504: Server Error for url: https://neutron.api.sjc3.rackspacecloud.com/v2.0/floatingips, 504 Gateway Time-out: nginx | 20:10 |
corvus | that's from create_floating_ip | 20:11 |
corvus | not sure if that's just something we should ignore until we move to the new project? | 20:11 |
fungi | probably worth pointing out to cloudnull or jamesdenton if it recurrs often | 20:14 |
jamesdenton | hey, we are aware of that and actively working on it now | 20:18 |
fungi | thanks jamesdenton! | 20:19 |
jamesdenton | fungi sent a PM | 20:19 |
opendevreview | Merged opendev/zuul-providers master: Switch to new rax flex project for image uploads https://review.opendev.org/c/opendev/zuul-providers/+/943200 | 20:30 |
jamesdenton | corvus mind trying again? | 20:49 |
corvus | jamesdenton: looks good! thanks! | 20:58 |
jamesdenton | sweet! some weird collision of floating IPs in the DB | 20:59 |
* corvus backs away slowly.... | 21:06 | |
corvus | :) | 21:06 |
clarkb | I'm going to have to pop out soon, but this is your reminder to edit the meeting agenda or let me know if there are updates you want me to make. I figured we could metnion the ovh flavors discussion as well as raxflex tenant swap | 21:10 |
clarkb | I'm going to drop the service coordinator election topic and the gitea caches topic. I might add another note about docker hub rate limits due to their supposed change on Saturday | 21:11 |
clarkb | https://review.opendev.org/c/opendev/system-config/+/943202 related this mirrors another image so that we don't have to fetch it from docker hub | 21:13 |
opendevreview | Merged opendev/system-config master: Mirror selenium/standalone-firefox to Quay.io https://review.opendev.org/c/opendev/system-config/+/943202 | 21:24 |
clarkb | recheck of the gitea canary did pass so its not a complete breakdown (yay) | 21:25 |
clarkb | I've just made a number of edits to the meeting agenda | 23:15 |
clarkb | fungi: you don't happen to still be around do you? Thinking now might be a good time to approve the old raxflex mirror cleanup changes? | 23:22 |
clarkb | they will take some time to apply because the inventory file landed | 23:22 |
clarkb | but should also be fairly noopy | 23:22 |
clarkb | I'll send the agenda out in about half an hour. Let me know if anything important is missing | 23:32 |
tonyb | Nothing from me | 23:32 |
clarkb | tonyb: did you see my question about forcing docker hub onto ipv4? Curious if that ever ended up in a change to review | 23:33 |
tonyb | Can I host my personal git repos on opendev | 23:33 |
clarkb | mostly because I think it may be a useful stopgap in the clouds that have ipv6 | 23:33 |
tonyb | clarkb: I did not, let me check scrollback | 23:33 |
clarkb | tonyb: https://opendev.org/inaugust is a small example of that. I think as long as it is licensed appropriately it would be within the rules to do so | 23:34 |
tonyb | Okay I think It's all Apache 2.0 | 23:34 |
tonyb | At this point I don't really need CI I'm happy for check and gate to be noops | 23:35 |
clarkb | ya I think the inaugust repos are super minimal ci | 23:36 |
clarkb | maybe a linter? | 23:36 |
tonyb | My standards aren't that high ;P | 23:37 |
corvus | actually i think inaugust may have some CD | 23:53 |
corvus | as well as ci | 23:53 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!