*** dtantsur_ is now known as dtantsur | 01:57 | |
opendevreview | Steve Baker proposed openstack/diskimage-builder master: Fix hashbang of non-executed bash libs https://review.opendev.org/c/openstack/diskimage-builder/+/928312 | 03:05 |
---|---|---|
frickler | ok, 27 jobs ran on raxflex so far, 92% success, so nothing severely broken I'd say. I think I'm going to bump max-servers to 20 now to give it some more testing while still being able to revert if something goes wrong before the weekend starts | 07:01 |
opendevreview | Jens Harbott proposed openstack/project-config master: Bump max-servers for raxflex-sjc3 to 20 https://review.opendev.org/c/openstack/project-config/+/928321 | 07:32 |
opendevreview | James Page proposed openstack/project-config master: charms: add official repo jobs https://review.opendev.org/c/openstack/project-config/+/928326 | 09:09 |
fungi | frickler: it looks like the current limiting factor in raxflex-sjc3 is ram, we have enough quota for 30 server instances with 8192mb ram each | 12:33 |
fungi | enough cpu for 98 server instances with 4vcpus each, and an instance quota of 48 | 12:35 |
frickler | fungi: yes, that matches what I've seen, do you want to go to 30 immediately? | 12:40 |
fungi | nah, this is fine | 12:40 |
fungi | i'll work on getting the new provider into grafana | 12:41 |
opendevreview | Jeremy Stanley proposed openstack/project-config master: Add RackSpace Flex Nodepool dashboard in Grafana https://review.opendev.org/c/openstack/project-config/+/928351 | 12:45 |
fungi | i *think* that's all it takes | 12:46 |
cardoe | Would you like me to get the quota bumped? | 12:48 |
opendevreview | Merged openstack/project-config master: Bump max-servers for raxflex-sjc3 to 20 https://review.opendev.org/c/openstack/project-config/+/928321 | 12:52 |
fungi | cardoe: eventually, but let's see how it fares for a few days first, we're not even pushing it to our available quota yet anyway | 12:57 |
fungi | and thanks again! | 12:57 |
Clark[m] | That looks about right. I'm avoiding the proper keyboard for now to avoid waking everyone up early | 13:00 |
opendevreview | Merged openstack/project-config master: Add RackSpace Flex Nodepool dashboard in Grafana https://review.opendev.org/c/openstack/project-config/+/928351 | 13:45 |
fungi | https://grafana.opendev.org/d/6d29645669/nodepool3a-rackspace-flex | 13:52 |
fungi | for some reason, `openstack limits show --absolute` is broken with latest client/sdk versions, but if i comment out the bit in the client that tries to retrieve volume limits i can see the compute limits just fine | 14:21 |
fungi | otherwise it gives the obscure error "Proxy.get_limits() got an unexpected keyword argument 'project_id'" | 14:22 |
fungi | but with that, i can see that the current raxflex-sjc3 quotas for the nodepool project are 50 instances, 100 standard instances worth of cpu, and 32 standard instances worth of ram. floating ip quota seems to be unlimited | 14:24 |
opendevreview | Jeremy Stanley proposed openstack/project-config master: Increase raxflex-sjc3 max-servers to 32 https://review.opendev.org/c/openstack/project-config/+/928445 | 14:30 |
fungi | that's ^ for whenever others agree this is looking okay so far | 14:30 |
Clark[m] | fungi: have any example job runs I can quickly check? Maybe tempest jobs too? | 15:02 |
fungi | Clark[m]: https://zuul.opendev.org/t/openstack/build/f3a71d03ddb84c03b82de123c33feb60 was one i was looking at last night | 15:05 |
fungi | cinder-tempest-plugin-lvm-lio-barbican-fips | 15:06 |
fungi | covers all the bases | 15:06 |
fungi | "Took 2 hrs 20 mins 27 secs" | 15:06 |
Clark[m] | ya that seems to be inline with runtimes in rax dfw | 15:09 |
fungi | comparing to earlier successful runs, those are anywhere between 1.5 and 2.5 hours in other providers | 15:09 |
Clark[m] | https://zuul.opendev.org/t/openstack/build/f3a71d03ddb84c03b82de123c33feb60/log/job-output.txt#522-644 also shows swap being configured properly which was one of my main concerns give the disk setup there | 15:09 |
Clark[m] | Looks like I've been signed out of all the things. I'll have to log back in after local updates and then I can +2 the change | 15:09 |
fungi | actually, devstack could get smarter about it and reuse the existing swap device those nodes have, though i think it's only 4gb | 15:10 |
fungi | nova can supply a swap device similar to how it supplies ephemeral disk, and in the case of the flavors in raxflex-sjc3 there's one ready to go | 15:11 |
fungi | and the ephemeral disk is already preformatted ext4 | 15:11 |
Clark[m] | Oh that's a good idea would speed up the jobs if minimally. But at least we aren't breaking things and ending up with no swap and also have enough disk for devstack | 15:12 |
Clark[m] | Probably just need a check to skip setup if swap is already oresent | 15:13 |
fungi | yeah, /dev/vdb is the pre-formatted ext4 ephemeral disk, and /dev/vdc is a prepped swap device | 15:13 |
fungi | so in theory it could mount /dev/vdb on /opt and swapon /dev/vdc and save a fair bit of time, yeah | 15:14 |
Clark[m] | Oh except /etc/fstab probably doesn't already have swap in it? So we would need to add the device and mount/swap on? | 15:14 |
fungi | cloud-init does find and add it to fstab, but glean doesn't no | 15:14 |
Clark[m] | Ok slightly less easy if a fix but not bad compared to what is already being run there | 15:14 |
fungi | but like i said, it's only a 4gb swap device, while devstack wants 8gb swap at the moment, so probably fine/necessary that we're creating it anyway | 15:16 |
Clark[m] | ya though really it would be great if we could get that number back down to 1gb or less... | 15:16 |
clarkb | ok +2'd | 15:23 |
clarkb | fungi: is there anything else I should be looking at before fading back into jet lag? | 15:26 |
opendevreview | Elod Illes proposed openstack/project-config master: Use ubuntu-jammy for propose-update-constraints https://review.opendev.org/c/openstack/project-config/+/928452 | 15:28 |
fungi | clarkb: nothing springs to mind. next week we can hopefully try to knock out the etherpad upgrade early | 15:33 |
clarkb | ++ | 15:33 |
clarkb | fungi: I'll let you decidei f you want another reviewers before increasing the raxflax max-servers value. Risk seems low on that though so probably fine to go as is? | 15:35 |
fungi | yeah, seems so. i'll self-approve so we can get additional data sooner | 15:36 |
opendevreview | Merged openstack/project-config master: Use ubuntu-jammy for propose-update-constraints https://review.opendev.org/c/openstack/project-config/+/928452 | 15:42 |
opendevreview | Merged openstack/project-config master: Increase raxflex-sjc3 max-servers to 32 https://review.opendev.org/c/openstack/project-config/+/928445 | 15:45 |
clarkb | looks like some kolla aarch64 build jobs are succeeding now. Do we think that could be related to the improvments Ramereth was making? | 15:58 |
Ramereth[m] | I hope so :) | 15:58 |
fungi | raxflex-sjc3 usage is starting to climb above 20 now according to https://grafana.opendev.org/d/6d29645669/nodepool3a-rackspace-flex | 16:06 |
fungi | we had 21 nodes running jobs simultaneously there for a bit | 16:07 |
fungi | it'll be a race to see if we can fully exercise the quota before friday developer activities hit their usual nosedive | 16:09 |
fungi | clarkb: oh, if you do get time, feel free to weigh in on whether we should add https://review.opendev.org/927684 before tagging gerritlib 0.11.0 | 16:18 |
fungi | but it's not urgent, we're not "release on fridays" people after all ;) | 16:18 |
clarkb | done | 16:19 |
fungi | thanks! | 16:21 |
fungi | and yeah, unless somebody goes crazy with a big stack of changes, we're probably waiting for the daily periodic trigger to fully exercise the new max there | 16:24 |
fungi | zuul's test node graph has been on a steady decline since the top of the hour | 16:25 |
opendevreview | Merged opendev/gerritlib master: Only installable on Python 3.10 and later https://review.opendev.org/c/opendev/gerritlib/+/927684 | 16:32 |
fungi | looks like we did get another burst of zuul activity, so ~topped out the quota in raxflex-sjc3 for the past 20 minutes so far | 18:51 |
fungi | yeah, it was maxed for about an hour | 19:33 |
fungi | though it looks like it only managed to boot 31 nodes there, not 32 | 19:36 |
fungi | but no blips on the error node launch attempts graph at all | 21:02 |
fungi | aha! i see why we only booted 31, there's a "frickler-test" instance in error state | 21:03 |
fungi | i'll clean that up | 21:03 |
fungi | now it should use up to 32 instances there | 21:03 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!