opendevreview | Merged opendev/system-config master: Update docker-compose restart flags https://review.opendev.org/c/opendev/system-config/+/801667 | 00:04 |
---|---|---|
opendevreview | Ian Wienand proposed opendev/base-jobs master: Switch Debian Stable to Bullseye https://review.opendev.org/c/opendev/base-jobs/+/802639 | 00:51 |
opendevreview | Ian Wienand proposed opendev/system-config master: Add Debian Bullseye testing https://review.opendev.org/c/opendev/system-config/+/802641 | 00:55 |
ianw | clarkb: it looks like the playbook import worked, but skipped? | 01:26 |
ianw | https://0989436ea33e935ace85-42b9b3ca9891e58d539431fcfb5b799d.ssl.cf1.rackcdn.com/802112/6/check/system-config-run-review-3.2/547cc5f/bridge.openstack.org/ara-report/results/374.html | 01:26 |
ianw | hosts: review in https://review.opendev.org/c/opendev/system-config/+/802112/6/playbooks/rename_repos.yaml suggests the group matching isn't working as hoped | 01:28 |
ianw | ahh, no actually i think it is working | 01:39 |
ianw | just the group rename bit didn't do anything https://0989436ea33e935ace85-42b9b3ca9891e58d539431fcfb5b799d.ssl.cf1.rackcdn.com/802112/6/check/system-config-run-review-3.2/547cc5f/bridge.openstack.org/ara-report/results/365.html | 01:39 |
opendevreview | Ian Wienand proposed opendev/system-config master: [for squash] rename testing https://review.opendev.org/c/opendev/system-config/+/802645 | 01:55 |
opendevreview | Tristan Cacqueray proposed opendev/system-config master: Run matrix-gerritbot on eavesdrop https://review.opendev.org/c/opendev/system-config/+/800506 | 02:00 |
opendevreview | Tristan Cacqueray proposed opendev/system-config master: Run matrix-gerritbot on eavesdrop https://review.opendev.org/c/opendev/system-config/+/800506 | 02:13 |
*** ykarel|away is now known as ykarel | 04:44 | |
opendevreview | Merged opendev/base-jobs master: Switch fedora-latest to use fedora-34 https://review.opendev.org/c/opendev/base-jobs/+/795639 | 04:48 |
opendevreview | Ian Wienand proposed openstack/project-config master: Remove debian-stretch disk images https://review.opendev.org/c/openstack/project-config/+/802654 | 04:49 |
opendevreview | Ian Wienand proposed openstack/project-config master: Remove Debian stretch image builds https://review.opendev.org/c/openstack/project-config/+/802655 | 04:49 |
ianw | fungi: i don't think we have too far to go with stretch removal; https://review.opendev.org/q/topic:%22stretch-removal%22+(status:open%20OR%20status:merged) should do it | 04:53 |
ianw | however, i've stacked that ontop of mnaser's (mostly) work to remove fedora-32 as it will conflict. that has a bit further to go. we need to pull it out of the stable devstack branches it has got into, and fix up master for f34 | 04:53 |
*** marios is now known as marios|ruck | 05:49 | |
*** rpittau|afk is now known as rpittau | 07:02 | |
*** amoralej|off is now known as amoralej | 07:07 | |
*** ykarel is now known as ykarel|lunch | 08:34 | |
*** zbr is now known as Guest2544 | 09:09 | |
*** sshnaidm|afk is now known as sshnaidm | 09:45 | |
*** ykarel|lunch is now known as ykarel | 10:17 | |
opendevreview | Tristan Cacqueray proposed opendev/system-config master: Run matrix-gerritbot on eavesdrop https://review.opendev.org/c/opendev/system-config/+/800506 | 11:32 |
opendevreview | Tristan Cacqueray proposed opendev/system-config master: Run matrix-gerritbot on eavesdrop https://review.opendev.org/c/opendev/system-config/+/800506 | 12:38 |
*** artom_ is now known as artom | 12:58 | |
*** amoralej is now known as amoralej|lunch | 13:16 | |
fungi | 2021-07-27 20:14:22,121 DEBUG zuul.Pipeline.openstack.gate: [e: 4f75f02688cb4529b4c61e35eb7079d8] Adding node request <NodeRequest 199-0014882562 <NodeSet two-centos-8-nodes [<Node None ('primary',):centos-8-stream>, <Node None ('secondary',):centos-8-stream>]>> for job tripleo-ci-centos-8-scenario000-multinode-oooq-container-updates to item <QueueItem c249608bba964523a9592e8defa507a2 for | 13:32 |
fungi | <Change 0x7f03fd588eb0 openstack/tripleo-heat-templates 800848,2> in gate> | 13:32 |
fungi | i think it's that one (199-0014882562) | 13:32 |
opendevreview | Tristan Cacqueray proposed opendev/system-config master: Run matrix-gerritbot on eavesdrop https://review.opendev.org/c/opendev/system-config/+/800506 | 13:37 |
*** amoralej|lunch is now known as amoralej | 13:38 | |
fungi | looks like it's nl03 | 13:39 |
fungi | 2021-07-27 20:21:30,856 DEBUG nodepool.driver.NodeRequestHandler[nl03.opendev.org-PoolWorker.inap-mtl01-main-e9f6aedc5ff04b16970f8576cadadf81]: [e: 4f75f02688cb4529b4c61e35eb7079d8] [node_request: 199-0014882562] Declining node request because nodes failed | 13:39 |
fungi | it seems to have never given up the lock on the node request | 13:39 |
fungi | i'll restart the nodepool-launcher container on nl03 to free that any any other stuck node requests it might be responsible for | 13:39 |
fungi | #status log Restarted the nodepool-launcher container on nl03.opendev.org in order to free stale node request locks | 13:40 |
opendevstatus | fungi: finished logging | 13:41 |
fungi | looks like there were some periodic jobs stuck for ~80 hours which are getting new node assignments now | 13:42 |
fungi | there's also a stuck build in check for 99 hours i'm hoping this will take care of | 13:43 |
fungi | that seems to have (finally) gotten it | 13:57 |
fungi | apparently the node request in question eventually got replaced: | 14:01 |
fungi | 2021-07-28 06:18:54,574 DEBUG zuul.nodepool: [e: 4f75f02688cb4529b4c61e35eb7079d8] Resubmitting lost node request <NodeRequest 199-0014882562 <NodeSet two-centos-8-nodes [<Node None ('primary',):centos-8-stream>, <Node None ('secondary',):centos-8-stream>]>> | 14:01 |
fungi | 2021-07-28 06:18:57,225 DEBUG zuul.nodepool: [e: 4f75f02688cb4529b4c61e35eb7079d8] Updating node request <NodeRequest 199-0014886787 <NodeSet two-centos-8-nodes [<Node None ('primary',):centos-8-stream>, <Node None ('secondary',):centos-8-stream>]>> | 14:01 |
fungi | and then replaced again: | 14:02 |
fungi | 2021-07-28 11:22:48,893 DEBUG zuul.nodepool: [e: 4f75f02688cb4529b4c61e35eb7079d8] Resubmitting lost node request <NodeRequest 199-0014886787 <NodeSet two-centos-8-nodes [<Node None ('primary',):centos-8-stream>, <Node None ('secondary',):centos-8-stream>]>> | 14:02 |
fungi | 2021-07-28 11:22:48,994 DEBUG zuul.nodepool: [e: 4f75f02688cb4529b4c61e35eb7079d8] Updating node request <NodeRequest 199-0014890476 <NodeSet two-centos-8-nodes [<Node None ('primary',):centos-8-stream>, <Node None ('secondary',):centos-8-stream>]>> | 14:02 |
fungi | but each one ended up getting stuck in inap-mtl01 on nl03 | 14:04 |
fungi | sheer luck i guess? | 14:04 |
sshnaidm | fungi, hi, can you please delete tag 1.5.0-1 from openstack/ansible-collections-openstack? I pushed it by mistake, need 1.5.1-1 | 14:34 |
fungi | sshnaidm: https://docs.opendev.org/opendev/infra-manual/latest/drivers.html#tagging-a-release "Tags can’t be effectively deleted once pushed, so make absolutely certain they’re correct (ideally by locally testing release artifact generation commands and inspecting the results between the tag and push steps above)." | 14:36 |
fungi | tag deletions don't propagate via pull or remote update | 14:37 |
sshnaidm | fungi, ack | 14:38 |
fungi | and even if we pushed a deletion to gerrit, zuul executor and merger caches would continue to have that tag as would users pulling updates to their local clones | 14:38 |
fungi | so generally better to just assume that tags can't be deleted once pushed, and choose alternate means of correcting the problem | 14:39 |
sshnaidm | fungi, yeah, no problem, I can handle it | 14:39 |
fungi | this is one of the reasons the openstack/releases repo exists, to allow for collective review of proposed tags | 14:40 |
sshnaidm | fungi, one more question, I have a pre-release job in repo https://opendev.org/openstack/ansible-collections-openstack/src/branch/master/.zuul.yaml#L456 - when I pushed the pre-release tag 1.5.0-1 it didn't start. Although acc. to semver it was lower than last tag 1.5.0 - so this is the reason? Or something else prevented it to trigger? | 14:42 |
fungi | it's the -1 on the end, that's not semver nor pep 440 compliant | 14:45 |
*** ykarel is now known as ykarel|away | 14:45 | |
fungi | the regular expressions we match on for release and pre-release pipelines can be found here: https://opendev.org/openstack/project-config/src/branch/master/zuul.d/pipelines.yaml | 14:46 |
fungi | release: ^refs/tags/[0-9]+(\.[0-9]+)*$ | 14:46 |
fungi | pre-release: ^refs/tags/[0-9]+(\.[0-9]+)*(a|b|rc)[0-9]+$ | 14:46 |
sshnaidm | https://semver.org/#spec-item-9 | 14:47 |
sshnaidm | it can include hyphen/dash | 14:48 |
sshnaidm | well, anyway, now I see why it's not triggered | 14:49 |
fungi | i should clarify, it's not (semver + pep 440) compliant | 14:50 |
fungi | we do have a pipeline called "tag" though which will match on any tag value | 14:50 |
sshnaidm | fungi, do you have example of pre-release tag which is compliant both with semver and zuul regexp? | 14:57 |
sshnaidm | because I can't find such | 15:02 |
fungi | sshnaidm: sure, https://opendev.org/opendev/git-review/tags shows a 1.27.1.0a1 and a 1.28.0.0a1 which are pep 440 "alpha" prerelease versions for 1.27.1 and 1.28.0 respectively | 15:03 |
fungi | switch the "a" to "b" for a beta release, or to "rc" for a release candidate | 15:03 |
sshnaidm | fungi, I mean a pure semver, not pep440 | 15:03 |
fungi | i think only actual release versions are going to match in that case, not prereleases | 15:04 |
sshnaidm | python -c "import semver;semver.match('1.27.1.0a1', '<1.0.0')" - ValueError: 1.27.1.0a1 is not valid SemVer string | 15:04 |
sshnaidm | fungi, so maybe we can add semver release option to zuul regexp? | 15:05 |
fungi | we should probably discuss if the openstack tenant wants to expand the regular expressions for those pipelines | 15:05 |
sshnaidm | fungi, yeah, would be great, because ansible galaxy for example can proceed only pure semver tags | 15:06 |
fungi | but also, like i said, there's also the "tag" pipeline which will match on anything, so if your job is able to tell the difference between prereleases and releases (or doesn't need to care) then you can just run the job in the tag pipeline | 15:06 |
sshnaidm | so we can't publish there pre-releases with zuul | 15:06 |
sshnaidm | fungi, yeah, that's a good option I think | 15:07 |
sshnaidm | though with "release" and "pre-release" it's more beautiful :) | 15:07 |
fungi | since openstack predominately publishes its software on pypi, it has opted to use strict pep 440 comlpiant versions for its prerelease pipeline | 15:07 |
sshnaidm | I see | 15:08 |
fungi | since semver prereleases wouldn't be handled properly by pip and related python packaging ecosystem tooling | 15:08 |
sshnaidm | so on submitter to push a right tag | 15:09 |
sshnaidm | not sure zuul should be involved here.. | 15:09 |
clarkb | you can push pre release via the tag pipeline | 15:11 |
clarkb | the tag pipeline allows you to process any tag | 15:11 |
sshnaidm | btw, maybe worth to update docs, they don't have "tag" pipeline: https://docs.opendev.org/opendev/system-config/latest/zuul.html | 15:12 |
sshnaidm | clarkb, yeah, fungi pointed me to it, seems like a good solution for now | 15:13 |
fungi | sshnaidm: yeah, we probably need to do a better job of moving more of the info about the openstack tenant out of our documentation and to somewhere else like the openstack project teams guide | 15:13 |
clarkb | ysterday we had an alert that the openstackid ssl cert hadn't refreshed but its happy today | 15:17 |
clarkb | today we have an email saying review's cert didn't update in a timely manner. I wonder if that will self correct like openstackid | 15:17 |
fungi | yeah, saw it as well, another possibility is we're not restarting apache on cert refresh i guess and the apache workers aren't recycling in a timely fashion after graceful | 15:18 |
fungi | and it's roulette as to whether you get a worker which is clinging to the old cert | 15:19 |
clarkb | infra-prod-letsencrypt does claim to have succeeded. I suppose either review02 is in the emergency file and we skipped it or it is the apache worker issue? | 15:22 |
* clarkb wanders off to grab keys and look | 15:22 | |
clarkb | review02 is not in the emergency file and the le cert file has a timestamp from today. I suspect this is apache worker rollover | 15:24 |
fungi | yeah, there's a bit of a race there since we don't have much of a grace period between our cert refresh period and where we start sending alerts for the cert getting close to expiry | 15:28 |
clarkb | fungi: I'm still getting the old cert for review though so the race may not be so tight | 15:31 |
yoctozepto | morning infra - any idea whether gerrit can be configured to allow anyone (or at least cores) to set hashtags on changes? (currently only the change owner can) | 15:33 |
opendevreview | Clark Boylan proposed opendev/system-config master: Test the rename_repos playbook https://review.opendev.org/c/opendev/system-config/+/802112 | 15:34 |
clarkb | ianw: fungi ^ I squashed ianw's fixup change in to that and also added a little more testing on the gitea side | 15:34 |
clarkb | yoctozepto: it can be. Zuul and Ironic have done this. I think we might consider allowing it across the server though? | 15:34 |
yoctozepto | clarkb: OOH, GREAT! I am asking for Kolla today but it seems useful enough for everyone | 15:35 |
clarkb | you can definitely request in individual project acls for now. And we should probably start thinking about adding it to the all projects global acl but that may need some double checking? (I'm not sure where it would go in that acl today) | 15:35 |
yoctozepto | ok, I can handle the kolla part then, thank you very much; I will have a look at ironic | 15:36 |
opendevreview | Radosław Piliszek proposed openstack/project-config master: Allow kolla cores to edit kolla hashtags https://review.opendev.org/c/openstack/project-config/+/802744 | 15:44 |
*** amoralej is now known as amoralej|off | 15:57 | |
clarkb | yuriys: I'm going to find some breakfast, but then I'm pretty much free to do cloud surgery if today is still good for you. Feel free to ping if/when you want to do that | 16:16 |
yuriys | Sounds good! | 16:17 |
mnaser | good morning infra-root. i would appreciate a hold on https://review.opendev.org/c/openstack/loci/+/801526 (loci-keystone). i'm unable to reproduce this failure locally no matter what i try (uwsgi fails to build inside the container for some reason) | 16:17 |
mnaser | all i see in logs is "[thread 6][x86_64-linux-gnu-gcc -pthread] core/routing.o" that goes to "ERROR: Failed building wheel for uwsgi" | 16:19 |
*** rpittau is now known as rpittau|afk | 16:21 | |
yoctozepto | could someone from infra merge this old fix ~> https://review.opendev.org/c/opendev/gerritbot/+/788565 | 16:27 |
yoctozepto | the missing branch name is sad | 16:27 |
clarkb | mnaser: and you have installed the documented uwsgi build deps? | 16:32 |
mnaser | clarkb: yeah.. the error seems that it just died midway through on focal only. i'm trying to reproduce locally from a perfectly _clean_ ubuntu focal image with no deps .. and it works | 16:32 |
mnaser | i got latest pip in an empty ubuntu focal with just python3-pip installed and `/tmp/test/bin/pip wheel --find-links /source-wheels --no-deps --wheel-dir / -c https://raw.githubusercontent.com/openstack/requirements/master/upper-constraints.txt uwsgi` and it builds fine | 16:33 |
clarkb | mnaser: the hold has been set | 16:34 |
mnaser | recheck'd, see ya in 15 minutes :P | 16:34 |
clarkb | yuriys: I'm back now. Just ping when you want to dig into this stuff | 16:37 |
*** marios|ruck is now known as marios|out | 16:38 | |
fungi | i'm around as well, but at some point i'll need to disappear for a bit to pick up cats from the vet (they're in for their 50k mile maintenance) | 16:41 |
yuriys | Great. I've already updated all the CTs. | 16:42 |
clarkb | woot https://review.opendev.org/c/opendev/system-config/+/802112 passes with the extra group verification and gitea side checking | 16:42 |
clarkb | ianw: fungi: ^ I think that is ready for review now | 16:43 |
fungi | tristanC: you may want to take https://review.opendev.org/788565 into account in your matrix gerritbot implementation | 16:45 |
mnaser | infra-root: 158.69.66.127 is the ip for loci-keystone job that's failing if i can have https://github.com/mnaser.keys snuck in? :) -- there's a hold on the system (see above) | 16:46 |
tristanC | fungi: yes thank you, i'll make sure the event types are similar | 16:47 |
clarkb | mnaser: one sec | 16:47 |
clarkb | mnaser: done | 16:49 |
mnaser | clarkb: thank you! | 16:51 |
opendevreview | Merged opendev/gerritbot master: Add branch to all the remaining event messages https://review.opendev.org/c/opendev/gerritbot/+/788565 | 16:54 |
fungi | and now it's time to go get the cats out of hock, back in 30-ish | 17:00 |
mnaser | clarkb: and of course.. it passes on loci-keystone on the run that i have a hold in. can i just recheck and if it fails it'll catch ? | 17:08 |
clarkb | mnaser: yes, the hold will stay open until a failure on that change for that job occurs | 17:09 |
mnaser | ok great, time to recheck.. i'll drop the other jobs in the change so im not wasting ci resources | 17:09 |
clarkb | yoctozepto: fungi: fyi a different gerrit user is asking about the unpack errors from clients after they have upgraded on the gerrit mailing list | 17:12 |
clarkb | I'm updating them with the info we've collected so far | 17:12 |
clarkb | I'll share a traceback with them too if the thread originator doesn't respond with that info (it was requested) | 17:13 |
yoctozepto | clarkb: thanks, good to know | 17:14 |
clarkb | I ended up responding with a sanitized traceback | 17:23 |
clarkb | yoctozepto: fungi: https://groups.google.com/g/repo-discuss/c/AtMvu8rW8gc/m/eog7BLG4BQAJ that settles that | 17:52 |
fungi | okay, back now | 17:55 |
fungi | clarkb: thanks for spotting that report | 17:55 |
yoctozepto | interesting | 17:56 |
opendevreview | Jeremy Stanley proposed openstack/project-config master: Make Sunny an operator on Kata IRC channels https://review.opendev.org/c/openstack/project-config/+/802765 | 17:59 |
*** mtreinish_ is now known as mtreinish | 18:52 | |
clarkb | fungi: +2'd ^ if you want to approve. Finishing up the inmotion cloud stuff now | 19:15 |
fungi | thanks | 19:21 |
*** prometheanfire is now known as Guest2621 | 19:34 | |
corvus | i'd like to restart zuul -- i think it's at a good point for a release, and we should have one as a checkpoint | 19:34 |
corvus | usage looks tame, and i don't see any release stuff going on right now | 19:34 |
opendevreview | Merged openstack/project-config master: Make Sunny an operator on Kata IRC channels https://review.opendev.org/c/openstack/project-config/+/802765 | 19:35 |
fungi | corvus: sounds fine to me | 19:36 |
fungi | i'll mention it in the #openstack-release channel too | 19:37 |
fungi | i'm helping a user who's waiting for that queued project-config deploy job, but i expect it's still a while behind the prod hourly jobs, and it'll probably still work after getting reenqueued | 19:41 |
corvus | restarting now | 19:42 |
corvus | fungi: and yeah, i saw that and figured it'll re-enqueue ok | 19:43 |
fungi | it's a good test at least | 19:43 |
corvus | may actually run sooner | 19:43 |
fungi | yeah, if it gets enqueued ahead of the hourly | 19:44 |
fungi | which it may if we reenqueue with the pipelines in alpha order | 19:44 |
corvus | 2021-07-28 19:44:19,633 WARNING zuul.ConfigLoader: Zuul encountered an error while accessing the repo | 19:45 |
corvus | inaugust/inaugust.com. The error was: | 19:45 |
corvus | ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) | 19:45 |
fungi | or display order, whatever that ordering is in the status page (i guess it's the yaml file order) | 19:45 |
fungi | huh, that was gerrit? | 19:45 |
corvus | yeah i think so | 19:45 |
corvus | that was during opendev tenant configuration | 19:46 |
corvus | so let's check that out when it's done. it's still proceeding | 19:46 |
corvus | my guess is we get the tenant but without the inaugust jobs? | 19:46 |
corvus | it's worth noting there is now a considerable amount of network between the zuul scheduler and gerrit | 19:47 |
fungi | i see it hitting the other inaugust repos in gerrit's ssh log, but not that one | 19:47 |
fungi | and no mention in the error log | 19:48 |
corvus | so the connection may not have made it to the vm (but hard to say for sure) | 19:48 |
corvus | #status log restarted all of zuul on commit 8e4af0ce5e708ec6a8a2bf3a421b299f94704a7e | 19:50 |
opendevstatus | corvus: finished logging | 19:50 |
fungi | if it exceeded the 100 simultaneous ssh connections conntrack overflow rule we have in iptables, that would have rejected the connection with icmp-port-unreachable | 19:51 |
fungi | another possibility is it exceeded the max ssh connections for a single user set in gerrit itself (i think we have that at 64?) | 19:52 |
corvus | it should only have 1 ssh connection | 19:52 |
corvus | or maybe 2 | 19:52 |
fungi | yeah, so highly unlikely to be either of those causes | 19:52 |
corvus | re-enqueueing | 19:53 |
corvus | hrm | 19:53 |
corvus | we ended up with 3 tenants | 19:53 |
fungi | should full-reconfigure pick up the others? | 19:54 |
corvus | i think so | 19:54 |
corvus | i want to see if there are error logs for them | 19:54 |
corvus | oh now there are others | 19:56 |
corvus | may have just not finished the initial load | 19:57 |
corvus | https://zuul.opendev.org/t/opendev/project/opendev.org/inaugust/inaugust.com | 19:57 |
corvus | it does look like it didn't load anything from there | 19:58 |
corvus | i don't see any other connection errors | 19:58 |
corvus | i'm running full-reconfigure on the scheduler | 19:59 |
corvus | fungi: the accessbot job is running now | 20:00 |
fungi | thanks! | 20:00 |
corvus | i'll check back in on the opendev tenant when the full-reconfiguration is complete | 20:01 |
fungi | so maybe it just missed that one repo | 20:01 |
corvus | it's correct now. full-reconfig is still proceeding. | 20:03 |
corvus | complete | 20:06 |
fungi | cloud network blip? | 20:07 |
corvus | my guess yes | 20:07 |
fungi | it is connecting from rackspace to vexxhost since last week | 20:07 |
fungi | so crossing more of the wild internet | 20:07 |
mordred | the wild internet is dark and full of terrors | 20:13 |
fungi | this land was green and good... until the crystal cracked | 20:20 |
corvus | ... until they made the prequel | 20:21 |
* fungi sighs | 20:22 | |
fungi | i gave up on it the moment they showed the skeksis council chamber and there weren't even the right number of them | 20:23 |
fungi | i suppose that's what i get for rewatching the movie right before trying to watch the series, the details they got wrong were still very fresh in my mind | 20:23 |
mordred | I tried watching ... but after the first episode I just didn't care. and I was excited when I hit play | 20:24 |
fungi | same | 20:24 |
corvus | either they completely misunderstood the central premise of the original, or i did | 20:24 |
mordred | maybe they should have explained how the midichlorians related to the crystal | 20:25 |
corvus | could not have hurt | 20:25 |
opendevreview | Yuriy Shyyan proposed opendev/system-config master: 1st commit https://review.opendev.org/c/opendev/system-config/+/802803 | 20:28 |
clarkb | ok we got the cloud back and then did some gerrit things :) | 20:32 |
fungi | very cool! | 20:32 |
clarkb | I need to eat lunch then I will make sure the mirror is properly patched and rebooted then we can reenable that cloud | 20:32 |
clarkb | fungi: I can also push up the opendev/project-config change to track the tapaas rename if you want to work on the etherpad? | 20:33 |
clarkb | you indicated you'd like to continue with that in the tc channel yesterday so I guess we keep doing that. Not sure hwo much stepping on toes that is if the TC doesn't ack it properly first but I'm willing to pretend that will happen for now :) | 20:33 |
fungi | sure, i was going to do the reverse, but that works too. i should be able to find an old maintenance plan i can copy and hack up to suit | 20:34 |
clarkb | thanks | 20:34 |
clarkb | now I really need to eat lunch | 20:34 |
clarkb | I've removed mirror02.iad3.inmotion.opendev.org from the emergency.yaml file and did manual updates and rebooted it to ensure it was up to date there | 20:51 |
clarkb | ianw: I think we can remove the WIP From https://review.opendev.org/c/openstack/project-config/+/801010 and land that to reenable the region | 20:52 |
yuriys | is rate a manual calculation you guys do | 20:52 |
yuriys | or is that like sleeps between requests | 20:53 |
clarkb | yuriys: no, its sort of hand wavy. Some clouds had strong rate limits that we set to that value, and others we just set something reasonable | 20:53 |
clarkb | yuriys: ya it basically ensures that you don't make more requests in $period than specified. In many cases I think the rtt on the previous request ends up being long enough that we just proceed with the enxt request without sleeping | 20:53 |
clarkb | but some clouds have enforced that pretty strongly so we have the abiltiy to tune it if necessary | 20:53 |
yuriys | makes sense, esp on highly trafficked apis | 20:54 |
*** timburke_ is now known as timburke | 20:55 | |
opendevreview | Clark Boylan proposed opendev/project-config master: Add tap-as-a-service rename records https://review.opendev.org/c/opendev/project-config/+/802809 | 21:06 |
clarkb | fungi: ^ ok that is the recording change | 21:06 |
fungi | thanks, i'll record it in the plan | 21:07 |
clarkb | I cut lunch a bit short to make headway on these things so I'm going to go take another break | 21:07 |
fungi | do that | 21:07 |
clarkb | yuriys: have you followed the operator pain points discussion on the openstack-discuss mailing list? | 21:15 |
clarkb | yuriys: I'm thinking we should add the struggles with cells and rabbitmq to that | 21:15 |
clarkb | yuriys: https://etherpad.opendev.org/p/pain-point-elimination is the document where tehy are capturing stuff. Would you prefer I try to capture it and then you can edit or would it be easier for you to add them? I'm thinking we can add orphaned instances in nova cells when rabbitmq breaks to nova and under kolla we can put something about how restarting rabbitmq isn't reliable? | 21:16 |
clarkb | ok taking that break now. I'll work on capturing that on the etherpad in a bit if you haven't gone ahead and done it (I assume your day is ending soon so I can add those items) | 21:17 |
yuriys | That's a good idea, I'll read through it, I have not heard of the pain point doc. | 21:20 |
clarkb | yuriys: http://lists.openstack.org/pipermail/openstack-discuss/2021-July/023659.html is the main thread, kolla started one too | 21:27 |
clarkb | alright I left two notes around what we found | 21:32 |
yuriys | awesome ty | 21:35 |
fungi | infra-root: i've begun drafting a maintenance plan for friday at https://etherpad.opendev.org/p/project-renames-2021-07-30 with the skeleton steps taken from our process doc. i've added a few implied steps of my own and also left some commentary. whatever we hack up on this can be applied as an update to our official process | 21:52 |
fungi | i can flesh it out with some more specific commands/prose to cut and paste once the process is firmed up | 21:55 |
ianw | clarkb: thanks! i've approved that to reenable region | 21:58 |
opendevreview | Merged openstack/project-config master: Revert "nodepool: set inmotion cloud to zero" https://review.opendev.org/c/openstack/project-config/+/801010 | 22:06 |
clarkb | I don't think we should make these changes to the rename playbook but I think we might want to consider adding group reindexes if practice shows this is necessary after a group rename and possibly add https://review.opendev.org/Documentation/cmd-index-changes-in-project.html for each project we rename too | 22:08 |
clarkb | I suspect its fine as is and we can manually run those should they be an issue and backfill the playbook after | 22:08 |
clarkb | (I'd avoid adding more reindexing than necessary hence not adding these proactively) | 22:09 |
fungi | yes, i concur | 22:12 |
opendevreview | Monty Taylor proposed opendev/system-config master: Run matrix-eavesdrop on eavesdrop https://review.opendev.org/c/opendev/system-config/+/800320 | 23:36 |
opendevreview | Monty Taylor proposed opendev/system-config master: Run matrix-gerritbot on eavesdrop https://review.opendev.org/c/opendev/system-config/+/800506 | 23:36 |
mordred | clarkb, corvus, ianw: I fixed ianw's issues with tristanC 's patch regarding file matchers - and also we'd missed a similar thing in corvus' patch. both of those should be GTG now ^^ | 23:37 |
corvus | mordred:inaugust.com: thanks! | 23:37 |
corvus | mordred: both lgtm | 23:39 |
mordred | also - fwiw - we had a miss on docker/ircbot in the test job - I included that in fixing the corvus patch | 23:40 |
ianw | fungi: if you have a chance to look @ https://review.opendev.org/c/opendev/base-jobs/+/802639 that switches the stable jobs to bullseye | 23:42 |
ianw | as you noted, i think they're lightly used | 23:43 |
clarkb | corvus: mordred I +2'd the eavesdrop change as I had previously given that an in depth review but only +1'd the gerritbot one since I haven't had a chance to really dig into it yet but noted the issues with testing that were called out had been addressed. I'm not approving anything as I'm currently trying ot make some ramen and won't be able to monitor | 23:52 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!