mordred | ianw: oh - I was just following the naming for things that I found in /etc/apt/trustd.gpg.d | 00:05 |
---|---|---|
mordred | ianw: I'd love being able to just drop it in there with an asc instead of needing to apt-key it | 00:06 |
ianw | mordred: should work, it does for nodepool :) | 00:08 |
fungi | yes, ubuntu focal and debian buster should support it | 00:09 |
fungi | anything older than that won't, but for this that's probably fine? | 00:10 |
mordred | ianw: neat | 00:10 |
ianw | containers, yay! | 00:10 |
ianw | speaking of, xenial is just screwed for pip-and-virtualenv-less ... my thought that virtualenv upgraded the ancient 8.0 pip doesn't work | 00:12 |
ianw | fungi: one thought is that we disable the wheel cache for xenial ... | 00:18 |
fungi | that's a possibility | 00:19 |
fungi | is the wheel cache causing issues, or... | 00:19 |
ianw | what's more important, the cache or shipping unadulterated images? | 00:19 |
ianw | fungi: it's the 8.0 version of pip on xenial that can't figure out how to fall back to pypi if it doesn't exist in our wheel cache | 00:19 |
fungi | oh, too-old packaged pip on xenial can't be configured to find our wheel cache, right | 00:20 |
fungi | can't be configured to correctly use our wheel cache anyway | 00:20 |
ianw | i wonder if you can specify pypi manually *as well* as our cache? | 00:20 |
ianw | oh that's what we do | 00:21 |
fungi | yeah, i think it only uses one or the other | 00:21 |
fungi | unless you're running at least pip... 9.x? | 00:22 |
fungi | i wonder if there isn't a newer python-pip backport we could just install | 00:22 |
fungi | nope, not from xenial-backports anyway | 00:23 |
mordred | so - this is one of those reasons why I've been firmly in the "screw distro supplied pip" camp this whole time | 00:23 |
mordred | I think pip is moving too fast | 00:23 |
mordred | and I think the value in distro-provided pip is negative | 00:23 |
mordred | especially given the lifespan of ubuntu LTS's | 00:24 |
mordred | python-pip on xenial is going to be a nightmare to deal with - and in a year python-pip from bionic will be a nightmare because it won't support depsolving | 00:24 |
fungi | which will probably be fine as long as jobs are solving their "i need pip" at runtime | 00:25 |
mordred | (I *do* like the python -m venv approach - don't get me wrong on that) | 00:25 |
mordred | just saying - I think distro pip should be avoided like the plague in all circumstances | 00:25 |
fungi | ianw: remind me why we can't install newer pip in a virtualenv on xenial and link it somewhere in the default path? | 00:25 |
mordred | pip in a virtualenv installs things into that virtualenv | 00:26 |
mordred | because magic | 00:26 |
fungi | d'oh, yep exactly why ;) | 00:26 |
fungi | i guess the really for realz solution is don't run pip in a system context, use pip to install things in a virtualenv | 00:27 |
ianw | yeah, i just really wanted to be out of this "what owns pip" game | 00:27 |
ianw | fungi: that's the problem though ... you can't get a working virtualenv in our mirror setup with xenial | 00:28 |
mordred | fungi: yeah - but even then- my argument is that if you're going to run pip in a system context to install stuff, you've already decided to pollute the system - so you might as well install pip that way too | 00:28 |
ianw | mordred: yeah ... my hope is that people can opt-in to that though | 00:29 |
ianw | at the very least; if we move this from pip-and-virtualenv into roles we can at least speculatively test | 00:29 |
mordred | ianw: I hear you - and I'm drinking wine, so ignore me - but I'd say this is an area where people having an alternate opinion is going to be habit and not a realistic need | 00:29 |
mordred | ianw: ++ | 00:29 |
mordred | moving into roles == yes | 00:29 |
mordred | ok. I'm not really here - as much as I love debating pip :) | 00:30 |
mordred | ianw: thanks for the tip on https://review.opendev.org/#/c/724344/ - it is now fixed per your suggestion | 00:31 |
ianw | we get back into this crap of "what owns pip". you get-pip.py install with python2 which takes over /usr/local/bin/pip, then install get-pip with python3 which takes it back | 00:31 |
ianw | mordred: cool! it took me a bunch of runs to figure out the asc/gpg thing :) | 00:32 |
mordred | ianw: I'm thrilled you learned it! because the apt-key dance really blows | 00:32 |
fungi | ianw: is there a path to some solution where we don't preinstall any sort of pip on xenial and then let jobs install whatever pip they need however makes the most sense for them? | 00:33 |
fungi | for example, a python3 based integration test probably doesn't need python2 pip installed at all | 00:33 |
ianw | fungi: we really need to be involved early because the base roles use pip: in ansible to do various things | 00:33 |
fungi | oh, hrm, so it's more of an ansible bootstrapping problem then? | 00:34 |
ianw | zuul-jobs bootstraping i guess | 00:34 |
ianw | https://review.opendev.org/#/c/724776/ it testing the -plain nodes with zuul-jobs | 00:35 |
*** mlavalle has quit IRC | 00:39 | |
*** DSpider has quit IRC | 00:40 | |
*** sangeet has quit IRC | 00:53 | |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] xenial pip install https://review.opendev.org/724788 | 02:24 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] xenial pip install https://review.opendev.org/724788 | 03:37 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] xenial pip install https://review.opendev.org/724788 | 03:44 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] xenial pip install https://review.opendev.org/724788 | 05:18 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] xenial pip install https://review.opendev.org/724788 | 05:24 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] xenial pip install https://review.opendev.org/724788 | 05:30 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: [wip] xenial pip install https://review.opendev.org/724788 | 05:36 |
*** rchurch has quit IRC | 06:37 | |
*** rchurch has joined #opendev | 06:40 | |
*** ralonsoh has joined #opendev | 07:08 | |
*** DSpider has joined #opendev | 08:16 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Control log archive and user preservation with vars https://review.opendev.org/701381 | 09:25 |
*** tosky has joined #opendev | 09:30 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Do not keep owner when pulling zuul-output https://review.opendev.org/701381 | 09:39 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Check for loop_control in with_ type loops https://review.opendev.org/724810 | 09:48 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-repositories: Add role https://review.opendev.org/717507 | 09:49 |
*** smcginnis has quit IRC | 09:56 | |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-repositories: Add role https://review.opendev.org/717507 | 10:12 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-repositories: Add role https://review.opendev.org/717507 | 10:18 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-repositories: Add role https://review.opendev.org/717507 | 10:31 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-nodejs: refactor to use ensure-repositories https://review.opendev.org/717508 | 10:31 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: refactor to use ensure-repositories https://review.opendev.org/717509 | 10:31 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: install-yarn: always install https://review.opendev.org/717375 | 10:31 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-kubernetes: refactor to use ensure-repositories https://review.opendev.org/717510 | 10:31 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-podman: refactor to use ensure-repositories https://review.opendev.org/717511 | 10:31 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-docker: refactor to use ensure-repositories https://review.opendev.org/717512 | 10:31 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: cleanup: move tests to use ensure-repositories https://review.opendev.org/717513 | 10:31 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: run ensure-nodejs before https://review.opendev.org/717817 | 10:33 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-repositories: Add role https://review.opendev.org/717507 | 10:38 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-nodejs: refactor to use ensure-repositories https://review.opendev.org/717508 | 10:38 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: refactor to use ensure-repositories https://review.opendev.org/717509 | 10:38 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: install-yarn: always install https://review.opendev.org/717375 | 10:38 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-kubernetes: refactor to use ensure-repositories https://review.opendev.org/717510 | 10:38 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-podman: refactor to use ensure-repositories https://review.opendev.org/717511 | 10:38 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-docker: refactor to use ensure-repositories https://review.opendev.org/717512 | 10:38 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: cleanup: move tests to use ensure-repositories https://review.opendev.org/717513 | 10:38 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: run ensure-nodejs before https://review.opendev.org/717817 | 10:39 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-repositories: Add role https://review.opendev.org/717507 | 11:03 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-nodejs: refactor to use ensure-repositories https://review.opendev.org/717508 | 11:03 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: refactor to use ensure-repositories https://review.opendev.org/717509 | 11:03 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: install-yarn: always install https://review.opendev.org/717375 | 11:03 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-kubernetes: refactor to use ensure-repositories https://review.opendev.org/717510 | 11:03 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-podman: refactor to use ensure-repositories https://review.opendev.org/717511 | 11:03 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-docker: refactor to use ensure-repositories https://review.opendev.org/717512 | 11:03 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: cleanup: move tests to use ensure-repositories https://review.opendev.org/717513 | 11:03 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: run ensure-nodejs before https://review.opendev.org/717817 | 11:05 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: run ensure-nodejs before https://review.opendev.org/717817 | 11:09 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Check for loop_control in with_ type loops https://review.opendev.org/724810 | 11:17 |
*** tkajinam has quit IRC | 11:25 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Check for loop_control in with_ type loops https://review.opendev.org/724810 | 11:39 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Check for loop_control in with_ type loops https://review.opendev.org/724810 | 11:45 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-repositories: Add role https://review.opendev.org/717507 | 12:05 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-nodejs: refactor to use ensure-repositories https://review.opendev.org/717508 | 12:05 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: refactor to use ensure-repositories https://review.opendev.org/717509 | 12:05 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: install-yarn: always install https://review.opendev.org/717375 | 12:05 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-kubernetes: refactor to use ensure-repositories https://review.opendev.org/717510 | 12:05 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-podman: refactor to use ensure-repositories https://review.opendev.org/717511 | 12:05 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-docker: refactor to use ensure-repositories https://review.opendev.org/717512 | 12:05 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: cleanup: move tests to use ensure-repositories https://review.opendev.org/717513 | 12:05 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: run ensure-nodejs before https://review.opendev.org/717817 | 12:09 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Write a buildkitd config file pointing to buildset registry https://review.opendev.org/724757 | 12:34 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Write a buildkitd config file pointing to buildset registry https://review.opendev.org/724757 | 12:38 |
*** priteau has joined #opendev | 12:45 | |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-package-repositories: Add role https://review.opendev.org/717507 | 12:46 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-nodejs: refactor to use ensure-package-repositories https://review.opendev.org/717508 | 12:46 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: refactor to use ensure-package-repositories https://review.opendev.org/717509 | 12:46 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: install-yarn: always install https://review.opendev.org/717375 | 12:46 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-kubernetes: refactor to use ensure-package-repositories https://review.opendev.org/717510 | 12:46 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-podman: refactor to use ensure--package-repositories https://review.opendev.org/717511 | 12:46 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-docker: refactor to use ensure-package-repositories https://review.opendev.org/717512 | 12:46 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: cleanup: move tests to use ensure-package-repositories https://review.opendev.org/717513 | 12:46 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: run ensure-nodejs before https://review.opendev.org/717817 | 12:49 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Write buildkitd.toml in use-buildset-registry https://review.opendev.org/724837 | 12:58 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-nodejs: refactor to use ensure-package-repositories https://review.opendev.org/717508 | 13:01 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: refactor to use ensure-package-repositories https://review.opendev.org/717509 | 13:01 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: install-yarn: always install https://review.opendev.org/717375 | 13:01 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-kubernetes: refactor to use ensure-package-repositories https://review.opendev.org/717510 | 13:01 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-podman: refactor to use ensure--package-repositories https://review.opendev.org/717511 | 13:01 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-docker: refactor to use ensure-package-repositories https://review.opendev.org/717512 | 13:01 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: cleanup: move tests to use ensure-package-repositories https://review.opendev.org/717513 | 13:01 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Write a buildkitd config file pointing to buildset registry https://review.opendev.org/724757 | 13:03 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Write buildkitd.toml in use-buildset-registry https://review.opendev.org/724837 | 13:03 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: run ensure-nodejs before https://review.opendev.org/717817 | 13:05 |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: Remove zuul-jobs notification from #openstack-infra https://review.opendev.org/724838 | 13:06 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: cleanup: move tests to use ensure-package-repositories https://review.opendev.org/717513 | 13:09 |
AJaeger | zbr, see 724838 | 13:11 |
*** diablo_rojo has quit IRC | 13:13 | |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-package-repositories: Add role https://review.opendev.org/717507 | 13:36 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-nodejs: refactor to use ensure-package-repositories https://review.opendev.org/717508 | 13:36 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: refactor to use ensure-package-repositories https://review.opendev.org/717509 | 13:36 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: install-yarn: always install https://review.opendev.org/717375 | 13:36 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-kubernetes: refactor to use ensure-package-repositories https://review.opendev.org/717510 | 13:36 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-podman: refactor to use ensure--package-repositories https://review.opendev.org/717511 | 13:36 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-docker: refactor to use ensure-package-repositories https://review.opendev.org/717512 | 13:36 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: cleanup: move tests to use ensure-package-repositories https://review.opendev.org/717513 | 13:36 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: use-buildset-registry: fix modify_registries_conf library idempotency https://review.opendev.org/724840 | 13:36 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: use-buildset-registry: do not update ca when not necessary https://review.opendev.org/724841 | 13:40 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Write a buildkitd config file pointing to buildset registry https://review.opendev.org/724757 | 13:43 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Write buildkitd.toml in use-buildset-registry https://review.opendev.org/724837 | 13:44 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-package-repositories: Add role https://review.opendev.org/717507 | 13:49 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-nodejs: refactor to use ensure-package-repositories https://review.opendev.org/717508 | 13:49 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: refactor to use ensure-package-repositories https://review.opendev.org/717509 | 13:49 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: install-yarn: always install https://review.opendev.org/717375 | 13:49 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-kubernetes: refactor to use ensure-package-repositories https://review.opendev.org/717510 | 13:49 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-podman: refactor to use ensure--package-repositories https://review.opendev.org/717511 | 13:49 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-docker: refactor to use ensure-package-repositories https://review.opendev.org/717512 | 13:49 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: cleanup: move tests to use ensure-package-repositories https://review.opendev.org/717513 | 13:49 |
*** smcginnis has joined #opendev | 13:54 | |
corvus | clarkb, fungi: https://review.opendev.org/724199 passes tests now | 13:55 |
fungi | thanks! | 13:59 |
*** tkajinam has joined #opendev | 14:01 | |
*** DSpider has quit IRC | 14:13 | |
*** DSpider has joined #opendev | 14:16 | |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Write a buildkitd config file pointing to buildset registry https://review.opendev.org/724757 | 14:18 |
AJaeger | zbr: adding infra folks to reviews is most often ignored, better ask here... | 14:22 |
zbr | AJaeger: i wonder if there is a better way to assure reviews are done without poking or begging ;) | 14:27 |
AJaeger | zbr: good question. It's tough. clarkb is that something for our virtual PTG? Or for a team meeting? | 14:28 |
zbr | i know we have a serious amount of reviews, and no clean way to distinguish between a random CR and one that is really ready for review. | 14:30 |
zbr | if drafts would be enabled it would be easier | 14:30 |
zbr | sadly is quite common to add people as reviewers when the code is not even passing the checks | 14:31 |
AJaeger | zbr: most infra folks just ignore beeing added since there are too many reviews... | 14:32 |
zbr | i know, i was trying to find ways to improve the situation by lowering the amount of noise | 14:33 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add ansible-lint rule to check owner and group is not preserved https://review.opendev.org/724855 | 14:34 |
openstackgerrit | Merged opendev/system-config master: Meetpad: redirect 80 to 443 https://review.opendev.org/724199 | 14:34 |
AJaeger | what is 'Limestone Networks CI ' - reports now on zuul-jobs ;( | 14:43 |
AJaeger | logan-: could you check your config, please? ^ | 14:44 |
AJaeger | logan-: https://review.opendev.org/717375 has "This change depends on a change that failed to merge. | 14:44 |
AJaeger | - by your CI | 14:44 |
logan- | looking into it | 14:44 |
AJaeger | thanks, logan- | 14:46 |
logan- | stopped zuul-scheduler on my side while we investigate. | 14:46 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add ansible-lint rule to check owner and group is not preserved https://review.opendev.org/724855 | 14:48 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Write a buildkitd config file pointing to buildset registry https://review.opendev.org/724757 | 14:50 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add ansible-lint rule to check owner and group is not preserved https://review.opendev.org/724855 | 14:51 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Do not keep owner when pulling zuul-output https://review.opendev.org/701381 | 14:51 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add ansible-lint rule to check owner and group is not preserved https://review.opendev.org/724855 | 14:52 |
fungi | thanks logan-! | 14:52 |
logan- | yep - thanks for the heads up! sorry about that. it will be interesting to see why that happened... it is not a new zuul installation and it just uses zuul-jobs to pull in the upstream jobs library. :/ | 14:53 |
AJaeger | logan-: I think we had that problem in our install as well. This is Zuul v3? Maybe something to discuss on #zuul and check that there is no bug or misconfiguration. | 14:55 |
logan- | ack. will do. thanks | 14:55 |
openstackgerrit | Merged zuul/zuul-jobs master: ensure-package-repositories: Add role https://review.opendev.org/717507 | 14:57 |
openstackgerrit | Merged zuul/zuul-jobs master: ensure-nodejs: refactor to use ensure-package-repositories https://review.opendev.org/717508 | 14:57 |
openstackgerrit | Merged zuul/zuul-jobs master: ensure-yarn: refactor to use ensure-package-repositories https://review.opendev.org/717509 | 14:57 |
openstackgerrit | Merged zuul/zuul-jobs master: install-yarn: always install https://review.opendev.org/717375 | 14:57 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: ensure-yarn: run ensure-nodejs before https://review.opendev.org/717817 | 15:08 |
openstackgerrit | Sorin Sbarnea (zbr) proposed zuul/zuul-jobs master: fetch-sphinx-tarbal: allow to follow symlinks https://review.opendev.org/724868 | 15:13 |
openstackgerrit | Merged zuul/zuul-jobs master: ensure-kubernetes: refactor to use ensure-package-repositories https://review.opendev.org/717510 | 15:28 |
openstackgerrit | Merged zuul/zuul-jobs master: ensure-podman: refactor to use ensure--package-repositories https://review.opendev.org/717511 | 15:28 |
*** dpawlik has joined #opendev | 15:30 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: haskell-stack-test: add haskell tool stack test https://review.opendev.org/723263 | 15:36 |
openstackgerrit | Merged zuul/zuul-jobs master: ensure-docker: refactor to use ensure-package-repositories https://review.opendev.org/717512 | 15:36 |
openstackgerrit | Merged zuul/zuul-jobs master: cleanup: move tests to use ensure-package-repositories https://review.opendev.org/717513 | 15:39 |
openstackgerrit | Merged zuul/zuul-jobs master: Add Bazel build and ensure roles https://review.opendev.org/693513 | 15:39 |
openstackgerrit | Merged zuul/zuul-jobs master: Do not keep owner when pulling zuul-output https://review.opendev.org/701381 | 15:39 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add ansible-lint rule to check owner and group is not preserved https://review.opendev.org/724855 | 15:42 |
openstackgerrit | Jeremy Stanley proposed openstack/project-config master: Clean up unused IRC channels https://review.opendev.org/724878 | 15:45 |
openstackgerrit | Jeremy Stanley proposed opendev/system-config master: Clean up unused IRC channels https://review.opendev.org/724879 | 15:45 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add ansible-lint rule to check owner and group is not preserved https://review.opendev.org/724855 | 15:48 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add ansible-lint rule to check owner and group is not preserved https://review.opendev.org/724855 | 15:49 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add ansible-lint rule to check owner and group is not preserved https://review.opendev.org/724855 | 15:53 |
clarkb | zuul memory use has started increasing quickly in the last few hours | 15:57 |
clarkb | looks like the scheduler may be growing | 15:57 |
AJaeger | we merged quite a few changes for zuul-jobs | 15:58 |
clarkb | 14:45UTC seems like roughly when it started to incrase the derivative on that line | 15:59 |
AJaeger | https://review.opendev.org/#/c/717507/13 merged at 14:56 UTC | 16:00 |
openstackgerrit | Merged openstack/project-config master: Remove zuul-jobs notification from #openstack-infra https://review.opendev.org/724838 | 16:00 |
AJaeger | was approved at 14:42 UTC | 16:00 |
clarkb | usually when the sceduler increases memory use its due to leaked config objects iirc | 16:02 |
AJaeger | and we had a confused scheduler, see my comments on #zuul at 14:46 | 16:03 |
fungi | clarkb: AJaeger: according to top it's the zuul-scheduler process and not zuul-web consuming the bulk of the memory this time | 16:05 |
fungi | though zuul-web is still using far more than it probably should too | 16:06 |
clarkb | fungi: yup | 16:06 |
openstackgerrit | Merged zuul/zuul-jobs master: Check for loop_control in with_ type loops https://review.opendev.org/724810 | 16:06 |
clarkb | fungi: usually when its the scheduler that has run away from us in the psat | 16:06 |
fungi | restarting zuul-web might be a relatively hitless way to temporarily free up some more ram for the scheduler though | 16:07 |
*** dpawlik has quit IRC | 16:08 | |
clarkb | https://review.opendev.org/#/c/724778/ is related to disk use on zuul01 | 16:08 |
clarkb | (not memory) | 16:08 |
openstackgerrit | Merged zuul/zuul-jobs master: fetch-sphinx-tarbal: allow to follow symlinks https://review.opendev.org/724868 | 16:09 |
openstackgerrit | Merged zuul/zuul-jobs master: haskell-stack-test: add haskell tool stack test https://review.opendev.org/723263 | 16:10 |
*** mlavalle has joined #opendev | 16:10 | |
clarkb | in earlier investigations was it always zuul-web that was running away? | 16:16 |
openstackgerrit | Jeremy Stanley proposed openstack/project-config master: Create new project for OpenDev Engagement Stats https://review.opendev.org/724886 | 16:25 |
fungi | clarkb: over the past few days it seemed that way, at least | 16:25 |
fungi | but it's possible we're conflating two independent memory leaks | 16:26 |
*** tkajinam has quit IRC | 16:26 | |
clarkb | looking through git history https://review.opendev.org/#/c/718160/ stands out given it touches the config loading. However, it should only add local scope vars which should make this safe | 16:32 |
clarkb | I've not manged to keep up with zuul development in the lsat few weeks so others may have a better idea of what is happening | 16:34 |
fungi | yeah, same, i've been heads-down on other things | 16:37 |
openstackgerrit | Jeremy Stanley proposed openstack/project-config master: Create new project for OpenDev Engagement Stats https://review.opendev.org/724886 | 16:47 |
openstackgerrit | Jeremy Stanley proposed openstack/project-config master: Add Engagement Statistics to docs index https://review.opendev.org/724892 | 16:47 |
clarkb | is it possible the changes are due to python runtime and/or libc? | 16:59 |
clarkb | we switched from python3.5 to python3.7 and ubuntu xenial libc to debian something | 16:59 |
fungi | entirely possible, sure | 17:01 |
openstackgerrit | Merged zuul/zuul-jobs master: tox: Use 'block: ... always: ...' instead of ignore_errors https://review.opendev.org/723640 | 17:01 |
openstackgerrit | Merged zuul/zuul-jobs master: ensure-sphinx: use failed_when: false instead of ignore_errors: true https://review.opendev.org/723642 | 17:01 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Write a buildkitd config file pointing to buildset registry https://review.opendev.org/724757 | 17:10 |
mordred | clarkb: worth noting we should also be running with jamalloc - but I believe we were running with that before | 17:15 |
clarkb | mordred: we were | 17:15 |
clarkb | though thats worth double hcecking on the images | 17:16 |
mordred | yeah | 17:16 |
clarkb | mordred: searching for malloc in zuul/zuul doesn't give me any results | 17:16 |
mordred | actually - I don't see libjemalloc1 installed on the scheduler and I don't think we did a remove on it - so it's also possible we _weren't_ running with jemalloc before | 17:17 |
mordred | clarkb: we add jemalloc in the python-base image | 17:17 |
mordred | although IIRC tobiash is also running with jemalloc | 17:17 |
clarkb | mordred: but we ave to configure python to use it right? do we do that in python-base too? | 17:17 |
mordred | we set LD_PRELOAD | 17:17 |
mordred | ENV LD_PRELOAD /usr/lib/x86_64-linux-gnu/libjemalloc.so.2 | 17:17 |
clarkb | mordred: looking at puppet we may have only set it on the executors | 17:19 |
clarkb | (before) | 17:19 |
clarkb | mordred: maybe we want to unset the env var for the scheduler and web? | 17:20 |
mordred | clarkb: we actually added it based on graphs from tobiash when he was dealing with memory leaks: d7c0be958df516c70eb7955e1783da991c3d36fb | 17:22 |
mordred | (in system-config) | 17:22 |
clarkb | mordred: ya and we added it to the executors and it did seem to help there | 17:23 |
clarkb | mordred: but it doesn't look like we ever ran the scheduler under it? | 17:24 |
mordred | clarkb: just confirmed though - we DO have LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2 in the scheduler container | 17:24 |
mordred | clarkb: yeah - possibly not - so it's certainly a thing to try | 17:24 |
mordred | in our list | 17:24 |
clarkb | mordred: ya an in zuul-web. I checked /proc for those | 17:24 |
openstackgerrit | Merged zuul/zuul-jobs master: ensure-yarn: run ensure-nodejs before https://review.opendev.org/717817 | 17:24 |
clarkb | I guess the other thingto check is that the path to the so is correct | 17:24 |
mordred | but I'd be surprised if it was the cause (also, we're not running current executors with it - so we shoudl fix that) | 17:24 |
mordred | clarkb: yeah - I confirmed that in the scheduler image | 17:24 |
clarkb | our sigusr2 yappi dumps will show object counts | 17:29 |
clarkb | if we end up needing to restart again maybe we can capture that data first | 17:29 |
fungi | this is probably the time to collect it, while the system is still reasonably responsive? | 17:30 |
clarkb | fungi: oh good point | 17:30 |
clarkb | mordred: if we want to send a sigusr2 to zuul-web and/or scheduler we can do that from the host via kill right? we don't need to docker exec? | 17:31 |
mordred | clarkb: uh - unknonw. I think I'd probably docker exec it myself - but it's probably fine from the host? | 17:33 |
fungi | i would be very surprised if you needed to docker exec a kill command. doing that from the man system context using the pid from the main process space should be fine | 17:33 |
fungi | s/man/main/ | 17:33 |
fungi | otherwise things like shutdown wouldn't work | 17:33 |
clarkb | ok next thing if we do it from docker exec how do we get the pid in the container to kill ? | 17:38 |
clarkb | there is no ps in our images | 17:38 |
clarkb | maybe doing it from the host is better afterall :) | 17:38 |
clarkb | I think our 3 zuul-scheduler processes are pids 1, 6, and 11 in the container. With the first one being dumb init? | 17:45 |
clarkb | so `sudo docker exec zuul-scheduler_scheduler_1 kill -usr2 6` or `sudo kill -usr2 17601` should be equivalent | 17:46 |
*** priteau has quit IRC | 17:46 | |
clarkb | infra-root we've lost half a gig of memory free while I sorted ^ out | 17:51 |
clarkb | I think my preference is to kill from the host because ps works from the host side and I don't have to look at /proc direclty | 17:51 |
clarkb | any opposition to me running `sudo kill -usr2 17601` giving it a minute then running it again? | 17:51 |
clarkb | (running it a second time disables the yappi instrumentation) | 17:52 |
fungi | sounds reasonable to me | 17:56 |
fungi | disables yappi instrumentation and, more importantly, provides the yappi report | 17:56 |
clarkb | I've issued the first | 17:58 |
clarkb | and the second | 17:59 |
clarkb | I'm not sure that worked | 17:59 |
clarkb | may need to issue signals via the container context afterall | 17:59 |
clarkb | (I didn't get any of the log output I expected after the second one) | 18:00 |
clarkb | now I'm trying to figure out how to double check pid 6 is parent of 11 (the numbers imply it is but I want to be sure before I exec in the container) | 18:01 |
clarkb | ok I'm like 95% confident pid 6 is what I want | 18:03 |
clarkb | so here goes | 18:03 |
*** smcginnis has quit IRC | 18:03 | |
clarkb | `sudo docker exec zuul-scheduler_scheduler_1 kill -USR2 6` is my command | 18:03 |
clarkb | and there is no kill executeable | 18:03 |
clarkb | do I need to write a python program to send the signal? | 18:04 |
clarkb | `sudo docker exec zuul-scheduler_scheduler_1 python3 -c 'import os; import signal; os.kill(6, signal.SIGUSR2)'` is my new command | 18:06 |
clarkb | that worked based on logs | 18:07 |
clarkb | I thought containers were supposed to make this easier :) | 18:07 |
fungi | or we could extend the command socket with some way to trigger those handlers | 18:11 |
clarkb | hrm maybe I just missed it in the logs the first time around. Grepping shows it may have worked | 18:12 |
clarkb | I ran kill from host two times, kill from container failed, and ran python3 kill from container 2 times so we should be steady state with yappi off right now | 18:13 |
fungi | and we probably have two yappi reports so could even compare them to see what the growth between them looks like | 18:15 |
clarkb | there are 718170 mappingproxy objects | 18:15 |
clarkb | that is a lot of dicts | 18:17 |
clarkb | we have 34k NodeSet objects, 33k FileMatcher objects, 32k Job objects | 18:18 |
clarkb | Probably not enough to say we are leaking configs, but it certainyl seems to point towards that? | 18:18 |
fungi | how are those trending between the reports? | 18:19 |
clarkb | fungi: they weren't moving but also its only about a minute between reports hwich may not be long enough | 18:19 |
*** smcginnis has joined #opendev | 18:19 | |
clarkb | also it seems that zuul-scheduler has shrunk by a couple gigs since I started doing this | 18:20 |
fungi | well, it does at least indicate that the increases aren't continuous but are more likely event-driven | 18:20 |
clarkb | ya | 18:20 |
clarkb | http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=64792&rra_id=all shows the drop | 18:21 |
clarkb | and it started right about hwne I sent the signals | 18:21 |
clarkb | (weird) | 18:21 |
clarkb | that does make me wonder if there is some sort of python gc issue | 18:22 |
clarkb | and tripping the signal handlers somehow unsticks it | 18:22 |
mordred | clarkb: and the yappi is tickling the gc to actually gc things | 18:22 |
mordred | yeah | 18:22 |
clarkb | because otherwise why would it drop coincident with my signals | 18:22 |
clarkb | does zuul-web have the yappi stuff too? | 18:25 |
clarkb | I wonder if we can reproduce the memory drop by signalling there as well | 18:25 |
clarkb | mordred: oh also the two steps coincide with my two signals | 18:25 |
clarkb | I signaled at 17:58 ish and 18:06 ish | 18:25 |
clarkb | so 18:00 snmp sees first drop and 18:05 sees second? | 18:26 |
clarkb | reproducing the behavior with zuul-web if it supports the signal handler couldbe interesting though | 18:26 |
clarkb | ya web inherits from ZuulDaemonApp so should be in the same boat | 18:27 |
AJaeger | clarkb: setup a cron job ;) | 18:27 |
clarkb | I'll give zuul-web the two signals now | 18:27 |
*** ralonsoh has quit IRC | 18:29 | |
clarkb | now we have 14GB free | 18:29 |
clarkb | lets see if cacti corroborates at 18:30 | 18:29 |
mordred | clarkb: if it does, it's going to make me think bad thoughts about py3.7 :) | 18:31 |
clarkb | mordred: me too | 18:31 |
clarkb | http://cacti.openstack.org/cacti/graph.php?action=zoom&local_graph_id=64792&rra_id=5&view_type=&graph_start=1588355379&graph_end=1588357607&graph_height=120&graph_width=500&title_font_size=12 | 18:33 |
clarkb | so ya I think signals improve things | 18:33 |
clarkb | its not perfect we are still using significant memory but sending signals causes it to get better | 18:33 |
clarkb | which really does make me think the gc thread is getting stuck and signals unstick it? signals are always handled by the main process thread. I don't know what context gc runs in but wouldn't surprise me if it is main process thread too | 18:34 |
clarkb | AJaeger: ya maybe we should do a kill -USR2 doubletap every hour :) | 18:34 |
clarkb | at least until we have a proper fix anyway | 18:34 |
mordred | clarkb: so - I think with that data - we should try 3.8 and 3.6 | 18:34 |
fungi | if it is something along those lines, then yes, the interpreter version/build seems likely to be what caused this to arise | 18:35 |
clarkb | mordred: ++ | 18:35 |
mordred | let me get a change up to switch to 3.8 - that's the easiest thing to try | 18:35 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 18:51 |
jrosser | I’ve seen a bunch of fetch-sphinx-tarball fail like this today https://zuul.opendev.org/t/openstack/build/218439d35cbd42d09d3f4e474641a26a | 18:52 |
jrosser | it looks like it is stat’ing a none existent file and not checking that item.stat is defined? | 18:53 |
*** sgw has joined #opendev | 18:55 | |
clarkb | jrosser: there was a renaming of the loop var item in zuul-jobs beacuse if you nest loops you get bad results with the default item var name | 18:57 |
clarkb | jrosser: I expect this is fallout from that and we need to fix the use of item there to whatever item was renamed to | 18:57 |
jrosser | oh right - that would do it yes | 18:57 |
clarkb | jrosser: https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/fetch-sphinx-tarball/tasks/pdf.yaml#L48 | 18:58 |
clarkb | ya thats the issue we need a change to update line 42 | 18:59 |
clarkb | I'm eating lunch but can work that up afterwards | 18:59 |
jrosser | cool - thanks! | 19:00 |
clarkb | jrosser: thank you for pointing that out | 19:00 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 19:04 |
jrosser | clarkb: I wonder if there is a case where stat.path still does not exist? there might be a lurking bug there | 19:05 |
jrosser | stat.<thing> being undefined has caught me out sooo many times | 19:06 |
openstackgerrit | Jeremy Stanley proposed openstack/project-config master: Add the Gerrit reviewers plugin repository to Zuul https://review.opendev.org/724913 | 19:08 |
openstackgerrit | Jeremy Stanley proposed opendev/system-config master: Add the Gerrit reviewers plugin to Gerrit builds https://review.opendev.org/724914 | 19:09 |
openstackgerrit | Clark Boylan proposed zuul/zuul-jobs master: Fix some item usage where var is zj_* now https://review.opendev.org/724916 | 19:10 |
clarkb | jrosser: ^ I think that should handle it | 19:10 |
AJaeger | how long does it take until gerritbot changes are in effect? 724838 merged at 16:00 UTC and there are still zuul-jobs notifications | 19:10 |
clarkb | AJaeger: I think gerritbot may still not be updating? | 19:11 |
AJaeger | clarkb: still not - I see | 19:11 |
mordred | yeah- we're still not updating (sorry - that's on me - it's been a long-tail of things) | 19:11 |
AJaeger | clarkb: thanks for fixing the zuul-jobs changed, missed that in review ;( | 19:11 |
jrosser | clarkb: i've been using this sort of thing quite a lot recently to build new lists with set_fact http://paste.openstack.org/show/792992/ | 19:13 |
jrosser | then you'd get a nice clean list and could drop a bunch of when: | 19:13 |
clarkb | jrosser: because an empty list is the same as when: not empty ? | 19:14 |
mordred | clarkb: yeah - iteration 0 times | 19:15 |
avass | jrosser: sorry for that | 19:19 |
clarkb | infra-root I'm going to try another round of SIGUSR2s | 19:19 |
clarkb | (more datapoints to collect) | 19:19 |
fungi | indeed, will be interesting if we see another drop in allocations | 19:19 |
clarkb | hrm did SIGUSR2 actually restart zuul-web? | 19:20 |
clarkb | I see new pids now | 19:20 |
fungi | that would be interesting if so... wouldn't it need a new container to fork a new zuul-web? | 19:20 |
clarkb | that is exceptionally weird since I checked that it has the handler | 19:20 |
fungi | oh, no right, we use an init within the container don't we? | 19:20 |
clarkb | fungi: not init capable of restarting processes and it is running uner a new container | 19:21 |
clarkb | maybe we restart zuul-web when new images show up and that coincided with my signals. I'm going to see if s | 19:21 |
clarkb | er I'm going to see if sigusr2 causes it ot restart again | 19:21 |
fungi | that sounds far more likely | 19:21 |
clarkb | It shouldnt | 19:21 |
clarkb | but I'llbe looking closely now | 19:21 |
fungi | the new images theory i mean | 19:22 |
clarkb | 2020-05-01 19:22:26,764 DEBUG zuul.web.ZuulWeb: ZuulWeb stopping | 19:22 |
clarkb | I think zuul-web is buggy with the USR2 handling | 19:23 |
clarkb | that makes me less confident in the gc python3.7 theory though we still saw that with scheduler and it did not restart | 19:23 |
clarkb | it says it is stopping but the process is still running | 19:24 |
clarkb | ah there it goes it finally restarted | 19:24 |
fungi | neat! | 19:25 |
clarkb | once its been long enough for cacti to record the memory use change from that restart I'm going to try sigusr2 on the scheduler again | 19:26 |
clarkb | ok I hit the scheduler again | 19:42 |
clarkb | and again we have a big decrease in memory use | 19:42 |
clarkb | and that process isn't restarting | 19:42 |
clarkb | I think despite the unexpected zuul-web behavior the relationship between signal handling and memory use dropping signifcantly remains | 19:43 |
clarkb | (and we should continue to pursue mordred's python3.8 change) | 19:43 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 19:47 |
clarkb | http://cacti.openstack.org/cacti/graph.php?action=zoom&local_graph_id=64792&rra_id=5&view_type=&graph_start=1588361233&graph_end=1588362192&graph_height=120&graph_width=500&title_font_size=12 that step down was my scheduler sigusr2 | 19:48 |
clarkb | its like we convinced python gc to actually gc :/ | 19:48 |
clarkb | virtual memory doesn't decrease but I'm pretty sure that isn't really something that can decrease? | 19:49 |
clarkb | once you've allocated pages they are yours | 19:49 |
clarkb | but whether or not they are resident depends on whether or not they are actually used | 19:49 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 19:53 |
clarkb | Zuul should be stable for a while now as it is using very little memory at the moment | 19:54 |
clarkb | I'm going to take that as an opportunity to pop out for a bike ride and not be in front of a computer | 19:54 |
corvus | i'm back and catching up | 19:54 |
clarkb | oh I'll wait a moment in case corvus has quesitons | 19:54 |
johnsom | Is gerrit down? | 19:55 |
clarkb | johnsom: doesn't seem to be | 19:55 |
clarkb | johnsom: web ui is responsive and loads changes for me | 19:55 |
johnsom | Hmm, one window was stuck "working" after adding a comment, one was stuck loading. | 19:55 |
johnsom | Hmm, seems better now | 19:55 |
corvus | clarkb: ack no questions; fascinating result, agree we should try other python vers | 19:57 |
corvus | 1 question | 19:58 |
sgw | slittle1: I am seeing a couple of download failures with: | 19:58 |
sgw | annobin-9.12-1.el7.x86_64.rpm | 19:58 |
sgw | kernel-headers-4.18.0-147.3.1.el8_1.x86_64.rpm | 19:58 |
corvus | clarkb: do you think the gc theory is a good enough lead right now that we should just pursue that, and i should not bother trying to do some kind of repl-based search for leaked layouts? | 19:58 |
sgw | have you seen this on cengn? | 19:58 |
sgw | oops wrong chat! | 19:58 |
sgw | sorry | 19:58 |
clarkb | corvus: probably? I mean we went from ~12GB resident memory to ~9.5GB the frist time then we went from 9.8GB to 1.8GB the second time. And both seem to line up with sending the signals | 19:59 |
corvus | kk | 20:00 |
mordred | still could be some sort of soft leak? | 20:00 |
clarkb | corvus: if we were properly leaking configs I wouldn't expect us to drop like that | 20:00 |
corvus | yeah | 20:00 |
clarkb | mordred: thats possible I suppose. Something that yappi or objgraph unsticks | 20:00 |
mordred | yeah | 20:00 |
corvus | but they shouldn't be modifying any references | 20:00 |
clarkb | corvus: correct | 20:00 |
mordred | yeah | 20:00 |
mordred | I'm just going to keep saying yeah | 20:00 |
corvus | yeah | 20:00 |
AJaeger | yeah ;) | 20:01 |
mordred | AJaeger: :) | 20:01 |
mordred | corvus: https://review.opendev.org/#/c/724908/ is the 3.8 change | 20:01 |
clarkb | corvus: its possible that yappi and objgraph info has useful things but I've not looked at it too closely yet | 20:01 |
AJaeger | clarkb: your change https://review.opendev.org/#/c/724916/ is since 19min waiting for nodes - hope the SIGUSR did not kill anything. I expected it to be merged by now ;( | 20:03 |
AJaeger | infra-root ^ | 20:03 |
clarkb | AJaeger: hrm I wouldn't have expected it to | 20:03 |
AJaeger | looking at https://zuul.opendev.org/t/zuul/status, it has no jobs listed yet - that's weird | 20:04 |
clarkb | oh hrm there do seem like stuck jobs I agree its weird | 20:04 |
clarkb | and maybe the sigusr2 is related | 20:04 |
corvus | it's doing a reconfig right now | 20:04 |
AJaeger | 20 mins and no jobs listed below is broken | 20:04 |
AJaeger | corvus: for 20 mins? | 20:04 |
clarkb | ya I see node requests getting fulfilled in the logs now | 20:05 |
AJaeger | "Queue lengths: 119 events, 0 management events, 523 results." | 20:05 |
clarkb | implying nodepool at least is happy | 20:05 |
corvus | it's still getting build result reports | 20:05 |
corvus | so gearman is working | 20:05 |
clarkb | corvus: I agree just saw a success go by | 20:05 |
AJaeger | very slowly increasing , now 527 results | 20:05 |
corvus | we may just be doing a really long full reconfigure? | 20:05 |
AJaeger | maybe | 20:06 |
corvus | hrm, i don't see the log messages that would indicate that | 20:06 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 20:09 |
corvus | 2020-05-01 20:06:07,689 DEBUG zuul.Pipeline.openstack.release-approval: [e: ac71bcf82a44409d977b6e0b4a41d2b2] Loading dynamic layout (phase 2) | 20:10 |
corvus | 2020-05-01 20:07:39,137 DEBUG zuul.Pipeline.openstack.release-approval: [e: ac71bcf82a44409d977b6e0b4a41d2b2] Loading dynamic layout complete | 20:10 |
corvus | that's a long time | 20:10 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 20:10 |
clarkb | Thread: 140299783370496 Thread-1 d: True <- from yappi I think that is the main pipeline manager thread | 20:11 |
corvus | normally that takes about 6 seconds | 20:11 |
clarkb | and that was from 19:41 ish | 20:11 |
clarkb | (just pointing out the thread seems to be running) | 20:11 |
corvus | yep, i see it making progress | 20:11 |
clarkb | thinking out loud here. Is it possible yappi is still instrumenting zuul? | 20:13 |
clarkb | that does result in a performance penalty iirc which is why I did it in pairs, but maybe the pairs didn't work or we or yappi has a bug? | 20:13 |
corvus | 2020-05-01 19:41:36,887 DEBUG zuul.stack_dump: Starting Yappi | 20:13 |
clarkb | hrm that should've been a stop | 20:13 |
corvus | that's the last yappi log i see; should there be a complement? | 20:13 |
clarkb | corvus: ya there should've been I always did it in pairs but maybe it was already running before and so I was out of sync or something ? | 20:14 |
clarkb | or maybe I issued too quickly when we were still in the handler (I tried to avoid that by checking log tail) | 20:14 |
corvus | clarkb: want to hit it one more time? | 20:14 |
clarkb | corvus: yes doing so now | 20:14 |
clarkb | done | 20:15 |
clarkb | corvus: looking at log sit looks like my first pair wasn't a pair | 20:16 |
clarkb | it reports stopping now (and hopefully goes much quicker now) | 20:16 |
clarkb | and I'll make a note to check the logs for the start stop messages explicitly in the future | 20:16 |
clarkb | it looks like a bunch of jobs have started | 20:17 |
clarkb | so I think that was it | 20:17 |
clarkb | AJaeger: Thank you for pointing that out | 20:17 |
corvus | yeah seems to be faster now | 20:18 |
AJaeger | we've broken devstack jobs ;( see discussion in #zuul | 20:18 |
corvus | last dynamic layout was 3s | 20:18 |
*** DSpider has quit IRC | 20:18 | |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Revert "Check for loop_control in with_ type loops" https://review.opendev.org/724923 | 20:24 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: use zj_source.zj_source https://review.opendev.org/724925 | 20:28 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 20:38 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Distinguish zj_source and zj_result and item https://review.opendev.org/724925 | 20:41 |
openstackgerrit | Merged zuul/zuul-jobs master: Fix some item usage where var is zj_* now https://review.opendev.org/724916 | 20:47 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Use zj_sphinx_pdf instead of item https://review.opendev.org/724927 | 20:50 |
openstackgerrit | Merged zuul/zuul-jobs master: Distinguish zj_source and zj_result and item https://review.opendev.org/724925 | 20:52 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 20:54 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 21:01 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 21:02 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 21:04 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Use zj_sphinx_pdf instead of item https://review.opendev.org/724927 | 21:06 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 21:11 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Only run tests for ensure-bazel when it is updated https://review.opendev.org/724933 | 21:15 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 21:21 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 21:26 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 22:03 |
*** osmanlicilegi has quit IRC | 22:07 | |
*** dzho has quit IRC | 22:07 | |
*** factor has quit IRC | 22:07 | |
*** elod has quit IRC | 22:07 | |
*** gouthamr has quit IRC | 22:07 | |
*** Dmitrii-Sh has quit IRC | 22:07 | |
*** stephenfin has quit IRC | 22:07 | |
*** ildikov has quit IRC | 22:07 | |
*** elod has joined #opendev | 22:08 | |
*** osmanlicilegi has joined #opendev | 22:10 | |
*** stephenfin has joined #opendev | 22:11 | |
*** gouthamr has joined #opendev | 22:11 | |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Write a buildkitd config file pointing to buildset registry https://review.opendev.org/724757 | 22:32 |
openstackgerrit | Merged zuul/zuul-jobs master: Use zj_sphinx_pdf instead of item https://review.opendev.org/724927 | 23:05 |
*** rosmaita has joined #opendev | 23:14 | |
*** mlavalle has quit IRC | 23:19 | |
*** tosky has quit IRC | 23:45 | |
openstackgerrit | Merged opendev/storyboard-webclient master: Build container images https://review.opendev.org/697322 | 23:48 |
openstackgerrit | Merged opendev/storyboard-webclient master: Update node to v10 https://review.opendev.org/697324 | 23:50 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!