corvus | fungi: are you around to approve https://review.opendev.org/858812 and https://review.opendev.org/858813 ? | 00:03 |
---|---|---|
fungi | sure, just a sec | 00:06 |
fungi | corvus: both lgtm, i | 00:08 |
fungi | 've approved them | 00:08 |
opendevreview | Merged opendev/zone-opendev.org master: Add tracing server to DNS https://review.opendev.org/c/opendev/zone-opendev.org/+/858812 | 00:09 |
opendevreview | Merged opendev/system-config master: Add tracing server to inventory https://review.opendev.org/c/opendev/system-config/+/858813 | 00:22 |
corvus | thanks! | 00:23 |
*** ysandeep|out is now known as ysandeep | 01:31 | |
*** rlandy|bbl is now known as rlandy|out | 01:59 | |
*** ysandeep is now known as ysandeep|afk | 04:31 | |
*** dasm is now known as dasm|off | 04:51 | |
*** jpena|off is now known as jpena | 07:19 | |
opendevreview | Rafal Lewandowski proposed openstack/diskimage-builder master: Added cloud-init growpart element https://review.opendev.org/c/openstack/diskimage-builder/+/855856 | 07:28 |
opendevreview | Rafal Lewandowski proposed openstack/diskimage-builder master: Added cloud-init growpart element https://review.opendev.org/c/openstack/diskimage-builder/+/855856 | 07:52 |
dpawlik | is it all fine with centos 9 stream image: https://nb01.opendev.org/images/ ? | 08:02 |
*** ysandeep|afk is now known as ysandeep | 08:15 | |
*** akahat|ruck is now known as akahat|ruck|lunch | 08:39 | |
*** ysandeep is now known as ysandeep|lunch | 08:52 | |
*** akahat|ruck|lunch is now known as akahat|ruck | 09:31 | |
*** ysandeep|lunch is now known as ysandeep | 10:01 | |
*** rlandy|out is now known as rlandy|rover | 10:29 | |
hrw | morning | 10:38 |
hrw | can someone tell me which repo keeps https://zuul.openstack.org/status code? I would like to move 'check-arm64' to be right after 'check' | 10:38 |
hrw | (now I do that with tampermonkey script) | 10:38 |
frickler | hrw: that should be part of zuul code, opendev.org/zuul/zuul. but IIUC the ordering is dynamic anyway trying to make efficient use of the three columns | 10:47 |
*** ysandeep is now known as ysandeep|afk | 10:48 | |
frickler | https://opendev.org/zuul/zuul/src/branch/master/web/src/pages/Status.jsx likely | 10:50 |
hrw | thanks | 11:01 |
*** ysandeep|afk is now known as ysandeep | 11:01 | |
*** frenzyfriday is now known as frenzyfriday|food | 11:11 | |
*** ysandeep is now known as ysandeep|afk | 11:36 | |
*** frenzyfriday|food is now known as frenzyfriday | 12:00 | |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Re-expose our Mailman archives.yaml and robots.txt https://review.opendev.org/c/opendev/system-config/+/858913 | 12:11 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Re-expose our Mailman archives.yaml and robots.txt https://review.opendev.org/c/opendev/system-config/+/858913 | 12:22 |
*** ysandeep|afk is now known as ysandeep | 12:35 | |
*** Tengu is now known as Guest1145 | 12:51 | |
*** Tengu_ is now known as Tengu | 12:51 | |
Clark[m] | dpawlik as far as I know everything is fine. If you are wondering why no images are listed for stream 9 on nb01 that is because nb02 has built both of the images for that distro. If you change the URL to nb02 you will see them there. | 13:05 |
dpawlik | Clark[m]: hey, thanks. That's what I was looking for | 13:31 |
*** ysandeep is now known as ysandeep|out | 13:37 | |
*** dasm|off is now known as dasm | 14:17 | |
clarkb | noonedeadpunk: /usr is stil mount read only in the zuul test envs and the zuul ansible venvs live in /usr | 15:14 |
clarkb | anyway I don't think the openstack collection or sdk or ansible version conflicts are a thing there. Some of those things are installed together but if you need a specific version you're using nested ansible | 15:15 |
noonedeadpunk | well yes, I do agree that in jobs we have it's unlikely that some reall tests are run with native ansible | 15:24 |
noonedeadpunk | It's indeed more used to launch tox | 15:24 |
clarkb | right, I just want ot be clear that I don't think any of those concerns apply | 15:24 |
clarkb | this is purely for executing the job and it isn't a broad enough use case to worry about those extra details | 15:25 |
clarkb | I'm trying to eplain this to tripleo right now in the change I pushed them too so that it can land before zuul is upgraded | 15:25 |
noonedeadpunk | Well, I personally find native ansible quite good to perform some simple jobs/checks without asking for a node and wasting extra resources. But I do agree that in current usecase it's likely not applicable indeed | 15:27 |
noonedeadpunk | Though I don't have good overview of all jobs zoo we might have | 15:28 |
clarkb | right, I'm asking that we stop complicating the situation so that we can address the immediate problem | 15:28 |
clarkb | if you want to optimize jobs that is an entirely separate discussion | 15:28 |
noonedeadpunk | +1 | 15:28 |
clarkb | (and one worth having, I've been optmizing some job tasks recently and have managed to save minutes in each devstack job and more in every multinode job) | 15:29 |
*** marios is now known as marios|out | 15:29 | |
corvus | a jaeger has appeared overnight: https://tracing.opendev.org/ | 15:42 |
clarkb | corvus: we'll need to configure zuul to point at that server once tracing support in zuul lands right? | 15:44 |
corvus | clarkb: yep... we might be able to configure zuul for that before tracing lands actually... | 15:45 |
clarkb | noonedeadpunk: side note, part of the optimization I've done is to stop doing work in ansible native tasks as they are horribly slow, particularly in large loops | 15:45 |
corvus | (since zuul ignores unknown config sections) | 15:45 |
noonedeadpunk | clarkb: well, I think the biggest loop is preparing related projects? At least that's true for OSA | 15:47 |
clarkb | noonedeadpunk: one of the biggest issues was normalizing devstack log files. I fixed that via an ansible module. Other cases are multinode ssh host key and /etc/hosts setup. | 15:48 |
clarkb | I'm sure there are more we can find if we go looking. | 15:48 |
fungi | preparing related projects could certainly fall into that category if a job has tons of them (which i expect is often true for deployment projects like osa) | 15:51 |
noonedeadpunk | clarkb: for us it takes like 12mins per job https://zuul.opendev.org/t/openstack/build/602ef6a3218f474b846f9a57d97efa8d/log/job-output.txt#207-5482 | 15:51 |
fungi | i guess the question is whether that's ansible overhead or git overhead. we can probably do more about the former than the latter | 15:51 |
noonedeadpunk | We have an ansible module for parallel git clone which is quite fast, but will need to be adopted for this specific task | 15:52 |
clarkb | if each task is taking between 1-3 seconds that seems to be the minimum for ansible tasks | 15:52 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible-plugins/src/branch/master/plugins/modules/git_requirements | 15:52 |
clarkb | if they are taking longer than that then we aren't hitting the ansible lower bound on task runtime | 15:52 |
clarkb | looks like the initial find loop is probably hitting the limit, and then some of the repo setup tasks are too (that probably depends on the size of the repo) | 15:54 |
noonedeadpunk | well, using module isntead should save quite some time? | 15:54 |
noonedeadpunk | As you don't loop ansible tasks, but to this in threads in module | 15:54 |
clarkb | yes, rewriting things into modules saves time because you can iterate your loops within a single ansible task | 15:54 |
clarkb | even if you don't parallelize things just removing the ansible task startup overhaed can be a huge win | 15:55 |
clarkb | (assuming again that your tasks are limited by that and not something else, these tasks seem to be a mix) | 15:55 |
noonedeadpunk | Just on git clone from remote using this module vs using anible's native git give like 7-10 times boost in time | 15:55 |
clarkb | unfortunately this is a core bit of zuul-jobs that most jobs in the wild will be using so modifying it is a lot more difficult than say updating devstack log file normalization | 15:56 |
clarkb | but still doable if people are interested | 15:56 |
clarkb | (I'm impressed that our local caches seem to work well enough that we hit the task startup bounds here) | 15:58 |
fungi | it's not necessarily the modifying it that's difficult, but making sure it has solid test coverage against regressions in behavior and coordinating/announcing the replacement | 15:58 |
fungi | more that it's a sensitive piece to modify | 15:58 |
fungi | implying a lot of potential risk for much of zuul's user base | 15:59 |
fungi | so needs a lot of care | 15:59 |
clarkb | yup and we can probably do it bit by bit in that role too | 15:59 |
clarkb | there are a number of steps that run loops and we don't have to replace everything all at once to see improvement | 16:00 |
*** jpena is now known as jpena|off | 16:13 | |
clarkb | oh another side note. Ansible 5 broke pipelining when configured for specific connection types. In theory this means that when we move to ansible 6 we'll get a speedup in jobs | 16:29 |
clarkb | https://review.opendev.org/c/openstack/devstack/+/858436 is my ansible 6 devstack test change let me see if the data there shows any improvement | 16:30 |
clarkb | 4.5 minutes faster between https://zuul.opendev.org/t/openstack/build/e6b8dded3c7f4ebf98f871acba2a3bab/log/job-output.txt and https://zuul.opendev.org/t/openstack/build/5b72dcf1c6ac4064945fe43389940f0a/log/job-output.txt on the same cloud provider in the tempest-full-py3 job. Only a single comparison but at least we haven't disproven the idea yet | 16:32 |
clarkb | https://zuul.opendev.org/t/openstack/build/35e5242ca27d4593b63e9ceebd276d63/log/job-output.txt vs https://zuul.opendev.org/t/openstack/build/d28f3d215537449d913fcde67f6d133c/log/job-output.txt shows us slower under ansible 6 for that job. I think if this helps the improvement is small enough to be subject to the variance within a single cloud provider | 16:36 |
johnsom | Hi everyone, it appears the promote pipeline failed on a releasenotes job in the upload-afs-roots task. Is there a way to re-run this without making a silly patch? https://review.opendev.org/c/openstack/designate/+/858022 | 16:42 |
johnsom | Our Zed release notes page is 404 and causing the links to not be in the releases patch. | 16:42 |
clarkb | jsut as a quick sanity check https://grafana.opendev.org/d/9871b26303/afs?orgId=1 shows we've got disk space and quota availale for both the afs disk and volumes | 16:47 |
clarkb | https://zuul.opendev.org/t/openstack/build/a3ce29b5c9294c3e98f368cdd9d3e4ec/log/job-output.txt#152 is where the job failed | 16:47 |
clarkb | johnsom: we can reenqueue the buildset which will also repromote the docs. Is that an issue? | 16:48 |
johnsom | Should not be a problem | 16:48 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Re-expose our Mailman archives.yaml and robots.txt https://review.opendev.org/c/opendev/system-config/+/858913 | 16:49 |
fungi | hopefully that's correct now. yay for tests! | 16:50 |
clarkb | fungi: do we not need a directory with require all granted for those? | 16:52 |
clarkb | or maybe a file? | 16:52 |
clarkb | your tests seem to indicate this is not necessary which surprises me. Maybe alias implies that? | 16:53 |
fungi | apparently not (i tried it briefly on lists.o.o), but the tests should confirm | 16:53 |
fungi | and yes, i too was surprised | 16:54 |
clarkb | noonedeadpunk: looking at that workspace setup role replacing the first loop should be straightforward with a module. I thought maybe the find module could do it instead but I think it will be too greedy and won't be accurate enough. But a simple module that checks if the path exists and returns true or false for that is an easy improvement to shave off some time | 16:55 |
clarkb | then the next bit can be tackled and so on | 16:55 |
clarkb | johnsom: have any changes landed since that one? | 16:56 |
clarkb | johnsom: that change landed about a week ago and I think redoing docs promotion will overwrite anything more recent | 16:56 |
fungi | unlikely, or else the promote re-run would be unneeded | 16:56 |
clarkb | fungi: well unless the docs promote can run and releasenotes job doesn't | 16:56 |
clarkb | (I'm not sure about that() | 16:56 |
noonedeadpunk | clarkb: well, that's already somewhere in my to-do list which goes quite slowly to be honest :( | 16:57 |
johnsom | clarkb Stuff has landed, but nothing with release notes or docs | 16:57 |
clarkb | ok I'll see about reenquing that now | 16:57 |
clarkb | noonedeadpunk: ya I might take a look at that later today | 16:57 |
clarkb | one idea I had for ansible loops was a meta module that would call another module but keep the looping internal. But it seems you can't really do that without creating new tasks and that will negate any benefit | 16:58 |
clarkb | fungi: does `zuul-client enqueue --tenant openstack --pipeline promote --project opendev.org/openstack/designate --change 858022,1` look correct to you? | 17:02 |
clarkb | noonedeadpunk: actually I think a better approach is to do everything in a single shell script as the next incrememntal step. We're trying to encode logic in ansible to avoid ansible errors, but we can do all that in a shell script and then loop once | 17:10 |
clarkb | Then a better approach to that would be to rewrite what that single shell script does into an ansible module that can do that looping internally to cut out the loop overhead entirely | 17:11 |
clarkb | but I think doing it step by step is good as reviewing the condensing to a single shell script should be easier and gives us a fallback point if we get the module wrong and need to revert | 17:11 |
clarkb | johnsom: I went ahead and ran that command | 17:12 |
johnsom | Thanks! | 17:13 |
clarkb | johnsom: both jobs are running now | 17:13 |
corvus | keep in mind that doing things in ansible gets us structured result data that's easier to display in the web ui -- so it's a balance! :) | 17:13 |
noonedeadpunk | But you can always create block/rescue if module fails and place old code in rescue? | 17:13 |
noonedeadpunk | eventually the good thing about module is that you can leverage multiprocess there comparing to bash script. I'm not sure how wise to run that for native ansible though load-wise | 17:14 |
johnsom | clarkb Fixed. Thanks again | 17:14 |
corvus | (i know clarkb knows this, i mostly just wanted to caution anyone reading "do everything in a single shell script" that we don't really mean *everything* :) | 17:14 |
corvus | (more like everything in a certain logical task) | 17:15 |
clarkb | right in this case "check if git repo cache dir exists, if so clone it to workspace else git init, then remove origin remote" is a logical task of preparing the workspace that was ~4 loops. It can be one | 17:17 |
clarkb | then a further improvement would be to move all of that into a module to remove the ansible task loop entirely and put that into the module | 17:17 |
clarkb | but I think iterate improvements here are a good thing | 17:17 |
corvus | ++ | 17:17 |
fungi | clarkb: yeah, that invocation looks right to me | 17:17 |
fungi | sorry, was distracted briefly by my weapons of grass destruction | 17:18 |
fungi | and now i'm headed back into the fray | 17:20 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Reduce the number of loops in prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858961 | 17:25 |
clarkb | corvus: noonedeadpunk ^ something like that. I think that is a good simple step | 17:25 |
noonedeadpunk | oh, for some reason I though that ansible git module is being used there, hehe | 17:27 |
noonedeadpunk | yeah, it makes way more sense this way | 17:27 |
noonedeadpunk | yeah, it's really way better this way | 17:32 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Reduce the number of loops in prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858961 | 17:37 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Reduce the number of loops in prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858961 | 17:48 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Test speedup change to prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858963 | 17:48 |
opendevreview | Clark Boylan proposed opendev/base-jobs master: Use test-prepare-workspace-git in base-test https://review.opendev.org/c/opendev/base-jobs/+/858964 | 17:49 |
fungi | https://zuul.opendev.org/t/openstack/build/7fdb7f42f96f4ec6b91737d01c721524/log/lists.openstack.org/apache2/lists.opendev.org-ssl-access.log#2 | 17:50 |
fungi | i think the job actually needs to fire the script the cronjob would normally run, or at least stick a replacement file there | 17:50 |
clarkb | you can trigger the script with a creates directive on the file paths it creates | 17:51 |
clarkb | then it would fire in CI but not run in prod | 17:51 |
fungi | probably easiest if we just create a temporary file there when deploying the server, and check with testinfra that it gets served? | 17:52 |
clarkb | ya that should work too | 17:52 |
fungi | that way we don't have to block for the script to complete (though i suppose ansible would do that too) | 17:52 |
clarkb | you'll want ot make that test specific then (the idea I had was so that we didn't need a separate playbook but we may already have one) | 17:52 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Test speedup change to prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858963 | 17:56 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Reduce the number of loops in prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858961 | 17:56 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Re-expose our Mailman archives.yaml and robots.txt https://review.opendev.org/c/opendev/system-config/+/858913 | 18:02 |
fungi | clarkb: how about that ^ approach? | 18:02 |
fungi | that also ensures the list is created on new servers in advance of the daily cron firing | 18:02 |
fungi | oh, i guess i need become: true | 18:03 |
fungi | or do i? we run those playbooks as root already, right? | 18:04 |
clarkb | yes they are already root | 18:04 |
clarkb | I think that looks correct | 18:04 |
fungi | cool, then i guess we'll see what folks (including zuul) think about that | 18:04 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Test speedup change to prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858963 | 18:10 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Reduce the number of loops in prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858961 | 18:10 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Re-expose our Mailman archives.yaml and robots.txt https://review.opendev.org/c/opendev/system-config/+/858913 | 20:10 |
* fungi sighs at his inability to indent correctly | 20:10 | |
opendevreview | Merged zuul/zuul-jobs master: Test speedup change to prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858963 | 21:00 |
clarkb | fungi: do you have time to double check I've modified the test playbook in https://review.opendev.org/c/opendev/base-jobs/+/858964 and won't be affecting production? | 21:12 |
*** rlandy|rover is now known as rlandy|rover|biab | 21:33 | |
opendevreview | James E. Blair proposed opendev/system-config master: Make zk-ca role more generic https://review.opendev.org/c/opendev/system-config/+/858988 | 21:37 |
opendevreview | James E. Blair proposed opendev/system-config master: Export Zuul traces to Jaeger https://review.opendev.org/c/opendev/system-config/+/858989 | 21:48 |
corvus | clarkb: fungi ^ as i was writing the second change, I realized the flaw in using zk-ca for jaeger, so the first change above makes a new ca, and the second change sets up tracing for zuul. | 21:49 |
clarkb | corvus: yup dropped some comments. Seems like a good improvement | 21:55 |
fungi | clarkb: sorry, got sidetracked by dinner, but looking noe | 21:56 |
fungi | now | 21:56 |
clarkb | I find dinner distracting too :) | 21:56 |
fungi | it was bibimbap, which is a lot of work, so extra distracting | 21:57 |
clarkb | once the base-test job updates I'm going to see if I can reparent an OSA job to it since they were the original example that kicked this off | 21:59 |
corvus | clarkb: yeah, i left the filename alone because it's the "zk-ca.sh" script from zuul... i guess we could change it in that copy? would you prefer that? | 22:03 |
clarkb | corvus: I'm on the fence. I do think it may create confusion later as we use it for other things | 22:04 |
clarkb | corvus: maybe leave a note that it originated from there and update the filename? | 22:04 |
corvus | ok | 22:04 |
opendevreview | Merged opendev/base-jobs master: Use test-prepare-workspace-git in base-test https://review.opendev.org/c/opendev/base-jobs/+/858964 | 22:06 |
opendevreview | James E. Blair proposed opendev/system-config master: Make zk-ca role more generic https://review.opendev.org/c/opendev/system-config/+/858988 | 22:06 |
opendevreview | James E. Blair proposed opendev/system-config master: Export Zuul traces to Jaeger https://review.opendev.org/c/opendev/system-config/+/858989 | 22:06 |
clarkb | remote: https://review.opendev.org/c/openstack/openstack-ansible/+/858992 is testing the prepare-workspace-git changes | 22:10 |
clarkb | it took just under 3 minutes to run through that loop on the openstack-ansible-deploy-aio_metal-debian-bullseye build for ^ and it took 6.5 ish minutes in yoctozepto's example | 22:20 |
clarkb | assuming it actually works that is a decent time saving | 22:20 |
clarkb | ya I think it got it down to 8 minutes or so. A definite improvement, but could still be a lot better | 22:26 |
*** dasm is now known as dasm|off | 22:33 | |
clarkb | yoctozepto is probably sleeping, but when your Friday starts can you check the error at https://zuul.opendev.org/t/openstack/build/c27b758d63224e1bbe5a8c43bce19802/log/job-output.txt#10583 I want to make sure that isn't fallout from my change to prepare-workspace-git. It doesn't look like openstacksdk is one of the repos managed by the zuul job and we don't set it up. So that must be | 22:58 |
clarkb | coming from somewhere else | 22:58 |
clarkb | Looks like https://review.opendev.org/c/openstack/openstack-ansible/+/858981 maybe suffering a similar fate. | 23:00 |
clarkb | I'm going to run devstack under base-test too to up my coverage | 23:00 |
clarkb | oh hrm devstack parents to multinode so that is tricky | 23:01 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: DNM reparenting multinode to base-test to test devstack https://review.opendev.org/c/zuul/zuul-jobs/+/858995 | 23:03 |
clarkb | https://review.opendev.org/c/openstack/devstack/+/858996 depends on ^ which should give us that coverage | 23:05 |
corvus | fungi: if you have a minute for https://review.opendev.org/858988 and https://review.opendev.org/858989 that should let us start using the new tracing server | 23:38 |
fungi | i may even have two minutes | 23:42 |
clarkb | ok the devstack jobs are passing. I'm pretty sure the osa issue is independent of my chnage. The last thing to check is probably that we still set the zuul git state properly, but that is actually done by the other role. I think this is probably safe to merge into production if other people want to review it (and probably better to land tomorrow and not today as I'm about to have | 23:44 |
clarkb | dinner) | 23:44 |
fungi | corvus: those lgtm. trade you for 858913? | 23:46 |
* fungi brokers deals for code review | 23:46 | |
clarkb | also it doesn't improve the speed of devstack tremendously because the repos devstack deals with are much larger and the cloning does dominate | 23:48 |
clarkb | but in osa it is a definite improvement | 23:48 |
corvus | fungi: lgtm thanks! | 23:49 |
fungi | thanks! | 23:51 |
opendevreview | James E. Blair proposed zuul/zuul-jobs master: WIP: Fix CORS and endpoint in AWS log upload https://review.opendev.org/c/zuul/zuul-jobs/+/858726 | 23:57 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!