opendevreview | Clark Boylan proposed opendev/system-config master: Upgrade to gerrit 3.3.8 https://review.opendev.org/c/opendev/system-config/+/819733 | 00:19 |
---|---|---|
clarkb | New gerrit releases have arrived. Those don't seem urgent but keeping up is good | 00:19 |
*** rlandy_ is now known as rlandy|ruck | 01:43 | |
*** rlandy|ruck is now known as rlandy|out | 01:48 | |
*** odyssey4me is now known as Guest7192 | 03:53 | |
*** pojadhav|afk is now known as pojadhav | 05:17 | |
*** ysandeep|out is now known as ysandeep|ruck | 05:24 | |
opendevreview | Ade Lee proposed zuul/zuul-jobs master: DNM enable_fips role for zuul jobs https://review.opendev.org/c/zuul/zuul-jobs/+/807031 | 06:19 |
opendevreview | Ade Lee proposed zuul/zuul-jobs master: DNM enable_fips role for zuul jobs https://review.opendev.org/c/zuul/zuul-jobs/+/807031 | 07:04 |
*** ysandeep|ruck is now known as ysandeep|lunch | 07:31 | |
*** pojadhav is now known as pojadhav|lunch | 07:42 | |
*** ysandeep|lunch is now known as ysandeep | 08:22 | |
*** bhagyashris_ is now known as bhagyashris | 08:42 | |
*** pojadhav|lunch is now known as pojadhav | 09:10 | |
*** jpena|off is now known as jpena | 09:27 | |
opendevreview | Merged openstack/project-config master: Add ansible-collection-kolla deliverable to kolla project https://review.opendev.org/c/openstack/project-config/+/819326 | 09:59 |
*** mazzy509881292 is now known as mazzy50988129 | 10:04 | |
*** rlandy is now known as rlandy|ruck | 10:44 | |
*** ysandeep is now known as ysandeep|afk | 10:51 | |
*** ysandeep|afk is now known as ysandeep|ruck | 11:06 | |
*** ysandeep|ruck is now known as ysandeep|afk | 13:09 | |
opendevreview | Aurelien Lourot proposed openstack/project-config master: Add NVidia vGPU plugin charm to OpenStack charms https://review.opendev.org/c/openstack/project-config/+/819818 | 13:12 |
opendevreview | Aurelien Lourot proposed openstack/project-config master: Add NVidia vGPU plugin charm to OpenStack charms https://review.opendev.org/c/openstack/project-config/+/819818 | 13:18 |
*** sshnaidm|afk is now known as sshnaidm | 13:29 | |
fungi | https://twitter.com/sbarnea/status/1465357109269348364 https://discuss.python.org/t/should-we-exclude-ci-cd-files-from-sdist-archives-or-not/12253/1 might be relevant to pbr developers | 13:32 |
fungi | i've replied on the latter but i don't do twitter | 13:32 |
*** ysandeep|afk is now known as ysandeep | 13:34 | |
*** ysandeep is now known as ysandeep|ruck | 13:34 | |
*** pojadhav is now known as pojadhav|brb | 14:37 | |
opendevreview | Ade Lee proposed zuul/zuul-jobs master: DNM enable_fips role for zuul jobs https://review.opendev.org/c/zuul/zuul-jobs/+/807031 | 14:49 |
*** pojadhav|brb is now known as pojadhav | 15:14 | |
*** ykarel is now known as ykarel|away | 15:28 | |
*** ysandeep|ruck is now known as ysandeep|out | 15:42 | |
clarkb | you can exclude things with the MANIFEST.in iirc. But also distributions like debian want to run tests against their packages. Its possible that over aggressive removal of those files will break that use case | 15:45 |
fungi | yeah, though the question was scoped to ci/cd configuration rather than test definitions... but the distinction between the two could also be fuzzy | 15:51 |
fungi | to make an example of a pbr-using package for a project in opendev, debian's autopackagetests are more likely to be directly invoking (s)testr or pytest, not even tox, and definitely not reusing the zuul job definitions | 15:54 |
fungi | in their case they don't want to test in fabricated virtual environments, but rather in a representative chroot of a real installation of the distro packages | 15:54 |
fungi | for zuul, i could see the playbooks/roles being useful as part of the sdist, maybe not the zuul configs though unless they're not dotfiles? but my reply was that excluding those files doesn't really save a measurable amount of space in most cases | 15:56 |
clarkb | ya I guess if the intent is only the CI framework cleanup then maybe you can get away with that, but probably better for package maintainers to explicitly remove them rather than do it automatically | 16:02 |
clarkb | and as you say they tend to be small so if they get included it isn't a big deal | 16:02 |
clarkb | as noted yesterday Gerrit has new releases. https://review.opendev.org/c/opendev/system-config/+/819733 Updates our images to them. I don't think this is super urgent. There is also a note on the gerrit mailing list that at least one installation's plugins broke due to some internal API changes made by these releases. In our case we build gerrit itself from tip of $branch so we'd | 16:14 |
clarkb | need to pin to the prior release if we want to avoid that | 16:14 |
clarkb | all that to say I expect this is fine if it passes our CI checks | 16:14 |
*** jpena is now known as jpena|off | 17:35 | |
corvus | infra-root: https://review.opendev.org/802973 is a good zuul housekeeping change to merge soon | 17:51 |
clarkb | done | 17:52 |
fungi | double-done | 17:52 |
opendevreview | Merged openstack/project-config master: Remove report-build-page from zuul tenant config https://review.opendev.org/c/openstack/project-config/+/802973 | 18:04 |
rlandy|ruck | fungi: hello ... I think the tripleo gate may be stuck ... the top patch 818820,2 has one test running and hanging on a task that should take less than a minute: https://zuul.opendev.org/t/openstack/stream/a822dac3693c4f739d2a7278a33da2f6?logfile=console.log | 18:29 |
rlandy|ruck | I can abort and restore if that is the right action | 18:30 |
rlandy|ruck | oh - nvm ... it just timed out | 18:30 |
fungi | no worries, let me know if you need anything else | 18:31 |
rlandy|ruck | thanks | 18:31 |
clarkb | rlandy|ruck: doesn't look like the job logged much. I wonder if that is due to whatever caused the task to stall out | 18:43 |
clarkb | maybe filesystem problems? the task prior to the one that timed out was doing ceph stuff | 18:43 |
clarkb | filled disk maybe (that can make ansible unable to function) | 18:43 |
fungi | we saw a similar situation with enospc causing swift tests to hang in the fips jobs | 18:45 |
fungi | so the result was a playbook timeout | 18:46 |
clarkb | also it appears that your jobs are logging zuul vars which we already log for all jobs | 18:46 |
clarkb | you shouldn't need to log that separately | 18:46 |
clarkb | rlandy|ruck: is https://opendev.org/openstack/tripleo-ci/src/branch/master/roles/ceph-loop-device/tasks/main.yml#L20-L29 creating two 14GB devices? | 18:52 |
clarkb | certainly I could see that causing problems depending on where that was written and how much disk space is being used | 18:52 |
fungi | we also log things like the available disk space at the start of the build | 18:53 |
clarkb | I think our cleanup-run playbook isn't working anymore though unfortunately | 18:53 |
clarkb | otherwise we could check this maybe | 18:53 |
clarkb | I'll see if that has any info | 18:53 |
fungi | but you can at least tell if the failed build had a fairly small rootfs initially | 18:53 |
rlandy|ruck | https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-ci-centos-8-scenario012-standalone has a bit of a spotty history | 18:54 |
rlandy|ruck | going to rerun and see if that reproduces | 18:54 |
clarkb | ya the get df usage playbook failed with Ansible complete, result RESULT_UNREACHABLE code None which can be caused by a full disk because ansible can't write to the remote | 18:56 |
clarkb | I strongly suspect that is what is causing this and you either need to reduce the number of loop devices or reduce their size or both | 18:56 |
clarkb | or possibly move them to the larger /opt device if they are not there already | 18:57 |
fungi | ansible will sometimes report a host as "unreachable" when it fails to write to its filesystem though | 18:57 |
clarkb | https://zuul.opendev.org/t/openstack/build/a822dac3693c4f739d2a7278a33da2f6/log/job-output.txt#1693-1694 confrims it is making two 14GB devices | 18:57 |
clarkb | fungi: yup exactly because writing to /tmp is part of its connection process | 18:57 |
fungi | i.e. it can ssh into the host successfully but fails to write its script for remote execution | 18:57 |
rlandy|ruck | ok- we have had some jobs return with DISK_FULL | 19:00 |
rlandy|ruck | and the like | 19:00 |
rlandy|ruck | will bring that up | 19:00 |
clarkb | ya I think there is some luck as far as how ansible can detect it | 19:00 |
clarkb | infra-root https://hub.docker.com/r/etherpad/etherpad/tags they pushed 1.18.16 images for etherpad to dockerhub. Should I half revert my 1.18.16 update change to go back to consuming the upsteram image? | 20:05 |
clarkb | or do we prefer building our own as updated in the change? | 20:05 |
fungi | i have no preference | 20:13 |
clarkb | ianw: I guess https://review.opendev.org/c/opendev/system-config/+/677903 needs a rebase? | 20:30 |
clarkb | ianw: would you prefer I review it pre rebase first or after? | 20:30 |
clarkb | fungi: I've approved the ansible galaxy proxy update change | 20:34 |
clarkb | (just fyi) | 20:34 |
opendevreview | Merged opendev/storyboard-webclient master: Bindep cleanup and JavaScript updates https://review.opendev.org/c/opendev/storyboard-webclient/+/814053 | 20:37 |
fungi | thanks clarkb! | 20:43 |
ianw | clarkb: sorry, breakfasting; i imagine it does need a merge, let me try | 21:01 |
clarkb | yup no rush, I just noticed that as I started pulling stuff up after eating lunch. Wasn't sure if you wanted us to review pre rebase or not | 21:03 |
opendevreview | Merged opendev/system-config master: Cache Ansible Galaxy on CI mirror servers https://review.opendev.org/c/opendev/system-config/+/818787 | 21:03 |
opendevreview | James E. Blair proposed opendev/system-config master: Add a keycloak server https://review.opendev.org/c/opendev/system-config/+/819923 | 21:09 |
opendevreview | Ian Wienand proposed opendev/system-config master: Make haproxy role more generic https://review.opendev.org/c/opendev/system-config/+/677903 | 21:09 |
opendevreview | Merged opendev/storyboard-webclient master: Update default contact in error message template https://review.opendev.org/c/opendev/storyboard-webclient/+/814041 | 21:18 |
clarkb | ianw: +2'd https://review.opendev.org/c/opendev/system-config/+/677903 but left some comments on being a bit more explicit about file ownership and modes | 21:26 |
clarkb | oh wait that just made me think of another thing | 21:26 |
clarkb | we are probably about 15 minutes away from the mirror update. It is reasonably well tested but please keep an eye out for sad mirrors in case we have to manually updte mirrors and revert | 21:28 |
ianw | clarkb: agree on fixing those things up now. will just do school run then push a new change | 21:30 |
clarkb | cool, I have to do a school run in about 45 minutes myself. | 21:30 |
clarkb | fungi: the mirrors should be updating right about now | 21:41 |
fungi | thanks for the heads up, dinner's done so i'll check some apache configs on a few of them shortly | 21:41 |
clarkb | infra-root https://review.opendev.org/c/opendev/system-config/+/818645 is the switch of matrix-gerritbot back to the dedicated user with the updated image | 21:44 |
clarkb | I'm not in a great spot to watch that land right now, but if someone else can review that I'll try to approve it probably tomorrow sometime | 21:45 |
opendevreview | James E. Blair proposed opendev/system-config master: Add a keycloak server https://review.opendev.org/c/opendev/system-config/+/819923 | 21:45 |
fungi | storyboard-webclient-promote-opendev-image is failing on the "promote-docker-image: Get dockerhub JWT token" task, but we hide all the output from that for obvious reasons... any guesses what the problem is? | 21:46 |
corvus | +2 | 21:46 |
clarkb | fungi: zuul had a docker request fail due to some expiration of a token I think. Its possible they are experiencing an outage maybe? | 21:47 |
corvus | fungi: if it's once, could just be dockerhub being flaky; if it's 2x some time apart maybe a real problem? does it have a history of working? | 21:47 |
fungi | oh, good point. i should be able to reenqueue that ref later | 21:47 |
fungi | it was 2x, from different changes which merged in a fairly short window however | 21:48 |
fungi | mmm, no, build history says it's consistently failed as far back as 2020-05-01 | 21:48 |
clarkb | mirrors appear to still function but the /galaxy/ path isn't working for me so may need more work to do galaxy stuff | 21:48 |
fungi | interesting, i added an explicit test for /galaxy/ returning known content | 21:49 |
clarkb | ya I wonder if maybe I'm hitting older not phased out mirror processes | 21:49 |
clarkb | and need them to recycle out for the graceful reload? | 21:49 |
corvus | fungi: could be an incorrectly encoded password (maybe has a newline). i'd start by re-doing the zuul encryption (assuming that cred isn't already working elsewhere) | 21:49 |
fungi | clarkb: possible | 21:50 |
fungi | corvus: yep, thanks i'll see if i can vet and reencode the secret for it | 21:50 |
corvus | every time i've hit that error legitimately, it's been the newline password thing | 21:50 |
fungi | makes sense | 21:50 |
fungi | it's not super critical since we don't (yet) deploy from the container images | 21:51 |
fungi | but would be good to make sure they're working correctly | 21:51 |
fungi | clarkb: yeah, it's stale apache processes. try https://mirror.ca-ymq-1.vexxhost.opendev.org/galaxy/ and you should get (albeit nonfunctional) content | 21:56 |
fungi | looks like the site itself is build from javascript which is included with absolute links the proxy isn't able to rewrite | 21:57 |
fungi | s/build/built/ | 21:57 |
clarkb | I wonder if the galaxy tooling has a raw index like pypi does for pip | 22:06 |
fungi | a sample example was provided last week in #openstack-infra, i'll find it | 22:11 |
fungi | or maybe it was in here | 22:13 |
fungi | not of the index, but https://galaxy.ansible.com/download/community-general-4.0.2.tar.gz | 22:14 |
fungi | https://mirror.ca-ymq-1.vexxhost.opendev.org/galaxy/download/community-general-4.0.2.tar.gz should in theory redirect to the other proxy path | 22:15 |
fungi | and it seems to do so | 22:15 |
fungi | yeah, i get a valid tarball that way | 22:16 |
fungi | sshnaidm: ^ please test more thoroughly when you get a chance | 22:16 |
fungi | the original discussion started in this channel at 2021-11-22 14:46 utc | 22:18 |
fungi | for reference | 22:18 |
opendevreview | James E. Blair proposed opendev/system-config master: Add a keycloak server https://review.opendev.org/c/opendev/system-config/+/819923 | 22:23 |
fungi | starring that ^ to review when my energy isn't already waning, but very exciting! | 22:34 |
ianw | clarkb: i agree looking back on it now it's amazing it finds the right template | 22:45 |
ianw | i think it's intentional https://docs.ansible.com/ansible/latest/user_guide/playbook_pathing.html#the-magic-of-local-paths | 22:48 |
ianw | i'll rename the file to have a gitea- prefix, that should help prevent any confusion over generically named files getting in the mix | 22:51 |
opendevreview | Ian Wienand proposed opendev/system-config master: Make haproxy role more generic https://review.opendev.org/c/opendev/system-config/+/677903 | 22:58 |
opendevreview | Ian Wienand proposed opendev/system-config master: haproxy: map in config as ro https://review.opendev.org/c/opendev/system-config/+/819927 | 22:58 |
clarkb | ianw: ya I suspect that was something they made work since puppet struggled so much with it | 23:06 |
opendevreview | James E. Blair proposed opendev/system-config master: Add a keycloak server https://review.opendev.org/c/opendev/system-config/+/819923 | 23:32 |
clarkb | is anyone able to review https://review.opendev.org/c/opendev/system-config/+/819733 for the gerrit image updates? would be nice to have images built and ready for when zuul restart happens | 23:43 |
corvus | clarkb: 2 comments on that | 23:46 |
opendevreview | Merged opendev/system-config master: infra-prod: clone source once https://review.opendev.org/c/opendev/system-config/+/807808 | 23:47 |
clarkb | corvus: responded. Let me know what you think | 23:49 |
corvus | ianw: +2 on 677903 -- do you want to +w that? i'm around for a bit longer if something goes wrong, but it's probably mostly you :) | 23:49 |
ianw | corvus: thanks; yeah i can watch it | 23:50 |
corvus | clarkb: thanks wfm | 23:50 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!