Tuesday, 2021-11-30

opendevreviewClark Boylan proposed opendev/system-config master: Upgrade to gerrit 3.3.8  https://review.opendev.org/c/opendev/system-config/+/81973300:19
clarkbNew gerrit releases have arrived. Those don't seem urgent but keeping up is good00:19
*** rlandy_ is now known as rlandy|ruck01:43
*** rlandy|ruck is now known as rlandy|out01:48
*** odyssey4me is now known as Guest719203:53
*** pojadhav|afk is now known as pojadhav05:17
*** ysandeep|out is now known as ysandeep|ruck05:24
opendevreviewAde Lee proposed zuul/zuul-jobs master: DNM enable_fips role for zuul jobs  https://review.opendev.org/c/zuul/zuul-jobs/+/80703106:19
opendevreviewAde Lee proposed zuul/zuul-jobs master: DNM enable_fips role for zuul jobs  https://review.opendev.org/c/zuul/zuul-jobs/+/80703107:04
*** ysandeep|ruck is now known as ysandeep|lunch07:31
*** pojadhav is now known as pojadhav|lunch07:42
*** ysandeep|lunch is now known as ysandeep08:22
*** bhagyashris_ is now known as bhagyashris08:42
*** pojadhav|lunch is now known as pojadhav09:10
*** jpena|off is now known as jpena09:27
opendevreviewMerged openstack/project-config master: Add ansible-collection-kolla deliverable to kolla project  https://review.opendev.org/c/openstack/project-config/+/81932609:59
*** mazzy509881292 is now known as mazzy5098812910:04
*** rlandy is now known as rlandy|ruck10:44
*** ysandeep is now known as ysandeep|afk10:51
*** ysandeep|afk is now known as ysandeep|ruck11:06
*** ysandeep|ruck is now known as ysandeep|afk13:09
opendevreviewAurelien Lourot proposed openstack/project-config master: Add NVidia vGPU plugin charm to OpenStack charms  https://review.opendev.org/c/openstack/project-config/+/81981813:12
opendevreviewAurelien Lourot proposed openstack/project-config master: Add NVidia vGPU plugin charm to OpenStack charms  https://review.opendev.org/c/openstack/project-config/+/81981813:18
*** sshnaidm|afk is now known as sshnaidm13:29
fungihttps://twitter.com/sbarnea/status/1465357109269348364 https://discuss.python.org/t/should-we-exclude-ci-cd-files-from-sdist-archives-or-not/12253/1 might be relevant to pbr developers13:32
fungii've replied on the latter but i don't do twitter13:32
*** ysandeep|afk is now known as ysandeep13:34
*** ysandeep is now known as ysandeep|ruck13:34
*** pojadhav is now known as pojadhav|brb14:37
opendevreviewAde Lee proposed zuul/zuul-jobs master: DNM enable_fips role for zuul jobs  https://review.opendev.org/c/zuul/zuul-jobs/+/80703114:49
*** pojadhav|brb is now known as pojadhav15:14
*** ykarel is now known as ykarel|away15:28
*** ysandeep|ruck is now known as ysandeep|out15:42
clarkbyou can exclude things with the MANIFEST.in iirc. But also distributions like debian want to run tests against their packages. Its possible that over aggressive removal of those files will break that use case15:45
fungiyeah, though the question was scoped to ci/cd configuration rather than test definitions... but the distinction between the two could also be fuzzy15:51
fungito make an example of a pbr-using package for a project in opendev, debian's autopackagetests are more likely to be directly invoking (s)testr or pytest, not even tox, and definitely not reusing the zuul job definitions15:54
fungiin their case they don't want to test in fabricated virtual environments, but rather in a representative chroot of a real installation of the distro packages15:54
fungifor zuul, i could see the playbooks/roles being useful as part of the sdist, maybe not the zuul configs though unless they're not dotfiles? but my reply was that excluding those files doesn't really save a measurable amount of space in most cases15:56
clarkbya I guess if the intent is only the CI framework cleanup then maybe you can get away with that, but probably better for package maintainers to explicitly remove them rather than do it automatically16:02
clarkband as you say they tend to be small so if they get included it isn't a big deal16:02
clarkbas noted yesterday Gerrit has new releases. https://review.opendev.org/c/opendev/system-config/+/819733 Updates our images to them. I don't think this is super urgent. There is also a note on the gerrit mailing list that at least one installation's plugins broke due to some internal API changes made by these releases. In our case we build gerrit itself from tip of $branch so we'd16:14
clarkbneed to pin to the prior release if we want to avoid that16:14
clarkball that to say I expect this is fine if it passes our CI checks16:14
*** jpena is now known as jpena|off17:35
corvusinfra-root: https://review.opendev.org/802973 is a good zuul housekeeping change to merge soon17:51
opendevreviewMerged openstack/project-config master: Remove report-build-page from zuul tenant config  https://review.opendev.org/c/openstack/project-config/+/80297318:04
rlandy|ruckfungi: hello ... I think the tripleo gate may be stuck ... the top patch 818820,2 has one test running and hanging on a task that should take less than a minute: https://zuul.opendev.org/t/openstack/stream/a822dac3693c4f739d2a7278a33da2f6?logfile=console.log18:29
rlandy|ruckI can abort and restore if that is the right action18:30
rlandy|ruckoh - nvm ... it just timed out18:30
fungino worries, let me know if you need anything else18:31
clarkbrlandy|ruck: doesn't look like the job logged much. I wonder if that is due to whatever caused the task to stall out18:43
clarkbmaybe filesystem problems? the task prior to the one that timed out was doing ceph stuff18:43
clarkbfilled disk maybe (that can make ansible unable to function)18:43
fungiwe saw a similar situation with enospc causing swift tests to hang in the fips jobs18:45
fungiso the result was a playbook timeout18:46
clarkbalso it appears that your jobs are logging zuul vars which we already log for all jobs18:46
clarkbyou shouldn't need to log that separately18:46
clarkbrlandy|ruck: is https://opendev.org/openstack/tripleo-ci/src/branch/master/roles/ceph-loop-device/tasks/main.yml#L20-L29 creating two 14GB devices?18:52
clarkbcertainly I could see that causing problems depending on where that was written and how much disk space is being used18:52
fungiwe also log things like the available disk space at the start of the build18:53
clarkbI think our cleanup-run playbook isn't working anymore though unfortunately18:53
clarkbotherwise we could check this maybe18:53
clarkbI'll see if that has any info18:53
fungibut you can at least tell if the failed build had a fairly small rootfs initially18:53
rlandy|ruckhttps://zuul.opendev.org/t/openstack/builds?job_name=tripleo-ci-centos-8-scenario012-standalone has a bit of a spotty history18:54
rlandy|ruckgoing to rerun and see if that reproduces 18:54
clarkbya the get df usage playbook failed with Ansible complete, result RESULT_UNREACHABLE code None which can be caused by a full disk because ansible can't write to the remote18:56
clarkbI strongly suspect that is what is causing this and you either need to reduce the number of loop devices or reduce their size or both18:56
clarkbor possibly move them to the larger /opt device if they are not there already18:57
fungiansible will sometimes report a host as "unreachable" when it fails to write to its filesystem though18:57
clarkbhttps://zuul.opendev.org/t/openstack/build/a822dac3693c4f739d2a7278a33da2f6/log/job-output.txt#1693-1694 confrims it is making two 14GB devices18:57
clarkbfungi: yup exactly because writing to /tmp is part of its connection process18:57
fungii.e. it can ssh into the host successfully but fails to write its script for remote execution18:57
rlandy|ruckok- we have had some jobs return with DISK_FULL19:00
rlandy|ruckand the like19:00
rlandy|ruckwill bring that up19:00
clarkbya I think there is some luck as far as how ansible can detect it19:00
clarkbinfra-root https://hub.docker.com/r/etherpad/etherpad/tags they pushed 1.18.16 images for etherpad to dockerhub. Should I half revert my 1.18.16 update change to go back to consuming the upsteram image?20:05
clarkbor do we prefer building our own as updated in the change?20:05
fungii have no preference20:13
clarkbianw: I guess https://review.opendev.org/c/opendev/system-config/+/677903 needs a rebase?20:30
clarkbianw: would you prefer I review it pre rebase first or after?20:30
clarkbfungi: I've approved the ansible galaxy proxy update change20:34
clarkb(just fyi)20:34
opendevreviewMerged opendev/storyboard-webclient master: Bindep cleanup and JavaScript updates  https://review.opendev.org/c/opendev/storyboard-webclient/+/81405320:37
fungithanks clarkb!20:43
ianwclarkb: sorry, breakfasting; i imagine it does need a merge, let me try21:01
clarkbyup no rush, I just noticed that as I started pulling stuff up after eating lunch. Wasn't sure if you wanted us to review pre rebase or not21:03
opendevreviewMerged opendev/system-config master: Cache Ansible Galaxy on CI mirror servers  https://review.opendev.org/c/opendev/system-config/+/81878721:03
opendevreviewJames E. Blair proposed opendev/system-config master: Add a keycloak server  https://review.opendev.org/c/opendev/system-config/+/81992321:09
opendevreviewIan Wienand proposed opendev/system-config master: Make haproxy role more generic  https://review.opendev.org/c/opendev/system-config/+/67790321:09
opendevreviewMerged opendev/storyboard-webclient master: Update default contact in error message template  https://review.opendev.org/c/opendev/storyboard-webclient/+/81404121:18
clarkbianw: +2'd https://review.opendev.org/c/opendev/system-config/+/677903 but left some comments on being a bit more explicit about file ownership and modes21:26
clarkboh wait that just made me think of another thing21:26
clarkbwe are probably about 15 minutes away from the mirror update. It is reasonably well tested but please keep an eye out for sad mirrors in case we have to manually updte mirrors and revert21:28
ianwclarkb: agree on fixing those things up now.  will just do school run then push a new change21:30
clarkbcool, I have to do a school run in about 45 minutes myself.21:30
clarkbfungi: the mirrors should be updating right about now21:41
fungithanks for the heads up, dinner's done so i'll check some apache configs on a few of them shortly21:41
clarkbinfra-root https://review.opendev.org/c/opendev/system-config/+/818645 is the switch of matrix-gerritbot back to the dedicated user with the updated image21:44
clarkbI'm not in a great spot to watch that land right now, but if someone else can review that I'll try to approve it probably tomorrow sometime21:45
opendevreviewJames E. Blair proposed opendev/system-config master: Add a keycloak server  https://review.opendev.org/c/opendev/system-config/+/81992321:45
fungistoryboard-webclient-promote-opendev-image is failing on the "promote-docker-image: Get dockerhub JWT token" task, but we hide all the output from that for obvious reasons... any guesses what the problem is?21:46
clarkbfungi: zuul had a docker request fail due to some expiration of a token I think. Its possible they are experiencing an outage maybe?21:47
corvusfungi: if it's once, could just be dockerhub being flaky; if it's 2x some time apart maybe a real problem?  does it have a history of working?21:47
fungioh, good point. i should be able to reenqueue that ref later21:47
fungiit was 2x, from different changes which merged in a fairly short window however21:48
fungimmm, no, build history says it's consistently failed as far back as 2020-05-0121:48
clarkbmirrors appear to still function but the /galaxy/ path isn't working for me so may need more work to do galaxy stuff21:48
fungiinteresting, i added an explicit test for /galaxy/ returning known content21:49
clarkbya I wonder if maybe I'm hitting older not phased out mirror processes21:49
clarkband need them to recycle out for the graceful reload?21:49
corvusfungi: could be an incorrectly encoded password (maybe has a newline). i'd start by re-doing the zuul encryption (assuming that cred isn't already working elsewhere)21:49
fungiclarkb: possible21:50
fungicorvus: yep, thanks i'll see if i can vet and reencode the secret for it21:50
corvusevery time i've hit that error legitimately, it's been the newline password thing21:50
fungimakes sense21:50
fungiit's not super critical since we don't (yet) deploy from the container images21:51
fungibut would be good to make sure they're working correctly21:51
fungiclarkb: yeah, it's stale apache processes. try https://mirror.ca-ymq-1.vexxhost.opendev.org/galaxy/ and you should get (albeit nonfunctional) content21:56
fungilooks like the site itself is build from javascript which is included with absolute links the proxy isn't able to rewrite21:57
clarkbI wonder if the galaxy tooling has a raw index like pypi does for pip22:06
fungia sample example was provided last week in #openstack-infra, i'll find it22:11
fungior maybe it was in here22:13
funginot of the index, but https://galaxy.ansible.com/download/community-general-4.0.2.tar.gz22:14
fungihttps://mirror.ca-ymq-1.vexxhost.opendev.org/galaxy/download/community-general-4.0.2.tar.gz should in theory redirect to the other proxy path22:15
fungiand it seems to do so22:15
fungiyeah, i get a valid tarball that way22:16
fungisshnaidm: ^ please test more thoroughly when you get a chance22:16
fungithe original discussion started in this channel at 2021-11-22 14:46 utc22:18
fungifor reference22:18
opendevreviewJames E. Blair proposed opendev/system-config master: Add a keycloak server  https://review.opendev.org/c/opendev/system-config/+/81992322:23
fungistarring that ^ to review when my energy isn't already waning, but very exciting!22:34
ianwclarkb: i agree looking back on it now it's amazing it finds the right template22:45
ianwi think it's intentional https://docs.ansible.com/ansible/latest/user_guide/playbook_pathing.html#the-magic-of-local-paths22:48
ianwi'll rename the file to have a gitea- prefix, that should help prevent any confusion over generically named files getting in the mix22:51
opendevreviewIan Wienand proposed opendev/system-config master: Make haproxy role more generic  https://review.opendev.org/c/opendev/system-config/+/67790322:58
opendevreviewIan Wienand proposed opendev/system-config master: haproxy: map in config as ro  https://review.opendev.org/c/opendev/system-config/+/81992722:58
clarkbianw: ya I suspect that was something they made work since puppet struggled so much with it23:06
opendevreviewJames E. Blair proposed opendev/system-config master: Add a keycloak server  https://review.opendev.org/c/opendev/system-config/+/81992323:32
clarkbis anyone able to review https://review.opendev.org/c/opendev/system-config/+/819733 for the gerrit image updates? would be nice to have images built and ready for when zuul restart happens23:43
corvusclarkb: 2 comments on that23:46
opendevreviewMerged opendev/system-config master: infra-prod: clone source once  https://review.opendev.org/c/opendev/system-config/+/80780823:47
clarkbcorvus: responded. Let me know what you think23:49
corvusianw: +2 on 677903 -- do you want to +w that?  i'm around for a bit longer if something goes wrong, but it's probably mostly you :)23:49
ianwcorvus: thanks; yeah i can watch it23:50
corvusclarkb: thanks wfm23:50

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!