*** jamesmcarthur has joined #openstack-infra | 00:03 | |
*** jamesmcarthur has quit IRC | 00:08 | |
*** mattw4 has quit IRC | 00:12 | |
*** goldyfruit_ has quit IRC | 00:20 | |
ianw | hrmm, i'd sort of glossed over the openstacksdk part of generating the nodepool images | 00:20 |
---|---|---|
ianw | we want to maintain testing against master of that too | 00:21 |
ianw | which means multiple inheritance (nodepool depends on dib & openstacksdk) | 00:21 |
ianw | that turns things into a different problem than just names | 00:21 |
ianw | fungi / clarkb: on the release, i thinkhttps://review.opendev.org/#/c/695981/ can work to do all rsync releases via ssh | 00:22 |
ianw | we could have a system-config dockerfile that builds nodepool+dib+openstacksdk all zuul checkouts. however, i can't see how to make it so that for production it only installs from releases | 00:25 |
clarkb | we might need two different dockerfiles | 00:25 |
clarkb | I think logic like this is why projects like kolla generate dockerfiles with a templating language? | 00:26 |
clarkb | I don't know that we want to go that far though | 00:26 |
ianw | it just feels very annoying to have different dockerfiles creating testing images and production images | 00:27 |
ianw | as we know, any slight chance of something being different will be, and result in broken production | 00:27 |
clarkb | yeah | 00:29 |
clarkb | mgoddard may have input? | 00:29 |
ianw | i guess i should get something up to build the "speculative" image and we can work from there | 00:30 |
clarkb | it is possible that if we are testing the speculative images we just don't care about releases in production anymore | 00:30 |
clarkb | since the images are known to work via testing | 00:30 |
ianw | yeah, maybe i'm too old fashioned. i guess the status quo is that master nodepool changes are put on the hosts (but don't apply unless manual reload) and dib & openstacksdk are pushed by puppet when new releases come | 00:33 |
ianw | we would not want to restart all nodepool-launchers on new containers | 00:34 |
ianw | i guess that means k8s or something managing them | 00:34 |
ianw | (restart all at once, and kill the world i mean) | 00:35 |
*** jamesmcarthur has joined #openstack-infra | 00:37 | |
clarkb | ianw: currently we use docker-compose and ansible to manage that for gitea | 00:40 |
clarkb | we serialize the gitea updates so that we only ever stop one at a time allowing the other 7 to carry load. We may want to do similar with nodepool though it is less of a concern there | 00:41 |
*** jamesmcarthur has quit IRC | 00:42 | |
*** sshnaidm has quit IRC | 00:45 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [dnm] nodepool-builder image with nodepool/openstacksdk/dib from master https://review.opendev.org/696000 | 00:55 |
ianw | oooh, 696000, nice round number | 00:56 |
*** sshnaidm has joined #openstack-infra | 01:00 | |
*** Goneri has quit IRC | 01:00 | |
openstackgerrit | Kendall Nelson proposed openstack/cookiecutter master: Update CONTRIBUTING.rst template https://review.opendev.org/696001 | 01:04 |
*** goldyfruit_ has joined #openstack-infra | 01:05 | |
openstackgerrit | Clark Boylan proposed zuul/zuul master: Improve functionality and docs around ansible installation https://review.opendev.org/675403 | 01:06 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [dnm] nodepool-builder image with nodepool/openstacksdk/dib from master https://review.opendev.org/696000 | 01:09 |
*** ociuhandu has joined #openstack-infra | 01:12 | |
*** igordc has quit IRC | 01:13 | |
*** michael-beaver has quit IRC | 01:18 | |
johnsom | ianw Any chance we can get some review on https://review.opendev.org/#/c/695823/ ? | 01:22 |
johnsom | We are trying to get some work around patches merged in Octavia, but the best option is to just fix that element. | 01:22 |
fungi | johnsom: there is also a #openstack-dib channel, in case that helps | 01:24 |
johnsom | ha, yeah, that discussion seems to move back and forth between here and the dib channel. | 01:24 |
*** ociuhandu has quit IRC | 01:28 | |
ianw | johnsom: oh, thanks, was just waiting for ci which was blocked on opensuse | 01:30 |
ianw | speaking of, mirror update hsould be done | 01:31 |
johnsom | Yeah, I poked it with a recheck, thanks. | 01:31 |
johnsom | It's done enough that the CI passed, lol | 01:31 |
*** rlandy|bbl is now known as rlandy | 01:36 | |
*** jamesmcarthur has joined #openstack-infra | 01:39 | |
*** ociuhandu has joined #openstack-infra | 01:40 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [dnm] nodepool-builder image with nodepool/openstacksdk/dib from master https://review.opendev.org/696000 | 01:42 |
*** jamesmcarthur has quit IRC | 01:43 | |
*** ociuhandu has quit IRC | 01:44 | |
*** ricolin has joined #openstack-infra | 01:46 | |
*** eharney has quit IRC | 01:49 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [dnm] nodepool-builder image with nodepool/openstacksdk/dib from master https://review.opendev.org/696000 | 01:52 |
*** armax has quit IRC | 01:57 | |
openstackgerrit | Merged opendev/storyboard master: Add support for creating attachments https://review.opendev.org/633421 | 02:10 |
openstackgerrit | Merged opendev/storyboard master: Add attachments API availability to /v1/systeminfo https://review.opendev.org/643560 | 02:10 |
*** jamesmcarthur has joined #openstack-infra | 02:10 | |
openstackgerrit | Merged openstack/diskimage-builder master: Stop installing pydistutils.cfg https://review.opendev.org/695823 | 02:12 |
*** jamesmcarthur has quit IRC | 02:15 | |
ianw | so, you can't copy files outside of the docker context .... but we can't really get the zuul source trees in there without some ugly hacks ... it doesn't feel like it's working out | 02:22 |
*** rlandy has quit IRC | 02:22 | |
*** armax has joined #openstack-infra | 02:31 | |
clarkb | does it have to be jn the current dir tree? | 02:33 |
ianw | clarkb: yes basically in or underneath the "Dockerfile" you're building | 02:36 |
*** ociuhandu has joined #openstack-infra | 02:38 | |
*** EmilienM|PTO is now known as EmilienM | 02:42 | |
*** ociuhandu has quit IRC | 02:48 | |
*** jamesmcarthur has joined #openstack-infra | 02:52 | |
*** jamesmcarthur has quit IRC | 02:57 | |
*** tonyb has joined #openstack-infra | 03:00 | |
*** diablo_rojo has quit IRC | 03:02 | |
*** ricolin has quit IRC | 03:03 | |
*** apetrich has quit IRC | 03:09 | |
openstackgerrit | Paul Belanger proposed zuul/zuul-jobs master: WIP: Print extra debug info https://review.opendev.org/696013 | 03:21 |
*** jamesmcarthur has joined #openstack-infra | 03:29 | |
*** jamesmcarthur has quit IRC | 03:34 | |
*** ociuhandu has joined #openstack-infra | 03:43 | |
*** ociuhandu has quit IRC | 03:47 | |
*** sshnaidm_ has joined #openstack-infra | 03:53 | |
*** ociuhandu has joined #openstack-infra | 03:53 | |
*** sshnaidm has quit IRC | 03:57 | |
*** hongbin has joined #openstack-infra | 04:00 | |
*** ociuhandu has quit IRC | 04:01 | |
*** ociuhandu has joined #openstack-infra | 04:02 | |
*** ociuhandu has quit IRC | 04:07 | |
*** rh-jelabarre has quit IRC | 04:07 | |
*** ricolin has joined #openstack-infra | 04:08 | |
*** udesale has joined #openstack-infra | 04:14 | |
*** ykarel has joined #openstack-infra | 04:19 | |
*** sshnaidm__ has joined #openstack-infra | 04:25 | |
*** sshnaidm_ has quit IRC | 04:26 | |
*** ricolin has quit IRC | 04:29 | |
*** jamesmcarthur has joined #openstack-infra | 04:31 | |
*** Lucas_Gray has joined #openstack-infra | 04:33 | |
*** jamesmcarthur has quit IRC | 04:35 | |
*** Lucas_Gray has quit IRC | 04:39 | |
*** Wryhder has joined #openstack-infra | 04:39 | |
*** Wryhder is now known as Lucas_Gray | 04:40 | |
*** Lucas_Gray has quit IRC | 05:21 | |
*** ociuhandu has joined #openstack-infra | 05:30 | |
*** ociuhandu has quit IRC | 05:35 | |
*** tkajinam has quit IRC | 05:37 | |
*** tkajinam has joined #openstack-infra | 05:38 | |
openstackgerrit | Merged zuul/zuul-jobs master: Use RDO trunk repos work for openvswitch on centos8 https://review.opendev.org/695833 | 05:40 |
openstackgerrit | Merged zuul/zuul-jobs master: update-test-platforms.py : handle non-voting jobs https://review.opendev.org/695830 | 05:40 |
openstackgerrit | Merged zuul/zuul-jobs master: Make opensuse-15 job voting again https://review.opendev.org/695831 | 05:41 |
*** raukadah is now known as chkumar|rover | 05:41 | |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: Add a Dockerfile and related jobs https://review.opendev.org/693971 | 05:50 |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: Add a Dockerfile and related jobs https://review.opendev.org/693971 | 05:52 |
*** adam_g has quit IRC | 05:55 | |
*** adam_g has joined #openstack-infra | 05:55 | |
*** markmcclain has quit IRC | 05:56 | |
*** markmcclain has joined #openstack-infra | 05:57 | |
*** jamesmcarthur has joined #openstack-infra | 06:07 | |
*** jamesmcarthur has quit IRC | 06:12 | |
*** hongbin has quit IRC | 06:18 | |
*** dchen has quit IRC | 06:18 | |
*** dchen has joined #openstack-infra | 06:19 | |
ianw | http://lists.openstack.org/pipermail/openstack-infra/2019-November/006529.html is the write-up I promised for the agenda point in the meeting | 06:31 |
*** pcaruana has joined #openstack-infra | 06:32 | |
*** lmiccini has joined #openstack-infra | 06:34 | |
*** udesale has quit IRC | 06:40 | |
*** udesale has joined #openstack-infra | 06:41 | |
*** jtomasek has joined #openstack-infra | 06:45 | |
*** slaweq has joined #openstack-infra | 07:23 | |
*** slaweq has quit IRC | 07:28 | |
*** apetrich has joined #openstack-infra | 07:29 | |
*** apetrich has quit IRC | 07:34 | |
*** udesale has quit IRC | 07:34 | |
*** udesale has joined #openstack-infra | 07:35 | |
*** xinranwang has joined #openstack-infra | 07:55 | |
*** pgaxatte has joined #openstack-infra | 07:55 | |
*** slaweq has joined #openstack-infra | 07:57 | |
*** pkopec has joined #openstack-infra | 08:08 | |
*** tesseract has joined #openstack-infra | 08:16 | |
*** tkajinam has quit IRC | 08:19 | |
*** dchen has quit IRC | 08:24 | |
*** tosky has joined #openstack-infra | 08:25 | |
AJaeger | yeah, openSUSE tumbleweed and 15 are working again. ianw, thanks for the mirror run. | 08:31 |
*** jtomasek has quit IRC | 08:34 | |
*** jtomasek has joined #openstack-infra | 08:34 | |
*** jpena|off is now known as jpena | 08:44 | |
*** ralonsoh has joined #openstack-infra | 08:53 | |
*** lucasagomes has joined #openstack-infra | 08:57 | |
*** priteau has joined #openstack-infra | 08:58 | |
*** jtomasek has quit IRC | 09:04 | |
*** ociuhandu has joined #openstack-infra | 09:04 | |
*** apetrich has joined #openstack-infra | 09:04 | |
*** ociuhandu has quit IRC | 09:08 | |
*** jtomasek has joined #openstack-infra | 09:09 | |
mgoddard | clarkb, ianw: missing a lot of context on what you were discussing, but we use templates for differentiation between distros, source/binary image types, to pass through configuration (e.g. base OS image) and customisation (e.g. install these extra packages) | 09:12 |
mgoddard | loci has a different approach, they use docker build args, which can be passed via CLI args and accessed within the Dockerfile. That can be good for package customisation etc | 09:13 |
mgoddard | another option is to use scripts within the image (ADD/COPY) which are agnostic to the thing that changes (distro, release) and either detect it or have it passed in via environment | 09:14 |
ianw | mgoddard: thanks; yeah at least for me the problem was that we were building images pulling across 3 different namespaces (zuul, opendev and openstack) and it meant nothing had a clear home | 09:17 |
ianw | s/was/is/ | 09:17 |
ianw | that's why i think removing our idea that there's one Dockerfile in each project that builds the canoncial image might work ... | 09:17 |
ianw | it doesn't matter that nodepool exports a zuul/nodepool and a opendev/nodepool and a <blah>/nodepool ... as long as we're fairly clear about what the purpose of the tools in a namespace is | 09:18 |
mgoddard | ianw: yeah, sounds like it needs to be in one place - either the 'top level' project among those or a separate one | 09:18 |
ianw | yeah, to take advantage of zuul's magic i think it's easier to have multiple dockerfiles in each project (because that way you easily speculatively build images based on zuul's checkout) ... then use the excellent work with the intermediate registries to chain those speulative builds together ... | 09:20 |
ianw | as they say, may you live in interesting times :) | 09:21 |
*** hashar has joined #openstack-infra | 09:27 | |
openstackgerrit | Simon Westphahl proposed zuul/zuul master: Add optional support for circular dependencies https://review.opendev.org/685354 | 09:31 |
*** ociuhandu has joined #openstack-infra | 09:37 | |
*** SotK has quit IRC | 09:37 | |
*** ykarel is now known as ykarel|lunch | 09:42 | |
*** ociuhandu has quit IRC | 09:43 | |
*** ociuhandu has joined #openstack-infra | 09:44 | |
*** SotK has joined #openstack-infra | 09:46 | |
*** ociuhandu has quit IRC | 09:48 | |
*** derekh has joined #openstack-infra | 09:49 | |
*** roman_g has joined #openstack-infra | 09:53 | |
*** pgaxatte has quit IRC | 09:57 | |
*** pgaxatte has joined #openstack-infra | 09:59 | |
*** ociuhandu has joined #openstack-infra | 10:00 | |
*** electrofelix has joined #openstack-infra | 10:09 | |
openstackgerrit | Simon Westphahl proposed zuul/zuul master: Keep task stdout/stderr separate in result object https://review.opendev.org/650276 | 10:11 |
*** xinranwang has quit IRC | 10:15 | |
openstackgerrit | Simon Westphahl proposed zuul/zuul master: Align template formating for reporters https://review.opendev.org/643306 | 10:18 |
*** iurygregory has joined #openstack-infra | 10:19 | |
*** ykarel|lunch is now known as ykarel | 10:28 | |
*** apetrich has quit IRC | 10:49 | |
*** lpetrut has joined #openstack-infra | 10:52 | |
*** takamatsu has quit IRC | 10:55 | |
*** udesale has quit IRC | 10:58 | |
*** dpawlik has quit IRC | 11:00 | |
*** ociuhandu has quit IRC | 11:03 | |
*** pgaxatte has quit IRC | 11:10 | |
*** jaosorior has joined #openstack-infra | 11:13 | |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: authentication config: add optional token_expiry https://review.opendev.org/642408 | 11:15 |
*** sshnaidm__ is now known as sshnaidm | 11:15 | |
*** priteau has quit IRC | 11:19 | |
*** apetrich has joined #openstack-infra | 11:27 | |
*** apetrich has quit IRC | 11:27 | |
*** dpawlik has joined #openstack-infra | 11:27 | |
*** apetrich has joined #openstack-infra | 11:29 | |
*** dpawlik has quit IRC | 11:31 | |
*** rcernin has quit IRC | 11:40 | |
*** ociuhandu has joined #openstack-infra | 11:41 | |
*** electrofelix has quit IRC | 11:43 | |
*** electrofelix has joined #openstack-infra | 11:43 | |
*** ociuhandu has quit IRC | 11:46 | |
*** dpawlik has joined #openstack-infra | 11:53 | |
*** dpawlik has quit IRC | 11:58 | |
*** jklare has quit IRC | 11:59 | |
*** surpatil has joined #openstack-infra | 11:59 | |
*** derekh has quit IRC | 12:01 | |
*** derekh has joined #openstack-infra | 12:01 | |
*** jklare has joined #openstack-infra | 12:03 | |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: Authorization rules: support YAML nested dictionaries https://review.opendev.org/684790 | 12:05 |
*** rfolco has joined #openstack-infra | 12:05 | |
*** lmiccini has quit IRC | 12:12 | |
*** dtantsur|afk is now known as dtantsur | 12:20 | |
*** udesale has joined #openstack-infra | 12:21 | |
*** apetrich has quit IRC | 12:23 | |
*** kjackal has joined #openstack-infra | 12:24 | |
*** lmiccini has joined #openstack-infra | 12:25 | |
*** Lucas_Gray has joined #openstack-infra | 12:26 | |
*** ccamacho has quit IRC | 12:40 | |
frickler | infra-root: I'm seeing a couple at gate failures due to inap mirror failures, no time to dig into it myself currently. sample https://634701e5b6b6ac718321-331251c5023ba17307c332949286c53b.ssl.cf1.rackcdn.com/695695/3/gate/openstack-tox-py35/42b263f/job-output.txt | 12:45 |
*** rh-jelabarre has joined #openstack-infra | 12:49 | |
*** dpawlik has joined #openstack-infra | 12:51 | |
*** Goneri has joined #openstack-infra | 12:53 | |
*** dpawlik has quit IRC | 12:56 | |
*** weshay|ruck is now known as weshay | 13:00 | |
*** pgaxatte has joined #openstack-infra | 13:00 | |
*** ccamacho has joined #openstack-infra | 13:02 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: zuul_stream: handle module that emit msg as a list https://review.opendev.org/696081 | 13:05 |
*** goldyfruit_ has quit IRC | 13:08 | |
*** rlandy has joined #openstack-infra | 13:11 | |
*** apetrich has joined #openstack-infra | 13:14 | |
*** surpatil has quit IRC | 13:14 | |
*** ociuhandu has joined #openstack-infra | 13:20 | |
*** liuyulong has joined #openstack-infra | 13:23 | |
*** ociuhandu has quit IRC | 13:26 | |
*** jpena is now known as jpena|lunch | 13:26 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: zuul_stream: handle module that emit non str msg https://review.opendev.org/696081 | 13:28 |
openstackgerrit | Simon Westphahl proposed zuul/zuul master: Keep task stdout/stderr separate in result object https://review.opendev.org/650276 | 13:30 |
*** mriedem has joined #openstack-infra | 13:35 | |
*** kjackal has quit IRC | 13:36 | |
*** kjackal_v2 has joined #openstack-infra | 13:37 | |
*** jaosorior has quit IRC | 13:38 | |
openstackgerrit | Jens Harbott (frickler) proposed opendev/system-config master: rsync mirrors: use localauth vos release https://review.opendev.org/695981 | 13:40 |
*** rfolco has quit IRC | 13:47 | |
*** rfolco has joined #openstack-infra | 13:49 | |
*** rfolco has quit IRC | 13:50 | |
rm_work | augh, this pyyaml thing is driving me mad | 13:51 |
rm_work | it's still happening even without novnc https://dc24826bd4516a6de44b-5bffd7009ada1667875c6ae3efa923db.ssl.cf5.rackcdn.com/695947/5/check/octavia-grenade/77b9004/logs/grenade.sh.txt.gz | 13:51 |
rm_work | i assume something ELSE pulled it in too | 13:51 |
rm_work | what I don't get is, I see something like 9 times it was requested to be installed, and it just worked | 13:52 |
rm_work | and then this one time, it explodes? O_o | 13:52 |
rm_work | i'm not sure where you found it being installed originally | 13:54 |
rm_work | hmmm nevermind, seems like the changes we made didn't actually successfully disable novnc, it's still getting installed T_T | 13:57 |
rm_work | https://review.opendev.org/#/c/695947/6/playbooks/legacy/grenade-devstack-octavia/run.yaml | 13:57 |
*** diga_ has joined #openstack-infra | 13:58 | |
rm_work | ahhh nm, it is, it's just done really stangely | 13:58 |
rm_work | https://zuul.opendev.org/t/openstack/build/8d124c9dba8049d1a8a541419031c778/log/logs/old/local_conf.txt.gz#21-54 | 13:58 |
rm_work | enabled and then disabled >_> | 13:59 |
rm_work | but that should work | 13:59 |
*** tkajinam has joined #openstack-infra | 14:00 | |
frickler | rm_work: I think we are seeing the same issue in designate, the reason for the issue, afaict, seems to be that capped pip is installed only for py2, not for py3, and then with the newer pip, some things fail | 14:11 |
rm_work | hmm | 14:12 |
rm_work | yes | 14:12 |
*** dpawlik has joined #openstack-infra | 14:12 | |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: enqueue: make trigger optional https://review.opendev.org/695446 | 14:14 |
*** jpena|lunch is now known as jpena | 14:18 | |
*** ociuhandu has joined #openstack-infra | 14:22 | |
zbr_ | frickler: *lots* of errors on inap, can't we disable it? | 14:24 |
zbr_ | >340 failures in 24h, http://logstash.openstack.org/#/dashboard/file/logstash.json?query=message:%5C%22Connection%20broken:%20IncompleteRead(0%20bytes%20read)%5C%22 | 14:24 |
frickler | rm_work: so in fact the old stack is being set up with use_python3=false, the new one with true, this breaks. IIUC gmann did make patches such that the old stack would also run with py3, not sure why this doesn't work here | 14:24 |
rm_work | we have some custom grenade stuff | 14:25 |
rm_work | so maybe it isn't using his | 14:25 |
rm_work | need to figure out what he did | 14:25 |
frickler | zbr_: I'd like some other infra-root to double-check, someone should be awake soon | 14:26 |
*** ociuhandu has quit IRC | 14:27 | |
*** rfolco has joined #openstack-infra | 14:28 | |
rm_work | can i just add `export DEVSTACK_GATE_USE_PYTHON3=True` to our grenade script? | 14:28 |
rm_work | looks like it | 14:28 |
rm_work | based on https://review.opendev.org/#/c/695097/ | 14:31 |
rm_work | and a couple other patches he proposed for the same issue I'd guess | 14:31 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: wip: add cleanup-timeout job attribute https://review.opendev.org/696098 | 14:34 |
openstackgerrit | Simon Westphahl proposed zuul/zuul master: Spec for allowing circular dependencies https://review.opendev.org/643309 | 14:35 |
*** goldyfruit has joined #openstack-infra | 14:36 | |
*** goldyfruit_ has joined #openstack-infra | 14:38 | |
openstackgerrit | Merged opendev/system-config master: rsync mirrors: use localauth vos release https://review.opendev.org/695981 | 14:40 |
*** Goneri has quit IRC | 14:41 | |
*** goldyfruit has quit IRC | 14:41 | |
*** dpawlik has quit IRC | 14:44 | |
*** Lucas_Gray has quit IRC | 14:45 | |
rm_work | yeah, inap mirrors borked maybe? | 14:46 |
rm_work | 2019-11-26 14:38:44.957553 | controller | Err:1 http://mirror.mtl01.inap.opendev.org/ubuntu bionic/main amd64 libharfbuzz0b amd64 1.7.2-1ubuntu1 | 14:46 |
rm_work | 2019-11-26 14:38:44.957657 | controller | 404 Not Found [IP: 198.72.125.4 80] | 14:46 |
openstackgerrit | Paul Belanger proposed zuul/zuul-jobs master: Update download-artifact to use zuul.artifacts https://review.opendev.org/696013 | 14:47 |
rm_work | our jobs on inap are dieing too | 14:47 |
rm_work | zbr_ / frickler ^^ | 14:47 |
*** ociuhandu has joined #openstack-infra | 14:49 | |
*** Goneri has joined #openstack-infra | 14:53 | |
openstackgerrit | Felix Schmidt proposed zuul/zuul master: Make reporting asynchronous https://review.opendev.org/691253 | 14:54 |
openstackgerrit | Felix Schmidt proposed zuul/zuul master: Make direct-push configurable on project-level https://review.opendev.org/677109 | 14:54 |
openstackgerrit | Felix Schmidt proposed zuul/zuul master: Implement push job in merger https://review.opendev.org/677110 | 14:54 |
openstackgerrit | Felix Schmidt proposed zuul/zuul master: Push changes in GerritReporter if direct-push is enabled https://review.opendev.org/677111 | 14:54 |
*** ociuhandu has quit IRC | 14:54 | |
*** dpawlik has joined #openstack-infra | 14:56 | |
fungi | i suspect the problem with the inap mirror (if the failures you're seeing are for proxied http calls?) is that the apache isn't getting its proxy cache culled fast enough or aggressively enough, so the dedicated /var/cache/apache2 volume is filling up from time to time: http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=67962&rra_id=all | 14:58 |
rm_work | ok so ... we wait for that to get fixed while a bunch of our jobs randomly die? :D | 14:59 |
rm_work | can we set up elastic-recheck for this? | 14:59 |
rm_work | (may have made a meaningless sentence, not sure if that's how that works) | 15:00 |
fungi | right now htcacheclean is running hourly with -l 70200M which may just not be aggressive enough if jobs are caching very large files | 15:00 |
fungi | the problem is "fixed" for the moment because there's room on that fs again | 15:00 |
rm_work | well right now, this means that using inap in zuul is worse than NOT using it, because having one job run on inap and fail means a recheck and like 10+ jobs need to run again | 15:01 |
rm_work | hmm k, so ... i guess I recheck and hope? | 15:01 |
fungi | but presumably still better than not testing anything at all? ;) | 15:02 |
fungi | i can clean up more cached content there, but i'm curious to know why we're not hitting the same problem in other providers which have similar configuration | 15:02 |
rm_work | is not testing one of the options? :D | 15:02 |
fungi | if you don't develop any software then there's no need to test anything, so i suppose? | 15:03 |
fungi | could be the volume of jobs being run there, or i/o performance slowing the htcacheclean run | 15:03 |
fungi | yeah, looks like it could be i/o performance, the 14:00 utc htcacheclean run is still going an hour later | 15:04 |
rm_work | yeah i'm just saying, the result of having it enabled when it's broken isn't just "ah, we don't get additional capacity from it", the result is "we have less capacity available than if it wasn't enabled" | 15:04 |
*** jaosorior has joined #openstack-infra | 15:04 | |
openstackgerrit | Paul Belanger proposed zuul/zuul-jobs master: Update download-artifact to use zuul.artifacts https://review.opendev.org/696013 | 15:04 |
fungi | right, and i'm saying the condition i suspect caused those failures has cleared, at least temporarily, and i'm looking at what we need to do to reduce the risk that it returns | 15:04 |
rm_work | k | 15:05 |
fungi | i think we're simply caching too much with apache for slower filesystems, in the past we tried to keep these down closer to 50gb but they've been increased in recent months i think | 15:05 |
fungi | there's a breaking point where you cache files faster than htcacheclean can remove them, and then you get a runaway train effect (runaway ussuri?) | 15:06 |
fungi | keeping the cache smaller reduces the context that htcacheclean must traverse, to hopefully keep it ahead of the game for subsequent runs | 15:07 |
sshnaidm | mordred, hi, please ping me if you're available in your time, wrt openstack ansible modules | 15:11 |
fungi | utilization is slowly falling there, so once htcacheclean hopefully catches up we can reduce it further. if it starts climbing again i'll disable inap temporarily in nodepool so we can safely wipe the cache before reducing the size we want htcacheclean to keep it at | 15:12 |
rm_work | really really wish it was possible to have zuul restart ONE job | 15:13 |
openstackgerrit | Elod Illes proposed openstack/openstack-zuul-jobs master: Add tox-py37 to periodic-stable-jobs template https://review.opendev.org/696105 | 15:13 |
*** ykarel is now known as ykarel|pto | 15:14 | |
rm_work | and do it while the others are still running so i don't have to wait the entire time and then re-run a bunch of jobs that passed fine already | 15:14 |
fungi | interestingly, some of our other mirrors (rax and ovh) have 128gb volumes for /var/cache/apache2 instead of 100gb | 15:16 |
fungi | that might be why the htcacheclean -l parameter got increased | 15:17 |
*** chkumar|rover is now known as raukadah | 15:17 | |
*** ociuhandu has joined #openstack-infra | 15:19 | |
*** tkajinam has quit IRC | 15:19 | |
fungi | i've temporarily switched out the running htcacheclean process there with a manual one with -l set to 50gb | 15:20 |
fungi | so hopefully we'll get some more breathing room while i continue digging | 15:20 |
rm_work | thanks | 15:20 |
fungi | looks like https://review.openstack.org/575520 raised it from 50gb to 70gb in july of last year | 15:23 |
*** dpawlik has quit IRC | 15:23 | |
fungi | though i think inap has been offline for most of that time, until a few weeks ago, so it's possible we simply didn't have data showing it would be a problem there | 15:24 |
*** yoctozepto has quit IRC | 15:24 | |
*** yoctozepto has joined #openstack-infra | 15:25 | |
*** ociuhandu has quit IRC | 15:28 | |
clarkb | fungi: we could make a 300gb cinder volume and swap in 200gb fs pretty easily? | 15:33 |
clarkb | (extra 100gb for afs) | 15:33 |
fungi | yeah, i'm unconvinced giving it more space will solve the problem though, if it can't manage to delete files down to 70gb | 15:33 |
fungi | we run htcacheclean hourly under flock, but if it takes 3 hours for htcacheclean to traverse 70gb+ data then that's two additional pulses behind it's getting | 15:34 |
clarkb | does it get near 70gb after that 3 hours though? | 15:35 |
fungi | increasing the breathing room might relieve that, i guess, to allow it to skip more htcacheclean pulses? | 15:35 |
clarkb | if so its only the growth in that windoe we have to accomodate right? | 15:35 |
fungi | looking at the graph, it was around 70gb as of 0800z but then started climbing | 15:37 |
fungi | filling the filesystem by around 1100z | 15:37 |
clarkb | we are probably very near that boundary then | 15:38 |
fungi | and it remained full until it suddenly began to drop around 1330z | 15:38 |
*** derekh has quit IRC | 15:38 | |
clarkb | and extra disk may be sufficient to get us away from it | 15:38 |
*** derekh has joined #openstack-infra | 15:38 | |
fungi | so basically htcacheclean could not keep up for around 5.5 hours | 15:38 |
clarkb | the cleaning is sudden aiuibecause it does a scan first to determine what to delete then deletes | 15:38 |
clarkb | as it applies LRU rules to objects if bot already expired | 15:39 |
clarkb | *not | 15:39 |
fungi | by the time it did manage to delete content, it had lost 10gb ground and so only deleted down to around 80gb | 15:39 |
fungi | presumably because it was working of 6-hour-old calculations for what should be expired | 15:39 |
fungi | but yeah, we could attach another cinder volume, add it to the current vg, extend the volume into it and see if things are able to keep up with the extra headroom | 15:42 |
fungi | but if it's basically only capable of doing an htcacheclean pass every 6 hours under load, i'm concerned that may still end up ina bad place | 15:42 |
*** ociuhandu has joined #openstack-infra | 15:42 | |
fungi | wondering if there are faster storage options there | 15:42 |
fungi | something where we might get better read performance (assuming that's the delay) | 15:43 |
clarkb | ya though raid 0 may be the answer | 15:43 |
clarkb | (which second volume approximates) | 15:44 |
fungi | that's not a terrible idea, though it does double the risk for catastrpohic failure of the service from a volume outage (so does attaching a second volume to the vg really) | 15:44 |
clarkb | yup | 15:45 |
clarkb | and with cinder volumes it is really hard to tell if weactually get double the throughput | 15:45 |
fungi | i'm pretty sure lvm2 can stripe blocks across multiple pvs, need to revisit the manpage. if so we could do that without resorting to mdraid | 15:45 |
clarkb | because networking and wedont know what the backend looks like | 15:45 |
clarkb | looks like lvm can do it but we need anew lv | 15:48 |
*** diablo_rojo has joined #openstack-infra | 15:53 | |
*** lmiccini has quit IRC | 15:54 | |
*** michael-beaver has joined #openstack-infra | 15:54 | |
openstackgerrit | Yannick Thomas proposed openstack/project-config master: Create neutron-interconnection repo under x/ namespace https://review.opendev.org/696116 | 15:55 |
*** mriedem has quit IRC | 15:55 | |
*** mriedem has joined #openstack-infra | 15:59 | |
*** dpawlik has joined #openstack-infra | 16:00 | |
*** dtantsur is now known as dtantsur|afk | 16:03 | |
*** dpawlik has quit IRC | 16:04 | |
*** kjackal_v2 has quit IRC | 16:12 | |
*** ijw has joined #openstack-infra | 16:14 | |
clarkb | fungi: alsonpossible that the local root disk performs better than cinder volumes there? we could use apache cache on local disk if so | 16:15 |
gmann | rm_work: frickler yeah that worked for other projects. octavia was facing the broken pipe issue also in that grenade job fix patch.. | 16:16 |
fungi | clarkb: maybe, on the other hand if it doesn't then we fill the rootfs again and *boom* | 16:17 |
clarkb | ya | 16:18 |
*** hashar has quit IRC | 16:19 | |
rm_work | gmann: yeah and ALSO some issue with the installation of osc-placement in nova | 16:19 |
*** jamesmcarthur has joined #openstack-infra | 16:19 | |
rm_work | https://review.opendev.org/#/c/695466/ | 16:19 |
rm_work | so many issues | 16:19 |
rm_work | i think we're going to temporarily switch our grenade job to non-voting, because this is ridiculous, and wait for some of this stuff to work itself out | 16:20 |
rm_work | our gates have been down since the middle of last week | 16:20 |
gmann | rm_work: ok, or I will suggest to make it py2->py2 as it was previously so that you keep running the coverage. later while moving to everything py3 then we fix those issue | 16:21 |
rm_work | hmm, i guess that's a possibility -- but i think we are ALSO about to have py2 issues | 16:22 |
*** tesseract has quit IRC | 16:23 | |
*** pgaxatte has quit IRC | 16:25 | |
*** ociuhandu has quit IRC | 16:33 | |
*** ociuhandu has joined #openstack-infra | 16:34 | |
AJaeger | config-core, could you review https://review.opendev.org/694478, please? | 16:42 |
*** jpena is now known as jpena|brb | 16:45 | |
*** lucasagomes has quit IRC | 16:46 | |
clarkb | as a sanity check the inap volume is mounted noatime,errors=remount-ro,barrier=0 which is in line with our other volume mount options | 16:47 |
clarkb | AJaeger: do you know if the neutron team is aware of the apparent new interest? and if so does that change their plans at all? | 16:50 |
clarkb | (I want to avoid a situation where we fork the project then 6 months later want to replace the openstack/ project with the fork | 16:51 |
slaweq | clarkb: hi, yes I know about it | 16:51 |
slaweq | but as there was really lack of development in this project for last couple of cycles, we decided to not keep it as stadium project anymore | 16:52 |
slaweq | and I don't think it will change anytime soon | 16:52 |
clarkb | slaweq: right I get all that. My question is now that you've said remove it people are claiming that they will invest time into it. Does that change the calculation at all? Mostly because I really want to avoid a potential unfork in the future | 16:53 |
*** dpawlik has joined #openstack-infra | 16:54 | |
slaweq | clarkb: I don't think so, there is no reason why it couldn't be developed in x/ namespace if people will really want | 16:54 |
slaweq | but based on the past experience with this project, I'm really still not sure how much time they will be able to invest in this project now | 16:55 |
slaweq | so as for now I can say that we will not want to unfork it in e.g. 6 months | 16:55 |
slaweq | clarkb: is that good answer for You? :) | 16:55 |
clarkb | yes | 16:56 |
slaweq | thx :) | 16:57 |
*** dpawlik has quit IRC | 16:58 | |
*** iurygregory has quit IRC | 16:58 | |
johnsom | Ah, ok, so you are already working on the broken inap mirror. | 17:01 |
johnsom | E: Failed to fetch http://mirror.mtl01.inap.opendev.org/ubuntu/pool/main/h/harfbuzz/libharfbuzz0b_1.7.2-1ubuntu1_amd64.deb 404 Not Found [IP: 198.72.125.4 80] | 17:01 |
clarkb | johnsom: well I think we've identified the cause (htcacheclean not cleaning quickly enough) and now trying to sort out if simply adding some headroom is sufficient to avoid the problem or if we need to find faster disk io | 17:03 |
johnsom | Just minutes ago in https://aa07927f3550f63afd7c-beccc8c74927db18ecc6d28abe62d057.ssl.cf1.rackcdn.com/695947/8/check/octavia-v2-dsvm-scenario/d746404/job-output.txt | 17:03 |
clarkb | hrm it has plenty of disk right now ~10GB | 17:03 |
johnsom | Cool, thanks. Hopefully not one of those HPE SSDs... lol | 17:03 |
clarkb | also that is afs not apache cache | 17:03 |
clarkb | (that particular url is I mean) | 17:03 |
clarkb | possible that something else is going on too | 17:04 |
*** sshnaidm is now known as sshnaidm|afk | 17:05 | |
clarkb | -????????? ? ? ? ? ? libharfbuzz0b_1.7.2-1ubuntu1_amd64.deb | 17:06 |
clarkb | that is what ls -l shows for that file on afs "disk" | 17:06 |
clarkb | about 5 hours ago there were a bunch of afs io errors | 17:07 |
clarkb | nothing current, but those errors could still affect that particular file I suppose | 17:07 |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: WIP pagure: remove connectors burden and simplify code https://review.opendev.org/696134 | 17:07 |
clarkb | fungi: ^ maybe we should reboot? | 17:08 |
*** udesale has quit IRC | 17:09 | |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: WIP pagure: remove connectors burden and simplify code https://review.opendev.org/696134 | 17:10 |
clarkb | fungi: I think we should disable inap, increase apache cache fs size (possibly with raid0), reboot, then turn it back on again | 17:10 |
clarkb | I expect that we are close enough to the limit here that simply adding a bit more disk will make things happy again | 17:10 |
clarkb | basically we need to get to where htcacheclean is able to stat its contents before the disk fills up | 17:11 |
clarkb | thoughts? | 17:11 |
fungi | clarkb: yeah, i also wonder if whatever impacted afs could be similarly responsible for htcacheclean's poor performance | 17:13 |
openstackgerrit | Clark Boylan proposed openstack/project-config master: Disable inap https://review.opendev.org/696137 | 17:13 |
rm_work | yeah i think it's always been that package for us -- so possible just that one or a couple of them got corrupted at some point | 17:13 |
fungi | clarkb: also two apache segfaults a few minutes after that burst of afs disk cache read errors | 17:15 |
*** ociuhandu has quit IRC | 17:15 | |
fungi | it's possible the cinder volume these are on is sometimes not responding or responding unreasonably slowly | 17:15 |
clarkb | ya and that could affect afs when it goes to read from the cache | 17:16 |
clarkb | (or contention between the two caches against the volume) | 17:17 |
clarkb | we have a single volume type in that region: solidfire0 | 17:18 |
openstackgerrit | Merged openstack/project-config master: Retire neutron-interconnection project https://review.opendev.org/694478 | 17:19 |
fungi | clarkb: looks like we're also not collecting disk i/o counters from snmp | 17:19 |
*** igordc has joined #openstack-infra | 17:20 | |
fungi | top shows iowait spiking up badly for brief periods though | 17:20 |
fungi | watched it go from around 1% wa to almost 80% a minute ago | 17:21 |
*** jpena|brb is now known as jpena | 17:21 | |
clarkb | benj_: ^ any idea how we can more efficiently make use of cinder there? | 17:21 |
*** jamesmcarthur has quit IRC | 17:22 | |
clarkb | fungi: another option is to use a bigger flavor then hope the root disk performs better | 17:23 |
fungi | yeah, maybe. are there other cinder volume types available, did you happen to notice? | 17:24 |
clarkb | no that is the only type available | 17:24 |
fungi | k, then yeah maybe the rootfs is hypervisor local storage and has less network congestion to worry about | 17:24 |
fungi | but if we're also seeing problems for afs then we'd presumably want both on the rootfs, so need a larger rootfs | 17:25 |
clarkb | yes, we can rebuild the server wti ha bigger flavor | 17:25 |
fungi | do the larger flavors come with larger rootfs? | 17:27 |
fungi | i know that's not a given in other providers | 17:27 |
AJaeger | config-core, two small reviews, please: https://review.opendev.org/695661 https://review.opendev.org/695401 | 17:27 |
clarkb | fungi: yes, in this case we can double the rootfs | 17:28 |
clarkb | 320GB from 160GB | 17:28 |
clarkb | (doubles a bunch of other stuff too but that may just be what we have to accept)_ | 17:29 |
fungi | anybody remember how far we got (or where we got stuck) adding a builder for arm64/aarch64 wheels? was it challenges with afs support? | 17:29 |
clarkb | we build wheels in zuul jobs now. I don't think anything on the zuul side has problems with that currently. I'm guessing that it is afs that posed a problem (since we have to write and not just read) | 17:30 |
fungi | do the current wheel builds for amd64 get temporary write access to afs, or is the executor retrieving those and writing them into afs? | 17:31 |
clarkb | fungi: I think a secret is shared on the build node and they write directly | 17:31 |
fungi | k. in that case, yeah, we do still need to solve afs access from arm64 systems | 17:32 |
AJaeger | ianw should know, AFAIK he looked at that... | 17:32 |
*** jaosorior has quit IRC | 17:32 | |
fungi | yeah, i was just going to follow up on the multi-arch sig thread on openstack-discuss where jrosser mentioned needing a prebuilt arm64 wheel cache for osa jobs | 17:33 |
fungi | but can certainly wait until ianw is awake | 17:33 |
openstackgerrit | Merged openstack/project-config master: Disable inap https://review.opendev.org/696137 | 17:34 |
*** ykarel|pto has quit IRC | 17:34 | |
clarkb | fungi: fwiw I'm somewhat inclined to build a new inap mirror on the 320GB root disk flavor since that is straightforward and doesn't involve trying to tweak too many settings | 17:34 |
fungi | i wonder if a refactor of the job to have the executor write those into afs instead would make things easier (but i think we'd want to make sure we only transfer the outstanding delta between what was already built and what changed?) | 17:34 |
clarkb | we can spend all day tuning cinder or an hour setting up new server :) | 17:35 |
fungi | clarkb: yeah, i'm with you there. best we focus our limited available time elsewhere | 17:35 |
mgagne | clarkb: what's the issue? | 17:35 |
fungi | mgagne: we saw read errors from a cinder volume, and are also getting very slow i/o reading from it | 17:36 |
clarkb | mgagne: the thought is that using a vm local root disk (that is bigger) may provide better throughput | 17:36 |
mgagne | is the issue intermittent or is it still going on? | 17:36 |
mgagne | can't disagree on that one (local disk > cinder volume) | 17:37 |
clarkb | mgagne: it is a little bit of both :) intermittent iowait slows down htcacheclean to the point where we run out of disk (because it can't clean quickly enough) and we think that the intermittent issue may cause problems in afs that then persists | 17:37 |
fungi | mgagne: it was apparently particularly bad around 12:00 and 12:45 utc | 17:37 |
clarkb | my thought is go to the 320GB root disk flavor and not use a cinder volume | 17:38 |
mgagne | I'm looking at performance graphs on the backend and we didn't see any issue here. Our cinder volume has a minimum of 400 iops guarantied and maximum of 4000 if available (which should always be the case here) | 17:39 |
clarkb | mgagne: could it be that network is the bottleneck (if shared with the test nodes?) | 17:39 |
mgagne | we unfortunately don't have any other volume types but I could manually increase the maximum IOPS to double it. | 17:39 |
clarkb | if we double the iops we should be able to see if htcacheclean runs quicker | 17:40 |
fungi | mgagne: can you see whether that instance is exceeding the cinder iops guarantee? | 17:40 |
mgagne | compute nodes have 10g with lacp. But VMs themselves have limited bandwidth depending on the flavor. The storage network isn't limited though. | 17:40 |
clarkb | ok, likely not networking then | 17:41 |
mgagne | I'll check if that information is available | 17:41 |
mgagne | performance graph isn't real time for volumes. I saw some spike to 100% utilization but IOPS reported is way below the maximum. So I'm a bit confused about how the % is computed. Could be based on the minimum. | 17:47 |
*** tobiash_ is now known as tobiash | 17:47 | |
openstackgerrit | Merged openstack/project-config master: Manage pyghmi jobs at project level https://review.opendev.org/695661 | 17:49 |
openstackgerrit | Merged openstack/project-config master: Add gerritbot trigger for microstack https://review.opendev.org/695401 | 17:49 |
*** openstackgerrit has quit IRC | 17:49 | |
mgagne | I think iops are based on 4k. But your iops are way above 4k. Average ~16k with 64k peaks in your case. I suppose increasing maximum would be a good thing | 17:50 |
clarkb | mgagne: fwiw this is the only volume we'll be running there (at least we don't have plans for additional volumes). I'm not sure if that system is shared but if not then raising those iops would probably be ok? | 17:51 |
mgagne | it's shared but it's under utilized afaik. | 17:51 |
mgagne | I doubled the maximum iops. | 17:52 |
clarkb | but also we can probably switch to root disk if that is better for you all (then use the bigger flavor) | 17:52 |
mgagne | yeah, local disk will always be faster | 17:53 |
mgagne | IIRC RAID10 with SSD is used on the compute nodes. | 17:54 |
clarkb | oh ya in that case maybe I should just go ahead and build a new server on the bigger flavor | 17:54 |
clarkb | fungi: ^ | 17:54 |
fungi | i concur | 17:56 |
*** jklare has quit IRC | 17:56 | |
fungi | sounds like the best way forward at this point | 17:56 |
clarkb | ok new mirror is building. I'll push up inventory and dns changes once that info is known | 17:58 |
fungi | thanks! | 17:59 |
*** jklare has joined #openstack-infra | 18:00 | |
*** derekh has quit IRC | 18:00 | |
*** openstackgerrit has joined #openstack-infra | 18:09 | |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Replace inap mirror with bigger instance https://review.opendev.org/696148 | 18:09 |
clarkb | I think we want to get in the dns update first | 18:09 |
clarkb | so that acme.sh will work | 18:10 |
openstackgerrit | Clark Boylan proposed opendev/zone-opendev.org master: Replace inap mirror https://review.opendev.org/696150 | 18:13 |
clarkb | fungi: ^ As soon as inap in-use count falls to 0 I think we can merge the dns update then the system-config update | 18:13 |
clarkb | in use count is 80 and falling | 18:14 |
*** armax has quit IRC | 18:14 | |
*** pkopec has quit IRC | 18:22 | |
*** ijw has quit IRC | 18:26 | |
*** jaosorior has joined #openstack-infra | 18:26 | |
clarkb | gonna step out for a bit before the meeting | 18:26 |
*** jpena is now known as jpena|off | 18:31 | |
*** rosmaita has quit IRC | 18:32 | |
*** openstackgerrit has quit IRC | 18:35 | |
*** ijw has joined #openstack-infra | 18:37 | |
fungi | http://grafana.openstack.org/d/ykvSNcImk/nodepool-inap?orgId=1 says 48 in use | 18:38 |
*** ijw has quit IRC | 18:40 | |
*** ijw has joined #openstack-infra | 18:40 | |
*** goldyfruit___ has joined #openstack-infra | 18:41 | |
*** goldyfruit_ has quit IRC | 18:43 | |
*** rosmaita has joined #openstack-infra | 18:46 | |
*** electrofelix has quit IRC | 18:50 | |
*** ralonsoh has quit IRC | 18:51 | |
*** goldyfruit_ has joined #openstack-infra | 18:51 | |
*** goldyfruit___ has quit IRC | 18:54 | |
*** dpawlik has joined #openstack-infra | 18:55 | |
*** aedc has joined #openstack-infra | 18:57 | |
*** dpawlik has quit IRC | 18:59 | |
*** ijw_ has joined #openstack-infra | 19:03 | |
*** ijw has quit IRC | 19:06 | |
*** goldyfruit_ has quit IRC | 19:07 | |
*** lpetrut has quit IRC | 19:08 | |
*** goldyfruit has joined #openstack-infra | 19:09 | |
*** goldyfruit_ has joined #openstack-infra | 19:19 | |
*** eharney has joined #openstack-infra | 19:20 | |
*** goldyfruit has quit IRC | 19:22 | |
*** jamesmcarthur has joined #openstack-infra | 19:36 | |
*** aedc has quit IRC | 19:38 | |
*** aedc has joined #openstack-infra | 19:39 | |
*** eharney has quit IRC | 19:46 | |
diablo_rojo | Sorry I missed the meeting. thanks fungi for covering the attachment stuff! | 19:51 |
fungi | np! | 19:51 |
*** rlandy is now known as rlandy|brb | 19:56 | |
*** diga_ has quit IRC | 19:57 | |
*** openstackgerrit has joined #openstack-infra | 20:01 | |
openstackgerrit | Merged opendev/zone-opendev.org master: Replace inap mirror https://review.opendev.org/696150 | 20:01 |
fungi | now that the meeting's over, i'm going to go get some very late lunch and run pre-holiday errands, back laterish | 20:06 |
*** aedc has quit IRC | 20:10 | |
*** rlandy|brb is now known as rlandy | 20:10 | |
*** goldyfruit___ has joined #openstack-infra | 20:17 | |
*** goldyfruit_ has quit IRC | 20:20 | |
*** goldyfruit_ has joined #openstack-infra | 20:21 | |
*** goldyfruit___ has quit IRC | 20:24 | |
*** nhicher has quit IRC | 20:33 | |
*** nhicher has joined #openstack-infra | 20:33 | |
clarkb | I've approved https://review.opendev.org/#/c/696148/ google dns reports the new ip for the mirror now so all the acme stuff should be ready to go | 20:34 |
*** nhicher has quit IRC | 20:35 | |
*** nhicher has joined #openstack-infra | 20:39 | |
*** ccamacho has quit IRC | 20:46 | |
*** tosky has quit IRC | 20:46 | |
*** dpawlik has joined #openstack-infra | 20:55 | |
openstackgerrit | Merged opendev/system-config master: Replace inap mirror with bigger instance https://review.opendev.org/696148 | 20:56 |
*** dpawlik has quit IRC | 21:00 | |
*** jtomasek has quit IRC | 21:00 | |
*** armax has joined #openstack-infra | 21:03 | |
*** rfolco has quit IRC | 21:14 | |
clarkb | inap mirror is ansibling now | 21:20 |
openstackgerrit | Clark Boylan proposed openstack/project-config master: Revert "Disable inap" https://review.opendev.org/696193 | 21:25 |
clarkb | we aren't ready for ^ yet, but wanted it to get through check so that it is ready when we are ready | 21:25 |
*** threestrands has joined #openstack-infra | 21:27 | |
openstackgerrit | Ian Wienand proposed openstack/project-config master: zuul layout: include openstacksdk in zuul tenant for jobs https://review.opendev.org/696194 | 21:29 |
ianw | fungi /clarkb: ^ this would be helpful so i can flesh out the images stuff into working changes | 21:30 |
clarkb | ianw: we might want to sort out the zuul tenant's config errors first? | 21:30 |
clarkb | AJaeger pointed them out and has fixes by way of includnig more projects in the zuul tenant (which is technically fine but the problems are related to the dns test jobs and similar which I wouldn't expect us to care about in the zuul tenant) | 21:31 |
clarkb | I've not had time to look at it beyond that though | 21:31 |
ianw | hrm, ok ,will look, we did similar for dib and it didn't make things worse, at least | 21:32 |
*** igordc has quit IRC | 21:33 | |
ianw | Job dib-functests-base not defined | 21:34 |
ianw | that's weird, that seems like it's all in the one repo | 21:35 |
*** ijw_ has quit IRC | 21:36 | |
*** cloudnull has quit IRC | 21:37 | |
*** d34dh0r53 has quit IRC | 21:39 | |
*** igordc has joined #openstack-infra | 21:39 | |
*** cloudnull has joined #openstack-infra | 21:39 | |
*** d34dh0r53 has joined #openstack-infra | 21:39 | |
ianw | i agree with the other missing projects, though | 21:48 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Add necessary ansible vars for inap mirror LE https://review.opendev.org/696195 | 21:53 |
clarkb | ianw: fungi ^ I forgot to add that (this is why we don't have an apache or certs on the new host yet) | 21:53 |
ianw | oh yeah, the number change | 21:53 |
*** diablo_rojo has quit IRC | 21:56 | |
*** jaosorior has quit IRC | 21:59 | |
openstackgerrit | Vitaliy Lotorev proposed zuul/zuul master: doc: Clarify that some regexp has restricted syntax https://review.opendev.org/695991 | 22:01 |
*** diablo_rojo has joined #openstack-infra | 22:02 | |
*** ociuhandu has joined #openstack-infra | 22:02 | |
*** rcernin has joined #openstack-infra | 22:04 | |
openstackgerrit | Vitaliy Lotorev proposed zuul/zuul master: doc: Document regexp usage https://review.opendev.org/695991 | 22:05 |
*** ociuhandu has quit IRC | 22:08 | |
*** pcaruana has quit IRC | 22:16 | |
*** slaweq has quit IRC | 22:37 | |
*** pkopec has joined #openstack-infra | 22:40 | |
tonyb | clarkb, ianw: FWIW, I tried running gitea locally to see if I can get more data / debuggin info but so far I haven't made a lot of progress. | 22:40 |
tonyb | clarkb, ianw: My nova repo get's 'wedged' every 2nd day but I have a work around and as I'm the only one seeing it I don't think it's anyhting like a high priority | 22:42 |
*** armax has quit IRC | 22:42 | |
fungi | you're not the only one seeing it, since we can all replicate the issue with a copy of your clone | 22:43 |
fungi | you're just apparently the only one inconvenienced enough by it to give us a heads-up | 22:43 |
*** gfidente has quit IRC | 22:46 | |
tonyb | fungi: fair. I'd really like for someone to reproduce it *without* my repo. Not that've done anyting funky with my repo but still. | 22:47 |
fungi | git fsck doesn't think you've adulterated that repo at the least | 22:49 |
tonyb | fungi: Yeah, I mean it's a repo I've been using for 5ish years but I've only done things you'd expect in the repo | 22:50 |
tonyb | I really want to reproduce it with a local gitea, I think that's the only way we're going to make progress | 22:51 |
fungi | yep, a local master branch tip deployment even | 22:54 |
fungi | i keep reminding myself they're all volunteers with day jobs too | 22:55 |
fungi | the easier we can make it to confirm this bug, the better | 22:55 |
*** dpawlik has joined #openstack-infra | 22:56 | |
*** dpawlik has quit IRC | 23:01 | |
*** rh-jelabarre has quit IRC | 23:06 | |
*** tkajinam has joined #openstack-infra | 23:08 | |
*** slaweq has joined #openstack-infra | 23:11 | |
*** goldyfruit___ has joined #openstack-infra | 23:13 | |
*** slaweq has quit IRC | 23:15 | |
*** goldyfruit_ has quit IRC | 23:16 | |
*** goldyfruit_ has joined #openstack-infra | 23:17 | |
*** goldyfruit___ has quit IRC | 23:20 | |
*** dchen has joined #openstack-infra | 23:23 | |
*** pkopec has quit IRC | 23:28 | |
*** ociuhandu has joined #openstack-infra | 23:30 | |
clarkb | arg the mirror test failed on the fix for inap le | 23:32 |
clarkb | I've rechecked it | 23:32 |
ianw | hrm what was the failure? | 23:34 |
clarkb | still looking | 23:34 |
*** ociuhandu has quit IRC | 23:35 | |
clarkb | https://zuul.opendev.org/t/openstack/build/3681d241ffc84924ac20107828374437/log/job-output.txt#2757 | 23:35 |
ianw | hrm, we don't capture the journal that would tell us why apache was unhappy | 23:35 |
clarkb | ara doesn't ahve any more than that job-output file either | 23:36 |
clarkb | has the "look in journalctl -xe" message | 23:37 |
ianw | it suggests invalid certs somehow, but the letsencrypt bits seemed to work. it should have self-signed certs deployed | 23:37 |
*** armax has joined #openstack-infra | 23:38 | |
ianw | https://zuul.opendev.org/t/openstack/build/3681d241ffc84924ac20107828374437/log/mirror01.openafs.provider.opendev.org/syslog.txt.gz#1460 | 23:39 |
ianw | actually we do capture it | 23:39 |
clarkb | oh! | 23:39 |
ianw | ... so ... why did that not generate a self-signed cert | 23:39 |
clarkb | https://storage.bhs1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_368/696195/1/gate/system-config-run-mirror/3681d24/bridge.openstack.org/ara-report/result/7c5e2ae6-30cf-4a12-9365-93801c1bb0a4/ that task was skipped | 23:40 |
ianw | https://zuul.opendev.org/t/openstack/build/3681d241ffc84924ac20107828374437/log/job-output.txt#2507 ... it looks like it ran acme.sh, but didn't get a txt record | 23:41 |
ianw | i doesn't seem to have captured *any* output from that | 23:41 |
clarkb | ya acme_txt_required is an empty list | 23:41 |
clarkb | I wonder if this is the same thing we saw on gitea06? | 23:41 |
clarkb | where it seems to do everything but fails anyway? | 23:42 |
ianw | https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/letsencrypt-acme-sh-install/files/driver.sh | 23:43 |
ianw | hrm, we should capture the log file, that would tell us what went wrong | 23:43 |
*** eernst has joined #openstack-infra | 23:44 | |
ianw | we do for the letsencrypt tests, but no the mirror jobs | 23:44 |
*** rfolco has joined #openstack-infra | 23:47 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: mirror jobs: copy acme.sh output https://review.opendev.org/696208 | 23:47 |
*** eernst has quit IRC | 23:48 | |
ianw | but, it still should have failed i would have thought, at least it would stop things earlier | 23:48 |
clarkb | I think the list being empty means nothign to do so it succeeds anyway | 23:49 |
*** goldyfruit_ has quit IRC | 23:49 | |
clarkb | I wonder if we can check that (have something to see if a record is expected? then fail if list is empty anyway?) | 23:49 |
ianw | looking at this with fresh eyes ... https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/letsencrypt-acme-sh-install/files/driver.sh#L39 would do a terrible job when there's a failure | 23:50 |
clarkb | not sure if we know that or if we rely on acm.sh to figure it out | 23:50 |
ianw | it will swallow it | 23:50 |
clarkb | ianw: the log file would have it though right? | 23:50 |
ianw | we'll see it in the log files (from the tee) but it won't fail the script | 23:51 |
*** rfolco has quit IRC | 23:51 | |
*** rfolco has joined #openstack-infra | 23:51 | |
ianw | PIPESTATUS is probably the best way to handle this, do a post-check | 23:52 |
clarkb | should we set pipefail and errexit? | 23:52 |
ianw | i think not because that will bail it before it gets to the tee to stash it in the logfile | 23:53 |
clarkb | good point | 23:53 |
openstackgerrit | Merged openstack/diskimage-builder master: Add IPv6 support in dhcp-all-interfaces https://review.opendev.org/692110 | 23:54 |
*** rfolco has quit IRC | 23:55 | |
*** rfolco has joined #openstack-infra | 23:55 | |
*** rlandy has quit IRC | 23:58 | |
ianw | have to double check the exit code when renewal isn't required | 23:58 |
*** rlandy has joined #openstack-infra | 23:58 | |
*** diablo_rojo has quit IRC | 23:58 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!