*** mlavalle has quit IRC | 00:09 | |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: Add ensure-virtualenv https://review.opendev.org/723309 | 00:10 |
---|---|---|
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: functests: use ensure-virtualenv https://review.opendev.org/723316 | 00:30 |
openstackgerrit | Merged opendev/system-config master: Improve zuul-web apache config https://review.opendev.org/723711 | 00:38 |
openstackgerrit | Ian Wienand proposed openstack/project-config master: Activate more plain nodes https://review.opendev.org/723769 | 01:20 |
*** DSpider has joined #opendev | 02:28 | |
openstackgerrit | Ian Wienand proposed openstack/project-config master: Activate more plain nodes https://review.opendev.org/723769 | 02:58 |
openstackgerrit | Ian Wienand proposed openstack/project-config master: Remove tumbleweed-plain images https://review.opendev.org/723780 | 02:58 |
openstackgerrit | Merged openstack/project-config master: Remove tumbleweed-plain images https://review.opendev.org/723780 | 03:28 |
openstackgerrit | Merged zuul/zuul-jobs master: k8-logs: use failed_when: instead of ignore_errors: https://review.opendev.org/723647 | 03:56 |
openstackgerrit | Merged zuul/zuul-jobs master: container-logs: use failed_when: instead of ignore_errors: https://review.opendev.org/723648 | 03:57 |
openstackgerrit | Jan Zerebecki proposed openstack/diskimage-builder master: Retry git clone/fetch on timeout https://review.opendev.org/721581 | 04:01 |
*** ykarel|away is now known as ykarel | 04:28 | |
*** sshnaidm|afk is now known as sshnaidm|off | 04:34 | |
ianw | dirk / cmurphy / AJaeger : so with dib 2.36.0 released we now have fresh builds of opensuse-15 and opensuse-tumbleweed images (tumbleweed just finished) | 04:47 |
ianw | i would like to move quickly on getting rid of pip-and-virtualenv from suse, as i think it doesn't have too much exposure to areas where that will be difficult | 04:47 |
ianw | https://review.opendev.org/723769 will add opensuse-15-plain nodes, then i can double check devstack and keystone (https://review.opendev.org/723762) but i expect it to "just work" | 04:48 |
ianw | if there's more areas where suse might be used outside of devstack LMN | 04:48 |
*** ysandeep|away is now known as ysandeep | 05:21 | |
AJaeger | thanks, ianw | 05:42 |
*** dpawlik has joined #opendev | 06:06 | |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add ensure-virtualenv https://review.opendev.org/723309 | 06:09 |
openstackgerrit | OpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml https://review.opendev.org/723809 | 06:22 |
openstackgerrit | Merged openstack/project-config master: Activate more plain nodes https://review.opendev.org/723769 | 06:32 |
openstackgerrit | Merged openstack/project-config master: Normalize projects.yaml https://review.opendev.org/723809 | 06:55 |
*** rpittau|afk is now known as rpittau | 07:34 | |
*** ykarel is now known as ykarel|afk | 07:35 | |
*** tosky has joined #opendev | 07:36 | |
*** ralonsoh has joined #opendev | 07:52 | |
*** ykarel|afk is now known as ykarel | 07:58 | |
*** diablo_rojo_phon has joined #opendev | 07:59 | |
*** ysandeep is now known as ysandeep|lunch | 08:22 | |
*** ykarel is now known as ykarel|lunch | 08:39 | |
*** ysandeep|lunch is now known as ysandeep | 08:53 | |
*** andreykurilin has joined #opendev | 08:57 | |
AJaeger | infra-root, Zuul is having problems, there are internal server errors | 08:57 |
openstackgerrit | Rico Lin proposed opendev/irc-meetings master: Switch Multi-Arch SIG meeting schedule https://review.opendev.org/723848 | 08:58 |
AJaeger | https://zuul.opendev.org/api/tenant/openstack/status gives 500 | 08:58 |
AJaeger | #status alert Zuul is currently failing testing, please refrain from recheck and submitting of new changes until this is solved. | 09:00 |
openstackstatus | AJaeger: sending alert | 09:00 |
AJaeger | infra-root, https://review.opendev.org/#/c/723148/ shows "EXCEPTION" on changes, Zuul does not look happy at all | 09:00 |
-openstackstatus- NOTICE: Zuul is currently failing testing, please refrain from recheck and submitting of new changes until this is solved. | 09:00 | |
*** ChanServ changes topic to "Zuul is currently failing testing, please refrain from recheck and submitting of new changes until this is solved." | 09:00 | |
frickler | GearmanError("Unable to submit job to any connected servers") | 09:04 |
frickler | I vaguely remember something in backlog about gearman stopping, but don't want to fiddle too much with it right now, let's wait for the usual suspects to show up | 09:05 |
openstackstatus | AJaeger: finished sending alert | 09:06 |
AJaeger | #status alert Zuul is currently failing all testing, please refrain from approving, rechecking or submitting of new changes until this is solved. | 09:10 |
openstackstatus | AJaeger: sending alert | 09:10 |
AJaeger | frickler: ok, let me make my statement clearer and spam a bit more if this takes longer. Thanks for checking. | 09:11 |
-openstackstatus- NOTICE: Zuul is currently failing all testing, please refrain from approving, rechecking or submitting of new changes until this is solved. | 09:11 | |
*** ChanServ changes topic to "Zuul is currently failing all testing, please refrain from approving, rechecking or submitting of new changes until this is solved." | 09:11 | |
frickler | Apr 28 08:52:59 zuul01 kernel: [71922766.912535] zuul-scheduler invoked oom-killer: gfp_mask=0x24201ca, order=0, oom_score_adj=0 | 09:14 |
openstackstatus | AJaeger: finished sending alert | 09:17 |
AJaeger | http://cacti.openstack.org/cacti/graph.php?action=zoom&local_graph_id=64794&rra_id=0&view_type=tree&graph_start=1587979104&graph_end=1588065504 | 09:18 |
AJaeger | http://cacti.openstack.org/cacti/graph.php?action=zoom&local_graph_id=64792&rra_id=0&view_type=tree&graph_start=1587979104&graph_end=1588065504 | 09:19 |
AJaeger | wow | 09:19 |
AJaeger | no wonder | 09:19 |
iurygregory | nice graphs =) | 09:34 |
*** ykarel|lunch is now known as ykarel | 09:48 | |
*** avass has joined #opendev | 10:03 | |
*** owalsh has joined #opendev | 10:09 | |
*** rpittau is now known as rpittau|bbl | 10:10 | |
*** ysandeep is now known as ysandeep|afk | 11:07 | |
openstackgerrit | Jan Zerebecki proposed openstack/diskimage-builder master: Retry zypper when refresh failed https://review.opendev.org/721587 | 11:12 |
openstackgerrit | Jan Zerebecki proposed openstack/diskimage-builder master: Retry git clone/fetch on timeout https://review.opendev.org/721581 | 11:14 |
*** tosky has quit IRC | 11:23 | |
*** iurygregory has quit IRC | 11:26 | |
*** iurygregory has joined #opendev | 11:26 | |
*** tosky has joined #opendev | 11:54 | |
mordred | AJaeger: jeez. what a nice thing | 11:58 |
*** panda|ruck has joined #opendev | 12:02 | |
mordred | infra-root: zuul scheduler is really unhappy - I don't know that I can debug it any further than it is now, but I could certainly restart it | 12:07 |
mordred | I'm not sure whether it's better to do that - which would potentially lose some debugging context - or wait for corvus to be awake, which might be a little longer | 12:08 |
*** rpittau|bbl is now known as rpittau | 12:08 | |
* mordred is leaning towards restarting - since we're pretty dead in the water right now ... | 12:10 | |
mordred | infra-root, config-core: ^^ any input? | 12:11 |
frickler | mordred: from the cacti graphs, we seem to have a memory leak, so after a restart that will likely reappear in a day or two | 12:11 |
frickler | mordred: which makes me support the restart | 12:11 |
mordred | yeah. likely to recur again - hopefully at a time it can be investigated | 12:11 |
* mordred restarting zuul | 12:11 | |
mordred | fwiw: http://paste.openstack.org/show/792821/ | 12:14 |
mordred | warnings in the log | 12:14 |
AJaeger | mordred: I'll take care of those warnings | 12:15 |
mordred | ok. zuul is back up | 12:16 |
AJaeger | thanks, mordred | 12:18 |
AJaeger | mordred: shall we give the all-green again? | 12:19 |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Test zuul-executor on focal https://review.opendev.org/723528 | 12:19 |
AJaeger | like status ok Zuul has been restarted, all events are lost, recheck or re-approve any changes submitted since 9:50 UTC. | 12:20 |
AJaeger | #status ok Zuul has been restarted, all events are lost, recheck or re-approve any changes submitted since 9:50 UTC. | 12:23 |
openstackstatus | AJaeger: sending ok | 12:23 |
*** ChanServ changes topic to "OpenDev is a space for collaborative Open Source software development | https://opendev.org/ | channel logs http://eavesdrop.openstack.org/irclogs/%23opendev/" | 12:23 | |
-openstackstatus- NOTICE: Zuul has been restarted, all events are lost, recheck or re-approve any changes submitted since 9:50 UTC. | 12:23 | |
openstackstatus | AJaeger: finished sending ok | 12:30 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Support multi-arch image builds with docker buildx https://review.opendev.org/722339 | 12:30 |
*** dpawlik has quit IRC | 12:36 | |
*** hashar has joined #opendev | 12:37 | |
*** dpawlik has joined #opendev | 12:40 | |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Update to tip of master in periodic jobs https://review.opendev.org/723889 | 12:55 |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Base run from master flag off of zuul pipeline https://review.opendev.org/723890 | 12:55 |
*** ykarel is now known as ykarel|afk | 13:01 | |
*** roman_g has joined #opendev | 13:02 | |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Base run from master flag off of zuul pipeline https://review.opendev.org/723890 | 13:02 |
fungi | around now, looks like there was some excitement? | 13:07 |
fungi | i guess the queue backups weren't viable to reenqueue from? | 13:08 |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Test zuul-executor on focal https://review.opendev.org/723528 | 13:10 |
*** avass is now known as Guest80970 | 13:10 | |
AJaeger | fungi: they were 4 hours old... | 13:12 |
AJaeger | (if they existed) | 13:12 |
fungi | yeah, looks like we have at best a couple hours of status snapshots retained before they're cleaned up | 13:16 |
fungi | though we had a bunch of old snapshots from before friday's maintenance when the names of the files changed | 13:21 |
fungi | i just cleaned those out with a `find /var/lib/zuul/backup/ -mtime +2 -name \*.json|xargs rm` | 13:21 |
fungi | yeah, we keep between 2-3 hours of snapshots, because we record a snapshot every minute, and then once an hour we delete all but the last 120 of them | 13:23 |
AJaeger | ah, interesting | 13:23 |
mordred | fungi: maybe we shoudl update the curl to not save if it gets an error | 13:23 |
fungi | the manpage mentions a --fail-early option | 13:27 |
fungi | oh, looks like --fail is what we actually want | 13:27 |
mordred | fungi: yeah - I agree | 13:28 |
fungi | the description for that option describes our case | 13:28 |
fungi | don't record the error document, just exit nonzero | 13:29 |
mordred | well ... | 13:29 |
mordred | we don't actually get an error document | 13:30 |
mordred | or - maybe we do and my current test is bong | 13:32 |
fungi | oh, we do get an error document | 13:32 |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Add --fail flag to zuul status backup curl https://review.opendev.org/723896 | 13:33 |
fungi | look at /var/lib/zuul/backup/openstack_status_1588071601.json for example | 13:33 |
openstackgerrit | Merged opendev/irc-meetings master: Switch Multi-Arch SIG meeting schedule https://review.opendev.org/723848 | 13:33 |
fungi | it's full of html which includes a bit that says <h2>500 Internal Server Error</h2> | 13:33 |
mordred | fungi: yah - but I betcha we also threw a 500 | 13:33 |
mordred | right | 13:34 |
mordred | ? | 13:34 |
fungi | presumably, maybe syslog has the stderr from it | 13:34 |
fungi | nop | 13:34 |
fungi | e | 13:34 |
fungi | 2>/dev/null | 13:34 |
fungi | so we're silencing stderr which is probably where you'd see it echoed | 13:34 |
mordred | oh well | 13:35 |
fungi | also, any idea where the zuul-scheduler-status-kata-containers and zuul-scheduler-status-prune-kata-containers cronjobs are coming from? | 13:38 |
fungi | are those cruft? | 13:38 |
fungi | it claims they were put there by ansible | 13:38 |
fungi | oh, nevermind, we have a {tenant} jinja replacement | 13:38 |
fungi | so we're instantiating that backup for the openstack and kata-containers tenants only, at the moment | 13:39 |
mordred | yeah | 13:45 |
mordred | maybe we should install it for all tenants? | 13:45 |
mordred | fungi: next time you have a sec, mind looking at https://review.opendev.org/#/c/723022/ ? | 13:46 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: haskell-stack-test: add haskell tool stack test https://review.opendev.org/723263 | 13:54 |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Test zuul-executor on focal https://review.opendev.org/723528 | 13:59 |
*** ykarel|afk is now known as ykarel | 14:00 | |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Update to tip of master in periodic jobs https://review.opendev.org/723889 | 14:07 |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Base run from master flag off of zuul pipeline https://review.opendev.org/723890 | 14:07 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: haskell-stack-test: add haskell tool stack test https://review.opendev.org/723263 | 14:08 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: tox: Use 'block: ... always: ...' instead of ignore_errors https://review.opendev.org/723640 | 14:08 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ensure-sphinx: use failed_when: false instead of ignore_errors: true https://review.opendev.org/723642 | 14:08 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: go: Use 'block: ... always: ...' and failed_when instead of ignore_errors https://review.opendev.org/723643 | 14:08 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ara-report: use failed_when: false instead of ignore_errors: true https://review.opendev.org/723644 | 14:08 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: fetch-subunit-output: use failed_when: instead of ignore_errors: https://review.opendev.org/723653 | 14:08 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: add-build-sshkey: use failed_when: instead of ignore_errors: https://review.opendev.org/723654 | 14:08 |
AJaeger | roman_g: could you abandon the open reviews for airship-in-a-bottle, please? | 14:11 |
openstackgerrit | Guilherme Steinmuller Pimentel proposed openstack/project-config master: Add vexxhost/google-directory-api-linux-agent https://review.opendev.org/723904 | 14:12 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: tox: Use 'block: ... always: ...' instead of ignore_errors https://review.opendev.org/723640 | 14:14 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ensure-sphinx: use failed_when: false instead of ignore_errors: true https://review.opendev.org/723642 | 14:14 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: go: Use 'block: ... always: ...' and failed_when instead of ignore_errors https://review.opendev.org/723643 | 14:14 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ara-report: use failed_when: false instead of ignore_errors: true https://review.opendev.org/723644 | 14:14 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: fetch-subunit-output: use failed_when: instead of ignore_errors: https://review.opendev.org/723653 | 14:14 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: add-build-sshkey: use failed_when: instead of ignore_errors: https://review.opendev.org/723654 | 14:14 |
roman_g | AJaeger: will do today. | 14:15 |
AJaeger | great | 14:17 |
*** ysandeep|afk is now known as ysandeep | 14:19 | |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Support multi-arch image builds with docker buildx https://review.opendev.org/722339 | 14:20 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: tox: Use 'block: ... always: ...' instead of ignore_errors https://review.opendev.org/723640 | 14:20 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ensure-sphinx: use failed_when: false instead of ignore_errors: true https://review.opendev.org/723642 | 14:20 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: go: Use 'block: ... always: ...' and failed_when instead of ignore_errors https://review.opendev.org/723643 | 14:20 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ara-report: use failed_when: false instead of ignore_errors: true https://review.opendev.org/723644 | 14:20 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: fetch-subunit-output: use failed_when: instead of ignore_errors: https://review.opendev.org/723653 | 14:20 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: add-build-sshkey: use failed_when: instead of ignore_errors: https://review.opendev.org/723654 | 14:20 |
openstackgerrit | Merged opendev/system-config master: Use the sync-project-config role in service-zuul https://review.opendev.org/723022 | 14:28 |
*** rpittau is now known as rpittau|brb | 14:32 | |
corvus | win 12 | 14:34 |
mordred | infra-root: https://review.opendev.org/#/c/723889/ is an impl of the "run from tip of master in periodic" we discussed yesterday. the followup https://review.opendev.org/#/c/723890 is a modification that I think might be more maintainable | 14:36 |
*** jrichard has joined #opendev | 14:39 | |
corvus | mordred: what's the thinking behind the naming of the var as "zuul_base_..."? | 14:41 |
jrichard | How do I get added as the first core reviewer for the https://opendev.org/starlingx/portieris-armada-app repo? | 14:42 |
fungi | jrichard: one of us adds you, and then you can add whoever else you need | 14:44 |
corvus | jrichard, fungi: done | 14:44 |
fungi | thanks corvus! you beat me to it | 14:44 |
jrichard | thanks | 14:46 |
*** jrichard has quit IRC | 14:46 | |
*** mlavalle has joined #opendev | 14:47 | |
mordred | corvus: well - now that you say it - bad thinking - my first patch was using the var in the run-base playbook - which was a mistake | 14:49 |
mordred | corvus: how about I name that something completely different? | 14:49 |
corvus | mordred: sounds great. maybe just squash all that too; i'm strongly in favor of the pipeline check. | 14:52 |
mordred | corvus: ok. cool | 14:52 |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Update to tip of master in periodic jobs https://review.opendev.org/723889 | 14:53 |
mordred | corvus: done ^^ | 14:53 |
mordred | corvus: since we're doing scheduler-in-docker-compose now - would it maybe make sense to run gearman in a separate container instead of as a scheduler subprocess? that way if it gets oom-killered, compose could restart it? (obviously we don't want to be ooming) | 14:56 |
*** rpittau|brb is now known as rpittau | 14:57 | |
clarkb | fungi: at 09:30 ish today we see bup running in the dstat data. I think that rules bup out as a cause | 14:57 |
clarkb | fungi: then 10:16 ish we see a bunch of different listinfo processes | 14:58 |
clarkb | fungi: those are started by apache requests which makes me think that indexing bot is the next thing to rule out | 14:58 |
clarkb | fungi: any objection to adding a robots.txt to the apache docroot that tells SEMrush bot to go away? | 14:59 |
*** jrichard has joined #opendev | 14:59 | |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Test zuul-executor on focal https://review.opendev.org/723528 | 15:00 |
fungi | clarkb: yeah, looks like we had two oom kills today, 10:17:43 and 10:18:13 | 15:01 |
corvus | mordred: no objections | 15:01 |
*** ykarel is now known as ykarel|away | 15:01 | |
fungi | clarkb: sounds reasonable... https://www.semrush.com/bot/ https://www.knownhost.com/forums/threads/fighting-semrushbot.4461/ https://forums.digitalpoint.com/threads/how-to-block-semrushbot.2800415/ | 15:03 |
fungi | apparently we're not the only ones it's causing issues fr | 15:03 |
fungi | for | 15:03 |
clarkb | fungi: I do wonder if this is at all related to things jimmy is doing | 15:03 |
clarkb | since its relatively recentl | 15:03 |
roman_g | AJaeger: done | 15:12 |
*** hashar has quit IRC | 15:15 | |
AJaeger | roman_g: thanks. config-core, https://review.opendev.org/#/c/720160/ is ready to retire airship-in-a-bottle, please review | 15:22 |
clarkb | infra-root I've put a robots.txt in /var/www/ on lists.openstack.org. This seems to be served by apache properly. If this changes the OOMing situation we can encode that into puppet | 15:36 |
clarkb | http://lists.opendev.org/robots.txt if you want to see it | 15:37 |
mordred | clarkb: cool | 15:37 |
corvus | clarkb: ++ | 15:43 |
fungi | thanks clarkb!!! | 15:50 |
*** lpetrut has joined #opendev | 15:59 | |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Support multi-arch image builds with docker buildx https://review.opendev.org/722339 | 16:03 |
mordred | clarkb: have a sec for https://review.opendev.org/#/c/723889/ ? | 16:05 |
mordred | clarkb: also https://review.opendev.org/#/c/723105/ | 16:05 |
clarkb | mordred: ya I need to reboot and then find food but reviewing things is on my todo list | 16:06 |
mordred | clarkb: cool. I think both of those would be nice to get in | 16:11 |
clarkb | mordred: corvus also I think we may need to restart apache on zuul scheduler to pick up the caching changes I made | 16:13 |
clarkb | we do seem to get gzipped status json now which us nice | 16:13 |
clarkb | but the js files may not be compressed yet? | 16:13 |
yoctozepto | morning | 16:16 |
AJaeger | config-core, https://review.opendev.org/#/c/720160/ is ready to retire airship-in-a-bottle, please review | 16:16 |
yoctozepto | any idea why zuul chooses python2 on aarch64 nodes in CI? | 16:16 |
yoctozepto | https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_58b/723361/8/check-arm64/kolla-ansible-debian-source-aarch64/58bbcb8/zuul-info/host-info.primary.yaml | 16:17 |
clarkb | yoctozepto: I don't think we explicitly set it? Ansible does autodetection | 16:17 |
clarkb | you can override that though | 16:18 |
*** rpittau is now known as rpittau|afk | 16:18 | |
clarkb | mordred: does setting a git repo like https://review.opendev.org/#/c/723889/3/playbooks/zuul/run-production-playbook.yaml imply update to latest? | 16:18 |
yoctozepto | yeah, it's odd because it does fine on x86_64 though | 16:18 |
clarkb | mordred: looks like update yes is the default? | 16:18 |
yoctozepto | hmm, looks like ansible has no debian defaults | 16:20 |
yoctozepto | and goes fallback | 16:20 |
yoctozepto | and first is /usr/bin/python | 16:20 |
yoctozepto | only then python3 | 16:21 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Support multi-arch image builds with docker buildx https://review.opendev.org/722339 | 16:21 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: DNM Run builder tests on expanded node https://review.opendev.org/724079 | 16:21 |
yoctozepto | so it must be that x86_64 debian images have no python2 or it is linked via /usr/bin/python hmm | 16:21 |
openstackgerrit | Merged zuul/zuul-jobs master: Do not set buildset_fact if it's not present in results.json https://review.opendev.org/723524 | 16:22 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: DNM Run builder tests on expanded node https://review.opendev.org/724079 | 16:23 |
yoctozepto | hmm, semingly it's the same one | 16:23 |
yoctozepto | need to dig more as to why it works then | 16:23 |
clarkb | yoctozepto: python == python2 | 16:23 |
yoctozepto | thanks clarkb for clearing up the dep | 16:23 |
clarkb | that should be pretty universal on all our systems | 16:23 |
mordred | clarkb: yes - I believe it does | 16:24 |
yoctozepto | clarkb: yeah, that's why it should fail on master :-) | 16:24 |
clarkb | yoctozepto: I don't understand what you mean by that | 16:24 |
clarkb | zuul's python should be completely independent of whatever you are testing's python | 16:24 |
yoctozepto | it fails on aarch64 because of python2 | 16:25 |
yoctozepto | yet not on x86_64 | 16:25 |
yoctozepto | yeah, I mean we might have some dependency on it, need to investigate eh | 16:25 |
yoctozepto | we = kolla-ansible | 16:25 |
*** DSpider has quit IRC | 16:26 | |
clarkb | mordred: comment on https://review.opendev.org/#/c/723889/3 | 16:27 |
yoctozepto | "The setuptools package must be installed for both the Ansible Python interpreter and for the version of Python specified by this option." | 16:30 |
yoctozepto | quirk of the year, thanks ansible! | 16:30 |
yoctozepto | so it's just that x86_64 image has python2 setuptools and it hides this quirk... | 16:31 |
AJaeger | config-core, please review https://review.opendev.org/#/c/723309/ and https://review.opendev.org/#/c/720160/ | 16:32 |
*** diablo_rojo has joined #opendev | 16:40 | |
*** DSpider has joined #opendev | 16:42 | |
clarkb | mordred: where does https://review.opendev.org/#/c/723105/2 set python2 for refstack? | 16:47 |
*** lpetrut has quit IRC | 16:51 | |
fungi | per distutils-sig ml, pip 20.1 is going to be released any moment | 16:52 |
fungi | keep an eye out for new disruption related to that | 16:52 |
mordred | clarkb: in a file on my laptop | 16:53 |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Use python3 for ansible https://review.opendev.org/723105 | 16:54 |
mordred | clarkb: how's that? | 16:54 |
clarkb | mordred: +2 thanks. Also see note on the other change you linked | 16:55 |
mordred | clarkb: doh | 17:01 |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Update to tip of master in periodic jobs https://review.opendev.org/723889 | 17:01 |
mordred | clarkb: thanks | 17:01 |
mordred | corvus: ^^ clarkb found an oops, can you re-review? | 17:02 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Support multi-arch image builds with docker buildx https://review.opendev.org/722339 | 17:06 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: DNM Run builder tests on expanded node https://review.opendev.org/724079 | 17:06 |
*** roman_g has quit IRC | 17:12 | |
*** roman_g has joined #opendev | 17:13 | |
*** roman_g has quit IRC | 17:20 | |
clarkb | infra-root is now a good time to restart apache on zuul.open*.org? I think that is necessary to pick up the caching changes | 17:44 |
clarkb | (its really hard to tell if that is working properly due ot how apache hashes cached things) | 17:44 |
mordred | clarkb: wfm | 17:45 |
*** dpawlik has quit IRC | 17:48 | |
clarkb | actually I kinda of what to remove /etc/apache2/sites-enabled/{4,5}0-zuul* files too | 17:49 |
clarkb | mordred: ^ those are files leftover from puppetry. Are you ok with my moving them aside and restarting to ensure the whole ansible deployed config is happy? | 17:49 |
clarkb | that should help reduce confusion in future debugging too | 17:49 |
mordred | clarkb: yeah please | 17:52 |
clarkb | cool I'll do that shortly | 17:52 |
clarkb | ok I've restarted apache and it is all happy | 18:10 |
clarkb | unfortuantely I've in the process realized the vhost update didn't get applied because this job failed https://zuul.opendev.org/t/openstack/build/d1ed5613516e4735a7a933d877821517 | 18:10 |
clarkb | mordred: ^ fyi | 18:10 |
clarkb | there is a bug in our zuul ansible /me tries to figure it out | 18:12 |
mnaser | hm | 18:14 |
mnaser | is the intermediate registry broken by any hcance o nopendev right now? | 18:14 |
mnaser | it seems to be returning 404 making jobs fail in post | 18:14 |
mnaser | https://zuul.opendev.org/t/vexxhost/build/2c5e2247629946b18c0d7372def74006 | 18:15 |
mordred | clarkb: it looks like it didn't run at all | 18:15 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Don't restart the zuul scheduler in prod https://review.opendev.org/724115 | 18:16 |
clarkb | mordred: ^ thats at least part of it | 18:16 |
mnaser | actually, looking again, that's 127.0.0.1 that its not copying from aka buildset registry | 18:17 |
mnaser | so might be a zuul thing | 18:17 |
mordred | clarkb: poop. good catch | 18:17 |
clarkb | mordred: we run zuul-web after zuul-scheduler so we were bailing out before we got to web | 18:17 |
mordred | clarkb: nod | 18:17 |
clarkb | mnaser: I think 404s from the buildset registry are "normal" ? | 18:18 |
clarkb | mnaser: the docker config is supposed to also try docker hub as well iirc | 18:18 |
mnaser | clarkb: this 404 is when its trying to do the skopeo copy to the intermediate registry | 18:18 |
mnaser | specifically inside the `push-to-intermediate-registry` role | 18:19 |
clarkb | ah ya that wouldn't be a case of normal 404s then | 18:19 |
mnaser | https://opendev.org/zuul/zuul-jobs/commit/6fb73060ec919d4e2364e418db84ce6aaa50492d seems to line up to the timeline of failures | 18:19 |
* mnaser moves to #zuul | 18:20 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: stat for result.json on the executor https://review.opendev.org/724116 | 18:26 |
mnaser | (for completion: that was a zuul-jobs issue we just fixed with ^ opendev is okay -- other than likely being affected by the same thing inside zuul/zuul-jobs) | 18:31 |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Test zuul-executor on focal https://review.opendev.org/723528 | 18:34 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Revert "Do not set buildset_fact if it's not present in results.json" https://review.opendev.org/724120 | 18:35 |
*** ysandeep is now known as ysandeep|away | 18:37 | |
fungi | okay, pip 20.1 release is now up at https://pypi.org/project/pip/ so keep your eyes peeled | 18:37 |
fungi | also looks like the plan is that pip 21 (likely a few months out still) will be dropping python 2.7 support | 18:39 |
fungi | most of the meat of 20.1 is described in the 20.1b1 release notes | 18:40 |
fungi | pip freeze output for things installed from not-pypi (particularly distro packages) is changing | 18:41 |
fungi | builds now happen in-place instead of in a temporary location followed by a copy | 18:41 |
fungi | --user and --target options can't be combined any longer | 18:42 |
fungi | there's a preview implementation of the new resolver logic included, but it's disabled by default still | 18:43 |
fungi | invocation as python -m pip now removes the cwd from the import path | 18:44 |
fungi | resolvelib and toml (replacing pytoml) were added to the vendored modules, and versions were updated for vendored copies of certifi, contextlib2, distro, idna, msgpack, packaging, pep517, pyparsing, requests, and urllib3 | 18:46 |
fungi | those are the things most likely to impact jobs we run, i think | 18:47 |
fungi | https://pip.pypa.io/en/stable/news/ for the complete release notes | 18:47 |
*** ralonsoh has quit IRC | 18:51 | |
mordred | fungi: fingers crossed it doesn't break the world | 19:01 |
mordred | ianw, corvus, fungi: got a quick sec for https://review.opendev.org/#/c/723105 ? | 19:02 |
openstackgerrit | Merged zuul/zuul-jobs master: Revert "Do not set buildset_fact if it's not present in results.json" https://review.opendev.org/724120 | 19:09 |
openstackgerrit | Merged openstack/project-config master: Add vexxhost/google-directory-api-linux-agent https://review.opendev.org/723904 | 19:12 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Revert "Revert "Do not set buildset_fact if it's not present in results.json"" https://review.opendev.org/724132 | 19:13 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Revert "Revert "Do not set buildset_fact if it's not present in results.json"" https://review.opendev.org/724132 | 19:23 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Revert "Revert "Do not set buildset_fact if it's not present in results.json"" https://review.opendev.org/724132 | 19:24 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Don't restart the zuul scheduler in prod https://review.opendev.org/724115 | 19:25 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Revert "Revert "Do not set buildset_fact if it's not present in results.json"" https://review.opendev.org/724132 | 19:26 |
clarkb | mordred: https://review.opendev.org/724115 took a different approach there that hopefully makes testing happier | 19:26 |
mordred | clarkb: ++ | 19:29 |
openstackgerrit | Merged opendev/system-config master: Use python3 for ansible https://review.opendev.org/723105 | 19:53 |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Test zuul-executor on focal https://review.opendev.org/723528 | 19:55 |
mordred | ianw: were you the one who tracked down ntp vs systemd-timesync before? | 19:56 |
mordred | ianw: if so - see ^^ - there's an issue in focal with systemd-timesync too | 19:56 |
ianw | mordred: that sounds like something simultaneously i remember and also have actively forgotten :) | 19:58 |
ianw | that goes back to ntp for focal? | 20:04 |
ianw | i'm thinking afs01.dfw.openstack.org is not happy | 20:09 |
ianw | ping but no ssh for me ... pulling up a console | 20:09 |
ianw | yeah, console there has ... bad stuff ... hung tasks etc and no response. going to reboot it | 20:11 |
fungi | i wouldn't be surprised if we get a ticket from rackspace in the next few minutes saying there's a problem with the hypervisor host that vm is on | 20:12 |
ianw | ok, it's back ... i'm not about static01 now though | 20:13 |
ianw | it's probably worth a reboot too with all the hung i/o apache processes | 20:13 |
ianw | #status log reboot afs01.dfw.openstack.org due to host hypervisor issues killing server | 20:14 |
openstackstatus | ianw: finished logging | 20:14 |
ianw | static rebooting | 20:14 |
ianw | it couldn't ls /afs/openstack.org | 20:15 |
*** rchurch has joined #opendev | 20:16 | |
smcginnis | That explains the failures I saw. | 20:17 |
ianw | ok it seems happy again | 20:17 |
ianw | #status log follow-up reboot of static01 after a lot of hung i/o processes due to afs01 issues | 20:18 |
openstackstatus | ianw: finished logging | 20:18 |
ianw | i think it's ok if a sever just goes away ... but when they end up in a weird zombie state is when things go wrong | 20:18 |
ianw | afs server | 20:19 |
ianw | smcginnis: yeah, unfortunately with a busy that that will have some effect on some jobs :/ | 20:19 |
smcginnis | It was a promote job, so no big deal. It will just get taken care of with the next merge. | 20:20 |
ianw | mordred: "Ubuntu 19.10's systemd package introduced /lib/systemd/system/systemd-timesyncd.service.d/disable-with-time-daemon.conf. This prevents systemd-timesyncd.service from starting if the ntp package has been installed. " | 20:23 |
ianw | https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1849156 | 20:23 |
openstack | Launchpad bug 1849156 in systemd (Ubuntu Eoan) "systemd-timesyncd.service broken on upgrade to 19.10 if ntp was installed" [High,Confirmed] | 20:23 |
ianw | mordred: yeah, we're preinstalling ntp during build -- https://nb04.opendev.org/ubuntu-focal-0000000004.log -- so i think that means systemd-timesyncd masks itself | 20:31 |
clarkb | ianw: thank you for taking care of afs | 20:38 |
*** diablo_rojo has quit IRC | 20:39 | |
ianw | np | 20:40 |
*** diablo_rojo has joined #opendev | 20:42 | |
clarkb | infra-root https://review.opendev.org/#/c/724115/ that passes testing now and should fix our zuul-scheduler ansible applications | 20:42 |
clarkb | (this will have the side effect of updating our apache vhost config to compress javascript and css files as they are quite large for zuul dashbaord) | 20:43 |
ianw | i have to afk for a bit but lgtm | 20:50 |
openstackgerrit | Merged zuul/zuul-jobs master: Add ensure-virtualenv https://review.opendev.org/723309 | 20:51 |
clarkb | ianw: ^ fyi | 20:52 |
mordred | clarkb: so that it doesn't fall into the cracks: https://review.opendev.org/#/c/723896/ | 21:09 |
mordred | clarkb: (when we restarted zuul this morning after OOM, there were no status backups to replay) | 21:10 |
clarkb | mordred: looking | 21:10 |
mordred | clarkb, ianw: I'm landing the patch to update nodepool system-config tests to run a real zookeeper. it should have no production impact - mostly just trying to squish the outstanding stacks | 21:12 |
mordred | clarkb: I think you could rebase your reorg of playbooks on top of https://review.opendev.org/#/c/720527/ - I don't expect we have much _structural_ outstanding | 21:13 |
fungi | yeah, it's maybe less awesome because it will now no longer record the content of error reports from the status api, but ultimately that was probably a poor means of debugging intermittent api errors regardless | 21:13 |
mordred | yeah | 21:13 |
fungi | counterargument, maybe hours-old status snapshots are useless | 21:14 |
fungi | it's a good coffee-talk topic | 21:14 |
mordred | maybe - unless zuul has been effectively AWOL the whole time | 21:14 |
mordred | so maybe they're better than nothning? but maybe they are useless | 21:14 |
clarkb | mordred: looking (fwiw I was able to reproduce the zuul test failures that happened in CI locally and so am being distracted trying to sort that out | 21:17 |
mordred | ooh- that's exciting | 21:19 |
clarkb | hrm we don't collect logs on successful tests so hard to know if this exception is expected or not :) anyway I'm learning things /me dives back into the hole | 21:27 |
*** DSpider has quit IRC | 21:34 | |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Run zookeeper cluster in nodepool jobs https://review.opendev.org/720709 | 21:52 |
*** jrichard has left #opendev | 22:30 | |
openstackgerrit | Merged opendev/glean master: Add container build jobs https://review.opendev.org/723285 | 22:43 |
diablo_rojo | mordred, corvus would I be correct in my understanding you are working on getting jitsi setup? (I apologize for being very out of the loop) | 22:57 |
fungi | diablo_rojo: a poc is hosted at https://meetpad.opendev.org/ and there's a spec outlining the plan at https://docs.opendev.org/opendev/infra-specs/latest/specs/jitsi-meet.html | 23:02 |
corvus | diablo_rojo: yes, i think most of the infra team has pitched in on it actually :) -- we think it's basically done -- at least, done enough for more testing to find out what isn't done | 23:02 |
corvus | i have a small todo to add an http->https redirect (which is important because it does not work over http) | 23:03 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Organize zuul jobs in zuul.d/ dir https://review.opendev.org/722394 | 23:07 |
clarkb | mordred: corvus ^ rebased (actually I just redid it as I think that was simpler | 23:07 |
diablo_rojo | I had an idea for testing | 23:08 |
diablo_rojo | End of release week I was planning a like.. celebration for ussuri. We could use that and see how it goes? | 23:08 |
diablo_rojo | I don't know how many people will join exactly. | 23:08 |
diablo_rojo | Figured I would throw it out there as an idea. | 23:08 |
diablo_rojo | Assuming thats not too soon? | 23:09 |
diablo_rojo | fungi, thank you! | 23:09 |
fungi | diablo_rojo: note that there's no dial-in access, it's internet-only | 23:09 |
diablo_rojo | ..I think that is okay? I dunno. Hadn't thought it all the way through really. | 23:10 |
corvus | diablo_rojo: that sounds like a great opportunity to throw people at the server and see what happens :) | 23:10 |
johnsom | mordred I see you have been doing some focal work. Have you run into an issue where the instance doesn't bring up the NIC? | 23:10 |
openstackgerrit | James E. Blair proposed opendev/system-config master: Meetpad: redirect 80 to 443 https://review.opendev.org/724199 | 23:11 |
corvus | if i'm nginxing correctly, i think that's what's needed for the redirect ^ | 23:11 |
diablo_rojo | corvus, cool :) I can have a zoom link as a backup, but I figured it would be a good chance to test things in a more.. real world scenario? | 23:11 |
clarkb | johnsom: I don't think we've seen that but we also use glean | 23:11 |
johnsom | I think our issue is we are adding the ifupdown package, which seems to confuse the new networking in focal and/or cloud-init | 23:12 |
johnsom | Ok, thought I would ask before I get too deep into this strangeness. Thanks | 23:13 |
diablo_rojo | corvus, I'll get a link from you maybe early next week? (assuming that patch does the redirect you need) | 23:13 |
corvus | diablo_rojo: also note that meetpad is etherpad-focused, so if it works as designed, then people will see a big etherpad in the middle with faces on the side (but you can still turn on and off the etherpad individually). so it's not the best party venue... or, well, it's like a party venue with a giant whiteboard in the middle... which is great if you want to play ascii pictionary i guess. so, in short: | 23:14 |
corvus | awesome party venue. :) | 23:14 |
corvus | diablo_rojo: no need to wait, just head on over to https://meetpad.opendev.org/ any time | 23:14 |
corvus | diablo_rojo: (just make sure to use https:) | 23:15 |
corvus | (we just need to land that http redirect so that if someone types "meetpad.opendev.org" into their browser without a protocol they go to the right place) | 23:16 |
diablo_rojo | corvus, oh cool. Thanks and will do :) | 23:16 |
diablo_rojo | (and noted) | 23:16 |
*** tosky has quit IRC | 23:39 | |
SotK | ooh that etherpad integration is lovely | 23:46 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!