*** dzho has joined #opendev | 01:17 | |
*** escarluka has joined #opendev | 01:33 | |
*** escarluka has quit IRC | 04:28 | |
openstackgerrit | Merged zuul/zuul-jobs master: Only run tests for ensure-bazel when it is updated https://review.opendev.org/724933 | 05:48 |
---|---|---|
*** redrobot has quit IRC | 05:48 | |
*** hiep_mq has joined #opendev | 05:54 | |
*** hiep_mq has quit IRC | 06:10 | |
*** DSpider has joined #opendev | 06:26 | |
*** rchurch has quit IRC | 06:44 | |
*** rchurch has joined #opendev | 06:47 | |
*** sgw has quit IRC | 07:22 | |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: Fix irc-meetings publishing https://review.opendev.org/724964 | 07:25 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 08:36 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 08:43 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 08:48 |
*** tosky has joined #opendev | 08:48 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 08:54 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 08:55 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: ansible-lint: use matchplay instead of matchtask https://review.opendev.org/724910 | 09:17 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Check blocks recursively for loops https://review.opendev.org/724967 | 09:30 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Check blocks recursively for loops https://review.opendev.org/724967 | 09:47 |
*** iurygregory has quit IRC | 09:56 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Check blocks recursively for loops https://review.opendev.org/724967 | 12:06 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Check blocks recursively for loops https://review.opendev.org/724967 | 12:14 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Check blocks recursively for loops https://review.opendev.org/724967 | 12:19 |
AJaeger | infra-root, could you check https://review.opendev.org/724964 - I noticed the irc-meetings job is failing every hour... | 12:30 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Check blocks recursively for loops https://review.opendev.org/724967 | 13:01 |
openstackgerrit | Merged openstack/project-config master: Fix irc-meetings publishing https://review.opendev.org/724964 | 13:21 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Check blocks recursively for loops https://review.opendev.org/724967 | 13:35 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Check blocks recursively for loops https://review.opendev.org/724967 | 13:40 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Check blocks recursively for loops https://review.opendev.org/724967 | 13:48 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: use zj_image instead of image as loopvar https://review.opendev.org/725012 | 13:48 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: use zj_log_file instead of item as loop_var https://review.opendev.org/725013 | 13:48 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Check blocks recursively for loops https://review.opendev.org/724967 | 13:53 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Update ansible-lint-rules testsuite to only test with the relevant rule https://review.opendev.org/725014 | 13:58 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Update ansible-lint-rules testsuite to only test with the relevant rule https://review.opendev.org/725014 | 14:07 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Update ansible-lint-rules testsuite to only test with the relevant rule https://review.opendev.org/725014 | 14:13 |
AJaeger | infra-root, publishing irc-meetings fails now in the next step - see https://zuul.opendev.org/t/openstack/build/62ecffee3a734a77b6a7c03d11f0b0e5 | 14:35 |
AJaeger | oh, found the typo... | 14:36 |
fungi | rsync: mkdir "/src/yaml2ical/calendars" | 14:36 |
fungi | hrm | 14:36 |
AJaeger | src -> srv | 14:37 |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: Fix typo in playbooks/yaml2ical/post.yaml https://review.opendev.org/725015 | 14:37 |
AJaeger | here's the fix ^ | 14:37 |
fungi | good eye! | 14:38 |
AJaeger | infra-root, infra-prod-run-accessbot fails as well - nothing visible from job output. See https://zuul.opendev.org/t/openstack/build/ba0b9eeb95074961b4832ad018c75a34 | 14:38 |
fungi | i'll check /var/log/ansible/run-accessbot.yaml.log on bridge.o.o | 14:39 |
fungi | the log wasn't much help, but i see there's something like 10 hanging ansible runs of that playbook, some nearly a week old | 14:41 |
fungi | root 15871 0.0 0.5 238828 47760 ? S Apr26 0:03 /usr/bin/python3 /usr/local/bin/ansible-playbook -v -f 5 /home/zuul/src/opendev.org/opendev/system-config/playbooks/run-accessbot.yaml | 14:41 |
fungi | that has a child process which is doing an ssh to 2001:4800:7818:104:be76:4eff:fe04:4887 | 14:43 |
fungi | reverse dns says it's | 14:43 |
fungi | eavesdrop01.openstack.org | 14:43 |
fungi | looking through the process list on master, we've also got some stuck manage-projects runs from weeks ago | 14:45 |
fungi | mordred: is there a safe way to time out that sort of stuff? | 14:46 |
fungi | i wonder if intermittent network issues are causing remote ansible calls to get stuck indefinitely and pile up | 14:46 |
fungi | all of these i'm finding, whether for accessbot or manage-projects, have hung doing this over ssh connections: /bin/sh -c 'python3 && sleep 0' | 14:48 |
fungi | and they're setting -o ControlMaster=auto -o ControlPersist=60s | 14:50 |
fungi | so maybe these are the persistent sockets never getting reaped? | 14:50 |
fungi | could be this is just a red herring | 14:50 |
openstackgerrit | Merged openstack/project-config master: Fix typo in playbooks/yaml2ical/post.yaml https://review.opendev.org/725015 | 14:52 |
*** zbr has quit IRC | 15:05 | |
*** zbr has joined #opendev | 15:05 | |
AJaeger | success - publish-irc-meetings works - https://zuul.opendev.org/t/openstack/build/9ec8f7d1bbec4980ba8402379b920a4a and eavesdrop has current timestop | 15:14 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: dhall-diff: add new job https://review.opendev.org/718694 | 15:18 |
fungi | i've hup'd the older hung ssh processes on bridge.o.o associated with manage-projects and accessbot runs for the past few weeks | 15:19 |
AJaeger | thx, fungi | 15:23 |
fungi | well, i doubt that's going to solve whatever's going on with the accessbot runs | 15:25 |
fungi | so i'm going to see if i can find a record of what's going on with it on eavesdrop now | 15:25 |
fungi | the last thing it logged in /var/log/accessbot/ on eavesdrop01.openstack.org was 2020-04-02 22:19:02 | 15:27 |
fungi | but syslog says ansible is running it regularly: | 15:29 |
fungi | May 2 15:18:54 eavesdrop01 python3[8448]: ansible-command Invoked with _uses_shell=False removes=None strip_empty_ends=True chdir=None creates=None warn=True stdin_add_newline=True executable=None stdin=None _raw_params=/usr/local/bin/accessbot argv=None | 15:29 |
fungi | /usr/local/bin/accessbot is a wrapper script which does: exec docker run --rm --net=host -v/etc/accessbot:/etc/accessbot -v/var/log/accessbot:/var/log/accessbot docker.io/opendevorg/accessbot | 15:31 |
*** elod has quit IRC | 15:34 | |
fungi | this could be related though...? | 15:35 |
fungi | May 2 15:18:19 eavesdrop01 cc3116109c7b[6092]: /bin/sh: 1: Syntax error: Unterminated quoted string | 15:35 |
*** elod has joined #opendev | 15:35 | |
fungi | looks like maybe that's getting syslogged from inside a container | 15:35 |
AJaeger | is that happening regularly? | 15:42 |
fungi | each time it's invoked by ansible, i think | 15:42 |
fungi | so probably something's not right inside that container | 15:42 |
fungi | i'm going to try running it from a command-line on eavesdrop and see if it gives me anything useful that's getting swallowed by indirection | 15:43 |
AJaeger | I manually inspected docker/accessbot/accessbot.py - that looks fine | 15:43 |
AJaeger | fungi: good idea | 15:43 |
fungi | i did it from a root shell without the exec | 15:44 |
fungi | yep, that does indeed seem to be where the error's coming from: | 15:44 |
fungi | root@eavesdrop01:~# docker run --rm --net=host -v/etc/accessbot:/etc/accessbot -v/var/log/accessbot:/var/log/accessbot docker.io/opendevorg/accessbot | 15:44 |
fungi | /bin/sh: 1: Syntax error: Unterminated quoted string | 15:45 |
AJaeger | fungi: I have an idea - one moment | 15:47 |
openstackgerrit | Andreas Jaeger proposed opendev/system-config master: Remove incomplete args from accessbot/Dockerfile https://review.opendev.org/725021 | 15:49 |
AJaeger | fungi, is that the culprit ? ^ | 15:49 |
AJaeger | count the " chars... | 15:50 |
fungi | https://opendev.org/opendev/system-config/src/branch/master/docker/accessbot/Dockerfile#L21 has an extra ] | 15:51 |
fungi | oh, yeah, and a bare " | 15:52 |
fungi | AJaeger: i wonder if that was meant to append to /var/log/accessbot/accessbot.log and was merely unfinished | 15:52 |
fungi | mordred: that ^ appears to have been introduced by a change of yours, so maybe you know? | 15:53 |
openstackgerrit | Andreas Jaeger proposed opendev/system-config master: Remove incomplete args from accessbot/Dockerfile https://review.opendev.org/725021 | 15:57 |
AJaeger | fungi: /var/log/accessbot/accessbot.log would be inside the container, so question is how mordred envisoned the logging. | 15:57 |
AJaeger | my change above should fix it - not sure about the logging. Something to iterate on ;) | 15:58 |
fungi | oh, yep, it's running this which already redirects: https://opendev.org/opendev/system-config/src/branch/master/docker/accessbot/accessbot.sh#L17 | 15:58 |
mordred | fungi, AJaeger - yeah - we should be mounting /var/log/accessbot as well as redirecting to it | 15:59 |
AJaeger | the dockerfile calls docker/accessbot/accessbot.sh and that does the redirection, doesn't it? | 16:00 |
AJaeger | I think there's duplication between accessbot.sh and the Dockerfile | 16:00 |
mordred | yeah - but I thnik that error about unterminated string is bad | 16:01 |
mordred | and also that | 16:01 |
AJaeger | accessbot.sh has : "exec python /usr/local/bin/accessbot.py -c /etc/accessbot/accessbot.config -l /etc/accessbot/channels.yaml >> /var/log/accessbot/accessbot.log 2>&1" | 16:01 |
mordred | why don't we just make the Dockerfile run the script | 16:01 |
AJaeger | mordred, yeah - on it... | 16:01 |
mordred | I think that was likely the idea originally | 16:01 |
mordred | and then - wow | 16:01 |
openstackgerrit | Andreas Jaeger proposed opendev/system-config master: Fix accessbot/Dockerfile https://review.opendev.org/725021 | 16:02 |
AJaeger | this way ^ | 16:02 |
mordred | AJaeger: yes. - that | 16:03 |
fungi | i arrived at the same conclusion, +2 | 16:03 |
AJaeger | will either of you +A now that you both gave +2, please? | 16:04 |
fungi | just did | 16:04 |
AJaeger | thanks | 16:04 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: tox: Add regex for matching ansible-lint output https://review.opendev.org/725027 | 16:04 |
fungi | thanks for fixing! | 16:05 |
AJaeger | mordred: infra-prod-remote-puppet-else is failing occasionally - and outputs quite a few ansible warnings. https://zuul.opendev.org/t/openstack/build/5ce68b5b2b2442f3a0ff67770c1c8a7e | 16:05 |
AJaeger | no idea on that one... | 16:05 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: tox: Add regex for matching ansible-lint output https://review.opendev.org/725027 | 16:06 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: DNM: Test ansible-lint commenting https://review.opendev.org/725028 | 16:06 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: tox: Add regex for matching ansible-lint output https://review.opendev.org/725027 | 16:12 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: DNM: Test ansible-lint commenting https://review.opendev.org/725028 | 16:12 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: tox: update pep8 regex to not require column https://review.opendev.org/725030 | 16:29 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: DNM: Test ansible-lint commenting https://review.opendev.org/725028 | 16:29 |
openstackgerrit | Merged opendev/system-config master: Fix accessbot/Dockerfile https://review.opendev.org/725021 | 17:09 |
AJaeger | \o/ infra-prod-service-eavesdrop passed as deploy job after the merge | 17:17 |
fungi | thanks again for fixing, AJaeger!!! | 17:19 |
AJaeger | that was colaboration - thanks fungi and mordred as well | 17:31 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: tox: update pep8 regex to not require column https://review.opendev.org/725030 | 17:34 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: DNM: Test ansible-lint commenting https://review.opendev.org/725028 | 17:34 |
*** iurygregory has joined #opendev | 18:49 | |
AJaeger | eavesdrop - but we've been looking at accessbot ;( | 18:51 |
AJaeger | Argh ;( | 18:51 |
AJaeger | https://zuul.opendev.org/t/openstack/build/eb449333e8ba470fb5f646ba98f880d7 failed for accessbot. any ideas why this time? | 18:52 |
AJaeger | mordred: "ansible-playbook -v -f 5 /home/zuul/src/opendev.org/opendev/system-config/playbooks/run-accessbot.yaml >> /var/log/ansible/run-accessbot.yaml.log | 18:53 |
AJaeger | and the file accessbot.sh redirects as well - that's again duplications, isn't it? | 18:54 |
AJaeger | mordred: but you mentioned mounting the log directory inside - could you look and fix on Monday. This can wait ;) | 18:55 |
fungi | i'll take another peek and see if there's better logged errors this time | 18:57 |
fungi | "TypeError: unsupported operand type(s) for +: 'dict_items' and 'dict_items'" maybe a python3 incompatibility? http://paste.openstack.org/show/793002 | 19:01 |
fungi | probably need to recast them as list() before concatenating | 19:02 |
fungi | i'm in the midst of weekend yardwork, but can try to whip up a patch and check for any other py3k problems later today if nobody beats me to it | 19:02 |
openstackgerrit | Andreas Jaeger proposed opendev/system-config master: Fix py3 problem in accessbot.py https://review.opendev.org/725036 | 20:12 |
AJaeger | here's the patch - at least that function works now ^ | 20:12 |
AJaeger | thanks, fungi. | 20:12 |
fungi | bigger thanks for pushing a patch! | 20:14 |
AJaeger | ;) | 20:15 |
AJaeger | until the next one ... | 20:15 |
AJaeger | I still think the log files needs reviewing but leave that to others to check.. | 20:17 |
fungi | well, the logs are on the server, and the script crashes on the first exception, so without running it on each successive fix the next bug won't become apparent | 20:20 |
fungi | ideally we'd have some means of exercising it in a test, but i suspect that would depend on setting up ircseven and charybdis in an integration test | 20:20 |
fungi | unless someone puts together some good mocks | 20:21 |
openstackgerrit | Merged opendev/system-config master: Fix py3 problem in accessbot.py https://review.opendev.org/725036 | 21:18 |
*** DSpider has quit IRC | 22:07 | |
*** tosky has quit IRC | 23:28 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!