*** tkajinam is now known as Guest3966 | 00:09 | |
*** rlandy|bbl is now known as rlandy | 00:20 | |
*** rlandy is now known as rlandy|out | 00:21 | |
opendevreview | Merged opendev/system-config master: Switch bridge to bridge01.opendev.org https://review.opendev.org/c/opendev/system-config/+/861112 | 01:13 |
---|---|---|
*** dasm|rover is now known as dasm|off | 01:18 | |
*** tkajinam is now known as Guest3972 | 01:47 | |
opendevreview | Ian Wienand proposed opendev/system-config master: run-production-bootstrap-bridge: fix bridge name https://review.opendev.org/c/opendev/system-config/+/862665 | 02:14 |
*** raukadah is now known as chandankumar | 03:10 | |
opendevreview | Merged opendev/system-config master: run-production-bootstrap-bridge: fix bridge name https://review.opendev.org/c/opendev/system-config/+/862665 | 03:13 |
ianw | ok, the bridge bootstrap is running better now | 03:16 |
ianw | it's installing ansible | 03:16 |
ianw | × Encountered error while trying to install package. | 03:20 |
ianw | ╰─> netifaces | 03:20 |
ianw | it doesn't have gcc so this failed to build when installing openstacksdk | 03:21 |
ianw | this probably works in the gate because we build a wheel for it ... | 03:21 |
ianw | https://pypi.org/project/netifaces/0.11.0/#files ... i guess no 3.10 wheel | 03:24 |
opendevreview | Ian Wienand proposed opendev/system-config master: install-ansible: unconditionally install build-essential https://review.opendev.org/c/opendev/system-config/+/862666 | 03:28 |
opendevreview | Merged opendev/system-config master: install-ansible: unconditionally install build-essential https://review.opendev.org/c/opendev/system-config/+/862666 | 04:23 |
opendevreview | Ian Wienand proposed opendev/system-config master: install-ansible: also install python3-dev https://review.opendev.org/c/opendev/system-config/+/862668 | 05:13 |
*** marios is now known as marios|ruck | 05:26 | |
*** tkajinam is now known as Guest3986 | 05:43 | |
opendevreview | Merged opendev/system-config master: install-ansible: also install python3-dev https://review.opendev.org/c/opendev/system-config/+/862668 | 06:18 |
ianw | alright, bridge01.opendev.org is bootstrapped -> https://zuul.opendev.org/t/openstack/build/c647c32a0d4242aa9f4392d582deb076 | 06:24 |
*** tkajinam is now known as Guest3988 | 06:53 | |
*** jpena|off is now known as jpena | 07:18 | |
opendevreview | Ian Wienand proposed opendev/system-config master: Add new bridge to allowed root logins https://review.opendev.org/c/opendev/system-config/+/862670 | 07:46 |
noonedeadpunk | folks, do you know anything about centos 8 stream epel mirror? Ie was it dropped/buggy/etc? | 08:47 |
noonedeadpunk | As we got https://zuul.opendev.org/t/openstack/build/c9655845446540368d870da56369785e yesterday and I'm not sure if it's worth jsut re-checking failed jobs or issue is deeper | 08:59 |
ianw | noonedeadpunk: afaik we have not done anything to it recently | 09:01 |
opendevreview | Merged opendev/system-config master: Add new bridge to allowed root logins https://review.opendev.org/c/opendev/system-config/+/862670 | 09:04 |
noonedeadpunk | hm, ok, will try to re-check then | 09:10 |
frickler | centos isn't exactly known to provide us with high quality mirrors to pull from, we've seen things being out of sync for extended periods before | 09:18 |
noonedeadpunk | :D | 09:19 |
noonedeadpunk | yeah, I know, but decided to double check before wasting resources | 09:19 |
frickler | no obvious error in the rsync logs afaict, so if there still an issue, it likely will affect our upstream mirror, too | 09:29 |
frickler | +is | 09:29 |
*** rlandy|out is now known as rlandy | 10:33 | |
*** soniya is now known as soniya|afk | 10:42 | |
*** dviroel|afk is now known as dviroel | 11:28 | |
frickler | noonedeadpunk: FYI epel 9 seems to be broken for kolla, too https://062b194e5e6e841b5adf-7651d79ea360a4aa04fbe96029f7a5e2.ssl.cf1.rackcdn.com/656603/4/gate/kolla-ansible-rocky9-source/b2fede2/primary/logs/build/000_FAILED_mariadb-server.log | 12:17 |
noonedeadpunk | frickler: yeah we also have failures on centos 9, true | 12:21 |
noonedeadpunk | and recheck failed same way | 12:21 |
fungi | https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/mirror-update/files/epel-mirror-update#L31 says we're mirroring from rsync://pubmirror1.math.uh.edu/fedora-buffet/epel | 12:29 |
opendevreview | Merged zuul/zuul-jobs master: Pin py38 jobs to focal https://review.opendev.org/c/zuul/zuul-jobs/+/862628 | 12:54 |
opendevreview | Merged zuul/zuul-jobs master: Add tox-py311 job https://review.opendev.org/c/zuul/zuul-jobs/+/862629 | 12:54 |
*** soniya|afk is now known as soniya | 13:01 | |
*** dasm|off is now known as dasm|rover | 13:38 | |
noonedeadpunk | got side-pinged. I can spawn a vm and check if it has issues with the original mirror | 14:12 |
frickler | clarkb: fyi I mentioned the storyboard mail on several project channels where I knew they were having that topic during the PTG, and a lot of feedback was that they weren't even aware of the service-discuss list, so maybe you'll want to send a link to openstack-discuss. or maybe fungi as tact lead. | 14:34 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Correct block_storage_endpoint_override for rax https://review.opendev.org/c/opendev/system-config/+/862706 | 15:07 |
fungi | clarkb: ^ frickler spotted that and it seems to allow much simpler venvs with newer openstacksdk/cli | 15:08 |
*** dviroel is now known as dviroel|lunch | 15:08 | |
fungi | i'm able to make it work with a fresh venv on bridge now with just pip install openstackclient 'python-cinderclient<5' | 15:10 |
fungi | make that <8 if we drop the v1 override on the cli | 15:12 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Correct block_storage_endpoint_override for rax https://review.opendev.org/c/opendev/system-config/+/862706 | 15:13 |
clarkb | frickler: ya list membership is someting we've struggled with. When we first split opendev and created those lists every email I sent to the project lists told people to subscribe | 15:14 |
clarkb | I think now the front page of opendev.org points people to those lists (if not we should add that) | 15:14 |
fungi | it's listed as "contact info" at the bottom of the page | 15:20 |
fungi | but we should probably make it more prominent | 15:20 |
*** marios|ruck is now known as marios|out | 15:33 | |
clarkb | the mm3 image upstream merged my change to add lynx to the images | 16:23 |
clarkb | another thing that occurred to me is that our fork isn't going to be equivalent to the tagged images on docker hub because our images will be newer. THis potentially complicates upgrades later or a shift to the upstream iamges? | 16:23 |
clarkb | there is a "rolling" tag that was updated recently though which might be clsoer to what we want? | 16:24 |
*** dviroel|lunch is now known as dviroel | 16:26 | |
opendevreview | Merged opendev/system-config master: Correct block_storage_endpoint_override for rax https://review.opendev.org/c/opendev/system-config/+/862706 | 16:39 |
fungi | oh, that sounds like an interesting option | 16:44 |
fungi | clouds.yaml fix has rolled out. confirming on the new bridge01, this is working: | 16:55 |
fungi | python3 -m venv foo | 16:55 |
fungi | foo/bin/pip install -U pip setuptools wheel | 16:55 |
fungi | foo/bin/pip install openstackclient 'python-cinderclient<8' | 16:56 |
fungi | sudo foo/bin/openstack --os-cloud=openstackci-rax --os-region-name=DFW volume list | 16:56 |
*** jpena is now known as jpena|off | 16:56 | |
frickler | nodepool rollout failed, like all the hourly jobs it seems? https://zuul.opendev.org/t/openstack/builds?job_name=infra-prod-service-nodepool&project=opendev/system-config | 17:19 |
fungi | oh, looking | 17:20 |
frickler | that might be fixed by https://review.opendev.org/c/opendev/system-config/+/862670 which also failed to deploy though | 17:21 |
fungi | oddly, the ansible log doesn't indicate a failed task | 17:22 |
fungi | oh! new bridge | 17:23 |
fungi | TASK [nodepool-base : Get zk config] | 17:24 |
fungi | but we filter the task output, so no clear explanation for what failed with it | 17:24 |
fungi | guess i'll step through it | 17:25 |
fungi | playbooks/roles/nodepool-base/library/make_nodepool_zk_hosts.py still starts with #!/usr/bin/env python3 | 17:26 |
fungi | frickler: ^ i expect that's it, new ansible is touchy about that | 17:27 |
clarkb | fungi: double check since I thought we were going to keep using old ansible and only upgrade once new bridge was stable | 17:27 |
fungi | oh, mebbe | 17:27 |
fungi | ansible-playbook [core 2.11.12] | 17:28 |
clarkb | ya its 2.13.x that had the problem I think that is unlikely to be the issue | 17:29 |
fungi | ansible 4.0.0 | 17:29 |
fungi | ansible-core 2.11.12 | 17:29 |
fungi | that's what we've got in /usr/ansible-venv anyway, so sounds like the problem lies elsewhere | 17:29 |
clarkb | make_nodepool_zk_hosts builds the nodepool config details for our zookeeper stuff out of inventory and host vars | 17:29 |
clarkb | I would double check that those are looking good | 17:30 |
clarkb | I need to pop out soon for an appointment so can't dive in right now myself | 17:31 |
fungi | no worries | 17:31 |
fungi | sudo /usr/ansible-venv/bin/ansible-inventory --list | 17:31 |
fungi | ERROR! Unexpected Exception, this is probably a bug: Object of type bytes is not JSON serializable | 17:31 |
clarkb | this may be the thing I mentioned somewhere which is that we rely on the yamlgroup inventory plugin which wasn't installed when I had a chance to look at new bridge yesterday | 17:32 |
clarkb | ianw thought that the bridge boostrapping would do it, but maybe not | 17:33 |
fungi | yeah, pip list doesn't report yamlgroup installed | 17:33 |
clarkb | I'm not sure if it was ever pip installed | 17:33 |
clarkb | you drop the script into /etc/ansible/somewhere | 17:33 |
clarkb | and have to set it in your ansible.cfg config file | 17:33 |
clarkb | also maybe cross check with ianw's etherpad that was linked yesterday. THis may be a known thing | 17:34 |
fungi | looks like we expect it to be in /etc/ansible/inventory_plugins/yamlgroup.py | 17:34 |
fungi | and it's tehre, so maybe ansible isn't finding it | 17:34 |
fungi | /etc/ansible/ansible.cfg does have enable_plugins=yaml,yamlgroup,advanced_host_list,ini | 17:35 |
fungi | interestingly, if i add -y it works | 17:37 |
fungi | so the json encoding problem is specific to the default output format | 17:37 |
clarkb | I think -y may mean interpret it as a normal yaml inventory though? which isn't quite right | 17:38 |
opendevreview | Artom Lifshitz proposed zuul/zuul-jobs master: DNM: Debugging failures https://review.opendev.org/c/zuul/zuul-jobs/+/862739 | 17:38 |
clarkb | I would've expected it to use the enabled plugins list to find the right stuff instead | 17:38 |
fungi | oh, i thought -y was telling it to output yaml instead of json | 17:38 |
clarkb | maybe? (the last time I had to fiddle with ansible inventory I couldn't figure out how to make it talk to lcoalhost without a local connection, its all very confusing) | 17:40 |
fungi | fwiw, ansible-inventory on the old bridge gives the same behaviors from what i can see | 17:47 |
fungi | `sudo /usr/ansible-venv/bin/ansible-config view` also returns configuration with the yamlgroup plugin specified, so it does appear to be reading that config at least | 17:53 |
fungi | `sudo /usr/ansible-venv/bin/ansible zookeeper --list-hosts` gives me zk04.opendev.org, zk05.opendev.org and zk06.opendev.org so i think that means the group is known | 17:55 |
clarkb | ++ | 17:57 |
fungi | i'm not sure how to easily test that plugin on the command line though | 18:01 |
fungi | looks like it requires a hostvars dict and a zk_group dict | 18:02 |
fungi | seems it needs a json blob piped into it | 18:05 |
fungi | and expects a ANSIBLE_MODULE_ARGS key in that | 18:05 |
fungi | i guess this is part of ansible's module specification/protocol | 18:06 |
fungi | i'm able to get the json format correct, but am unclear on what the actual values for hostvars and zk_group should be, so ansible the script is just returning a changed=true failed=true if i provide a nonempty zk_group | 18:15 |
fungi | i'm thinking it should be "zk_group": ["zk04.opendev.org", "zk05.opendev.org", "zk06.opendev.org"] | 18:17 |
Clark[m] | It would be whatever is passed by the Ansible that calls the module | 18:17 |
fungi | yeah, which is hard to suss out because we no_log that task | 18:17 |
fungi | the task passes zk_group: "{{ groups['zookeeper'] }}" which i think gets expanded before being sent into the module | 18:18 |
fungi | also passes hostvars: "{{ hostvars }}" but i have no idea what that is or where it gets set (is that some sort of ansible builtin?) | 18:19 |
fungi | or maybe that's a whole inventory blob? | 18:20 |
frickler | that should be the complete result of gather_facts. but I also have no idea why we set no_log on that task. but in the fail case there should also be an exception msg returned? | 18:45 |
fungi | or an unsafe value in one of the facts could get echoed into the log maybe | 18:48 |
fungi | anyway, i've cracked the input format | 18:48 |
fungi | echo '{"ANSIBLE_MODULE_ARGS": {"hostvars": {"zk04.opendev.org": {"ansible_host": ""}}, "zk_group": ["zk04.opendev.org"]}}' | sudo /usr/ansible-venv/bin/python3 ~zuul/src/opendev.org/opendev/system-config/playbooks/roles/nodepool-base/library/make_nodepool_zk_hosts.py | 18:49 |
fungi | that gets me successful output, albeit with incomplete data, but it's enough to exercise part of the script at least | 18:49 |
fungi | so i can be reasonably sure it executes and isn't raising an exception at least | 18:50 |
frickler | Unable to pass options to module, they must be JSON serializable: Object of type bytes is not JSON serializable | 19:01 |
fungi | where are you getting that? | 19:02 |
fungi | from gather_facts? | 19:02 |
fungi | oh, trying to pass the gather_facts blob in suppose | 19:02 |
frickler | no, from running that module. I made a copy of nodepool-base in /root/roles on bridge01 and run /root/sn.yaml | 19:03 |
frickler | which is a stripped down version of service-nodepool.yaml | 19:03 |
frickler | I'll add a debug to show the module parameters before that call | 19:04 |
fungi | per above, if i pipe some json into the module i can get a successful response, so i guess that comes down to how ansible is injecting the json | 19:04 |
frickler | so I haven't found which of the hostvars has bytes in it. but it isn't one of the zk_hosts, so with a bit of filtering this passes https://paste.opendev.org/show/bIspSkl8YqUXvC8lPWoo/ | 19:42 |
frickler | with that I'm out for today and leave the fun for ianw I guess | 19:42 |
ianw | o/ ... looking | 19:45 |
ianw | this is infra-prod-nodepool right? | 19:47 |
clarkb | I think so. There was a change to update the clouds.yaml on bridge and the nodepool nodes to fix a cinder issue with rax | 19:50 |
clarkb | it apparently applied to bridge but the nodepool job failed on the zk config module thing | 19:50 |
ianw | bridge run also has : "Failed to lock apt for exclusive operation: Failed to lock directory /var/lib/apt/lists/: E:Could not get lock /var/lib/apt/lists/lock." | 19:52 |
ianw | It is held by process 438359 (python3) | 19:52 |
ianw | ... occasionally | 19:54 |
ianw | so either something is happening the background more frequently updating apt with jammy, or possibly somehow there's some sort of racy thing where ansible steps on itself | 19:54 |
clarkb | unattended upgrades should only run once a day. But ya maybe therei s a newer thing doing it more often | 19:56 |
clarkb | in particular th elists lock is checking for package updates I think | 19:57 |
*** dviroel is now known as dviroel|afk | 20:03 | |
ianw | here is the problem serializing -- "zuul_executor_keytab": " | 20:27 |
ianw | this looks like pretty much the same thing i reported and fixed and was rejected @ https://github.com/ansible/ansible/issues/45098 :/ | 20:29 |
jrosser_ | i think that make_nodepool_zk_hosts.py might be done natively a bit like this https://paste.opendev.org/show/bgogpaWUq7TTmwFEHyu0/ | 20:31 |
jrosser_ | right now the python is really a reimplementation of the default filter | 20:32 |
ianw | jrosser_: yes, i have to agree it's probably better to do it like that -- rather than serialising all of hostvars and passing it :) | 20:32 |
ianw | ... but ... we do also have !!binary data in our hostvars, which as 45098 says is actually unsupported | 20:33 |
clarkb | but it also worked before? I guess that may have been due to older pyyaml or something? | 20:33 |
ianw | why it did work is a question. it may be a python3.6 -> python3.10 thing | 20:34 |
ianw | i'm not sure we even use the variable "zuul_executor_keytab" | 20:34 |
ianw | https://review.opendev.org/c/opendev/system-config/+/515181 seemed to remove zuul_launcher_keytab | 20:36 |
ianw | https://review.opendev.org/c/opendev/system-config/+/371818 added it to publish ... | 20:37 |
ianw | yep it works with them commented out | 20:39 |
ianw | i'll do a more thorough investigation in a bit, have to get kid out door | 20:39 |
clarkb | I think that may have existed before zuul secrets existed | 20:40 |
clarkb | and so we had it as part of the executor install? | 20:40 |
ianw | i can't see we ever used zuul_executor_keytab in system-config. the zuul_launcher_keytab we don't use any more | 20:41 |
ianw | all our keytabs are just base64 encoded strings, e.g. | 20:42 |
ianw | foo_keytab: | <base64> | 20:42 |
ianw | i'm not sure how these ..._keytab: !!binary bits hung on in there | 20:43 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP Upgrade to Gitea 1.18 https://review.opendev.org/c/opendev/system-config/+/862661 | 20:57 |
opendevreview | Clark Boylan proposed opendev/system-config master: DNM intentional failure to hold a node https://review.opendev.org/c/opendev/system-config/+/848181 | 20:57 |
opendevreview | Clark Boylan proposed opendev/system-config master: Disable unused gitea features https://review.opendev.org/c/opendev/system-config/+/862756 | 20:57 |
clarkb | I think 862756 is safe to land whenever. I noticed those options when looking at 1.18 configs | 20:58 |
clarkb | I put a hold on 848181 to enable interactive debugging | 20:59 |
clarkb | oh I didn't update its commmit message. Oh well | 20:59 |
clarkb | other than nodepool, zuul-jobs and the openstack releases announce job have we seen fallout from the jammy base nodeset update? | 21:01 |
clarkb | does anyone remember what gitea paths showed files as vendored? | 22:07 |
fungi | i bet my irc logs remember | 22:08 |
fungi | checking | 22:08 |
clarkb | I'm looking at the currently deployed setup on opendev.org and half wondering if the fixed lib made it into that alrady | 22:08 |
fungi | apparently almost everything in https://opendev.org/openstack/oslo.cache/commit/7fb06bc2034d9747c9721c9d3eff06925a4483c6 showed up as "vendored" according to my chat logs | 22:09 |
clarkb | aha its on the commits not the file browser thats what I needed | 22:09 |
fungi | still says "vendored" when i bring that up | 22:09 |
clarkb | https://opendev.org/opendev/system-config/commit/2d9d24d07d73d959241f3a7e4ba50e83542ebed0 is a system config path that shows it. But we can check the 1.18.0-rc0 against that | 22:10 |
clarkb | https://198.72.124.43:3081/opendev/system-config/commit/2d9d24d07d73d959241f3a7e4ba50e83542ebed0 no more vendored | 22:10 |
clarkb | yay my fix worked | 22:11 |
fungi | w00t! | 22:12 |
clarkb | the key was being shown it was on the commit pages and not on the general file browser | 22:13 |
opendevreview | Ian Wienand proposed opendev/system-config master: nodepool-base: don't call out to find zk_hosts https://review.opendev.org/c/opendev/system-config/+/862759 | 22:23 |
*** rlandy is now known as rlandy|bbl | 22:26 | |
ianw | jrosser_: ^ not calling out is about 20 times faster :) | 22:27 |
fungi | wow | 22:28 |
ianw | it's a lot to serialise; bit of a corner case but a good one to be aware of | 22:32 |
clarkb | does running set fact against itself in a loop work like that in ansible? | 22:40 |
ianw | it does seem to -> https://paste.opendev.org/show/bF50adDZ66VuSYQbNMd6/ | 22:51 |
clarkb | I think a lot of the 1.18 stuff may be around features we don't use like federation and email and the proxy protocol. That means the 1.18 upgrade is likely to be straightforwatd for us. I do want to investigate enabling the proxy protocol though as I think that might help with our logging. The problem is I'm not sure if apache will grok it (haproxy does) | 23:26 |
opendevreview | Ian Wienand proposed opendev/system-config master: bootstrap-bridge: Codify allowed Zuul logins https://review.opendev.org/c/opendev/system-config/+/862761 | 23:35 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!