opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_adjutant master: Install mysqlclient devel package https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/888985 | 07:21 |
---|---|---|
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_adjutant master: Fix linters and metadata https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/888469 | 07:23 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_adjutant master: Fix linters and metadata https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/888469 | 07:23 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-lxc_hosts master: Refactor LXC image expiration https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/888278 | 07:25 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-lxc_hosts master: Fix linters issue and metadata https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/888180 | 07:26 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-lxc_hosts master: Fix linters issue and metadata https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/888180 | 07:27 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-lxc_hosts master: Add retries to LXC base build command https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/888750 | 07:27 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-galera_server stable/zed: Add optional compression to mariabackup https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/887143 | 07:44 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/yoga: Include proper vars_file for rally https://review.opendev.org/c/openstack/openstack-ansible/+/888656 | 07:45 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_rally stable/yoga: Include proper commit in rally_upper_constraints_url https://review.opendev.org/c/openstack/openstack-ansible-os_rally/+/887681 | 07:45 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_rally stable/yoga: Include proper commit in rally_upper_constraints_url https://review.opendev.org/c/openstack/openstack-ansible-os_rally/+/887681 | 07:46 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/yoga: Include proper vars_file for rally https://review.opendev.org/c/openstack/openstack-ansible/+/888656 | 07:48 |
kleini | https://paste.opendev.org/show/bZj7Yq3mmW8wWi1e9pqj/ <- I have this issue during upgrade to 26.1.2. The SSH keyfiles are there and I can properly read them with ssh-keygen. generated public key matches the public key file. Do you have any hints, what is wrong? ssh-keygen does not ask me for a passphrase for the private key, when showing the public one with ssh-keygen -y -e -f private | 08:26 |
noonedeadpunk | kleini: what mode is file in? | 09:03 |
noonedeadpunk | as IIRC it does fail if it is not 0600 | 09:03 |
noonedeadpunk | And if it is stored in git - it won't be 0600 | 09:04 |
noonedeadpunk | but if you say ssh-keygen can read them... huh | 09:06 |
noonedeadpunk | as I was thinking about this thingy https://github.com/ansible-collections/community.crypto/issues/564 | 09:06 |
noonedeadpunk | kleini: do you have `backend: cryptography` in /etc/ansible/ansible_collections/openstack/osa/roles/ssh_keypairs/tasks/standalone/create_keypair.yml ? | 09:08 |
noonedeadpunk | as this could be fixed with https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/870997 but yeah, it's available only for Antelope and not Zed | 09:09 |
kleini | it was the file permissions. many thanks! | 09:10 |
noonedeadpunk | we probably can backport this patch to Zed | 09:10 |
noonedeadpunk | or you can propose it as well ;) | 09:11 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server stable/2023.1: Use wildcards to specify rabbit/erlang versions https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/888657 | 09:12 |
kleini | on the staging system, I don't have those files in Git, just the configuration without upper constraints, pki, keypairs and so on. because I deploy staging freshly every time, IPs change, container UUIDs change and so on. but for production I have all files in Git. resulting in wrong file permissions for SSH private key files. *facepalm* | 09:13 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server stable/2023.1: Use wildcards to specify rabbit/erlang versions https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/888657 | 09:13 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server stable/2023.1: Use wildcards to specify rabbit/erlang versions https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/888657 | 09:15 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server stable/2023.1: Use wildcards to specify rabbit/erlang versions https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/888657 | 09:16 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server stable/2023.1: Use wildcards to specify rabbit/erlang versions https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/888657 | 09:17 |
opendevreview | Marcus Klein proposed openstack/openstack-ansible-plugins stable/zed: Use cryptography backend for openssh_keypair https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/888658 | 09:18 |
kleini | too easy ;-) | 09:19 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/2023.1: Use include_role in task to avoid lack of access to vars https://review.opendev.org/c/openstack/openstack-ansible/+/888659 | 09:20 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/zed: Use include_role in task to avoid lack of access to vars https://review.opendev.org/c/openstack/openstack-ansible/+/888660 | 09:20 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/yoga: Use include_role in task to avoid lack of access to vars https://review.opendev.org/c/openstack/openstack-ansible/+/889021 | 09:20 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/yoga: Re-enable CI jobs after rally is fixed https://review.opendev.org/c/openstack/openstack-ansible/+/889016 | 09:23 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/yoga: Re-enable CI jobs after rally is fixed https://review.opendev.org/c/openstack/openstack-ansible/+/889018 | 09:24 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/yoga: Pin version of setuptools https://review.opendev.org/c/openstack/openstack-ansible/+/889022 | 09:25 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/yoga: Pin version of setuptools https://review.opendev.org/c/openstack/openstack-ansible/+/889022 | 09:25 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/yoga: Pin version of setuptools https://review.opendev.org/c/openstack/openstack-ansible/+/889022 | 09:25 |
noonedeadpunk | kleini: nobody said it will be hard :) but this way it will get reviewed faster | 09:27 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/yoga: Restore an ability for HAProxy to bind on interal IP https://review.opendev.org/c/openstack/openstack-ansible/+/887577 | 09:28 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/zed: Restore an ability for HAProxy to bind on interal IP https://review.opendev.org/c/openstack/openstack-ansible/+/887574 | 09:29 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/2023.1: Gather facts before including common-playbooks https://review.opendev.org/c/openstack/openstack-ansible/+/889023 | 09:30 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-plugins stable/zed: Skip updating service password by default https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/888153 | 09:30 |
kleini | I have now an issue with zookeeper in production deployment. incoming connection from other zookeeper instances (in containers) seem to come in from their hosts, not the containers. what can be a possible cause for that? IPs and routing looks the same as with all other LXC containers. | 10:43 |
kleini | the certificate check fails then because the source IP is wrong and zookeeper drops the connection from the other zookeeper instance. | 10:44 |
noonedeadpunk | kleini: so it's cluster connection that fails? Or client connection? | 10:51 |
noonedeadpunk | As for client connection, there's a bug in tooz library (was fixed quite recently), that does not allow to enable encryption for clients | 10:51 |
noonedeadpunk | but clustering encryption should work | 10:51 |
kleini | it is for the cluster connection | 10:51 |
anskiy | noonedeadpunk: https://zuul.opendev.org/t/openstack/build/42f7de398a5a42a498ffd264914301b1/log/logs/host/glance-api.service.journal-10-08-02.log.txt now it's glance that is broken. I think there is some problem with keystone, not nova | 10:51 |
anskiy | hmm: https://zuul.opendev.org/t/openstack/build/42f7de398a5a42a498ffd264914301b1/log/logs/host/keystone-wsgi-public.service.journal-10-08-02.log.txt#17579 | 10:53 |
noonedeadpunk | anskiy: I have quite vague understanding why this can happen to be frank, and only in upgrade jobs | 10:53 |
noonedeadpunk | ah | 10:53 |
kleini | https://paste.opendev.org/show/bKFZxWBwt20oLxi8memH/ 10.20.150.2-4 are the infra hosts, while 127,132,184 are the zookeeper containers | 10:53 |
noonedeadpunk | but that's kinda "expected" | 10:53 |
noonedeadpunk | kleini: the only guess how this might happen - is that zookeeper attempts to use eth0 instead of eth1 | 10:54 |
noonedeadpunk | and eth0 has src nat | 10:54 |
kleini | okay, so bind seems to be wrong | 10:54 |
noonedeadpunk | But, eth0 should not be routable, as lxcbr0 is isolated | 10:54 |
noonedeadpunk | I _think_ it binds to 0.0.0.0 | 10:54 |
noonedeadpunk | but not sure | 10:55 |
noonedeadpunk | and default route is through eth0 actually | 10:55 |
anskiy | I wonder why does it say 10.20.150.132 in ListenHandler, does it bind on it? | 10:55 |
noonedeadpunk | there're different set of settings for client and clustering | 10:55 |
anskiy | I mean, what's that address anyway, as it seems, that's not an infra host | 10:56 |
noonedeadpunk | I woudl need to read zookeper docs to recall what is what | 10:56 |
noonedeadpunk | but I'd check `ss` output to check where it is binded | 10:56 |
kleini | zookeeper is bound to eth1 address in containers. strangely incoming connections seem to come in from own host not the other containers host... | 11:02 |
noonedeadpunk | just to ensure - zookeper is not running on hosts as well? | 11:04 |
noonedeadpunk | as might be you ended up with 6 zookeepers or smth? | 11:05 |
kleini | damn, it is | 11:05 |
noonedeadpunk | ok, that's interesting | 11:05 |
kleini | I have 6 zookeepers | 11:05 |
anskiy | noonedeadpunk: well, for `openstack-ansible-upgrade_yoga-aio_metal-ubuntu-focal`, which succeeded glance_service_password is 47 chars, for `openstack-ansible-upgrade-aio_metal-rockylinux-9` it's 61 | 11:05 |
noonedeadpunk | is it our env.d that is failing or your own inventory weird? | 11:05 |
noonedeadpunk | anskiy: well, there's a configuration for keystone to change hashing method to remove this issue | 11:06 |
kleini | https://paste.opendev.org/show/biAjsSQbx4XG3GsA0CVU/ <- issue in inventory | 11:06 |
noonedeadpunk | it's due to bcache or smth like that | 11:06 |
noonedeadpunk | kleini: yeah, but I wonder what caused it.... | 11:07 |
noonedeadpunk | you used default env.d file? | 11:07 |
noonedeadpunk | anskiy: password_hash_algorithm https://docs.openstack.org/keystone/latest/configuration/config-options.html#identity.password_hash_algorithm | 11:07 |
noonedeadpunk | bcrypt has a limit of 54, scrypt does not | 11:08 |
kleini | env.d is only modified for having cinder volume in containers for Ceph only backing storage | 11:08 |
noonedeadpunk | anskiy: https://opendev.org/openstack/keystone/src/commit/8ad765e0230ceeb5ca7c36ec3ed6d25c57b22c9d/releasenotes/notes/bug_1543048_and_1668503-7ead4e15faaab778.yaml | 11:09 |
anskiy | noonedeadpunk: so, it's better to change user_secrets generation? | 11:10 |
noonedeadpunk | another way would be to ensure that our tooling does not make passowrds longer then 54 by default | 11:10 |
noonedeadpunk | I'd say that for new deployments it makes sense to start using scrypt to be frank... | 11:10 |
noonedeadpunk | anskiy: but I really wonder why it's the issue for this specific patch only | 11:10 |
noonedeadpunk | this should be adjsuted then to be 54 max https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/pw-token-gen.py#L89 | 11:11 |
kleini | sorry, I was wrong. there is no zookeeper instance on the hosts. the inventory looks the same in staging | 11:12 |
noonedeadpunk | ugh... Having zookeeper on hosts would be way easier explanation | 11:13 |
noonedeadpunk | but yeah, coordination_all should contain containers and hosts - it "as designed" | 11:14 |
noonedeadpunk | as playbook runs against `zookeeper_all` | 11:14 |
noonedeadpunk | which should contain only containers | 11:14 |
kleini | I found some iptables rules, maybe causing this | 11:16 |
kleini | https://paste.opendev.org/show/bCO5z7LFXcb6zI383WBB/ <- some masquerading rule for computes and network nodes to access outside world through infra hosts was not strict enough... | 11:26 |
anskiy | noonedeadpunk: there is this thing https://review.opendev.org/c/openstack/openstack-ansible/+/887866 and it's 2023.1 too, which fails in `nova-status upgrade check` (jammy) and glance (rocky) | 11:29 |
noonedeadpunk | kleini: I think at least one of the rules is created by lxc-hosts role | 11:31 |
noonedeadpunk | the one for 10.0.3.0/24 | 11:31 |
noonedeadpunk | not sure about second one though | 11:31 |
kleini | yes. and the other one was from systemd-networkd to allow computes and network nodes to access outside world through infra hosts. in my setup only infra hosts have outside world IPs and floating IPs are reachable from outside | 11:32 |
kleini | computes and network nodes need to access outside world through management network using infra hosts as SNAT gateways | 11:33 |
noonedeadpunk | ah, ok, I see then. | 12:03 |
lsudre | Hi, I try to deploy openstack+ceph with OSA when I run "openstack-ansible setup-infrastructure.yml" I have an issue with this task: [ceph-osd : wait for all osd to be up] skipping osd1 and osd2, and failing with osd3 and retrying 60times. Do you have any ideas, about what is causing this? Thx | 14:02 |
noonedeadpunk | lsudre: worth checking `ceph -s` or `ceph health` | 14:06 |
lsudre | in osd? | 14:07 |
noonedeadpunk | on monitor host | 14:07 |
noonedeadpunk | and might be smth like `ceph osd tree` | 14:07 |
lsudre | auth: unable to find a keyring | 14:07 |
anskiy | lsudre: do you run this on one of controller nodes, as opposed to deploy host? | 14:10 |
noonedeadpunk | is there even monitor service running? | 14:10 |
lsudre | noonedeadpunk: sudo systemctl status ceph-mon.service return inactive | 14:11 |
lsudre | anskiy: in my ceph-mon host | 14:11 |
anskiy | lsudre: service should be called like `ceph-mon@<HOST>` | 14:12 |
lsudre | noonedeadpunk: and sudo systemctl status ceph-mon@os-deploy-ceph-host.service return active and running | 14:13 |
lsudre | anskiy: like this ? ceph-mon@os-deploy-ceph-host.service | 14:13 |
noonedeadpunk | and what's in /etc/ceph then? There should be ceph.conf and keyrings | 14:14 |
lsudre | my infra is: one mon(mon1) and 3 osd ([osd1, osd2, osd3]) | 14:14 |
lsudre | ok now I can run a ceph -s command | 14:14 |
lsudre | the command return a HEALTH_WARN | 14:15 |
lsudre | mon is allowing insecure global_id reclaim 1 MDSs report slow metadata IOs Reduced data availability: 2 pgs inactive OSD count 0 < osd_pool_default_size 3 | 14:15 |
anskiy | lsudre: so what is the status of OSDs: `systemctl status ceph-osd@<OSD ID>`? | 14:16 |
noonedeadpunk | lsudre: `mon is allowing insecure global_id reclaim` is relatively minor | 14:18 |
lsudre | anskiy: in the mon? I haven't this service, only a ceph-osd.target | 14:18 |
noonedeadpunk | nah. on osd node | 14:19 |
noonedeadpunk | saying, osd3 | 14:19 |
lsudre | same shit only ceph-osd.target and is running | 14:20 |
lsudre | the ceph -s command return services: mon: 1 daemons, quorum os-deploy-ceph-host (age 26m) mgr: os-deploy-ceph-host(active, since 26m) mds: 1/1 daemons up osd: 0 osds: 0 up, 0 in | 14:20 |
anskiy | lsudre: I would still try to run `systemctl status ceph-osd@3` on osd3 -- there could be some logs of the previous attempt to start it | 14:21 |
anskiy | or `journalctl -u ceph-osd@3` | 14:22 |
anskiy | or whichever is ID for osd3 | 14:22 |
lsudre | ○ ceph-osd@3.service - Ceph object storage daemon osd.3 Loaded: loaded (/lib/systemd/system/ceph-osd@.service; disabled; vendor preset: enabled) Active: inactive (dead) | 14:24 |
noonedeadpunk | and what if you try to start it? | 14:28 |
noonedeadpunk | or indeed - check journalctl | 14:28 |
lsudre | he ask me a password | 14:28 |
lsudre | OSD data directory /var/lib/ceph/osd/ceph-3 does not exist; bailing out. | 14:29 |
lsudre | I have no files in /var/lib/ceph/osd/ folder | 14:30 |
lsudre | something is missing in my OSA conf? | 14:32 |
anskiy | lsudre: could you please show us your `openstack_user_config.yml` via paste.opendev.org and user_variables.yml? | 14:32 |
lsudre | shure | 14:32 |
lsudre | https://paste.opendev.org/show/bGDj5FVzWWKEjkaxhKJX/ https://paste.opendev.org/show/bJBfMzednhy5Y5IOgwiL/ | 14:34 |
anskiy | I wonder how are example configurations suppose to work, without settings `lvm_volumes`... | 14:48 |
anskiy | lsudre: so, I suppose, you need to set `lvm_volumes` variable to some disk devices (like `/dev/sdX`) that you're willing to use as OSDs. | 14:52 |
anskiy | noonedeadpunk: so, I've forcefully set https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/pw-token-gen.py#L89 this thing to 64 and succesfully bootstrapped antelope, so it's something else -_- | 14:54 |
lsudre | ok thank you for your time | 14:54 |
*** dviroel__ is now known as dviroel | 14:55 | |
noonedeadpunk | anskiy: from what I read in keystone code - it should jsut "strip" to 54 anything that is longer | 14:56 |
noonedeadpunk | and it happens only to upgrade, so might be smth realted to re-hashing... And then when we did "update_password" - it was resetting it, so it was not a concern I guess | 14:57 |
anskiy | noonedeadpunk: didn't the patch was only applicable to nova? As Glance get 401 too | 14:59 |
noonedeadpunk | Nope, we jsut disabled resetting password by default | 15:00 |
noonedeadpunk | or updating it | 15:00 |
noonedeadpunk | so if you need to update password - you'de need to define a variable for that | 15:00 |
noonedeadpunk | so it could be result of keystone upgrade. But then it's good we've catched that | 15:04 |
lsudre | anskiy: why I need a volume_group: cinder-volumes in user_variables like the openstack_user_config.yml.test.example when i should use the rbd_volumes specifically made for ceph configuration? | 15:23 |
anskiy | lsudre: Ceph needs some disks to be used as OSDs, so it could provide block storage for your cluster. And defining which devices should be used by Ceph is done like this: https://github.com/ceph/ceph-ansible/blob/main/group_vars/osds.yml.sample#L21-L122. I, for example set this via `lvm_volumes` list for each OSD node. | 15:41 |
opendevreview | Merged openstack/openstack-ansible-openstack_hosts master: Fix linters issue and metadata https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/888455 | 21:59 |
opendevreview | Merged openstack/openstack-ansible-ceph_client master: Fix linters and metadata https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/888216 | 22:01 |
opendevreview | Merged openstack/openstack-ansible-ceph_client master: Apply tags to included tasks https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/888461 | 22:27 |
opendevreview | Merged openstack/openstack-ansible-repo_server master: Fix linters and metadata https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/888280 | 22:43 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!