opendevreview | Dmitriy Rabotyagov proposed openstack/ansible-hardening master: Use valid value for CREATE_HOME https://review.opendev.org/c/openstack/ansible-hardening/+/908977 | 08:44 |
---|---|---|
jrosser | good morning | 08:50 |
noonedeadpunk | o/ | 08:54 |
noonedeadpunk | it's so nice being just 2 pages of bugs :) | 09:00 |
noonedeadpunk | or well, 3 bugs short of that... | 09:01 |
opendevreview | Andrew Bonney proposed openstack/openstack-ansible-plugins master: Add override for gluster host used for bootstrap operations https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/908981 | 10:07 |
opendevreview | Andrew Bonney proposed openstack/openstack-ansible master: [doc] Use bootstrap node override for gluster primary upgrade https://review.opendev.org/c/openstack/openstack-ansible/+/908982 | 10:09 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Replace common-tasks with lxc_container_setup role from plugins collection https://review.opendev.org/c/openstack/openstack-ansible/+/908984 | 10:22 |
noonedeadpunk | jrosser: maybe instead of ^ it makes sense to finally just move playbooks as a whole? | 11:36 |
noonedeadpunk | I can check later today on that | 11:36 |
noonedeadpunk | or well, if we wanna go this route ofc | 11:37 |
jrosser | so 908984 is needed for the capi patches anyway (or indeed any extension of osa that is coming from a collection) | 11:44 |
jrosser | and it's a bit independant if we move the other playbooks or not, but yes thats totally an option | 11:44 |
noonedeadpunk | ah, ok, got what you mean | 11:45 |
noonedeadpunk | yeah, makes sense | 11:45 |
jrosser | ELK would be in a similar position if we converted that to a proper collection in the ops repo | 11:45 |
noonedeadpunk | I guess one note though... | 11:45 |
jrosser | oh hold on 908984 is not actually needed to merge the capi patches as we already put the role in the plugins repo | 11:46 |
jrosser | it's more like a tidy up to remove duplicate code | 11:46 |
noonedeadpunk | yeah, but it make sense before/after moving anyway | 11:46 |
noonedeadpunk | the only thing we need to update a-c-r as well | 11:46 |
jrosser | maybe that should be `master` rather than a SHA? | 11:47 |
noonedeadpunk | well, might be. we do test master in CI anyway.... | 11:48 |
nixbuilder | I have my new deployment installed with three infra nodes and using the haproxy. To ease troubleshooting I shutdown infra02 and infra03. However when I do that my neutron-metadata is throwing connection errors (https://paste.openstack.org/show/bcZv6gDc5l2nshpWwH2Z/) and a test instance can no longer ping 169.254.169.254. So I assume I missed something on the install that causes problems with high | 11:48 |
nixbuilder | a service. What could I have missed? | 11:48 |
noonedeadpunk | nixbuilder: so, services must reconnect to currently active rabbitmq server | 11:49 |
noonedeadpunk | it's up to client to find one, as long as it has all 3 configured | 11:49 |
noonedeadpunk | one thing about neutron-metadata-agent - is that its logging not great,,,, | 11:50 |
jrosser | nixbuilder: there might also be more subtle ways to do troubleshooting, shutdown 2 of 3 infra nodes is pretty invasive | 11:50 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Replace common-tasks with lxc_container_setup role from plugins collection https://review.opendev.org/c/openstack/openstack-ansible/+/908984 | 11:53 |
nixbuilder | jrosser: Fortunately this new deployment is not in production yet so it's no big deal. | 11:55 |
jrosser | if you want to make all requests go to one particular host then you can put backends in maintainance in haproxy | 11:55 |
nixbuilder | noonedeadpunk: My rabbitmq server on my remaining infra01 node is up and running... https://paste.openstack.org/show/b8e1NguiEZjILj8WvIbu/ | 11:57 |
noonedeadpunk | so, eventually it might be neutron-metadata who just does not log successfull reconnection, for instance | 11:58 |
noonedeadpunk | but also what sucks utterly, is that there;s no dst logged which it tried to connect to... | 11:58 |
noonedeadpunk | usually it would include host and port it tried to connect | 11:59 |
jrosser | isn't there some oslo logging syntax you can use | 12:03 |
noonedeadpunk | not sure? as that's basically exception text I assume | 12:04 |
noonedeadpunk | but maybe I just don't know about that :) | 12:05 |
opendevreview | Merged openstack/openstack-ansible-os_nova stable/2023.2: Fix nova device_spec to support multiple values https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/908811 | 12:06 |
nixbuilder | noonedeadpunk: Still doing some testing... brought infra02 back up. Still no joy. Brought infra03 back up... still no joy. Rebooted my test instance... everything is back to normal. So what I am wondering is if one of my infra nodes does crash, does that mean that new instances cannot be created do to failure of high availability??? Like I said... still testing this out. | 12:20 |
noonedeadpunk | um, no, I don't think it's a case in fact. or wel, it should not be at very least by multiple reasons | 12:26 |
noonedeadpunk | So apparently, services got disconnected from rabbitmq and need some time to re-connect to it | 12:26 |
noonedeadpunk | left rabbitmq node, in case of enabled ha queues or quorum queues should contain messages left for services in such cases | 12:27 |
noonedeadpunk | so ideally, once they re-connect, they should consume messages and execute required RPC | 12:27 |
noonedeadpunk | But then it can be some service didn't for $reason send an rpc, which can legitimaly result in metadata not providing metadata as it doesn't know anything about instance | 12:29 |
opendevreview | Merged openstack/openstack-ansible-ceph_client stable/2023.2: Don't load systemd parent service for object cache https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/908809 | 12:30 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_ironic stable/2023.2: Fix a typo in pxe_redfish definition https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/908932 | 12:34 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_ironic stable/2023.1: Fix a typo in pxe_redfish definition https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/908933 | 12:35 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_ironic stable/zed: Fix a typo in pxe_redfish definition https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/908934 | 12:35 |
opendevreview | Merged openstack/openstack-ansible-ops master: Add collection to deploy magnum cluster-api with vexxhost driver https://review.opendev.org/c/openstack/openstack-ansible-ops/+/901450 | 12:58 |
opendevreview | Merged openstack/openstack-ansible-ops master: Cluster API Bootstrapping playbook https://review.opendev.org/c/openstack/openstack-ansible-ops/+/902178 | 13:14 |
opendevreview | Merged openstack/openstack-ansible-ops master: Add role to install and run sonobouy k8s validation tests https://review.opendev.org/c/openstack/openstack-ansible-ops/+/906054 | 13:54 |
mgariepy | noonedeadpunk, for the bridge stuff, i gueess it depends on a few factor like lxb vs ovs and ovs openflow vs iptables shim stuff no ? | 14:05 |
opendevreview | Merged openstack/openstack-ansible stable/2023.2: [doc] Update dist upgrade guide for 2023.1 / Ubuntu Jammy https://review.opendev.org/c/openstack/openstack-ansible/+/908804 | 14:06 |
noonedeadpunk | mgariepy: I think not? As even for LXB, for br-vxlan can stand any interface - it just need to have an IP from tunnel | 14:12 |
noonedeadpunk | same pretty much for br-storage on computes - they just need l3 access? | 14:13 |
noonedeadpunk | and then br-vlan will be added to another bridge - so bridge in a bridge situation | 14:13 |
mgariepy | hmm ok right | 14:16 |
noonedeadpunk | and then on computes/net nodes even br-mgmt can be just interface ... | 14:17 |
noonedeadpunk | we jsut have bridges requirement in doc for consistency of naming | 14:17 |
mgariepy | hmm yeah ok ;) haha | 14:18 |
noonedeadpunk | eventually, if you have sr-iov on controllers - you can do without bridges even there kinda.... | 14:21 |
noonedeadpunk | at least I had a use-case of passing IB devices to containers directly, withoug br-stor | 14:21 |
mgariepy | yeah but well i like simple networking on the controllers. lol | 14:21 |
noonedeadpunk | sure, totally. | 14:22 |
noonedeadpunk | I gues point was - would be nice to be explicit that it's written like that just for naming consistency | 14:22 |
mgariepy | i only do bridges on the controllers theses days. for lxc containers. | 14:22 |
noonedeadpunk | yeah, fair, same here | 14:22 |
mgariepy | yeah, it does also simplify the config. | 14:22 |
noonedeadpunk | tried OVS bridges, but haven't found any difference more or less | 14:23 |
mgariepy | no need to overrides the neutron agent config on some esoteric named interfaces.. that changes on distro upgrade. | 14:23 |
noonedeadpunk | well, you can name interface to any name with systemd-networkd or netplan? | 14:24 |
mgariepy | that's what i do. | 14:24 |
noonedeadpunk | like nothing stops naming a regular vlan as br-mgmt lol | 14:24 |
mgariepy | yes something stops me. future-self will kill me if i do that | 14:25 |
noonedeadpunk | lol | 14:25 |
noonedeadpunk | true | 14:25 |
noonedeadpunk | would be sweet 1st april joke to convert some environment to jsut intrafeces but name them as bridges... | 14:25 |
mgariepy | my point being more that we need eitehr do document how to do that, or warn that you will need to do a bunch of overrides. | 14:25 |
mgariepy | who doesn't like interface named like that: enp45s0u1u3u3 haha | 14:27 |
noonedeadpunk | but yeah, I kinda agree here, that this really huge doc refactoring would be needed with examples and drawings | 14:27 |
opendevreview | Merged openstack/openstack-ansible-ceph_client stable/2023.1: Don't load systemd parent service for object cache https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/908810 | 15:39 |
opendevreview | Merged openstack/openstack-ansible master: Allow virtualisation type to be defined in a test scenario https://review.opendev.org/c/openstack/openstack-ansible/+/907327 | 16:16 |
noonedeadpunk | nixbuilder: actually, I've seen `Connection failed: [Errno 111] ECONNREFUSED (retrying in 11.0 seconds): ConnectionRefusedError: [Errno 111] ECONNREFUSED` but that was when I did not defined transport_url in neutron.conf at all... | 17:14 |
nixbuilder | noonedeadpunk: Thanks for the tip... let me check that. | 17:15 |
noonedeadpunk | but I doubt it will work on it's own again... | 17:17 |
noonedeadpunk | but otherwise I see where it tries to connect at very least... | 17:17 |
noonedeadpunk | found quite /o\ thing in couple of roles: https://opendev.org/openstack/openstack-ansible-os_neutron/src/branch/stable/yoga/tasks/neutron_post_install.yml#L136-L142 | 17:58 |
noonedeadpunk | also present in cinder https://opendev.org/openstack/openstack-ansible-os_cinder/src/branch/master/tasks/cinder_post_install.yml#L131-L137 | 17:58 |
noonedeadpunk | in first case it makes rootwrap.d mode 0640, in second all files in rootwrap.d 750 | 17:58 |
noonedeadpunk | I'm also not convinced at all why this task exist at the first place.... | 17:59 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Fix permissions for rootwrap files https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/909034 | 18:03 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Add VPNaaS OVN support https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/908341 | 18:06 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/2023.1: [doc] Update documentation for galera cluster recovery https://review.opendev.org/c/openstack/openstack-ansible/+/907576 | 21:14 |
opendevreview | Merged openstack/openstack-ansible-os_nova stable/2023.2: Evaluate my_ip address once https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/908698 | 22:55 |
opendevreview | Merged openstack/openstack-ansible-os_masakari stable/2023.2: Updated from OpenStack Ansible Tests https://review.opendev.org/c/openstack/openstack-ansible-os_masakari/+/903073 | 22:55 |
opendevreview | Merged openstack/openstack-ansible-os_nova stable/2023.1: Fix nova device_spec to support multiple values https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/908812 | 23:01 |
opendevreview | Merged openstack/openstack-ansible stable/2023.1: [doc] Update dist upgrade guide for 2023.1 / Ubuntu Jammy https://review.opendev.org/c/openstack/openstack-ansible/+/908805 | 23:52 |
opendevreview | Merged openstack/openstack-ansible stable/2023.1: [doc] Update documentation for galera cluster recovery https://review.opendev.org/c/openstack/openstack-ansible/+/907576 | 23:52 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!