opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Bump ansible version to 2.15.9 https://review.opendev.org/c/openstack/openstack-ansible/+/905619 | 08:56 |
---|---|---|
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Fix inventory defenition for Cloudkitty https://review.opendev.org/c/openstack/openstack-ansible/+/912269 | 08:57 |
kleini | https://paste.opendev.org/show/bZZYrhnNuQMzW6yMpJv3/ I have constantly in all services this warning about lost database connections to MySQL server during query. How is this supposed to work reliably? OSA defaults even produce more of those warnings. I already reduced openstack_db_connection_recycle_time to 400 while timeout on Galera side remains 600. How can I avoid those warnings? | 11:12 |
gokhan__ | I have also get lost connection to mysql errors on antelope 27.3.0 | 11:22 |
noonedeadpunk | kleini: `Lost connection to MySQL server during query` is really bad thing | 11:27 |
kleini | as said, OSA defaults and happens with all OpenStack services across all three infra nodes and Galera is totally idle | 11:28 |
noonedeadpunk | I guess there're 2 possible things here - 1st is smth is off with haproxy, for instance, internal VIP is getting in some failover loop | 11:30 |
kleini | haproxy and Galera are totally stable and monitored via Prometheus. no VIP failover nothing. | 11:31 |
kleini | sqlalchemy or oslo.db seem to be so bad implemented, that they keep open connections laying around. they time out. then some request comes in and 'select 1' fails. so, this connection_recycle_time does not seem to work reliably | 11:33 |
noonedeadpunk | iirc, select 1 is issued periodically to detect/reset such connection | 11:38 |
noonedeadpunk | I can recall there was some setting regarding that.... | 11:39 |
noonedeadpunk | https://opendev.org/openstack/oslo.db/src/commit/5363ca11c9d4d9a5a9cf5a2be2fc40c52659f258/oslo_db/sqlalchemy/engines.py#L78-L101 | 11:40 |
noonedeadpunk | So I assume, that should be solved with sqlalchemy 2.0.5: https://opendev.org/openstack/oslo.db/commit/64e50494f219b5c06ed79f947c91cdb7f37cb0d6 | 11:44 |
noonedeadpunk | but I think even for 2024.1 it still will be <2.0 | 11:44 |
opendevreview | Merged openstack/openstack-ansible-os_cloudkitty master: Enable CloudKitty APIv2 https://review.opendev.org/c/openstack/openstack-ansible-os_cloudkitty/+/912291 | 11:45 |
opendevreview | Merged openstack/openstack-ansible-os_heat master: Deprecate and remove heat_deferred_auth_method variable https://review.opendev.org/c/openstack/openstack-ansible-os_heat/+/905109 | 11:57 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_heat master: Grant proper privileges to admin user for testing purposes https://review.opendev.org/c/openstack/openstack-ansible-os_heat/+/912108 | 11:58 |
kleini | noonedeadpunk, thank you very much. so I need to be patient regarding that issue and try to ignore those messages in the logs. | 12:23 |
opendevreview | OpenStack Release Bot proposed openstack/openstack-ansible-haproxy_server master: reno: Update master for unmaintained/victoria https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/913016 | 12:25 |
noonedeadpunk | oh no.... | 12:26 |
noonedeadpunk | gerritbot flood is to start now I assume | 12:26 |
opendevreview | Merged openstack/openstack-ansible master: Fix physical network mapping for linuxbridge https://review.opendev.org/c/openstack/openstack-ansible/+/912768 | 12:36 |
opendevreview | Merged openstack/openstack-ansible master: Upgrade Gnocchi to 4.6 https://review.opendev.org/c/openstack/openstack-ansible/+/912277 | 12:36 |
noonedeadpunk | fwiw, ^ this still results in broken gnocchi due to "recently" upgraded werkzeug in u-c :( | 12:37 |
noonedeadpunk | just realized that todat | 12:37 |
noonedeadpunk | https://github.com/gnocchixyz/gnocchi/pull/1378 was to address it | 12:37 |
opendevreview | OpenStack Release Bot proposed openstack/ansible-hardening master: reno: Update master for unmaintained/xena https://review.opendev.org/c/openstack/ansible-hardening/+/913136 | 12:52 |
spatel | Did you guys upgrade 2023.1 to 2023.2 ? any good or bad experience | 13:39 |
opendevreview | Aleksandr Chudinov proposed openstack/openstack-ansible-os_nova master: fix apparmor profile for non-standard nova home https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/912583 | 14:05 |
hamburgler | spatel: minimal issues from our end 2023.1 > 2023.2 when using 28.0.1, think it was just haproxy not logging for us | 15:08 |
spatel | oh! so that is not openstack related issue but just infra stuff. | 15:10 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible stable/wallaby: Remove use of undefined ceph distro job zuul template https://review.opendev.org/c/openstack/openstack-ansible/+/910192 | 15:24 |
noonedeadpunk | sounds like we can be switching to 2024.1 branch slowly | 17:03 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_gnocchi master: Drop default policy file location https://review.opendev.org/c/openstack/openstack-ansible-os_gnocchi/+/913244 | 17:37 |
opendevreview | Merged openstack/openstack-ansible master: Add check_hostname option to db healthcheck tasks https://review.opendev.org/c/openstack/openstack-ansible/+/911150 | 17:44 |
f0o | Specifying haproxy_bind_(in/ex)ternal_lb_vip_interface after everything has been deployed doesnt seem to update the haproxy configs - is that intended? | 18:13 |
noonedeadpunk | Um, it actually should | 18:18 |
f0o | I reran haproxy-install and os-keystone-install and it didnt change anything | 18:18 |
f0o | it triggered creation of certs but the actual haproxy config/s werent touched at all | 18:19 |
noonedeadpunk | f0o: so. I think, you'd need to run smth like setup-openstack.yml --tags haproxy-service-config | 18:19 |
noonedeadpunk | as each service backend/frontend today configured in service playbooks | 18:20 |
f0o | oh ok; let me run that | 18:20 |
noonedeadpunk | But I would expect keystone to be re-configured at least from what you ran | 18:20 |
f0o | it finished and no changes done to the configs | 18:26 |
opendevreview | Jimmy McCrory proposed openstack/openstack-ansible-os_nova master: Ensure nova_device_spec is templated as JSON string https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/913248 | 18:30 |
jrosser | f0o: the logic in the haproxy role / service template is pretty complicated - it might be you have some kind of inconsistency in the variables | 18:41 |
f0o | yeah I tried looking into it and noped out of it tbh | 18:42 |
jrosser | do you actually need to bind to an interface? | 18:43 |
jrosser | for 99% of cases the IP is sufficient | 18:43 |
f0o | I have two options; bind to interface or set net.ipv4.tcp_l3mdev_accept=1 | 18:43 |
f0o | I did the latter now | 18:43 |
noonedeadpunk | f0o: aha, so you binded to the interface | 18:44 |
f0o | other way; I didnt bind and now would need to for VRFs to work correctly | 18:44 |
f0o | it's no biggie, can just nuke the rack again and do a resetup with the VRF interfaces straight away | 18:45 |
noonedeadpunk | I do have a setup with working binds to interfaces | 18:45 |
f0o | additive changes sometimes arent additive hah | 18:45 |
jrosser | well it should work | 18:45 |
jrosser | like I say there are probably a bunch of combinations or vars that are not valid | 18:45 |
f0o | likely | 18:45 |
f0o | I will spend some more time tomorrow on it | 18:46 |
f0o | I've had a real stroll down memory lane today with hitting multi-year old bugs in OVS, FRR, VRFs, .... | 18:46 |
noonedeadpunk | So I do have `internal_lb_vip_address` set to FQDN and then `haproxy_bind_external_lb_vip_address: "*"` and `haproxy_bind_external_lb_vip_interface: bond0.3104` | 18:46 |
f0o | ^ that's what I had too | 18:47 |
f0o | when I ran that playbook just now | 18:47 |
jrosser | which one did you run? | 18:47 |
f0o | haproxy-install && os-keystone-install as well as setup-openstack.yml --tags haproxy-service-config as suggested by noonedeadpunk | 18:48 |
f0o | weirdly enough it created the Certs for the *-Interface pair and uploaded them | 18:48 |
noonedeadpunk | and no changes in /etc/haproxy/conf.d/keystone_service ? | 18:48 |
f0o | just never changed the haproxy configs/ | 18:48 |
f0o | noonedeadpunk: nope | 18:49 |
noonedeadpunk | huh | 18:49 |
noonedeadpunk | and you're running 2023.2? | 18:49 |
f0o | yup | 18:49 |
jrosser | noonedeadpunk: there are some for loops in the service template which I was a bit suspicious about | 18:49 |
noonedeadpunk | like - we did run the role yesterday and it worked | 18:49 |
f0o | bad8ffe55b51f8197e71b9d552282480e6a40063 to be precise | 18:50 |
noonedeadpunk | or well, day before yesterdat | 18:50 |
noonedeadpunk | f0o: just in case - where did you defined these variables? | 18:50 |
jrosser | like if vip_binds somehow ends up with several entries | 18:50 |
noonedeadpunk | group_vars/host_vars | 18:50 |
f0o | in /etc/openstack_deploy/user_variables.yml | 18:50 |
f0o | that's what the docs said | 18:50 |
noonedeadpunk | and you have commented things out in openstack_user_config? | 18:51 |
noonedeadpunk | in global_overrides? | 18:51 |
f0o | commented out? | 18:51 |
noonedeadpunk | so, some part of the docs could suggest adding `haproxy_bind_external_lb_vip_address` there instead of user_variables | 18:51 |
f0o | I still have global_overrides>internal_lb_vip_address defined | 18:52 |
noonedeadpunk | so just trying to check you don't have any conflisct | 18:52 |
noonedeadpunk | that is indeed very interesting | 18:52 |
f0o | I was under the impression global>*_lb_vip_address would dictate the OpenStack endpoints and haproxy_bind_* would bypass those to get alternative binds | 18:52 |
f0o | as per https://docs.openstack.org/openstack-ansible-haproxy_server/latest/configure-haproxy.html#overriding-the-address-haproxy-will-bind-to | 18:52 |
noonedeadpunk | yes, that's exactly what happens | 18:53 |
f0o | ok then it doesnt work :D | 18:53 |
f0o | or it probably does work on a fresh install but not when it's applied afterwards | 18:53 |
noonedeadpunk | nah, it should not matter | 18:53 |
jrosser | f0o: rather than nuke it, you should debug | 18:53 |
noonedeadpunk | it's resulting in a template | 18:53 |
f0o | for what it's worth, I also have letsencrypt enabled... not sure if that adds more complications | 18:54 |
jrosser | add some debug: var=<blah> | 18:54 |
jrosser | follows by fail: tasks just before the template, and see what you get in the actual variables used | 18:54 |
f0o | yeah I'll litter the haproxy service.j2 with some debug comments and hope to get some change into the configs | 18:56 |
jrosser | ok cool then we can fix either our docs, the code or your vars and it’s understanding++ | 18:56 |
noonedeadpunk | I would add debug here: https://opendev.org/openstack/openstack-ansible-haproxy_server/src/branch/master/tasks/haproxy_service_config_external.yml#L28 | 18:57 |
f0o | https://github.com/openstack/openstack-ansible-haproxy_server/blob/master/templates/service.j2 << this one | 18:58 |
f0o | :+1: | 18:58 |
f0o | man I'm too used to slack when that's my first instinct | 18:58 |
noonedeadpunk | or maybe even here https://opendev.org/openstack/openstack-ansible-haproxy_server/src/branch/master/tasks/haproxy_service_config.yml#L30 | 18:58 |
f0o | good pointers, will tackle it tomorrow | 19:00 |
f0o | time to walk the dog | 19:01 |
noonedeadpunk | I guess I'd try to print out _haproxy_service_configs_simplified for the beginning.... | 19:01 |
noonedeadpunk | or even better - haproxy_tls_vip_binds | 19:03 |
noonedeadpunk | as basically this logic is the question I guess: https://opendev.org/openstack/openstack-ansible-haproxy_server/src/branch/master/templates/service.j2#L15-L19 | 19:05 |
opendevreview | Jimmy McCrory proposed openstack/openstack-ansible-os_nova master: Ensure nova_device_spec is templated as JSON string https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/913248 | 19:08 |
opendevreview | Jimmy McCrory proposed openstack/openstack-ansible-os_nova master: Ensure nova_device_spec is templated as JSON string https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/913248 | 19:09 |
opendevreview | Jimmy McCrory proposed openstack/openstack-ansible-os_nova master: Ensure nova_device_spec is templated as JSON string https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/913248 | 19:10 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Add support for ovn-bgp-agent deployment https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/909780 | 22:24 |
opendevreview | Merged openstack/openstack-ansible-os_neutron master: Use ansible_facts['processor_vcpus'] instead of fact variable https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/912481 | 23:58 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!