*** dviroel|ruck is now known as dviroel|ruck|Afk | 00:05 | |
*** arxcruz|off is now known as arxcruz | 07:50 | |
jrosser | morning | 08:17 |
---|---|---|
noonedeadpunk | o/ | 08:29 |
*** dviroel|ruck|Afk is now known as dviroel|ruck | 11:16 | |
MrClayPole_ | Morning, I'm still migrating from OSA Train (20.2.6) to OSA Victoria (22.4.1). At https://opendev.org/openstack/openstack-ansible-os_nova/src/branch/stable/victoria/tasks/nova_compute.yml#L16 I'm getting the error "Error when evaluating variable in dynamic parent include path: drivers/{{ nova_virt_type }}/nova_compute_{{ nova_virt_type }}.yml. When using static imports, the parent dynamic include cannot utilize host | 11:17 |
MrClayPole_ | facts or variables from inventory". I've confirmed that nova_virt_type is set to "kvm" so is it that using this type of variable is not supported in this context? | 11:17 |
noonedeadpunk | hm, that's interesting | 12:19 |
noonedeadpunk | MrClayPole_: do you have override of nova_virt_type somewhere? | 12:19 |
MrClayPole_ | I've grep'ed my /etc/openstack-deploy/*.yml and got no hits | 12:19 |
noonedeadpunk | and you don't have /etc/openstack-deploy/host_vars or /etc/openstack-deploy/group_vars ? | 12:20 |
MrClayPole_ | checking .... | 12:20 |
MrClayPole_ | inventory/group_vars/kvm-compute_hosts.yml "nova_virt_type: kvm" | 12:22 |
MrClayPole_ | noonedeadpunk: should I try commenting that out and re-running? | 12:24 |
MrClayPole_ | I don't have a /etc/openstack-deploy/host_vars/ and /etc/openstack-deploy/group_vars/ files don't have that variable set | 12:27 |
noonedeadpunk | well, I guess it's placed there for some reason, right? | 12:29 |
noonedeadpunk | and I'd say it's valid usecase.... | 12:29 |
noonedeadpunk | jrosser: that's actually interesting issue I think ^ | 12:30 |
MrClayPole_ | Unfortunately I didn't build this environment. Let me check with the person that did to see why /opt/openstack-ansible/inventory/group_vars/kvm-compute_hosts.yml is "nova_virt_type: kvm" set | 12:31 |
noonedeadpunk | MrClayPole_: are you also sure that roles bootstrapped? | 12:31 |
noonedeadpunk | oh, wait | 12:32 |
MrClayPole_ | The boot strap showing no errors at the end | 12:32 |
noonedeadpunk | can you replace that with include_tasks? https://opendev.org/openstack/openstack-ansible-os_nova/src/branch/stable/victoria/tasks/main.yml#L226 | 12:32 |
MrClayPole_ | when you say replace do you mean in "kvm-compute_hosts.yml" | 12:35 |
noonedeadpunk | no, I mean in the line I posted :) | 12:36 |
noonedeadpunk | so `include_tasks: nova_compute.yml` | 12:36 |
MrClayPole_ | Sorry I'm being slow on the up take here ... I'm not sure where you want me to put it | 12:36 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible-os_nova/src/branch/stable/victoria/tasks/main.yml#L226 | 12:37 |
MrClayPole_ | so just to be clear change L226 from "task_import: nova_compute.yml" to "include_tasks: nova_compute.yml" | 12:38 |
noonedeadpunk | wait. why it's `task_import`? | 12:39 |
noonedeadpunk | as it should be currently `import_tasks` | 12:40 |
MrClayPole_ | Thats whats showing on L226 on the link | 12:40 |
noonedeadpunk | Do we see different content via the link?:) As I see there "import_tasks: nova_compute.yml" | 12:40 |
MrClayPole_ | I see import tasks on the link you sent but I though you said "so `include_tasks: nova_compute.yml`" | 12:41 |
MrClayPole_ | it was my typo before it does say "import_task" | 12:44 |
MrClayPole_ | it was my typo before it does say "import_tasks" | 12:44 |
noonedeadpunk | yes, so it should be include_tasks, not import_tasks :) | 12:51 |
MrClayPole_ | noonedeadpunk: Thanks, I seem to be struggling this morning, need get more sleep tonight. I've made the changes and am running the os_nova playbook again. | 12:55 |
noonedeadpunk | Let me know about the result | 12:55 |
noonedeadpunk | I believe this is pretty valid bug | 12:56 |
jamesdenton | i confirm the bug. ran into it with victoria AIO yesterday | 13:00 |
jamesdenton | had to hardset nova_virt_type | 13:00 |
opendevreview | Merged openstack/openstack-ansible-os_neutron master: Add configuration option for heartbeat_in_pthread https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/833237 | 13:16 |
opendevreview | Andrew Bonney proposed openstack/openstack-ansible-os_cinder stable/xena: Add configuration option for heartbeat_in_pthread https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/833863 | 13:17 |
opendevreview | Andrew Bonney proposed openstack/openstack-ansible-os_cinder stable/wallaby: Add configuration option for heartbeat_in_pthread https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/833864 | 13:18 |
opendevreview | Andrew Bonney proposed openstack/openstack-ansible-os_nova stable/xena: Add configuration option for heartbeat_in_pthread https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/833865 | 13:18 |
opendevreview | Andrew Bonney proposed openstack/openstack-ansible-os_nova stable/wallaby: Add configuration option for heartbeat_in_pthread https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/833866 | 13:18 |
opendevreview | Andrew Bonney proposed openstack/openstack-ansible-os_neutron stable/xena: Add configuration option for heartbeat_in_pthread https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/833867 | 13:18 |
opendevreview | Andrew Bonney proposed openstack/openstack-ansible-os_neutron stable/wallaby: Add configuration option for heartbeat_in_pthread https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/833868 | 13:18 |
MrClayPole_ | noonedeadpunk: Just popped away for some lunch. Back now, It failed with the same error. | 13:30 |
noonedeadpunk | hm | 13:31 |
noonedeadpunk | I need to spawn aio then to check for possible ways to solve that | 13:32 |
MrClayPole_ | I'll back out that patch we just made and wait to hear from you. This is just a test environment my side so while I need to get it done its not critical. | 13:33 |
MrClayPole_ | I'm happy to test once you have a patch :) | 13:33 |
MrClayPole_ | Thanks for the info jamesdenton. I'll hold off on the hardset unless I get presure from my side to get it done. | 13:35 |
anskiy | noonedeadpunk: hey! you've mentioned yesterday about some mariadb connection drops. Currently I'm observing strange timeouts on X with 10.6.5. Looks like it happens when connection times out on server side and client tries to reconnect. Is it somehow related to your problem? | 14:40 |
noonedeadpunk | anskiy: um... yes but in facts it's different I believe | 14:47 |
noonedeadpunk | After futher investigation I believe that connection is dropped by haproxy for some reason, clients aknoldeges that connection is dropped and logs "Server has gone away during query", but MariaDB thinks connection is still active and drops it with timeout only | 14:49 |
noonedeadpunk | As a result, if connection with update statement is killed that way, table remains locked until timeout releases | 14:50 |
noonedeadpunk | The only way we for now workarounded that was pointing to specific mariadb container with port forwarding and disabling haproxy frontend/backend. | 14:51 |
noonedeadpunk | And not sure what goes wrong with haproxy | 14:51 |
noonedeadpunk | at same time several of our regions are running same osa version without single issue | 14:52 |
anskiy | is it 10.6.x? I'm feeling a bit lazy about further investigating into my issue and thinking about just downgrading to 10.5.12 (which worked flawlessly). | 14:55 |
noonedeadpunk | yup, 10.6.5 | 14:56 |
noonedeadpunk | and you can't downgrade just in case | 14:56 |
noonedeadpunk | and case is not in mariadb either | 14:56 |
noonedeadpunk | or well, at least from what we see... | 14:57 |
noonedeadpunk | during 10.5 -> 10.6 upgrade mysql system tables are adjusted heavily | 14:57 |
noonedeadpunk | and there're really migrations that could be breaking | 14:57 |
noonedeadpunk | but well... | 14:58 |
noonedeadpunk | you can try:) | 14:58 |
anskiy | well, maybe I'm hitting some other bug... Thanks for the info, was going to check downgrade in Vagrant first anyways | 14:58 |
noonedeadpunk | it sounds suuuuper related to what we see just in case | 14:59 |
noonedeadpunk | anskiy: was you performing some kind of W->X upgrade? | 15:04 |
noonedeadpunk | Or just upgraded mariadb? | 15:04 |
anskiy | noonedeadpunk: during 23.1.2 I've went from 10.5.6 to 10.5.12 (which still was deadlocking), then, prior to upgrading to 23.2.0, I've upgraded to 10.5.13, which was fine. Then I've upgraded to X and it was working okay (probably for a week or so) | 15:13 |
anskiy | there is no actual workload for now, I'm just occasionally poking at it. But deadlocking was so bad, it could've happen on normal playbook run, when ansible was checking/registering services in Keystone | 15:15 |
anskiy | honestly, it could be anything else, but remembering previous problems with mariadb... | 15:15 |
noonedeadpunk | eventually with keystone - it could be that apache hitting conenction limit? For test instances it's set to pretty low value | 15:16 |
noonedeadpunk | as we see never saw any lock with keystone. it is always smth that writes intensively to db, like octavia while checking health or neutron during updating ports | 15:17 |
noonedeadpunk | or nova ofc updating list of instances per computes | 15:17 |
anskiy | I can't remember now, but I'm pretty sure it was mariadb's fault. I still have those links to mariadb's Jira in user_variables SCM history :) | 15:20 |
opendevreview | Merged openstack/openstack-ansible-os_keystone master: Drop distributed_lock parameter https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/831786 | 15:20 |
anskiy | with identical symptoms | 15:21 |
noonedeadpunk | hm | 15:22 |
opendevreview | Merged openstack/openstack-ansible-os_gnocchi master: Add availability to define gnocchi_incoming_driver https://review.opendev.org/c/openstack/openstack-ansible-os_gnocchi/+/822905 | 15:22 |
noonedeadpunk | well let me know if downgrade to 10.5.13 will just work... | 15:22 |
jrosser | how many seperate mariadb issues do we have | 15:25 |
jrosser | i'm not sure if the deadlock discussion here is the 10.5.6 fails-to-startup problem | 15:26 |
anskiy | noonedeadpunk: well, at first look, admin's password is no longer accepted now :) | 15:36 |
noonedeadpunk | well I have one pretty weird thing that I don't think even related to mariadb, as when we exclude haproxy from chain of connection things just work | 15:37 |
noonedeadpunk | but I can hardly imagine wtf could be with haproxy | 15:38 |
jrosser | we had a network switch flap yesterday here which completely upset haproxy wrt galera in a very unexpected way | 15:38 |
jrosser | andrewbonney: was going to look into that a bit as haproxy said the backend was down when it clearly wasnt | 15:38 |
jrosser | so there is certainly something suspect with the healthcheck | 15:39 |
noonedeadpunk | we don't have backend flapping :( | 15:39 |
anskiy | jrosser: yeah, it's not the same. What's that "fails-to-start" problem? Was it the one during installation on ubuntu/debian? | 15:40 |
opendevreview | Merged openstack/openstack-ansible-os_magnum master: Remove legacy policy.json cleanup handler https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/827444 | 15:41 |
jrosser | yes, where there was an internal "deadlock" inside mariadb that meant the service never started properly | 15:41 |
jrosser | there is an issue on the mariadb jira about that | 15:42 |
jrosser | iirc thats why we did 10.5.6 -> 10.5.12 even on a stable branch where we'd never normally touch the version outside major version upgrades | 15:42 |
*** dviroel|ruck is now known as dviroel|ruck|lunch | 16:02 | |
*** frenzy_friday is now known as frenzyfriday | 16:17 | |
opendevreview | Merged openstack/openstack-ansible master: Connect openstack_pki_regen_ca variable to pki role https://review.opendev.org/c/openstack/openstack-ansible/+/831242 | 16:21 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible stable/xena: Connect openstack_pki_regen_ca variable to pki role https://review.opendev.org/c/openstack/openstack-ansible/+/834017 | 16:40 |
opendevreview | Merged openstack/openstack-ansible master: Replace use of deprecated ANSIBLE_CALLBACK_WHITELIST https://review.opendev.org/c/openstack/openstack-ansible/+/829002 | 16:55 |
*** dviroel|ruck|lunch is now known as dviroel|ruck | 17:15 | |
*** arxcruz is now known as arxcruz|off | 17:18 | |
opendevreview | Merged openstack/openstack-ansible master: Set minimum and maximum microversions for manila api https://review.opendev.org/c/openstack/openstack-ansible/+/827560 | 17:34 |
opendevreview | Merged openstack/openstack-ansible stable/xena: Add test of used SHAs https://review.opendev.org/c/openstack/openstack-ansible/+/831031 | 17:34 |
*** dviroel|ruck is now known as dviroel|ruck|afk | 20:03 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!