Tuesday, 2024-05-14

opendevreviewMerged openstack/openstack-ansible-galera_server master: Add distro infra jobs  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/91469102:35
opendevreviewMerged openstack/openstack-ansible-os_octavia master: Implement variables to address oslo.messaging improvements  https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/91906902:56
semanticHello, everyone! So I've been testing this https://github.com/openstack/openstack-ansible/commit/d4530e242db7c45c10729123be8d7a8fbab38296 and it seems to me, that even with it I still have the problem of services stuck in 'waiting for message' state. I slowly take infra nodes down one at a time, then turn them on back again and eventually nova-compute service on some host, or neutron-ovs-agent stuck. I got similar 06:49
semanticbehaviour when try to use quorum queues... Maybe someone could suggest any ideas? Rabbit cluster itself reports as healthy and no network partitions when any of the nodes is down.06:49
noonedeadpunkhey07:28
noonedeadpunksemantic: well, for caracal we're landing a bunch of improvements for quorum queues behaviour (which become available only on caracal)07:30
noonedeadpunkbut for the mentioned commit to be respected, you'd need to run pretty much all roles, as the policy is applied per vhost07:31
noonedeadpunkso question is - how you was testing that?07:31
jrosser_semantic: for https://github.com/openstack/openstack-ansible/commit/d4530e242db7c45c10729123be8d7a8fbab38296 specifically we could ask andrewbonney - he did a lot of work on our rabbitmq related to that07:37
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: [doc] Rename extending-osa page  https://review.opendev.org/c/openstack/openstack-ansible/+/91507807:49
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: [doc] Add information about hook playbooks to the 'extending-osa' docs  https://review.opendev.org/c/openstack/openstack-ansible/+/91955508:00
opendevreviewMerged openstack/openstack-ansible-os_blazar master: Define lock directory for oslo_concurrency  https://review.opendev.org/c/openstack/openstack-ansible-os_blazar/+/91906108:57
opendevreviewMerged openstack/openstack-ansible-os_blazar master: Implement variables to address oslo.messaging improvements  https://review.opendev.org/c/openstack/openstack-ansible-os_blazar/+/91799908:57
opendevreviewMerged openstack/openstack-ansible-os_gnocchi master: Ensure Gnocchi is connected to MySQL coordination with TLS  https://review.opendev.org/c/openstack/openstack-ansible-os_gnocchi/+/91803609:01
noonedeadpunkI think worth adding logic there to add zookeeper for coordination as well ^09:19
opendevreviewMerged openstack/openstack-ansible-os_skyline master: Add designate and masakari to service mapping  https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/91952309:25
opendevreviewMerged openstack/openstack-ansible stable/2023.1: Bump SHAs for 2023.1  https://review.opendev.org/c/openstack/openstack-ansible/+/91906609:39
opendevreviewMerged openstack/openstack-ansible-os_skyline master: Reflect keystone service variables in config  https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/91816009:48
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_skyline master: Switch nginx with Apache  https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/91952910:12
semanticspacesWell, technically i basically do a new cluster installation over empty hosts and then run tests. And I'm little confused with amqp_durable_queues = true, osa does not include this in config, while kolla does. Though it does not seem to change anything in my case.10:15
noonedeadpunksemanticspaces: so improvements I've talked about are these: https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/919059/2/templates/neutron.conf.j210:20
noonedeadpunknot everything is merged yet10:20
noonedeadpunksemanticspaces: also amqp_durable_queues is noop when quorum queues are used10:21
noonedeadpunkhttps://github.com/openstack/oslo.messaging/blob/master/oslo_messaging/_drivers/impl_rabbit.py#L169-L17110:21
jrosser_semanticspaces: what version of openstack-ansible are you testing?10:23
semanticspaces28.2.010:24
noonedeadpunksemanticspaces: if you're playing with setups anyway... can you check out current master?:)10:29
noonedeadpunkas it seems - all core services already do have these quorum improvements merged10:29
noonedeadpunk(except telemetry, magnum, ironic, manila and zun)10:29
semanticspacesi can check master, but we only install nova,glance,neutron,placement,ceilometer,horizon...if i do this what settings should i configure? oslomsg_rabbit_quorum_queues: True only? 10:33
noonedeadpunkyeah10:39
noonedeadpunkall rest should be implied from that10:39
noonedeadpunkok, so changes for ceilometer were not merged yet10:40
noonedeadpunkhttps://review.opendev.org/c/openstack/openstack-ansible-os_ceilometer/+/91810710:40
jrosser_we have this all setup via SCENARIO env var for all-in-ones too?10:40
noonedeadpunkbut the rest are10:40
noonedeadpunkwe do, yes10:40
noonedeadpunkI think `quorum` is a keyword10:41
noonedeadpunkhttps://opendev.org/openstack/openstack-ansible/src/branch/master/tests/roles/bootstrap-host/templates/user_variables.aio.yml.j2#L362-L36410:41
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: [doc] Expand documentation on OVN useful commands  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/91358810:56
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_keystone master: Change example to contain domain name instead of UUID  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/91956311:41
opendevreviewMerged openstack/openstack-ansible-os_aodh master: Add service policies defenition  https://review.opendev.org/c/openstack/openstack-ansible-os_aodh/+/91794712:26
spotz[m]noonedeadpunk: You mind if I fix some grammar vs comments on that doc?13:39
noonedeadpunkspotz[m]: I never mind that13:55
spotz[m]Ok patch up in a few14:01
opendevreviewAmy Marrich proposed openstack/openstack-ansible-os_neutron master: [doc] Expand documentation on OVN useful commands  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/91358814:03
noonedeadpunkthanks so much14:05
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_cinder master: Fix rootwrap files idempotency  https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/90910414:26
opendevreviewMerged openstack/openstack-ansible-os_cinder master: reno: Update master for unmaintained/zed  https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/91915314:44
noonedeadpunk#startmeeting openstack_ansible_meeting15:00
opendevmeetMeeting started Tue May 14 15:00:17 2024 UTC and is due to finish in 60 minutes.  The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
opendevmeetThe meeting name has been set to 'openstack_ansible_meeting'15:00
noonedeadpunk#topic office hours15:00
noonedeadpunkwell, it should have been rollcall... but whatever :D15:00
damiandabrowskihi!15:01
noonedeadpunkhuge chunk of https://review.opendev.org/q/topic:%22osa/messaging_improvements%22 has been merged 15:02
noonedeadpunkthanks jrosser_ and andrewbonney for taking time on reviewing that!15:02
jrosser_o/ hello15:02
jrosser_no worries15:02
noonedeadpunksome roles are broken, so I was planning to look deeper during the week about reasons15:02
jrosser_there are some awkward bits left i think15:02
jrosser_but tbh this is not terrible, becasue these patches for messaging give us a good health check before we release15:03
noonedeadpunkyeah15:03
noonedeadpunkI was expecting more worse situation kinda15:03
noonedeadpunkI actually tried to add tempest tests for trove, but it expects for datastores to be created outside of tempest runtime15:04
noonedeadpunkwhich is non-trivial due to missing ansible modules for that15:04
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: [doc] Add information about hook playbooks to the 'extending-osa' docs  https://review.opendev.org/c/openstack/openstack-ansible/+/91955515:06
noonedeadpunkwe also have quite some failures for upgrade jobs, which are mainly intermittent, but highly annoying15:06
jrosser_seems to be some amount of tls / rocky / upgrade failures15:07
jrosser_i did look in some logs and it was pretty hard to find anything specific15:07
noonedeadpunkyeah, also haven't found anything too obvious15:07
noonedeadpunkrather then potentially some nodes just being slow or smth...15:08
NeilHanlono/ hiya15:08
NeilHanloni am going to check on repos for rocky today; make sure there aren't bad entries still being served. i.e, mirrors which are not serving 9.415:08
jrosser_tbh i have not seen repo specific errors for ~ a week at least15:09
noonedeadpunkactually ,another part we've recently realized being annoying - is that volume types are created regardless if operator want to manage them through osa15:09
jrosser_just somehow our rocky CI feels less stable than other things15:09
noonedeadpunkor well, naming of default one is really opinionated15:09
NeilHanlonjrosser_: ack; well, i'll still do that, but will check into CI generally and see if I can find anything common15:09
noonedeadpunkto be completely frank - I was spotting issues with Ubuntu as well15:10
jrosser_yep, i also saw what looked like sqla troubles too15:10
noonedeadpunkout of capi topic - one patch left: https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/91664715:12
noonedeadpunkand actually - magnum feels tough on passing ci lately :(15:13
jrosser_yes, that is not super critical, but it's needed to make a bunch of jobs to test all the k8s versions15:13
jrosser_yes it does15:13
noonedeadpunkie - there're some capi intermittent issues15:13
jrosser_and i think there is brokenness in vexxhost nodepool too15:13
noonedeadpunkon top of upgrade failures15:13
jrosser_^ mnaser 15:14
jrosser_yeah i agree that the capi job has failed for unspecific reasons15:14
jrosser_though surprisingly we already collect a ton of logs from the control plane which should make it possible to see whats happening15:14
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_magnum master: Move insecure param to keystone_auth section  https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/90511015:15
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_octavia master: Implement support for octavia-ovn-provider driver  https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/86846215:16
noonedeadpunkoctavia-ovn-provider - I still failed to work closer on testing this path :(15:16
jrosser_hmm yes another NODE_FAILURE - thats because we need the 32G flavor for capi tests15:17
jrosser_and something is broken with those15:17
noonedeadpunkor well, there's probably something validly broken with ovn driver in octavia in our aio15:18
noonedeadpunkor well... a different set of tempest tests should run for ovn: https://96de7bed307fb2a6a065-7f155e7c59383dfa4a196ee8803910a9.ssl.cf1.rackcdn.com/868462/16/check/openstack-ansible-deploy-aio_lxc_ovnprovider-ubuntu-jammy/de203dd/logs/openstack/aio1-utility-container-3ffa4338/utility/stestr_results.html15:19
noonedeadpunkas it's getting just `Got NotImplemented error` 15:19
noonedeadpunkwhich is fair....15:19
noonedeadpunkwill try to talk to octavia folks to see how we can test that15:19
noonedeadpunkand if they do have any tempest for that at all..15:20
jrosser_maybe thats just not valid to let octavia tempest go against the ovn provider15:20
noonedeadpunkyeah, but then how to test it...15:20
jrosser_there might be some clue in how neutron set up to test that15:20
jrosser_(i assume this happens.....)15:20
jrosser_perhaps some already existing skip list we can use for example15:21
mgariepyisn't i only that the flavor in octavia is not needed for ovn ?15:26
mgariepyhttps://docs.openstack.org/ovn-octavia-provider/latest/admin/driver.html15:26
mgariepy* knows nothing about octavia tho. haha15:26
noonedeadpunkwell, ovn also is only l4 balancing... and only source_ip_port algo iirc15:27
mgariepyDetails: b'{"faultcode": "Server", "faultstring": "Provider \'ovn\' does not support a requested action: This provider does not support validating flavors.", "debuginfo": null}'15:28
noonedeadpunkyeah, which is fair...15:28
noonedeadpunkok, will check on that a bit later15:28
* noonedeadpunk looking through https://etherpad.opendev.org/p/osa-dalmatian-ptg15:29
noonedeadpunkovn-bgp-agent merged...15:29
noonedeadpunkinactive projects are removed...15:29
noonedeadpunkEOM branches...15:29
noonedeadpunkEOM branches are pita...15:30
noonedeadpunkWe outdue to create Zed EOM15:30
noonedeadpunk*overdue15:30
jrosser_we also need to choose if we work in the unmaintained branches at all15:30
noonedeadpunkyeah15:30
jrosser_victoria is nearly OK, and i was planning to work back toward the maintained ones15:31
noonedeadpunkI guess this depends... but for work on them, they should be revived first and I failed to follow on that15:31
johnsomjrosser_ The Octavia tempest plugin will run against the OVN provider, however most of the tests will "skip" as the OVN provider doesn't support many of the features of Octavia. There are gate jobs that run with it.15:31
jrosser_but just not managed to make progress15:31
opendevreviewMerged openstack/openstack-ansible-os_aodh master: Add variable to globally control notifications enablement  https://review.opendev.org/c/openstack/openstack-ansible-os_aodh/+/91794815:31
noonedeadpunkjohnsom: well, we kinda try to run just octavia_tempest_plugin.tests.scenario.v2.test_load_balancer15:32
jrosser_and this happens https://96de7bed307fb2a6a065-7f155e7c59383dfa4a196ee8803910a9.ssl.cf1.rackcdn.com/868462/16/check/openstack-ansible-deploy-aio_lxc_ovnprovider-ubuntu-jammy/de203dd/logs/openstack/aio1-utility-container-3ffa4338/utility/stestr_results.html15:32
noonedeadpunkwhich fails... but I guess I need to check what you're running in gates15:32
noonedeadpunkalso, looking at test list... I wonder if we kinda want to always run `tempest.scenario.test_server_basic_ops.TestServerBasicOps`?15:33
johnsomI know what it is, one second15:33
jrosser_i have to go to another meeting - but andrewbonney was interested in feedback on https://review.opendev.org/q/topic:%22osa/rmq-migrate%2215:33
noonedeadpunkoh, yes15:33
jrosser_particularly fixing up the tags, and whats best approach15:33
johnsomjrosser_ https://github.com/openstack/octavia-tempest-plugin/blob/master/zuul.d/jobs.yaml#L110015:34
jrosser_i did discuss it with him and we could not see the point in tags like 'nova' - and the behaviour of them is just very wierd right now15:34
jrosser_johnsom: ahha!15:34
opendevreviewMerged openstack/openstack-ansible-os_trove master: Add variable to globally control notifications enablement  https://review.opendev.org/c/openstack/openstack-ansible-os_trove/+/91722615:35
johnsomWe are strict with the main test jobs15:35
noonedeadpunkok, that explains it15:35
noonedeadpunkI wonder if anything will run though15:35
johnsomMostly just a few TCP and UDP tests work with OVN15:36
opendevreviewMerged openstack/openstack-ansible-os_trove master: Implement variables to address oslo.messaging improvements  https://review.opendev.org/c/openstack/openstack-ansible-os_trove/+/91799715:36
noonedeadpunkso we have just that right now: https://opendev.org/openstack/openstack-ansible/src/branch/master/tests/roles/bootstrap-host/templates/user_variables_octavia.yml.j2#L1415:38
opendevreviewMerged openstack/openstack-ansible-os_magnum master: reno: Update master for unmaintained/zed  https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/91917115:38
noonedeadpunkas we pretty much want just bare minimal thing to see if it's basically operational15:39
noonedeadpunkwe don't want to take your job by re-running all tests available :D15:39
mgariepyi guess that we need to set: not_implemented_is_error: False15:39
johnsomYep, understandable. Yeah, I think all you need is to set that tempest variable15:39
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Do not fail on NotImplemented tests for OVN Octavia  https://review.opendev.org/c/openstack/openstack-ansible/+/91959915:40
noonedeadpunklet's see:)15:40
johnsomHere is a list of scenario tests run with OVN: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_833/917076/1/check/neutron-ovn-provider-v2-scenario/83362d1/testr_results.html15:40
noonedeadpunkthanks!15:40
noonedeadpunkok, so  octavia_tempest_plugin.tests.scenario.v2.test_load_balancer.LoadBalancerScenarioTest is part of that, amazing15:41
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_octavia master: Implement support for octavia-ovn-provider driver  https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/86846215:41
opendevreviewMerged openstack/openstack-ansible-os_aodh master: Implement variables to address oslo.messaging improvements  https://review.opendev.org/c/openstack/openstack-ansible-os_aodh/+/91794915:48
noonedeadpunkso, looking at ptg doc, we're pretty much done?15:53
noonedeadpunkexcept renaming of groups....15:53
noonedeadpunkbut I think, we will do that for the next release....15:54
noonedeadpunkand potentially - if we wanna replace nginx with apache for SKyline? https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/91952915:54
NeilHanlonI'm pretty okay with it. would rather have haproxy + one other thing, not three15:55
noonedeadpunk#endmeeting16:11
opendevmeetMeeting ended Tue May 14 16:11:54 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:11
opendevmeetMinutes:        https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-05-14-15.00.html16:11
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-05-14-15.00.txt16:11
opendevmeetLog:            https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-05-14-15.00.log.html16:11
noonedeadpunkNeilHanlon: yeah, the only place where nginx is left - repo container16:12
noonedeadpunkworth to be replaced as well I guess...16:12
NeilHanlonjust in time to move everything to Caddyserver! :P 16:14
noonedeadpunkhahaha16:14
opendevreviewMerged openstack/openstack-ansible-os_glance master: Add qos_prefetch_count to variables  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/91908716:35
noonedeadpunkregarding https://review.opendev.org/q/topic:%22osa/rmq-migrate%22 - it looks quite fair to me16:41
noonedeadpunkandrewbonney: ^16:41
andrewbonneyThanks. I'll sort out the full set of patches in the next day or two16:41
noonedeadpunkbut, I think for things like nova - you might need some extra tasks16:41
noonedeadpunklike to detect virt type16:41
noonedeadpunkah, but it has `always`16:42
noonedeadpunkso yeah, not needed :)16:42
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_manila master: DNM  https://review.opendev.org/c/openstack/openstack-ansible-os_manila/+/91960416:45
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_manila master: Add quorum queues support for service  https://review.opendev.org/c/openstack/openstack-ansible-os_manila/+/89891416:46
opendevreviewMerged openstack/openstack-ansible-os_manila master: reno: Update master for unmaintained/zed  https://review.opendev.org/c/openstack/openstack-ansible-os_manila/+/91917317:06
noonedeadpunkok so manial fails on ceph-ansible with `nfs-ganesha : Depends: liburcu6 but it is not installable`19:08
noonedeadpunkas it's liburcu8 on jammy19:10

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!