16:00:43 <noonedeadpunk> #startmeeting openstack_ansible_meeting
16:00:44 <openstack> Meeting started Tue Jan 26 16:00:43 2021 UTC and is due to finish in 60 minutes.  The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:45 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:47 <openstack> The meeting name has been set to 'openstack_ansible_meeting'
16:00:59 <noonedeadpunk> #topic office hours
16:01:16 <noonedeadpunk> o/
16:01:29 <jrosser> hello
16:03:04 <noonedeadpunk> So, regarding renos. First of all we need to stop publishing new ones. I placed bunch of PRs to cover that. Once that will be done, I will go to infra team regarding removing already published ones
16:03:43 <noonedeadpunk> which means that we can just abandon changes like https://review.opendev.org/c/openstack/openstack-ansible-os_mistral/+/768663
16:04:19 <jrosser> yeah, we should do that.... the only one (if its there?) which should stay is on openstack-ansible i guess
16:04:59 <noonedeadpunk> yeah. Also I'm not sure if we should leave for ansible-hardening...
16:05:08 <noonedeadpunk> I've pushed PR but now I'm not so sure
16:05:23 <noonedeadpunk> considering repo is also tagless, then probably yes...
16:05:26 <jrosser> ok well -W any you are not sure about
16:07:33 <noonedeadpunk> that was the only one I think.
16:07:37 <jrosser> i think i may have figured out this tempestconf stuff
16:07:47 <noonedeadpunk> they are kind of broken anyway there
16:08:32 <noonedeadpunk> oh, rly? I tried to push some patch but it was not the cause https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/769966
16:08:37 <noonedeadpunk> or it ws not the only one
16:08:49 <jrosser> yes i fiddled with that today
16:09:01 <noonedeadpunk> aha, yes, just noticed that
16:09:25 <jrosser> just seems i make silly error with the command
16:09:42 <jrosser> [Errno 2] No such file or directory: '/root/workspace/etc/profile.yaml --insecure False'
16:10:00 <jrosser> like it takes the whole thing as the --profile parameter, which is strange
16:11:36 <noonedeadpunk> maybe it's because of ""
16:11:53 <jrosser> also https://opendev.org/openstack/openstack-ansible-os_tempest/src/branch/master/defaults/main.yml#L228 kind of not what i expected either
16:11:56 <noonedeadpunk> since they start before profile and end at the end
16:12:04 <jrosser> there must be a better variable for that
16:13:04 <noonedeadpunk> we use keystone_service_internaluri_insecure everywhere
16:13:16 <noonedeadpunk> or what do you mean&
16:13:43 <jrosser> i think i need a variable that talks about the external IP
16:14:24 <noonedeadpunk> I think we should just make tempestconf use internalurl for interaction?
16:14:53 <jrosser> well, your patch made it do that
16:15:02 <jrosser> it gets the service catalog from the internal endpoint
16:15:18 <jrosser> then uses the public entries in the catalog itself by the look of it
16:15:34 <noonedeadpunk> uh....
16:16:19 <jrosser> i think that may be what is happening here http://paste.openstack.org/show/801994/
16:16:23 <noonedeadpunk> there sould be some extra I guess to select internal instead of public
16:16:35 <noonedeadpunk> actually the same issue spatel reported about senlin
16:16:55 <noonedeadpunk> that it connects to keystone through internal but takes public endpoints from catalog
16:17:29 <jrosser> i can't otherwise see how it has discovered the IP of eth0
16:17:35 <spatel> Yes, its known issue and may required senlin code change
16:18:03 <noonedeadpunk> I'm pretty sure you're right about tempestconf picking up public endpoint
16:18:14 <jrosser> anyway, my hope was that adding --insecure would make it not worry about the certificate
16:18:27 <jrosser> i think it may be legitimate behaviour as tempest kind of pretends to be an end user
16:18:29 <noonedeadpunk> yeah
16:18:41 <noonedeadpunk> will see...
16:18:42 <spatel> senlin acting like end-user and using all public endpoint by default
16:18:51 <jrosser> theres is two things, making the tempestconf discovery not worry about the cert
16:19:15 <jrosser> then it has to propery write out a tempest.conf that *also* doesnt worry about the cert for the actual tests
16:19:51 <jrosser> seems only recently this is possible https://opendev.org/osf/python-tempestconf/commit/f146f810695e83d2a8ce0fcdb94ff32e75ebdb20
16:20:40 <noonedeadpunk> ok, we can set verify: false
16:21:42 <jrosser> do you know where we would do that?
16:21:51 <jrosser> the documentation for this is kind of sparse
16:22:15 <openstackgerrit> Jonathan Rosser proposed openstack/openstack-ansible-os_tempest master: Use internal endpoint for tempestconf and respect tempest_keystone_interface_insecure  https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/769966
16:22:37 <noonedeadpunk> maybe add to tempest_tempestconf_profile ?
16:23:13 <jrosser> as far as i can see you can't specify this in the profile
16:24:15 <jrosser> i think that this must be picking up a clouds.yaml from somewhere
16:25:00 <jrosser> anyway, we maybe should move on a bit?
16:25:08 <noonedeadpunk> where does it takes auth creds from?
16:25:12 <noonedeadpunk> clouds.yaml?
16:25:20 <noonedeadpunk> yeah, let's move
16:25:36 <jrosser> andrewbonney is doing a U->V upgrade in our lab today
16:26:06 <noonedeadpunk> We've done T->V in lab and it went beautiful
16:26:24 <noonedeadpunk> Planning do it in prod next week
16:26:32 <jrosser> we have a bunch of issues and will make some patches
16:26:48 <jrosser> actually some already done
16:27:48 <jrosser> maybe most surprisng was something during setup-hosts restarting all the api containers at the same time
16:27:54 <MickyMan77> jrosser: can you help out, I'm not able to start, stop or restart the mysql in the galera_container. It just hanging.. I can't found any log about the issue...
16:29:15 <jrosser> some releasenotes may be missing too, like bind-to-mgmt means rabbitmq containers need /etc/hosts fixing up a bit
16:29:17 <openstackgerrit> Gaudenz Steinlin proposed openstack/openstack-ansible-os_cinder stable/ussuri: Define credentials for nova interaction  https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/772539
16:30:00 <openstackgerrit> Merged openstack/openstack-ansible master: Use TCP mode for console if SSL is configured  https://review.opendev.org/c/openstack/openstack-ansible/+/574153
16:30:25 <jrosser> did we have a bug / explanation for the designate pool UUID issue?
16:31:17 <andrewbonney> jrosser:  fwiw our designate issue is this one: https://bugs.launchpad.net/designate/+bug/1897936
16:31:19 <openstack> Launchpad bug 1897936 in Designate "Pool update fails when zones exist" [Undecided,Fix released] - Assigned to Mark Goddard (mgoddard)
16:31:57 <noonedeadpunk> No, I can't remeber if we have one
16:32:41 <noonedeadpunk> but it was quite silly that we used config param that was not preset in designate anymore
16:32:59 <noonedeadpunk> what caused containers to restart?
16:33:07 <noonedeadpunk> have you upgraded to focal?
16:33:27 <admin0> MickyMan77, is the container broken or only mysql ?
16:33:37 <spatel> I thought designate issue has been resolved, i have deployed couple of time with UUID patch and no issue so far
16:33:47 <admin0> if container is broken, is it only 1 container, or 3 containers ( if you are in HA setup)
16:33:47 <jrosser> andrewbonney: looks like that patch was reverted https://review.opendev.org/c/openstack/designate/+/755429
16:35:06 <noonedeadpunk> btw is it smth you mentioned about rabbitmq? https://bugs.launchpad.net/openstack-ansible/+bug/1824857
16:35:07 <openstack> Launchpad bug 1824857 in openstack-ansible "Rabbitmq join cluster fail" [Undecided,New]
16:36:00 <openstackgerrit> Merged openstack/openstack-ansible-apt_package_pinning master: [reno] Stop publishing release notes  https://review.opendev.org/c/openstack/openstack-ansible-apt_package_pinning/+/772007
16:36:11 <jrosser> yes, as part of bind-to-mgmt we had 3 patches about that
16:36:58 <noonedeadpunk> have we missed smth from V?
16:37:26 <prometheanfire> has anyone upgraded from ussuri to victoria?
16:37:59 <jrosser> noonedeadpunk: currently this has to be dealt with manually https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/670706 https://review.opendev.org/c/openstack/openstack-ansible-lxc_container_create/+/670705 https://review.opendev.org/c/openstack/openstack-ansible/+/670392
16:38:48 <jrosser> prometheanfire: we discuss this right now for the last 10 mins :)
16:40:11 <noonedeadpunk> I think https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/670706 should not be relevant now?
16:40:47 <noonedeadpunk> or dunno...
16:40:56 <noonedeadpunk> not quite understand why issue is raised now...
16:41:43 <jrosser> i think the activation of bind-to-mgmt across everything comes in for V
16:42:02 <noonedeadpunk> we now generate hosts file quite differently with blockinfile
16:42:11 <jrosser> that is true
16:42:16 <jrosser> do you have a link to that patch?
16:42:16 <noonedeadpunk> so we just need to drop everything except this block?
16:42:35 <noonedeadpunk> https://opendev.org/openstack/openstack-ansible-openstack_hosts/commit/c64e1caf72c20a2ffcce7b1d92e8b8cc8093a808
16:42:41 <jrosser> right so even blockinfile won't remove stuff thats wrong
16:43:59 <jrosser> i think our issue was the hosts files are long lived and had accumulated stuff from many releases
16:44:45 <noonedeadpunk> that';s interesting. I think I might have exactly the same issue
16:45:02 <prometheanfire> jrosser: good timing :D
16:45:25 <noonedeadpunk> so eventually dropping all except generated block should help it?
16:46:13 <jrosser> maybe a bit too much to assume the deployer has not put things in there
16:46:21 <jrosser> not sure OSA 'owns' then whole file
16:48:02 <openstackgerrit> Merged openstack/openstack-ansible-plugins master: [reno] Stop publishing release notes  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/772054
16:49:26 <noonedeadpunk> nah, ofc it's not what I'm thinking to script, but actually how it might be solved _when_ we will face it here:)
16:50:40 <jrosser> i think it was a trivial fix manually once we figured what was going on
16:52:38 <noonedeadpunk> yeah...
16:53:41 <noonedeadpunk> I'm still not sure what caused bug https://bugs.launchpad.net/openstack-ansible/+bug/1824857 since what you said should not be the issue for clen deployments
16:53:42 <openstack> Launchpad bug 1824857 in openstack-ansible "Rabbitmq join cluster fail" [Undecided,New]
16:53:54 <noonedeadpunk> Will try to set up 3 nodes sandbox tomorrow
16:54:00 <openstackgerrit> Merged openstack/openstack-ansible-lxc_container_create master: [reno] Stop publishing release notes  https://review.opendev.org/c/openstack/openstack-ansible-lxc_container_create/+/772013
16:54:53 * jrosser just looking at bugs
16:58:29 <noonedeadpunk> https://bugs.launchpad.net/openstack-ansible/+bug/1911482 has come here in IRC and said that it was issue in inventory or smth
16:58:30 <openstack> Launchpad bug 1911482 in openstack-ansible "neutron-l3-agent broken after train upgrade" [Undecided,New]
16:58:55 <noonedeadpunk> I'm wondering if we can move it to incomplete or invalid...
17:00:08 <jrosser> comment "resolved via IRC" and incomplete sounds good
17:01:50 <noonedeadpunk> Should we do anything regarding https://bugs.launchpad.net/openstack-ansible/+bug/1877421 ?
17:01:52 <openstack> Launchpad bug 1877421 in openstack-ansible "Cinder-volume is not able to recognize a ceph cluster on OpenStack Train." [Undecided,Confirmed]
17:05:42 <jrosser> oh,
17:06:06 <jrosser> wasnt there something magical about RBD, in that you couldnt ever have ceph@RBD becasue it's some kind of keyword?
17:06:21 <jrosser> there was a huge long irc thread about this some time ago
17:06:35 <noonedeadpunk> yeah, I can recall smth like that
17:06:53 <jrosser> as soon as you use ceph@some-other-pool-name it's all ok
17:07:24 <noonedeadpunk> I'm wondering if we have defined smth like this in docs?
17:08:14 <noonedeadpunk> oh....
17:08:16 <noonedeadpunk> #endmeeting