#openstack-ansible log

16:00:14 <noonedeadpunk> #startmeeting openstack_ansible_meeting
16:00:14 <openstack> Meeting started Tue Feb  2 16:00:14 2021 UTC and is due to finish in 60 minutes.  The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:15 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:17 <openstack> The meeting name has been set to 'openstack_ansible_meeting'
16:00:28 <noonedeadpunk> # topic office hours
16:00:31 <noonedeadpunk> #topic office hours
16:00:38 <noonedeadpunk> \o/
16:02:24 <noonedeadpunk> So, with release of amqp 5.0.5 it seems we need to speed up with SSL topic
16:02:39 <noonedeadpunk> I tried to cover some of the comments raised for https://review.opendev.org/c/openstack/openstack-ansible-specs/+/758805
16:02:46 <noonedeadpunk> but maybe let's discuss them?
16:03:22 <noonedeadpunk> `What to do when certs expire (including the root cert) and how that should be managed, including their default lifetimes`
16:03:58 <noonedeadpunk> I'd say that we should just have a flag that will force role to renew certificate or root ca
16:05:15 <noonedeadpunk> I don't feel like we should be watching after expiration dates atm out of the box. Considering we should be able to work with user provided certificates and let's encrypt where applicable
16:05:52 <noonedeadpunk> But yes, we totally need to have a valid mechanism of root ca update without making cluster stuck because of that update
16:06:35 <noonedeadpunk> also imo cert revocation systems is kind of overkill at the moment as well
16:07:04 <noonedeadpunk> it's probably nice to have feature, but where we are at the moment and what needs to be done overall is kind of...
16:07:26 <noonedeadpunk> Btw I already asked for repo creation https://review.opendev.org/q/topic:%22osa%252Fpki%22+(status:open%20OR%20status:merged)
16:09:54 <jrosser> o/ hello
16:10:01 <andrewbonney> For expiry I guess the main detail is making sure the expiring one remains trusted whilst rollover happens
16:10:47 <jrosser> the root CA would be extremely long lived and in intermediate is the more likley thing to need rotating?
16:11:27 <jrosser> so probably two different things, rotate root CA (very very infrequent unless some security incident with it)
16:11:43 <noonedeadpunk> yes, totally, 2 different flags
16:12:07 <jrosser> re-issue service cert + intermediate bundle against a new intermediate, and that should be much much easier than rotating a root
16:12:16 <noonedeadpunk> but again with remaining root ca trusted even when new one is in place
16:13:00 <noonedeadpunk> because otherwise I don't see how to update root. It was super clever suggestion I was not aware of
16:13:22 <noonedeadpunk> I mean https://tools.ietf.org/html/rfc4210#section-4.4
16:14:54 <noonedeadpunk> hm I guess I'm a bit lost in terminology ( So intermediate is root CA and "root" in private key, right?
16:15:31 <noonedeadpunk> or you're talking about something extra? As intermediate I guess is addition to CA one?
16:16:06 <jrosser> generally certs for services are not signed directly with the private key of the root CA
16:16:28 <jrosser> the only thing you use that for is to generate an intermediate CA cert/key, and you can have as many of those as you like
16:16:55 <noonedeadpunk> and you issue certificates with intermediate ones?
16:17:02 <jrosser> which is good, because you can revoke/change an intermediate whenever you like without affecting the trust of stuff signed from a different intermediate
16:17:15 <jrosser> it's like a tree
16:17:15 <noonedeadpunk> I just never was digging deep in how certs are issued on provider side
16:17:33 <jrosser> thats why generally you make the root CA valid for a very very long time
16:18:04 <jrosser> but you can make the lifetime of the intermediates shorter, and the pain of rolling them is really much smaller than if you wanted to roll the entire root CA
16:18:19 <noonedeadpunk> I'm not sure if you can put CA in trust store... I guess you can?
16:18:37 <jrosser> oh absolutely, thats pretty much what it contains
16:19:13 <noonedeadpunk> Just in terms that we won't need to define intermediate chan to the services since they will be trusted system wide?
16:19:47 <noonedeadpunk> ok, I guess I got the idea. Need to read more anyway
16:20:06 <jrosser> sure well i think we should write more and maybe test some of this
16:22:59 <jrosser> there doesnt seem to be anything too major we have missed from the comments
16:23:12 <noonedeadpunk> yeah, I guess so
16:24:04 <noonedeadpunk> but all comments were really valid though
16:25:47 <noonedeadpunk> regarding hardening - it seems I got role unstuck https://review.opendev.org/c/openstack/ansible-hardening/+/771481
16:26:17 <noonedeadpunk> but I'm not sure in 1 thing there, which makes role compatible with ansible 2.10 and later only
16:26:52 <noonedeadpunk> which is `truthy(convert_bool=True)` filter
16:26:57 <jrosser> for master/osa thats fine, not sure how much use we get beyond that?
16:32:58 <noonedeadpunk> I guess we can use for V as well?
16:33:21 <noonedeadpunk> the main concern is that role was used not only by OSA I guess
16:33:33 <noonedeadpunk> it has been used even outside of the openstack...
16:34:00 <jrosser> i expect you used the new 2.10 keyword for a good reason?
16:34:54 <noonedeadpunk> good question... I used to replace https://opendev.org/openstack/ansible-hardening/src/branch/master/tasks/rhel7stig/accounts.yml#L147 to fix linters
16:35:11 <noonedeadpunk> but... item.value here might be either int or bool or string
16:35:45 <noonedeadpunk> and I'm out of good ideas how to test them except comparing to empty string or with truthy test...
16:36:04 <noonedeadpunk> because bool for string will be false and you can't check legth of int or bool...
16:36:38 <noonedeadpunk> we can leave it as is and add noqa here
16:39:03 <jrosser> sounds reasonable as its a difficult test to do properly
16:40:34 <jrosser> are there bugs to look at?
16:41:29 <noonedeadpunk> there were no new ones. But there were some untriaged left from last year
16:41:37 <noonedeadpunk> #topic bug triage
16:41:51 <noonedeadpunk> I guess I just found extra one:)
16:41:59 <noonedeadpunk> https://opendev.org/openstack/openstack-ansible-os_nova/src/branch/master/templates/nova.conf.j2#L241 - this will be always false right?
16:42:10 <noonedeadpunk> good place to use new trythy filter as well?
16:43:36 <jrosser> thats just broken now?
16:43:41 <jrosser> string -> false
16:43:45 <noonedeadpunk> yeah...
16:43:51 <noonedeadpunk> just faced it
16:44:05 * noonedeadpunk upgrading T->V directly
16:44:35 <jrosser> how does that even work at all then
16:44:47 <jrosser> V ceph job for example
16:44:56 <noonedeadpunk> we have a lot of diskspace :p
16:45:07 <noonedeadpunk> so nova uses local storage for ephemeral drives
16:45:23 <jrosser> andrewbonney: ^ one to add to the list! :)
16:46:40 <noonedeadpunk> and nova_rbd_inuse is defined not correctly as well... doh
16:47:03 <jrosser> hmm seems like we need a LP bug for this
16:47:21 <noonedeadpunk> yeah, will spawn some
16:47:39 * noonedeadpunk fixing environment
16:48:30 <openstackgerrit> Merged openstack/openstack-ansible-galera_server stable/victoria: Bring db setup vars in line with other roles  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/772550
16:49:22 <noonedeadpunk> oh, btw, what about galera issue we're facing...
16:49:36 <noonedeadpunk> I think it might be worth reaching galera folks for some help with that?
16:50:53 <noonedeadpunk> oh, and https://bugs.launchpad.net/openstack-ansible/+bug/1908703
16:50:55 <openstack> Launchpad bug 1908703 in openstack-ansible "federation domain not configured correct" [Undecided,New]
16:51:12 <jrosser> yeah, i made a paste with the journal when galera had not started properly, that should be useful
16:52:00 <jrosser> gshippey: are you around?
16:52:16 <gshippey> I am
16:52:55 <jrosser> the federation bug just mentioned before, does it look like the example mapping we give in the docs is missing some things for the default domain?
16:57:03 <gshippey> Just had a quick look at the docs, and the domain_id on the trusted_idp is there. Give me a sec, need to find some old patches of mine
16:58:09 <noonedeadpunk> annoying thing is that gerrit now not linked to LP
16:58:49 <gshippey> if anything looking at the keystone_sp structure in https://docs.openstack.org/openstack-ansible-os_keystone/latest/ the federated_identities should be pulling the domain from the idp rather than the other way around
16:59:47 <jrosser> hmm looks like pertoft is not here in irc?
17:00:21 <jrosser> gshippey: if you would be able to follow up to the reply on that bug it would be awesome
17:01:46 <jrosser> noonedeadpunk: we are encountering this in our upgrade work https://github.com/ansible/ansible/issues/72776
17:03:00 <noonedeadpunk> yeah I saw patch from andrewbonney, but didn't have time to read bug carefully
17:03:06 <gshippey> I will do, essentially I don't think the domain of the idp functionally matters and to maintain backwards compatibility specifying the domain of the idp has to be optional.
17:03:14 <openstackgerrit> Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_nova master: Fix nova_libvirt_images_rbd_pool check  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/773732
17:03:42 <jrosser> noonedeadpunk: i am not sure if it is triggered by something specific in our environment
17:03:59 <andrewbonney> I'm still investigating at the moment but I think in our case it's because our deploy host doesn't have name resolution for the container hosts
17:04:01 <jrosser> to do with the way hosts vs. IPs in the inventory
17:04:37 <andrewbonney> Or rather that's why it doesn't show up elsewhere
17:05:38 <jrosser> there was discussion recently if the OSA things should be adding entries to the deploy host /etc/hosts
17:05:57 <jrosser> becasue the behavour currently will be different if infra1 is the deploy host vs. some dedicated deploy host
17:06:46 <noonedeadpunk> and in your case deploy host is placed on infra?
17:06:49 <jrosser> and we would never see this sort of thing in CI jobs because deploy==infra host
17:06:55 <jrosser> no it's seperate
17:07:03 <noonedeadpunk> it's also separate for me...
17:07:23 <noonedeadpunk> but anyway I see nothing wrong in setting hosts file to the deploy host as well
17:07:42 <noonedeadpunk> except it's not so easy to achieve I guess)
17:07:59 <noonedeadpunk> as we don't want to run whole openstack_hosts against deploy
17:08:26 <jrosser> no we do not want to do that
17:08:38 <jrosser> i think we let andrewbonney dig into this and see what the root cause is
17:09:12 <jrosser> there is a further instance of it beyond the patch today which cannot be fixed in a straightforward way
17:09:33 <noonedeadpunk> Btw I'm thinking if we should release 22.0.1 now (once all V backports will land) and 22.1.0 after that I guess?
17:09:56 <noonedeadpunk> as point release used to mark that it's pretty safe to upgrade?:)
17:10:47 <jrosser> sounds like we are both working though V upgrades on prod environments and catching a few things
17:11:00 <jrosser> so yes a 22.1.0 when all that is settled would be good
17:11:07 <noonedeadpunk> k
17:11:11 <noonedeadpunk> #endmeeting