Monday, 2025-07-07

noonedeadpunkgood morning07:26
jrossero/ morning07:50
jrosserlooks like a missing dependancy here https://zuul.opendev.org/t/openstack/build/35dd297230ae4b7280a7780d9be8067107:58
jrosserdistutils for centos1007:58
jrosseri was also thinking that making the "user supplied" choice an explicit backend in the pki role could make things simpler08:03
jrosseras we add extra backends it is kind of odd that user supplied is some special case of "standalone"08:03
jrosserwhen actually it would not use the standalone cert generation at all08:03
jrosserthat would allow the vars that define a cert to become more uniform and reduce "these vars are for backend X" "these vars are for backend Y" that we seem to be going towards08:04
noonedeadpunkI think that distutils is provided by setuptools, right?08:10
noonedeadpunkI actually can recall coverign it in some other molecule job08:11
jrosseri think thats right yes08:11
noonedeadpunkdamiandabrowski:  ^08:16
damiandabrowskihi! jrosser I'm not sure if I understand, can you prepare some example?08:18
fricklerdistutils is provided by setuptools, but the latter is no longer automatically installed in a venv, you need to make that explicit08:18
jrosserdamiandabrowski: like here https://github.com/openstack/openstack-ansible-os_glance/blob/master/defaults/main.yml#L38008:19
noonedeadpunkah08:20
noonedeadpunkwell, how would it be different from standalone then? 08:20
jrosserfrickler: i think there is a good chance that this is actually missing in the system python, as the missing distutils error is happening inside execution of an ansible module on its target08:20
fricklerbut that should already have been needed with noble, cf. https://review.opendev.org/c/openstack/kolla/+/90758908:21
noonedeadpunkfrickler: yeah, we do it for sure in most cases, but missed some other in molecule it seems08:21
noonedeadpunkor just docker images are slightly different in what they have for ubuntu/centos08:22
damiandabrowskiahhh I see, yes that may be a good idea.08:22
damiandabrowskiIf I understand you corectly, it will slightly reduce the "these vars are for backend X" "these vars are for backend Y" issue08:23
damiandabrowskibecause we won't need a separate src (for standalone) and cert (for hashi_vault) parameter here, right?08:23
noonedeadpunkbut thinking about that, offloading this logic to the role might make sense indeed08:23
damiandabrowskihttps://opendev.org/openstack/openstack-ansible-os_glance/src/commit/1f1dc604b1cf0c5b4c2cd8c502299399e6848b9a/defaults/main.yml#L40508:23
jrosserdamiandabrowski: i was noticing in your patches that the name of the cert really is quite simple `myservice_{{ ansible_facts['hostname'] }}`08:23
jrosserand theres no reason it cannot also be the same format for the standalone, except that it is expecting a path in order to also support user-supplied08:24
noonedeadpunkwe can just go with `name` instead thyen?08:24
noonedeadpunkor well.08:24
jrosserwell something yes for sure, im not totally sure what is the best option08:25
noonedeadpunkyeah, it can be name for standalone/vault and keep src for user-supplied?08:25
jrosser"src" is pretty well understood from the copy module so that does make sense for the user-supplied08:25
noonedeadpunkas we still need to pass a user-supplied path somehow, unless we restrain this to expected structure08:25
noonedeadpunkand tell that in order to use user-supplied certs, you have to place them under expected strucutre08:26
damiandabrowskibut maybe we can use exactly the same parameter for all backends - standalone, user-supplied and hashi_vault?08:26
damiandabrowskiin this case, src would perfectly fit IMO08:26
damiandabrowskiofc, for user-supplied backend, user will need to override the default value of src with the expected cert path08:27
jrosserunless the use of `name` in that case means you *have* to put it in the expected directory08:28
jrosserand `src` allows you to place it wherever you wish08:28
noonedeadpunkyeah08:28
damiandabrowskishould we work on this before of after merging hashi_vault patches?08:29
damiandabrowskior after*08:29
jrosseranyway, i was just thinking that if we tidy up a few things (like noonedeadpunk did with the defaults/group vars typo) and also this, there might be quite some improvment for adding new backends cleanly08:29
jrosseryou did say that you had some trouble with overriding defaults and vars, i think we should address that first08:30
damiandabrowskiouh, maybe I wasn't clear enough. I had troubles with this in other roles :D (for ex. ansible-hardening stores its variables in vars/ which is problematic)08:31
jrosserwell it's not problematic, so long as they're not supposed to be overridden :)08:31
jrosserbut yeah it needs care to make it correct08:31
jrosserahhah https://opendev.org/openstack/ansible-role-pki/commit/4e960a1083c71babc05a282a929f29d8f2f4df0208:33
noonedeadpunkoh, well :)08:34
noonedeadpunktime to add it to redhat then08:34
opendevreviewJonathan Rosser proposed openstack/ansible-role-pki master: Add python3-setuptools for redhat-10 based distros.  https://review.opendev.org/c/openstack/ansible-role-pki/+/95421308:35
damiandabrowskifair point, yeah :D I will try to sort out the vars/ vs. defaults/ issue, but we will definitely need to have some backend-specific defaults in ansible-role-pki anyway08:37
damiandabrowskiand I wonder if it's okay to keep them in defaults/main.yml or move them to something like defaults/hashi_vault.yml and dynamically import this file when needed08:37
damiandabrowskiexample: https://opendev.org/openstack/ansible-role-pki/src/commit/00545ffa46446372b0baf7fdb8a4b99e3eb5926a/defaults/main.yml#L20508:38
jrosseri think it is fine for them to be in defaults, so long as it is variables that are intended to be overridden08:39
jrosseri would much rather we address this sort of thing https://opendev.org/openstack/ansible-role-pki/src/commit/00545ffa46446372b0baf7fdb8a4b99e3eb5926a/defaults/main.yml#L172-L18008:39
damiandabrowskiokok, will do that08:43
damiandabrowskiI aim to apply improvements to my patches later this week08:43
noonedeadpunkjrosser: btw on your comment here: https://review.opendev.org/c/openstack/openstack-ansible/+/953570 :)08:43
damiandabrowskiit would be nice to gather as much feedback as possible by then :D 08:43
noonedeadpunkthe problem is, that despite squid is listening on *:3128, the problem is that it's not actually responding on management_address, as it's configured after service is started08:44
jrosseroh becasue we change the order of setting up squid and creating the network?08:45
noonedeadpunkbut also, there's a race condition, that openstack_hosts fail before networks are configured, as they try to reach proxy to install systemd_networks08:45
noonedeadpunkyeah08:45
noonedeadpunkbut the route I've added is /3208:46
jrosserright but that allows services to contact the external vip?08:46
noonedeadpunkso it does not really give any escape path - jsut tells how to reach squid and do that not via lxcbr with nat, but via mgmt networek08:46
jrosserunless i misunderstand......08:47
noonedeadpunkhm08:47
jrosserthe proxy scenario is kind of a two-for-the-price-of-one test08:47
jrosserbecasue it proves that everything goes via the proxy, or it will fail08:48
jrosserand i think aslo it prevents misconfigured/broken services from directly using the external vip08:48
noonedeadpunkwell, they will go through proxy be default anyway then?08:49
jrosserno, becasue it only sets deployment_environment_variables iirc08:49
noonedeadpunkas public vip is in no_proxy anyway?08:49
jrosserso there is no left-over proxy config left once the ansible has run08:49
noonedeadpunkwell, then this idea to offload to openstack_host sucks....08:51
noonedeadpunkas with proxy ansible/apt wants to go through it right away08:51
noonedeadpunknot giving any chance to provision networks after bootstrap_aio is completed08:51
jrosserwell yes08:53
jrosserin a real environment the proxy would be something that just exists before you start and you point to it08:53
jrossernot something ever provisioned by openstack-ansible08:53
jrossersimilar case pretty much for step-ca?08:53
noonedeadpunkI think proxy is a bit unique here08:55
jrosseri think that theres two things going on here08:55
noonedeadpunkas proxy is pre-requirement for `apt` to install packages08:55
jrosserthe setup of networks in openstack-hosts does have benefits for automating more of the OSA specific things08:55
noonedeadpunkwhile step-ca is needed waaaay later08:55
jrosserbut the other case is "test fixtures" that we need which are somehow network related, and squid is the most early of these it seems08:56
noonedeadpunkwell, yes. As also on production you won't host squid on any of openstack hosts anyway09:02
noonedeadpunkit should be somehow in different perimeter09:02
jrosserwell maybe thats what we do09:10
noonedeadpunkok, can we start a bit from the beginning here? Just trying to think if I should abandon this patch based on that or maybe not09:10
jrosserwe have have some other code that deploys test fixtures right at the start09:10
jrosserand some other interface or whaever thats nothing to do with the OSA deploy09:10
jrosserso it behaves just like it would in production09:10
noonedeadpunkso like another bridge?09:11
noonedeadpunkand add it to all containers?09:11
jrosserand use the .102 IP perhaps so that theres no confusion with the VIP09:11
jrosseroh hmm09:11
jrosseri think it would be OK if the route you'd added was to something that was not the external VIP09:12
noonedeadpunkso the issue is order of execution09:13
noonedeadpunkwe don't have this IP address up until systemd_networkd is restarted, but we already need it to install systemd_networkd by apt09:13
noonedeadpunkand `bootstrap_host_public_address` is ansible_default_ip4_address09:14
noonedeadpunkwhich we already have by default09:14
noonedeadpunkand we can't add extra IP to the default interface, as it can be real IP and also restricted by firewall or allowed-address pairs09:15
noonedeadpunkso I can't mess up with it09:15
noonedeadpunkso the only 2 things I have to not startup squid on the public VIP, is either to provision a completely separate network in aio as we do today, or give up on idea of using openstack_hosts for network provisionment09:16
noonedeadpunk(but also squid is listening on *:3128, so it's not even startup, but jsut ordering loophole)09:17
noonedeadpunkI think that with current approach we do test proxy connection in quite a good way tbh. As dropping the route or doing smth wrong with squid would result in failures right away.09:18
noonedeadpunkWith that, we actually were not testing proxy connection to the public VIP anyway as we have this today: https://opendev.org/openstack/openstack-ansible/src/branch/master/tests/roles/bootstrap-host/templates/user_variables.aio.yml.j2#L32209:19
jrosseryes i agree it is pretty robust and catching errors that we would not find in other tests, even for non proxy deployments (like endpoint errors)09:19
noonedeadpunkSo services would walk to public IP without squid regardless09:19
noonedeadpunkwhat the route changes, is that snat won't be used for reaching public vip09:21
noonedeadpunkbut we're keeping the interface we communicate over, as previously we were talking throug mgmt interface as well09:22
noonedeadpunkbut yeah, dunno09:22
noonedeadpunkIt's not I like proposal, I don't see good option to solve that09:22
jrosseralso i think i removed lxcbr0/eth0 from the containers entirely in this config09:25
jrosserso it's not possible to nat via the host at all09:25
jrosserhttps://opendev.org/openstack/openstack-ansible/src/branch/master/tests/roles/bootstrap-host/templates/user_variables.aio.yml.j2#L324-L32609:27
noonedeadpunkah, yes, true09:28
noonedeadpunkI think that's why I added route :D09:29
jrosseri think i'm more confused rather than less now09:29
jrosserbut eth1 should be the mgmt subnet09:29
noonedeadpunkand this is what was guaranteeing public ip is not reachable09:29
jrosseroh ok ok public ip is the node ip isnt it09:29
noonedeadpunkand I think it was not reachable at all, as containers were not having a default route09:29
noonedeadpunkwell09:30
jrosserso i'm getting confused about .100 / .101 for sure09:30
noonedeadpunkwhat we can do... is to have a different IP for proxy for containers and bare metal09:30
noonedeadpunk.100 and .101 are both management network09:30
jrosseryes one is the VIP and one is the bind address for services?09:31
noonedeadpunkas for containers, we can keep having proxy on management network, but do it on public for bare metal09:31
noonedeadpunkyes09:31
noonedeadpunkbut for public we don't have VIP09:31
noonedeadpunkso yeah, I can do is_metal check and use either bootstrap_host_public_address or bootstrap_host_management_address depending on the choice09:32
noonedeadpunkfor http_proxy09:32
noonedeadpunkand then we don't need route09:33
jrosserso for the "early" things it would use the public address that squid is already bound to09:34
noonedeadpunkyeah09:34
noonedeadpunkfor late as well, given they're running not in lxc though09:34
noonedeadpunkbut public vip is local bound, so you can't prohibit reaching it anyway09:35
noonedeadpunkopr well09:35
noonedeadpunkyou can...09:35
noonedeadpunkbut you got it :)09:35
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Offload network provisionment for AIO to openstack_hosts  https://review.opendev.org/c/openstack/openstack-ansible/+/95357009:38
noonedeadpunkso smth like that I guess09:38
noonedeadpunkoh, not really09:39
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Offload network provisionment for AIO to openstack_hosts  https://review.opendev.org/c/openstack/openstack-ansible/+/95357009:40
noonedeadpunkit could be fairt middle-ground....09:41
noonedeadpunkor indeed we should be having just a separate network/bridge for squid which we still do during bootstrap-aio09:42
noonedeadpunkor abandon the patch :)09:42
opendevreviewJonathan Rosser proposed openstack/ansible-role-pki master: Allow certificates to be installed by specifying them by name  https://review.opendev.org/c/openstack/ansible-role-pki/+/95423912:43
jrosserdamiandabrowski: ^ maybe something like this? then for the vault stuff you can also use the `name` key12:43
jrosseri think also this is backward compatible, and we could go through the roles and migrate everything to use `name`12:44
damiandabrowskii liked the idea of a "user-provided" backend a bit more, because it would allow us to have only one variable instead of name and src, but this also looks good at first glance13:18
damiandabrowskiI'll have a deeper look tomorrow13:30
opendevreviewJonathan Rosser proposed openstack/ansible-role-pki master: Allow certificates to be installed by specifying them by name  https://review.opendev.org/c/openstack/ansible-role-pki/+/95423914:38
opendevreviewJonathan Rosser proposed openstack/ansible-role-pki master: Allow certificates to be installed by specifying them by name  https://review.opendev.org/c/openstack/ansible-role-pki/+/95423914:45
opendevreviewJonathan Rosser proposed openstack/ansible-role-pki master: Allow certificates to be installed by specifying them by name  https://review.opendev.org/c/openstack/ansible-role-pki/+/95423915:11
jrosserdamiandabrowski: even if we add a user-provided backend we have to support the vars that are in use today15:11
jrossereven if its temporary whist migrating src -> name for most things15:12
jrosserand one thing i'm not sure about is how we "enable" a user provided cert15:12
jrosseras right now it's enough to just define `glance_user_ssl_cert` as the path to the file, and it will work15:13
jrosserno messing with backend settings, or redefining the whole set of certs for glance just to set one of them to be user supplied15:13
jrossersetting that var really implies that some 'user-supplied' functionality is used, however that is implemented15:14
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_glance master: Use 'name' to specify SSL certificates to the PKI role  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/95426915:29
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_glance master: Use 'name' to specify SSL certificates to the PKI role  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/95426915:51
damiandabrowskiyeah, you may be right about keeping src...16:22
jamesdenton_jrosser hello! looking back thru some old IRC logs... curious if you're using ASAP^2 in production with OVN17:26
jrosserwell17:27
jrossersome time ago we (well andrewbonney actually) did a POC with it17:27
jrosserand whilst it did basically work it was extremely fragile17:27
jrosseri do have some notes here is there is anything in particular17:30
jamesdenton_Nothing in particular, no. Curious if the fragility was more on the driver side than say, neutron17:30
jamesdenton_Mainly curious to know what throughput you're seeing for GENEVE out of the box17:31
jamesdenton_and besides ASAP^2, are you doing anything special for offloading17:31
jamesdenton_i'm not crazy about DPDK, even aafter all this time17:31
jrosserfrom what i can see in the test environment we were getting ~5Gbps out of the box between two VM on different hosts with the standard OVN setup17:37
*** jpw_alt is now known as jpw17:37
jrosserand i think we got that up to 18Gbps with the offloading as good as we could get at the time (~2 years ago)17:38
jrosseri think that the fragility was in the composite of all-the-things that had to be just right at the same time17:39
jrosserand all these things are pretty niche features, like vf-lag, the ovs offloading itself, and i think at the time offloaded security groups were a really new feature17:40
jrosserand you need some pretty custom config of the NIC at boot, at every boot as well17:41
jamesdenton_that tracks pretty well. thank you17:43
jrosserit would do 35Gbps between VM on the same host, and that was with 4x iperf threads all at ~85% CPU on the server side17:45
jrosserso that might have been running out of CPU grunt at that point rather than network17:46
jamesdenton_yeah, exactly. We're seeing about 2.2Gbps between hypervisors on some Broadcom 25G NICs17:46
jamesdenton_and thats a single iPerf thread. I can scale that out horizontally but no one thread is > than 2Gbps or so17:47
jamesdenton_What's interesting is this offload page mentions Broadcom NICs as supporting offloading - though i'm not sure if the implementation is the same. https://docs.openstack.org/neutron/latest/admin/config-ovs-offload.html17:48
jrosserlooks like we also struggled to set encapsuation + vlan tagging at the same time on the what you'd call "uplink" port from OVS17:49
jrosserfor this test we ended up having to have the tunnel traffic untagged up to the switch17:49
jrosserinteresting - no detail for the broadcom nic though17:50
jamesdenton_that doc doesn't really call out the vtep itself, huh. an important missing detail :D17:51
jrosserdamiandabrowski: why dont we also implement ca_chain and fullchain for the standalone backend18:10
jrosserseems like we leak implementation details of hashi vault out with the need for dual handling of string or list for `type`18:10

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!