opendevreview | Merged openstack/openstack-ansible stable/zed: Switch spice-html5 source to freedesktop gitlab https://review.opendev.org/c/openstack/openstack-ansible/+/881337 | 00:32 |
---|---|---|
opendevreview | Merged openstack/openstack-ansible stable/zed: Bump OpenStack-Ansible Zed https://review.opendev.org/c/openstack/openstack-ansible/+/881338 | 00:50 |
jrosser | good morning | 07:14 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_ironic master: Add driver type for redfish https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/881450 | 07:19 |
noonedeadpunk | o/ | 07:33 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Add Yoga upgrade jobs https://review.opendev.org/c/openstack/openstack-ansible/+/879884 | 07:34 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Switch ubuntu upgrade jobs to Jammy https://review.opendev.org/c/openstack/openstack-ansible/+/879890 | 07:34 |
noonedeadpunk | psymin[m]: lxc_container_default_mtu is desinged for just user_variables.yml | 07:35 |
noonedeadpunk | for the error you've posted - I think it has smth to do with how you've specified provder_networks | 07:36 |
noonedeadpunk | and you should never update role defaults. whatever is there - can be overwritten in user_varaibales or group_vars | 07:37 |
noonedeadpunk | Though in this specific case neutron_provider_networks either getting defined through openstack_user_config or should be explicitly configured in user_variables. Both ways work | 07:38 |
noonedeadpunk | jrosser: what are your thoughts about this part for tls? https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/876429/comment/ba86394b_5ac67c55/ | 07:40 |
noonedeadpunk | in terms that repo_server_bind_address is defaulted to 0.0.0.0 | 07:40 |
jrosser | yes i think we already have `repo_server_bind_address` dont we | 07:42 |
noonedeadpunk | As that sounds like proper variable at first glance, but then it kinda needs logic to handle 0.0.0.0 case where add all IPs, and that needs facts that we don't have... | 07:42 |
noonedeadpunk | It's set like that `repo_server_bind_address: "{{ openstack_service_bind_address | default('0.0.0.0') }}"` | 07:42 |
noonedeadpunk | and SUN for 0.0.0.0 kinda.... won't work I assume | 07:42 |
jrosser | well - i wonder if actually the default is bogus | 07:43 |
jrosser | because as you say it has to be defined really | 07:43 |
noonedeadpunk | well, it's fine to bind on 0.0.0.0 | 07:43 |
jrosser | some of the roles are designed to work standalone like galera etc | 07:43 |
noonedeadpunk | it's not fine to issue cert for 0.0.0.0 | 07:43 |
jrosser | so for those, the default must be such that it does work | 07:44 |
jrosser | but actually we do the same thing here https://github.com/openstack/openstack-ansible-galera_server/blob/master/defaults/main.yml#L28 | 07:44 |
noonedeadpunk | yup... | 07:45 |
jrosser | which is almost certainly not correct for TLS things | 07:45 |
jrosser | do you think it's better to default these to 127.0.0.1? | 07:45 |
noonedeadpunk | Um, no, then it won't work for sure kinda | 07:45 |
noonedeadpunk | as binding uwsgi to 127.0.0.1 is kinda usless | 07:46 |
jrosser | relatedly i'm not sure that the `| default('ExampleCorpIntermediate') }}"` is really helpful either | 07:46 |
noonedeadpunk | in galera case cert and root are going to be issued if they're not defined | 07:46 |
jrosser | yes | 07:46 |
noonedeadpunk | and we actually use `galera_address` as it's VIP atm | 07:47 |
noonedeadpunk | (for tls) | 07:47 |
jrosser | well - for roles that are not standalone we can probably not specify defaults, as they are confusing or wrong | 07:48 |
jrosser | that way it says "this other variable must be defined" | 07:48 |
noonedeadpunk | yeah, ok, makes sense. pki_dir still worth to be defaulted I guess | 07:50 |
jrosser | well i don't know really | 07:51 |
jrosser | everything can be overridden regardless | 07:51 |
jrosser | if the role cannot create its own CA then the vars are mandatory to be defined | 07:52 |
noonedeadpunk | I'm really not sure what to do with bind_address though to be frank. I could come up with logic that will do like `repo_server_tls_san: "{{ repo_server_bind_address == '0.0.0.0' | ternary(ansible_facts['all_ipv4_addresses'] | join(',IP:'), repo_server_bind_address }}"` or smth... | 07:52 |
noonedeadpunk | but that sounds like overkill | 07:53 |
noonedeadpunk | And almost sure we don't have these facts | 07:53 |
noonedeadpunk | or like "{{ repo_server_bind_address == '0.0.0.0' | ternary(container_address, repo_server_bind_address) }}" | 07:53 |
noonedeadpunk | but it's also kinda same... | 07:54 |
jrosser | i think that is what we have this for? https://github.com/openstack/openstack-ansible/blob/379426ef210d5ffea858a37c006502575efed80b/inventory/group_vars/all/all.yml#L40-L41 | 07:56 |
noonedeadpunk | yeah, exactly... | 07:56 |
jrosser | so unless i'm missing something it's just `repo_server_bind_address: "{{ openstack_service_bind_address }}"` | 07:57 |
noonedeadpunk | yup | 07:57 |
noonedeadpunk | it works by default, totally | 07:57 |
noonedeadpunk | unless you set `openstack_service_bind_address: 0.0.0.0` which might be valid thing to do | 07:57 |
jrosser | oh yes i think thats totally not going to work | 07:58 |
jrosser | it'll break haproxy/metal i think for a start | 07:58 |
noonedeadpunk | sure, but in lxc it will just work | 07:58 |
noonedeadpunk | or in cases where ou have haproxy out of controllers | 07:59 |
jrosser | though we did split the service address and the VIP out onto .100/.101 IP for AIO so it *might* be ok? | 07:59 |
noonedeadpunk | or smth like f5 instead of haproxy... | 07:59 |
jrosser | yes | 07:59 |
jrosser | `container_address` is always the local IP on the mgmt net | 08:00 |
jrosser | so really depends if we want to support binding to 0.0.0.0 at all, then we must revisit all the bind-to-mgmt patches i think | 08:00 |
noonedeadpunk | Given this is merged and used https://review.opendev.org/c/openstack/openstack-ansible/+/870113 - then yes :) | 08:01 |
noonedeadpunk | otherwise it can be messy as well | 08:01 |
noonedeadpunk | I think binding to 0.0.0.0 should be totally fine in some scenarios, except tls thing we're currently talking about | 08:02 |
jrosser | right - so actual question is if we should split the vars between the bind address and whatever gets in the certificate | 08:02 |
jrosser | but if you connect to an IP that is not the one in the cert, then it's an error isnt it? | 08:03 |
noonedeadpunk | yeah. And now I see we should | 08:03 |
noonedeadpunk | yes, totally | 08:03 |
jrosser | so what is the case to make them different? | 08:04 |
noonedeadpunk | when you want to bind on zeros? | 08:04 |
noonedeadpunk | and then you need to do like `ansible_facts['all_ipv4_addresses'] | join(',IP:')` | 08:04 |
jrosser | urgh | 08:05 |
jrosser | thats like information leakage too | 08:05 |
jrosser | as you write the whole IP address mapping for the deployment out into the certificates | 08:05 |
noonedeadpunk | yeah, so maybe just ternary(container_address, repo_server_bind_address) then? | 08:06 |
noonedeadpunk | it's a standalone var, so if you want to override it - feel free | 08:07 |
noonedeadpunk | and then take care of having facts | 08:07 |
noonedeadpunk | *you need for that | 08:07 |
jrosser | so is the idea here to have the services accessible on an "ssh" network as well as the mgmt network? | 08:07 |
* jrosser just trying to understand why binding to 0.0.0.0 is helpful | 08:08 | |
noonedeadpunk | I wasn't thinking of SSH network, but might be storage network or smth like that | 08:08 |
jrosser | btw i did patch the uwsgi role to be able to listen on a list of ip | 08:08 |
jrosser | which is also not accounted for in the TLS stuff | 08:08 |
noonedeadpunk | good point | 08:08 |
jrosser | so there is 3 cases, the mgmt ip, a list of ip, and 0.0.0.0 | 08:09 |
noonedeadpunk | like if you want for rgw to talk to keystone through storage net and not mgmt net - that was my thinking | 08:09 |
jrosser | the list of IP was necessary to avoid using 0.0.0.0 in ironic, where the conductor API needs to listen on mgmt net and also bmaas net | 08:09 |
noonedeadpunk | for example, I'm binding haproxy to zeros, but on a specific interfacce, so that any IP on this intarface was served properly | 08:11 |
noonedeadpunk | but that's not realted to the topic :) | 08:11 |
jrosser | i think we have added extra VIP for similar purposes | 08:11 |
noonedeadpunk | yeah, so we have a keepalived instance for each haproxy server, and in case of failovers there can be up to 3 VIPs on 1 haproxy | 08:12 |
jrosser | hmm right so we conclude that this is all complicated | 08:13 |
noonedeadpunk | let me check how uwsgi lists are handled though | 08:13 |
noonedeadpunk | aha, just a list instead of string | 08:14 |
jrosser | yes https://github.com/openstack/ansible-role-uwsgi/commit/cd49d19055bc6a82ef8a80784ab257975a65536a | 08:14 |
jrosser | one thing we could do is look at converting that to a list everwhere we use it | 08:15 |
jrosser | then from a certs point of view it would always be a list, even if just one element | 08:15 |
noonedeadpunk | yup, makes sense | 08:15 |
noonedeadpunk | and check for zeros and use management_address/container_address if that's the case? | 08:16 |
jrosser | i wonder about glance | 08:16 |
jrosser | where there is a good chance it is not uwsgi | 08:16 |
noonedeadpunk | also neutron | 08:17 |
noonedeadpunk | ah, but it does suport with eventlet | 08:17 |
noonedeadpunk | right | 08:17 |
jrosser | btw the reason i don't much like the 0.0.0.0 stuff is that accidents are easy, particularly if there is a public IP on the host | 08:18 |
jrosser | not binding to 0.0.0.0 but to specific addresses gives you another thing that has to go wrong to accidentally have a service exposed | 08:19 |
noonedeadpunk | you're not really running api's where there's a public network available? | 08:19 |
jrosser | no, but the OSA default deployment does | 08:19 |
jrosser | haproxy on infra nodes | 08:20 |
noonedeadpunk | you mean metal one? | 08:20 |
jrosser | yes | 08:20 |
noonedeadpunk | yeah, I'm thinking about 0.0.0.0 as - just in case someone needs that inside their containers but not skilled enough to figure out what goes wrong | 08:21 |
jrosser | idk how much attention people pay to firewall before the infra nodes, iptables etc | 08:21 |
noonedeadpunk | so we could say - try setting this var to zeros, and let's see how it goes | 08:21 |
jrosser | yes thats reasonable | 08:21 |
jrosser | right yeah, so if you actually set the var to bind to 0.0.0.0 then it should not be a surprise that the certificate gets `ansible_facts['all_ipv4_addresses'] | join(',IP:')` | 08:23 |
noonedeadpunk | but then we don't have this fact gathered as of today | 08:24 |
noonedeadpunk | (haven't checked but quite sure we should not) | 08:24 |
jrosser | no - because on a network node that will be gigantic facts | 08:24 |
jrosser | and takes aaaaaages to run | 08:24 |
jrosser | perhaps we can improve the facts gathering to collect network facts when is_metal==false | 08:25 |
damiandabrowski | regarding uwsgi bind list and "then from a certs point of view it would always be a list, even if just one element" | 08:28 |
damiandabrowski | please note that some roles do not support uwsgi: repo_server, tacker, masakari, swift | 08:30 |
jrosser | the same is true of glance and neutron as we discussed just now | 08:32 |
jrosser | so it's a case of if in general it is possible to support binding to multiple addresses in the uwsgi and non-uwsgi cases | 08:33 |
jrosser | repo_server and horizon don't really count as they both have a web server | 08:33 |
jrosser | thats another piece of tech-debt actually, collapsing repo server and horizon onto the same apache for metal deploys rather than keeping a nginx/apache split | 08:34 |
opendevreview | Dmitriy Rabotyagov proposed openstack/ansible-role-python_venv_build master: Reduce amount of task that are executed https://review.opendev.org/c/openstack/ansible-role-python_venv_build/+/881397 | 08:38 |
damiandabrowski | jrosser: "so it's a case of if in general it is possible to support binding to multiple addresses in the uwsgi and non-uwsgi cases" | 09:10 |
damiandabrowski | it's not possible for non-uwsgi, i just tried to achieve it for swift and tacker | 09:11 |
noonedeadpunk | I don't think it affects logic of cert generation in any way thoug? | 09:15 |
noonedeadpunk | but yes, for those services if they wanna to have multiple IPs to listen on, then need te be binded to 0.0.0.0 | 09:15 |
damiandabrowski | it affects cert generation because in that case, cert would be signed to 0.0.0.0 which i believe won't work | 09:16 |
jrosser | so in that case the cert must contain the list of interface IP, and the service must bind to 0.0.0.0 | 09:17 |
damiandabrowski | to my understanding - yes | 09:17 |
jrosser | damiandabrowski: there are some other things to think about with the ironic role regarding TLS | 09:36 |
jrosser | there is potential communication between ironic-python-agent over br-bmaas and the ironic conductor *backend* | 09:36 |
jrosser | and that has a couple of factors, needing more than one IP in the certificate, and also the CA needing to be present in the IPA ramdisk | 09:37 |
jrosser | additionally there is also a web server in the ironic deployment and i don't know if it is desirable to make that https or not | 09:38 |
jrosser | i've not really thought about it but it's used for ipxe (probably can't be https?) and also as a staging point for glance images retrieved by IPA (perhaps could be https) | 09:39 |
damiandabrowski | ouh, that's surprising that IPA connects directly to conductor backend. Is it described somewhere so I can try to understand the logic? | 09:45 |
jrosser | it can do, if you want it to | 09:46 |
jrosser | see the use of `endpoint_override` here https://docs.openstack.org/openstack-ansible-os_ironic/latest/configure-lxc-example.html | 09:48 |
jrosser | it defaults to whatever the service catalog says but that doesnt really fit with the way OSA deploys the bmaas network with lxc | 09:48 |
jrosser | and each conductor is responsible for a specific subset of the nodes, so it's fine for each conductor to use it's own IP as the callback in the config file | 09:49 |
jrosser | this would be different and use the internal VIP if bmaas network was routed fwiw | 09:49 |
damiandabrowski | do you think we should cover this scenario out-of-the-box? We can probably say this about all services(that customer may want to omit haproxy and connect directly to service backends) | 09:52 |
damiandabrowski | alternatively, we can say that more complex ironic scenarios may require disabling TLS(or dealing with potential issues like injecting CA into IPA ramdisk etc.) | 09:52 |
damiandabrowski | I'm not super familiar with ironic so I guess I'll just trust your opinion. | 09:52 |
jrosser | well theres a class of things like ironic/octavia/heat/trove(?) where there is an agent | 09:55 |
jrosser | tbh what i think we should do for ironic is document the addition of a CA into the IPA ramdisk | 09:55 |
jrosser | or finding how how to tell it to ignore certificate validation | 09:56 |
jrosser | it would be nice if the published agents would work even if it was not optimum | 09:56 |
jrosser | i would really like to improve the ironic CI job using virtualbmc but i just don't have any spare cycles to look at that | 09:57 |
damiandabrowski | okok, let's do as you say then | 09:58 |
damiandabrowski | there's some information about IPA TLS https://docs.openstack.org/ironic-python-agent/latest/install/index.html#ipa-and-tls | 09:58 |
jrosser | hmm second time today i see `fatal: [aio1 -> localhost]: FAILED! => {"changed": false, "dest": "/etc/openstack_deploy/upper-constraints/upper_constraints_70106ea80ce77cc6425ca09790503ba03be14618.txt", "elapsed": 10, "msg": "Connection failure: The read operation timed out", "url": "https://releases.openstack.org/constraints/upper/70106ea80ce77cc6425ca09790503ba03be14618"` | 09:58 |
jrosser | right so `ipa-insecure=1` would be the most straightforward way to use the default IPA image | 10:00 |
damiandabrowski | regarding ironic-ipxe(nginx) TLS support - that's a valid point, I added it to my notes, but now I'm focusing more on adding TLS support to haproxy backends | 10:02 |
jrosser | sure, that feels totally like something to do later | 10:03 |
noonedeadpunk | So, should I try to post some change to repo_server role (as quite trivial one) with conclusions on our todays discussion? | 10:23 |
noonedeadpunk | and then we see if it makes any sense? | 10:24 |
opendevreview | Dmitriy Rabotyagov proposed openstack/ansible-role-python_venv_build master: Reduce amount of task that are executed https://review.opendev.org/c/openstack/ansible-role-python_venv_build/+/881397 | 10:42 |
damiandabrowski | noonedeadpunk: sure, so what is the final conclusion? do something like this + gather network facts for non-metal? | 11:03 |
damiandabrowski | placement_pki_san_ips: "{{ repo_server_bind_address == '0.0.0.0' | ternary(ansible_facts['all_ipv4_addresses'] | join(',IP:'), repo_server_bind_address) }}" | 11:03 |
damiandabrowski | i wonder how should we handle 0.0.0.0 bind address on metal deployments then | 11:03 |
damiandabrowski | s/placement/repo/g | 11:04 |
noonedeadpunk | to be frank I'm more inclided for management_address or smth like that | 11:04 |
noonedeadpunk | we don't | 11:04 |
damiandabrowski | hmm, can you please clarify what do you mean? as using management_address is exactly what we do now | 11:06 |
noonedeadpunk | I really don't want to gather facts... | 11:06 |
damiandabrowski | so in case i'm not sure what's going to change | 11:06 |
damiandabrowski | yeah, I'd love to avoid that as well | 11:06 |
noonedeadpunk | until `openstack_service_bind_address: 0.0.0.0` | 11:07 |
damiandabrowski | so...are you thinking about adding one extra task(probably above Create and install SSL certificates) to gather all_ipv4_addresses if openstack_service_bind_address = 0.0.0.0? | 11:09 |
damiandabrowski | s/openstack_service_bind_address/repo_server_bind_address/ | 11:10 |
damiandabrowski | i have one more thought about this whole thing as maybe we are trying to make it more complex than it's really needed | 11:15 |
damiandabrowski | so from the beginning: we are trying to secure haproxy backends | 11:16 |
damiandabrowski | so what connects to these backends? haproxy | 11:17 |
damiandabrowski | and IPs of these backends are easily predictable: https://github.com/openstack/openstack-ansible-haproxy_server/blob/master/templates/service.j2#L132 | 11:18 |
damiandabrowski | so maybe it's enough to include in cert's SAN only this single IP used by haproxy? as technically only haproxy should connect to these backends | 11:19 |
noonedeadpunk | damiandabrowski: I think this logic is wrong kinda | 11:24 |
noonedeadpunk | damiandabrowski: https://review.opendev.org/c/openstack/openstack-ansible/+/871483 | 11:26 |
damiandabrowski | it may be, but it would be still either ansible_host or management_address(after we fix haproxy logic) | 11:27 |
damiandabrowski | i think it doesn't change the fact, that haproxy uses one specific IP to connect to its backend and most likely, it's the only IP we need in SAN | 11:28 |
noonedeadpunk | so you think adding variable just defined to management_address? | 11:41 |
noonedeadpunk | basically like you did... | 11:43 |
noonedeadpunk | yeah, okay, you're right here I guess | 11:45 |
damiandabrowski | jrosser: you're not the only one :D | 12:01 |
damiandabrowski | fatal: [aio1_repo_container-41b0947f -> localhost]: FAILED! => {"changed": false, "dest": "/etc/openstack_deploy/upper-constraints/upper_constraints_65245016de7cf2d1e585eeb1378aac6aa6d75de0.txt", "elapsed": 11, "gid": 0, "group": "root", "mode": "0644", "msg": "Connecti | 12:01 |
damiandabrowski | on failure: The read operation timed out", "owner": "root", "size": 12146, "state": "file", "uid": 0, "url": "https://releases.openstack.org/constraints/upper/65245016de7cf2d1e585eeb1378aac6aa6d75de0"} | 12:01 |
jrosser | hmm i wonder what is going on | 12:01 |
jrosser | tbh i am a bit surrprised that it tries to do that | 12:02 |
jrosser | we have two different cases to handle this https://github.com/openstack/openstack-ansible-repo_server/blob/master/tasks/repo_install_constraints.yml#L22-L37 | 12:02 |
jrosser | in CI iw was supposed to be that the URL was file:// - but i wonder if we have broken that somehow | 12:03 |
jrosser | hmm https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/defaults/repo_packages/openstack_services.yml#L37 | 12:04 |
damiandabrowski | ah i guess you're right, code suggests that in CI upper constraints should be retrieved from local filesystem but they are not | 12:08 |
damiandabrowski | so that's another issue | 12:08 |
opendevreview | Damian Dąbrowski proposed openstack/openstack-ansible-repo_server master: Add TLS support to repo_server backends https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/876429 | 12:12 |
damiandabrowski | noonedeadpunk: ^ | 12:12 |
jrosser | this looks broken https://github.com/openstack/openstack-ansible/blob/379426ef210d5ffea858a37c006502575efed80b/scripts/get-ansible-role-requirements.yml#L100-L104 | 12:13 |
damiandabrowski | yeah, i was just looking at this | 12:14 |
noonedeadpunk | in CI that indeed could broke for collections... | 12:14 |
noonedeadpunk | but um... | 12:14 |
damiandabrowski | i wonder why, the file is there: https://zuul.opendev.org/t/openstack/build/2319916e3e5741b7b3514a40e9923366/log/logs/etc/host/openstack_deploy/user_variables_zuulrepos.yml.txt | 12:14 |
jrosser | yes | 12:16 |
jrosser | 2023-04-25 07:48:06.932526 | rockylinux-9 | TASK [Set Zuul sources path] *************************************************** | 12:16 |
jrosser | 2023-04-25 07:48:06.955443 | rockylinux-9 | skipping: [localhost] | 12:16 |
jrosser | ^ doh | 12:16 |
noonedeadpunk | ah, we don't apply that for upgrade jobs | 12:16 |
jrosser | right so the failing job i looked at was an upgrade openstack-ansible-upgrade-aio_metal-rockylinux-9 | 12:16 |
jrosser | yeah | 12:16 |
noonedeadpunk | - "lookup('env', 'UPGRADE_TARGET_BRANCH') == ''" | 12:16 |
noonedeadpunk | Another question I was wondering about - should we use `cn: ansible_facts['hostname']` but then in san: 'DNS:' ~ ansible_facts['fqdn'] for instance? | 12:17 |
damiandabrowski | hmm, i think we don't have to(i.e it's not necessary), but would harm, right? | 12:20 |
damiandabrowski | at least it would allow us to use FQDNs in the future | 12:20 |
damiandabrowski | but wouldn't harm* | 12:21 |
jrosser | i wonder if we can somehow create some sort of template for what needs to be in the certificate | 12:22 |
jrosser | define it once, and use it everywhere | 12:23 |
noonedeadpunk | that would be nice | 12:23 |
jrosser | otherwise if we change our mind it is a ton of roles to patch | 12:23 |
jrosser | or make some silly error | 12:24 |
noonedeadpunk | I think that some service interconnection might use hostnames as rather then IP. rabbit is one example | 12:25 |
jrosser | right - we already need san: "{{ 'DNS:' ~ ansible_facts['hostname'] ~ ',IP:' ~ rabbitmq_node_address }}" for rabbit | 12:25 |
damiandabrowski | so something like: `openstack_pki_san: "{{ 'DNS:' ~ ansible_facts['hostname'] ~ ',IP:' ~ management_address }}"` in ./inventory/group_vars/all/ssl.yml? | 12:28 |
damiandabrowski | and `repo_pki_san: "{{ openstack_pki_san | default('tba...') }}"` | 12:30 |
damiandabrowski | in repo_server defaults | 12:30 |
noonedeadpunk | yeah, might be that | 12:34 |
damiandabrowski | ack | 12:40 |
NeilHanlon | morning (afternoon) all | 13:01 |
jrosser | hello | 13:02 |
damiandabrowski | hi! | 13:03 |
* NeilHanlon forgot to configure the VIPs on his machines when changing from ifcfg scripts to network manager | 13:08 | |
NeilHanlon | oops | 13:08 |
opendevreview | Damian Dąbrowski proposed openstack/openstack-ansible-repo_server master: Add TLS support to repo_server backends https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/876429 | 13:25 |
opendevreview | Damian Dąbrowski proposed openstack/openstack-ansible master: Add support for TLS backends https://review.opendev.org/c/openstack/openstack-ansible/+/879085 | 13:25 |
damiandabrowski | I think it's better to wait for reviews in repo_server before I patch the other roles https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/876429 | 13:26 |
noonedeadpunk | ++ | 13:31 |
NeilHanlon | hm.. | 13:40 |
NeilHanlon | https://paste.opendev.org/show/bBcgUJTPDo8XrUtGxmak/ ssl handshake failures on my rocky 9 deployment | 13:41 |
noonedeadpunk | NeilHanlon: but do you have any backend alive? | 14:05 |
noonedeadpunk | `echo 'show stat' | nc -U /run/haproxy.stat | grep galera` or smth | 14:05 |
NeilHanlon | i always forget you can access haproxy stats from netcat | 14:17 |
NeilHanlon | weird, neutron, horizon, and heat backends are down... time to dig more | 14:24 |
noonedeadpunk | well, there's also ui available | 14:25 |
NeilHanlon | yeah, i just had to dig out my old scriptlet for haproxy stats :P | 14:26 |
NeilHanlon | `echo "show stat" | nc -U /run/haproxy.stat | cut -d "," -f 1,2,5-11,18,24,27,30,36,50,37,56,57,62 | column -s, -t | awk '/(\#|galera)/'` | 14:27 |
mgariepy | you can also enable prometheus exporter from haproxy | 14:28 |
NeilHanlon | but if my cluster isn't running where will I send the logs! :P | 14:28 |
NeilHanlon | but i actually forgot there was a prom exporter in recent haproxy. i've been using old versions for too long, i guess | 14:28 |
mgariepy | you can spam your inbox, of slack or thelegram with alertmanager :P | 14:29 |
NeilHanlon | thank you mgariepy | 14:29 |
NeilHanlon | :D | 14:29 |
mgariepy | #toomanyfeatureeverywhere | 14:29 |
NeilHanlon | can send 'em right to your phone, too https://ntfy.sh/ | 14:29 |
mgariepy | i prefer not having alert on my phone haha | 14:30 |
NeilHanlon | me too... but for some important things I allow it | 14:30 |
noonedeadpunk | #startmeeting openstack_ansible_meeting | 15:02 |
opendevmeet | Meeting started Tue Apr 25 15:02:22 2023 UTC and is due to finish in 60 minutes. The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:02 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:02 |
opendevmeet | The meeting name has been set to 'openstack_ansible_meeting' | 15:02 |
noonedeadpunk | #topic rollcall | 15:02 |
noonedeadpunk | sorry for the delay | 15:02 |
NeilHanlon | o/ heya | 15:02 |
noonedeadpunk | was a bit o_O on a raid controller | 15:02 |
noonedeadpunk | o/ | 15:02 |
damiandabrowski | hi! | 15:02 |
jrosser | o/ hello | 15:04 |
noonedeadpunk | #topic office hours | 15:05 |
noonedeadpunk | First of all let me greet NeilHanlon as a new member of OSA Core Reviewers group! Thanks for all work you do and welcome aboard! | 15:05 |
NeilHanlon | thank you! I appreciate all your confidence | 15:06 |
jrosser | excellent | 15:06 |
NeilHanlon | i'll try not to break too many things | 15:06 |
noonedeadpunk | we still do :D | 15:07 |
damiandabrowski | welcome! \o/ | 15:07 |
noonedeadpunk | Next to that small reminder - we have exactly 1 month left for 2023.1 release. And out of agreed stuff we have tls and upgrade jobs that needs landing | 15:09 |
noonedeadpunk | For upgrades I've proposed this patch: https://review.opendev.org/c/openstack/openstack-ansible/+/879884 | 15:11 |
noonedeadpunk | This is another important part and not only for distro jobs: https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/880761 | 15:12 |
noonedeadpunk | Regarding TLS - we have this topic going on - https://review.opendev.org/q/topic:tls-backend | 15:12 |
noonedeadpunk | Separated haproxy config has been merged at this point | 15:13 |
noonedeadpunk | Today during discussion on TLS we agreed to review https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/876429 now. | 15:13 |
noonedeadpunk | Once it will satisfy everyone, damiandabrowski will be able to proceed updating all others | 15:13 |
noonedeadpunk | With that we need to decide on default behaviour. If we want to switch it to have internal tls be default or not | 15:15 |
jrosser | it is good to see this has passed https://review.opendev.org/c/openstack/openstack-ansible/+/879501 | 15:15 |
noonedeadpunk | As at the moment, outside of AIO, internal endpoint won't be covered with TLS. Though galera and rabbitmq communications will be protected | 15:15 |
noonedeadpunk | at least once :D | 15:16 |
noonedeadpunk | but yes, that's quite sweet | 15:16 |
cloudnull | O-nice. TLS backends by default? | 15:18 |
NeilHanlon | very good to see that! :D | 15:18 |
NeilHanlon | have we seen this before? https://paste.opendev.org/show/bJ3qHVpW1RLVqjXlUJKP/ | 15:18 |
cloudnull | that's sweet. | 15:19 |
noonedeadpunk | NeilHanlon: I think yes, and there should be rescue part | 15:19 |
noonedeadpunk | so it should not be critical failure | 15:20 |
jrosser | would be nice to get rid of that | 15:20 |
NeilHanlon | ah, good enough.. trying to debug a different error i'm having during neutron install | 15:20 |
noonedeadpunk | iirc it was tricky, and cloudnull might have better memories of that :) | 15:20 |
NeilHanlon | https://paste.opendev.org/show/bG8sYIvHkrdXeUWmcMZk/ | 15:21 |
noonedeadpunk | Regarding TLS by default - I'm not sure at the moment. | 15:21 |
cloudnull | noonedeadpunk those are things I'd rather forget :D | 15:21 |
noonedeadpunk | fair enough :D | 15:21 |
cloudnull | its awesome seeing that go forward | 15:21 |
cloudnull | ++ nice work | 15:21 |
noonedeadpunk | So regarding TLS I would rather leave defaults as is for now. And maybe enable that on 2024.1 | 15:22 |
noonedeadpunk | As it's quite close to the release and we haven't tested that enough to make it default | 15:22 |
noonedeadpunk | But we totally should create a job that would cover this path for sure | 15:22 |
cloudnull | NeilHanlon re: the mount issue. you can run something like `systemctl status "$(systemd-escape -p --suffix=mount /var/www/repo)"` to see what that mount service unit is doing? | 15:22 |
jrosser | i wonder if we can use `lsmount` or something | 15:23 |
NeilHanlon | cloudnull: yeah the mount itself is fine, rfom what I can tell, just doesn't support remount | 15:23 |
cloudnull | ah - that could be . | 15:23 |
cloudnull | is that in the service unit? | 15:23 |
noonedeadpunk | cloudnull: https://opendev.org/openstack/ansible-role-systemd_mount/src/branch/master/tasks/systemd_mounts.yml#L75-L85 | 15:23 |
NeilHanlon | regarding my other issue w/ neutron, seems they're just not there, so probably an issue with something else anyways lol | 15:23 |
damiandabrowski | I'm okay with enabling TLS backend by default in 2024.1 | 15:24 |
noonedeadpunk | but I think it's glusterfs in topic | 15:24 |
jrosser | if we need to test if something is a mount then there is this https://github.com/openstack/openstack-ansible-repo_server/blob/stable/zed/tasks/repo_pre_install.yml#L40 | 15:24 |
cloudnull | yeah I knew I remembered this being my fault. | 15:24 |
noonedeadpunk | I tried to check that but realized that can't come up with anything better | 15:25 |
cloudnull | so maybe https://opendev.org/openstack/ansible-role-systemd_mount/src/branch/master/vars/main.yml#L18-L19 just needs to be set to start/stop | 15:25 |
cloudnull | also I guess https://opendev.org/openstack/ansible-role-systemd_mount/src/branch/master/vars/main.yml#L17 never quite worked right | 15:26 |
cloudnull | maybe it used to be silent in older systemd? not sure. | 15:27 |
jrosser | it seems kind of trivial thing but it does alarm a lot of people who see the failed task | 15:28 |
cloudnull | ++ | 15:28 |
NeilHanlon | (like me...) | 15:28 |
noonedeadpunk | damiandabrowski: so I think we should add a job, that will enable TLS for internal/admin endpoints (with rollback of behaviour to just default that is non-tls) and between haproxy/usgi | 15:31 |
noonedeadpunk | then we can revert this thing | 15:32 |
noonedeadpunk | (leaving non-tls job as a separate one) | 15:32 |
damiandabrowski | so create a separate job that will deploy openstack with frontend & backend TLS enabled and then disable both backend and frontend TLS? | 15:35 |
noonedeadpunk | let me re-phrase this :) | 15:36 |
noonedeadpunk | right now jobs do deploy frontend with TLS for internal VIP, that is not default behaviour | 15:37 |
noonedeadpunk | So we return main jobs just to defaults | 15:37 |
damiandabrowski | ok, and what's next? how are we going to test tls backend? :D | 15:42 |
noonedeadpunk | Yes, so and for TLS backend we add another job for rocky/ubuntu | 15:42 |
damiandabrowski | ok, i think i get it now | 15:44 |
jrosser | it will be much more obvious | 15:44 |
damiandabrowski | Can I count on your help with zuul? | 15:44 |
jrosser | `tls` in the job name to drop in the right vars and off we go | 15:44 |
noonedeadpunk | damiandabrowski: sure, I can make such job when we're ready or just help out :) | 15:44 |
jrosser | you should be able to use what i did for proxy/stepca as a boilerplate for how that works | 15:45 |
damiandabrowski | okok thanks | 15:45 |
damiandabrowski | is it okay to include CI logic in this patch? https://review.opendev.org/c/openstack/openstack-ansible/+/879085 | 15:45 |
damiandabrowski | or should i create separate one? | 15:45 |
jrosser | small patches = good :) | 15:46 |
damiandabrowski | ok ;) | 15:46 |
noonedeadpunk | We have couple of roles broken btw | 15:52 |
noonedeadpunk | Among them are magnum and zun | 15:52 |
noonedeadpunk | For zun I will try to invest some time and try to see why it's stuck | 15:52 |
noonedeadpunk | regarding magnum - error is that we can't update cluster label/properties. | 15:54 |
noonedeadpunk | *cluster template | 15:54 |
noonedeadpunk | Eventually that sounds to me now, that there's issue with module... | 15:56 |
noonedeadpunk | So we're supplying same `magnum_cluster_templates` but on second execution module jsut error out? | 15:56 |
noonedeadpunk | ofc we can comment out https://opendev.org/openstack/openstack-ansible/src/branch/master/tests/roles/bootstrap-host/templates/user_variables_magnum.yml.j2#L36-L41 but still feels like some module issue after refactoring | 15:58 |
noonedeadpunk | #endmeeting | 16:03 |
opendevmeet | Meeting ended Tue Apr 25 16:03:24 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:03 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-04-25-15.02.html | 16:03 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-04-25-15.02.txt | 16:03 |
opendevmeet | Log: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-04-25-15.02.log.html | 16:03 |
NeilHanlon | thanks noonedeadpunk :) | 16:17 |
opendevreview | Merged openstack/openstack-ansible-openstack_hosts master: Update release name to Antelope https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/880761 | 17:51 |
*** spotz_ is now known as spotz | 18:16 | |
noonedeadpunk | folks, one more vote needed here https://review.opendev.org/c/openstack/openstack-ansible/+/881318 | 19:17 |
damiandabrowski | done | 19:19 |
noonedeadpunk | thanks! | 19:19 |
noonedeadpunk | Kinda wonder if that https://review.opendev.org/c/openstack/ansible-role-python_venv_build/+/881397 makes any difference... | 19:41 |
noonedeadpunk | oh, we've also strted building wheels always. That's why I guess CI time has increased... | 19:45 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Enable venv_wheel_build_enable for CI https://review.opendev.org/c/openstack/openstack-ansible/+/752311 | 19:47 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Enable venv_wheel_build_enable for CI https://review.opendev.org/c/openstack/openstack-ansible/+/752311 | 19:47 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Disable venv_wheel_build_enable for CI https://review.opendev.org/c/openstack/openstack-ansible/+/752311 | 19:49 |
damiandabrowski | I'm trying to enable TLS in oslo.cache, no luck so far... I wonder if it's really implemented for oslo_cache.memcache_pool | 19:49 |
noonedeadpunk | I won't be surprised if it's not. | 19:50 |
damiandabrowski | from what I can see, it doesn't matter if i tls_enabled is enabled or not, it always uses plain http | 19:50 |
noonedeadpunk | though I think it might depend on the driver a lot | 19:50 |
noonedeadpunk | um, you check with tcpdump? | 19:50 |
damiandabrowski | there is something for bmemcached: https://github.com/openstack/oslo.cache/blob/7fb06bc2034d9747c9721c9d3eff06925a4483c6/oslo_cache/_bmemcache_pool.py | 19:51 |
damiandabrowski | checked tcpdump, it doesn't look like TLS... | 19:53 |
jrosser | memcached content is encrypted so it might not be the worst thing? | 19:54 |
damiandabrowski | yeah, i wonder if it's really worth to spend more time on this | 19:55 |
noonedeadpunk | Do we have it encrypted everywhere? | 19:56 |
noonedeadpunk | or well, are we consistent is setting encryption key? | 19:56 |
jrosser | this would be somthing in keystoneauth maybe? | 19:56 |
noonedeadpunk | that is totally different topic though | 19:56 |
jrosser | also it's used in horizon/oidc use case as well and that might not be encrypted | 19:57 |
noonedeadpunk | ah, we use same key everywhere | 19:57 |
damiandabrowski | oslo.cache bug report w.r.t TLS support: https://bugs.launchpad.net/oslo.cache/+bug/2017700 | 21:16 |
opendevreview | Merged openstack/openstack-ansible stable/xena: Switch spice-html5 source to freedesktop gitlab https://review.opendev.org/c/openstack/openstack-ansible/+/881318 | 22:24 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!