f0o | Good morning/evening - I let ansible run overnight and it did setup everything and the provider net seems correct as per output of ovs-vsctl (bond0 on compute and mlagXYZ on the network nodes); But I'm having an issue with traffic flow on plain vlan (not tested geneve yet). When I start cirros and tap bond0 and br-ext of the hypervisor I see ARP lookups from cirros | 07:34 |
---|---|---|
f0o | forwarded to the wire. But when I arping cirros' IP from a physical device on the same vlan I only see the ARP requests on bond0 and not in br-ext. So somewhere bond0 is not forwarding packets into br-ext but is happily forwarding them out of br-ext. Ideas? (SGs are -1 allow) | 07:34 |
f0o | I might have a suspicion; I do use the same vlan on a vlan subinterface on the same bond0... Let me test with a different vlan for the sake of completeness | 07:50 |
noonedeadpunk | o/ | 07:58 |
noonedeadpunk | f0o: so. if your externel connectivity happens over vlan (or you can do that over vlan), I'd suggest just forget about flat networks | 07:59 |
noonedeadpunk | as you can't really use flat interface for vlan interface | 08:00 |
noonedeadpunk | the hack I did back in the days for flat network to work - was manually creating bond0.1000 (or smth) and adding it as "flat" network | 08:00 |
noonedeadpunk | but it was back in the days when I was not sure about policies/rbac, so had no idea if users with their default permissions can/can not create vlan networks on their own, so decided to "isolate" that way | 08:01 |
noonedeadpunk | so my suggestion would be to forget about flat networks, and just use vlan. You can create an external/shared network in neutron using VLAN as well | 08:02 |
f0o | this isnt a flat network tho | 08:09 |
noonedeadpunk | ah, ok | 08:10 |
f0o | I think that the linux kernel might yank the packet from bond0 and toss it into bond0.2 instead of br-ext | 08:10 |
f0o | since bond0.2 is not on a bridge so it might not cause the packet to fanout | 08:10 |
noonedeadpunk | hm, let me check my ovn sandbox.... | 08:10 |
f0o | just went through the lovely Cisco IOS terminals to get myself a new dummy vlan to just test that theory | 08:11 |
f0o | if I can ping a vm with vlan3 from a non-hypervisor device on vlan3 then that explains why the vlan2 was swallowed | 08:12 |
f0o | also low on caffeine so words are super hard | 08:12 |
noonedeadpunk | I'm jsut not sure that bond0.2 (if that's the external vlan) should be present for kernel networking at all | 08:15 |
noonedeadpunk | but seems that I crashed my sandbox lately with some other experiment :D | 08:15 |
noonedeadpunk | (or it was some of my collegues) | 08:15 |
f0o | ah no that's not external networking - it's just some workloads we will run cant do geneve and need to be on L2 connectivity for god knows why | 08:16 |
f0o | I was already looking into just abusing our arista TOR switches to do geneve<>vlan translations but I gues that will pour hell into ansible to add foreign vteps | 08:17 |
jrosser | isn’t geneve/vxlan tunnel completely invisible to the workload? | 08:19 |
jrosser | or you need to integrate some other not-openstack thing? | 08:19 |
f0o | non-openstack thing sadly | 08:20 |
noonedeadpunk | I guess external/non-external doesn't really matter... | 08:21 |
f0o | but I just confirmed by theory; I see arp-requests being pushed from bond0->br-ext on vlan3 | 08:21 |
f0o | vlan2 is swallowed by kernel and tossed into bond0.2 instead of bond0->.2&&br-ext | 08:21 |
jrosser | ah well we do a hybrid for this | 08:21 |
jrosser | we use vxlan for general purpose project networks | 08:21 |
noonedeadpunk | I got an octavia (amphora) instance that use a vlan as an example - it's also "internal" vlan | 08:22 |
jrosser | and then for “special” things that need to extend to physical devices we have a bunch of provider vlans in openstack that are shared into projects with neutron RBAC | 08:24 |
jrosser | but actually those are vxlan-evpn on the leaf/spine but no mast integration needed between that and openstack | 08:24 |
jrosser | no *nasty integration | 08:24 |
noonedeadpunk | ok, crap... and how to find vlan wiring in ovn.... | 08:26 |
noonedeadpunk | I bet I saw that in ovs-vsctl output.... | 08:26 |
* jrosser goes to hide behind linuxbridge | 08:27 | |
f0o | woop we have connectivity. So I can confirm that if you use a physical interface as backing for your ovs bridge and you define a vlan subinterface via your OS on that same physical interface, then no packets with the tag will enter OVS but packets will exit OVS no problem | 08:27 |
noonedeadpunk | but as a matter of fact, I for sure have no "vlan" interfaces anywhere exposed | 08:27 |
noonedeadpunk | so basically bond0 is indeed part of br-ext | 08:27 |
f0o | gonna see if I can use a linux-bridge as physical backing instead which should make the packet be delivered to both OVS and subinterface | 08:28 |
noonedeadpunk | But then all ports of VM are part of br-int | 08:28 |
noonedeadpunk | including the vlan one.... | 08:28 |
f0o | yep | 08:28 |
f0o | ovs doing some 5D-Chess flow policies to move packets around | 08:28 |
noonedeadpunk | By why do you have bond0.2 at kernel space at all? | 08:28 |
f0o | I did the mistake to dump-flows on br-int and was slaped with a wall of text | 08:28 |
noonedeadpunk | I guess that's the question I'm trying to understand | 08:28 |
f0o | .2 is management network | 08:29 |
noonedeadpunk | aha | 08:29 |
f0o | Iw as just lazy and tried an existing vlan | 08:29 |
f0o | didnt want to configure a test vlan on the switches and mlags and yaddayadda - thought it would "just work" like with linux-bridges | 08:29 |
* noonedeadpunk still loves linuxbridges | 08:29 | |
f0o | but it's non-issue since vlans do just work, as long as I dont consume them through kernel on the hypervisors - I see this as a security plus | 08:30 |
f0o | But I can guarantee somebody will ask me to allow spawning some vm that has access to management-vlan so for that I will do the flat-network like you suggested earlier | 08:30 |
noonedeadpunk | yeah, I don't think you can use vlan on kernel space and use that as neutron vlan... | 08:30 |
noonedeadpunk | I guess that's indeed what confused me | 08:30 |
noonedeadpunk | Probavbly you can do some kind of wiring though.... Like ovn-bgp-agent does | 08:31 |
noonedeadpunk | But it adds ovs flows to redirect traffic to kernel space and vrf for the way back | 08:31 |
noonedeadpunk | but it's slightly /o\ | 08:31 |
f0o | havent even started looking at BGP since that'll be likely needed too | 08:31 |
f0o | for floating IPs | 08:31 |
f0o | babysteps | 08:31 |
noonedeadpunk | I am working on it right now | 08:32 |
noonedeadpunk | and what I see right now - all security and isolation benefits you get with ovn - wades away with bgp | 08:32 |
noonedeadpunk | *fades | 08:32 |
f0o | I looked at the docs very briefly and there was a way to only advertise floating IPs instead of all tenant networks but that might be outdated | 08:33 |
noonedeadpunk | the one more or less decent in theory way, is using a standalone ovn cluster, which I haven't even tried yet, as that requires newer OVN version then present in repos, and have quite serious limitations | 08:33 |
noonedeadpunk | so... I think exposing tenant networks is potentially not what you're thinking about. As it's not about east-west traffic... And moreover, these tenant networks can not overlap in ranges | 08:34 |
noonedeadpunk | so it's... really very private-cloudish-specific-thingy | 08:35 |
jrosser | there’s multiple things isn’t there? | 08:35 |
jrosser | advertising fip to upstream routers would be nice | 08:35 |
noonedeadpunk | yeah, but still tenant networks are all using same vrf from what I got so far | 08:35 |
noonedeadpunk | and that's not for geneve I assume anyway | 08:36 |
jrosser | and I didn’t yet look at all about how that would be for ipv6 | 08:36 |
noonedeadpunk | yeah, actually, for ipv6 you probably indeed wanna expose tenant networks | 08:36 |
noonedeadpunk | when each tenant has it's own ipv6 range... | 08:36 |
jrosser | right just like you have to today without ovn | 08:36 |
jrosser | yes any they are unique anyway just because v6 | 08:37 |
noonedeadpunk | yeah | 08:37 |
noonedeadpunk | but for ipv4 it's non-multitenant, imo | 08:37 |
jrosser | would be great to write some guidance on this | 08:39 |
jrosser | just notes in the neutron role docs | 08:39 |
noonedeadpunk | I'm still struggling to understand how exactly it works/ should work | 08:39 |
jrosser | calico was very similar iirc | 08:39 |
jrosser | great, but also not great, depending | 08:39 |
noonedeadpunk | right now playing with external networks, got SRC-NAT and gateway reachable. But a bit struggling with how FIPs are to be exposed. | 08:40 |
noonedeadpunk | As docs are a bit contradictory about - do you need frr on computes or not | 08:40 |
noonedeadpunk | as ideally, when FIPs are centralized - it should be just on net nodes. | 08:41 |
noonedeadpunk | But agent somehow does not catch FIP changes there... | 08:41 |
f0o | just to jump in here; east-west traffic is pretty easy with geneve/vxlan since vteps take away the wire-noise by moving everything into ptp. BGP would only be usefullt/beneficial for north-south traffic like floating-ips to not have those onlink routed anymore. In our old setup we had quite a lot of broadcast noise, I think ~100mbps just junk noise on the wire since there | 08:41 |
f0o | was no deterministic way to know if an IP existed or not. We ended up writing some go-bgp service that periodically crawls the neutron-db for assignments and whitelisted them in our TOR switches | 08:41 |
f0o | hoping to avoid the workaround and get some native neutron<>bgp integration here | 08:42 |
noonedeadpunk | But even for src-nat usecase - traffic just rushes through the default gateway, which is logical... | 08:42 |
noonedeadpunk | but not so good if you wanna have air-gapped environement... | 08:42 |
noonedeadpunk | As I feel uncomfortable seing all VMs traffic to the world coming through the FORWARDING table to the world where you have all sort of management traffic as well... | 08:43 |
noonedeadpunk | But then ofc you can create own routing table and redirect traffic towards different gateway - but it's slightly a mess and lack of isolation, imo | 08:44 |
noonedeadpunk | not 100% sure I'm right though | 08:44 |
noonedeadpunk | As I'm not good at many things... (not sure I'm good at anything nowadays :D) | 08:45 |
noonedeadpunk | But networking is probably one of them | 08:45 |
noonedeadpunk | f0o: I think this patch might be a good start for doing that: https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/909780 | 08:46 |
f0o | TIL br-ovs-flat:br-mgmt will kill your networking | 09:12 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Determine if upgrade source branch is stable/ or unmaintained/ https://review.opendev.org/c/openstack/openstack-ansible/+/911583 | 09:19 |
noonedeadpunk | frankly speaking, for me networking is so far the hardest part in openstack | 09:25 |
f0o | cant agree more | 09:25 |
f0o | it was simple with linuxbridge... but now with OVN it adds a whole new layer of complexity. I mean I get the benefits which is why I want to actually get this done "right" but still.. this is now week2 on this adventure | 09:26 |
f0o | granted most time is resetting the hardware | 09:26 |
jrosser | f0o: are you combining the mgmt network in with the OVN/OVS stuff? | 09:27 |
noonedeadpunk | yeah, true. lxb was very easy to understand and debug | 09:28 |
jrosser | ok so we need to find someone who can merge unmaintained patches | 09:29 |
f0o | jrosser: well not anymore hah | 09:29 |
jrosser | f0o: well, i would say "good move" :) | 09:29 |
noonedeadpunk | jrosser: not so many ppl on the list https://review.opendev.org/admin/groups/4d728691952c04b8b2ec828eabc96b98dc124d69,members | 09:29 |
f0o | jrosser: but yeah it was a low-ball attempt to get management-vlan into openstack neutron for consumption because I just *know* that request will come sooner or later | 09:30 |
jrosser | i think you should never never do that tbh | 09:30 |
noonedeadpunk | easier to find ones who will approve our ACL change in gerrit | 09:30 |
jrosser | as you have a massive trust boundary that you should not cross there into the control plane | 09:30 |
f0o | jrosser: yes but... "I just need to test this software in a vm real quick so I need access from the OpenStack for this very short PoC yaddayadda" | 09:31 |
f0o | I can already predict those requests | 09:31 |
noonedeadpunk | actually, you can get full control of your cluster relatively easily by having access to that network | 09:31 |
jrosser | f0o: perhaps some misunderstanding? | 09:31 |
jrosser | the mgmt network is private and internal to the inside workings of openstack | 09:31 |
noonedeadpunk | It's totally fine to get external vlans, but not management one | 09:32 |
jrosser | you don't access to it, ever, to do anything in a VM as a user | 09:32 |
noonedeadpunk | yeah | 09:32 |
noonedeadpunk | access openstack through public endpoint from VMs | 09:32 |
noonedeadpunk | if you need to interact with API | 09:32 |
f0o | hrm I just smacked it into the general management network... | 09:32 |
jrosser | so, for example | 09:33 |
f0o | but I see your point, could just split it into its own vlan | 09:33 |
jrosser | i have an out-of-band network, which PXE the hosts and whatnot, and for monitoring | 09:33 |
jrosser | thats mine exclusively as cloud-operator | 09:33 |
jrosser | then seperate is the openstack mgmt network, which deals with the internals of the control plane and you should consider as "if this is accessible, my cloud is compromised" | 09:34 |
jrosser | ^ these two you can totally make be the same network, and most deployments probably do that | 09:34 |
jrosser | then what your users and workloads see and use is something else altogether | 09:35 |
f0o | I get your point; I over-interpreted management network and used the actual management-network we use for all our gear instead of an openstack-specific management network | 09:37 |
f0o | mea culpa | 09:38 |
jrosser | so to complete the picture, you've then got some "external" networks | 09:38 |
f0o | which immeidately fixes the issue I tried to solve because this makes the general management network accessible agian since it's a different vlan | 09:38 |
jrosser | and you might have a subnet containing your external haproxy endpoint, perhaps ceph radosgw endpoints, perhaps bind instances for Designate and so on | 09:39 |
f0o | good thing OVN nuked my hosts just now, linuxbridge would've been fine | 09:39 |
f0o | hah | 09:39 |
jrosser | ^ these things all have some outward facing presence from your cloud, toward your users | 09:39 |
jrosser | and thats a seperate thing to your public network, but often, a lot of deployments might combine the two together | 09:40 |
jrosser | by reserving a bunch of addresses in the public network for the API endpoints and so on | 09:40 |
jrosser | but you have really total freedom to set that up however you need | 09:41 |
ThiagoCMC | Folks, morning! Which variable should I set so that the `rabbitmq_server` Ansible playbooks (`/etc/ansible/roles/rabbitmq_server`) will only use the Ubuntu packages without adding the third-party APT repository from this "novemberain.org" place? | 09:57 |
noonedeadpunk | `rabbitmq_install_method: distro` I assume | 10:03 |
ThiagoCMC | Hmmm... Thanks! I'll try it now. | 10:06 |
ThiagoCMC | Also, I can't see a similar `_install_method` for the MariaDB. Any other variable name for it so I can also install it from Ubuntu's repositories? | 10:07 |
noonedeadpunk | I'm not sure about mariadb - there's probably no that easy way. But I think you should be on safe side jsut undefining external repos vars | 10:36 |
noonedeadpunk | like `galera_repo: {}`, `galera_gpg_keys: []` | 10:37 |
jrosser | you would fall outside what we test for mariadb versions though | 11:01 |
jrosser | so it might be good to have some local testing of what you actually get from the ubuntu repo | 11:02 |
opendevreview | Merged openstack/openstack-ansible-haproxy_server stable/2023.2: Use correct permissions for haproxy log mount https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/911603 | 11:04 |
kleini | Currently my LXC containers get the default MTU of 1500 on their network interfaces. I also see that in the LXC container configuration files. I am digging into the Ansible roles now. What is the best way to configure MTU 9000 for some networks (br-storage and br-lbaas). I have currently the issue that heartbeat of some amphora VMs are send with MTU 9000 but they can not reach the Octavia health manager. | 11:14 |
noonedeadpunk | kleini: hold on, I should have a sample somewhere | 11:18 |
noonedeadpunk | kleini: you'd need to add `container_mtu: 9000` to openstack_user_config.yml under global_overrides/provider_networks | 11:19 |
noonedeadpunk | and then dynamic_inventory should take care of that... | 11:19 |
noonedeadpunk | but then my guess would be to run lxc-containers-create --limit octavia_all,lxc_hosts and restart containers | 11:20 |
noonedeadpunk | and actually here's example out of docs: https://opendev.org/openstack/openstack-ansible/src/branch/master/etc/openstack_deploy/openstack_user_config.yml.example#L268 | 11:21 |
ThiagoCMC | noonedeadpunk, it worked! RabbitMQ is from `distro` now. Thanks! | 11:38 |
noonedeadpunk | we actually should add a variable for mariadb | 11:41 |
noonedeadpunk | and add these overrides as part of distro job testing | 11:41 |
noonedeadpunk | I'm thinking about that for couple of month now... | 11:41 |
g3t1nf0 | hey its me again with the questions :D | 12:08 |
g3t1nf0 | https://docs.openstack.org/openstack-ansible/2023.2/user/prod/example.html according to this I need rsyslog server as well | 12:08 |
noonedeadpunk | nah | 12:09 |
noonedeadpunk | you don't | 12:09 |
g3t1nf0 | also I know that its prefered to hava application LB infront the whole openstack but can I use a server there as well running HAProxy | 12:09 |
noonedeadpunk | we've deprecated rsyslog for a while now... | 12:09 |
noonedeadpunk | Actually, I assume that haproxy we're deploying can be that application LB already | 12:10 |
noonedeadpunk | and you can deploy it wherever you want basically | 12:10 |
g3t1nf0 | so then 3 servers for control plane 3 for the ceph and then just compute? | 12:10 |
noonedeadpunk | pretty much yes | 12:10 |
g3t1nf0 | okay then do I need 2 networking servers this way the LB will be on it and not on the control plane | 12:10 |
noonedeadpunk | I guess it's up to you? | 12:11 |
g3t1nf0 | or running it on the control plane will not eat that much cpu ? | 12:11 |
noonedeadpunk | I guess mainly ppl don't bother themselves with a standalone LB hosts | 12:11 |
noonedeadpunk | ofc depends on scale | 12:12 |
g3t1nf0 | usually they have f5 :D | 12:12 |
noonedeadpunk | and potentially RGW is a bigger question | 12:12 |
noonedeadpunk | meh, I find haproxy way more performant then nginx | 12:12 |
noonedeadpunk | unless you're talking about hardware ones | 12:12 |
g3t1nf0 | rgw ? | 12:13 |
noonedeadpunk | Rados Gateway - ceph component that provides object storage | 12:13 |
g3t1nf0 | oh yeah I think of letting it sit on the 3 ceph servers | 12:13 |
noonedeadpunk | https://docs.ceph.com/en/latest/radosgw/ | 12:13 |
noonedeadpunk | and also - you can ignore haproxy installation and do f5 or whatever else you want | 12:14 |
g3t1nf0 | no I prefer haproxy | 12:14 |
noonedeadpunk | we're really trying not to invent any locks here and let operators variety of choices (for good and for bad) | 12:15 |
g3t1nf0 | last two questions for the OS, I'm gonna go with debian can I use LVM for the main storage? this way I can use snapshots and revert if my install fails thus saving time on reprovisioning | 12:15 |
g3t1nf0 | and the second one | 12:16 |
noonedeadpunk | main storage meaning storage for OS? | 12:16 |
g3t1nf0 | yea | 12:16 |
noonedeadpunk | doesn't matter from our prespective | 12:16 |
noonedeadpunk | you can do ZFS as well :D | 12:17 |
g3t1nf0 | and the second question was about the firewall somewhere I've read that its up to the deployer to configure it, so not sure I understand | 12:17 |
noonedeadpunk | actually, I guess you can also use lvm backend for containers if you want to | 12:17 |
noonedeadpunk | *for lxc containers | 12:17 |
noonedeadpunk | yeah, there's nothing we have to configure firewall on nodes. | 12:18 |
noonedeadpunk | that's true | 12:18 |
g3t1nf0 | prefered storage for OS is nvme or ssd does it matter like in k8s ? | 12:18 |
g3t1nf0 | on k8s for etcd its prefered nvme | 12:19 |
noonedeadpunk | Well, faster storage is always better, but probably not that crucial after all | 12:19 |
noonedeadpunk | or well. rabbit quorum queues reside on the storage... but they're not enabled by default for now. | 12:20 |
noonedeadpunk | so potentially rabbitmq might get hungry for storage like etcd is | 12:20 |
g3t1nf0 | gotcha | 12:20 |
noonedeadpunk | but with classic queues it's all in memory iirc | 12:20 |
g3t1nf0 | I've heard people complaining for rabitmq issues | 12:21 |
noonedeadpunk | yeah | 12:21 |
g3t1nf0 | so doing it with nvme then 2tb should be suffiscient | 12:21 |
noonedeadpunk | but it's way better lately | 12:21 |
g3t1nf0 | no idea, about to test it out :D | 12:21 |
g3t1nf0 | so back to the firewall, routing and firewall rules I have to figure out on my own ? | 12:21 |
noonedeadpunk | and I'd suggest enabling quorum queus then right away | 12:22 |
noonedeadpunk | So, routing is super trivial (close to absent) unless you're talking about neutron + bgp. | 12:22 |
noonedeadpunk | (or well ovn+bgp) | 12:23 |
noonedeadpunk | about firewall - indeed we don't have anything right now for that. | 12:23 |
g3t1nf0 | I do want public not private so I guess I'll have to figure out bgp | 12:23 |
noonedeadpunk | it could be interesting to get some implementation for firewall actually, in a way we do with haproxy... | 12:23 |
noonedeadpunk | well, it's not that you have to... | 12:24 |
g3t1nf0 | is apparmour still supported? is the hardening plabook enabled by default | 12:24 |
noonedeadpunk | I guess plenty of deployments just do use vlan passing to net/compute nodes | 12:24 |
noonedeadpunk | yeah, hardening playbook is enabled as well as apparmour is | 12:24 |
noonedeadpunk | Though we haven't added new stigs to hardening role for a while | 12:25 |
g3t1nf0 | I like my networks strickt, so far what I have has always been specific ports open on the host and on the FW tables | 12:25 |
noonedeadpunk | like contributions in this area would be very welcome... | 12:25 |
noonedeadpunk | Yeah, there're just a lot of them | 12:25 |
noonedeadpunk | SO probably nobody come to a really good combined view yet, also considering all sort of part/bits for monitoring, etc | 12:26 |
g3t1nf0 | never touch apparmour but you convinced me not to go the centos way so I'm out for the selinux | 12:26 |
noonedeadpunk | I wasn't convicing against Rocky Linux. I did against centos solely:D | 12:26 |
g3t1nf0 | sorry I just don't trust Rocky at all | 12:26 |
g3t1nf0 | still have bad taste in my mouth for centos state before rh bought them | 12:27 |
g3t1nf0 | one last thing, all the management for openstack should be made thru the deployment host correct? | 12:28 |
g3t1nf0 | and I should evade making hand changes ? | 12:29 |
g3t1nf0 | stick to iac | 12:30 |
noonedeadpunk | so yes, idea that all configuration changes are made with playbooks, and you just need to maintain state/changes in /etc/openstack_deploy | 12:32 |
noonedeadpunk | playbooks/roles are pretty much idempotent, so running them should not hurt deployment | 12:32 |
noonedeadpunk | unless manual changes were made - then they will be overriden | 12:32 |
g3t1nf0 | perfect, thank you for the time | 12:33 |
g3t1nf0 | going to get my hands dirty | 12:33 |
noonedeadpunk | but ofc you can make them if you want/need for some testing, just don't forget to reflect them in config for future you | 12:33 |
g3t1nf0 | how often should I pull from gh | 12:34 |
noonedeadpunk | we actually have here one of Rocky Linux maintainers - so can't say we don't trust them :) | 12:34 |
noonedeadpunk | when you need to do minor upgrade? | 12:34 |
noonedeadpunk | as we usually suggest checking out to a specific release/tag | 12:34 |
noonedeadpunk | each tag has fixed SHAs of versions for all and each component, which makes environment pretty much fixed/stable | 12:35 |
noonedeadpunk | and reproducible | 12:35 |
g3t1nf0 | and if I want to contribute just extra branch and push ? | 12:35 |
noonedeadpunk | when you wanna make minor upgrade - you checkout to the new tag, do ./scripts/bootstrap-ansible.sh, which pulls in new versions | 12:36 |
noonedeadpunk | You can override SHA of specific component anytime | 12:36 |
noonedeadpunk | and if you want contribute - we're using gerrit for that. It's slightly diferent flow then github is, but imo - more trivial one. Basically you'd need to isntall git-review plugin. and then - git commit; git review | 12:37 |
noonedeadpunk | no need to do git push or checkout for branch. Though, checking out to branch is handy - branch name is considered as a "topic" for the patch. and you can make series of patches to different repos this way | 12:38 |
noonedeadpunk | also `git review -f` will delete this branch once change be pushed | 12:38 |
jrosser | noonedeadpunk: we have full coverage here of iptables with the role that logan wrote | 12:39 |
jrosser | could be interesting to see if that can be somehow contributed | 12:40 |
noonedeadpunk | we have also pretty much full coverage but I don't think the way it's done is applicable and can be contributed - long legacy history behind what we have | 12:40 |
noonedeadpunk | but can help with rules for sure | 12:41 |
jrosser | do have to be careful though with neutron | 12:41 |
noonedeadpunk | I was also a bit thinking of nftables that are currenly a backedn for iptables... | 12:41 |
noonedeadpunk | yeah, as geneve vs vxlan using different protos? | 12:42 |
noonedeadpunk | *ports | 12:42 |
noonedeadpunk | and then whole BGP | 12:42 |
jrosser | if whatever neutron plugins a you use mess with iptables, the care needed not to disturb that with rules from OSA | 12:42 |
jrosser | particularly as the neutron ones are not persistent rules you don’t know what they are | 12:43 |
jrosser | maybe less relevant problem for OVN? | 12:43 |
jrosser | g3t1nf0: another thing you can think about for network “strictness” is which actually need to route to each other (most in OSA don’t, regardless what the docs say) | 12:45 |
jrosser | and also which need any egress or NAT, again most actually don’t | 12:45 |
noonedeadpunk | jrosser: I don't think it's relevant even with OVS today, unless you do hyprid_iptables (like we do /o\) | 12:54 |
noonedeadpunk | as ovs native is known to work nicely for a while now. Though I can't get myself up for testing migration path | 12:54 |
g3t1nf0 | hmm sorry I didn't get that, what? Once I get to Neutron I'll have maybe a bit more understanding what you've meant | 12:54 |
noonedeadpunk | so, VM security groups/permissions and filtering are made by neutron. | 12:55 |
noonedeadpunk | with some drivers/implementations they are made on a kernel space | 12:55 |
noonedeadpunk | through iptables basically | 12:56 |
noonedeadpunk | jrosser: I wonder if we should jsut document set of rules that might be needed and some sample of how to configure it using logan-'s role | 12:57 |
noonedeadpunk | as in fact thinking about neutron and all kind of extra stuff, not sure if it's feasible to enforce them anywhere except maybe control plane... | 12:57 |
noonedeadpunk | but dunno | 12:57 |
noonedeadpunk | maybe it's doable after all through group vars and extending/merging config.... | 12:58 |
jrosser | well, we do apply everywhere so it is possible | 13:11 |
jrosser | even with linuxbridge | 13:12 |
noonedeadpunk | jrosser: I guess I was more thinking of how to do that with diversity of deployments | 13:28 |
jrosser | yes that would be very difficult | 13:28 |
jrosser | btw also looks like we have openstack-ansible-unmaintained-core group now | 13:28 |
noonedeadpunk | \o/ | 13:31 |
jrosser | i added you https://review.opendev.org/admin/groups/1d7433bc7e2c46fd333e6b1b7bfeaa9a324803d0,members | 13:31 |
noonedeadpunk | awesome, thanks | 13:32 |
jrosser | probably we should just add the existing core group to that | 13:33 |
g3t1nf0 | on the OS that openstacks controll node will be run can I make some modifications or its prefered to keep it as stock as possible? | 13:40 |
jrosser | g3t1nf0: you can make modifications if you need - OSA tries to stay away from everything you might need specific to your deploymemt | 13:44 |
jrosser | so you can make whatever local setup you need for things like ntp, mta, ssh etc | 13:45 |
jrosser | it's typical to have some additional "base" setup either via whatever host provisioning you use or some more ansible of your own | 13:45 |
g3t1nf0 | on my rhel systems I'm setting a lot of stuff but never touched debian so I have to do a base setup for it as well, like ssh chrony systemd timers ... | 13:47 |
jrosser | andrewbonney: if you log out/in of gerrit can you see the usual voting options on https://review.opendev.org/c/openstack/openstack-ansible/+/911621 ? | 13:49 |
andrewbonney | No need to log in/out. I can see them now | 13:49 |
jrosser | excellent, thanks | 13:51 |
kleini | noonedeadpunk, thank you very much. With our growing kubernetes cluster on OpenStack I think, we will some day run into https://bugs.launchpad.net/octavia/+bug/2025262 | 13:56 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Determine if upgrade source branch is stable/ or unmaintained/ https://review.opendev.org/c/openstack/openstack-ansible/+/911583 | 14:11 |
f0o | I couldn't leave it be after having a working MVP... I absolutely and entirely nuked the setup again attempting a lift&shift into a new management vlan space | 15:04 |
f0o | the joy of going throuhg a whole resetup again | 15:05 |
noonedeadpunk | f0o: so you have had a working MVP - that;s smth already :D | 15:13 |
nixbuilder | I have installed OSA v.27.4.0 and it seems like none of the policy.yaml files were created on the system. Not sure why. Which OSA scripts are responsible for creating the policy files. | 15:14 |
f0o | noonedeadpunk: oh yeah after those pointers with the host_vars and user_vars I basically 1-hit got something running across the halfrack I'm using. That really solved everything for me. But then I went and tweaked stuff like container_mtu, dedicated vlan, ... and Obviously I attempted it without redeployment, just re-running the playbooks... I should've known better :D | 15:17 |
noonedeadpunk | nixbuilder: they're not created unless you have overrides, since all policies defaults are implemented as defaults | 15:17 |
f0o | So now I'm about to grab a beer at the pub while the rack re-re-re-resets itself once again | 15:17 |
noonedeadpunk | so they're present only when you define overrides | 15:17 |
noonedeadpunk | f0o: very nice to hear, and beer part is especially valuable hehe | 15:22 |
noonedeadpunk | but actually.. .changing mtu should have worked | 15:22 |
f0o | noonedeadpunk: 100%; if I bump into you or jrosser remind me to get you a round too | 15:22 |
noonedeadpunk | though with adding another network/re-wiring stuff might be quite breaking for clusters indeed | 15:22 |
f0o | I think if I would've done the MTU first and then vlan change after (or vice-versa) it would've been fine but I did it all at once and just had the vlans routed between assuming it was "good enough" | 15:23 |
noonedeadpunk | Wel, you would be also fine, likely, by just re-creating containers per controller | 15:23 |
noonedeadpunk | though at mvp might be indeed easier to re-run things | 15:24 |
jrosser | f0o: please do raise bugs for anything broken you find, thats very handy for us | 15:24 |
f0o | MAAS is fast enough to just run through setup once more and have ansible fired afterwards - it'll just do it's thing over the evening and tomorrow I'll see the result | 15:25 |
f0o | jrosser:will do, taking notes of all the gotchas already | 15:25 |
jrosser | and also if you want to change / improve something this whole endeavor is "by operators, for operators" so contirbutions are absolutely welcome | 15:25 |
nixbuilder | noonedeadpunk: Hmmm... well... I had a couple of overrides and yes, the policy.yaml was created. However when using horizon I keep getting errors saying policies do not allow the operations I am trying to execute. If I manually enter the policy the errors go away. | 15:25 |
f0o | 100% will upstream my openstack_user_config.prod.example with the OVN fixes and some comments to why what how | 15:26 |
f0o | including that gotcha that vlan works fine but if your kernel consumes a tag it wont show up in OVS | 15:26 |
nixbuilder | noonedeadpunk: Where are the policy defaults located? | 15:28 |
jrosser | nixbuilder: https://governance.openstack.org/tc/goals/completed/queens/policy-in-code.html | 15:29 |
noonedeadpunk | nixbuilder: actually... for horizon it could be this patch you should try | 15:29 |
noonedeadpunk | https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/910726 | 15:30 |
opendevreview | Merged openstack/openstack-ansible-os_octavia master: Adopt for usage openstack_resources role https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/889879 | 16:19 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/2023.2: Bump SHAs for 2023.2 https://review.opendev.org/c/openstack/openstack-ansible/+/911943 | 16:20 |
opendevreview | Merged openstack/openstack-ansible-os_magnum master: Adopt for usage openstack_resources role https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/901185 | 16:21 |
nixbuilder | Where do you guys put the change logs between versions? In other words where would I find the change logs for 27.2.0 and 27.4.0? | 16:23 |
opendevreview | Merged openstack/openstack-ansible stable/zed: Determine if upgrade source branch is stable/ or unmaintained/ https://review.opendev.org/c/openstack/openstack-ansible/+/911584 | 16:26 |
noonedeadpunk | nixbuilder: https://docs.openstack.org/releasenotes/openstack-ansible/2023.1.html | 16:28 |
nixbuilder | noonedeadpunk: As always... thank you! | 16:30 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_magnum master: Move insecure param to keystone_auth section https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905110 | 16:32 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 16:32 |
noonedeadpunk | huh, so we've landed all openstack_resources patches - sweet | 16:34 |
jrosser | feels like we are also very close to getting the capi stuff merged too | 16:46 |
noonedeadpunk | yeah | 16:47 |
*** jamesdenton_ is now known as jamesdenton | 16:48 | |
jrosser | this should be good to go now https://review.opendev.org/c/openstack/openstack-ansible/+/911621 | 16:50 |
-opendevstatus- NOTICE: Jobs that fail due to being unable to resolve mirror.dfw.rackspace.opendev.org can be rechecked. This error was an unexpected side effect of some nodepool configuration changes which have been reverted. | 16:53 | |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-plugins master: Add support for the apply_to parameter for policies https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/911949 | 17:07 |
f0o | is there an easy way to have openstack-ansible create some predefined flavors? | 17:54 |
f0o | tried googling but I got a whole lot of other results unrelated to openstack-ansible including the obvious " just use the openstack module from ansible community" | 17:54 |
g3t1nf0 | what is the standard network called not the OVN the other one the old one ? | 18:05 |
f0o | g3t1nf0: linuxbridge | 18:05 |
g3t1nf0 | thanks | 18:05 |
g3t1nf0 | jrosser: if I provide a list with selinux policies and firewall ports for the different services is that suffiscient for selinux enforcing ? | 18:06 |
jrosser | f0o: there is improvement in that regard in the next release | 18:07 |
g3t1nf0 | https://paste.rs/LUWz4.txt so far | 18:07 |
jrosser | we have ansible role “openstack resources” that can manage those things for you | 18:07 |
f0o | neat! | 18:08 |
noonedeadpunk | but it's only on master | 18:08 |
noonedeadpunk | pretty much new and still needs some polishing/improvement | 18:08 |
noonedeadpunk | But I assume you can try using it | 18:08 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible-plugins/src/branch/master/roles/openstack_resources/defaults/main.yml | 18:08 |
f0o | I'll try at least heh | 18:09 |
noonedeadpunk | g3t1nf0: there's also OVS which sits in-between | 18:09 |
noonedeadpunk | g3t1nf0: um, I guess you would need to teach us a bit more on how it's expected to be set up | 18:11 |
g3t1nf0 | noonedeadpunk: should I continue with the policies or its of no use | 18:11 |
g3t1nf0 | what you mean ? | 18:12 |
g3t1nf0 | the commands are there | 18:12 |
g3t1nf0 | you create the custom policy then apply it if there is no policy | 18:12 |
noonedeadpunk | well, regarding firewall, I think me and jrosser have some view on it, though question about how implement it in a meaningful way for others is still open question | 18:13 |
noonedeadpunk | regarding selinux - that is super useful thing to hove, imo | 18:13 |
noonedeadpunk | *have | 18:13 |
jrosser | I think for selinux we need someone to step forward and offer to develop/support that feature | 18:15 |
g3t1nf0 | jrosser: so just the policies is not enought ? | 18:15 |
noonedeadpunk | we for sure can provide some guidance there | 18:15 |
jrosser | g3t1nf0: I have little to no practical experience with RHEL variants | 18:16 |
noonedeadpunk | I guess a question here - what if they change | 18:16 |
jrosser | and therefore motivation to keep the CI in good order for selinux is not high, for me | 18:16 |
noonedeadpunk | I think selinux can work nicely on ubuntu/debian as well though | 18:17 |
g3t1nf0 | there is selinux for ubuntu and debian, debian 100% not sure ubuntu | 18:17 |
g3t1nf0 | https://wiki.debian.org/SELinux | 18:17 |
noonedeadpunk | g3t1nf0: so, can you please submit a bug report with your work/policies then, so we won't loose them, unless you wanna to push them directly | 18:18 |
noonedeadpunk | given amount of things in this paste - is there some "automated" way to come up with a policy? | 18:18 |
g3t1nf0 | https://wiki.ubuntu.com/SELinuxhttps://wiki.ubuntu.com/SELinux lol guess not so much | 18:18 |
g3t1nf0 | automated way? maybe go to chatgpt give it the auditd logs with the denials and ask for a policy, no idea I got them from a person I know that runs openstack | 18:19 |
jrosser | this is the sort of feature we would want to run gate CI on I think | 18:20 |
noonedeadpunk | yeah | 18:20 |
jrosser | so that’s where the challenge comes - needing to maintain that, because if it breaks we have to either fix it or drop it | 18:20 |
jrosser | otherwise we cannot merge anything | 18:20 |
g3t1nf0 | Let me check with the older versions if much have been changed | 18:21 |
jrosser | g3t1nf0: ^ this is why having active contributors is so important - particularly if you have a specific interest in a feature and need that to be maintained | 18:21 |
noonedeadpunk | I've added that topic to etherpad for PTG | 18:22 |
noonedeadpunk | g3t1nf0: or at least a bit document/educate how you've to fix that :D | 18:23 |
noonedeadpunk | But also question - how we are to implement their distribution. Per role? On playbook level? | 18:24 |
g3t1nf0 | okay I'll create the list first for bobcat then compare with older releases on first glance not much have changed sinze Yoga | 18:24 |
noonedeadpunk | g3t1nf0: I guess interesting might be nova/manila | 18:24 |
noonedeadpunk | neutron as well | 18:25 |
g3t1nf0 | neutron has different configs for different servers I've started with the OVN first I'm on linuxbridge now | 18:25 |
g3t1nf0 | Nova is different only between the nova-scheduler, nova-conductor vs nova-compute | 18:26 |
noonedeadpunk | I totally agree with jrosser though on maintaining it part - we are running ubuntu everywhere... | 18:26 |
noonedeadpunk | and I guess main "problem" here - you need to cover either 100% or 0% more or less | 18:27 |
noonedeadpunk | you can't cover one scenario and leave another one broken. | 18:28 |
g3t1nf0 | with the list that jrosser gave me I have the policies for all of them | 18:28 |
noonedeadpunk | oh, huh | 18:28 |
noonedeadpunk | I'm really pretty much interested to see what we can do here, as I don't want this work to be wasted | 18:29 |
g3t1nf0 | keystone, nova, glance, horizon, swift, cinder, neutron, octavia, heat, ceilometer, cloudkitty, trove, magnum, sahara, ironic, manila, designate, barbican | 18:29 |
jrosser | writing an ansible role to apply this stuff is pretty much the trivial bit | 18:29 |
g3t1nf0 | I can do the ansible part as well | 18:30 |
g3t1nf0 | its a bit tricky with the neutron because you have to separate the policy where it is installed and which network is being used ovn or linuxbridge | 18:30 |
jrosser | I’m about to run out of battery here :/ | 18:31 |
noonedeadpunk | yeah, let's continue this discussion later, ok? | 18:31 |
noonedeadpunk | I need to think about how better to implement that anyway | 18:32 |
noonedeadpunk | as you said - policy should be aware of the context | 18:32 |
g3t1nf0 | https://www.server-world.info/en/note?os=CentOS_Stream_9&p=openstack_bobcat&f=1 here is his blogs with documentation that he did | 18:32 |
noonedeadpunk | to we likely need to place them in service roles. | 18:32 |
noonedeadpunk | and call smth out of that | 18:32 |
noonedeadpunk | but then due to the nature of selinux it indeed feels like more approriate would be - run role against * | 18:36 |
g3t1nf0 | see you tomorrow guys I need to take the dog for a walk | 18:37 |
noonedeadpunk | o/ | 18:37 |
gokhani | Hello folks, I am getting weird issues on galera after upgrading primary infra host. galera cluster is working and sync. But on haproxy galera backend is down because it can't reach 9200 port. I can curl 9200 port on galera container but on infra host I can't reach 9200 port. https://paste.openstack.org/show/bKWpCIfVrwSz9v2ZzqiV/ . what can prevent this ? | 18:46 |
gokhani | I can curl 3306 port both from host and container. | 18:46 |
noonedeadpunk | gokhani: I guess you've existing the controller from some weird IP | 18:48 |
noonedeadpunk | like having VIP added as /24 instead of /32 or smth like that | 18:48 |
noonedeadpunk | since service on 9200 will accept connections only from specific IPs | 18:48 |
gokhani | noonedeadpunk: I am trying to curl with galera container ip not with VIP | 18:53 |
gokhani | it is also can not reach with galera container ip | 18:53 |
noonedeadpunk | src IP I meant | 18:56 |
noonedeadpunk | it matters | 18:56 |
noonedeadpunk | so /etc/systemd/system/mariadbcheck.socket defines an explicit list of IPs from which it can be accessed | 18:58 |
noonedeadpunk | through `IPAddressAllow` | 18:58 |
noonedeadpunk | which is defined by `galera_monitoring_allowed_source` variable | 18:58 |
noonedeadpunk | and it allows only haproxy mgmt ip and localhost by default | 18:59 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible/src/branch/master/inventory/group_vars/galera_all.yml#L33-L39 | 18:59 |
noonedeadpunk | gokhani: ^ | 18:59 |
noonedeadpunk | and in case VIP is added with some netmask like /24 or smth - traffic from controller might flow thorugh VIP rather then it should from management IP | 19:00 |
noonedeadpunk | causing haproxy to mark galera as down | 19:00 |
gokhani | noonedeadpunk: I am checking now | 19:04 |
gokhani | noonedeadpunk: thanks this issue is because of setting ips different from br-mgmt ips in /etc/openstack_deploy/openstack_user_config.yml. ı don't know how this ip is named. | 19:11 |
gokhani | after allowing infra host br-mgmt ips in /etc/systemd/system/mariadbcheck.socket, it worked | 19:12 |
gokhani | and last but basic question, my bash is not completing container name when typing lxc-* commands, which app ı need to install. | 19:13 |
noonedeadpunk | yeah | 19:15 |
noonedeadpunk | I dunno why it's broken, never had time to look at it | 19:16 |
noonedeadpunk | but it's difference of `sudo -s` vs `sudo -i` | 19:16 |
noonedeadpunk | so smth with environment | 19:16 |
noonedeadpunk | so I just tought myself to use `sudo -i` for auto-completition to work | 19:17 |
noonedeadpunk | gokhani: you can really use `galera_monitoring_allowed_source` for more persistent fix | 19:18 |
noonedeadpunk | but there might be really more surprises though | 19:18 |
gokhani | noonedeadpunk: now with sudo -i it is worked. but I previously tried with sudo -i it didn't work but now it works :) | 19:19 |
gokhani | thanks noonedeadpunk, I have also added galera_server_proxy_protocol_networks for no surprises :) | 19:20 |
noonedeadpunk | really would be good to take time and figure out why sudo -s not working anymore... | 19:21 |
noonedeadpunk | I haven't managed in couple of years now... so chances are low :D | 19:21 |
gokhani | noonedeadpunk: yes it would be goof if it can also work with sudo -s | 19:24 |
noonedeadpunk | also - auto-logout is not working with sudo -s | 19:24 |
noonedeadpunk | which is way more critical | 19:25 |
noonedeadpunk | so if you find the reason - let us know) | 19:25 |
gokhani | noonedeadpunk: ı have completed upgrade of infra hosts from victoria to antelope and also from focal to jammy. | 19:25 |
noonedeadpunk | sweet | 19:26 |
noonedeadpunk | that sounds pretty much like an achievment | 19:26 |
gokhani | ı have only problems with galera for these ip issues because of my env | 19:26 |
noonedeadpunk | I guess, we talked last year that upgrades are not that scary and better/easier just to keep things up to date :) | 19:28 |
noonedeadpunk | but again - I wonder what kind of issues you had | 19:28 |
noonedeadpunk | So you have separate network for SSH and another as openstack mgmt net? | 19:29 |
noonedeadpunk | because then we have smth new in inventory for these cases | 19:29 |
gokhani | there is a difference when copying .my.cnf because galera root user changed with admin from root | 19:29 |
noonedeadpunk | oh, well | 19:30 |
noonedeadpunk | in fact... | 19:30 |
gokhani | Yes I have seperate network for ssh | 19:30 |
noonedeadpunk | in fact root should use just socket auth | 19:30 |
noonedeadpunk | not password auth | 19:30 |
noonedeadpunk | then there's no need for .my.cnf on galera containera | 19:31 |
noonedeadpunk | but we obviously missed that thing out of sight historically | 19:31 |
noonedeadpunk | for ssh !=mgmt - have you cheked on https://opendev.org/openstack/openstack-ansible/src/branch/master/doc/source/reference/inventory/configure-inventory.rst#having-ssh-network-different-from-openstack-management-network ? | 19:32 |
gokhani | yes if I am not wrong it is only needed on first utility container | 19:32 |
noonedeadpunk | actually - for all hardware hosts | 19:34 |
gokhani | ohh this is awesome, it is first time I have seen this doc :( | 19:34 |
noonedeadpunk | like for everything defined in openstack_user_config | 19:34 |
noonedeadpunk | ah, about mariadb | 19:34 |
gokhani | ı haven't ever checked this. | 19:34 |
noonedeadpunk | yes, mariadb just once | 19:34 |
gokhani | noonedeadpunk: thanks a lot ı have learned new things today :) | 19:36 |
ncuxo | is there some recommendations for partitioning of the OS drive ? | 19:40 |
noonedeadpunk | ncuxo: I'm not sure we have written that anywhere | 19:41 |
noonedeadpunk | also I guess it kinda depends | 19:41 |
noonedeadpunk | If you're using LXC containers, and rootfs as backing storage (just folders on the FS) - then they're stored in /var/lib/lxc/ and you might want to have a partition for that | 19:42 |
ncuxo | usually I'm separating /usr /var /var/log /var/log/audit /var/crash /var/tmp /opt /tmp /home /.snapshots | 19:43 |
noonedeadpunk | then there're things in /openstack/ - like backup, log, venvs... | 19:43 |
noonedeadpunk | basically folders that are bind mounted inside containers | 19:44 |
noonedeadpunk | and venvs if smth happens to be on bare metal | 19:44 |
noonedeadpunk | so /openstack might be another thing | 19:44 |
ncuxo | can I use podman or docker ? or only lxc ? | 19:46 |
ncuxo | isn't kolla the one that is using containers, or I'm on the wrong place | 19:48 |
jrosser | ncuxo: openstack-ansible uses lxc containers by default, it that is optional | 19:50 |
jrosser | these are nothing at all like docker/podman, but can be considered more like virtual machines | 19:50 |
ncuxo | so for bare metal ? | 19:50 |
jrosser | can you be more specific? | 19:50 |
ncuxo | okay I need to read some more on LXC then I though they are some other kind of container | 19:51 |
jrosser | they are analogous to a vm and run the full systemd and everything | 19:51 |
ncuxo | so lxc is like virtual machine thus openstack-ansible is on bare metal ? | 19:51 |
jrosser | just happen to share the host kernel | 19:51 |
jrosser | no not rally, it’s like having multiple hosts inside one bare metal host | 19:52 |
jrosser | when we say a “metal” | 19:52 |
ncuxo | I'm perplexed then why are we using docker or podman at all does lxc have the same "containerisation" are there worries of escapes | 19:52 |
ncuxo | sorry I'm just finding a whole new world here and I'm a bit confused why I've never heard of those | 19:53 |
jrosser | well they are not particularly fashionable in a world full of docker and k8s | 19:54 |
jrosser | they are “machine containers” where docker and podman are usually “application containers” | 19:54 |
ncuxo | what happens if one of them is stopped can it be restarted with systemd like the others ? | 19:55 |
jrosser | there are command line tools to manage lxc containers | 19:55 |
jrosser | start / stop / create / delete etc | 19:55 |
ncuxo | okay I'll be back you gave me homework | 19:56 |
jrosser | ncuxo: see https://linuxcontainers.org/ | 19:57 |
ncuxo | already there | 19:57 |
ncuxo | can I use rocky or centos or rhel as lxc ? | 19:57 |
ncuxo | or everything is alpine | 19:57 |
jrosser | for openstack-ansible we build our own image during deployment that matches the host OS | 19:59 |
jrosser | with debootstrap or dnf to build a minimal filesystem | 19:59 |
jrosser | but for the general case outside OSA there are many different OS container images available to use | 20:00 |
ncuxo | which playbook does that container-lxc-create ? | 20:00 |
jrosser | the image build is done in the lxc_hosts role | 20:00 |
noonedeadpunk | ncuxo: you can do just bare metal as well | 20:01 |
noonedeadpunk | yes, and it's backed up by 2 roles basically: https://opendev.org/openstack/openstack-ansible-lxc_container_create - which handles container creation | 20:03 |
noonedeadpunk | and https://opendev.org/openstack/openstack-ansible-lxc_hosts where we prepare basic image for them | 20:03 |
jrosser | yes osa makes the lxc containers optional - they’re not mandatory, and would be arguments both ways depending on circumstances | 20:03 |
noonedeadpunk | but yes, if you run rocky - you will get rocky in container, if you run centos - you will get centos. We actually talked about letting ppl select smth different as well, but it;s not high in prio | 20:04 |
ncuxo | the thing is if I go with container containers why not deploy openstack on k8s | 20:04 |
noonedeadpunk | I guess because you can't really use much of k8s features? | 20:05 |
noonedeadpunk | but again - we do run CI without containers as well | 20:05 |
noonedeadpunk | really easy to opt out from them | 20:06 |
noonedeadpunk | Like - auto-scaling can't be used as breaks all kind of things | 20:06 |
jrosser | well this is all factors in choosing your deployment tool, if you want docker then you have to use kolla, if you want bare metal then you have to choose osa, if you want k8s based then there are other choices besides these | 20:06 |
noonedeadpunk | Self-healing? But that kinda systemd does pretty well by restarting the service | 20:06 |
ncuxo | auto scaling of openstack is broken when deployed with or without lxc | 20:06 |
noonedeadpunk | so to disable any container things, jsut add `no_containers: true` to openstack_user_config.yml under global_overrides | 20:07 |
ncuxo | sure but I didn't understood which is breaking the autoscaling and self-healing if I disable the lxc and then those are broken then I prefer to do it with lxc | 20:08 |
noonedeadpunk | nah | 20:09 |
noonedeadpunk | there's no autoscaling regardless | 20:09 |
jrosser | I feel we are having a very confused conversation here | 20:09 |
noonedeadpunk | and nothing is broken | 20:09 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible/src/branch/master/doc/source/reference/inventory/configure-inventory.rst#deploying-directly-on-hosts | 20:09 |
noonedeadpunk | but as I said - you can just define no_containers once under global_overrides | 20:10 |
noonedeadpunk | And actually - then you can fully skip `provider_networks` defenition at all | 20:10 |
noonedeadpunk | will jsut need to define neutron_provider_networks for mappings, which is way more trivial thing to do | 20:10 |
ncuxo | are there any benefits of running one over the other | 20:11 |
ncuxo | lxc over baremetal and vise versa | 20:11 |
jrosser | some people feel that the without-lxc setup is simpler, and the deployment is certainly faster | 20:11 |
noonedeadpunk | container is easy to re-create, no conflicting dependencies | 20:11 |
noonedeadpunk | and yes - +1 to metal benefits | 20:12 |
jrosser | however, day-2 operations and maintenance are certainly easier with the lxc setup | 20:12 |
ncuxo | with those LXC you made me rethink our whole infra at work | 20:12 |
noonedeadpunk | well, I guess ppl can argue on day2 as well though | 20:12 |
noonedeadpunk | but I guess it's a matter of taste | 20:13 |
noonedeadpunk | so we allow both of these and operator is free to choose more or less | 20:13 |
jrosser | for me, being able to just delete and reprovision a container if I mess up is a good benefit | 20:13 |
ncuxo | how is security implemented with LXC as an example if I run debian then I first have to harden my os debian then lxc is copying my OS as base :? | 20:14 |
jrosser | but a downside is you now have N times more “systems” to maintain and patch | 20:14 |
noonedeadpunk | ncuxo: pretty much | 20:14 |
noonedeadpunk | well, not fully... | 20:14 |
noonedeadpunk | Sorry, I answered wrongly... | 20:14 |
ncuxo | jrosser: why ? isn't it just redeploy with the new image ? | 20:14 |
noonedeadpunk | ncuxo: lxc image is build from scratch using debootstrap/rpm as jrosser wrote earlier | 20:15 |
noonedeadpunk | and there's some caching time, when re-running role won't try to update it | 20:15 |
jrosser | and then ansible deploys “the things” into the image once it’s booted | 20:15 |
ncuxo | so it is taking my OS as base thus I need to upgrade my OS and recreate all the LXC images ? | 20:15 |
noonedeadpunk | but there's variable that can be adjusted | 20:15 |
noonedeadpunk | nah, it's not taking OS as base | 20:16 |
jrosser | no, it builds a fresh OS fikesystem | 20:16 |
jrosser | see debootstrap tool | 20:16 |
noonedeadpunk | I guess biggest difference from docker here, is that image is jsut bare minimal OS | 20:16 |
ncuxo | sorry for the confusionn guys never used LXC, I had a totally different idea of the thing | 20:16 |
noonedeadpunk | it does not contain any services. So if you re-create, then you'd need to re-install service inside it | 20:17 |
jrosser | if lxc looks interesting, you should also check out lxd/incus | 20:17 |
noonedeadpunk | yeah, incus is sweet.... | 20:17 |
jrosser | they are a slightly higher level abstraction of the same thing | 20:17 |
jrosser | with api and fancy stuff | 20:17 |
noonedeadpunk | but then - you can just run `dnf upgrade` inside lxc container | 20:17 |
noonedeadpunk | and it will got updated | 20:17 |
ncuxo | so I can have rhel9 and use LXC to create a minimal rhel9 lxc ? | 20:18 |
ncuxo | why I can't find any rhel lxc templates | 20:18 |
jrosser | well RHEL is non free and can’t be distributed | 20:19 |
noonedeadpunk | sorry, folks, I need to leave now - already quite late | 20:20 |
noonedeadpunk | o/ | 20:20 |
jrosser | o/ | 20:20 |
noonedeadpunk | but... if you run el9 - lxc will be also el9 I assume | 20:20 |
noonedeadpunk | as the way how we build images - is to take repos from the host | 20:21 |
ncuxo | I have to test this ... I've always build everything on rhel using the ubi images | 20:21 |
noonedeadpunk | and run dnf to create minimal image of a thing | 20:21 |
noonedeadpunk | you can start with AIO really easily | 20:21 |
ncuxo | can AIO be on a cluster ? | 20:22 |
ncuxo | or its meant just for one machine | 20:22 |
noonedeadpunk | git clone https://opendev.org/openstack/openstack-ansible; cd openstack-ansible; git checkout stable/2023.2; ./scripts/gate-check-commit.sh aio_lxc | 20:22 |
noonedeadpunk | ncuxo: well, for cluster we have MNAIO but it's slightly unmaintained: https://opendev.org/openstack/openstack-ansible-ops/src/branch/master/multi-node-aio | 20:23 |
noonedeadpunk | I want to re-work this concept for a while and having couple of ideas, but ENOTIME | 20:23 |
ThiagoCMC | jrosser, so, back to Ceph quickly. I manage to install Ceph 18 (Reef) on Ubuntu 22.04 with Bobcat, using Ceph Ansible `stable-7.0`! The `stable-8.0` is definitely broken for Ubuntu 22.04 + Botcat (with Reef) and 24.04. It seems that Ceph Ansible is not good for Ubuntu 24.04, perhaps if downgrading Ansible. | 20:35 |
jrosser | ThiagoCMC: maybe you can make a patch to deploy Reef on OSA master? | 20:42 |
jrosser | right so currently we certainly test quincy https://zuul.opendev.org/t/openstack/build/55e1e978d267477eb52e21574d4152f6/log/logs/host/apt/history.log.txt#57 | 20:46 |
nixbuilder | I am stumped... I have been trying to fix this all day to no avail... fresh 27.4.0 install and I get this https://paste.openstack.org/show/beNkSK8rCKyzRKTi9LxP/ | 22:28 |
nixbuilder | This is weird... back to back commands... the first an error the second it works (https://paste.openstack.org/show/bvQOCb1qTdiQDtRcP4Rh/) | 23:03 |
opendevreview | Merged openstack/openstack-ansible-os_neutron master: Add VPNaaS OVN support https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/908341 | 23:36 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!