jrosser | good morning | 08:56 |
---|---|---|
noonedeadpunk | o/ mornings | 09:11 |
jrosser | well include_role: / import_role: / roles: behave somewhat differently about vars defined in roles | 09:51 |
jrosser | if you use roles: [ myrole ] then myrole/defaults/main.yml vars are defined in the scope of the playbook for the tasks to use | 09:52 |
jrosser | but not the case for include / import | 09:52 |
noonedeadpunk | so in case of import/include that's in hostvars? | 10:43 |
jrosser | i get undefined | 10:46 |
jrosser | this is for a role that has nothing other than defaults/main.yml in it | 10:46 |
jrosser | a "set some vars" role | 10:46 |
noonedeadpunk | you're talking about latest ansible-core, right? | 10:48 |
noonedeadpunk | or in general? | 10:48 |
jrosser | well specifically this https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/900527 | 10:49 |
jrosser | and i thought i understood how this worked but clearly i don't | 10:50 |
jrosser | noonedeadpunk: looks like reprovisioning an haproxy node is quite exciting with the changes we made to put the config in the service playbooks | 14:07 |
jrosser | depending on what keepalived is setup to do, it could prefer the node you just re-installed but has no backends yet | 14:07 |
noonedeadpunk | ouh | 14:08 |
noonedeadpunk | So basically we should be skipping keepalived | 14:09 |
jrosser | i guess this could also happen in the past | 14:09 |
jrosser | but the window of brokenness would be small as the haproxy playbook would put everything back pretty quickly | 14:09 |
noonedeadpunk | but not for aio or smth... | 14:09 |
jrosser | only relevant for H/A | 14:09 |
noonedeadpunk | and not fresh deployment | 14:09 |
noonedeadpunk | actually... that is good catch as we're going to do controlelr re-setup soonish | 14:10 |
jrosser | primarily OS upgrades or repairing a broken node | 14:10 |
jrosser | which is exactly what andrewbonney is looking at now | 14:10 |
noonedeadpunk | yeah, hosamali_ looking there now as well | 14:11 |
noonedeadpunk | And actually I don't have any good ideas here except skip keepalived until all backends will be configured | 14:12 |
noonedeadpunk | and then run smth like setup-openstack --tags haproxy-service-config | 14:13 |
andrewbonney | ^ about to fix a bug with that tag :) | 14:13 |
noonedeadpunk | ok :) | 14:13 |
noonedeadpunk | I _think_ I used it already one or two times | 14:13 |
jrosser | i thought there was a way to inhibit keepalived taking the vip | 14:14 |
jrosser | but we do not seem to have that config | 14:14 |
noonedeadpunk | well, weights | 14:14 |
noonedeadpunk | but if you take down keepalived with highest weight.... | 14:15 |
jrosser | sure, but we could use vrrp_track_file | 14:15 |
noonedeadpunk | or well, yes, totally | 14:15 |
jrosser | then there is something to manually set when you know that the backends are all missing | 14:15 |
noonedeadpunk | but how to define if we're missing backends | 14:15 |
jrosser | i am not sure that this can be automated | 14:15 |
noonedeadpunk | or know that they're | 14:15 |
opendevreview | Andrew Bonney proposed openstack/openstack-ansible-haproxy_server master: Fix undefined fact when running haproxy-service-config tag https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/905867 | 14:16 |
noonedeadpunk | jrosser: well, you can run haproxy-install.yml --skip-tags keepalived | 14:16 |
jrosser | true | 14:16 |
noonedeadpunk | and then just re-run it at the very end | 14:16 |
noonedeadpunk | which kinda metter of OS upgrade process (we need to update) | 14:17 |
jrosser | or shut the public interface | 14:17 |
*** ianychoi[m] is now known as ianychoi | 14:17 | |
jrosser | well thats not good enough actually becasue it will have the same breakage on the internal endpoint | 14:18 |
noonedeadpunk | Well, I guess doc --skip-tags keepalived and then configure all backends and afterwards re-run haproxy-install --tags keeplaived sounds like easy/fair workaround | 14:19 |
jrosser | yes that does sound simplest | 14:19 |
andrewbonney | Looks like that bug was actually fixed in https://github.com/openstack/openstack-ansible-haproxy_server/commit/a8a8feba7153622e37b2bde5def1053cc76d30e1 - we're running an older version | 14:21 |
noonedeadpunk | aha, ok, that explains why that sounded common | 14:22 |
noonedeadpunk | andrewbonney: do you know if `setup-openstack --tags haproxy-service-config --limit contorl01` will work? | 14:25 |
noonedeadpunk | (mainly in terms of limit) | 14:25 |
andrewbonney | I ran it without limit first time around. At first glance that looks like it runs through too quickly to be working properly | 14:29 |
andrewbonney | Yes, steps like this one get missed: TASK [Temporarily copy haproxy_service_configs value from keystone_all to haproxy_all] | 14:31 |
andrewbonney | changed: [infra1_keystone_container-8ea445de] => (item=haproxy1) | 14:31 |
andrewbonney | changed: [infra1_keystone_container-8ea445de] => (item=haproxy2) | 14:31 |
noonedeadpunk | yeah :( | 14:31 |
noonedeadpunk | thanks for checking! | 14:32 |
noonedeadpunk | spatel: hey! I recall you used ovn bgp agent, don't you?:) | 15:38 |
spatel | Yes in lab :( but it was chaos | 15:39 |
noonedeadpunk | mhm, I see | 15:39 |
spatel | it was 2 year ago and now things are changed and patched so it should be smooth now | 15:39 |
spatel | Are you looking for L3 end to end solution? | 15:39 |
noonedeadpunk | I'm just reading and trying to asses if it's worth it at all, given that time is veeeery limited | 15:39 |
noonedeadpunk | yeah, pretty much | 15:40 |
noonedeadpunk | But I _think_ that Amphora in OCtavia won't like it | 15:40 |
noonedeadpunk | at least with exposing tenant networks via bgp | 15:40 |
noonedeadpunk | But the more I read the less I understand kinda | 15:41 |
noonedeadpunk | And it looks like there are soooo many places where things may go wrong | 15:42 |
jrosser | you want it for ipv6? | 15:43 |
spatel | bgp is not fun | 15:45 |
noonedeadpunk | jrosser: I think at this point not only for ipv6 | 15:46 |
NeilHanlon | i'm waiting for ipv12 | 15:46 |
noonedeadpunk | or well. I personally not sure I want it per say, but network folks seems to | 15:46 |
noonedeadpunk | And I kinda understand why on paper | 15:47 |
NeilHanlon | it makes things a lot easier from the integration point if you can do, e.g., bgp-unnumbered | 15:47 |
noonedeadpunk | but looking on complexity I'm not absolutely convinced that it's worth it | 15:47 |
spatel | building ovn-bgp is easy but inside kernal space packet bound a lot and it required some complex troubleshooting | 15:47 |
NeilHanlon | https://www.juniper.net/documentation/us/en/software/nce/nce-225-bgp-unnumbered/index.html | 15:47 |
spatel | I would like if ovn support bgp native | 15:47 |
* noonedeadpunk loves junipers | 15:47 | |
* NeilHanlon too | 15:48 | |
* NeilHanlon also loves the drink of the juniper.... gin | 15:48 | |
noonedeadpunk | spatel: yeah, exactly, but given implementation with frr it feels like soooo much can potentially go wrong | 15:48 |
jrosser | well juniper -> HPE right? | 15:49 |
noonedeadpunk | As with OVN we want to get rid of complexity with namespaces, but then bgp neglects all simplifications | 15:49 |
noonedeadpunk | jrosser: ugh, I completely missed these news | 15:49 |
NeilHanlon | allegedly.. | 15:50 |
NeilHanlon | we'll see what the US courts do ;) | 15:50 |
jrosser | tbh we (org wise, not mee) are ha having juniper trouble | 15:50 |
jrosser | we have some chassis based routers that the routing engine just goes AWOL every so often | 15:50 |
jrosser | and then some TOR type switches in virtual chassis (whoever thought that was a good idea....) all decided to go out to lunch simultaneously recently | 15:51 |
noonedeadpunk | I can't recall if dealt with their core engines very closely.... | 15:51 |
noonedeadpunk | but never had huge issues with switch stacks | 15:52 |
noonedeadpunk | their redundancy/rolling upgrades/failover was waaay better then what I experience with cisco's... but dunno | 15:53 |
jrosser | yeah | 15:53 |
noonedeadpunk | (or dells) | 15:53 |
spatel | noonedeadpunk my question is why are you looking for OVN-BGP ? | 15:55 |
spatel | what encourage you? | 15:55 |
spatel | I am running BGP Unnumbered EVPN Fabric with Cisco nexus in my new DC | 15:56 |
spatel | without IPv6 :) | 15:56 |
noonedeadpunk | Getting proper spine/leaf I guess | 15:56 |
jrosser | well i think theres two things | 15:57 |
spatel | ? | 15:57 |
jrosser | you'd need bgp (or routing protocol of choice) for spine/leaf but that doesnt really touch OVN | 15:57 |
jrosser | but then other thing completely is interfacing the edge of openstack into an L3 external networks with BGP only | 15:58 |
spatel | In ovn-bgp your compute nodes will interact with your physical EVPN fabric so in that case you will have 100% L3 network in fabric. | 16:00 |
noonedeadpunk | I have very vague described purpose, so trying really understand options and evaluate them | 16:00 |
noonedeadpunk | I guess this full L3 should evaluate switches stacks that are crap | 16:00 |
spatel | FRR will run inside your compute and you will create BGP peering with your physical tor switches running BGP | 16:01 |
noonedeadpunk | and always bring down everything when one member goes down | 16:01 |
NeilHanlon | layer 2 is the devil | 16:01 |
jrosser | spatel: for gateway nodes, right? | 16:01 |
noonedeadpunk | spatel: btw, frr is installed/configured separately, right? | 16:01 |
spatel | Other big advantage is your don't need LACP bond etc.. :) | 16:01 |
noonedeadpunk | jrosser: nah, also for computes | 16:01 |
noonedeadpunk | or well. depends | 16:01 |
jrosser | tenant networks...? | 16:02 |
noonedeadpunk | if you have distributed vips or not | 16:02 |
noonedeadpunk | and tenant networks, yes | 16:02 |
jrosser | well actually i guess i specifically mean the IP of the VTEP | 16:02 |
spatel | This is best example diagram here - https://ltomasbo.wordpress.com/2021/02/04/ovn-bgp-agent-testing-setup/ | 16:03 |
spatel | your compute connected to leaf with BGP peer (using FRR running inside compute nodes) | 16:04 |
noonedeadpunk | ok, thats very good reading, thanks! | 16:06 |
spatel | beauty of OVN-BGP is that you can connect two different cloud using l3 routing using BGP magic. Scaling is very easy etc.. no bordcast no multicast.. | 16:08 |
noonedeadpunk | aha, for exposing tenant networks it looks like they should not intersect in address spaces as well? | 16:09 |
spatel | https://www.openstack.org/videos/summits/berlin-2022/Using-BGP-at-OpenStack-to-interconnect-workloads-across-clouds | 16:09 |
noonedeadpunk | seems I've missed this specific presentation (likely other folks of ours were there) | 16:10 |
noonedeadpunk | but we use vpnaas for that today | 16:10 |
spatel | One big advantage is, if you running k8s then you can expose your pod IPs to outside world and we can ping directly to pod. Google provide that feature | 16:10 |
noonedeadpunk | as we need to do so only on tenant level | 16:10 |
noonedeadpunk | but how're you gonna do that.... like you kinda need public ip inside the pod? | 16:12 |
noonedeadpunk | or some dst nat? | 16:12 |
jrosser | the non intersecting address space thing is difficult for multitenant clouds | 16:13 |
jrosser | or where you have some template/terraform/whatever that spins up the same thing in multiple projects | 16:14 |
jrosser | in the start we looked very hard at calico bit it suffered from exactly the same thing | 16:14 |
jrosser | yeah so in those examples they have to use a subnetpool i think? | 16:18 |
noonedeadpunk | Haven't looked a video yet from Berlin | 16:25 |
noonedeadpunk | but non-overlapping networks are mentioned explicitly in implementation limitations: https://docs.openstack.org/ovn-bgp-agent/latest/contributor/bgp_mode_design.html#limitations | 16:25 |
noonedeadpunk | ah, yes, subnet pools | 16:26 |
noonedeadpunk | but kinda... making tenants using subnet pools for private networks? | 16:27 |
noonedeadpunk | so yeah, as they wrote it's not east/west solution for sure | 16:28 |
noonedeadpunk | even though there're some options to do so | 16:28 |
noonedeadpunk | and what makes things worse, I'm not very good at networking historically | 16:38 |
spatel | Still I don't understand why are you looking at ovn-bgp solution? Do you have specific need which existing infra not able to meet requirement? | 16:39 |
noonedeadpunk | spatel: I was told to by our net folks | 16:45 |
noonedeadpunk | more or less | 16:45 |
spatel | They told you to do ovn-bgp or just EVPN fabric at network layer ? | 16:45 |
noonedeadpunk | I quite vagualy understand reasoning, except ipv6, but apparently they love BGP and want everything to be handled through it or smth | 16:45 |
noonedeadpunk | They want fips to be routed directly to core switches or smth | 16:46 |
spatel | haha! loving BGP is different thing then BGP love you :) | 16:46 |
noonedeadpunk | *core routers | 16:46 |
noonedeadpunk | actually, I don't fully understand that either. | 16:47 |
noonedeadpunk | trying to gather info now to have a constructive dialog :D | 16:47 |
noonedeadpunk | and task laks motivation part as well... | 16:48 |
jrosser | fip is one part but also there is IP of tenant router to think about (if you don't advertise tenant subnets) | 16:49 |
* jrosser thinks this all gets real messy real quick | 16:49 | |
spatel | This is what I have in my DC - https://ibb.co/CwJyP1w | 16:49 |
mgariepy | woohoo +2 from dansmith ! https://review.opendev.org/c/openstack/glance_store/+/885581 | 16:49 |
noonedeadpunk | wow | 16:50 |
spatel | Physical switch fabric running EVPN with anycast gateway so each my tor-leaf switch is gateway and compute machines are standard servers in L2 | 16:50 |
noonedeadpunk | no idea how you did it mgariepy | 16:50 |
mgariepy | haven't done much tbh. just commented and rebased the patch ;p | 16:51 |
mgariepy | just need another core now. so anyone here have contact ? | 16:52 |
mgariepy | ;) | 16:52 |
noonedeadpunk | jrosser: well, reportedly, bgpg agent should listen also for router external ips and advertise them as well | 16:54 |
noonedeadpunk | the messiest part I don't like - plenty of kernel space wiring and messing with OVS flows to create another exit/entrance point | 16:55 |
jrosser | yeah, so i wonder if there is a halfway house where most of the complexity and limitiations of that blog post dont apply | 16:55 |
jrosser | where the BGP part only applies to FIP / router IP on designated gateway nodes | 16:55 |
jrosser | then you could have clean interface with upstream routers at a well defined point | 16:55 |
noonedeadpunk | (and computes potentially) | 16:55 |
jrosser | maybe | 16:56 |
jrosser | you'd have to work out how to address that | 16:56 |
jrosser | particularly if it was actual internet | 16:56 |
jrosser | though i believe you would be able to use rfc1918 loopback IP even if the eventual addresses were public | 16:57 |
noonedeadpunk | yeah, true | 16:58 |
jrosser | and i think be suuuuper careful with what services bind to what IP on everything | 16:59 |
noonedeadpunk | basically if provider networks are not passed down to computes, potentially they might be left intact | 16:59 |
noonedeadpunk | we have other level of complexity which are called AZs though :D | 17:00 |
noonedeadpunk | SO yeah... | 17:00 |
noonedeadpunk | anyway. | 17:00 |
noonedeadpunk | Looking at ovn bgp it seems it requires FRR role | 17:00 |
noonedeadpunk | I kinda know one role, but it's not maintained for a while: https://opendev.org/vexxhost/ansible-role-frrouting | 17:02 |
noonedeadpunk | (neither it was fully completed) | 17:02 |
noonedeadpunk | spatel: oh, anycast gateway - that might work indeed | 17:12 |
spatel | each my leaf is gateway | 17:14 |
noonedeadpunk | spatel: yeah, that is good idea, I kinda like it | 17:15 |
noonedeadpunk | Have close to no idea how much hassle from net side would be to do that though | 17:15 |
noonedeadpunk | as eventually flow would be pretty much alike | 17:16 |
noonedeadpunk | Potential issue here is that we have quite some public networks from different subnets, thus multiple gateways | 17:17 |
noonedeadpunk | but yeah, seems you also have quite some gateways there :D | 17:19 |
noonedeadpunk | huh, then it's really a good question why bgp ovn plugin really exists... | 17:20 |
spatel | haha! its to remove L2 and STP loops in some cases.. + NO LACP | 17:21 |
spatel | l3 give you more control on your hands (I don't have any good example but it does provide more traffic engineering in your hand) | 17:21 |
noonedeadpunk | well, given that E/W still geneve/vxlan - you still need lacp I assume | 17:22 |
spatel | I think simple EVPN with anycast gateway give you good start | 17:22 |
noonedeadpunk | as well as more ways to fuck up | 17:22 |
noonedeadpunk | but yeah | 17:23 |
spatel | why you need LACP? you will have ECMP (bgp multi path ) | 17:23 |
noonedeadpunk | from compute to tor switch? | 17:23 |
spatel | yes | 17:23 |
noonedeadpunk | oh, well | 17:24 |
spatel | from compute to switch you will have L3 link not LACP | 17:24 |
noonedeadpunk | ok, I guess I'm not getting that part | 17:24 |
spatel | advantage is you don't need TOR switch with multi-chassis requirement.. | 17:24 |
noonedeadpunk | as previosuly you wrote "compute machines are standard servers in L2 " | 17:25 |
spatel | https://ltomasbo.wordpress.com/2021/02/04/ovn-bgp-agent-testing-setup/ | 17:25 |
spatel | look at in this diagram compute has /32 IP (that is magic of BGP) | 17:26 |
noonedeadpunk | ah, nah, scrap bgp plugin for now:) I was thinking about anycast option you've proposed :D | 17:26 |
spatel | Trust me stay away from ovn-bgp and wait for OVN-BGP native feature whenever it come up | 17:27 |
noonedeadpunk | and that article is exactly ovn-bgp, right? | 17:28 |
noonedeadpunk | sorry for stupid and annoying questions, jsut want to ensure I didn't mixed everything up | 17:28 |
spatel | yes it python based ovn-bgp plugin | 17:28 |
spatel | script in short | 17:28 |
noonedeadpunk | ++ | 17:29 |
spatel | https://github.com/luis5tb/bgp-agent.git | 17:29 |
noonedeadpunk | Though I'm still not sure about how tenant networks will be working, Like it feels you'd need another FRR instance to announce these separately from what ovn-bgp will be doing | 17:29 |
noonedeadpunk | yeah, today this has emerged to openstack namespace: https://opendev.org/openstack/ovn-bgp-agent | 17:30 |
noonedeadpunk | but it's pretty much same thing | 17:30 |
noonedeadpunk | spatel: ok, but in your env you have lacp? or also computes are just /32 for vxlan nets? | 17:32 |
spatel | yes I have LACP | 17:32 |
noonedeadpunk | ++ | 17:32 |
spatel | My tor switch support multi-chassis (Cisco VPC) | 17:33 |
spatel | So I have LACP with active-active | 17:33 |
noonedeadpunk | Yeah, I think we have quite simmilar cisco switches | 17:33 |
noonedeadpunk | ok, yes, I kinda like that design way more then ovn-bgp, so thanks for this great idea ;) | 17:34 |
noonedeadpunk | Now I just need to figure out way for IPv6 :D | 17:35 |
spatel | IPv6 for what | 17:35 |
spatel | why do you need ipv6? | 17:36 |
noonedeadpunk | we have some customers running IPv6 only workloads even | 17:36 |
spatel | sure ipv6 should work.. I am running ipv6 | 17:36 |
spatel | just like standard way ipv4 | 17:37 |
noonedeadpunk | though we had them announced through BPG in OVS :( | 17:38 |
noonedeadpunk | (bgp dr-agent) | 17:38 |
* noonedeadpunk needs to recall why we did it this way at the first place | 17:38 | |
admin1 | "They want fips to be routed directly to core switches or smth" -- if 2 services talk to each other, won't this make the whole traffic go upto the spine and back via the leaf to the node ? | 17:49 |
jrosser | noonedeadpunk: for v6 a whole externally addressed subnet sits behind a tenant router | 17:53 |
jrosser | so the bgp dr agent thing told the upstream router which open stack external router ip was the next hop for each tenant /64 | 17:54 |
noonedeadpunk | yeah, probably that | 17:54 |
jrosser | no nat like there would be for v4 | 17:55 |
noonedeadpunk | This potentially should not be needed with ovn, given distributed FIP, or? | 17:56 |
noonedeadpunk | (when provider networks are reachable from computes) | 17:57 |
noonedeadpunk | but there's no nat overall I assume in ovn... but dunno | 18:06 |
noonedeadpunk | spatel: sorry, me again :D So, you have provider net gateway announced from leaf switches with anycast, right? And how are you handling storage/mgmt traffic as well as vxlan/geneve on itself? | 18:35 |
noonedeadpunk | just static routes with next hop of leaf switch? | 18:35 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!