*** mhen_ is now known as mhen | 01:30 | |
opendevreview | Merged openstack/whitebox-tempest-plugin master: Verify vTPM creation after svc restart https://review.opendev.org/c/openstack/whitebox-tempest-plugin/+/927306 | 02:39 |
---|---|---|
opendevreview | Martin Kopec proposed openstack/tempest master: Drop centos 8 stream jobs https://review.opendev.org/c/openstack/tempest/+/923152 | 07:29 |
opendevreview | Martin Kopec proposed openstack/tempest master: Parametrize target_dir for the timestamp https://review.opendev.org/c/openstack/tempest/+/913757 | 07:57 |
opendevreview | Martin Kopec proposed openstack/tempest master: Add releasenotes page for version 40.0.0 https://review.opendev.org/c/openstack/tempest/+/928886 | 08:03 |
opendevreview | Merged openstack/tempest master: Fix AttributeError with 'SSHExecCommandFailed' https://review.opendev.org/c/openstack/tempest/+/927424 | 09:33 |
opendevreview | Martin Kopec proposed openstack/tempest master: Parametrize target_dir for the timestamp https://review.opendev.org/c/openstack/tempest/+/913757 | 11:19 |
*** whoami-rajat_ is now known as whoami-rajat | 14:04 | |
opendevreview | Takashi Kajinami proposed openstack/devstack master: Create s3 endpoints in swift https://review.opendev.org/c/openstack/devstack/+/928926 | 14:36 |
opendevreview | Takashi Kajinami proposed openstack/devstack master: Create s3 endpoints in swift https://review.opendev.org/c/openstack/devstack/+/928926 | 14:39 |
opendevreview | Takashi Kajinami proposed openstack/devstack master: Create s3 endpoints in swift https://review.opendev.org/c/openstack/devstack/+/928926 | 14:40 |
opendevreview | Takashi Kajinami proposed openstack/devstack master: Create s3 endpoints in swift https://review.opendev.org/c/openstack/devstack/+/928926 | 14:46 |
frickler | clarkb: fungi: another failure that looks related to raxflex, likely because there is no IPv6 available there https://zuul.opendev.org/t/openstack/build/7d06160f4a0d4ea180378da764f9b661 | 14:46 |
frickler | was discussed in the neutron channel earlier, but I'm pretty sure it needs a fix in devstack. will take a closer look tomorrow unless one of you is faster | 14:48 |
opendevreview | Takashi Kajinami proposed openstack/devstack master: Create s3 endpoints in swift https://review.opendev.org/c/openstack/devstack/+/928926 | 14:48 |
fungi | don't we have any other providers without ipv6 routed? | 14:48 |
frickler | I don't think so. one could argue that nowadays nodes without IPv6 are broken and we should defer raxflex usage until this is fixed. but I still think it is a bug in devstack to have that assumption baked in | 14:52 |
clarkb | inmotion has no ipv6 | 14:52 |
clarkb | ovh has it but I'm not sure if we configure it because for the longest time there wasn't the necessary info available in config drive | 14:52 |
clarkb | you had to get the details from the neutron/nova api directly and then configure things statically (no RAs either) | 14:52 |
clarkb | I think that may have changed transparently for us at some point but I haven't confirmed it | 14:53 |
JayF | Is it possible that Rackspace is forced disabling IPv6 at a kernel level? | 14:54 |
JayF | I know it was practice to do that when I worked there in some environment | 14:54 |
clarkb | we control the kernel | 14:54 |
JayF | Good to know | 14:54 |
fungi | yeah. the kernel command line and kernel package are all part of the images nodepool/dib builds for us | 14:55 |
clarkb | I think we enable things by default since a cloud that only does RAs and no config drive should work too | 14:55 |
opendevreview | Brian Haley proposed openstack/tempest master: Wait for instance ports to become ACTIVE https://review.opendev.org/c/openstack/tempest/+/928471 | 14:58 |
clarkb | a random stackoverflow question answer says that debian bookworm complains at time when you edit the interface live (eg after it is up'd) | 15:03 |
dtantsur | that would explain the randomness | 15:08 |
clarkb | this is an ubuntu jammy node though | 15:09 |
clarkb | rather than being ipv6 related since I'm pretty sure inmotion at least is in the same boat could it be network device type? | 15:10 |
clarkb | similar to what we saw with ephemeral and swap devices being confused perhaps we've got a different type of network device and that has different behaviors? | 15:10 |
opendevreview | Ihar Hrachyshka proposed openstack/devstack master: Dump sysctl in worlddump https://review.opendev.org/c/openstack/devstack/+/928929 | 15:10 |
clarkb | module: virtio_net is what is in use on that job and the mtu is small: 1442 https://zuul.opendev.org/t/openstack/build/7d06160f4a0d4ea180378da764f9b661/log/zuul-info/host-info.controller.yaml#386 | 15:13 |
clarkb | could it be that you are trying to set a larger mtu than can transit that "phsical" dvice? | 15:13 |
clarkb | trying to find an example from a kvm host on a different cloud next | 15:13 |
clarkb | from a random job https://zuul.opendev.org/t/openstack/build/a5d6e96a8ba74b2cbfdf43a8d298c8d8/log/zuul-info/host-info.primary.yaml#335-336 this is openmetal (sorry I kept saying inmotion its too early to get names right, I meant openmetal) without ipv6 but the mtu is 1500 | 15:15 |
clarkb | so thats my best guess at the moment that the MTU is the problem | 15:15 |
clarkb | thats the same ubuntu jammy test node type using the same virtio_net kernel module in a different cloud but also without ipv6 | 15:16 |
JayF | that is a pretty good thought | 15:20 |
haleyb | clarkb: one thing (bug) we have seen in neutron recently is when the MTU is below 1280 IPv6 config fails, but this should be well above that | 15:21 |
clarkb | also note that both interfaces have link local ipv6 so we haven't killed ipv6 in the kernel | 15:22 |
clarkb | haleyb: ya its 1442 which should be plenty of headroom for a couple extra layers of vxlan nesting :) | 15:22 |
frickler | humm, having MTU < 1500 for tenant networks is also a bug IMNSHO | 15:23 |
clarkb | I would say lack of global ipv6 and smaller mtus are not ideal, but things should work regardless | 15:25 |
clarkb | I mean I don't have native ipv6 from my isp yet | 15:25 |
haleyb | i think you'll usually see 1450-ish in a VM unless you have a jumbo overlay, and that's been ok for years | 15:25 |
clarkb | and smaller mtus are common with dsl aiui | 15:25 |
JayF | those assumptions get less true with Ironic, I think. I've gotta look but I think there's a bug around MTU path discovery in OVN or OVS? /me needs to look at notes to remember | 15:25 |
JayF | this is likely why Ironic breaks: I suspect we're manually setting an MTU somewhere | 15:26 |
clarkb | this is still good feedback that we can give rackspace, but I wouldn't say its broken just not ideal and the jobs should handle it becaus you don't know what people will have on their laptop at home. Its perfectly valid for my local reproducer at home to have no ipv6 and a smaller mtu | 15:26 |
JayF | While we are figuring out how to make the Ironic job happy on flex, is there something we can do to reduce the impact to our CI in the meantime? | 15:26 |
clarkb | JayF: set the job to non voting maybe? | 15:27 |
clarkb | you can't easily exclude a cloud from running your jobs (the only way you can approximate it here is to use the nested virt label since raxflex doesn't participate in that but likely will soon) | 15:27 |
clarkb | and I don't think we should turn off a cloud uintil we have evidence it is doing something wrong and I don't see any evidence of that yet | 15:28 |
dtantsur | That's not one job, it's many devstack jobs failing at random | 15:28 |
clarkb | dtantsur: are they all ironic devstack jobs? neutron is supposed to find the smallest mtu and calculate what its overlay should be | 15:28 |
dtantsur | I think so | 15:29 |
clarkb | (to address this specific issue because this isn't the first time we've had cloud resources with small MTUs) | 15:29 |
opendevreview | James Parker proposed openstack/whitebox-tempest-plugin master: Update docstring for VirtQEMUdManager https://review.opendev.org/c/openstack/whitebox-tempest-plugin/+/928930 | 15:29 |
clarkb | (but also again it makes things work on laptops in a home environemtn where we don't have as much say on MTU size) | 15:29 |
dtantsur | we definitely have a ton of MTU logic in our devstack plugin (TheJulia may have more context on it) | 15:29 |
clarkb | also this is almost entirely a problem of our own design :) if openstack didn't use cloud overlays so aggressively then it would be less common to cut into the MTU size | 15:30 |
dtantsur | https://opendev.org/openstack/ironic/commit/cf074202e50426365c761326c8d2ccfcce4ad916 and https://opendev.org/openstack/ironic/commit/40e825ba93c8fb0bcc2d9ef0428b64a46286e0c0 | 15:30 |
clarkb | something something nova-network | 15:30 |
dtantsur | :D | 15:30 |
clarkb | dtantsur: ok we should look at the logs from the failing job to see if those variables are calculated properly | 15:31 |
clarkb | https://zuul.opendev.org/t/openstack/build/7d06160f4a0d4ea180378da764f9b661/log/job-output.txt#2557-2560 | 15:32 |
clarkb | that value is smaller than 1280 | 15:32 |
dtantsur | yeah, it's the ironic calculation https://zuul.opendev.org/t/openstack/build/7d06160f4a0d4ea180378da764f9b661/log/controller/logs/devstacklog.txt#656 | 15:33 |
dtantsur | if I only remembered why we did that... | 15:33 |
clarkb | the comment says you are handling ipv6 tunnels which use an overhead of 100 bytes vs vxlan's 50 | 15:34 |
clarkb | I don't think we use ipv6 tunnels in CI unless ironic is doing that themselves (the default CI overlay networking is all vxlan beacuse it works everywhere) | 15:34 |
clarkb | maybe local_mtu should try something like max(calculated_value, 1280) to see if the issue haleyb calls out is the problem? | 15:35 |
dtantsur | this is the point where I can only drop on the floor, cry and hope that TheJulia is around | 15:35 |
dtantsur | networking is not my strength :( | 15:35 |
dtantsur | but I can try the max() logic sure | 15:36 |
JayF | it'd be hilarious in an incredibly sad kinda way if we run outta room in the MTU | 15:37 |
JayF | dtantsur: I think clarkb and haleyb hit the nail on the head, I would +2 such a change (assuming it passed CI) | 15:37 |
dtantsur | let's see https://review.opendev.org/c/openstack/ironic/+/928931 | 15:38 |
JayF | I was reading thru scrollback and read the number we were setting and thought "that's too low for linux to accept" before I even saw the earlier comments | 15:38 |
JayF | we have the weirdest problems :) | 15:38 |
clarkb | PPPoE is the standard that often results in smaller MTUs in the home network environemnt (fwiw) | 15:38 |
clarkb | and is often used with dsl | 15:39 |
* dtantsur does not miss PPPoE | 15:39 | |
haleyb | with just ipv4 you can set mtu to 576 i think, but i did just double-check that setting to 1279 and adding an IPv6 address fails with the EINVAL error | 15:40 |
clarkb | haleyb: thanks! that is probably the issue then | 15:40 |
haleyb | PUBLIC_BRIDGE_MTU=1272 - oh, yuck, yeah that's probably it | 15:42 |
haleyb | too bad the message from /sbin/ip isn't more helpful | 15:42 |
clarkb | I'll just have to file that away into the esoteric but useful knowledge bin and hope i remember it next time | 15:44 |
haleyb | at least it has moved out of the neutron tribal knowledge bucket, i literally have my third patch up fixing this issue in neutron, we already changed the API to return a 40x if the mtu is too low for the address family | 15:47 |
JayF | I wouldn't assume because Ironic is in on it that you've expanded the family much ;) | 15:48 |
JayF | I think Ironic and Neutron have always been closer than many other projects since we share a lot of similar problems + ngs / nb ironic projects | 15:48 |
haleyb | neutron + 1 > neutron is all i can hope for :) if that "- 100" changed to be more specific, like IPv6 + Geneve it would probably solve this, let me just double-check that number | 15:53 |
JayF | I think we're probably OK with going lower than needed now that we have a minimum to ensure no breaking | 15:53 |
JayF | I still think there's a piece at work here that Julia was mentioning around needing an even lower MTU than makes sense in OVN use cases | 15:54 |
haleyb | so with OVN, the minimum geneve header is 38 bytes, so adding IPv6 (40) makes it 78 | 15:56 |
haleyb | for example, if you spun-up a multi-node devstack with OVN, IPv4 overlay puts tenant mtu at 1442, IPv6 overlay at 1422, which jives with what i've seen | 15:58 |
TheJulia | dtantsur: huh what?! | 16:24 |
TheJulia | dtantsur: sorry, heads down in a narly backport down to train, whats up? | 16:24 |
dtantsur | TheJulia: I think we solved it, sorry for bothering | 16:24 |
dtantsur | tl;dr the MTU logic drove our MTU too low to be usable | 16:25 |
TheJulia | in what case?!? | 16:25 |
dtantsur | if my patch works, we're saved. if not, we're.. well.. not | 16:25 |
TheJulia | where?! | 16:25 |
TheJulia | okay | 16:25 |
TheJulia | if it doesn't, do we start a streaming drinking party? | 16:25 |
* TheJulia is game, has outdoor TV and 4k camera and everything. | 16:25 | |
dtantsur | I think it will be are only option :D | 16:26 |
TheJulia | Excellent! | 16:26 |
TheJulia | please summon if it doesn't work, in the mean time, I'm teleporting my brain back to the land of Train | 16:26 |
dtantsur | choochooo! | 16:26 |
TheJulia | chooooo chooooo (Sign when you've lost remaining sanity!) | 16:27 |
clarkb | dtantsur: haleyb JayF TheJulia so one thing to consider is taht vxlan is 50 bytes. Ironic is doing overlays to simulate a network for testing purposes. You don't need to support every option available including ipv6 + geneve. You just need one that works for CI. Why not use vxlan consistently since it is small and has worked for years? | 16:43 |
clarkb | that said if you can reduce the -100 math to -78 that may be good enough too | 16:43 |
clarkb | my point is more that unlike a real cloud deployment that may want to support different tooling to accomodate different environments we can be highly opinionated in what we use for our test specific overlays and one criteria may be to choose the most byte efficient option to avoid problems like this | 16:44 |
JayF | I'm game to have this discussion, but we need Julia to have time to contexualize it and participate | 16:44 |
JayF | I believe some of this was added specifically for our OVN job, but I'm not sure | 16:44 |
TheJulia | If memory serves, when OVN is in the mix it gets encapsulated again and we need to artificially deflate the networking because otherwise it assumes it has 1500 bytes at the wire when with virtual networking it does not | 16:45 |
haleyb | clarkb: ipv4 + vxlan is 50 bytes, ipv6 + vxlan is 70, ipv6 + geneve is 78 which is where i got that number from | 16:45 |
clarkb | haleyb: got it | 16:46 |
TheJulia | (i.e. OVN always assumes it has a bare interface, which is kind of bonkers, but I can see where they came from) | 16:46 |
clarkb | TheJulia: oh you can't tie an OVN interface and a tunnel interface together with a bridge or some sort? | 16:46 |
clarkb | anyway I think it is worth considering using the smallest possible tunnel option when building a fake l2 network for multinode testing and that would be vxlan aiui | 16:47 |
TheJulia | clarkb: you can, it just doesn't grok the mtu is anything less | 16:47 |
clarkb | we don't actually need to support different tooling as long as the other different tooling can run over the top of that opinionated overlay | 16:47 |
clarkb | TheJulia: right you have to manually configure the MTUs lower because without an l3 interface/device there is nothing to respond with icmp framgentation packets | 16:48 |
TheJulia | well, the bottom line is it doesn't know how to do the reduction in size so we artifically knock the mtu down so the host configures itself because ovn, at least when we looked, entirely lacked support to say "use a smaller mtu" | 16:48 |
TheJulia | which again, is bonkers | 16:48 |
clarkb | since we're joining many l2 connection together we have to manually configure the lowest mtu across all of them | 16:48 |
TheJulia | I have a list of bonkers things | 16:48 |
TheJulia | someplace... | 16:49 |
clarkb | right that is an l2 vs l3 problem and it affects all the things not just ovn | 16:49 |
TheJulia | yeah, but OVN prevents the ability to discover it out of the box | 16:49 |
clarkb | because fragementation responses rely on icmp which rely on having an ip address to send them from which implies an l3 interface | 16:49 |
TheJulia | yup | 16:49 |
TheJulia | which OVN in some cases just doesn't *really* have | 16:50 |
clarkb | right this is true with neutron and ovs or linux bridges too | 16:50 |
clarkb | and it just means we have to manually configure the appropriate min mtu value on all the devices | 16:50 |
TheJulia | well, in those cases, you do have the interfaces with the real bindings so it inherently just sort of works | 16:50 |
TheJulia | because your networking node, or at least attached namespaces resply | 16:50 |
clarkb | not in the CI setup | 16:50 |
TheJulia | "oh, your mtu is too big!" | 16:50 |
clarkb | or with neutron just out of the box | 16:51 |
clarkb | we struggled with this for years until finally neutron implemented management of mtus explicitly on all the devices iirc | 16:51 |
TheJulia | it requires explicit configuration | 16:51 |
TheJulia | yup | 16:51 |
TheJulia | so when configured, it got asserted and magical happiness and joyous sea shanties ensued | 16:51 |
clarkb | anyway since we can control the CI specific overlay setup we can do thinsg like use vxlan over ipv4 as a rule | 16:51 |
clarkb | now I think it is fair to say maybe we should also allow for vxlan + ipv6 since people may want to run this locally as well | 16:52 |
clarkb | so we could do a 70 byte subtraction for each layer and then get back 30% for each layer which should be enough headroom | 16:52 |
TheJulia | well, you can't control the ovn interaction side of it because it doesn't grok the lower mtu on what it asserts | 16:52 |
haleyb | please use IPv6 + Geneve or OVN won't work | 16:52 |
clarkb | haleyb: why can't you tunnel that over a vxlan tunnel? | 16:53 |
* TheJulia steps away due to the corgi attempting to sign the bark chain | 16:53 | |
clarkb | that was my earlier point ovn and geneve shouldn't care if you give them an interface that just happens to be part of a bridge with a vlxan tunnel to another node | 16:53 |
clarkb | the current multinode network overlay setup for zuul jobs uses ipv4 + vxlan | 16:54 |
haleyb | clarkb: are you talking about native encapsulation? or is this about just shoving packets into an overlay? | 16:54 |
clarkb | that presents an interface to the jobs on each node in the job with an mtu of host mtu - 50. Then you run ovn or whatever else you want against that interface and potentially reduce the mtu further to handle the extra layer of nesting | 16:54 |
haleyb | if you have ipv4 + vxlan it will work I suppose, but there will be fragments | 16:54 |
clarkb | haleyb: I'm talking about multinode testing in zuul being given a fake l2 network to make the test setup look more like reality since we can't provide them actual l2 networks like that from the cloud providers | 16:55 |
clarkb | haleyb: why would there be fragments if you know the parent mtu is 1450 for example and then geneve + ipv6 needs another 78 bytes you'd configure the innermost neutron overlays to use 1450-78 byte mtus | 16:55 |
TheJulia | clarkb: my context https://github.com/openstack/ironic/blob/master/doc/source/admin/ovn-networking.rst#maximum-transmission-units | 16:56 |
clarkb | let me see if I can find the old docuemtnation for this in devstack | 16:56 |
haleyb | ok, so a zuul overlay network | 16:56 |
TheJulia | I've not dug into the bug recently to see if there are any updates | 16:56 |
clarkb | haleyb: essentially yes, and my undersatnding is that ironic configures one of these because they are doing fake baremtal vms that should all live on the same "physical" network | 16:56 |
clarkb | essentially we've got two layers here. The outmost is approximating the phyiscal cables between hosts in a virtual environemtn. Then you've got the inner tenant isolation cloud network overlays that run over that | 16:57 |
clarkb | we all get confused because the outer layer is actually an overlay layer too because we don't have physical cables betwene hosts nor do we have fancy control of tenant networks in half the clouds | 16:57 |
JayF | the OVN bug Julia points at I think is a primary root cause of this being more nonsensical in Ironic | 16:58 |
clarkb | I don't think that is actually an ovn bug | 16:58 |
clarkb | its just how icmp and mtu fragmentation work and unfortuantely it requires us to work around it | 16:58 |
haleyb | clarkb: ok, makes sense more now, thanks | 16:58 |
JayF | https://github.com/ovn-org/ovn/blob/main/TODO.rst disagrees clarkb | 16:58 |
JayF | at least they count handling fragmentation on outgoing packets as a todo | 16:59 |
JayF | I don't know enough about OVN to know how many layers are there; but it's directly in that projects' todo doc | 16:59 |
clarkb | JayF: but you have to manually configure it anyway is my point | 16:59 |
clarkb | yes they may not implement the actual icmp fragmentation protocol, but that doesn't matter because you're going to have to manually configure things if you have any l2 only devices | 16:59 |
haleyb | that TODO might be a little out of date if you ask me, but there are some issues where we've not seen a packet-too-big where we expect it | 16:59 |
clarkb | haleyb: https://opendev.org/openstack/devstack-gate/src/commit/9cfd5cca0a3b1dbfe8f1fefd836942d20425f172/multinode_setup_info.txt here is the really old docs on this | 16:59 |
clarkb | JayF: in the case of neutron + ovs or linux bridge we have had the same issues in CI because you end up with like one l3 device for every 5 l2 devices | 17:00 |
clarkb | and the only way to make that reliable is to manually configure the lower mtu across the board and not rely on icmp and automatic fragmentation | 17:00 |
JayF | I think that's what we were trying to do | 17:00 |
JayF | we just set the limbo bar lower than the floor | 17:01 |
clarkb | yes. And my point is we can optimize it even further by using vxlan because it is lighter weight | 17:01 |
haleyb | https://github.com/openstack/neutron/blob/master/doc/source/ovn/gaps.rst - see the section on fragmentation/path mtu there, which can lead to tenant packet issues, but i digress into the weeds | 17:01 |
clarkb | we don't need to support ovn + geneve at that layer as long as ovn + geneve can overlay on top of the original overlay so we end up with 50 bytes + 78 bytes overhead instead of 78 + 78 | 17:01 |
clarkb | as a side note discussing this stuff is about 100x easier with the ability to draw pictures. Text makes it difficult | 17:02 |
haleyb | yes, i would agree that ipv4+vxlan should be fine for your virtual overlay, then anything on top should fit in without going below 1280 | 17:03 |
JayF | yeah, I am having trouble tracking it, but TBH Ironic <> Neutron stuff is a weak point for me I need to beef up on anyway | 17:03 |
clarkb | historically I think the main thing people get confused about is that we're typically ending up with two distinct layers of networks in the CI jobs. The outer layer is a set of overlays that may or may not use the same overlay tech as the "workload" networks but its job is to be there and ensure l2 access amongst the test nodes approximating them all being physically connected | 17:05 |
clarkb | on the same switch | 17:05 |
clarkb | then we have the "workload" networking layer which may or may not run over that "physical" layer whcih is providing all of your tenant networks to your VMs or baremetal devices | 17:05 |
clarkb | the "physical" layer is test/CI specific and we don't need to support more than one tool/technology there. | 17:06 |
JayF | the place where it all turns to mush for me is actually in the deepest layers | 17:06 |
JayF | I've done a lot of networking from a "managing a linux-based load balancer/firewall" perspective but less from the "magic cloud-y switch" perspective :D | 17:07 |
clarkb | honestly a lot of the old school stuff maps over well. You have bridges as switches and veth pairs as cables. Most of the problems arise from having a bunch of different implementations for all of these things whcih sometimes don't play nice together or with standard tooling like tcpdump | 17:09 |
clarkb | haleyb: can you tcpdump ovn interfaces or is it like ovs and you have to set up a tap device to bridge between standard networking tooling and the special stuff? | 17:10 |
JayF | well contributing to this is I've literally never worked a place with an openstack cloud that used an upstream-style ironic+neutron networking solution (well, until $curJob, but I'm not really operationally involved at all in that cloud) :D | 17:13 |
haleyb | clarkb: well there is ovs-tcpdump which makes things a little easier | 17:14 |
clarkb | I guess that is an improvement. Personally I've always found it useful to do something like tcpdump -i any for 30 seconds then read through the capture to see what is happening and ovs in particular breaks that and forces you to understand how things work beacuse you can use that as a method to undersatnd how things work | 17:16 |
JayF | I bet that is more helpful in the devstack case. Working on edge servers I often found the signal:noise in a tcpdump was too bad to be useful (especially when the customer won't accept 'your firewall is randomly sending us tcp resets, what are we supposed to do?!' as an answer to their support ticket :D ) | 17:17 |
clarkb | JayF: typically I capture all the things then you can use a tool like wireshark to filter and see different views like "what did dhcp do" or "was there any http traffic" and so on | 17:18 |
clarkb | because yes just looking at a huge raw packet capture all at once is difficult | 17:19 |
clarkb | but if you have all that data then you can start filtering and piece together where things are moving across the different intrfaces to say negotiate dhcp | 17:19 |
JayF | back when I was doing this as my day job (we're talking like, 2008-ish), I actually loved the microsoft tool for viewing pcaps, it was way faster than wireshark, at least on windows :D | 17:19 |
haleyb | what, you can't pick a needle out of a 10G pcap haystack with tcpdump? :-p | 17:20 |
opendevreview | Merged openstack/tempest master: Add releasenotes page for version 40.0.0 https://review.opendev.org/c/openstack/tempest/+/928886 | 17:30 |
TheJulia | clarkb: so, the underlying issue is the linux kernel routing code drops pmtu packets if it can't address them back to the source, that was free with OVS with the model of a network namespace being involved where the routing code would come into effect | 17:37 |
* TheJulia is barely following the thread | 17:37 | |
clarkb | that may be new functioanlity in ovs then? We definitely had to manually explicitly set mtus on all the things with ovs and also with linxu bridge | 17:37 |
clarkb | because that would only happen if every interface was configured with an l3 address allowing it to icmp properly | 17:38 |
clarkb | I'm pretty sure that neutron manages all of this explicitly now as a result | 17:39 |
clarkb | I left a comment on dtantsur's change basically suggesting instead of 100 bytes trimmed off you use 78 to accomodate geneve + ipv6 then that will get you more than 1280 bytes for the mtu in the current situation. Also suggested leaving a comment there that 1280 is the min for ipv6 | 17:42 |
TheJulia | oh, if there is a mismatch and the parent layers dropped stuff, it could not just magically work, thus requiring explicit config to know | 17:42 |
TheJulia | the more layers, the more places stuff can get dropped at | 17:42 |
clarkb | I sent an email to rackspace about the issue just to keep them in the loop and I think this is valuable feedback for public clouds if people hit issues even if they can't fix it immediately or at all | 18:45 |
opendevreview | Goutham Pacha Ravi proposed openstack/devstack-plugin-ceph master: Skip tempest image format tests https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/928303 | 18:51 |
opendevreview | Merged openstack/whitebox-tempest-plugin master: Update docstring for VirtQEMUdManager https://review.opendev.org/c/openstack/whitebox-tempest-plugin/+/928930 | 19:18 |
opendevreview | Sreelakshmi Menon Kovili proposed openstack/whitebox-tempest-plugin master: Discard the cpu-0 from dedicated set https://review.opendev.org/c/openstack/whitebox-tempest-plugin/+/927641 | 20:42 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!