Friday, 2024-10-04

*** mhen_ is now known as mhen02:03
ZethHi. I have an issue i was hoping someone can help with. We have a frontend that talks to the api. Sometimes when creating a stack, it will take too long and timeout, then the stack gets stuck in DELETE_FAILED due to the disk being in a wrong state. Additionally it sometimes gives 504 responses... In Heat logs i can see it times out waiting for reply to messages. The logs indicate that the reply queue is missing, and i cannot find anything about07:27
Zeth said queue in rabbitmq logs. Similar to the issue described here: https://bugzilla.redhat.com/show_bug.cgi?id=148454307:27
ZethIs there anyone that can point me in the right direction?07:27
*** Guest5395 is now known as i0annis13:01
i0annisis thinking it is awfully quiet here... scared to say hello not to polute the serenity...13:03
i0annisHere goes nothing then :) - Hello everyone! Been pondering on this for several days and thought to ping here:13:08
i0annisif you create an instance on the provider network and you have confirmed that, from a netwroking perspective the interface can reach that network i.e. the gateway, the security groups are set, a route has been added, other instances in that cidr communicate fine, where would you look if you cannot ping the instance from compute, controller or the gateway itself...?13:12
DHEI start by checking ARP. Either arping on the same layer 2 vlan, or just try to ping and check the ARP table afterwards regardless of success. Most firewalls ignore it. If an ARP entry is present, then you can suspect firewalls/ACLs/Security groups13:18
i0annisMuch appreciated @DHE. By present I'd assume you mean with the corresponding mac address?13:31
DHEyes13:34
DHEI mean... if there's a different mac address, that's an IP conflict.13:34
i0anniscurrently it comes back incomplete... so it feels it something before that... ovs maybe? ml2 ? really I've gone back adn forth so many times I'm thinking to delete the neutron database and do it again...13:41
DHEif you're using ovs, I'm assuming you're using a vlan based networking for your provider network(s)?13:46
i0anniserror wise I only see ovs|00002|db_ctl_base|ERR|multiple rows in Manager match "ptcp:6640:127.0.0.1" in neutron-openvswitch-agent but it is still running and i've read it can be ignored13:47
DHEwhen you created the network, you named it something, I presume "provider" or such. and in the ML2 config you had to specify a mapping that "provider" maps to an OVS bridge named br-provider or something. You've attached a real NIC to br-provider, right?13:48
i0annisaye13:48
i0annisand from the controller instance I can ping through that interface the gateway of the provider network...13:50
DHEwhich is interesting because normally that's not possible. br-provider will have switch ACL rules (aka openflow rules) programmed by neutron which will do all kinds of shenanigans that would make that port unusable13:51
DHEthat is, unusable under general rules. the host wouldn't be able to make use of it13:52
DHEare you making 802.1q vlan devices from the physical port and the controller runs on that?13:52
i0anniswell... it goes through the ens not the br-provider13:52
DHEthat is, you're using `ip link add [..] type vlan` ?13:54
i0annisno... it as simple as it could ever get... a default untagged interface connected to the controller, compute (and block) as provider in their own cidr13:55
DHEokay, not quite the way I thought...  so you'd either set it up as a flat network or match vlan numbers in ovs13:58
i0annisflat indeed13:59
i0annisvanilla network14:00
i0annisI'd paste things here but it could get noisy and I 'd hate that... you are welcome to take a look at a tmate session if you like, and we can share those findings back here. I can see the other tags in ovs but I14:02
i0annisexpect it to be like that14:02
i0annisi.e Port qr-7d3cd66d-ff tag: 114:04
jrosseryou can use paste.opendev.org14:06
i0annismy bad... has been a while since irc https://paste.opendev.org/show/bRAo12fS50OLIRIrrvhs/14:07
i0annisthe mapping between the br-provider and the interface (ens33) seems to be working fine since it was able to identify tha mac address of the gateway trhough the bridge...14:14
i0annishttps://paste.opendev.org/show/b2ovyPmwwZx3V0MnNsmv/14:15
DHEwhich implies it might be a vlan tag thing... none are present on either br-provider or ens33 so it's using the magic "no tag" vlan (which is considered separate from an actual vlan number)14:32
DHEside note, I would not trust br-provider to work. I'm a bit surprised it does14:33
i0annistechnically the network itself was created as flat - openstack network create --share --external --provider-physical-network provider   --provider-network-type flat provider - and if understand ovs correctly, it is meant to take the flat traffic and trasnfer it to the tagged one... - I can try tagging and see where it takes me... 14:50
i0annisany suggestions replacing ovs?14:51
DHEI do like OVS. full vlan support, as long as you're plugging into a vlan capable switch as well.14:51
DHEironically I know less about flat networks than vlan networks... maybe I should stop lest I give bad advice14:52
i0annisnah... can't get any worse - don't worry :p14:53
i0anniscould it mind that I'm using only one compute node?15:03

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!