*** mhen_ is now known as mhen | 02:03 | |
Zeth | Hi. I have an issue i was hoping someone can help with. We have a frontend that talks to the api. Sometimes when creating a stack, it will take too long and timeout, then the stack gets stuck in DELETE_FAILED due to the disk being in a wrong state. Additionally it sometimes gives 504 responses... In Heat logs i can see it times out waiting for reply to messages. The logs indicate that the reply queue is missing, and i cannot find anything about | 07:27 |
---|---|---|
Zeth | said queue in rabbitmq logs. Similar to the issue described here: https://bugzilla.redhat.com/show_bug.cgi?id=1484543 | 07:27 |
Zeth | Is there anyone that can point me in the right direction? | 07:27 |
*** Guest5395 is now known as i0annis | 13:01 | |
i0annis | is thinking it is awfully quiet here... scared to say hello not to polute the serenity... | 13:03 |
i0annis | Here goes nothing then :) - Hello everyone! Been pondering on this for several days and thought to ping here: | 13:08 |
i0annis | if you create an instance on the provider network and you have confirmed that, from a netwroking perspective the interface can reach that network i.e. the gateway, the security groups are set, a route has been added, other instances in that cidr communicate fine, where would you look if you cannot ping the instance from compute, controller or the gateway itself...? | 13:12 |
DHE | I start by checking ARP. Either arping on the same layer 2 vlan, or just try to ping and check the ARP table afterwards regardless of success. Most firewalls ignore it. If an ARP entry is present, then you can suspect firewalls/ACLs/Security groups | 13:18 |
i0annis | Much appreciated @DHE. By present I'd assume you mean with the corresponding mac address? | 13:31 |
DHE | yes | 13:34 |
DHE | I mean... if there's a different mac address, that's an IP conflict. | 13:34 |
i0annis | currently it comes back incomplete... so it feels it something before that... ovs maybe? ml2 ? really I've gone back adn forth so many times I'm thinking to delete the neutron database and do it again... | 13:41 |
DHE | if you're using ovs, I'm assuming you're using a vlan based networking for your provider network(s)? | 13:46 |
i0annis | error wise I only see ovs|00002|db_ctl_base|ERR|multiple rows in Manager match "ptcp:6640:127.0.0.1" in neutron-openvswitch-agent but it is still running and i've read it can be ignored | 13:47 |
DHE | when you created the network, you named it something, I presume "provider" or such. and in the ML2 config you had to specify a mapping that "provider" maps to an OVS bridge named br-provider or something. You've attached a real NIC to br-provider, right? | 13:48 |
i0annis | aye | 13:48 |
i0annis | and from the controller instance I can ping through that interface the gateway of the provider network... | 13:50 |
DHE | which is interesting because normally that's not possible. br-provider will have switch ACL rules (aka openflow rules) programmed by neutron which will do all kinds of shenanigans that would make that port unusable | 13:51 |
DHE | that is, unusable under general rules. the host wouldn't be able to make use of it | 13:52 |
DHE | are you making 802.1q vlan devices from the physical port and the controller runs on that? | 13:52 |
i0annis | well... it goes through the ens not the br-provider | 13:52 |
DHE | that is, you're using `ip link add [..] type vlan` ? | 13:54 |
i0annis | no... it as simple as it could ever get... a default untagged interface connected to the controller, compute (and block) as provider in their own cidr | 13:55 |
DHE | okay, not quite the way I thought... so you'd either set it up as a flat network or match vlan numbers in ovs | 13:58 |
i0annis | flat indeed | 13:59 |
i0annis | vanilla network | 14:00 |
i0annis | I'd paste things here but it could get noisy and I 'd hate that... you are welcome to take a look at a tmate session if you like, and we can share those findings back here. I can see the other tags in ovs but I | 14:02 |
i0annis | expect it to be like that | 14:02 |
i0annis | i.e Port qr-7d3cd66d-ff tag: 1 | 14:04 |
jrosser | you can use paste.opendev.org | 14:06 |
i0annis | my bad... has been a while since irc https://paste.opendev.org/show/bRAo12fS50OLIRIrrvhs/ | 14:07 |
i0annis | the mapping between the br-provider and the interface (ens33) seems to be working fine since it was able to identify tha mac address of the gateway trhough the bridge... | 14:14 |
i0annis | https://paste.opendev.org/show/b2ovyPmwwZx3V0MnNsmv/ | 14:15 |
DHE | which implies it might be a vlan tag thing... none are present on either br-provider or ens33 so it's using the magic "no tag" vlan (which is considered separate from an actual vlan number) | 14:32 |
DHE | side note, I would not trust br-provider to work. I'm a bit surprised it does | 14:33 |
i0annis | technically the network itself was created as flat - openstack network create --share --external --provider-physical-network provider --provider-network-type flat provider - and if understand ovs correctly, it is meant to take the flat traffic and trasnfer it to the tagged one... - I can try tagging and see where it takes me... | 14:50 |
i0annis | any suggestions replacing ovs? | 14:51 |
DHE | I do like OVS. full vlan support, as long as you're plugging into a vlan capable switch as well. | 14:51 |
DHE | ironically I know less about flat networks than vlan networks... maybe I should stop lest I give bad advice | 14:52 |
i0annis | nah... can't get any worse - don't worry :p | 14:53 |
i0annis | could it mind that I'm using only one compute node? | 15:03 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!