jrwr | Nova Errors like this keep me up at night -- requires a full stack reboot to fix https://hastebin.com/raw/ajefokeseh | 00:01 |
---|---|---|
*** rlandy is now known as rlandy|out | 00:07 | |
*** mhen_ is now known as mhen | 01:12 | |
jrwr | Neutron created hundreds of TAP interfaces... I go and try and reboot.. kernel panics -- now I gotta wait until I get back into work to iDrac into them | 01:26 |
*** yadnesh|away is now known as yadnesh | 03:36 | |
*** rlandy|out is now known as rlandy | 10:28 | |
jrwr | ERROR neutron.agent.linux.dhcp [req-2af2610f-2c6b-4854-b3f8-762584e210b3 - - - - -] Unable to plug DHCP port for network d4467036-0416-4526-a8d2-8867033571bd. Releasing port.: ovsdbapp.exceptions.TimeoutException: exceeded timeout 10 seconds, cause: TXN queue is full -- I wonder how I would fix this | 14:56 |
*** yadnesh is now known as yadnesh|away | 16:18 | |
jrwr | If you ever hit a bunch of TAP interfaces desync'd from Neutron and are clogging up OVS/OVN | 18:15 |
jrwr | cat openvswitch/ovs-vswitchd.log | grep "could not open network" | cut -d' ' -f6 | uniq | xargs -n1 -P1 ovs-vsctl del-port br-int | 18:15 |
jrwr | cleared out over 8k TAPs in my OVS today | 18:15 |
JayF | jrwr: Glad you figured some of that out; if you want more guidance about that nova error, you might have more luck emailing openstack-discuss@ | 18:18 |
jrwr | Its a Kolla Deployment, but Ill take you up on that | 18:19 |
jrwr | I've had RabbitMQ desync, I've had Cinder, Neutron, and others all get out of sync and just hardlock deployments | 18:20 |
jrwr | it /really/ dislikes 1200 VM deployments | 18:20 |
JayF | I've absolutely used nova past that scale with success, but rabbitmq was consistently the issue I hit scaling it up. | 18:23 |
JayF | Hopefully someone on the list with more knowledge of kolla+VM-nova will be able to help out :D | 18:23 |
jrwr | The hosts are def beefy (8 hosts, Each with 2TB Ram, 200Gbit/s 256 CPU Cores) | 18:24 |
lowercase | I have a much a large deployment than 1200. We don't use ovs and instead opt'd to use LACP bond on every hypervisor which is port channeled into a vpc. then the ip addresses are BGP routed to our vms. | 18:55 |
lowercase | and each hypervisor has a couple bridges and vlans that are specific to our deployment, vlans for customer traffic, management, database and rabbit are all segemented. | 18:56 |
jrwr | We are doing student competitions -- so its like 30vms and two networks per team (that are the same IP) with Secgroups | 18:58 |
jrwr | times 40 | 18:58 |
jrwr | god I wish I could just dumb the traffic on the network | 18:58 |
jrwr | dump* | 18:58 |
lowercase | bgp would allow that to be possible. | 19:05 |
lowercase | the openstack network range hands out ip addresses in a subnet, that is routable by the entire internal network space. | 19:06 |
lowercase | Think floating ips, but instead, the "floating ips" are handed out by neutron and assigned to the vms dynamically. | 19:07 |
jrwr | fun part is, these networks are isolated, I don't need to to talk anything, Just need NAT for the publics | 19:22 |
jrwr | and they are all on the same L2 VLAN, so I could get away with a good bit | 19:26 |
jrwr | Oh, and its a Uni, so I don't even control my poor switches | 19:29 |
*** lifeless_ is now known as lifeless | 20:14 | |
*** rlandy is now known as rlandy|out | 22:10 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!