openstackgerrit | Jacky Hu proposed openstack/neutron-lbaas-dashboard master: Replace noop tests with registration test https://review.openstack.org/591261 | 00:37 |
---|---|---|
*** longkb has joined #openstack-lbaas | 00:38 | |
openstackgerrit | Jacky Hu proposed openstack/neutron-lbaas-dashboard master: Replace noop tests with registration test https://review.openstack.org/591261 | 01:08 |
openstackgerrit | Jacky Hu proposed openstack/neutron-lbaas-dashboard master: Removes testr and switches cover to karma-coverage https://review.openstack.org/570442 | 01:22 |
*** colby_home has quit IRC | 01:29 | |
*** hongbin has joined #openstack-lbaas | 02:19 | |
*** ramishra has joined #openstack-lbaas | 03:53 | |
*** ramishra has quit IRC | 03:54 | |
*** ramishra has joined #openstack-lbaas | 03:54 | |
*** rcernin has quit IRC | 04:54 | |
*** rcernin has joined #openstack-lbaas | 04:54 | |
*** openstackgerrit has quit IRC | 05:18 | |
*** hongbin has quit IRC | 05:20 | |
*** pcaruana has joined #openstack-lbaas | 05:59 | |
*** pcaruana has quit IRC | 06:05 | |
*** openstackgerrit has joined #openstack-lbaas | 06:12 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/neutron-lbaas master: Imported Translations from Zanata https://review.openstack.org/590173 | 06:12 |
*** pcaruana has joined #openstack-lbaas | 06:19 | |
*** ispp has joined #openstack-lbaas | 06:52 | |
*** nmagnezi has quit IRC | 07:01 | |
*** rcernin has quit IRC | 07:03 | |
*** nmagnezi has joined #openstack-lbaas | 07:20 | |
*** nmagnezi has quit IRC | 07:26 | |
*** nmagnezi has joined #openstack-lbaas | 07:32 | |
*** rpittau has joined #openstack-lbaas | 07:43 | |
*** pcaruana has quit IRC | 07:43 | |
*** pcaruana has joined #openstack-lbaas | 07:57 | |
*** ispp has quit IRC | 08:11 | |
openstackgerrit | Carlos Goncalves proposed openstack/octavia-tempest-plugin master: Update links in README.rst https://review.openstack.org/575030 | 08:14 |
*** ktibi has joined #openstack-lbaas | 08:37 | |
*** salmankhan has joined #openstack-lbaas | 09:00 | |
openstackgerrit | Nir Magnezi proposed openstack/octavia master: Leave VIP NIC plugging for keepalived https://review.openstack.org/589292 | 09:16 |
*** ktibi has quit IRC | 09:30 | |
*** ktibi has joined #openstack-lbaas | 09:31 | |
*** rpittau has quit IRC | 09:36 | |
*** rpittau has joined #openstack-lbaas | 09:36 | |
*** celebdor has joined #openstack-lbaas | 10:18 | |
*** numans has quit IRC | 10:37 | |
*** numans has joined #openstack-lbaas | 10:41 | |
*** dmellado has quit IRC | 10:46 | |
*** sapd1 has joined #openstack-lbaas | 11:16 | |
*** longkb has quit IRC | 11:34 | |
*** amuller has joined #openstack-lbaas | 12:01 | |
*** rpittau is now known as rpittau|afk | 12:10 | |
*** rpittau|afk is now known as rpittau | 12:10 | |
*** celebdor has quit IRC | 12:17 | |
*** dmellado has joined #openstack-lbaas | 12:50 | |
*** celebdor has joined #openstack-lbaas | 12:58 | |
*** ramishra has quit IRC | 13:04 | |
*** ramishra has joined #openstack-lbaas | 13:48 | |
*** savvas has joined #openstack-lbaas | 13:51 | |
*** fnaval has joined #openstack-lbaas | 14:00 | |
*** devfaz has joined #openstack-lbaas | 14:45 | |
devfaz | Hi, are there any open bugs regarding heat-tempest-lbaasv2-tests against octavia? | 14:46 |
*** savvas has quit IRC | 15:23 | |
*** pcaruana has quit IRC | 15:37 | |
*** devfaz has left #openstack-lbaas | 16:00 | |
*** devfaz has joined #openstack-lbaas | 16:00 | |
*** pcaruana has joined #openstack-lbaas | 16:40 | |
*** openstackgerrit has quit IRC | 17:19 | |
*** celebdor has quit IRC | 17:23 | |
*** jiteka has quit IRC | 17:31 | |
*** eandersson has quit IRC | 17:32 | |
*** jiteka has joined #openstack-lbaas | 17:34 | |
*** ktibi has quit IRC | 17:34 | |
*** ianychoi_ has joined #openstack-lbaas | 17:38 | |
*** ianychoi has quit IRC | 17:41 | |
*** pcaruana has quit IRC | 17:59 | |
*** salmankhan has quit IRC | 18:15 | |
mnaser | are Guru Meditation reports not available in octavia-health-manager ? | 19:06 |
johnsom | mnaser: I am still out on vacation until tomorrow. I though we had it in all of them. | 19:13 |
mnaser | johnsom: no worries, enjoy it. i might be hitting https://review.openstack.org/#/c/576388/ | 19:14 |
mnaser | i have like 500 threads per octavia-hm process | 19:14 |
*** lxkong_ has joined #openstack-lbaas | 19:17 | |
*** lxkong has quit IRC | 19:25 | |
*** ptoohill- has quit IRC | 19:25 | |
*** lxkong_ is now known as lxkong | 19:25 | |
*** amotoki has quit IRC | 19:27 | |
*** amotoki has joined #openstack-lbaas | 19:30 | |
colby_ | When addressing the octavia directly via the cli I specified the subnet id but the network id and ip address do not match the subnet (it fails with port not found when trying to build the lb) | 19:32 |
colby_ | one thing I was wondering. Does the management network need to belong to the service project (where octavia user is) or the admin project? In docs it said admin. | 19:33 |
colby_ | sorry Im trying to spin up a lb with openstack loadbalancer create | 19:34 |
*** salmankhan has joined #openstack-lbaas | 19:39 | |
*** salmankhan has quit IRC | 19:44 | |
*** amuller has quit IRC | 19:56 | |
*** salmankhan has joined #openstack-lbaas | 20:32 | |
rm_work | mnaser: which version are you on? | 21:12 |
rm_work | i think you said you run Master? | 21:12 |
mnaser | rm_work: queens | 21:12 |
rm_work | ah | 21:12 |
mnaser | But up to date from stable branch | 21:12 |
rm_work | hmm | 21:12 |
rm_work | yeah i thought we got backports in for all the applicable HM issues I had found... | 21:13 |
mnaser | This keeps happening.. do you think it could be related to the machine having a high core count | 21:13 |
rm_work | you shouldn't have that many threads though :P | 21:13 |
rm_work | unless by threads you mean, actual processes | 21:13 |
rm_work | because i switched it from threading to multiprocess | 21:13 |
mnaser | It has 40 cores. I end up with like a ton of health manager processes with like 500 or so threads inside each | 21:13 |
rm_work | (specifically because threads were broken) | 21:14 |
rm_work | hmm | 21:14 |
mnaser | I would do ps -T -p pid and it would read some 500 entries | 21:14 |
rm_work | O_o | 21:14 |
rm_work | umm | 21:14 |
rm_work | let me check something | 21:14 |
mnaser | strace with -fF shows it detach from a ton of things when I ctrl+c | 21:14 |
mnaser | Also, during all this, I get the warning that health check processing took too long | 21:15 |
rm_work | yeah wtf hold on | 21:16 |
rm_work | no, confirmed it should be using processes not threads on queens | 21:22 |
rm_work | there's no way you should have that many threads | 21:23 |
rm_work | can you spot-check the code to make sure it looks like this? https://github.com/openstack/octavia/blob/stable/queens/octavia/amphorae/drivers/health/heartbeat_udp.py#L66-L67 | 21:24 |
*** abaindur has joined #openstack-lbaas | 21:29 | |
colby_ | will octavia create the ports automatically for the vip in neutron? | 21:32 |
*** rcernin has joined #openstack-lbaas | 22:01 | |
*** abaindur has quit IRC | 22:04 | |
*** abaindur has joined #openstack-lbaas | 22:04 | |
lxkong | colby_: i think so, you could specify a subnet id, octavia will create port inside. | 22:12 |
colby_ | hmm ok it does not seem to be working correctly Im trying to narrow down. Im getting a Port Not Found error. | 22:13 |
*** rtjure has quit IRC | 22:35 | |
rm_work | johnsom: so when you're back tomorrow -- having issues with SOURCE_IP algorithm | 22:44 |
rm_work | it seems to always just direct all traffic to one member regardless of anything | 22:45 |
johnsom | rm_work: stuck in traffic and not driving, so can chat | 22:45 |
rm_work | 3 members, all seem up and good, but only one receiving ALL traffic (from many many different source IPs) | 22:45 |
rm_work | lol | 22:45 |
rm_work | THAT stuck? T_T | 22:45 |
johnsom | Not pushing 5mph | 22:46 |
rm_work | T_T | 22:46 |
rm_work | well, i can paste you stuff later, but | 22:46 |
rm_work | i'm at loss... did verification that everything is good in config and health is good for all members via the socket stats | 22:46 |
rm_work | everything looks like it SHOULD be balancing | 22:46 |
rm_work | but it just isn't <_< | 22:46 |
rm_work | I don't know how to get further insight into why it's making the routing decisions it is | 22:47 |
johnsom | So, wait, source_ip is session persistence, so yeah, should be stuck | 22:47 |
rm_work | source_ip *balancing alg* | 22:47 |
rm_work | on the pool | 22:47 |
rm_work | it's HTTPS with no termination | 22:47 |
rm_work | no session persistence on the pool | 22:48 |
rm_work | but yes, it should still "do session persistence" kinda | 22:48 |
rm_work | except, LITERALLY ALL TRAFFIC from hundreds of different source IPs, all going to member 1 | 22:48 |
rm_work | nothing to any other member | 22:48 |
rm_work | that isn't "balancing" lol | 22:48 |
johnsom | Hmm, so source_IP is hash based. How many members and what are their weights? | 22:50 |
rm_work | 3 members, all weight 1 | 22:50 |
rm_work | removing the member that is receiving 100% of the traffic appears to cause all traffic to redirect to one of the remaining two <_< | 22:50 |
johnsom | So divide the ip address range into three buckets, that is how it will balance if my memory is good | 22:51 |
rm_work | ok | 22:51 |
rm_work | so what it IS doing, is this: | 22:51 |
rm_work | traffic comes in -> goes to bucket 1 | 22:51 |
johnsom | Right, you would have a 50-50 chance | 22:51 |
rm_work | ok | 22:51 |
rm_work | so now imagine we have hundreds of unique IPs hitting it | 22:52 |
rm_work | and ALL of them are going to bucket 1 | 22:52 |
johnsom | By hash though | 22:52 |
rm_work | ... | 22:52 |
rm_work | so let's see what the probability is | 22:52 |
rm_work | that all 300 or so unique IPs that have hit the LB | 22:52 |
rm_work | all hash to the same member of those 3 | 22:52 |
rm_work | that's 33%... | 22:52 |
rm_work | ^300 | 22:53 |
rm_work | right? | 22:53 |
johnsom | Could happen, but would be odd. | 22:53 |
rm_work | 3.58246506e111 | 22:53 |
rm_work | sooooo PROBABLY NOT lol | 22:53 |
johnsom | It is skewed though, as large blocks of possible IPs are reserved, etc | 22:53 |
rm_work | which brings me back to, something is up with source_Ip | 22:53 |
rm_work | ummmmm | 22:54 |
rm_work | so even within a block | 22:54 |
rm_work | like | 22:54 |
rm_work | imagine 64 sequential IPs | 22:54 |
rm_work | i can't imagine any hashing algorithm that isn't absolute *trash* that would hash all 64 the same | 22:54 |
rm_work | "oh, that IP starts with 172.157, guess it hashes to 3" | 22:55 |
rm_work | https://i.imgur.com/wZ6XzO7.png | 22:58 |
rm_work | we're so far beyond "maybe randomly it just worked out that way" that i can't even begin to describe how ridiculous that suggestion is, lol | 22:59 |
rm_work | maybe if it was like 4 source_IPs *maybe* | 22:59 |
rm_work | but even that is statistically pretty unlikely | 22:59 |
rm_work | like, ~1% | 22:59 |
rm_work | this is hundreds of source IPs, and not a single byte of traffic to any other member | 23:00 |
johnsom | I am not saying it randomly worked out that way, I am saying the hash for source IP by default is pretty dumb. | 23:01 |
johnsom | Check the haproxy docs for the balance and hash-type keywords | 23:02 |
johnsom | It is intended, in default, to be a poor mans session persistence | 23:02 |
rm_work | yeah | 23:03 |
rm_work | when we have no way to TLS terminate | 23:03 |
johnsom | I would expect that most of the common IP fit in one bucket with three servers | 23:03 |
rm_work | and someone has a requirement for session persistence :P | 23:03 |
rm_work | you would actually expect their hashing algorithm to be so bad that 100% of IPs hash the same? | 23:04 |
johnsom | You can use round robin with source ip SP | 23:04 |
johnsom | On TCP flows | 23:04 |
rm_work | I will have them try that | 23:05 |
rm_work | but like | 23:05 |
rm_work | should we *disable* source_ip as an algorithm? | 23:05 |
rm_work | if it is this broken? | 23:05 |
rm_work | this is unusable | 23:05 |
johnsom | Well, I can’t see your IP list, but a bucket is a third of the IPs with one bucket probably reserved addresses | 23:06 |
rm_work | not just "bad" | 23:06 |
rm_work | one sec | 23:06 |
johnsom | It isn’t broken, I bet it is doing exactly how it is configured | 23:06 |
rm_work | ok | 23:08 |
rm_work | i got an IP list | 23:08 |
rm_work | I was wrong, it's not hundreds | 23:08 |
johnsom | Which is not what you want or expect | 23:08 |
rm_work | it's 2855 | 23:08 |
rm_work | 2855 different source IPs | 23:08 |
rm_work | and every single connection, 100%, went to a single member | 23:08 |
johnsom | Yeah, that is a pretty small spread | 23:08 |
rm_work | I can give you the list if you want | 23:08 |
rm_work | are you fucking kidding me right now? lol | 23:08 |
rm_work | almost 3000 IPs | 23:08 |
johnsom | Well, I am on mobile so no use | 23:09 |
rm_work | and "ah yeah, they all hash the same, that makes sense"??? | 23:09 |
rm_work | that is a completely broken hashing algorithm | 23:09 |
rm_work | that's like a random number generator where the code is "return 5 # because I like the number 5" | 23:09 |
johnsom | Sigh | 23:10 |
johnsom | 3k, clustered IPs, out of 4.293 million addresses. Draw two horizontal lines on a whiteboard, divide the ip ranges evenly, starting with 0.0.0.0 in the first section, and ending with 255.255.255.255 in the bottom section. | 23:12 |
*** fnaval has quit IRC | 23:13 | |
johnsom | That is how you have it configured right now | 23:13 |
rm_work | that's not a hashing algorithm at all | 23:13 |
rm_work | so you're saying the SOURCE_IP algorithm isn't hash based | 23:14 |
rm_work | ok no, what you are saying does not line up with what is in the HAProxy configuration manual | 23:16 |
rm_work | "The source IP address is hashed and divided by the total weight of the running servers to designate which server will receive the request." | 23:17 |
johnsom | Ok, well maybe I am wrong. I can’t exactly research right now | 23:17 |
johnsom | Yeah, that quote is what I was saying | 23:17 |
rm_work | but it's *hashed* | 23:18 |
rm_work | "the hash table is a static array containing all alive servers. The hashes will be very smooth, will consider weights, but will be static in that weight changes while a server is up will be ignored. This means that there will be no slow start." | 23:18 |
johnsom | That does not mean it has an even distribution | 23:19 |
rm_work | "sdbm this function was created initially for sdbm (a public-domain reimplementation of ndbm) database library. It was found to do well in scrambling bits, causing better distribution of the keys and fewer splits. It also happens to be a good general hashing function with good distribution" | 23:19 |
johnsom | Making an int from an IP and dividing by the buckets is a hash | 23:20 |
johnsom | Is sdbm the default? | 23:20 |
rm_work | yes | 23:20 |
rm_work | "The default hash type is "map-based" and is recommended for most usages. The | 23:20 |
rm_work | default function is "sdbm", the selection of a function should be based on | 23:20 |
rm_work | the range of the values being hashed." | 23:20 |
*** salmankhan has quit IRC | 23:25 | |
*** hvhaugwitz has quit IRC | 23:31 | |
*** hvhaugwitz has joined #openstack-lbaas | 23:31 | |
abaindur | johnsom: We made some progress too on setting up octavia from the issues last week | 23:41 |
abaindur | but now i had some questions regarding tenancy | 23:41 |
abaindur | we have to give the octavia user some credentials in the service_auth section, with a tenant specified. Our octavia user is in services tenant | 23:43 |
abaindur | the issue is, the LBs come up in this tenant. is there a way to have them come up in the tenant of the user who created the LBs? | 23:43 |
abaindur | Otherwise they are hidden to this tenant | 23:44 |
abaindur | (the amphora VMs) | 23:44 |
johnsom | No, it is intended that these are not visible to the tenant and do not impact their service quotas | 23:45 |
johnsom | This is by design | 23:45 |
abaindur | second, does the mgmt LB network need to be shared, or owned by whatever tenant the octavia user is? I am assuming if the "services" tenant doesnt have access to this network, the amphora cant be bototed up on this network | 23:45 |
johnsom | Amphora are not a concept users should be aware of | 23:46 |
abaindur | ah... ok | 23:46 |
abaindur | SO i'm guessing we should create a special LB tenant that the octavia user is a part when we specify the creds (since we dont want it in services tenant). and We should create the LB network as owned by this tenant | 23:47 |
johnsom | Lb-mgmt is purely for the internal Octavia use, so only needs to be available to the octavia service account | 23:47 |
johnsom | Yeah, I think OSA uses services project and octavia username. But projects are cheap... grin | 23:48 |
abaindur | How does the amphora attach to the backend networks we want to load balance to btw? | 23:49 |
abaindur | here i am assuming they will not be shared, but isolated tenant networks | 23:49 |
johnsom | However if you do not use the service project you may need to do some RBAC config in the other services that is usually handled already for the service project | 23:50 |
abaindur | Now that our amphora and LB mgmt net is owned by services (or whatever new tenant we use), it can still attach to non-shared networks used to backend servers? | 23:50 |
abaindur | which migth be in "tenantA" or B etc... | 23:51 |
johnsom | We have RBAC rights to attach those tenant networks | 23:51 |
abaindur | i dont believe we are using rbac... using the old admin_or_owner policy.json that was supplied | 23:51 |
johnsom | That is RBAC, just a simple rule set | 23:51 |
johnsom | These RBAC settings are not in Octavia, but the other services | 23:52 |
johnsom | Nova, neutron, glance, barbican, etc. | 23:53 |
johnsom | The service project usually has these rules already in those services | 23:53 |
abaindur | ah ok thanks. sorry my knowledge on RBAC is limited :) | 23:55 |
lxkong | johnsom, rm_work, hi could you please take a look at https://storyboard.openstack.org/#!/story/2003413? I'd like to hear your suggestions | 23:57 |
*** irenab has quit IRC | 23:58 | |
*** abaindur has quit IRC | 23:58 | |
*** abaindur has joined #openstack-lbaas | 23:58 | |
johnsom | I can look tomorrow. Still on vacation and disappearing again now | 23:59 |
lxkong | johnsom: kk, have fun :-) | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!