opendevreview | Arnaud Morin proposed openstack/large-scale master: Move configuration guidelines https://review.opendev.org/c/openstack/large-scale/+/862100 | 11:43 |
---|---|---|
opendevreview | Arnaud Morin proposed openstack/large-scale master: Move configuration guidelines https://review.opendev.org/c/openstack/large-scale/+/862100 | 12:12 |
amorin | hey felixhuettner[m] and other large-scalers, I was wondering if you had a specific configuration on sql pool_size and other related config params? | 12:14 |
amorin | https://docs.openstack.org/nova/latest/configuration/config.html#database.max_pool_size | 12:14 |
amorin | we are using mariadb on our side, some of our biggest regions are consuming a LOT of SQL connection to the DB | 12:15 |
amorin | I am not yet sure why, but I have the feeling that some tuning needs to be done on that part | 12:15 |
felixhuettner[m] | we have some tuning on our mariadb-galera side that we could share | 12:16 |
felixhuettner[m] | on the openstack service side its mostly about the connections we want to hold open | 12:16 |
amorin | that would be awecome | 12:16 |
felixhuettner[m] | allthough thats more random guessing of what works best (since there is no metrics for that) | 12:16 |
amorin | we see sometimes in our neutron logs that the workers reached the pool limit | 12:17 |
amorin | especially under heavy load | 12:17 |
amorin | I cant understand that, as I was expecting each worker having it's own pool | 12:18 |
amorin | and avoid consuming more RPC messages when the pool limit is reached | 12:18 |
felixhuettner[m] | yep i think so too | 12:18 |
felixhuettner[m] | these should all be separate processes | 12:18 |
amorin | in what I can see, the workers are still consuming messages even if there is no more connection in the pool | 12:19 |
amorin | have you seen such things? | 12:19 |
amorin | in your deployment? | 12:19 |
felixhuettner[m] | not that i know of | 12:19 |
amorin | ack | 12:19 |
felixhuettner[m] | we have max_pool_siz 15and max_overflow 25 set | 12:19 |
felixhuettner[m] | and a gigantic amount of workers | 12:20 |
amorin | do you monitor the number of connection on the DB? | 12:20 |
felixhuettner[m] | i think over all neutron-api instances we are at 300 workers :D | 12:20 |
felixhuettner[m] | yep, give me a sec what we have there effectively | 12:20 |
amorin | we have something similar, we reduced the pool size (and overflow) to lower the number of connection on the DB, because of RAM issue | 12:20 |
amorin | anyway, having something like 4k connections only for neutron services is huge | 12:21 |
amorin | and seems like a bad behavior/design to me | 12:21 |
felixhuettner[m] | it's around 1k connections to neutron database | 12:21 |
felixhuettner[m] | and i guess most of your's are idle as well? | 12:22 |
amorin | 300 workers with 15+25 | 12:22 |
amorin | you can reach something like 10500 | 12:22 |
amorin | in theory | 12:22 |
felixhuettner[m] | yep, glady that never happened :D | 12:22 |
amorin | it happened to us unfortunately | 12:22 |
amorin | e.g. restart all neutron agents in a shot | 12:23 |
amorin | or: imagine a drop on RPC, all agents a reviving and doing a full sync at the same time | 12:23 |
felixhuettner[m] | we had that a few times (as we killed our whole rabbit clusters) | 12:24 |
felixhuettner[m] | there it spiked up to 1.600 connections | 12:24 |
amorin | how many computes do you have? | 12:24 |
amorin | computes/agent | 12:24 |
felixhuettner[m] | roughly 400 | 12:24 |
felixhuettner[m] | and a few more agents for l3 | 12:24 |
amorin | ok | 12:25 |
amorin | we have a region with more than 2k, with 2 ovs agents, 1 l3 agent and few dhcp agents | 12:26 |
felixhuettner[m] | poor 1 l3 agent :) | 12:26 |
amorin | 1 l3 per compute | 12:26 |
felixhuettner[m] | what did you set for `connection_recycle_time`? | 12:26 |
felixhuettner[m] | ooooh, ok thats a lot :) | 12:26 |
amorin | our db suffered a lot | 12:27 |
amorin | for connection_recycle_time we just lower it a little bit, something like 55 minutes | 12:27 |
amorin | what about you? | 12:27 |
felixhuettner[m] | ok, we have 280 seconds | 12:27 |
amorin | waw | 12:27 |
amorin | so you are recycling connections a lot | 12:28 |
felixhuettner[m] | maybe that helps to keep them down | 12:28 |
amorin | for sure | 12:28 |
felixhuettner[m] | but it heavily helps us to do failovers of databases | 12:28 |
amorin | what about the overhead to establish the connection? | 12:28 |
felixhuettner[m] | since we just need a few min to drain | 12:28 |
amorin | make sense | 12:29 |
felixhuettner[m] | we did not notice something yet, allthough we might not have detailed enough data | 12:29 |
amorin | ok, that's a good point anyway, I'll consider changing this | 12:29 |
felixhuettner[m] | and our haproxy in front of the galera cluster has a connection timeout of 300 seconds, so we don't drop things there | 12:30 |
amorin | we set this timeout to 1H IIRC, that's why we lower it a little bit | 12:30 |
felixhuettner[m] | +1 | 12:30 |
felixhuettner[m] | do you also run a galera cluster? | 12:31 |
felixhuettner[m] | or native mariadb? | 12:31 |
amorin | yup | 12:31 |
amorin | galera | 12:31 |
felixhuettner[m] | do you also set `gcomm.thread_prio` under `wsrep_provider_options`? | 12:32 |
amorin | hum good question | 12:32 |
felixhuettner[m] | that is the biggest stability gain for our galera cluster that we found | 12:33 |
amorin | I cant see anything like that on our deployment | 12:33 |
felixhuettner[m] | did your galera cluster ever break because of a lot of connections? | 12:33 |
felixhuettner[m] | because it missed heartbeats? | 12:34 |
amorin | yes we had such issues | 12:34 |
felixhuettner[m] | you will want | 12:34 |
felixhuettner[m] | `wsrep_provider_options=gcomm.thread_prio=rr:2` | 12:34 |
felixhuettner[m] | but then you need to start the database with CAP_NICE | 12:34 |
felixhuettner[m] | then the replication thread gets realtime priority set | 12:34 |
felixhuettner[m] | and client connections don't break the replication anymore | 12:35 |
felixhuettner[m] | (depending on your kernel settings you might need some cgroup magic) | 12:35 |
amorin | nice! | 12:35 |
felixhuettner[m] | i guess i'll write that up for the docs | 12:36 |
amorin | nice tip, maybe it could be worth to add that to the docs | 12:36 |
amorin | you were faster :) | 12:36 |
opendevreview | Felix Huettner proposed openstack/large-scale master: Add guide for galera configuration https://review.opendev.org/c/openstack/large-scale/+/862141 | 13:11 |
felixhuettner[m] | there you go | 13:11 |
felixhuettner[m] | would be interested where you see differences | 13:13 |
amorin | nice, thanks! I will talk with the team about this, and I will let you know | 20:15 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!