Saturday, 2017-08-12

johnsom	See, this is what I don't get: When deadlock detection is enabled (the default) and a deadlock does occur, InnoDB detects the condition and rolls back one of the transactions (the victim).	00:00
johnsom	So, it should only roll back one. It should still let one complete	00:01
johnsom	rm_work Ah, I see why it just stops....	00:07
*** gongysh has joined #openstack-lbaas		00:08
*** sshank has quit IRC		00:08
*** gongysh has quit IRC		00:09
openstackgerrit	Michael Johnson proposed openstack/octavia master: Fix health monitor DB locking. https://review.openstack.org/493252	00:11
johnsom	Doesn't answer the deadlock, but will cause it to not matter as much.	00:11
*** sshank has joined #openstack-lbaas		00:13
*** sshank has quit IRC		00:22
*** xingzhang has joined #openstack-lbaas		00:25
rm_work	eugh	00:26
rm_work	http://paste.openstack.org/show/618236/	00:26
rm_work	followed by	00:26
rm_work	http://paste.openstack.org/show/618237/	00:27
rm_work	this is spectacular	00:27
rm_work	so much bug	00:27
rm_work	this is what i was talking about before i think	00:27
rm_work	the first one is that failovers should be able to ignore status	00:28
rm_work	so it does seem to ALLOW failovers now	00:30
rm_work	but that is pretty lulzy	00:30
xgerman_	yeah, looks like the wheels are coming off	00:31
johnsom	The top try/catch block?	00:31
johnsom	I mean, it should be ok for that health check to not get a lock, that is "normal" in a way	00:31
johnsom	I probably should modify that get_stale try block to ignore the deadlock event.	00:32
xgerman_	it’s still saving the busy? I saw the commit in the func calling but…	00:33
rm_work	that was because of a previous failed failover	00:34
rm_work	but	00:34
johnsom	As for the failover status, this is an interesting one. It's locking the LB, which may have other healthy amps....	00:34
rm_work	basically if it tries to failover when the state is PENDING_UPDATE	00:34
rm_work	it fails	00:34
rm_work	and yeah, the busy stays	00:34
rm_work	i have to figure out the second one	00:34
johnsom	Those revert issues are just missing kwargs	00:35
rm_work	trying to figure out where	00:35
johnsom	https://github.com/openstack/octavia/blob/master/octavia/controller/worker/tasks/database_tasks.py#L922	00:36
rm_work	ah yeah there's one	00:36
rm_work	and 907	00:36
rm_work	we should fix up all of those	00:36
johnsom	Should look more like: https://github.com/openstack/octavia/blob/master/octavia/controller/worker/tasks/database_tasks.py#L1058	00:36
johnsom	Yeah, I fixed a ton of those at one point, but more must have slipped in	00:37
johnsom	We probably need a hacking rule for that	00:37
xgerman_	+1	00:38
johnsom	Ha, that is currently the only one it looks like	00:39
xgerman_	It makes sense to me to lock an LB during failover even if he has more than one amp — we can’t guarantee that updates will reach all amps at that point in time	00:39
xgerman_	but we should ignore it when we failover another amp	00:40
johnsom	Oh, I don't disagree that it should be locked, I'm just worried if the update thread is still going on the other o-cw if it's going to mess with the state machine, i.e. unlock it	00:41
xgerman_	mmh	00:41
johnsom	I mean it "should" fail out and go to ERROR instead of pending	00:42
rm_work	yep lol	00:42
rm_work	just the one spot	00:42
rm_work	awesome >_>	00:42
johnsom	So, either we don't failover when it's in PENDING_* and wait for it to exit that state or....	00:43
xgerman_	well, we always need to failover - uptime is our ultimate goal	00:43
johnsom	Yeah, but I don't want failover of one amp to cause failure of the other....	00:44
xgerman_	ok, makes sense - so if we are not running SINGLE we can ait for the update (and hope it doesn’t crash by talking to the defunct amp)	00:45
rm_work	but yeah that's what i was saying earlier -- "we always need to failover"	00:45
rm_work	so blocking a failover because of an update is kinda >_>	00:45
rm_work	but, yeah, easier said than done since it IS problematic	00:45
rm_work	anyway this HELPS since now failovers happen, but now i'm just getting deadlocks like constantly	00:45
johnsom	Did you ever find the deadlock log?	00:46
rm_work	seriously, just spewing them	00:46
rm_work	looking	00:46
rm_work	oh err	00:47
rm_work	wait	00:47
rm_work	am i using INNODB?	00:47
johnsom	I super hope so	00:47
rm_work	err	00:47
rm_work	how do i verify that	00:47
johnsom	http://paste.openstack.org/show/618235/	00:47
rm_work	i have this set up as percona+extradb	00:47
johnsom	Yeah, you are	00:47
johnsom	It was in your status output	00:48
rm_work	does ExtraDB not override that or something	00:48
rm_work	ExtraDB is madness	00:48
johnsom	sqlalchemy + mysql is maddness	00:48
rm_work	lol	00:48
*** leitan has quit IRC		00:50
rm_work	yeah i have no idea where the errors are going <_<	00:50
rm_work	if anywhere	00:51
johnsom	lsof?	00:52
openstackgerrit	Michael Johnson proposed openstack/octavia master: Fix health monitor DB locking. https://review.openstack.org/493252	00:55
johnsom	That will shut it up	00:55
xgerman_	ha	00:55
rm_work	lol....	00:55
rm_work	not sure that's ideal	00:57
johnsom	Well, no. We still need to figure out what is deadlocking.	00:58
rm_work	this is dumb	01:12
rm_work	maybe i need to explicitly configure a log location?	01:13
rm_work	ah percona xtradb is Galera	01:13
johnsom	There should be a mysql variable that defines the error log location	01:14
johnsom	But didn't you see those "row too long" messages? that should have been the error log	01:14
rm_work	yeah i found those on all nodes	01:15
rm_work	but nothing about deadlocks	01:15
rm_work	i don't know if setting that global is working right	01:15
rm_work	hmmmmmm	01:15
rm_work	maybe I need to just ...	01:15
rm_work	only send writes to one node <_<	01:15
rm_work	one sec	01:15
rm_work	doing that	01:16
rm_work	man, what we ARE missing is active/passive	01:16
rm_work	i want to have one node ONLY come up if the other is down	01:17
rm_work	can't really do it with weights	01:17
rm_work	johnsom: k i think that solves it -- so this is not really octavia's problem, so much as galera's optimistic locking and writing to more than one node	01:19
johnsom	Are you kidding me?	01:19
johnsom	Ugh, can't figure out why this regex doesn't work	01:20
johnsom	(.)def revert\(.+, (?!\\*kwargs)\):	01:21
rm_work	:3	01:28
rm_work	this is a little odd	01:28
rm_work	johnsom: so i would say: throw that revert fix into the same HM patch, remove the bits that hide the deadlock messages from logs, and we should merge that	01:33
rm_work	since it does solve a problem	01:34
johnsom	I would consider it if I can get this damn regex to work	01:35
*** yamamoto has joined #openstack-lbaas		01:35
rm_work	yeah i poked at it	01:37
rm_work	not sure wtf	01:37
*** ssmith has quit IRC		01:50
*** yamamoto has quit IRC		02:03
*** gongysh has joined #openstack-lbaas		02:43
*** yamamoto has joined #openstack-lbaas		03:04
*** xingzhang has quit IRC		03:08
*** xingzhang has joined #openstack-lbaas		03:08
*** yamamoto has quit IRC		03:09
*** xingzhang has quit IRC		03:13
*** xingzhang has joined #openstack-lbaas		03:14
*** rajivk has quit IRC		03:28
*** reedip has quit IRC		03:28
*** yamamoto has joined #openstack-lbaas		03:29
openstackgerrit	Michael Johnson proposed openstack/python-octaviaclient master: Improve error reporting for the octavia plugin https://review.openstack.org/493273	04:13
johnsom	Ok, that should pass through our fault strings to the user giving better error strings than "Bad Request"	04:14
*** yamamoto has quit IRC		04:39
*** yamamoto has joined #openstack-lbaas		04:48
*** yamamoto has quit IRC		04:55
*** gcheresh has joined #openstack-lbaas		06:22
*** gongysh has quit IRC		06:36
*** gcheresh has quit IRC		06:40
*** yamamoto has joined #openstack-lbaas		06:54
*** yamamoto has quit IRC		06:59
*** tesseract has joined #openstack-lbaas		07:03
*** KeithMnemonic has quit IRC		07:24
*** yamamoto has joined #openstack-lbaas		07:27
*** yamamoto has quit IRC		07:32
*** Alex_Staf has joined #openstack-lbaas		08:12
*** aojea has joined #openstack-lbaas		08:19
*** gongysh has joined #openstack-lbaas		09:28
*** gongysh has quit IRC		09:28
*** aojea has quit IRC		09:47
*** amotoki__away is now known as amotoki		10:51
*** aojea has joined #openstack-lbaas		10:53
*** aojea has quit IRC		11:00
*** dasanind has quit IRC		11:02
*** yamamoto has joined #openstack-lbaas		11:27
*** yamamoto has quit IRC		12:01
*** yamamoto has joined #openstack-lbaas		12:21
*** gcheresh has joined #openstack-lbaas		12:24
*** Alex_Staf has quit IRC		12:30
*** Alex_Staf has joined #openstack-lbaas		12:35
*** aojea has joined #openstack-lbaas		12:57
*** aojea has quit IRC		13:01
*** gcheresh has quit IRC		13:17
*** aojea has joined #openstack-lbaas		14:58
*** xingzhang has quit IRC		14:58
*** xingzhang has joined #openstack-lbaas		14:59
*** aojea has quit IRC		15:02
*** xingzhang has quit IRC		15:03
*** ajo has quit IRC		15:31
*** yamamoto has quit IRC		15:40
*** yamamoto has joined #openstack-lbaas		15:41
*** ipsecguy_ has joined #openstack-lbaas		15:52
*** ipsecguy has quit IRC		15:56
*** xingzhang has joined #openstack-lbaas		16:09
*** Alex_Staf has quit IRC		16:33
*** xingzhang has quit IRC		16:42
*** aojea has joined #openstack-lbaas		16:58
*** aojea has quit IRC		17:03
*** aojea has joined #openstack-lbaas		17:04
*** tesseract has quit IRC		17:28
*** xingzhang has joined #openstack-lbaas		17:42
*** Alex_Staf has joined #openstack-lbaas		18:05
*** xingzhang has quit IRC		18:12
*** aojea has quit IRC		18:13
openstackgerrit	Michael Johnson proposed openstack/octavia master: Fix octavia logging to be more friendly https://review.openstack.org/493328	18:57
*** xingzhang has joined #openstack-lbaas		19:12
*** D33P-B00K has joined #openstack-lbaas		19:30
*** D33P-B00K has left #openstack-lbaas		19:30
*** xingzhang has quit IRC		19:42
*** Alex_Staf has quit IRC		19:58
*** gcheresh has joined #openstack-lbaas		20:00
johnsom	Nice, that works for the gates	20:02
*** xingzhang has joined #openstack-lbaas		20:42
*** aojea has joined #openstack-lbaas		20:42
openstackgerrit	Merged openstack/neutron-lbaas master: Update reno for stable/pike https://review.openstack.org/492872	20:58
*** gcheresh has quit IRC		21:04
*** xingzhang has quit IRC		21:12
*** aojea has quit IRC		21:37
*** aojea has joined #openstack-lbaas		21:37
*** xingzhang has joined #openstack-lbaas		22:12
*** xingzhang has quit IRC		22:42
*** aojea has quit IRC		23:31
*** xingzhang has joined #openstack-lbaas		23:42

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!