Monday, 2021-11-01

*** sshnaidm is now known as sshnaidm\|afk		10:09
*** sshnaidm\|afk is now known as sshnaidm		11:39
evrardjp	Shameless plug: If you know someone who wants to work on Ironic, feel free to give that person this link: https://citynetwork.uhigher.com/en/job-details?job=61508774-cc6f-4269-87c2-3e07162160f7 ... Or to contact me on irc ;)	13:31
spatel	jamesdenton_alt altnative ID :)	14:47
spatel	what happened here?	14:47
*** jamesdenton_alt is now known as jamesdenton		14:48
jamesdenton	maybe an imposter!	14:48
spatel	:)	14:52
spatel	I have very strange issue going on related networking	14:52
spatel	i thought may be you can help me guide me or advice me	14:53
spatel	we have c7000 HP chassis with 16 gen9 blades	14:53
spatel	all blade configured for Active-Standby LACP bundle for redundancy.	14:53
spatel	yesterday i noticed one of blade has some crash and turn out related memory failure. but that created strange issue that blade switch went wrong and stop sending LACP PDU to upstream TOR switch and switch isolated :(	14:55
spatel	I have wild theory that may be memory failure created loop on switch (not sure how)	14:56
spatel	thinking to configure PASSIVE LACP config on HP blade switch side so if anything happened to server switch will shutdown port.	14:57
jamesdenton	hmm	14:59
jamesdenton	was it active-standby or lacp? I think lacp aggregates all links?	15:00
spatel	https://paste.opendev.org/show/810315/	15:01
spatel	This is what i have on Ubuntu server	15:02
spatel	I that LACP has mode called active-standby	15:02
jamesdenton	i just blame netplan	15:02
spatel	what do you mean?	15:03
jamesdenton	active-standby corresponds to mode 1 (not lacp), i think, while 802.3ad would active-active (mode 4 lacp)	15:04
spatel	you are saying in my case its not LACP bond right?	15:04
jamesdenton	rifht	15:04
jamesdenton	yes	15:04
jamesdenton	so the link must actually go NO-CARRIER, i think, for the failover to occur	15:05
spatel	hmm	15:05
spatel	This bond config doesn't detect my upstream uplink failure :(	15:06
jamesdenton	it would not detect that	15:06
spatel	I am thinking to add arp_ip_target to get gateway arp to detect upstream failure of uplink	15:06
jamesdenton	never used it myself, but give it a shot	15:08
spatel	This issue killing me.. whenever server crashed or memory failed on these blade cause blade switch break LACP bond with TOR switch :(	15:10
spatel	trying to understand what is the relation with server crash and TOR LACP bundle go down.	15:10
spatel	I am seeing HP 6120XG blade switch stopped sending LACP PDU to tor switch which put LACP in suspended mode.	15:11
jamesdenton	and then the downlinks to the servers don't recognize that and appear offline?	15:14
spatel	jamesdenton look at this diagram - https://ibb.co/FntGz01	15:16
spatel	for server both HP 6120 switch is up but TOR switch not getting any LACP PDU packet so tor putting this switch LACP port in suspended	15:17
spatel	This incident only happened to switch-A	15:17
spatel	jamesdenton did you test bonding inside VM ?	18:40
jamesdenton	eeeesh, if i did i don't recall	19:43
jamesdenton	having issues?	19:43
spatel	jamesdenton no worry let me dig and see	21:09
spatel	what vm_memory_high_watermark setting you guys do for rabbitMQ .	21:09
spatel	/	21:09
spatel	?	21:09
bjoernt	0.2	21:11
bjoernt	it really depends on how much ram you have and how large the vm should be come	21:11
spatel	i have 128GB memory	21:14
spatel	my rabbitMQ keep dying :(	21:14
spatel	i got getting OOM killer when i had 64GB memory so i have added bunch of more dimm and now i have 128GB	21:20
spatel	my current setting is 0.2 so thinking to change it to 0.4	21:20
bjoernt	depends how it dies. doubling the ram is effectively the same as 0.2 on the old one	21:39
bjoernt	0.4 i meant	21:49
bjoernt	you dont want too large vms then the GC will take too long	21:49
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org is being restarted quickly for some security updates, but should return to service momentarily		22:09

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!