Saturday, 2014-01-18

*** jcooley_ has joined #tripleo00:03
*** jcooley_ has quit IRC00:08
*** ccrouch has quit IRC00:16
*** CaptTofu has joined #tripleo00:18
*** noslzzp has joined #tripleo00:19
*** sdake has joined #tripleo00:22
*** sdake has joined #tripleo00:22
*** sdake has quit IRC00:25
*** UtahDave has quit IRC00:26
*** jcooley_ has joined #tripleo00:30
*** CaptTofu has quit IRC00:40
*** CaptTofu has joined #tripleo00:40
*** CaptTofu has quit IRC00:45
*** spzala has quit IRC00:51
*** newell has quit IRC01:04
lifelesscrickets01:17
*** CaptTofu has joined #tripleo01:23
devanandalol01:34
clarkblifeless: isn't it your weekend? :P01:35
devanandahey lifeless, got a few minutes?01:35
clarkbwe have a long weekend this side of the planet01:35
devanandaclarkb: what's monday again?01:35
clarkbdevananda: mlk jr day01:35
devanandaah right01:35
derekhlifeless: got a node running the jjb stuff again, now rebuilding the TE before I clock off01:49
*** taps has quit IRC01:57
*** noslzzp has quit IRC02:08
*** noslzzp has joined #tripleo02:09
*** CaptTofu has quit IRC02:24
*** coolsvap_away has joined #tripleo02:52
openstackgerritDerek Higgins proposed a change to openstack-infra/tripleo-ci: Switch test environment users  https://review.openstack.org/6761402:53
*** coolsvap has quit IRC02:54
derekhlifeless: looks like TE host on the baremetal cloud can't contact the broker, have updated the etherpad with more uptodate notes and will pick it back up tomorrow02:55
derekhroot@testenv-testenvconfig-lieghn64l4vq:/home/heat-admin# ping 192.168.1.102:56
derekh13 packets transmitted, 0 received, +7 errors, 100% packet loss, time 12065ms02:56
*** AaronGr is now known as AaronGr_Zzz02:57
*** derekh has quit IRC02:59
*** coolsvap_away has quit IRC03:15
*** sdake has joined #tripleo03:28
*** hewbrocca has quit IRC03:54
*** hewbrocca has joined #tripleo03:54
*** CaptTofu has joined #tripleo04:42
*** CaptTofu has quit IRC05:15
*** noslzzp has quit IRC05:17
*** rushiagr has joined #tripleo05:24
*** vkozhukalov has joined #tripleo05:30
openstackgerritTzu-Mainn Chen proposed a change to openstack/tuskar-ui: Add unstyled overcloud resource category page  https://review.openstack.org/6763205:59
*** rushiagr is now known as rushiagr_away06:01
*** rushiagr_away is now known as rushiagr06:11
*** ccrouch has joined #tripleo06:18
*** ccrouch has quit IRC06:23
*** tzumainn has quit IRC06:32
*** CaptTofu has joined #tripleo07:16
*** boris-42 has joined #tripleo07:16
*** akuznetsov has quit IRC07:19
*** rwsu has quit IRC07:19
*** CaptTofu has quit IRC07:20
*** rushiagr has joined #tripleo07:32
*** akuznetsov has joined #tripleo07:50
*** akuznetsov has quit IRC08:32
*** akuznetsov has joined #tripleo09:10
*** CaptTofu has joined #tripleo09:16
*** CaptTofu has quit IRC09:21
*** akuznetsov has quit IRC09:47
*** hewbrocc` has joined #tripleo09:59
*** uvirtbot has quit IRC10:03
*** hewbrocca has quit IRC10:04
*** boris-42 has quit IRC10:06
*** e0ne_ has joined #tripleo10:10
*** sdake has quit IRC10:11
*** e0ne has quit IRC10:12
*** akuznetsov has joined #tripleo10:12
*** sdake has joined #tripleo10:14
*** jcooley_ has quit IRC10:14
*** boris-42 has joined #tripleo10:34
*** derekh has joined #tripleo11:00
*** akuznetsov has quit IRC11:15
*** CaptTofu has joined #tripleo11:18
*** CaptTofu has quit IRC11:22
*** derekh has quit IRC11:44
*** derekh has joined #tripleo11:50
*** rbrady has joined #tripleo11:51
*** akuznetsov has joined #tripleo11:52
*** rushiagr has quit IRC12:08
*** rushiagr has joined #tripleo12:11
*** derekh has quit IRC12:16
*** rushiagr2 has joined #tripleo12:18
*** rushiagr has quit IRC12:21
*** rushiagr2 has quit IRC12:41
*** e0ne_ has quit IRC13:04
*** e0ne has joined #tripleo13:04
*** CaptTofu has joined #tripleo13:18
*** CaptTofu has quit IRC13:21
*** CaptTofu has joined #tripleo13:22
*** derekh has joined #tripleo13:26
*** e0ne has quit IRC13:33
*** derekh has quit IRC13:34
*** e0ne has joined #tripleo14:09
*** e0ne has quit IRC14:13
*** e0ne has joined #tripleo14:22
*** e0ne has quit IRC14:25
*** ccrouch has joined #tripleo14:48
*** vkozhukalov has quit IRC15:01
*** e0ne has joined #tripleo15:02
*** e0ne has quit IRC15:06
*** CaptTofu has quit IRC15:36
*** CaptTofu has joined #tripleo15:37
*** CaptTofu has quit IRC15:42
*** vkozhukalov has joined #tripleo15:43
*** derekh has joined #tripleo15:47
*** lynxman has quit IRC15:49
*** mordred has quit IRC15:49
*** slagle has quit IRC15:49
*** rpodolyaka has quit IRC15:49
*** lynxman has joined #tripleo15:52
*** mordred has joined #tripleo15:52
*** slagle has joined #tripleo15:52
*** rpodolyaka has joined #tripleo15:52
*** lynxman has quit IRC15:56
*** mordred has quit IRC15:56
*** slagle has quit IRC15:56
*** rpodolyaka has quit IRC15:56
*** lynxman has joined #tripleo15:58
*** mordred has joined #tripleo15:58
*** slagle has joined #tripleo15:58
*** rpodolyaka has joined #tripleo15:58
*** akuznetsov has quit IRC16:40
*** derekh has quit IRC16:45
*** panda has joined #tripleo17:29
*** rushiagr has joined #tripleo17:31
*** panda__ has quit IRC17:32
*** CaptTofu has joined #tripleo17:37
*** CaptTofu has quit IRC17:42
*** akuznetsov has joined #tripleo17:47
*** marun has joined #tripleo18:01
*** marun has quit IRC18:06
*** jrist has quit IRC18:19
*** UtahDave has joined #tripleo18:33
*** jrist has joined #tripleo18:33
*** akuznetsov has quit IRC18:51
*** noslzzp has joined #tripleo19:10
*** CaptTofu has joined #tripleo19:11
lifelesso/19:21
lifelessmore crickets!19:21
lifelessto the theme of 'more cowbell!'19:21
*** CaptTofu has quit IRC19:57
*** taps has joined #tripleo19:58
SpamapSI got a fever20:21
*** UtahDave has quit IRC20:30
*** akuznetsov has joined #tripleo20:35
lifelessI think we're going to have to debug this network performance thing asap20:39
lifeless12kBps is too slow20:39
phschwartzWhat type of network issue. I have some cycles while testing of a new release here is going on and I can take a look20:39
lifelessphschwartz: on ci-overcloud.tripleo.org, which is a regular cd-overcloud just a different name, so we have a stable base for infra to run in20:40
lifelessphschwartz: instances are getting 12kbps from the internet20:40
lifelessphschwartz: gre overlay network20:40
phschwartzgre+ovs I take it20:40
lifelessyah20:41
phschwartzI have had this issue a few times. What version of ovs is installed?20:41
lifelessml2 drive20:41
lifelessI just tried clamping the mtu of an instance down, no discernable effect20:41
*** akuznetsov has quit IRC20:41
lifelesslet me log into the plumbing and I'll answer the ovs question20:41
lifelessphschwartz: ovs-vsctl --version20:42
lifelessovs-vsctl (Open vSwitch) 1.10.220:42
lifelessCompiled Sep 23 2013 14:53:1320:42
lifelesson the network node20:42
phschwartzNo, that won't do it. One of the older ovs installs had an issue with gre networking that caused its in memory datastore for ovs+gre routing to eat cpu and ram. It would clean the ram, but would leave cpu usage high causing a reduction in traffic routing compute which in turn slows down throughput20:43
lifelesswhich is able to pull 20MB/s from the host I was testing against20:43
phschwartzLet me check to see if that is the version with the issue or not20:43
lifelesssame version on the compute node20:43
phschwartzok, that is the one I had the issue with that over time would have the same problem. I was running the default ubuntu installed 1.10.2 on 13.04.20:45
lifeless1.10.2 is bad?20:45
phschwartzCompiling for my local 1.11.0 fixed the issue. The other thing that helped was moving from using the python wrapper for root commands20:45
phschwartzI found it to be with gre20:46
phschwartzWorks good with nvp20:46
lifelessok, thats super useful. THanks!20:46
lifelessphschwartz: is nvp open source?20:46
phschwartzI found this before I started with Rax, but I think someone in Rax found the same as they moved to nvp and that I know still run 1.10.220:46
phschwartzno, it is not.20:46
lifelessah :)20:46
lifelessok, so we need to replace the openvswitch packages too20:47
lifelessI can see us just building everything from scratch :/20:47
phschwartzThat was what I did for the fix. Built my own and made a local repo for install20:47
phschwartzI need to look at the bug back log and work on a few when I have time like this. Haven't had much time lately.20:47
lifelessI'm going to poke deeper on this, as there isn't a CPU problem today, just a throughput problem20:49
phschwartzI think what I found on the ovs mailing lists when I had the issue was that it would eat ram and cpu, then kill the gre threads, and it would severely limit them when it respawns them and that is why it has the issue.20:50
lifelessyeah but this is right from first vm on cloud ever20:50
lifelessit would have to eat them spectacularly fast...20:50
phschwartzI found it to happen very fast20:52
lifelessok20:52
lifelessorder of minutes?20:52
phschwartzIf he is on, kbringard in #openstack had the same issue in the ovs+gre setup that at&t was using and helped me locate the issue. He might have more in depth info still.20:53
phschwartzyes, a matter of mintues20:53
lifelessok, cool20:53
lifelessso - replacing the package version is going to be a little tricky right now, but will dig into it20:53
phschwartzI would get the slow down starting within 2-5 min of quantum bringing up networking as a whole for my env.20:53
phschwartzIt will be in this case. I had the benefit of a small cluster at the time with no impact of stopping to redo it.20:54
lifelessso one thing thats odd20:54
lifelesswhen I wget from the instance to the world - slow20:54
lifelesswhen I rsync up the same content from my home to the instance - fast20:54
lifelessphschwartz: would restarting openvswitch temporarily fix things?20:55
phschwartzdefn the same issue that I had then. It was slowness in the computing of routing in the ovs namespaces that were using gre.20:55
phschwartzThat would work sometimes, but usually needed a host reboot.20:55
lifelessrighto, from the ip router netns I get 160Mbps of throughput to a static file in the UK20:56
lifelesswhich isn't brilliant but is tolerable20:57
phschwartzI would see not even to the net, but between external networks in the datacenter hits where I would get 50-60kpbs, and the core network for the env was 160gb and the clusters interconnect was 8 10g ports aggregated with 2 10g aggregated on each host.21:02
phschwartzYou can never be 100% positive, but defn sounds like the same issue I was having21:02
phschwartzWhen I would get rid of namespacing it would improve, but that defeats the purpose21:03
lifelesshmmm21:03
lifelesstrusty has 2.021:03
lifelessthat might be easier21:03
phschwartzhere is a mail list thread from OS that someone had the same issue. http://lists.openstack.org/pipermail/openstack/2013-October/002265.html21:04
phschwartzIn their case, the only fix was setting up a proxy to get around the issue with the gre namespacing21:04
lifelessyah21:05
lifelessfamily time, shall dig in in detail this evening21:06
lifelessthanks for the pointers21:06
phschwartzJust had a network eng from LexisNexis (where I use to work) remind me that we also had to turn GRO off on the hardware side as the offloading made the problem happen a lot faster.21:06
phschwartzno problem at all21:06
*** julim has quit IRC21:16
*** taps has quit IRC21:19
*** boris-42 has quit IRC21:19
*** jhurlbert has quit IRC21:19
*** sgrasley has quit IRC21:19
*** taps has joined #tripleo21:20
*** boris-42 has joined #tripleo21:20
*** jhurlbert has joined #tripleo21:20
*** sgrasley has joined #tripleo21:20
*** boris-42 has quit IRC21:21
*** boris-42 has joined #tripleo21:21
*** d0ugal has joined #tripleo21:26
*** derekh has joined #tripleo21:41
*** vkozhukalov has quit IRC21:43
*** rushiagr has quit IRC21:50
*** taps has quit IRC22:35
*** ccrouch has quit IRC22:37
*** cody-somerville has quit IRC23:00
*** e0ne has joined #tripleo23:00
*** e0ne has quit IRC23:07
*** e0ne has joined #tripleo23:20
*** akuznetsov has joined #tripleo23:25
derekhneed a bigger VM - Out of memory23:27
*** cody-somerville has joined #tripleo23:35
*** e0ne has quit IRC23:46

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!