Sunday, 2014-02-09

*** jcooley_ has quit IRC00:13
*** jcooley_ has joined #tripleo00:14
*** jcooley_ has quit IRC00:28
*** jcooley_ has joined #tripleo00:29
*** jcooley_ has quit IRC00:34
*** jcooley_ has joined #tripleo00:35
*** jcooley_ has quit IRC00:57
*** jcooley_ has joined #tripleo00:58
*** jcooley_ has quit IRC01:03
peoplemergeSpamapS: Just looking at the schedule at SCALE, noticed you're giving a talk.  Looking forward to it!01:10
*** cd-undercloud has joined #tripleo01:11
cd-undercloud************** overcloud complete status=1 ************01:11
*** cd-undercloud has quit IRC01:11
*** NearlyFunctional has quit IRC01:17
*** NearlyFunctional has joined #tripleo01:19
*** michchap_ has joined #tripleo01:20
*** yamahata_ has joined #tripleo01:22
*** jrist has quit IRC01:29
*** rwsu has quit IRC01:29
*** openstackgerrit has quit IRC01:29
*** yamahata__ has quit IRC01:29
*** michchap has quit IRC01:30
*** uvirtbot has quit IRC01:30
*** rwsu has joined #tripleo01:37
*** jrist has joined #tripleo01:37
lifelesspeoplemerge: gl!01:42
lifelessSpamapS: ohhai?01:42
lifelessInstanceDeployFailure: Timeout reached while waiting for PXE deploy of instance b2bf8601-240d-4ee8-b19d-93c10ddb26d101:44
*** uvirtbot has joined #tripleo01:50
lifelessStevenK: ah, you don't mirror sources I'm guessing ?02:31
lifelessStevenK: since my shiny new apt-mirror is many more GB than 50 :)02:32
*** cd-undercloud has joined #tripleo02:36
cd-undercloud************** overcloud complete status=1 ************02:36
*** cd-undercloud has quit IRC02:36
SpamapSlifeless: I thought StevenK said 600G?02:48
SpamapSfor "all"02:48
peoplemergelifeless: thanks. Last time I got stuck on testenv https://gist.github.com/peoplemerge/886354202:48
peoplemergeThis time I did it all more thoughtfully.  Rebooting after devtest_setup and before testenv fixed that (libvirt permissions err?)02:50
peoplemergeI was able to build seed without all that complicated manual bridging step SpamapS and I did02:51
peoplemergeit looks good so far.  Gotta go buy some ski gear with wifey, be back after dinner :)02:51
lifelessSpamapS: oh hi! see my question back before you disappeared ;)02:52
SpamapSlifeless: ah, so a) :( more bad machines02:53
SpamapSlifeless: b) The "without heat's help" solution just requires that we infer the change that is coming and react accordingly...02:53
lifelessSpamapS: you say potato I saw wtf you talking about willis?02:54
SpamapSlifeless: I'm worried about getting things into Heat actually. I feel that there is hyper-conservative behavior dominating the discussions so it may take a while to land things. :-/02:55
SpamapSlifeless: Could just be my perceptions.02:55
*** CaptTofu has joined #tripleo02:55
lifelessSpamapS: perhaps. still, I want to understand your idea02:55
lifelessI don't, atm02:55
SpamapSlifeless: so we'd just have a pre-server waitcondition and launchconfig that exposes the same things as the current post-server launchconfig and waitcondition (except the server's address) ...02:57
SpamapSlifeless: if we see image ID changing, we'd know that a rebuild is coming, and do any appropriate actions, then ping back that wait condition.02:57
SpamapSlifeless: this is pure evil, but it would provide ammunition to counter the conservative "don't make Heat a workflow engine" folks.02:58
lifelessSpamapS: I see, so the evil bit is that we're duplicating the entire config descriptor02:58
SpamapSand inferring the action02:59
lifelessSpamapS: we only need the image property tough02:59
lifelessnone of the rest - just a second cfn source for occ02:59
SpamapSYeah that is true02:59
lifelesswhat about deletes?02:59
SpamapSnothing implicit I can think of.. we'd have to actually set a "you're about to be deleted" flag03:00
lifelessok03:00
lifelessso we can workaround heat not doing this intrinsically for rebuild,but not scale down03:00
SpamapSright03:00
lifelessso, I think I'd rather carry a patch than invest in a poor-mans solution that is incomplete like tis03:01
lifelesswhat do you think ?03:01
SpamapSLikewise03:04
SpamapSI'm working through the options03:04
SpamapSwhat about the hundreds of updates solution?03:04
SpamapSI _hate_ that one.03:04
SpamapSbut I'm tired of arguing03:04
lifeless2014-02-09 03:02:17.385 21895 ERROR nova.virt.libvirt.driver [-] [instance: 45daff91-84dd-4596-95c2-023afc6436c1] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+tcp://overcloud-novacompute1-bxq4qdzw7bif/system: Unable to resolve address 'overcloud-novacompute1-bxq4qdzw7bif' service '16509': Name or service not known03:04
lifelessSpamapS: hundreds of updates solution?03:04
SpamapSlifeless: oh sorry that is for rolling03:05
lifelessSpamapS: so I'm not seeing these arguments03:05
lifelessSpamapS: I'm entirely happy to go to battle for sanity for you :)03:05
lifelessSpamapS: if you want to focus on making the world a better place through code03:06
SpamapSdunno, have found it hard to even write code for this03:07
SpamapStrying to make a spec that is perfect :=P03:07
lifelessSpamapS: ouch03:07
lifelessSpamapS: why03:07
SpamapSlifeless: not sure. I feel the same about Heat as I did about Juju.. lots of smart people not caring much for their real users.03:09
lifelessSpamapS: ok, so lets fix that.03:10
SpamapSIt could just be that I'm losing weight and projecting my negativity on my current problem set too.03:10
lifelessyou're dieting>03:10
lifeless?03:10
SpamapSYeah a lot03:10
lifelesscool03:10
lifelessI'm trying but man03:10
lifelesswith kids, so hard03:11
lifelesss/s//03:11
SpamapSI hit 260lbs. last week.. which caused me to throw out my back... so I'm on 1600 calories a day03:11
lifelessSpamapS: ah, so you had been anti-dieting03:11
SpamapSalso called "stress eating"03:11
lifelesswhats got you stressed?03:11
SpamapSThat's a profound question that probably has a complicated answer.03:12
lifelessok03:12
lifelessso zaneb seems to agree that having servers optionally notified of coming changes to the server itself via metadata is good short term answer to a bunch of things03:12
lifelessAFAICT we just need to show up with code now03:12
SpamapSYeah I think that could be an easy one to get done.03:12
lifelessunless stevebaker wants a cordon-bleu-print03:13
SpamapSIt's basically my poor-man's solution, but encoded in an argument to the server which is something like "metadata_notification_key: xxx" which will be a key in metadata where we put a dict with a pending action/callback url, etc.03:14
*** mestery__ is now known as mestery03:14
lifelessyes03:15
lifelessthough since we use a decoupled metadata struct03:15
SpamapSmetadata_notification_resource: too :-P03:15
SpamapSor unravel the circular dependency bits03:15
lifelessone thing at a time03:17
SpamapSthe former seems simpler since the circular dep thing will be controversial and possibly complicated03:17
SpamapSlifeless: I think what is getting me is that it feels like updates are the red headed step child of Heat.03:17
SpamapSlifeless: anywa... airing doubts makes them feel less important. Thanks.03:18
lifelessSpamapS: so, I think that CFN (and thus Heat) were built on the basis of not needing what we need, because there are other tools like trove for db's03:21
lifelessbut we're deploying with itself03:21
lifelessso we need to handle scenarios that the older design was able to just avoid03:25
SpamapSagreed03:33
lifelessanyhow, I think we have a simple, and easy answer here.03:48
lifelessso sometime monday it should be working and we can monkey patch it on :)03:49
*** CaptTofu has quit IRC03:52
*** cody-somerville has joined #tripleo03:55
*** cody-somerville has joined #tripleo03:55
*** cd-undercloud has joined #tripleo03:57
cd-undercloud************** overcloud complete status=1 ************03:57
*** cd-undercloud has quit IRC03:57
StevenKlifeless: Yeah, I don't mirror sources. That should only count as another arch, really.04:07
* lifeless tries live block migration with qemi04:07
lifelessqemu04:07
lifelesscall me a daredevil04:07
*** ohadlevy has quit IRC04:09
*** ohadlevy has joined #tripleo04:11
*** ohadlevy is now known as Guest2915004:11
lifelesshmm04:37
lifelessvirNetTLSContextCheckCertFile:117 : Cannot read CA certificate '/etc/pki/CA/cacert.pem': No such file or directory04:37
lifelessbut I didn't turn tls on04:37
lifelessyay :/ https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/97921204:40
uvirtbotLaunchpad bug 979212 in libvirt "libvirtd --listen fails with: Cannot read CA certificate '/etc/pki/CA/cacert.pem': No such file or directory" [Medium,Won't fix]04:40
*** cd-undercloud has joined #tripleo05:19
cd-undercloud************** overcloud complete status=1 ************05:19
*** cd-undercloud has quit IRC05:19
*** killer_prince has joined #tripleo05:32
*** edmund1 has joined #tripleo05:37
*** edmund1 has quit IRC05:42
lifelessI get the feeling I'm manually reproducing all the neutron races right now05:43
lifeless../05:43
*** jcooley_ has joined #tripleo05:46
*** jcooley_ has quit IRC05:50
*** CaptTofu has joined #tripleo05:53
*** CaptTofu has quit IRC05:58
*** vkozhukalov has joined #tripleo06:23
*** cd-undercloud has joined #tripleo06:39
cd-undercloud************** overcloud complete status=1 ************06:39
*** cd-undercloud has quit IRC06:39
*** Guest29150 is now known as ohadlevy06:46
*** ohadlevy is now known as ohadlevy_06:46
*** ohadlevy_ is now known as ohadlevy06:46
*** ohadlevy has joined #tripleo06:46
*** noslzzp has quit IRC06:58
*** akuznetsov has quit IRC07:07
*** akuznetsov has joined #tripleo07:12
*** akuznetsov has quit IRC07:22
*** akuznetsov has joined #tripleo07:37
*** mrunge has joined #tripleo07:51
*** e0ne has joined #tripleo07:53
*** CaptTofu has joined #tripleo07:54
*** CaptTofu has quit IRC07:59
*** cd-undercloud has joined #tripleo08:00
cd-undercloud************** overcloud complete status=1 ************08:00
*** cd-undercloud has quit IRC08:00
lifelessStevenK: https://review.openstack.org/59699 merged - congrats08:10
*** rlandy has joined #tripleo08:11
lifelessgreghaynes: that thing with rootwrap08:13
lifelessgreghaynes: a thought - neutron knows where it's being installed (realpath etc)08:13
lifelessgreghaynes: so, during install, it should specify that location as a valid path for binaries.08:13
peoplemergeOK The seed step is done, following devtest.  Here's where it gets a bit hazy for me.  I guess undercloud is optional here due to the modest size of my effort.08:22
peoplemergedo I run devtest_overcloud.sh on seed?08:23
lifelesspeoplemerge: you're using real machines right?08:24
peoplemergeAlso I don't see the part in docs where run `nova baremetal-node-create seed 2 4096 128 $NIC` but that does appear to happen on seed08:24
peoplemergelifeless: yes08:24
peoplemergelifeless: desktops, using fake power08:24
lifelesshow many machines?08:25
peoplemergelifeless: 508:25
lifelessso for that, you could do one undercloud, then 1 overcloud control and three hypervisors08:26
lifelesspeoplemerge: I wouldn't skip undercloud, because deploy from real machines is much faster08:26
lifelessits nice to have an actual real box for the undercloud08:26
lifelessso for that, you're registering a single machine with the seed, then you source the seed rc, and then you can run devtest_undercloud08:26
*** mrunge has quit IRC08:27
lifelessthe last few commits added support for metadata in the test environment describing real hardware08:27
lifelessits still a bit rough, but you may want to look at their diffs to get your head around it08:27
lifelessalso, apropos of onthing08:28
lifelessHTF did this ever work : ./etc/neutron/rootwrap.d/l3.filters~:metadata_proxy_local: CommandFilter, /usr/local/bin/neuton-ns-metadata-proxy, root08:28
peoplemergelifeless: sounds good.  better not over think this08:39
peoplemergelifeless: hm I guess my confusion is which commands do run where, ie run devtet #1-5 on host containing seed, #6 on seed, then run baremetalnodecreate, then boot a box and pxe should fire which should make the undercloud.08:45
*** mrunge has joined #tripleo08:45
peoplemergewill review last few commits08:49
* peoplemerge -> bed08:50
*** e0ne has quit IRC08:58
lifelesspeoplemerge: all the commands are run from the same shell08:59
*** e0ne has joined #tripleo08:59
*** cd-undercloud has joined #tripleo09:21
cd-undercloud************** overcloud complete status=1 ************09:21
*** cd-undercloud has quit IRC09:21
*** e0ne has quit IRC09:22
*** mrunge has quit IRC09:37
*** CaptTofu has joined #tripleo09:55
*** CaptTofu has quit IRC10:00
*** SpamapS has quit IRC10:06
*** mestery_ has joined #tripleo10:08
*** mestery has quit IRC10:10
*** SpamapS has joined #tripleo10:13
*** cd-undercloud has joined #tripleo10:42
cd-undercloud************** overcloud complete status=1 ************10:42
*** cd-undercloud has quit IRC10:42
*** jrist has quit IRC10:52
*** jrist has joined #tripleo11:05
*** boris-42_ has joined #tripleo11:11
*** boris-42 has quit IRC11:11
StevenKlifeless: Woo. I owe Zhi Yan Liu at one beer, possibly more.11:26
*** CaptTofu has joined #tripleo11:55
*** CaptTofu has quit IRC11:56
*** CaptTofu_ has joined #tripleo11:56
*** CaptTofu_ has quit IRC12:01
*** cd-undercloud has joined #tripleo12:03
cd-undercloud************** overcloud complete status=1 ************12:03
*** cd-undercloud has quit IRC12:03
*** CaptTofu has joined #tripleo12:14
*** CaptTofu has quit IRC13:12
*** nijaba_ has quit IRC13:18
*** nijaba has joined #tripleo13:18
*** nijaba has quit IRC13:18
*** nijaba has joined #tripleo13:18
*** cd-undercloud has joined #tripleo13:24
cd-undercloud************** overcloud complete status=1 ************13:24
*** cd-undercloud has quit IRC13:24
*** tserong has quit IRC13:33
*** tserong has joined #tripleo13:39
*** weshay has joined #tripleo14:12
*** nijaba has quit IRC14:28
*** noslzzp has joined #tripleo14:31
*** nijaba has joined #tripleo14:32
*** nijaba has quit IRC14:32
*** nijaba has joined #tripleo14:32
*** cd-undercloud has joined #tripleo14:45
cd-undercloud************** overcloud complete status=1 ************14:45
*** cd-undercloud has quit IRC14:45
mordredStevenK: neat. what did Zhi Yan Liu do?14:56
mordredo m g14:56
mordredyou got glance moved to testrepository?14:56
*** nijaba has quit IRC14:56
* mordred hands StevenK a beer too14:57
*** nijaba has joined #tripleo14:57
*** nijaba has quit IRC14:57
*** nijaba has joined #tripleo14:57
mordredare you tackling keystone too?14:57
SpamapSok, time to pull latest heat15:08
SpamapSinto undercloud15:08
SpamapSwe'll get fixes for 500 error handling and event-list will finally be sane again15:08
mordredwoot15:12
*** nijaba has quit IRC15:13
*** nijaba has joined #tripleo15:16
*** e0ne has joined #tripleo15:17
*** ftcjeff has joined #tripleo15:54
*** openstackgerrit has joined #tripleo15:55
*** ftcjeff has quit IRC16:04
*** edmund has joined #tripleo16:25
*** nijaba has quit IRC16:44
*** nijaba has joined #tripleo16:45
*** nijaba has quit IRC16:45
*** nijaba has joined #tripleo16:45
*** e0ne has quit IRC16:56
*** rlandy has quit IRC17:03
*** cd-undercloud has joined #tripleo17:09
cd-undercloud************** overcloud complete status=1 ************17:09
*** cd-undercloud has quit IRC17:09
*** tserong has quit IRC17:12
*** tserong has joined #tripleo17:13
*** mestery_ is now known as mestery17:19
*** d0ugal has joined #tripleo17:26
*** d0ugal has joined #tripleo17:26
openstackgerritRichard Su proposed a change to openstack/tripleo-image-elements: Add service restart on tgtd for Fedora  https://review.openstack.org/7215417:31
*** rwsu has quit IRC17:32
*** bauzas has joined #tripleo17:34
*** bauzas has quit IRC18:25
*** cd-undercloud has joined #tripleo18:33
cd-undercloud************** overcloud complete status=1 ************18:33
*** cd-undercloud has quit IRC18:33
lifelessoh joy, now ovs-vswitchd segfaults. what did I do?18:35
lifelessalso, morning18:35
*** dkehn__ has joined #tripleo19:03
*** dkehn__ has quit IRC19:05
*** dkehn__ has joined #tripleo19:05
*** dkehn_ has quit IRC19:05
*** dkehn has quit IRC19:06
*** dkehn has joined #tripleo19:07
*** tserong has quit IRC19:14
lifelessahahahahaha19:24
lifeless[    4.013765] init: mountall main process (210) killed by FPE signal19:24
lifelessGeneral error mounting filesystems.19:24
lifelessA maintenance shell will now be started.19:24
lifelessCONTROL-D will terminate this shell and reboot the system.19:24
lifelessroot@demo:~#19:24
*** tserong has joined #tripleo19:24
*** tserong has joined #tripleo19:24
*** d0ugal has quit IRC19:40
*** cd-undercloud has joined #tripleo19:55
cd-undercloud************** overcloud complete status=1 ************19:55
*** cd-undercloud has quit IRC19:55
SpamapSwtf19:58
SpamapSFPE?!!?19:58
SpamapSfor mountall?19:58
SpamapSlifeless: is this on one of our failed boxes?19:59
*** marun has joined #tripleo20:04
SpamapSpeoplemerge: sitting at computer for about 10 more minutes if you need help debugging20:08
*** e0ne has joined #tripleo20:09
lifelessSpamapS: thats on a vm inside the ci-overcloud20:16
lifelessSpamapS: whats more perplexing is this freaking gre tunnel not decapsulating /anything/ in my local vm based testbed20:17
lifelessnow its not20:17
lifelessit was working20:17
SpamapSlifeless: can you please explain again how to attach to the shared screens for ilos btw?20:17
SpamapSit is not explained in /home/shared20:17
SpamapSand I don't use screen, so I am about to throw things at it.20:17
lifelesshah20:18
lifelesssec20:18
SpamapSscreen -x ? screen -foad20:18
lifelessssh to the bastion20:18
lifelessscreen -Ax ilos{1,2}20:18
SpamapSalso I'm not listed in the acl in the screen_ilorc120:18
SpamapSso thats probably half the problem20:18
lifelessSpamapS: neither am I20:19
lifelessSpamapS: so try20:19
lifelessscreen -Ax ilos120:19
lifelessand see what happens20:19
SpamapSoh aclgrp users20:19
SpamapSn/m20:19
SpamapSno screen to be attached matching ilos120:19
SpamapSit has never worked20:19
lifelessSpamapS: try now20:21
SpamapSlifeless: we've been around this merry go round a few times20:21
SpamapSMust run suid root for multiuser support20:21
SpamapSit only works for the user running the screens20:21
lifelessSpamapS: /dont/ fiddle it20:21
SpamapShah no I still have burn marks from the last time20:22
lifelessSpamapS: no, it has worked, last time you 'fixed' it by breaking it.20:22
lifelessSpamapS: so you no touchy.20:22
SpamapSthis is why I don't use the shared screen btw20:22
lifelesssure, if it doesn't work thats a good reason not to use it20:22
SpamapSI'd love to fix it20:22
lifelessbut not to not fix it20:22
SpamapSbut I'm like ---> tmux holmes20:22
lifelessSpamapS: try now20:23
SpamapSsame20:24
lifelessSpamapS: whats the specific error ?20:24
SpamapSclint@bm-aw1az2-freecloud0001:/home/shared$ screen -Ax robertc/ilos120:24
SpamapSMust run suid root for multiuser support.20:24
SpamapS(without your username, "There is no screen to be attached matching ilos1")20:24
lifelesshmmm20:24
lifelessscreen is setuid already20:26
lifeless        6865.ilos2      (12/17/2013 01:29:01 AM)        (Multi, detached)20:26
lifeless        5187.ilos1      (11/22/2013 01:07:55 AM)        (Multi, detached)20:26
SpamapSlifeless: would it be terrible to just use a semaphore rather than use shared screens?20:27
lifelessthey are marked multiuser20:27
lifelessNg: around ?20:27
SpamapSscreen is not suid root btw20:27
lifelessno ?20:27
SpamapSsgid utmp20:27
SpamapSI believe that's a dpkg-reconfigure thing20:28
lifelesssudo dpkg-reconfigure screen -p low finds nowthing20:28
lifelessnothing20:28
*** CaptTofu has joined #tripleo20:29
SpamapSgah20:29
SpamapSok well I tried.. can I just ilo into freecloud 0030 and see why it can't pxe?20:29
lifelessDebian Modifications20:29
lifeless--------------------20:29
lifeless  * added Debian package maintenance files20:29
lifeless  * Use /var/run/screen as socket directory20:29
lifeless  * Make it set-gid "utmp" instead of setuid root20:29
lifeless] dpkg-statoverride --update --add root utmp 4755 /usr/bin/screen20:29
lifeless] chmod 0755 /var/run/screen20:29
SpamapSDebian brokeded it20:29
lifelessdebian, I love you. SOMETIMES.20:29
lifelessso, try now20:30
SpamapSworks20:32
SpamapSugh20:32
SpamapSif you ctrl-c the sleep the window is lost20:32
*** noslzzp has quit IRC20:32
lifelesshaha, yeah20:32
lifelessSpamapS: sec restarting it20:33
lifelessscreen -S ilos1 -c ilo_screenrc1 for reference20:34
SpamapSaye20:34
SpamapSAccess to session denied now20:35
lifelessorly20:36
lifelesstry now?20:36
SpamapSmind if I add 'defzombie rd' to the rcs?20:37
lifelessgimme a sec20:37
lifelessdid you manage to connect?20:37
SpamapSthat will just make the ctrl-c try again if an ilo exits20:37
SpamapSI did20:37
SpamapSthough.. fun.. 'ctrl-A d' cannot detach20:38
lifeless!20:38
lifelessok, i've jus ttweaked the rcs to get you access on startup20:38
lifelesshopefully20:38
lifelessmake your changes20:38
mordredSpamapS: tmux ftw20:41
lifelessbah20:42
lifelessI bet you like systemd too20:42
mordredSpamapS: btw - I tried spinning up a new raring node in hpcloud to host my IRC stuff (Which is on precise) and SOMETHING about the byobu/tmux config in raring is different from precise and the keybindings are all wonky now20:42
lifelessmordred: you use byobu?20:42
mordredas in, pg-up/pg-dn don't work anymore20:42
mordredlifeless: hell yes20:42
*** vkozhukalov has quit IRC20:42
mordredlifeless: screen is a pile of poop20:43
SpamapSbyobu is tmux now ;)20:43
lifelessmordred: byobu used to be based on screen... so thats a orthogonal answer.20:43
mordredscrews with all of my keybindings - so I avoided it until byobu came along and unbroke it20:43
SpamapSbut I have given up using it.. it wastes so much load20:43
lifelessmordred: I'm just surprised, since i didn't think I knew any byobu users.20:43
mordredme20:43
mordredquite aggressively20:43
lifelessexistence proof20:43
lifelesswho knew20:43
mordred:)20:43
SpamapSI've gotten used to the tmux keys20:44
mordredSpamapS: does tmux break existing keybindings out of the box like screen does?20:44
mordredctrl-a is the big one20:44
mordredthe asstastic fail of screen using that is amazing to me20:44
mordredI'm sorry20:44
lifelessmordred: emacs user?20:44
SpamapSthey use ctrl-b20:44
mordredemacs has been aroudn for SO MANY MORE YEARS20:44
SpamapSwhich I don't use for anything20:44
mordredlifeless: well, yeah - also bash command line editing20:44
lifelessmordred: if you don't have vim mode turned on :P20:45
mordredtyping ctrl-a a  is stupid20:45
SpamapSscreen's upstream is, IIRC, completely dead20:45
mordredlifeless: I do not. I use emacs keybindings20:45
lifelessI should man up and switch to tmux at some point I guess20:45
lifelessI just haven't enjoyed my experiences with tmux so far20:45
mordredbecause they work exceptionally well - unless you're using screen, in which case the world breaks20:45
mordredwhat I like about my weechat-in-tmux-in-byobu experience is that all of the keys do what I expect them to and I did not do extra config anywhere20:46
lifelessok, so back to this openvswitch headf*ck20:46
*** bauzas has joined #tripleo20:46
mordred:)20:46
lifelessmordred: btw, you have mail (that work thread)20:46
lifelessmordred: we have confusion somewhere20:46
mordredlifeless: I'm sure we do20:46
lifelessplus I'm having to reboot the n-o-a every few hours on my undercloud or dhcp goes lalalalala20:49
lifelesssheese20:49
SpamapSlifeless: doesn't it just HUP dnsmasq a lot?20:50
SpamapSwait thats n-d-a20:50
mordredlifeless: responded20:50
SpamapSn-o-a runs ovs-vsctl all the time20:50
*** tserong has quit IRC20:51
lifelessSpamapS: yea, and its doing something that breaks the br-int -> tap glue20:51
lifelessnda HUPs dnsmasq, which has its own issue20:52
lifelessSpamapS: https://bugs.launchpad.net/neutron/+bug/127134420:52
uvirtbotLaunchpad bug 1271344 in tripleo "neutron-dhcp-agent doesn't hand out leases for recently used addresses" [Critical,Triaged]20:52
*** CaptTofu has quit IRC20:56
*** jrist has quit IRC20:58
lifelessmordred: reresponded :P20:58
SpamapSok.. back to Sunday21:00
mordredlifeless: I rereresponded - but I did so to Clint21:04
lifelessyah, I got the beep :P21:06
lifelessits very gentle, OWA.21:06
lifeless'behp'21:06
lifelessoh wow, bad bad bad rules on the compute node vswitch21:10
*** jrist has joined #tripleo21:10
lifelessthat was it21:12
lifelesssheeeese21:12
lifelesslooks like the ovs agent isn't resyncing properly after rabbit disconnects21:13
*** e0ne has quit IRC21:20
* lifeless stabs21:23
lifelessovercloud-novacompute1-bxq4qdzw7bif.novalocal21:23
*** tzumainn has joined #tripleo21:32
mordredany redhat people aroudn who know where the heck haproxy is on a rhel install?21:42
*** Perfectknoppsand has joined #tripleo21:42
lifelesstzumainn probably does21:42
lifelessqemu slow21:43
lifelessthe good news I have migration working, but don't think its live yet21:43
lifelessfor all that its called 'live-migration'21:43
*** e0ne has joined #tripleo21:51
*** CaptTofu has joined #tripleo21:53
mordredlifeless: :)21:55
*** noslzzp has joined #tripleo21:58
*** CaptTofu has quit IRC21:59
SpamapS2014-02-09 22:08:43,071.071 1484 TRACE nova.openstack.common.threadgroup NovaException: Baremetal node: 32 has no available physical interface for virtual interface c6192a37-b6d9-4434-8bd5-739a6513cb4722:08
lifelessSpamapS: fun, new one on me22:09
SpamapSyeah...22:09
lifelessSpamapS: ci-overcloud ?22:09
SpamapSlifeless: looking now not sure22:09
SpamapSon cd-undercloud22:09
SpamapSnova-compute was showing 0 anything available22:09
SpamapSrestarted it and it started spewing this trace22:09
SpamapS| 3198f460-ebd6-43ed-8e8b-4aa9142dd624 | testenv-testenv3-wn42v7iwboxr                   | ACTIVE | None       | Running     | ctlplane=10.10.16.152, 10.10.16.156 |22:10
SpamapSthat's node 3222:10
*** cd-undercloud has joined #tripleo22:10
cd-undercloud************** overcloud complete status=1 ************22:10
*** cd-undercloud has quit IRC22:10
*** Perfectknoppsand has quit IRC22:12
SpamapSweird weird weird22:13
SpamapSneutron port c6192a37-b6d9-4434-8bd5-739a6513cb47 has mac 78:e7:d1:21:66:7622:13
SpamapSwhich is not for bm node 32, but for 422:13
SpamapSwhich is an unallocated machine22:13
SpamapSdoh22:17
*** hewbrocca has quit IRC22:17
lifelessmisregistered node ?22:17
SpamapStwo ports for 3198f460-ebd6-43ed-8e8b-4aa9142dd62422:17
SpamapSI already pasted it even22:17
SpamapS10.10.16.152 and 10.10.16.15622:18
SpamapSNo idea why nova wants to plug node 32 into that22:18
*** e0ne has quit IRC22:20
SpamapS2014-02-09 22:23:17,854.854 32448 WARNING nova.compute.manager [req-d6d585cc-eeb0-470a-b574-ab6d316bfd48 None None] Found 35 in the database and 27 on the hypervisor.22:23
lifelessSpamapS: I mean in nova baremetal's registry22:23
SpamapSlifeless: no the macs are right I re-checked against the info text22:23
SpamapSlifeless: and that node is working fine. nova_bm just tried to plug that port into the wrong instance from what I can tell.22:24
SpamapS2014-02-09 22:29:51,258.258 32448 INFO nova.compute.resource_tracker [-] Compute_service record updated for undercloud:fbe3a166-af9e-4caa-9008-09988efec77c22:30
SpamapS2014-02-09 22:29:51,259.259 32448 AUDIT nova.compute.resource_tracker [-] Auditing locally available compute resources22:30
SpamapS2014-02-09 22:29:51,307.307 32448 AUDIT nova.compute.resource_tracker [-] Free ram (MB): 022:30
SpamapSlifeless: seems like something else is broken..22:30
SpamapSlifeless: no baremetal resources are available22:31
SpamapS2014-02-09 22:29:51,425.425 32448 ERROR nova.network.neutronv2.api [-] *************************************************************************************************************foo {'tenant_id': u'4956c533154c476799c688eda7ed65ab', 'device_id': '6bd2573d-1e19-4baa-96ff-0ca240e1a074'}22:31
SpamapSalso .. that.. ??22:31
lifelessthe foo is debug cruft I added, ignore it22:31
SpamapSah ok22:31
SpamapSlifeless: so I deleted that port from neutron, as the machine that mac is associated with is unallocated. That made the weird trace go away.22:32
lifelessright22:33
lifelessonly one interface row in bm_interfaces. weird.22:34
lifelessSpamapS: you did see that I booted all the undercloud machines22:34
lifelessSpamapS: to find broken ones22:34
lifelessright ?22:34
SpamapSOh so there really are no available nodes?22:34
lifeless[nova boot --num=instances=....22:34
lifelesswell22:34
SpamapSI totally did not see that22:35
lifelessthere should be enough for the overcloud still22:35
lifelesscheck nova list:)22:35
SpamapSwe should drop retries to 0 and do that22:35
SpamapSbut yeah we can use nova-compute.log to find the borken ones22:35
SpamapSlifeless: ok so anyway, overcloud is still getting errors22:37
SpamapSlifeless: and the one I booted got an error22:37
lifelesswhat was the error ?22:37
SpamapSno available hosts22:37
SpamapSbut immediately22:37
SpamapSnot after pxe timeouts22:37
lifelessok so22:37
lifelessnova hypervisor-stats22:37
lifeless| count                | 35      |22:37
lifeless| running_vms          | 35      |22:37
SpamapSyeah22:38
lifeless nova baremetal-node-list | grep undercloud  | wc -l22:38
lifeless3522:38
lifelesswe have 5 marked bad22:38
lifelessnova baremetal-node-list | grep -v undercloud22:38
SpamapSok22:38
SpamapSso, do I delete two of the ACTIVE find brokens, and the overcloud, and then try the stack creation again?22:39
lifelessthe overcloud is 3 of the 3522:39
SpamapSERROR,ERROR22:39
SpamapS1 active only22:39
lifelessnova list| grep ACTIVE | wc -l22:39
lifeless2722:39
lifelessnova list| grep None | wc -l22:40
lifeless3522:40
lifelessthe ERRORS count against a machine it would seem22:40
SpamapSerrrrrrrrr22:40
lifelessnow, whether thats a bug or not, dunno.22:40
lifelessBut22:40
*** CaptTofu has joined #tripleo22:41
lifelessIw ould try this - delete an ERROR findbroken22:41
lifelessand try booting one machine with nova boot22:41
lifeless(once nova hypervisor-stats shows running vms=3422:41
lifelessSpamapS: whats your goal - make overcloud work ?22:42
SpamapSTIL 'nova hypervisor-stats'22:42
SpamapSlifeless: yes22:42
lifelessok22:42
lifelessso delete the overcloud stack22:42
lifelessand two of 'findbroken ACTIVE'22:42
lifelessthen boot --num-instance 522:42
lifelessthat should flush out the two broken ones from the overcloud22:42
SpamapSthats what I said, I'm setting booty traps22:43
lifelessthen delete 3 findbroken ACTIVE22:43
lifelessthen run tripleo-cd again22:43
lifelessyeah with you now; was being slow22:43
SpamapSlifeless: freecloud0016 looks like not getting PXE22:48
lifelessSpamapS: mac is right? I find tcpdump on the undercloud br-ctlplane is useful to see if the machines DHCP gets out22:49
*** jackrabbit has joined #tripleo22:50
*** jackrabbit has quit IRC22:50
lifelessstab22:51
lifeless2014-02-09 21:51:20.327 3459 ERROR nova.virt.disk.mount.nbd [req-288461c9-9645-47f4-9e62-da5355f0ffa2 554875e48592436e96725c48b070d737 f2dd987b34584852b6a824d32d290a1c] nbd module not loaded22:51
lifelessits not meant to be -trying-22:51
SpamapSnoooo22:51
SpamapSnbd: http://memecrunch.com/meme/26FC/why-won-t-you-die/image.png22:51
lifeless    def _allocate_nbd(self):22:52
lifeless        raise Exception("FOAD")22:52
lifelesscloud-init-nonet[73.79]: waiting 120 seconds for network device22:52
lifelesssheese22:52
SpamapSno I was wrong it is booing into deploy ramdisk22:53
SpamapSbooting even22:53
SpamapSlifeless: it used to wait __forever__22:54
SpamapShad a bit of a knock down drag out (but always friendly) with smoser over that22:54
lifelessSpamapS: my sheese wasn't at cloud-init22:54
SpamapSoh22:54
lifelessit was at this qemu cloud flakiness on networking22:54
lifelesswhich I'm slowly debugging22:55
SpamapSOH hahahaha I totally missed the FOAD22:55
SpamapShmmm... boot into deploy RD shows it just stalling while booting cpus22:55
SpamapSso might be a bad CPU22:55
lifelessahhhh22:55
lifelesswhy can't we have nice things?22:55
SpamapS[    0.591937] smpboot: Booting Node   0, Processors  #2222:55
lifelessso, mark bad?22:56
SpamapS[    0.578597] smpboot: Booting Node   1, Processors  #21 OK22:56
SpamapSYeah and lets get a ticket open22:56
lifelessfor tht I'm opening a trello card, recording in -tab.txt and changing the service host in bm_nodes.22:56
lifelessand pinging someone that has cycles to liase with -ops22:56
lifelesse.g. Ng (though I must read his docs now!)22:56
SpamapSYeah I think he did try to teach us all22:58
SpamapSlifeless: I'm being pulled back to sunday stuff..22:58
SpamapSfreecloud0016 .. bad... confirmed22:58
SpamapSwait no22:59
SpamapStextcons may just be confused22:59
SpamapSbecause it is pingable22:59
lifelessheh23:01
SpamapSweeeiirrdd23:01
SpamapSit worked23:02
SpamapSso....23:02
SpamapS-> Sunday23:02
SpamapSlifeless: test-clint1 is totally deletable23:02
SpamapSI may leave it a bit just bcuz23:02
lifelessssh on qemu sllooww23:05
*** e0ne has joined #tripleo23:12
*** CaptTofu has quit IRC23:18
lifelessomg23:21
lifelessnbd needs such heavy boots to avoid23:21
lifeless            if size:23:22
lifeless                disk.extend(target, size, use_cow=True)23:22
lifelessuses qemu-nbd23:22
*** noslzzp has quit IRC23:22
*** noslzzp has joined #tripleo23:24
*** cd-undercloud has joined #tripleo23:33
cd-undercloud************** overcloud complete status=1 ************23:33
*** cd-undercloud has quit IRC23:33
*** edmund has joined #tripleo23:37
*** jeremydei has quit IRC23:39
*** jeremydei has joined #tripleo23:42
greghayneslifeless: re the neutron rootwrap thing: the full venv path idea works for me23:48
lifelessgreghaynes: btw did you spot the typo in l3.filters?23:49
lifelessgreghaynes: I submitted a patch for it23:49
greghaynesoh, no23:49
lifelessneuton23:50
lifelessgreghaynes: that might possibly be the cause, I haven't checked though23:50
greghaynesah, ill check it out23:50
lifelessSpamapS: whats this about: cloud-init-nonet[70.26]: waiting 120 seconds for network device23:51
greghaynesif it works with that would be kind of.. odd23:51
lifelesscloud-init-nonet[78.65]: static networking is now up23:51
lifelessSpamapS: the instance has dhcp configured...23:51
lifelessand wtf23:52
lifelesslive-migrate reboots the instance.23:52
lifelessI don't know *where* to start23:52
lifelessoh23:53
lifeless--block-migration does not do whwat you might think23:53

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!