Wednesday, 2013-10-30

*** michchap has joined #tripleo00:05
*** jesusaurus has joined #tripleo00:07
*** michchap_ has joined #tripleo00:30
*** matsuhashi has joined #tripleo00:30
*** michchap has quit IRC00:33
*** cd-undercloud has joined #tripleo00:38
cd-undercloud************** overcloud complete status=1 ************00:38
*** cd-undercloud has quit IRC00:38
*** michchap_ has quit IRC00:44
lifelessstill dabroke00:47
*** michchap has joined #tripleo00:47
*** julim has quit IRC00:57
*** akuznetsov has quit IRC01:01
*** nosnos has joined #tripleo01:03
*** dprince has quit IRC01:04
*** krotscheck has quit IRC01:10
SpamapSlifeless: durn01:16
* SpamapS had to run to fetch kids01:16
*** sdake has joined #tripleo01:19
*** vipul is now known as vipul-away01:31
SpamapSovs-vsctl: 'list-ports' command takes at most 1 arguments (note that options must precede command names and follow a "--" argument)01:35
*** vipul-away is now known as vipul01:36
*** MarkAtwood has joined #tripleo01:42
*** MarkAtwood has quit IRC01:53
*** cd-undercloud has joined #tripleo02:02
cd-undercloud************** overcloud complete status=1 ************02:02
*** cd-undercloud has quit IRC02:02
*** spzala has quit IRC02:44
*** michchap has quit IRC02:48
*** anteaya has quit IRC03:00
*** ehelms is now known as ehelms-afk03:00
*** coolsvap has joined #tripleo03:13
SpamapSlifeless: you looking at this at all?03:16
SpamapSlifeless: | 56efc4c6-557c-443c-a9cf-9b5a698a725d |      | fa:16:3e:5a:cf:e8 | {"subnet_id": "18b96b8d-c989-4820-b330-0c0f7c19c8d5", "ip_address": "10.10.16.172"} |03:16
SpamapSlifeless: I can't figure out where that port comes from03:16
lifelessSpamapS: no, I've been learning ovs plumbing in #openstack03:17
SpamapSlifeless: and when I delete it, it is recreated03:17
lifelessSpamapS: so, start with -show03:17
lifeless| device_owner          | network:dhcp03:18
lifelessSpamapS: give you one guess03:18
SpamapSlifeless: ok, that makes sense.. ;)03:19
SpamapSCommand: ['sudo', 'ovs-vsctl', '--timeout=2', 'list-ports', 'br-int']03:19
SpamapSExit code: 14203:19
SpamapSso this seems like the "WTF" part03:19
SpamapSwhere the logs just don't match whatever is actually happening03:19
*** vipul has quit IRC03:20
*** vipul has joined #tripleo03:22
*** slagle_ has joined #tripleo03:24
*** slagle_ has quit IRC03:24
*** slagle_ has joined #tripleo03:25
*** cd-undercloud has joined #tripleo03:26
cd-undercloud************** overcloud complete status=1 ************03:26
*** cd-undercloud has quit IRC03:26
lifelessSpamapS: ok, so that command works by hand03:27
SpamapSlifeless: right, and it works as neutron the user as well03:29
SpamapSlifeless: 142 is also not an exit code of ovs-vsctl03:29
lifeless       0      Successful program execution.03:29
lifeless       1      Usage, syntax, or configuration file error.03:29
lifeless       2      The bridge argument to br-exists specified the name of a bridge that does not exist.03:29
lifelessgreat minds03:29
SpamapSbit of confusion here in how communicate() works I think03:33
SpamapS    _stdout, _stderr = (process_input and03:33
SpamapS                        obj.communicate(process_input) or03:33
SpamapS                        obj.communicate())03:33
SpamapSlifeless: turned on debug on neutron.conf03:35
SpamapSgood god that thing is chatty03:35
SpamapSlifeless: btw, that command is streaming as succeeding most of the time.03:36
SpamapSmakes you wonder if there isn't a libovsctl .. or if there should be one..03:36
SpamapSI guess sudo.. so.. would have to be daemonized or something.03:37
lifelessSpamapS: or dacls03:37
lifelessSpamapS: it works by opening sockets associated with the bridges03:38
lifelessSpamapS: just a group for it should work03:38
* SpamapS re-parses the log with -O0 ... must.. not .. optimize....03:38
lifelessSpamapS: that code looks ok; it either passes a stdin in or doesn't03:38
SpamapSlifeless: it's fine, but weird.03:39
SpamapS 1952 mysql     20   0 4578m 197m 8100 S    72  0.2   1807:04 mysqld03:49
SpamapSProbably would cut a few percent of total time if we just tuned mysql a tiny bit.03:49
openstackgerritClint Byrum proposed a change to openstack/tripleo-image-elements: Make register-state-path take multiple paths  https://review.openstack.org/5447003:52
openstackgerritClint Byrum proposed a change to openstack/tripleo-image-elements: Register o-a-c managed files with use-ephemeral  https://review.openstack.org/5447103:52
SpamapSlifeless: so I'm met with an interesting question regarding use-ephemeral.03:56
SpamapS(working on it while I wait for more clues w/ debug = True)03:56
SpamapSlifeless: the question is.. do I depend on use-ephemeral after I change, for example, cinder, to look for its config files in /mnt/etc/cinder?03:57
lifelessI think so03:57
lifelessI think it's fine for us to be able to build non-ephemeral base imges03:58
lifelessbut for software setup to work with decoupled state storage, thats an explicit choice03:58
SpamapSit would work without use-ephemeral..03:58
SpamapSit would just be odd because it would have config files in /mnt for seemingly no good reason.03:58
lifelessI'm not confident it would always work03:59
lifelessI think it's easier to reason about if we do it rather than partually do it03:59
SpamapSok that tips my scale :)04:00
lifelessSpamapS: so when are you registering many files ?04:00
SpamapSlifeless: os-apply-config's thingy ;) (see next patch)04:00
lifelessSpamapS: huh04:00
lifelessSpamapS: I don't follow. use words.04:01
SpamapSlifeless: os-apply-config will automatically register all of the files it intends to write "not to /mnt"04:01
SpamapSwell not o-a-c, but the o-a-c element04:01
SpamapSlifeless: we can look at any files being registered that way as TODO's04:01
lifelessSpamapS: that doesn't make sense to me.04:02
lifelessSpamapS: the intent of register-state is that we register directories04:02
lifelessSpamapS: and only exceptionally register files04:02
lifelessSpamapS: doing what you are proposing will register many fiels04:03
lifelessbah04:03
lifelessmany files04:03
SpamapSlifeless: right, these are exceptions.04:03
SpamapSlifeless: these are elements that aren't actually done yet.04:03
lifelessSpamapS: and it's also wrong04:03
lifelessSpamapS: o-a-c may write files to /var/run04:03
SpamapS/var/run is tmpfs04:03
lifelessSpamapS: which should not be relocated.04:03
SpamapSalways04:03
lifelessSpamapS: yes, we both agree on that observation of reality :)04:04
SpamapSoh templated files are being thrown there?04:04
lifelessSpamapS: its a supported, intended use of oac04:04
lifelessSpamapS: IMO04:04
SpamapSWell let me take a step back because it is as much something to help me reason as it is an important change.04:05
*** vipul is now known as vipul-away04:05
lifelessSpamapS: arguably, and I just realise this now- arguably, things which are truely transient should perhaps go to /var/run nto /mnt, modulo size04:05
SpamapSPerhaps there isn't value in having a magical config file registering thing in the o-a-c element.04:05
SpamapSlifeless: config files in /var/run makes the reboot less predictable.04:06
SpamapSwe have to order everything to happen after occ04:06
lifelessSpamapS: yeah04:06
SpamapSwith them in /mnt the reboot at least has a chance at working right without a full occ run04:07
lifelessSpamapS: though correct behaviour may not be well defined until the metadata is re-read04:07
lifelessSpamapS: since we don't know how much has changed while we were powered off04:07
SpamapScorrect I agree, but predictable is more desirable than correct, IMO04:07
SpamapSthough I guess we can predict with them in /var/run that all will fail until we can reach metadata.04:08
lifelessSpamapS: it may be that it would be more correct and more predictable but not work as often :)04:08
SpamapShah right04:08
lifelessheh, jinx04:08
lifelessso04:08
lifelesslets put /v/r on the back burner04:08
SpamapSyeah its already stopped simmering :)04:08
lifelessI question the magic of having oac auto register04:08
lifelessI think it binds oac to use-ephemeral too tightly04:09
lifelessmaking it hard to use oac in a regular image04:09
lifelesslike a precious-snowflake long lived server environment04:09
SpamapSwell I made it optional ;)04:10
SpamapSif [ -x register-...] etc04:10
*** vipul-away is now known as vipul04:10
SpamapSbut I think now that I look at all the elements it would affect if use-ephemeral happens to be there.. side effects.. bad..04:10
lifelessright04:10
* SpamapS abandons this lazy, lazy change set :)04:11
lifelessright, work harder, not smarter :P04:11
SpamapSdoh04:12
* SpamapS wonders if a glass bird full of red water can complete this task..04:12
SpamapSno clues as to why we get a 142 error code in the debug-laden logs btw04:13
*** cwolferh has quit IRC04:14
lifelessSpamapS: http://openvswitch.org/pipermail/discuss/2010-November/004516.html04:14
lifelessSpamapS: timeout04:14
lifelessSpamapS: man page needs a bug filed on it04:15
SpamapSOh lovely04:15
SpamapSI was pondering reading the source04:15
lifelessdo we restart ovs in any orc scripts?04:17
lifeless'04:17
lifelessThe usual solution is to make sure that ovsdb-server and ovs-vswitchd04:17
lifelessare running.'04:17
lifeless24 core machine isn't going to be cpu bound ;)04:17
SpamapSno we don't restart them04:18
SpamapSwhat about parallelism causing a deadlock?04:19
SpamapSboth servers are being started at the same time04:19
lifelesswell04:19
lifelessthey should have been running for a month or so now04:19
lifelesspid 9293 and 929404:20
lifelessreasonably low04:20
lifelessbut04:20
SpamapSI restarted them while debugging04:20
lifelessoh04:20
SpamapSok now I get it04:23
SpamapS142 == SIGALRM04:23
SpamapSovs-vsctl should have installed a handler for SIGLARM before setting a timer04:23
lifelesswhy?04:24
lifelessthey're using the timer to die quickly04:24
SpamapS$ python -c 'import signal; import time; signal.alarm(1); time.sleep(2)' ; echo $?04:24
SpamapSAlarm clock04:24
SpamapS14204:24
lifelessyes...04:24
SpamapSI 'spose thats one way to go :)04:24
SpamapSanyway, I am highly suspicious that the timeout is either lock contention/deadlocking, or something calling ovs-vsctl, then not wait()'ing on it and it not cleaning up its locks.04:26
SpamapSI doubt it is performance related. :)04:26
lifelessso the timeout=204:27
SpamapSof course, with any good race, its possible we just started hitting it because of a change in cosmic rays. :-P04:27
lifelessI think thats to avoid blocking on ovs calls04:27
SpamapSlifeless: doesn't eventlet handle these things? :-P04:28
lifelessSpamapS: you don't want a large number of ovs subprocessing all requering thesame stuff04:28
SpamapSbut no I get what you're saying04:28
SpamapSthreads don't want to wait a long time for ovs-vsctl04:28
*** cd-undercloud has joined #tripleo04:31
cd-undercloud************** overcloud complete status=0 ************04:31
*** cd-undercloud has quit IRC04:31
lifelessSpamapS: hey so04:31
lifelessssh: connect to host review.openstack.org port 29418: Network is unreachable04:31
lifelessSpamapS: behind the HP firewall.04:31
lifelesswhats the answer ?04:31
SpamapSlifeless: bounce04:32
lifelessSpamapS: I need a detailed answer for a colleague in (I think) bangalore04:32
SpamapSlifeless: I used tsocks to bounce off my IRC bastion04:32
SpamapSlifeless: it was discussed on the openstack-sig list a few weeks ago04:32
SpamapSlifeless: you can do a .ssh/config thing to bounce of any external host and use nc04:32
SpamapSbounce off I should say04:33
SpamapSlifeless: Also I think somebody pointed out an internal SOCKS5 proxy04:33
SpamapS2013-10-30 03:40:56.028 4699 DEBUG neutron.agent.linux.utils [-]04:33
SpamapSCommand: ['sudo', 'ovs-vsctl', '--timeout=2', '--format=json', '--', '--columns=name,external_ids', 'list', 'Interface']04:33
SpamapS2013-10-30 03:41:00.548 4699 DEBUG neutron.agent.linux.utils [-]04:34
SpamapSCommand: ['sudo', 'ovs-vsctl', '--timeout=2', 'list-ports', 'br-int']04:34
SpamapS< 2 seconds apart04:34
lifelessright04:34
lifelessbut how long does it take to execute04:34
SpamapSalso the logging happens after the return code ...04:34
SpamapShm though I'd hope all the readonly commands are shared locks04:35
lifelessyou'd hope04:36
*** ccrouch is now known as ccrouch-afk04:36
lifelessSpamapS: is this whats breaking us ?04:45
SpamapSlifeless: looking now04:51
SpamapSneed to take a break.. but I am suspicious that the locking method is exclusive.04:54
lifelessSpamapS: I mean, is this neutron error correlated or causative of the deploy failures04:55
lifelessheat event-list overcloud04:55
lifeless+---------------------+-------+----------04:55
lifeless...04:55
lifeless| CompletionCondition | 43724 | state changed          | CREATE_COMPLETE    | 2013-10-30T04:27:27Z |04:55
lifeless27 minutes04:56
lifelesswait_for 190 10 stack-ready overcloud04:56
lifeless31m04:57
*** vipul is now known as vipul-away05:02
*** rpodolyaka1 has joined #tripleo05:06
rpodolyaka1morning all05:10
*** rushiagr has joined #tripleo05:10
*** cwolferh has joined #tripleo05:16
*** jpeeler has quit IRC05:16
*** jpeeler has joined #tripleo05:19
*** vipul-away is now known as vipul05:19
*** cwolferh has quit IRC05:30
*** cd-undercloud has joined #tripleo05:34
cd-undercloud************** overcloud complete status=0 ************05:34
*** cd-undercloud has quit IRC05:34
lifelessSpamapS: ^05:36
*** akuznetsov has joined #tripleo05:38
*** akuznetsov has quit IRC05:41
*** sdake has quit IRC05:46
*** svapneel has joined #tripleo05:49
*** coolsvap has quit IRC05:50
*** svapneel has quit IRC05:50
rushiagrhi friends05:50
*** coolsvap has joined #tripleo05:50
rushiagrthe command from devtest_seed doc page:05:51
rushiagrexport SEED_MACS=$(create-nodes $NODE_CPU $NODE_MEM $NODE_DISK $NODE_ARCH 1)05:51
rushiagrdoesnt return any MAC address ,that is nothing is assigned to SEED_MAC05:51
rushiagroh wait! it is seed_mac's'05:51
rushiagrah okay, my mistake. Never mind05:51
*** akuznetsov has joined #tripleo05:56
*** akuznetsov has quit IRC06:05
SpamapSlifeless: working now.. I wonder if debug logging changed the race winner06:12
lifelessSpamapS: I would so hate that06:13
SpamapSI've had it happen a few times before.06:13
SpamapSor times where ptracing a program made it work06:14
lifelessme too06:14
lifelessI still hate it06:14
*** blamar has quit IRC06:14
*** rpodolyaka1 has quit IRC06:17
SpamapSlifeless: ok so list-ports timeout may be a red herring06:19
*** cd-undercloud has joined #tripleo06:25
cd-undercloud************** overcloud complete status=2 ************06:25
*** cd-undercloud has quit IRC06:25
lifelessoh, a 2?!06:25
SpamapS<attribute 'message' of 'exceptions.BaseException' objects> (HTTP Unable to establish connection to http://138.35.77.4:306:26
SpamapS5357/v2.0/OS-KSADM/services)06:26
SpamapSusage: keystone endpoint-create [--region <endpoint-region>] --service <service> [--publicurl <public-url>] [--adminurl <admin-url>] [--internalurl <internal-url>]06:26
SpamapSkeystone endpoint-create: error: argument --service/--service-id/--service_id: expected one argument06:26
SpamapSlifeless: unfortunately I just ran out of gas06:26
SpamapSlifeless: but good luck. :-P06:26
lifelessthanks06:28
*** martyntaylor has joined #tripleo06:34
lifelessNg: when you arrive, https://review.openstack.org/#/c/54385/06:37
lifelessNg: and https://review.openstack.org/#/c/54121/06:37
*** marios_ has quit IRC06:45
*** marios has joined #tripleo06:46
openstackgerritA change was merged to openstack/tripleo-image-elements: Rename heat_watch_server_url to watch_server_url  https://review.openstack.org/5434706:54
rushiagrrpodolyaka: around?06:58
*** rdopieralski has joined #tripleo06:58
openstackgerritA change was merged to openstack/tripleo-heat-templates: Rename heat_watch_server_url to watch_server_url  https://review.openstack.org/5434807:04
*** vipul is now known as vipul-away07:06
rushiagrso the undercloud VM creation via heat fails07:11
rushiagrit is probably due to the recent bug in heat which rpodolyaka talked about yesterday07:12
*** cd-undercloud has joined #tripleo07:17
cd-undercloud************** overcloud complete status=0 ************07:17
*** cd-undercloud has quit IRC07:17
*** jcoufal has joined #tripleo07:38
*** viktors has joined #tripleo07:48
*** jprovazn has joined #tripleo07:54
*** lsmola_ has joined #tripleo08:00
rushiagrsry, had to go. So I just wanted to ask if I need to put only the integers in quotes or also the IP addresses?08:03
*** jtomasek has joined #tripleo08:06
rushiagrit will be a bonus if someone can tell me where to check the logs for this failure. THere is no screen obviously for logs08:07
GheRiveromorning tripleo!08:08
lifelessrushiagr: on ubuntu, logs are in /var/logs/upstart/08:08
rushiagrlifeless: thanks! I'm on Ubuntu08:09
*** cd-undercloud has joined #tripleo08:13
cd-undercloud************** overcloud complete status=0 ************08:13
*** cd-undercloud has quit IRC08:13
*** pblaho has joined #tripleo08:18
* rpodolyaka has been improving his plumber skills08:31
rpodolyakarushiagr: pong08:31
Ngmorning08:57
*** ifarkas has joined #tripleo08:57
rpodolyakao/08:57
*** ifarkas has quit IRC08:57
*** ifarkas has joined #tripleo08:58
GheRiveroo/08:58
openstackgerritA change was merged to openstack/tripleo-incubator: Allow Neutron to schedule networks lazily  https://review.openstack.org/5412108:59
openstackgerritA change was merged to openstack/tripleo-incubator: Respect storage pool path in create-nodes script  https://review.openstack.org/5438508:59
*** matsuhashi has quit IRC09:03
*** matsuhashi has joined #tripleo09:03
*** derekh has joined #tripleo09:05
*** jistr has joined #tripleo09:06
*** matsuhas_ has joined #tripleo09:08
*** toci-bot has joined #tripleo09:08
toci-botERROR during toci run, see http://54.228.118.193/toci/toci_logs_qr8eldR/09:08
*** toci-bot has quit IRC09:08
*** matsuhashi has quit IRC09:08
*** cd-undercloud has joined #tripleo09:23
cd-undercloud************** overcloud complete status=1 ************09:23
*** cd-undercloud has quit IRC09:23
*** CaptTofu has quit IRC09:24
ifarkashey all, I did a couple try to setup TripleO following the devtest guide but the 'heat stack-create undercloud-vm' step is consistently failing for me. Here's a log from seed: http://pastebin.com/qYrRmRXM I got Node undercloud-notcompute not found. Does anyone know how to fix this?09:25
*** lucasagomes has joined #tripleo09:32
* rpodolyaka looking09:34
*** che-arne has joined #tripleo09:34
rushiagrrpodolyaka: oops, I was in a meeting09:35
rpodolyakaifarkas: could you show output of 'nova baremetal-node-list' on seed vm (and then baremetal-node-show)?09:35
rushiagrrpodolyaka: do I need to put the IP addresses in quotes too? no?09:36
ifarkasrpodolyaka, sure: http://pastebin.com/FRmVsDs009:37
ifarkasrpodolyaka, I guess the node for undercloud is missing, right?09:37
rpodolyakaifarkas: for some reason the baremetal node is not available09:38
rpodolyakarushiagr: what IP addresses you are talking about? could you please refer to the particular step in devtest guide?09:39
ifarkasrpodolyaka, oh, I messed it up. It's on my machine, not in the seed vm09:39
ifarkasrpodolyaka, sorry, a sec...09:39
rushiagrrpodolyaka: I know that I need to add integers in quotes for the heat template. I wanted to know if I need to put IP addresses in quotes too09:40
ifarkasrpodolyaka, right, so it's the same in the vm too09:40
rpodolyakarushiagr: I suppose, you are talking about the bug in Heat. But it's fixed now. You can just start going through the devtest guide again and everything must be ok now09:41
rushiagrrpodolyaka: ah09:41
rpodolyakarushiagr: but you definitely should not quote all integers in heat templates! ;)09:41
rushiagrrpodolyaka: I dont want to do so many steps again :(09:41
rushiagrrpodolyaka: okay. Slightly noob with heat templates and yaml in general :)09:42
rpodolyakarushiagr: what step are you on?09:42
rushiagrhttp://docs.openstack.org/developer/tripleo-incubator/devtest_undercloud.html09:42
rushiagrrpodolyaka: devtest undercloud step 409:42
rpodolyakaifarkas: nova baremetal-node-show 1 ? (I'm interested in the MAC address)09:43
ifarkasrpodolyaka, http://pastebin.com/ffTq5Q9J09:44
rpodolyakarushiagr: ok, so you don't need to quote nor integers, nor ip addresses. Could you login to seed vm (ssh root@192.0.2.1), cd to /opt/stack/heat and do "git show"?09:45
rpodolyakaifarkas: rushiagr: sorry guys, will be on a call for the next 15 minutes09:46
rpodolyakabut stay tuned :)09:46
rushiagrrpodolyaka: thats fine, thanks ! :)09:46
ifarkasrpodolyaka, ok, thanks ;-)09:46
rushiagryes I can do that. If you wanted to know the commit hash, it is 614e196f01d1e675d7e512bdc66de10a3a8c8e40 -- done on 28th october09:47
rushiagrifarkas: how to get the logs?09:49
rushiagrifarkas: I'm stuck at the same point actually, the heat stack creation is failing09:49
ifarkasrushiagr, I used 'journalctl -abf|grep -v neutron|tee logfile' on fedora09:50
ifarkasrushiagr, what os are you using for the images?09:51
rushiagrifarkas: Ubuntu09:51
ifarkasrushiagr, you can try eg /var/log/upstart/nova-compute.log09:52
ifarkasrushiagr, or different files for different services09:52
rushiagrifarkas: I don't have nova, or heat, or any openstack project named files in that folder09:59
rushiagris something wrong with my setup?09:59
*** boris-42_ is now known as boris-4210:00
rushiagrI'm trying in my host base OS, and not inside any VM10:00
* rpodolyaka is back10:02
rpodolyakaifarkas: can you show me "virsh dumpxml seed | grep mac" on the host? and "grep power_driver /etc/nova/nova.conf" on seed vm?10:02
rpodolyakarushiagr: oops, this is bad, as this effectively means you have a broken heat10:03
rpodolyakarushiagr: but you can easily overcome that bug10:03
rpodolyakarushiagr: you can use my patch to tripleo-heat-templates instead Ie5c6f766283173e62fe057675279f17baf3dd2f010:04
rushiagrrpodolyaka: http://paste.openstack.org/show/50195/10:06
rushiagrrpodolyaka: these are the changes I made, and tried again, but the stack creation still failed10:06
rpodolyakarushiagr: go to $TRIPLEO_ROOT , cd tripleo-heat-templates and do git review -d Ie5c6f766283173e62fe057675279f17baf3dd2f010:06
rushiagrrpodolyaka: is your patch upstream?10:06
rushiagrrpodolyaka: ah, okay10:07
rpodolyakarushiagr: no, and it mustn't be, as this is a bug in Heat10:07
* rushiagr needs to first setup is machine for git review process. He got a new laptop last week10:07
rushiagrrpodolyaka: okay, thanks. I'll try that10:08
rpodolyakarushiagr: which is already fixed, so this is just an easy way to overcome this bug, so you wouldn't need to start going through devtest again or upgrade Heat on seed vm10:08
derekhlifeless: looks like ovsdb-server and ovs-vswitchd were restart last night did it help?10:17
derekhlifeless: I might restart ovs-controller aswell so all three get done10:18
rushiagrgit review is taking to much time I dont know why :/10:20
rpodolyakarushiagr: you can do  "git fetch https://review.openstack.org/openstack/tripleo-heat-templates refs/changes/22/54322/1 && git checkout FETCH_HEAD" instead10:21
rushiagrrpodolyaka: oh yeah, the link on the review page..I should've done that before you telling me..10:22
rushiagrrpodolyaka: thanks10:22
rushiagrouch, it still fails10:24
rushiagrI need to check logs now..10:24
rushiagrrpodolyaka (or anyone else if rpodolyaka is bored of me :P ): can you please guide me to the logs as to where I should check this?10:25
rpodolyakarushiagr: on ubuntu you should look for logs in /var/log/upstart dir10:26
rpodolyakarushiagr:  there is a separate log for each service running10:26
rushiagrrpodolyaka: http://paste.openstack.org/show/50196/10:27
rushiagrrpodolyaka: what are the relevant services here? or am I missing some services?10:27
rushiagrdir is /var/log/upstart10:27
rpodolyakarushiagr: this seems to be the output from you local machine, not seed vm :)10:28
rushiagrrpodolyaka: ahh, okay10:28
rpodolyakarushiagr: I was talking about the machines we run in devtest guide10:28
rushiagrrpodolyaka: the heat command is to be run in the host machine right? or I goofed this part up completely?10:29
rushiagrhost machine=local machine10:29
rpodolyakarushiagr: yep, all devtest guide commands are run on your machine10:29
rushiagrrpodolyaka: okay10:29
* rushiagr heaves a sigh of relief10:30
rpodolyakarushiagr: it's the OS_* vars that show which machine you are referring to10:30
*** jtomasek has quit IRC10:30
rpodolyakarushiagr: so once you did "source seedrc", heat event-show will be executed for seed vm, source undercloudrc for undercloud vm and so on10:30
rushiagrrpodolyaka: yea, and that should be, and is the seed machine for me10:31
rpodolyakarushiagr: but to see logs, of course, you would need to login to a VM over ssh10:31
rpodolyakarushiagr: on seed vm, you must see the logs files for a bunch of openstack services running10:33
*** cd-undercloud has joined #tripleo10:33
cd-undercloud************** overcloud complete status=1 ************10:33
*** cd-undercloud has quit IRC10:33
rushiagrrpodolyaka: yea, I can log into the VM10:33
rushiagrrpodolyaka: yes, I am getting the files now. Looking at it10:33
ifarkasrpodolyaka, sorry, I am back too10:36
ifarkasrpodolyaka, http://pastebin.com/t1RH4Qcd10:36
*** jtomasek has joined #tripleo10:37
*** jtomasek has quit IRC10:37
*** jtomasek has joined #tripleo10:38
rpodolyakaifarkas: oh, I'm sorry, I meant baremetal_0. could you please do "virsh dumpxml baremetal | grep mac" instead?10:39
rpodolyaka*baremetal_010:39
ifarkasrpodolyaka, http://pastebin.com/ndYYe7gu10:43
*** jtomasek has quit IRC10:45
rpodolyakaifarkas: hmm, looks correct. Does your ~/.ssh/authorized_keys on your host contain the ssh key of seed vm?10:47
rpodolyakaifarkas: it ends with  "boot-stack key for use with nova VirtualPowerDriver"10:47
ifarkasrpodolyaka, yes, I have that line10:48
rpodolyakaifarkas: ok, so let's try to boot a baremetal node manually to see if there are any issues with this10:50
rpodolyakaifarkas: so, source seedrc ; nova list10:50
rpodolyakaifarkas: are there any nova instances on the list?10:51
ifarkasrpodolyaka, yes, there is one: http://pastebin.com/FX730PN4 heat is trying to delete the failed stack for more than an hour10:51
rpodolyakaifarkas: can you delete it by hand?10:52
ifarkasrpodolyaka, if I try nova delete, it still stays there.10:53
ifarkasrpodolyaka, or is there any other method to delete it?10:54
rpodolyakaifarkas: can you check nova logs for errors?10:54
*** jtomasek has joined #tripleo10:54
*** markmc has joined #tripleo10:54
ifarkasrpodolyaka, it's weird, I have the very same stacktrace as I had when I was trying the create the undercloud stack: http://pastebin.com/raw.php?i=Kk4fD2UR10:56
*** matsuhas_ has quit IRC11:01
*** matsuhashi has joined #tripleo11:02
rpodolyakaifarkas: hmm, this is strange (I'm checking the source code of virtpowermanager)11:04
*** toci-bot has joined #tripleo11:04
toci-botERROR during toci run, see http://54.228.118.193/toci/toci_logs_bp4sLgG/11:04
*** toci-bot has quit IRC11:04
rpodolyakaifarkas: so we call check_for_node() which returns True (libvirt domain is found), then we call is_power_on() which also calls  check_for_node() and this time it can't find the domain11:05
jprovaznifarkas: # cat /etc/profile.d/virsh.sh11:07
jprovaznexport LIBVIRT_DEFAULT_URI=qemu:///system11:07
*** slagle_ has quit IRC11:08
*** matsuhashi has quit IRC11:11
rushiagrrpodolyaka: okay, I remember I messed up with 'setup-baremetal' step -- I ran it again without supplying any parameters to it11:21
*** ehelms-afk is now known as ehelms11:22
rushiagrthe heat logs says no baremetal flavor exist, which is true in fact11:22
rushiagralso, although virt-manager shows baremetal VM, it is not listed when I do a 'nova list'11:22
*** cd-undercloud has joined #tripleo11:27
cd-undercloud************** overcloud complete status=0 ************11:27
*** cd-undercloud has quit IRC11:27
rushiagrOK, I manually created a baremetal flavor, and now trying again11:27
rushiagrrpodolyaka: another error: http://paste.openstack.org/show/50207/ :(11:34
ifarkasrpodolyaka, I was able to boot the undercloud vm with the help of jprovazn11:36
ifarkasrpodolyaka, the issue was caused by setting LIBVIRT_DEFAULT_URI=qemu:///system in tripleorc while the default uri is qemu:///session11:36
ifarkasrpodolyaka, jprovazn, thanks for your help11:37
*** CaptTofu has joined #tripleo11:37
jprovaznifarkas: great, np11:37
*** akuznetsov has joined #tripleo11:39
rushiagrany help with this folks? http://pastebin.com/raw.php?i=Kk4fD2UR11:43
*** akuznetsov has quit IRC11:43
rushiagrthis happened when I did a heat stack-create in step 4 of devtest_undercloud11:43
GheRiverorushiagr: virsh list --all11:45
rushiagr Id    Name                           State11:45
*** ehelms is now known as ehelms-afk11:45
rushiagr----------------------------------------------------11:45
rushiagr 1     seed                           running11:45
rushiagr 2     baremetal_0                    running11:45
rushiagrGheRivero: ^11:45
*** akuznetsov has joined #tripleo11:47
rpodolyakaifarkas: hey, but this is the first step in devtest guide ;) (export LIBVIRT_DEFAULT_URI=${LIBVIRT_DEFAULT_URI:-"qemu:///system"})11:49
*** akuznetsov has quit IRC11:49
rpodolyakarushiagr: can you show nova logs? (nova-scheduler and nova-compute)11:50
rushiagrrpodolyaka: sure11:50
ifarkasrpodolyaka, yes, I did that and it caused the issue because the default for me is qemu:///session11:51
GheRiverothe problem when changing the LIBVIRT uri is that when the nova driver connects via ssh, uses your user default configuration, not the one exported11:52
GheRivero/home/ghe/.config/libvirt11:52
rpodolyakaifarkas: ohh, I missed the "using  existing variable value" part11:52
rpodolyakabtw, does anybody know what 'session' means here?11:53
GheRiveroecho uri_default = 'qemu:///system > ~/.config/libvirt/libvirt.conf11:53
*** morazi has joined #tripleo11:53
GheRiveroecho uri_default = 'qemu:///system' > ~/.config/libvirt/libvirt.conf11:53
GheRiveromising quote11:53
rpodolyakaI've seen two different uris so far - one for qemu and one for virtualbox, but 'session' is new to me :)11:53
ifarkasrpodolyaka, jprovazn also told me that setting the uri to qemu:///system wouldn't work either because it has different auth mechanism which is incompatible with devtest11:55
GheRiverosession is per user, and system is globall (kind of)11:55
openstackgerritTomas Sedovic proposed a change to openstack/tripleo-image-elements: Add Horizon element  https://review.openstack.org/5091811:55
rushiagrrpodolyaka: http://paste.openstack.org/show/50210/11:55
rpodolyakaGheRivero: thanks for clarifying!11:56
rushiagrrpodolyaka: this is the error from nova-compute11:56
ifarkasrpodolyaka, so the solution which worked for me is to set the uri in /etc/profile.d/virsh.sh11:56
rushiagrnova sched has no problems here I guess11:56
rpodolyakarushiagr:  do you have sshd server running on your host?11:56
*** akuznetsov has joined #tripleo11:57
rpodolyakarushiagr: it's needed for VirtualPowerDriver11:57
rushiagrrpodolyaka: how to check that? I didnt install anything apart from what is mentioned in the docs11:58
rpodolyakaifarkas: cool! my libvirt-foo is not that good :)11:58
rpodolyakarushiagr: debian/fedora based distro (on your host)?11:58
ifarkasrpodolyaka, hehe, no worries. I really appreciate your help11:59
GheRiverorushiagr: do you have and ssh key created and added to the .ssh/authorized_keys? (this is done by the install-dependencies script)11:59
rpodolyakaifarkas: np11:59
rushiagrall ubuntu here11:59
rushiagrGheRivero: yes, I checked and it the keys were there already11:59
rushiagrGheRivero: on the local machine and seed node both12:00
*** akuznetsov has quit IRC12:00
rpodolyakarushiagr: ps -ef | grep sshd      to check if it's running12:00
rpodolyakarushiagr: aptitude show openssh-server   to check if it's installed12:00
rpodolyakarushiagr: if not, install it12:00
*** ccrouch-afk is now known as ccrouch12:00
*** matsuhashi has joined #tripleo12:02
rushiagrrpodolyaka: owh, not installed. Installing12:03
*** akuznetsov has joined #tripleo12:05
*** akuznetsov has quit IRC12:12
*** lucasagomes is now known as lucas-lunch12:12
openstackgerritA change was merged to openstack/tuskar-ui: Refactor the cached api calls into @cached_property  https://review.openstack.org/5381512:12
*** dprince has joined #tripleo12:14
*** ehelms-afk is now known as ehelms12:17
*** akuznetsov has joined #tripleo12:17
*** slagle has quit IRC12:19
*** jpeeler has quit IRC12:19
*** jpeeler has joined #tripleo12:19
*** akuznetsov has quit IRC12:25
*** nosnos has quit IRC12:27
*** jdob has joined #tripleo12:27
*** cd-undercloud has joined #tripleo12:27
cd-undercloud************** overcloud complete status=0 ************12:27
*** cd-undercloud has quit IRC12:27
*** akuznetsov has joined #tripleo12:29
rushiagrI ran again, and it threw the error that 'deploy_ramdisk' and one more parameter not passed to baremetal thingy, which again made me realise my hacks I did when I mess up with the 'setup-baremetal' step12:30
rpodolyakarushiagr: what hacks? :)12:30
rushiagr(I think) I fixed those hacks, and run the heat stack-create again. Hopefully it will work this time.. the  stack creation is taking some time, so atleast something is right this time :)12:31
rushiagrrpodolyaka: I accidentally ran 'nova flavor-delete baremetal' line from 'setup-baremetal' script, and when I realized my error, reran that script12:32
rushiagrrpodolyaka: but that threw some error. So now I ran all the commands of that scripts except the 'register-nodes' line, and it seems it is working so far :)12:33
rushiagrthe stack is still 'create-in-progress'12:33
rushiagrrpodolyaka: does this always take this much time, or should I start exploring for issues?12:34
*** jeckersb is now known as jeckersb_gone12:34
rpodolyakarushiagr: no, it's ok12:34
rpodolyakarushiagr: you can watch your VM booting via VNC12:34
rushiagrn00b with VNC :)12:34
rpodolyakarushiagr: do you have virt-manager installed? (GUI for libvirt)12:34
rushiagryes, I do have virt manager12:35
rpodolyakathen run it and double-click your VM12:35
rushiagrwoah! CREATE COMPLETE12:35
rushiagrrpodolyaka: you coming to summit? I owe you a beer now :)12:36
rpodolyakarushiagr: yeah, I'm coming :)12:36
*** akuznetsov has quit IRC12:36
rushiagrrpodolyaka: will meet you there :)12:36
rpodolyakarushiagr: sure, I'd like to meet all of you guys in person and grab a couple of beers ;)12:37
rushiagrhehe, that way you'd be too much drunk to attend the summit :-P12:37
rpodolyaka:)12:37
*** akuznetsov has joined #tripleo12:41
*** CaptTofu has quit IRC12:44
*** CaptTofu has joined #tripleo12:45
*** julim has joined #tripleo12:46
*** akuznetsov has quit IRC12:48
*** akuznetsov has joined #tripleo12:52
*** slagle has joined #tripleo12:56
*** akuznetsov has quit IRC12:58
*** toci-bot has joined #tripleo12:59
toci-botERROR during toci run, see http://54.228.118.193/toci/toci_logs_j70RDAy/12:59
*** toci-bot has quit IRC12:59
*** akuznetsov has joined #tripleo13:01
*** anteaya has joined #tripleo13:02
*** akuznetsov has quit IRC13:02
*** jayg|g0n3 is now known as jayg13:07
*** matsuhashi has quit IRC13:07
*** matsuhashi has joined #tripleo13:08
*** matsuhas_ has joined #tripleo13:09
*** matsuhashi has quit IRC13:10
openstackgerritVictor Sergeyev proposed a change to openstack/python-tuskarclient: Fix response code 301 processing  https://review.openstack.org/5453613:12
*** jeckersb_gone is now known as jeckersb13:14
*** lucas-lunch is now known as lucasagomes13:15
*** tzumainn has joined #tripleo13:19
*** cd-undercloud has joined #tripleo13:25
cd-undercloud************** overcloud complete status=0 ************13:25
*** cd-undercloud has quit IRC13:25
rpodolyakaNg: hey, are you around?13:27
Ngrpodolyaka: hey13:34
rpodolyakaNg: I need your advice on version numbering. So this "Semantic Versioning" doc you shown me is definitely useful, but we don't seem to be following it yet. So I'm unsure if I should make MINOR releases instead of PATCH ones13:36
rpodolyakaNg: e.g. os-collect-config, Clint changed the default polling interval13:36
rpodolyakaNg: this is not a bug-fix, so it should be a MINOR release. Am I right?13:37
Ngrpodolyaka: I would tend to agree, and that we've been doing it wrong so far13:37
Ngwe're changing things enough that almost every release would need to indicate that a bunch of stuff has changed13:37
rpodolyakayep :)13:38
rbradyhaving a problem adding baremetal nodes.  "The server has either erred or is incapable of performing the requested operation." for a post request, get request works fine: http://paste.openstack.org/show/50218/13:38
*** slagle has quit IRC13:38
*** pblaho has quit IRC13:43
rushiagrrpodolyaka: any suggestion on reducing memory of the two baremetal nodes which are created after undercloud VM is up and running?13:44
rushiagrI see with two VMs my lappy filled up 5 out of 8 GBs of RAM13:44
rushiagrrpodolyaka: can I make both of them 1GB?13:44
rpodolyakarushiagr: I'm afraid, not13:45
rpodolyaka*no13:45
rushiagrrpodolyaka: if I understand correctly, these are the overcloud nodes which are baremetally registered with undercloud13:45
rushiagrrpodolyaka: see point 2 of http://docs.openstack.org/developer/tripleo-incubator/devtest_end.html13:45
rushiagrit says 1GB is probably enough13:45
rushiagris this info outdated?13:46
rpodolyakarushiagr: yep. overcloud notcompute nodes runs a bunch of openstack services + mysql + rabbitmq13:46
rpodolyakarushiagr: let  me check my overcloud13:46
dkehnmorning all13:48
rushiagrrpodolyaka: sure13:49
*** slagle has joined #tripleo13:50
rpodolyakarushiagr: you know, it actually might work. My control node uses ~750 MBs, and compute one - ~660 MBs13:50
rushiagrrpodolyaka: thats great, so I'll make 800 and 700 megs for those13:51
rpodolyakarushiagr: 32-bit of course, though it's still risky :)13:51
rpodolyakarushiagr: I would go for 1GB13:51
rpodolyakarushiagr: and you don't know which one of those will be control node13:52
rpodolyakarushiagr:  it's up to scheduler to decide and told Nova you had 2 baremetal nodes with 2GB of RAM each13:52
rushiagrrpodolyaka: yeah, I just discovered that..there is no differenciation yet available in the scripts13:52
rpodolyaka*you told13:52
rpodolyakaby executing setup-baremetal13:52
*** jeckersb is now known as jeckersb_gone13:53
rushiagrI have 3 gig free as of now. So I think I can go with 1G.13:53
* rpodolyaka used to kill google-chrome when going through devtest guide13:54
*** CaptTofu has quit IRC13:55
*** CaptTofu has joined #tripleo13:55
*** john-n-seattle has quit IRC13:58
*** CaptTofu has quit IRC13:58
Nghttp://www.mirantis.com/blog/introducing-rubick/ interesting13:58
*** jeckersb_gone is now known as jeckersb13:59
*** CaptTofu has joined #tripleo13:59
*** toci-bot has joined #tripleo14:00
toci-botERROR during toci run, see http://54.228.118.193/toci/toci_logs_arM0ug5/14:00
*** toci-bot has quit IRC14:00
*** CaptTofu has quit IRC14:04
*** CaptTofu has joined #tripleo14:04
*** CaptTofu has quit IRC14:08
*** CaptTofu has joined #tripleo14:09
*** rushiagr has quit IRC14:09
*** jergerber has joined #tripleo14:09
*** pblaho has joined #tripleo14:11
*** devanand1 is now known as devananda14:11
*** coolsvap has quit IRC14:12
rpodolyakaNg: could you restore https://review.openstack.org/#/c/52206 please?14:20
Ngrpodolyaka: done14:21
rpodolyakaNg: thanks!14:21
*** matsuhas_ has quit IRC14:25
*** matsuhashi has joined #tripleo14:25
rpodolyakaNg: do you mind if I update your patches to openstack-infra/config (if guys leave some comments)? :)14:27
*** matsuhashi has quit IRC14:30
Ngrdopieralski: not at all :)14:30
Ngerr14:30
Ngrpodolyaka: not at all :)14:30
rpodolyakaok :)14:31
*** cd-undercloud has joined #tripleo14:35
cd-undercloud************** overcloud complete status=1 ************14:35
*** cd-undercloud has quit IRC14:35
*** edmund has joined #tripleo14:45
openstackgerritVictor Sergeyev proposed a change to openstack/python-tuskarclient: Fix response code 301 processing  https://review.openstack.org/5453614:50
openstackgerritTzu-Mainn Chen proposed a change to openstack/tuskar-ui: Stop appending rc name to rc flavors  https://review.openstack.org/5455214:51
openstackgerritVictor Sergeyev proposed a change to openstack/python-tuskarclient: Fix response code 301 processing  https://review.openstack.org/5453614:52
NobodyCamGood morning TripleO14:54
shadowermarios: did you talk to anyone about moderating the session you proposed? http://icehousedesignsummit.sched.org/event/8ffcdc04e6a5db58fae60ff3d1374cca14:56
shadowermarios: you're not going in HK iirc14:56
*** spzala has joined #tripleo14:57
mariosshadower: no and no. so actually when i was asked to write something i got the impression it was for you, cos you were away those days15:03
mariosshadower: you're welcome :)15:04
* shadower wanted to wait for someone else to volunteer15:06
* shadower is a lazy bastard in other news15:06
shadowermarios: okay, I'll create the etherpad and moderate the session15:06
mariosshadower: ha. well 3 of them were squashed together right? so martyntaylor and jcoufal will be around anyway15:06
shadowerya15:07
jcoufalshadower: marios's session was squeezed to ours15:07
jcoufaland it's impossible to cover all topics in one sessions15:08
jcoufalso I will hit that topic a bit but not in big details :(15:08
*** edmund has quit IRC15:08
shadowerjcoufal: ah okay fair enough. There will be an ironic or ceilometer session dedicated to discovery and hardware metrics anyway iirc15:08
shadowerjcoufal: fill you take care of the etherpad, then?15:09
shadower*will15:09
jcoufalshadower: so it was me and martyn who took the session15:09
jcoufalshadower: it's the etherpad I sent you the link at night15:09
*** edmund has joined #tripleo15:10
shadowerjcoufal: huh, I can't find it. Can you send it to me again pls?15:10
jcoufalshadower: will do15:10
jcoufalshadower: done15:11
shadowerjcoufal: I'm an idiot, didn't realise that's what you meant. thanks15:12
jcoufalshadower: np ;)15:12
shadowerok, I added it to the etherpads wiki15:13
jcoufalshadower: noooooo15:14
jcoufal:D15:14
*** UtahDave has joined #tripleo15:15
jcoufalshadower: not finished yet, will add the etherpad a bit later15:16
shadowersorry15:17
jcoufalshadower: np ;)15:17
openstackgerritA change was merged to openstack/tuskar-ui: Stop appending rc name to rc flavors  https://review.openstack.org/5455215:17
*** CaptTofu has quit IRC15:17
*** CaptTofu has joined #tripleo15:18
openstackgerritRadomir Dopieralski proposed a change to openstack/tuskar-ui: Rename all node variables to tuskar_node or baremetal_node  https://review.openstack.org/5456415:18
*** lsmola_ has quit IRC15:22
*** dddtest_5a9fe has joined #tripleo15:22
*** jistr is now known as jistr|afk15:23
*** jprovazn has quit IRC15:25
*** jcoufal is now known as jcoufal|afk15:25
*** MarkAtwood has joined #tripleo15:29
openstackgerritRadomir Dopieralski proposed a change to openstack/tuskar-ui: Rename all node variables to tuskar_node or baremetal_node  https://review.openstack.org/5456415:32
martyntaylorjcoufal|afk: shadower what was the outcome of teh design session proprosal?15:37
slagledoes the nova-baremetal-deploy-helper service just deploy the disk images to the iscsi targets in serial?15:39
slaglee.g., for the 2 node overcloud, does it wait for the first to finish completely before starting the other?15:40
slagleit seems like that's how it works, just wondering if anyone can confirm15:40
*** rdopieralski has quit IRC15:46
*** blamar has joined #tripleo15:48
shadowermartyntaylor: based on this http://icehousedesignsummit.sched.org/event/8ffcdc04e6a5db58fae60ff3d1374cca the sessions proposed by you and marios will take place on Tuesday from 3:40 to 4:20 pm15:52
martyntaylorshadower: yeah I was asking about the conversation between you and jcoufal|afk15:53
martyntaylorshadower: the sessions are merged, myself and jcoufal|afk met up yesterday to talk about how we can pull them together.  We have an etherpad (that I think he sent you)15:53
martyntaylorshadower: are you thinking of contributing to the session?15:53
rpodolyakaslagle: I can't confirm this, but it seems you are right15:55
*** pblaho has quit IRC15:55
*** jprovazn has joined #tripleo15:57
*** edmund has quit IRC16:00
shadowermartyntaylor: yea why not (if you guys want). I'll be there anyways16:00
shadowermartyntaylor: got to go now, we can discuss it tomorrow16:00
martyntaylorshadower: sure.  I'll be around pretty early, just drop me a ping, catch you later16:02
*** sdake has joined #tripleo16:08
*** jistr|afk is now known as jistr16:13
*** jcoufal|afk is now known as jcoufal16:13
*** slagle has quit IRC16:26
*** slagle has joined #tripleo16:27
*** rushiagr has joined #tripleo16:32
jcoufallifeless: ping16:32
jcoufalmartyntaylor: ping16:33
openstackgerritVictor Sergeyev proposed a change to openstack/python-tuskarclient: Fix response code 301 processing  https://review.openstack.org/5453616:39
*** bauzas has quit IRC16:46
*** jcoufal has quit IRC16:47
SpamapSderekh: do you reason that restarting ovs-controller fixed it?16:54
*** coolsvap has joined #tripleo16:54
derekhSpamapS: I didn't restart it because the next run after that message succeeded16:55
SpamapSderekh: seems like we had a couple of successful deploys in there after you did that.16:55
SpamapSoh16:55
SpamapSI'm going to try a little tuning of mysql at this point. It is going ridiculously slow, and stats show that it is doing a lot of extra work due to lack of innodb buffer pool memory16:55
SpamapSderekh: I have a theory that we're getting deadlocked or just lock-lagged by concurrent ovs-vsctl calls16:56
derekhthe ovs db doesn't use mysql does it ? but ya if mysql is running slow tunning would be worth a try16:57
derekhSpamapS: hmm, I can't seem to ssh in anymore16:57
*** CaptTofu has quit IRC16:58
SpamapSme neither16:58
*** CaptTofu has joined #tripleo16:58
SpamapSderekh: almost feels like a firewall issue. I can ssh to other boxes on that vlan and arp responds, but TCP is dead17:01
* SpamapS will try ilo console17:01
derekhok17:01
SpamapSbox is up.. one thing I see are _thousands_ of TIME_WAIT connections to 127.0.0.1:969617:06
SpamapSWhich is neutron-server17:06
dkehnSpamapS: when runing the devtest.sh to setup my env using the ml2, I'm waiting for undercloud node to configure br-ctlplane is there a good log to monitor on the seed to watch for an error.17:06
SpamapSdkehn: /var/log/upstart/os-collect-config.log17:07
SpamapSoh...17:07
SpamapShrm17:08
SpamapSNg: around? I think our undercloud box got partitioned off vlan2517:09
NgSpamapS: hey17:10
SpamapSNg: only traffic I see is 17:09:41.284821 ARP, Request who-has 138.35.77.1 tell 138.35.77.3, length 2817:10
SpamapSor something we did borked the vlan config17:11
SpamapSNg: btw I'm on the box with conman17:13
NgSpamapS: huh, weird, but yeah, from the bastion I can only ping a handful of the public IPs17:14
Ng3, to be exact17:16
Ng13, 16, 1717:16
SpamapSNg: ok so does this feel like something is just wrong in our vlan?17:16
SpamapSNg: I'm only deferring to you because I don't want to wade into JIRA.. ;)17:16
NgSpamapS: I'm making a note here that you're making me wade into jira ;)17:17
SpamapS"and if you don't show us your weapons, we will be very very angry with you, and write you a very angry letter." -- Hans Blix17:17
NgI would say this definitely earns a trello card17:18
Ngit seems unlikely that this is something we've done, since it's very wide ranging and includes grizzly POC machines17:18
*** SpamapS changes topic to "CRITICAL: vlan trouble and bug #1245852 | Using OpenStack to deploy OpenStack; meetings Tuesday 1900 UTC in #openstack-meeting-alt"17:19
SpamapSNg: I got bumped off my ssh connection sometime between 0700 UTC and 1700 UTC17:20
SpamapSI think derekh could narrow that down even more17:20
SpamapSderekh: ^^ when were you last on the box rougly?17:20
SpamapSroughly even17:20
*** jistr has quit IRC17:21
derekhlogged in around 9:30 UTC  and ssh was connected for for around 3 hours I would say17:22
NgSpamapS: filed as NET-290717:22
derekhI'm pretty sure I logged myself out though so I'd guess problem started sometime in the last 3 or 4 hours17:23
derekhSpamapS: Ng ^17:23
Ng"The user "clint.byrum@hp.com" does not have permission to view this issue. This user will not be added to the watch list."17:23
NgI hate you jira17:23
* Ng releases gojira to lizard-smash jira17:23
derekhSpamapS: btw when I was on earlier it looed like the problems running ovs-vsctl were gone , so It looks like something ye did last night made the problem vanish17:25
*** vipul-away is now known as vipul17:27
*** jsomara has joined #tripleo17:27
jsomaratzumainn, ahh!17:27
jsomaratzumainn, my VPN is dead17:27
tzumainnjsomara, oh no!17:29
tzumainnwhat does the 's' stand for?17:29
jsomarasuper17:29
tzumainnI don't see that17:30
SpamapSderekh: right, but then we got a =117:30
SpamapS07:35 < cd-undercloud> ************** overcloud complete status=1 ************17:30
SpamapSthat is 14:35 UTC17:30
SpamapSNg: ^^ that is the last time we saw a UTC message17:31
SpamapSerr17:31
SpamapSIRC message17:31
NgSpamapS: ta17:31
* Ng adds that as a comment17:31
* SpamapS looks through tripleo-cd.log for indicators of having a working internet connection17:32
SpamapSNg: did we still have a "move some cable" ticket open? Perhaps somebody moved the wrong cable. ;)17:34
*** cwolferh has joined #tripleo17:37
jsomaratzumainn, i am trying to figure out how to get back on17:39
jsomaratzumainn, the self service portal is broken17:39
NgSpamapS: we do, but those tickets are all going nowhere fast17:39
NgI'll check if there are any updates, but usually they stall out at people even finding the machines ;)17:40
openstackgerritChris Krelle proposed a change to openstack/tripleo-image-elements: Add Ironic elements  https://review.openstack.org/4450017:40
*** cwolferh_ has joined #tripleo17:40
*** derekh has quit IRC17:41
SpamapSNg: I'm adding the card to trello.. unless you are already doing that?17:42
*** cwolferh has quit IRC17:42
*** lsmola_ has joined #tripleo17:42
NgSpamapS: I was not already doing that, so thank you :)17:43
SpamapSNg: you are assigned, and I have nothing else to action on this but please do feel free to put my contact info into the tickets so they can reach me if you aren't around.17:47
*** jsomara has quit IRC17:47
*** CaptTofu has quit IRC17:48
*** CaptTofu has joined #tripleo17:48
*** cody-somerville has quit IRC17:49
*** rushiagr has quit IRC17:52
*** krotscheck has joined #tripleo17:54
*** cody-somerville has joined #tripleo18:04
*** cody-somerville has joined #tripleo18:04
* Ng dinners18:07
NgSpamapS: I'll check in with jira a bit later18:07
SpamapSNg: thanks for driving that. :)18:07
lifelessmorning18:09
*** spzala has quit IRC18:11
*** edmund has joined #tripleo18:15
*** rpodolyaka1 has joined #tripleo18:21
*** markmc has quit IRC18:30
*** jprovazn has quit IRC18:38
Nglifeless: see backscroll, the freecloud public vlan is borked18:49
*** CaptTofu has quit IRC18:50
*** CaptTofu has joined #tripleo18:51
*** jcoufal has joined #tripleo18:51
lifelessNg: waah18:51
lifelessNg: ETA ?18:52
lifelessNg: I can ssh into the bastion18:52
Nglifeless: unknown, I've filed NET-2907 as of 1717UTC, but no response on it so far18:52
NgI filed it as Major. I have the option to upgrade that to Critical and then Blocker, but since this is non-production stuff I was a bit wary of doing either of those18:53
lifelessNg: what is the basis for us saying it's down ?18:53
Nglifeless: from the bastion host we can only ping three other public IPs from the 40 machines18:53
lifelessNg: we should only be able to ping 418:54
lifelessNg: or maybe three18:54
lifelessNg: most will be off18:54
lifelessNg: and the ips are nearly all neutron managed now18:54
lifelessNg: have we checked via ilo that expected machines are on ?18:55
Nglifeless: hmm, interesting. SpamapS was connected to serial on the undercloud machine and couldn't talk to anything else, which is how this was noticed18:55
lifelessNg: root@undercloud-notcompute-jws3awlsb2kh:~#18:55
lifelessNg: via ilo18:55
lifeless        Current message level: 0x00000014 (20)18:55
lifeless                               link ifdown18:55
lifeless        Link detected: yes18:55
lifelessSpamapS: Ng: so , the ipmi addresses are not vlanned18:56
lifelessSpamapS: Ng: they also seem down to me18:56
lifelessI'm going to bounce ye old firmware18:56
lifelessand loong hand on rmmod18:56
lifelessyup18:57
lifelesscan ping ipmi now18:57
lifelesscan ping poc.tripleo.org now18:57
Ngbah18:57
Ngok, I'll ditch the jira ticket then18:58
lifelessflakey undercloud machine is flakey18:58
lifelessNg: well, we have something fundamentally wrong with this machine18:58
lifelessNg: but yeah, the network ticket is 'wolf'18:58
SpamapSlifeless: ah so this is the same eth problem?19:08
SpamapSlifeless: could it just be a bad nIC?19:09
SpamapSNIC19:09
dkehnlifeless: I'm trying to get the ml2 plugin working and I seem to failing "Waiting for undercloud node to configure br-ctlplane", and wondering if there is a good way to go about debugging it19:11
*** lucasagomes is now known as lucas-dinner19:11
dkehnlifeless: the otcompute    | 8  | Error: Creation of server undercloud-notcompute-ok5srebdchw6 failed., and the only othere thing that I really see in from the  ovs-vswitchd.log19:11
dkehnlifeless: netdev_linux|INFO|ioctl(SIOCGIFHWADDR) on tap5913dcfd-b9 device failed: No such device from the log19:12
dkehnlifeless: any thoughts19:12
*** vipul is now known as vipul-away19:25
*** vipul-away is now known as vipul19:25
*** akuznetsov has joined #tripleo19:27
*** akuznetsov has quit IRC19:35
*** martyntaylor has quit IRC19:40
*** martyntaylor1 has joined #tripleo19:40
lifelessSpamapS: its the same problem, yes19:40
lifelessSpamapS: I don't know. Perhaps.19:40
lifelessdkehn: log into the undercloud vm with stack:stack and look at /var/log/upstart/os-collect-config.log19:41
dkehnlifeless: since the undercloud didn't come up, looking at the seed's os-collect-config.log19:43
lifelessdkehn: no19:45
dkehnlifeless: ok19:45
lifelessdkehn: log into the undercloud vm19:45
lifelessdkehn: as I said, I understood the scenario you gave, and the debug instructions I offered are precisely what's needed :)19:45
dkehnlifeless: that ip is not pingable19:45
lifelessdkehn: use the virt-manager console19:45
lifelessdkehn: you often can't ssh in when something goes wrong, e.g. failures to talk to the metadata service.19:46
openstackgerritClint Byrum proposed a change to openstack/diskimage-builder: Do not run apt-get update in offline mode  https://review.openstack.org/5461919:46
lifelessdkehn: so console access should be your go-to thing19:46
dkehnlifeless: using  virsh the only vm running is the seed19:46
dkehnlifeless: assuming that baremetal0 is undercloud19:46
lifelessdkehn: ok19:47
lifelessdkehn: try the deploy again, watching virt-manager to see what gets powered on19:47
lifelessand watch the console of the one that gets powered on19:47
*** spzala has joined #tripleo19:47
lifelessdkehn: the ovs agent error with a random tap device may be relevant, but we need to bisect back to find out how far through it's getting19:48
dkehnlifeless: yepper, reboot probbaly in order as weel and if that fails I'll backout the chenage to see what we get19:48
dkehnlifeless: ok thanks19:48
lifelessdkehn: another thing you could do is look at the nova-baremetal-deploy-helper log and see if it got pinged and tried a deploy19:48
dkehnlifeless: k19:49
dkehnlifeless: wheres that log located?19:50
lifeless /var/log/upstart/19:51
dkehnlifeless: ok, got me not on the seed upstart, at least on mine19:52
lifelessdkehn: nova-baremetal-deploy-helper.log ? If it's not there, then it hasn't logged anything, which means it never got pinged by the node.19:52
dkehnlifeless: good to know19:53
lifelessdkehn: check for a ....1.gz file too ?19:53
lifelessdkehn: if there is a .1.gz it just means the log got rotated, so you can check it19:53
dkehnlifeless: in upstart nope19:53
lifelessdkehn: if there isn't, it never output.. so19:53
lifelessdkehn: we know that PXE boot of the deploy image failed.19:53
SpamapSlifeless: looking at /proc/interrupts, it looks like all eth2 interrupts are being handled on CPU0. I wonder.. irqbalance...?19:58
lifelessSpamapS: it's installed19:58
lifelessii  irqbalance          1.0.3-1ubuntu2 amd619:59
SpamapSah and doing nothing19:59
lifelessps fax | grep irqb19:59
lifeless 1911 ?        Ss     3:32 /usr/sbin/irqbalance19:59
lifeless#Should irqbalance be enabled?19:59
lifelessENABLED="1"19:59
lifeless#Balance the IRQs only once?19:59
lifelessONESHOT="0"19:59
lifelessSpamapS: you're in the heat meeting right?20:03
lifelessSpamapS: perhaps we can have a sync() call after that20:03
*** dprince has quit IRC20:06
*** MarkAtwood has quit IRC20:07
rpodolyaka1lifeless: Hey! Ng greatly helped me today to dive into releasing process. This is what what I've got so far (https://docs.google.com/document/d/1O5M1vB5v_o4jHUo6BUvEFfIVKUo5kF0OnYe1DeLcnEw/edit?usp=sharing). I was following the rules of Semantic Versioning (http://semver.org/) when picking up new version numbers. Not sure, if we have been using it  properly so far. Could you and SpamapS take a look and comment when you have a free minute?20:11
lifelessrpodolyaka1: looking20:14
*** morazi is now known as morazi-afk20:14
lifelessrpodolyaka1: I still owe you a brain dump too :)20:14
lifelesspleia2: and you wanted some guidance? Sorry about yesterday, twas one of those days20:14
lifelessrpodolyaka1: don't wait for friday (in fact friday is perhaps worst day to release on; if there are issues, noone around to fix)20:15
pleia2lifeless: no worries, I left some notes in the etherpad for you as I started working on understanding the o-r-c stuff20:16
lifelessrpodolyaka1: bumping minor is fine IMO. Though note that 0.x.y releases have no compat guarantees at all20:16
lifelessrpodolyaka1: so semver rules don't require bumping 0.x ever, strictly speaking.20:16
lifelessrpodolyaka1: I think perhaps we should treat it as 0.major.minor before we hit 1.minor.patch20:17
lifelessrpodolyaka1: that is, for 0.x.y releases, bump the x digit if we have made an incompatible change and the y digit if we are still compatible20:17
lifelessrpodolyaka1: by my reading of semver this isn't prohibited, but we'd want to document that somewhere20:17
lifelessrpodolyaka1: what do you think?20:18
rpodolyaka1lifeless: treating versions as 0.major.minor makes sense IMO20:20
rpodolyaka1lifeless: e.g. tripleo-image-elements. We switched seed vm to neutron-dhcp-agent. So I should bump X in 0.X.Y, right?20:22
*** martyntaylor1 has left #tripleo20:22
lifelessrpodolyaka1: I think so20:23
lifelesssince thats arguably incompatible20:23
rpodolyaka1lifeless: cool, that was my understanding too :)20:23
rpodolyaka1ok, so I should fix os-collect-config version then20:24
rpodolyaka1as it's totally backwards compatible20:24
lifelessyeah20:24
lifelessalso perhaps include old, new version20:24
rpodolyaka1though it's kind of a feature, not a bug-fix20:24
lifelessso folk don't need to check pypi to see what the current one is20:24
lifelessI've added you to -prl20:25
lifeless-ptl20:25
*** cwolferh_ has quit IRC20:25
rpodolyaka1lifeless: what do you mean by "perhaps include old, new version"?20:25
lifelessin your doc you say 0.2.020:25
rpodolyaka1yep20:25
lifelesswhat was the old version?20:25
rpodolyaka10.1.420:26
lifelessthats what I mean20:26
rpodolyaka1oh :)20:26
*** CaptTofu has quit IRC20:26
rpodolyaka1yeah, that makes sense20:26
*** CaptTofu has joined #tripleo20:26
*** cd-undercloud has joined #tripleo20:26
cd-undercloud************** overcloud complete status=0 ************20:26
*** cd-undercloud has quit IRC20:26
rpodolyaka1\o/20:27
rpodolyaka1night all20:49
*** rpodolyaka1 has quit IRC20:54
*** marun has joined #tripleo20:58
*** lsmola_ has quit IRC21:04
*** spzala has quit IRC21:05
*** dddtest_5a9fe has quit IRC21:11
*** cd-undercloud has joined #tripleo21:20
cd-undercloud************** overcloud complete status=0 ************21:20
*** cd-undercloud has quit IRC21:20
*** jergerber has quit IRC21:25
SpamapSlifeless: so, I have to wonder if the ovs issues were in part related to the NIC problems21:28
SpamapSlifeless: if we get 4 successes in a row I'll close bug 1245852 and move the card to done21:28
*** ehelms is now known as ehelms-afk21:32
lifelessSpamapS: first thing I did when it started gltiching was a firmware reset21:40
lifelessSpamapS: so, I don't think so21:40
*** jdob has quit IRC21:40
SpamapSlifeless: yeah it is just wishful thinking21:40
*** marun has quit IRC21:43
*** CaptTofu has quit IRC21:45
*** CaptTofu has joined #tripleo21:45
*** CaptTofu has quit IRC21:47
*** CaptTofu has joined #tripleo21:47
openstackgerritClint Byrum proposed a change to openstack/diskimage-builder: Add an install-packages element  https://review.openstack.org/5464021:49
*** jcoufal has quit IRC21:49
*** jayg is now known as jayg|g0n321:51
lifelessSpamapS: hey so21:56
lifelessSpamapS: https://review.openstack.org/#/c/54640/ really shouldn't dep on the other patch21:56
lifelessSpamapS: and, I'd like some face time :)21:56
SpamapSlifeless: yeah I'm just working on a stream with that dependency22:01
SpamapS(optimizing builds because I seem to be working on things that have me building things a lot)22:01
lifelessSpamapS: sure, but the other patch is deeply problematic22:02
*** MarkAtwood has joined #tripleo22:02
*** jeckersb is now known as jeckersb_gone22:06
Ngdoh, I see offline made it in22:06
* Ng strikes reviving that branch from his todo list22:07
Nghmm, no, I'm on crack, that's di-b22:07
* Ng re-instates the todo for devtest22:07
Ngdefinitely a plane hack, that one :D22:08
lifelesshey, I'm going to stop tripleo-cd after the next successful deploy22:22
lifelessI want to mess around with a solid heat for a bit22:22
*** jtomasek has quit IRC22:26
*** tzumainn has quit IRC22:27
*** cd-undercloud has joined #tripleo22:30
cd-undercloud************** overcloud complete status=1 ************22:30
*** cd-undercloud has quit IRC22:30
*** tzumainn has joined #tripleo22:45
*** sdake has quit IRC22:49
lifelessSpamapS: http://status.openstack.org/reviews/22:52
SpamapSand we're back to failing22:54
lifelessnuuuuuuuuuuuuuuuuts22:54
*** edmund has quit IRC22:55
dkehnlifeless: correct in no PXE boot for the undercloud for ml2, hrm no boot device22:56
dkehnlifeless: wondering if the dhcp_agent.ini interface_driver should be changed22:57
lifelessdkehn: I don't know ;)22:58
dkehnlifeless: me guessing as well22:58
dkehnwill  raise the issue on the neutron22:59
dkehnthanks tons22:59
*** CaptTofu has quit IRC23:22
*** CaptTofu has joined #tripleo23:23
*** UtahDave has quit IRC23:29
*** tzumainn has quit IRC23:33
*** rwsu has quit IRC23:38
*** julim has quit IRC23:39
*** cd-undercloud has joined #tripleo23:43
cd-undercloud************** overcloud complete status=1 ************23:43
*** cd-undercloud has quit IRC23:43
*** rwsu has joined #tripleo23:44
*** julim has joined #tripleo23:45
*** CaptTofu has quit IRC23:48
*** CaptTofu has joined #tripleo23:49
*** lucas-dinner has quit IRC23:57

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!