SpamapS | lifeless: guessing he means multiple of the same type | 00:00 |
---|---|---|
SpamapS | stevebaker: multiple cfn paths is supposed to be path = x,y,z | 00:00 |
SpamapS | stevebaker: if there ever are dynamic-ish collectors I'd think just [type_key] | 00:00 |
stevebaker | SpamapS: different cfn paths might have different credentials | 00:01 |
SpamapS | stevebaker: good point, I did not plan for that case. | 00:02 |
SpamapS | stevebaker: go forth, and innovate. :) | 00:02 |
lifeless | can configparser nest? | 00:02 |
lifeless | cfn:child ? | 00:02 |
lifeless | cfn:path ? | 00:02 |
stevebaker | SpamapS: I was thinking a block for each [cfn_foo], with the type implied by the prefix, and a collectors value which declares each block | 00:03 |
stevebaker | lifeless: I'll check that | 00:04 |
SpamapS | stevebaker: yes that should work nicely | 00:06 |
SpamapS | IIRC oslo.config can do this | 00:06 |
SpamapS | might have to do cfn:key | 00:06 |
SpamapS | err yeah you said that. ;) | 00:06 |
stevebaker | OK, I might take a look at some point. I may still get away with only a single source | 00:07 |
lifeless | I'd like a stronger separator than _ | 00:08 |
lifeless | but thats bikeshedding | 00:08 |
SpamapS | no I agree | 00:10 |
SpamapS | : is definitely more obviously a data separator | 00:11 |
SpamapS | I'm certain that oslo.config has some way to do this, and if not.. we should patch it to, as this is a normal ini file requirement. | 00:11 |
stevebaker | I haven't found it yet | 00:12 |
SpamapS | lifeless: https://bugs.launchpad.net/tripleo/+bug/1254555 | 00:16 |
uvirtbot | Launchpad bug 1254555 in tripleo "tenant does not see network that is routable from tenant-visible network until neutron-server is restarted" [Critical,Triaged] | 00:16 |
lifeless | SpamapS: thats a brainfuck | 00:16 |
SpamapS | lifeless: reported against Neutron. I suggest we work around it by restarting neutron via SSH just before trying to do floatingip-create. | 00:17 |
SpamapS | lifeless: thoughts on that? | 00:17 |
lifeless | JFDI | 00:18 |
SpamapS | I'm also going to try to start doing this locally | 00:19 |
SpamapS | doing this == reproducing | 00:19 |
stevebaker | SpamapS: is the default occ/orc/oac install set up to regenerate sources.ini when os-collect-config metadata changes? | 00:21 |
SpamapS | stevebaker: yes | 00:21 |
stevebaker | nice, does oslo.config always read from the current file? | 00:22 |
*** matsuhashi has joined #tripleo | 01:03 | |
*** toci-bot has joined #tripleo | 01:11 | |
toci-bot | ERROR during toci run, see http://54.228.118.193/toci/toci_logs_hWX21Gc/ | 01:11 |
*** toci-bot has quit IRC | 01:11 | |
*** matsuhashi has quit IRC | 01:11 | |
*** matsuhashi has joined #tripleo | 01:11 | |
*** matsuhas_ has joined #tripleo | 01:15 | |
*** matsuhashi has quit IRC | 01:16 | |
*** nosnos has joined #tripleo | 01:18 | |
*** rongze has joined #tripleo | 01:19 | |
lifeless | stevebaker: ... yes? | 01:19 |
*** cd-undercloud has joined #tripleo | 01:20 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 01:20 |
*** cd-undercloud has quit IRC | 01:20 | |
*** rongze has quit IRC | 01:24 | |
SpamapS | Waiting for the overcloud stack to be ready | 01:41 |
SpamapS | Timing out - last probe output: | 01:41 |
* SpamapS suspects the mellanox problems are starting again | 01:41 | |
lifeless | want me to stabbbb it ? | 01:42 |
lifeless | SpamapS: ^? | 01:42 |
*** rongze has joined #tripleo | 01:56 | |
*** rongze has quit IRC | 01:57 | |
*** rongze has joined #tripleo | 01:57 | |
*** rongze_ has joined #tripleo | 01:59 | |
*** rongze has quit IRC | 02:02 | |
*** cd-undercloud has joined #tripleo | 02:08 | |
cd-undercloud | ************** overcloud complete status=2 ************ | 02:08 |
*** cd-undercloud has quit IRC | 02:08 | |
*** CaptTofu has quit IRC | 02:20 | |
*** CaptTofu has joined #tripleo | 02:21 | |
*** rongze_ has quit IRC | 03:09 | |
*** cd-undercloud has joined #tripleo | 03:15 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 03:15 |
*** cd-undercloud has quit IRC | 03:15 | |
*** matsuhas_ has quit IRC | 03:17 | |
*** matsuhashi has joined #tripleo | 03:19 | |
*** rongze has joined #tripleo | 03:23 | |
*** greghaynes has quit IRC | 03:27 | |
*** matsuhashi has quit IRC | 03:40 | |
SpamapS | lifeless: yeah if you could | 03:47 |
SpamapS | lifeless: these are like the errors we had before. If we stabbbb and then still get them then at least we'll know these are not bad-network-card related. | 03:47 |
*** CaptTofu has quit IRC | 04:03 | |
*** CaptTofu has joined #tripleo | 04:04 | |
lifeless | SpamapS: doing | 04:04 |
lifeless | rmmod mlx4_en mlx4_core; modprobe mlx | 04:04 |
lifeless | 4_en; sleep 1; ip address del 10.10.16.169/26 dev eth2; ovs-vsctl del-port br-ct | 04:04 |
lifeless | lplane eth2; ovs-vsctl add-port br-ctlplane eth2 | 04:04 |
lifeless | WARNING: /etc/modprobe.d/mellanox.conf line 3: ignoring bad line starting with ' | 04:04 |
lifeless | /sbin/modprobe' | 04:04 |
lifeless | [4132354.620152] mlx4_core 0000:05:00.0: command 0xc failed: fw status = 0x40 | 04:04 |
lifeless | SpamapS: done, no big backlog this time | 04:05 |
*** boris-42 has joined #tripleo | 04:24 | |
SpamapS | lifeless: ty | 04:30 |
*** cd-undercloud has joined #tripleo | 04:35 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 04:35 |
*** cd-undercloud has quit IRC | 04:35 | |
SpamapS | that one timed out because of the mlx reload, I think | 04:36 |
*** rushiagr has joined #tripleo | 04:42 | |
*** rongze has quit IRC | 04:50 | |
*** matsuhashi has joined #tripleo | 04:51 | |
*** rongze has joined #tripleo | 04:53 | |
SpamapS | 2013-11-25 00:59:56.300 2164 WARNING os_collect_config.cfn [-] 400 Client Error: InvalidParameterValue | 05:24 |
SpamapS | weird | 05:24 |
SpamapS | 2013-11-25 05:24:42.946 18663 ERROR heat.openstack.common.rpc.common [-] Returning exception The Resource (notcomputeConfig) is not available. to caller | 05:25 |
SpamapS | heat ?? wtf? | 05:25 |
*** rpodolyaka has joined #tripleo | 05:27 | |
*** cd-undercloud has joined #tripleo | 05:37 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 05:37 |
*** cd-undercloud has quit IRC | 05:37 | |
*** akuznetsov has quit IRC | 05:40 | |
*** akuznetsov has joined #tripleo | 05:41 | |
*** nosnos_ has joined #tripleo | 05:51 | |
*** nosnos has quit IRC | 05:54 | |
*** matsuhashi has quit IRC | 05:55 | |
*** nosnos has joined #tripleo | 05:57 | |
*** nosnos_ has quit IRC | 05:57 | |
*** matsuhashi has joined #tripleo | 05:59 | |
*** matsuhashi has quit IRC | 06:04 | |
*** rushiagr has quit IRC | 06:05 | |
*** rushiagr has joined #tripleo | 06:05 | |
*** matsuhashi has joined #tripleo | 06:19 | |
*** cd-undercloud has joined #tripleo | 06:39 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 06:39 |
*** cd-undercloud has quit IRC | 06:39 | |
*** jcoufal has joined #tripleo | 06:39 | |
*** matsuhashi has quit IRC | 06:47 | |
*** boris-42 has quit IRC | 06:47 | |
*** rpodolyaka has left #tripleo | 06:50 | |
*** matsuhashi has joined #tripleo | 06:58 | |
*** matsuhashi has quit IRC | 06:59 | |
*** tzumainn has quit IRC | 07:00 | |
*** arata has joined #tripleo | 07:01 | |
*** matsuhashi has joined #tripleo | 07:02 | |
*** toci-bot has joined #tripleo | 07:04 | |
toci-bot | ERROR during toci run, see http://54.228.118.193/toci/toci_logs_OybmZrk/ | 07:04 |
*** toci-bot has quit IRC | 07:04 | |
*** nosnos has quit IRC | 07:05 | |
*** nosnos_ has joined #tripleo | 07:05 | |
*** matsuhashi has quit IRC | 07:23 | |
*** matsuhashi has joined #tripleo | 07:23 | |
*** akuznetsov has quit IRC | 07:28 | |
*** matsuhashi has quit IRC | 07:28 | |
*** matsuhashi has joined #tripleo | 07:29 | |
*** matsuhashi has quit IRC | 07:29 | |
*** matsuhashi has joined #tripleo | 07:29 | |
*** rdopieralski has joined #tripleo | 07:32 | |
*** cd-undercloud has joined #tripleo | 07:41 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 07:41 |
*** cd-undercloud has quit IRC | 07:41 | |
*** rpodolyaka has joined #tripleo | 07:41 | |
rpodolyaka | morning | 07:41 |
lifeless | morning! | 07:43 |
rpodolyaka | ohh, ubuntu, every time I do safe-upgrade something breaks... | 07:45 |
*** akuznetsov has joined #tripleo | 07:48 | |
*** pblaho has joined #tripleo | 07:50 | |
*** michchap has quit IRC | 07:53 | |
*** michchap has joined #tripleo | 07:53 | |
*** edmund has quit IRC | 07:55 | |
*** akuznetsov has quit IRC | 07:59 | |
*** matsuhashi has quit IRC | 08:09 | |
*** mattymo has quit IRC | 08:09 | |
*** matsuhashi has joined #tripleo | 08:10 | |
*** athomas has joined #tripleo | 08:10 | |
*** jtomasek has joined #tripleo | 08:12 | |
*** nosnos_ has quit IRC | 08:14 | |
*** nosnos has joined #tripleo | 08:15 | |
*** hewbrocca has joined #tripleo | 08:27 | |
SpamapS | | notcompute | 63244 | ClientException: The server has either erred or is incapable of performing the requested operation. (HTTP 500) (Request-ID: req-0fe6686c-37de-41ff-b21b-a17259770350) | CREATE_FAILED | 2013-11-25T08:04:50Z | | 08:33 |
SpamapS | weeirrd | 08:33 |
SpamapS | | ac954c30-1f9a-4b26-af2d-9e4c252d58ed | overcloud-notcompute-ngoyx525mzie | ACTIVE | None | Running | ctlplane=10.10.16.171 | | 08:33 |
SpamapS | oh this may be a heat bug... server.get() raising a 500 should not be "create failed"... | 08:34 |
SpamapS | though there is a reasonable chance it is fixed in newer heat's | 08:35 |
SpamapS | req-0fe6686c-37de-41ff-b21b-a17259770350 | 08:38 |
SpamapS | doh | 08:38 |
SpamapS | 2013-11-25 08:04:45,948.948 19625 TRACE nova.api.openstack ConnectionFailed: Connection to neutron failed: Maximum attem | 08:39 |
SpamapS | pts reached | 08:39 |
SpamapS | neutron | 08:39 |
SpamapS | AGAIN | 08:39 |
hewbrocca | film at 11... | 08:39 |
SpamapS | I'm seriously bringing a Jimmy Neutron piƱata to the next summit | 08:40 |
lifeless | SpamapS: make sure the nova has the bug fix from late last week | 08:40 |
lifeless | SpamapS: also make sure heat has it too | 08:40 |
SpamapS | this is undercloud | 08:40 |
SpamapS | happy to update everything | 08:40 |
lifeless | SpamapS: oh, then we have the bug. | 08:40 |
SpamapS | but.. you know.. how many things do we change before we are starting over? | 08:40 |
lifeless | SpamapS: at least in nova. | 08:40 |
lifeless | SpamapS: well we were deploying reliably for a week or so | 08:41 |
SpamapS | yes | 08:41 |
SpamapS | I'll try the windows admin method: restart everything | 08:41 |
*** akuznetsov has joined #tripleo | 08:42 | |
lifeless | SpamapS: bleep | 08:42 |
*** cd-undercloud has joined #tripleo | 08:42 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 08:42 |
*** cd-undercloud has quit IRC | 08:42 | |
SpamapS | lifeless: its possible we were having this problem for a while.. I don't know. | 08:42 |
SpamapS | clearly this has layers | 08:42 |
SpamapS | ok so I'm _just_ going to restart neutron-server | 08:43 |
SpamapS | lifeless: btw the neutron bug in the overcloud where we have to restart neutron-server is apparently already under analysis. Apparently they did some kind of optimization for early policy loading that is causing havoc. | 08:44 |
lifeless | hahahahahaaha | 08:44 |
lifeless | sadface | 08:44 |
SpamapS | https://bugs.launchpad.net/neutron/+bug/1251982 | 08:45 |
uvirtbot | Launchpad bug 1251982 in neutron "external network invisible for not-admin users after q-svc reboot" [High,Confirmed] | 08:45 |
SpamapS | IMO that is a Critical, not a High. | 08:45 |
lifeless | I agree | 08:46 |
SpamapS | but it definitely sounds like our problem | 08:46 |
lifeless | it doth indeed | 08:47 |
*** derekh has joined #tripleo | 08:58 | |
*** jprovazn has joined #tripleo | 09:00 | |
*** vkozhukalov has joined #tripleo | 09:08 | |
*** jistr has joined #tripleo | 09:09 | |
SpamapS | lifeless: got any info on what manifests this bug? | 09:11 |
SpamapS | Oh and our wait_for should also fail if the stack goes to a FAILED state. | 09:11 |
*** matsuhashi has quit IRC | 09:12 | |
*** matsuhashi has joined #tripleo | 09:14 | |
SpamapS | lifeless: so, https://bugs.launchpad.net/neutron/+bug/1211915 .. yes.. should we just update nova and neutronclient in nova's venv on undercloud? | 09:17 |
uvirtbot | Launchpad bug 1211915 in neutron/havana "Connection to neutron failed: Maximum attempts reached" [High,Fix committed] | 09:17 |
SpamapS | there seems to be some dissent among users that this is not actually fixed | 09:18 |
lifeless | SpamapS: if heat is talking directly to neutron | 09:19 |
lifeless | SpamapS: then we're not suffering that, it's a neutron bug. | 09:19 |
lifeless | SpamapS: OTOH if it's nova talking to neutron, then it may be. | 09:19 |
lifeless | SpamapS: where is the error happening ? | 09:19 |
SpamapS | no, heat is talking to nova | 09:20 |
lifeless | ok | 09:20 |
SpamapS | nova is returning 500 because of "Connection to neutron failed. Maximum attempts reached" | 09:20 |
lifeless | and nova is whinging? | 09:20 |
lifeless | so | 09:20 |
lifeless | let me have a quick peek | 09:21 |
lifeless | see if the buggy local.py is there | 09:21 |
lifeless | strong_store = corolocal.local | 09:21 |
lifeless | so | 09:21 |
lifeless | it is | 09:21 |
lifeless | I suggest cherrypicking the fix | 09:22 |
Ng | morning | 09:22 |
lifeless | which replaces just local.py | 09:22 |
SpamapS | lifeless: as it is 01:22 here and my fingers are looking awfully fat.. I will wait until tomorrow to do such things | 09:23 |
openstackgerrit | Marios Andreou proposed a change to openstack/tripleo-heat-templates: Adds AvailabilityZone parameter to compute/notcompute templates https://review.openstack.org/58229 | 09:24 |
lifeless | SpamapS: I'll pull it in | 09:24 |
*** markmc has joined #tripleo | 09:25 | |
SpamapS | sweet | 09:25 |
* SpamapS retires for the evening | 09:25 | |
lifeless | nova installed with the cherrypick | 09:26 |
lifeless | occ forced | 09:26 |
lifeless | Ng: morning! | 09:26 |
Ng | hey lifeless | 09:26 |
lifeless | tripleo-cd restarted | 09:26 |
*** lsmola has joined #tripleo | 09:33 | |
*** martyntaylor has joined #tripleo | 09:39 | |
*** viktors has joined #tripleo | 09:42 | |
*** boris-42 has joined #tripleo | 09:42 | |
*** max_lobur has joined #tripleo | 09:46 | |
*** martyntaylor has quit IRC | 09:49 | |
*** martyntaylor has joined #tripleo | 09:55 | |
*** marun has joined #tripleo | 10:00 | |
*** cd-undercloud has joined #tripleo | 10:11 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 10:11 |
*** cd-undercloud has quit IRC | 10:11 | |
*** rongze has quit IRC | 10:13 | |
*** jtomasek has quit IRC | 10:13 | |
*** jtomasek has joined #tripleo | 10:18 | |
*** arata has left #tripleo | 10:19 | |
*** jtomasek has quit IRC | 10:24 | |
*** jtomasek has joined #tripleo | 10:25 | |
*** martyntaylor has quit IRC | 10:28 | |
openstackgerrit | Marios Andreou proposed a change to openstack/tripleo-heat-templates: Adds AvailabilityZone parameter to compute/notcompute templates https://review.openstack.org/58229 | 10:32 |
*** martyntaylor has joined #tripleo | 10:35 | |
openstackgerrit | Mark McLoughlin proposed a change to openstack/tripleo-image-elements: Clarify that boot-stack isn't for baremetal https://review.openstack.org/58247 | 10:36 |
*** panda has joined #tripleo | 10:42 | |
openstackgerrit | Mark McLoughlin proposed a change to openstack/diskimage-builder: Fix typo in source-repositories README https://review.openstack.org/58251 | 10:54 |
openstackgerrit | Mark McLoughlin proposed a change to openstack/tripleo-image-elements: Clarify that boot-stack isn't only for baremetal https://review.openstack.org/58247 | 10:57 |
*** matsuhashi has quit IRC | 10:58 | |
*** cd-undercloud has joined #tripleo | 10:58 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 10:58 |
*** cd-undercloud has quit IRC | 10:58 | |
*** boris-42_ has joined #tripleo | 10:59 | |
*** boris-42 has quit IRC | 11:01 | |
*** nosnos has quit IRC | 11:07 | |
openstackgerrit | A change was merged to openstack/tripleo-image-elements: Clarify that boot-stack isn't only for baremetal https://review.openstack.org/58247 | 11:08 |
openstackgerrit | A change was merged to openstack/diskimage-builder: Fix typo in source-repositories README https://review.openstack.org/58251 | 11:09 |
*** rongze has joined #tripleo | 11:13 | |
*** rongze has quit IRC | 11:18 | |
*** cd-undercloud has joined #tripleo | 11:44 | |
cd-undercloud | ************** overcloud complete status=0 ************ | 11:44 |
*** cd-undercloud has quit IRC | 11:44 | |
*** arata has joined #tripleo | 11:47 | |
*** boris-42 has joined #tripleo | 11:56 | |
*** boris-42_ has quit IRC | 11:57 | |
openstackgerrit | Roman Podoliaka proposed a change to openstack/tripleo-image-elements: Fix building of overcloud compute node image https://review.openstack.org/58265 | 11:59 |
*** boris-42_ has joined #tripleo | 12:01 | |
*** boris-42 has quit IRC | 12:03 | |
*** panda_ has joined #tripleo | 12:03 | |
*** lucasagomes has joined #tripleo | 12:04 | |
*** panda has quit IRC | 12:07 | |
*** arata has left #tripleo | 12:14 | |
*** rongze has joined #tripleo | 12:14 | |
*** michchap has quit IRC | 12:18 | |
*** rongze has quit IRC | 12:18 | |
*** michchap has joined #tripleo | 12:20 | |
*** rongze has joined #tripleo | 12:24 | |
*** akrivoka has joined #tripleo | 12:28 | |
*** lucasagomes is now known as lucas-hungry | 12:29 | |
*** cd-undercloud has joined #tripleo | 12:32 | |
cd-undercloud | ************** overcloud complete status=0 ************ | 12:32 |
*** cd-undercloud has quit IRC | 12:32 | |
openstackgerrit | Roman Podoliaka proposed a change to openstack/tripleo-image-elements: Fix building of overcloud compute node image https://review.openstack.org/58265 | 12:33 |
*** rushiagr has quit IRC | 12:39 | |
openstackgerrit | A change was merged to openstack/tripleo-image-elements: Fix building of overcloud compute node image https://review.openstack.org/58265 | 12:39 |
rpodolyaka | this wait_for until nova is initialized seems to be fragile https://github.com/openstack/tripleo-incubator/blob/master/scripts/devtest_seed.sh#L75 | 12:45 |
rpodolyaka | at least it doesn't work for me | 12:45 |
rpodolyaka | perhaps, we could use wait condition for undercloud too | 12:46 |
rpodolyaka | and for seed vm, maybe ssh + check that os-refresh-config has finished? | 12:46 |
openstackgerrit | James Slagle proposed a change to openstack/tripleo-image-elements: Add element for tripleo-heat-templates. https://review.openstack.org/58274 | 12:53 |
*** martyntaylor has quit IRC | 12:59 | |
*** martyntaylor has joined #tripleo | 13:08 | |
*** bauzas has quit IRC | 13:15 | |
*** cd-undercloud has joined #tripleo | 13:15 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 13:15 |
*** cd-undercloud has quit IRC | 13:15 | |
*** hewbrocca has quit IRC | 13:16 | |
*** bcrochet has quit IRC | 13:21 | |
*** jomara has quit IRC | 13:23 | |
*** rongze has quit IRC | 13:25 | |
*** bcrochet has joined #tripleo | 13:27 | |
*** bauzas has joined #tripleo | 13:29 | |
*** jomara has joined #tripleo | 13:30 | |
*** jdob has joined #tripleo | 13:33 | |
*** lucas-hungry is now known as lucasagomes | 13:34 | |
*** boris-42_ is now known as boris-42 | 13:41 | |
*** dmojoryder has left #tripleo | 13:52 | |
*** arata has joined #tripleo | 13:54 | |
*** lsmola has quit IRC | 13:54 | |
*** arata has left #tripleo | 13:54 | |
*** akrivoka has quit IRC | 14:00 | |
*** noslzzp has joined #tripleo | 14:00 | |
*** CaptTofu has quit IRC | 14:00 | |
*** CaptTofu has joined #tripleo | 14:00 | |
*** dprince has joined #tripleo | 14:01 | |
*** akrivoka has joined #tripleo | 14:01 | |
*** cd-undercloud has joined #tripleo | 14:02 | |
cd-undercloud | ************** overcloud complete status=0 ************ | 14:02 |
*** cd-undercloud has quit IRC | 14:02 | |
*** athomas has quit IRC | 14:02 | |
*** rongze has joined #tripleo | 14:04 | |
*** hewbrocca has joined #tripleo | 14:06 | |
*** tzumainn has joined #tripleo | 14:06 | |
*** morazi has joined #tripleo | 14:07 | |
*** athomas has joined #tripleo | 14:12 | |
*** julim has joined #tripleo | 14:13 | |
openstackgerrit | Dan Prince proposed a change to openstack/tripleo-heat-templates: Use merge.py for the undercloud templates. https://review.openstack.org/57994 | 14:14 |
*** julim has quit IRC | 14:14 | |
*** julim has joined #tripleo | 14:16 | |
*** jdob has quit IRC | 14:16 | |
*** jdob has joined #tripleo | 14:16 | |
*** bauzas1 has joined #tripleo | 14:18 | |
*** bauzas has quit IRC | 14:20 | |
*** jayg|g0n3 is now known as jayg | 14:20 | |
*** rushiagr has joined #tripleo | 14:21 | |
*** ccrouch has joined #tripleo | 14:24 | |
*** lsmola has joined #tripleo | 14:25 | |
*** rongze has quit IRC | 14:26 | |
*** jdob_ has joined #tripleo | 14:30 | |
*** jdob has left #tripleo | 14:31 | |
*** jdob has quit IRC | 14:31 | |
*** rongze_ has joined #tripleo | 14:31 | |
*** edmund has joined #tripleo | 14:38 | |
*** jergerber has joined #tripleo | 14:38 | |
*** CaptTofu has quit IRC | 14:47 | |
*** CaptTofu has joined #tripleo | 14:48 | |
*** cd-undercloud has joined #tripleo | 14:50 | |
cd-undercloud | ************** overcloud complete status=0 ************ | 14:50 |
*** cd-undercloud has quit IRC | 14:50 | |
NobodyCam | Good morning TripleO | 14:53 |
*** martyntaylor1 has joined #tripleo | 14:54 | |
*** martyntaylor has quit IRC | 14:55 | |
openstackgerrit | Derek Higgins proposed a change to openstack-infra/tripleo-ci: Add element to install a testenv worker https://review.openstack.org/58305 | 14:56 |
openstackgerrit | Derek Higgins proposed a change to openstack-infra/tripleo-ci: Add element to install a testenv client https://review.openstack.org/58306 | 14:56 |
*** CaptTofu has quit IRC | 14:57 | |
*** CaptTofu has joined #tripleo | 14:57 | |
*** akrivoka has quit IRC | 14:58 | |
*** beekneemech is now known as bnemec | 15:07 | |
*** akrivoka has joined #tripleo | 15:11 | |
openstackgerrit | Dan Prince proposed a change to openstack-infra/tripleo-ci: Make undercloud-vm.yaml. https://review.openstack.org/58310 | 15:13 |
*** vkozhukalov has quit IRC | 15:13 | |
openstackgerrit | Dan Prince proposed a change to openstack-infra/tripleo-ci: Make undercloud-vm.yaml. https://review.openstack.org/58000 | 15:14 |
openstackgerrit | Dan Prince proposed a change to openstack/tripleo-incubator: Make undercloud-vm.yaml. https://review.openstack.org/57999 | 15:17 |
*** csd has quit IRC | 15:22 | |
*** csd has joined #tripleo | 15:23 | |
*** boris-42 has quit IRC | 15:31 | |
*** jprovazn has quit IRC | 15:32 | |
slagle | dprince: i was ok with you not updating those :) | 15:36 |
*** cd-undercloud has joined #tripleo | 15:38 | |
cd-undercloud | ************** overcloud complete status=0 ************ | 15:38 |
*** cd-undercloud has quit IRC | 15:38 | |
dprince | slagle: sure. I'm sort of like I can't make people happy... so I'll just make it go away. | 15:39 |
dprince | slagle: it wasn't just your feedback. | 15:39 |
slagle | ok :) | 15:39 |
*** CaptTofu has quit IRC | 15:47 | |
*** UtahDave has joined #tripleo | 15:52 | |
*** jdob_ has quit IRC | 15:53 | |
openstackgerrit | Petr Blaho proposed a change to openstack/python-tuskarclient: [WIP] Adds help for subcommands https://review.openstack.org/56257 | 15:54 |
*** hewbrocca has quit IRC | 15:54 | |
*** viktors has quit IRC | 15:56 | |
*** julim has quit IRC | 15:57 | |
*** julim has joined #tripleo | 16:00 | |
*** hewbrocca has joined #tripleo | 16:00 | |
mordred | dprince: I find that making problems go away is my favorite solution :) | 16:10 |
*** pblaho has quit IRC | 16:13 | |
*** hewbrocca has quit IRC | 16:14 | |
*** d0ugal is now known as jud3k | 16:21 | |
*** jud3k is now known as d0ugal | 16:22 | |
*** cd-undercloud has joined #tripleo | 16:24 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 16:24 |
*** cd-undercloud has quit IRC | 16:24 | |
dprince | mordred: or people. But problems are better I suppose. | 16:26 |
*** pblaho has joined #tripleo | 16:31 | |
*** pblaho has quit IRC | 16:32 | |
*** boris-42 has joined #tripleo | 16:32 | |
*** rushiagr has quit IRC | 16:32 | |
*** csd has quit IRC | 16:34 | |
*** lsmola has quit IRC | 16:35 | |
*** max_lobur has quit IRC | 16:40 | |
*** max_lobur has joined #tripleo | 16:41 | |
SpamapS | rpodolyaka: how is the nova wait for fragile? | 16:42 |
*** pblaho has joined #tripleo | 16:43 | |
rpodolyaka | SpamapS: we are doing something like "wait_for 30 10 nova list" there | 16:43 |
SpamapS | rpodolyaka: yes, that seems reasonable. | 16:43 |
rpodolyaka | SpamapS: but by the them when "nova list" already succeeds, nova_bm db is not initialized yet | 16:43 |
SpamapS | rpodolyaka: waitcondition is just a less painful poll to heat. | 16:44 |
rpodolyaka | SpamapS: so baremetal calls fail | 16:44 |
SpamapS | rpodolyaka: ahh so nova list is the wrong command | 16:44 |
rpodolyaka | SpamapS: yeah, not sure nova baremetal-node-list is much better though | 16:44 |
rpodolyaka | SpamapS: it works for me, but I'd rather we find something better :) | 16:45 |
*** rushiagr2 has joined #tripleo | 16:45 | |
SpamapS | rpodolyaka: well for undercloud we do have waitconditions, but for the seed, not so much. | 16:46 |
rpodolyaka | SpamapS: yep | 16:46 |
rpodolyaka | what about ssh + check if os-refresh-config has finished? | 16:47 |
rpodolyaka | or it's too brute? :) | 16:47 |
*** rushiagr2 has quit IRC | 16:48 | |
SpamapS | we can do better | 16:49 |
*** blamar has joined #tripleo | 16:49 | |
Ng | if we have to do that, we have to do that, but it's ugly as hell and imo indicates that there is a bug to be fixed somewhere | 16:49 |
Ng | so I really hope we don't have to do that :) | 16:49 |
openstackgerrit | Derek Higgins proposed a change to openstack/tripleo-incubator: Save the output from boot-seed-vm https://review.openstack.org/58336 | 16:49 |
SpamapS | rpodolyaka: how about we just have os-refresh-config start some service when all is well... we can poll that service. | 16:51 |
SpamapS | Ng: this is for the seed. How do you know the seed is initialized? | 16:51 |
rpodolyaka | SpamapS: any ideas on what service to start? | 16:52 |
rpodolyaka | SpamapS: or for seed vm python -m SimpleHTTPServer is ok? | 16:53 |
*** martyntaylor1 has quit IRC | 16:54 | |
Ng | SpamapS: well I would think we want to define an answer to that so we can make sufficient API calls to know | 16:57 |
Ng | it sounds like nova baremetal-node-list might suffice here | 16:58 |
*** pblaho has quit IRC | 17:01 | |
*** akuznetsov has quit IRC | 17:01 | |
*** jergerber has quit IRC | 17:01 | |
*** zrlpll has joined #tripleo | 17:01 | |
zrlpll | hi | 17:03 |
derekh | we have a whole load of scripts in post-configure.d/80-* how about using baremetal-node-list as suggested but move post-configure.d/80-nova-baremetal to post-configure.d/81-nova-baremetal | 17:03 |
*** lsmola has joined #tripleo | 17:03 | |
*** toci-bot has joined #tripleo | 17:04 | |
toci-bot | ERROR during toci run, see http://54.228.118.193/toci/toci_logs_dFVYTvD/ | 17:04 |
*** toci-bot has quit IRC | 17:04 | |
Ng | derekh: that doesn't sound like a terrible thing, although it's a bit implicit maybe | 17:05 |
rpodolyaka | derekh: this must work 99% of times, but I think, the problem we are facing here, is that e.g. 'nova baremetal-node-list' doesn't exactly guarantee that nova_bm has been fully initialized | 17:06 |
rpodolyaka | this is especially true for larger DBs with more migrations, like nova | 17:07 |
Ng | rpodolyaka: I've not got all the context on this discussion I suspect, but the failures we're seeing will be in the setup-baremetal call, right? | 17:07 |
Ng | or is it the setup-neutron? | 17:07 |
rpodolyaka | Ng: yeah | 17:07 |
rpodolyaka | baremetal | 17:08 |
Ng | either way, we can just make the call with a wait_for | 17:08 |
Ng | when the backend is ready, we move on | 17:08 |
rpodolyaka | sure | 17:08 |
Ng | we do this already somewhere in devtest_overcloud afair | 17:09 |
Ng | hmm, no, we don't, we just wait for a port-list to say something we expect | 17:10 |
*** cd-undercloud has joined #tripleo | 17:10 | |
cd-undercloud | ************** overcloud complete status=0 ************ | 17:10 |
*** cd-undercloud has quit IRC | 17:10 | |
Ng | weird | 17:10 |
*** rdopieralski has quit IRC | 17:10 | |
Ng | init-keystone does it though, on a tenant-create | 17:10 |
* rpodolyaka goes offline for a few hours | 17:13 | |
*** vkozhukalov has joined #tripleo | 17:14 | |
*** zrlpll has quit IRC | 17:14 | |
derekh | Ng: yes the wait_for in init-keystone was enough last week before we moved the db-sync's , now the db-sync for keystone is happening and keystone is ready way before nova | 17:15 |
*** CaptTofu has joined #tripleo | 17:36 | |
*** olaph has joined #tripleo | 17:39 | |
lifeless | SpamapS: ^ is that a success I see? | 17:40 |
*** jistr has quit IRC | 17:41 | |
Ng | lifeless: (btw, fwiw, other TLAs, I know the true/false thing in the offline patch is inconsistent and not tremendously efficient, I was playing for minimising the amount of syntax needed for each test of $OFFLINE :) | 17:42 |
Ng | I'm now wondering if I should rename the entire option though, given derekh's excellent suggestion of checking whether the resources actually exist | 17:42 |
Ng | (so making it more like boot-seed-vm's -c) | 17:43 |
lifeless | right | 17:44 |
lifeless | offline is a soft offline | 17:44 |
lifeless | having an offline that isn't the same would be confusing | 17:44 |
Ng | lifeless: indeed, in which case, how do you feel about renaming the option to indicate that it's an attempt to re-use cached assets? | 17:45 |
lifeless | Ng: your patch? or --offline? | 17:46 |
lifeless | Ng: cause --offline doesn't turn on caching; it turns of invalidation. | 17:47 |
lifeless | Ng: derekh's -c turns on caching on things that have no invalidation code path. | 17:47 |
SpamapS | lifeless: yes, we won the race this time ;) | 17:50 |
*** derekh has quit IRC | 17:54 | |
*** cd-undercloud has joined #tripleo | 17:57 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 17:57 |
*** cd-undercloud has quit IRC | 17:57 | |
lifeless | SpamapS: oh, it's fixed? ship it! | 17:57 |
lifeless | doh | 17:57 |
lifeless | terrible timing on that joke | 17:58 |
SpamapS | OR fantastic timing | 17:58 |
SpamapS | lifeless: we are at least always hitting the same problem now | 17:58 |
SpamapS | Unable to find network with name 'ext-net' | 17:58 |
SpamapS | lifeless: I think there is some coordination problem with l3 agent | 17:58 |
lifeless | isn't that the the thing you identified with a restart? | 17:59 |
lifeless | dprince: oh hai! | 17:59 |
lifeless | dprince: can we finish analysing your bare metal devtest setup? | 17:59 |
SpamapS | lifeless: What may be happening is they are trying to have the api tell l3 to do things and only making it available to users after the l3 agent agrees to do them, but somewhere that process gets lost and we never see the network until we restart l3 or api. | 17:59 |
SpamapS | lifeless: I'm doing a restart now via ssh (local change on cd-undercloud) ... it seems that it is just making the race a little better | 18:00 |
dprince | lifeless: Sure. FWIW, the libvirt issue solved my problems. | 18:00 |
lifeless | so the api tells the server which commits it and then emits a notification on rabbit and then the l3 reads that and regenerates. | 18:00 |
lifeless | dprince: yes, but thats because your seed config is whack :) | 18:00 |
SpamapS | lifeless: thats how it _should_ work. | 18:00 |
dprince | lifeless: my seed config is stock | 18:00 |
lifeless | dprince: I'm exaggerating slightly. | 18:00 |
SpamapS | lifeless: I'm suggesting that instead they commit it, but don't then update their view of policy/the world until the l3 agent responds | 18:00 |
lifeless | dprince: but 192.168.122.x shouldn't be seen inside teh 192.0.2.x network | 18:01 |
dprince | lifeless: Furthermore, other Fedora users hit this same issue. | 18:01 |
*** lucasagomes is now known as lucas-afk | 18:01 | |
lifeless | dprince: not if you want to be able to communicate to that network from another machine than the seed host | 18:01 |
dprince | lifeless: So, should one of our incubator scripts add an IP to the ctrlplane bridge point then? | 18:01 |
lifeless | dprince: that would break the isolation of it being a separate network | 18:02 |
lifeless | dprince: but it is a possible answer for the baremetal case | 18:02 |
lifeless | dprince: another answer is to use a separate real router and not route through the seed | 18:02 |
dprince | lifeless: I'm essentially following devtests, just using bare metal. | 18:02 |
dprince | lifeless: Don't have a real router :(. I suppose markmc might let me expense one though | 18:03 |
* dprince kids | 18:03 | |
lifeless | dprince: I get that, but you've got something different to the current rack region | 18:03 |
lifeless | dprince: so I'd like to get that identified, then look at solutions | 18:03 |
dprince | lifeless: So we add a route so that traffic gets routed: | 18:03 |
dprince | lifeless: 192.0.2.0 192.168.122.81 255.255.255.0 UG 0 0 0 virbr0 | 18:03 |
dprince | lifeless: so that is how seed host traffic gets into the 192.0.2 network | 18:04 |
*** akuznetsov has joined #tripleo | 18:05 | |
lifeless | slagle: btw, hi, would like to catch up with more bw about the demo images thing at some point | 18:05 |
lifeless | dprince: so on https://etherpad.openstack.org/p/dans-tripleo-setup | 18:05 |
dprince | lifeless: I guess the main reason I wanted to fix this was that it would become an issue for people doing the fully virtual setup as well | 18:05 |
lifeless | dprince: where we have (A) eth0 Seed host eth1(B) | 18:05 |
lifeless | dprince: I think the details put in are for your VM, not for the actual host | 18:05 |
lifeless | dprince: so I'd like to tweak the diagram to make those separate | 18:06 |
dprince | lifeless: sure. I missed that. One sec. | 18:06 |
Ng | lifeless: I mean renaming --offline to something pithier than --don't-invalidate-cached-assets | 18:07 |
lifeless | Ng: I'm always open to ideas; considerations here: we'll be breaking a published interface. And we don't (yet) have a better name. | 18:08 |
slagle | lifeless: sure, can do so whenever | 18:10 |
SpamapS | Ng: you don't like --offline? because I like it as it pushes us to actually make it work offline | 18:12 |
dprince | lifeless: are these your ssh keys https://launchpad.net/~lifeless/+sshkeys | 18:12 |
Ng | SpamapS: I was just thinking that its name isn't quite accurate if I make it also check for the existence of the assets | 18:13 |
*** athomas has quit IRC | 18:15 | |
*** rushiagr2 has joined #tripleo | 18:18 | |
*** akrivoka has quit IRC | 18:20 | |
lifeless | dprince: yes | 18:25 |
* Ng leaves a --offline run going and dinners, will need a couple more test runs after, but I hope this is now finished and addressing the many -1s ;) | 18:25 | |
lifeless | Ng: so your patch isn't --offline the way the rest of the codebase is | 18:26 |
lifeless | Ng: I think -c is a better label for your patch | 18:26 |
lifeless | Ng: that was my point | 18:26 |
slagle | anyone recall if it was a conscious decision to have the tuskar element not require horizon? | 18:28 |
dprince | lifeless: you should have access | 18:28 |
dprince | lifeless: sudo to root and then you'll see toci | 18:29 |
dprince | lifeless: 'seed', 'undercloud', and 'overcloud' get you access to the various machines | 18:29 |
lifeless | slagle: tuskar the API? Absolutely. | 18:29 |
lifeless | slagle: horizon on one machine and APIs on others is pretty common | 18:30 |
slagle | lifeless: yea :). i thought the tuskar element was the api and ui. but it's actually just the api | 18:31 |
*** insanidade has joined #tripleo | 18:32 | |
lifeless | dprince: so the other aspect to this | 18:33 |
lifeless | dprince: is that 192.0.2.x is a non-routable network | 18:33 |
lifeless | dprince: as in RFCs say MUST NOT | 18:33 |
*** markmc has quit IRC | 18:33 | |
lifeless | dprince: so I don't want to enable folk doing uncustomised devtest to accidentally violate that | 18:34 |
lifeless | dprince: -> I'd like to tie allowing remote access to a devtest cluster, whether baremetal or not, to selecting an actual network. | 18:34 |
dprince | lifeless: Okay. My use case is a simple one, just trying to follow devtest and make it work. I've pretty much got that now I think (with the libvirt network conflicts fixed) | 18:35 |
lifeless | dprince: sanity checking the layout: seed is on brbm, brbm has a physical ethernet port added to it too ? | 18:37 |
dprince | lifeless: yes. A USB GigE connector | 18:38 |
dprince | lifeless: and that goes to my undercloud/overcloud switch | 18:38 |
lifeless | ok | 18:38 |
* lifeless updates | 18:38 | |
*** max_lobur is now known as max_lobur_afk | 18:39 | |
lifeless | dprince: I still can't see why the traffic is getting natted | 18:40 |
lifeless | dprince: unless the seed is doing it! | 18:40 |
*** rongze_ has quit IRC | 18:40 | |
lifeless | dprince: ssh seed | 18:41 |
lifeless | ssh: Could not resolve hostname seed: Name or service not known | 18:41 |
*** CaptTofu has quit IRC | 18:41 | |
dprince | lifeless: just seed | 18:41 |
dprince | lifeless: its an alias, they all are | 18:41 |
lifeless | dprince: in whose shell ? | 18:42 |
dprince | lifeless: sudo to root | 18:42 |
lifeless | ah :) | 18:42 |
*** cd-undercloud has joined #tripleo | 18:43 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 18:43 |
*** cd-undercloud has quit IRC | 18:43 | |
lifeless | dprince: is your ping from Z to D still running ? | 18:43 |
lifeless | dprince: as in, if I tcpdump I should see it ? | 18:43 |
dprince | lifeless: no | 18:43 |
lifeless | dprince: could you kick it off and leave it running? I want to track down the natting, which is the real issue | 18:43 |
lifeless | dprince: and it's not obviously happening in either the seed host or the seed | 18:44 |
dprince | lifeless: [root@localhost ~]# ping 192.0.2.5 | 18:44 |
dprince | PING 192.0.2.5 (192.0.2.5) 56(84) bytes of data. | 18:44 |
dprince | 64 bytes from 192.0.2.5: icmp_seq=1 ttl=63 time=1.27 ms | 18:44 |
dprince | 64 bytes from 192.0.2.5: icmp_seq=2 ttl=63 time=0.839 ms | 18:44 |
dprince | lifeless: running now... | 18:44 |
*** jcoufal has quit IRC | 18:44 | |
lifeless | yum installing tcpdump :P | 18:44 |
dprince | lifeless: really, I thought it was there? | 18:44 |
*** rushiagr2 has quit IRC | 18:44 | |
lifeless | not on the seed | 18:44 |
dprince | lifeless: oh, right. | 18:45 |
lifeless | ok so the seed sees | 18:45 |
lifeless | 18:44:57.945977 IP 192.168.122.1 > 192.0.2.5: ICMP echo request, id 18434, seq 65, length 64 | 18:45 |
lifeless | -> seed host check time | 18:45 |
dprince | lifeless: yep, traffic all goes through it, I saw the same. | 18:45 |
dprince | lifeless: overcloud always saw it too, | 18:45 |
lifeless | dprince: hey! | 18:46 |
dprince | lifeless: but when the overcloud ran it's own libvirt network it gobbled up the outgoing response | 18:46 |
lifeless | dprince: you're pinging from the seed host :) | 18:46 |
lifeless | dprince: I thought you were pinging from a further away machien ? | 18:46 |
dprince | lifeless: yes! | 18:46 |
lifeless | dprince: I may be misunderstanding the symptoms | 18:46 |
dprince | lifeless: no, never. | 18:46 |
lifeless | ok, so that relieves my concerns about 192.0.2.x | 18:46 |
lifeless | also it means that fedora is unique in showing this problem | 18:46 |
lifeless | :) | 18:46 |
dprince | lifeless: cool. So I can make this happen again if you want. | 18:46 |
lifeless | or maybe it's not. | 18:47 |
lifeless | Ok, so this is whats happening | 18:47 |
lifeless | you're pinging from the seed host | 18:47 |
dprince | lifeless: I'm not sure it is just a Fedora thing. | 18:47 |
lifeless | it's selecting the next hop interface (virbr0) appropriately, which is why you see 192.168.122.1 | 18:47 |
lifeless | now I'm sure that for that specific case I fixed this ages ago | 18:48 |
openstackgerrit | James Slagle proposed a change to openstack/diskimage-builder: Add option --image-size. https://review.openstack.org/58354 | 18:49 |
dprince | lifeless: Okay. I re-enabled the libvirt default.xml network on the overcloud. So ping is broken again. | 18:49 |
lifeless | elements/seed-stack-config/os-apply-config/var/opt/seed-stack/masquerade | 18:50 |
dprince | lifeless: This will show you the original symptom at least. | 18:50 |
dprince | lifeless: which is ping from the seed host is broken. As is ssh access. | 18:50 |
lifeless | that lets traffic to 192.168.122.1 through unnatted | 18:50 |
lifeless | and nats all other traffic | 18:51 |
lifeless | dprince: is there a bug open for this? I think the original analysis (mine probably :)) was incomplete (months ago) | 18:51 |
lifeless | dprince: if not I'll open one | 18:51 |
dprince | lifeless: before we do so can you have a final look at the broken setup? | 18:52 |
lifeless | dprince: I'm still poking around | 18:52 |
dprince | lifeless: we probably need a bug... although like I said the removal of the default network into the overcloud will fix the issue | 18:52 |
lifeless | dprince: thats more intrusive than we should need to be though | 18:53 |
dprince | lifeless: if you ssh into the seed now, then you can still get to the overcloud that way | 18:53 |
dprince | lifeless: And you'll see my ping's no longer have responses to them | 18:53 |
lifeless | dprince: looks ok to me :P | 18:54 |
dprince | lifeless: hmmm. something fixed it! | 18:55 |
lifeless | dprince: that would be me | 18:55 |
*** lucas-afk is now known as lucasagomes | 18:55 | |
lifeless | dprince: fixingish elements/seed-stack-config/os-apply-config/var/opt/seed-stack/masquerade | 18:55 |
lifeless | filing a bug now | 18:55 |
dprince | lifeless: very well then, I'll see what you come up with. | 18:56 |
dprince | lifeless: gotta say though, I can hit this same issue with a much simpler setup | 18:56 |
lifeless | certainly | 18:56 |
dprince | lifeless: for instance if I run a single VM on my laptop | 18:56 |
lifeless | and now my head is out of the sand I get it :) | 18:56 |
dprince | lifeless: and the guest VM runs libvirt on the same range as the host | 18:57 |
dprince | lifeless: you'd hit this same problem, so my initial take was either: 1) modify the guest, or 2) modify the host | 18:57 |
dprince | lifeless: changing either libvirt network range would have resolve the issue... | 18:58 |
dprince | lifeless: the nova-kvm fix we did last week was simply to delete the default libvirt network since it isn't getting use anyway | 18:58 |
lifeless | dprince: I thought I -'d that? | 18:59 |
*** rpodolyaka1 has joined #tripleo | 18:59 | |
dprince | lifeless: well it landed anyway | 18:59 |
lifeless | doh | 18:59 |
lifeless | I think we need a more robust fix | 19:00 |
lifeless | because many people won't think to think about their virbr0 network as a problem | 19:00 |
lifeless | I will - more firmly in future | 19:00 |
dprince | lifeless: well, given that people could hit this same issue with a single node VM I would argue the fix that landed isn't bad or wrong at all. | 19:01 |
lifeless | dprince: It changes a default config inside the deployed nodes in a way that people may not expect. | 19:02 |
dprince | lifeless: https://bugzilla.redhat.com/235961, http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=a83fe2c23efad190a1e00e448f607fe032650fd6 | 19:02 |
lifeless | dprince: Thats the concern I have with it | 19:02 |
dprince | lifeless: just got that from markmc this morning | 19:02 |
lifeless | huh | 19:03 |
dprince | lifeless: I don't disagree that this is worth chasing down, but I also think the solution we used (disabling the default libvirt network) is totally fine | 19:03 |
lifeless | so thats insufficient :) | 19:03 |
lifeless | Probably a ping to 192.168.122.1 on boot would be sufficient to deal in our environment | 19:03 |
dprince | lifeless: maybe | 19:04 |
dprince | lifeless: Ironically, I don't hit this on my laptop at all because I use 192.168.129 locally (a non default libvirt setting). In my dev environment I was trying to stay as stock as possible though. | 19:06 |
* dprince should probably not use the word ironic in this channel | 19:07 | |
lifeless | :P | 19:09 |
lifeless | I've filed https://bugs.launchpad.net/tripleo/+bug/1254836 | 19:09 |
uvirtbot | Launchpad bug 1254836 in tripleo "192.168.122.1 from the seed host leaks into the 192.0.2.x network" [Undecided,New] | 19:09 |
lifeless | since there is a fix in tree, marking it medium | 19:10 |
*** rongze has joined #tripleo | 19:11 | |
dprince | lifeless: might be worth mentioning more about the real life use case here (running libvirt on a guest VM with the same range, etc) | 19:11 |
lifeless | dprince: I'm not sure thats relevant, no? we're solving the seed routing story, not the general case | 19:12 |
lifeless | because the general case is an OS vendor issue | 19:12 |
lifeless | we create a special case where the nodes can't easily tell that they are within a vm environment, where this would matter | 19:12 |
lifeless | so we need to solve that | 19:13 |
dprince | lifeless: to me it gives a good example of a case where we have hit issues here | 19:14 |
lifeless | dprince: feel free to add it ;) | 19:14 |
* dprince likes to make it real | 19:15 | |
*** rongze has quit IRC | 19:17 | |
*** boris-42 has quit IRC | 19:22 | |
*** epim has joined #tripleo | 19:25 | |
*** CaptTofu has joined #tripleo | 19:34 | |
*** CaptTofu has quit IRC | 19:36 | |
*** CaptTofu has joined #tripleo | 19:36 | |
*** cd-undercloud has joined #tripleo | 19:41 | |
cd-undercloud | ************** overcloud complete status=0 ************ | 19:41 |
*** cd-undercloud has quit IRC | 19:41 | |
Ng | hmm, wtf keeps purging neutron from my devtest machine | 19:42 |
Ng | all the other clients persist across runs, but for some reason I keep losing neutron | 19:42 |
*** rongze has joined #tripleo | 19:44 | |
*** rongze has quit IRC | 19:49 | |
Ng | oh, duh, I keep blowing away my local branch, so I'm losing openstack-tools | 19:53 |
Ng | maybe we should put that dir in TRIPLEO_ROOT | 19:53 |
insanidade | lifeless: are you around ? | 19:53 |
SpamapS | horrible hack on the way to work around bug 1254555 | 19:58 |
uvirtbot | Launchpad bug 1254555 in neutron "tenant does not see network that is routable from tenant-visible network until neutron-server is restarted" [High,In progress] https://launchpad.net/bugs/1254555 | 19:58 |
lifeless | OTP at the moment | 19:58 |
lifeless | insanidade: maybe SpamapS / dprince someone can help | 19:58 |
SpamapS | insanidade: whassup? | 19:58 |
dprince | insanidade: hello | 20:00 |
insanidade | hey. thanks :) sorry for the delay. | 20:01 |
insanidade | SpamapS, dprince : basic question on the usage of dib. | 20:01 |
insanidade | SpamapS, dprince : I was wondering how to create a CentOS heat-enabled image. lifeless helped me with some hints. looks like I have to use the rhel element with some parameters. right ? | 20:02 |
openstackgerrit | Clint Byrum proposed a change to openstack/tripleo-incubator: Work around neutron floatingip race condition https://review.openstack.org/58371 | 20:02 |
SpamapS | insanidade: Yes, dprince may also have some insight into that. | 20:03 |
SpamapS | insanidade: What would be great would be if the centos patch were revived so we had centos built-in | 20:03 |
insanidade | SpamapS, dprince : yeah, I have checked the abandoned patch. | 20:04 |
dprince | insanidade: yes. If you can revive the patch perhaps we can see if there are things in common w/ RHEL and go from there. I haven't used the rhel element w/ Centos myself by I'd guess it gets you pretty close... | 20:05 |
dprince | insanidade: for that matter I'm usually using Fedora these days. slagle: have you tried the RHEL element on Centos at all? | 20:06 |
slagle | dprince: no :( | 20:07 |
insanidade | SpamapS, dprince: I believe my question is pretty basic. I want to create a golden image with CentOS. Please correct me if I'm wrong. the steps I'm thinking about are: | 20:08 |
insanidade | 1) install a CentOS vm and all the packages I need for the app I have to spin up with heat; 2) turn that vm into an image; 3) use that image with dib and rhl element (with parameters) | 20:09 |
insanidade | does it make sense? | 20:09 |
lifeless | insanidade: CentOS publish reference images | 20:09 |
lifeless | insanidade: so it's 1) use dib to build a golden image with your apps installed, 2) deploy via heat. | 20:10 |
insanidade | lifeless: that's the point - I believe my question is very basic: is dib invoked agains the OS I'm currently running? | 20:11 |
lifeless | insanidade: no | 20:11 |
lifeless | insanidade: https://git.openstack.org/cgit/openstack/diskimage-builder/tree/README.md#n124 | 20:12 |
insanidade | lifeless: thanks :) | 20:12 |
insanidade | lifeless: I ask that because, for some reason, an attempt to create a fedora image is failing (a very simple example) | 20:13 |
SpamapS | fedora has some weird mirror issues IIRC | 20:14 |
*** lucasagomes has quit IRC | 20:14 | |
dprince | insanidade: maybe paste the errors to a paste site so we can see them? | 20:15 |
dprince | insanidade: also what OS are you building on? | 20:15 |
slagle | the mirrors can be slow, but that shouldn't fail the build | 20:15 |
SpamapS | slagle: I saw them failing toci builds over the weekend | 20:15 |
slagle | how so? | 20:16 |
insanidade | dprince, lifeless : I'm running on CentOS. Error messge: http://paste.openstack.org/show/53939/ | 20:16 |
lifeless | slagle: 504 errors IIRC | 20:16 |
insanidade | lifeless, dprince : the command is: diskimage-builder/bin/disk-image-create vm fedora heat-cfntools -a amd64 -o fedora-heat-cfntools | 20:16 |
lifeless | well that should be running inside the chroot, and the python2.7 output there suggests it is | 20:17 |
*** jtomasek has quit IRC | 20:18 | |
insanidade | lifeless, dprince : do you see anything wrong I might be doing ? | 20:20 |
Ng | slagle: was there a reason for not Approving 58336, ooi? | 20:21 |
openstackgerrit | James Slagle proposed a change to openstack/diskimage-builder: Add option --image-size. https://review.openstack.org/58354 | 20:25 |
dprince | insanidade: I think the heat-cfntools element looks to be missing some dependencies | 20:25 |
dprince | insanidade: not sure though, will have to mess with it... | 20:26 |
*** cd-undercloud has joined #tripleo | 20:26 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 20:26 |
*** cd-undercloud has quit IRC | 20:26 | |
slagle | Ng: none other than I mistakenly glanced over your +2, b/c i'm used to seeing Jenkins on the first line | 20:27 |
dprince | insanidade: what you are doing looks fine though | 20:27 |
slagle | Ng: approved it now | 20:27 |
Ng | slagle: cool :) | 20:27 |
openstackgerrit | A change was merged to openstack/tripleo-image-elements: Add element for tripleo-heat-templates. https://review.openstack.org/58274 | 20:27 |
openstackgerrit | A change was merged to openstack/tripleo-heat-templates: Use merge.py for the undercloud templates. https://review.openstack.org/57994 | 20:28 |
openstackgerrit | A change was merged to openstack-infra/tripleo-ci: Make undercloud-vm.yaml. https://review.openstack.org/58000 | 20:28 |
dprince | lifeless: you need any more debug info from my dev setup? | 20:28 |
openstackgerrit | James Slagle proposed a change to openstack/diskimage-builder: Add option --image-size. https://review.openstack.org/58354 | 20:28 |
* dprince is ready to rebuild! | 20:28 | |
openstackgerrit | A change was merged to openstack/tripleo-incubator: Make undercloud-vm.yaml. https://review.openstack.org/57999 | 20:28 |
pleia2 | SpamapS, lifeless - where are we re: saucy support? (wondering if I should upgrade my dev system or stick with raring for a bit longer) | 20:28 |
insanidade | dprince: would the fact that I'm building that image on a CentOS (vm) be of any importance ? | 20:29 |
pleia2 | (I was about to upgrade before I do my latest round of test env setup) | 20:29 |
dprince | insanidade: so long as its using the chroot I think you should be fine | 20:29 |
SpamapS | pleia2: we've been holding off while we sort out the instability | 20:30 |
pleia2 | SpamapS: ok, glad I asked :) | 20:30 |
dprince | insanidade: I won't say there aren't any weird issues with chroots, but this doesn't look like that sort of issue | 20:30 |
SpamapS | pleia2: it is definitely worth trying to run through devtest with DIB_RELEASE=saucy | 20:30 |
pleia2 | SpamapS: ah, good idea | 20:31 |
SpamapS | ugh and now I think we're back to mellanox fail | 20:31 |
*** cody-somerville has quit IRC | 20:31 | |
SpamapS | which.. would be weird, that was just 12 hours or so. | 20:31 |
*** cody-somerville has joined #tripleo | 20:36 | |
*** cody-somerville has joined #tripleo | 20:36 | |
lifeless | pleia2: as a build host, it's fine | 20:39 |
lifeless | pleia2: as a built image I'm not sure, ask Ng | 20:39 |
lifeless | dprince: no, kick me off | 20:39 |
dprince | lifeless: k, thanks for having a look | 20:40 |
Ng | pleia2: so, we've built saucy images with dib, but I don't think anyone has yet tried a full devtest.sh run with it, or at least if they haven, I haven't seen their results | 20:40 |
Ng | last time it was mentioned, I said I should have a run at it, but I haven't yet. I'd like to get my noisy workstation switched off once my current run has finished, but I've made a note to try it tomorrow morning :) | 20:41 |
pleia2 | Ng: ok, thanks | 20:41 |
ccrouch | pleia2: am I remembering correctly that you were looking at tripleo + lxc | 20:44 |
ccrouch | or is that a figment of my imagination? | 20:44 |
openstackgerrit | A change was merged to openstack/tripleo-incubator: Save the output from boot-seed-vm https://review.openstack.org/58336 | 20:44 |
*** rongze has joined #tripleo | 20:46 | |
openstackgerrit | Chris Jones proposed a change to openstack/tripleo-incubator: Add asset re-use support to devtest.sh. https://review.openstack.org/57755 | 20:47 |
pleia2 | ccrouch: I was a few months back, had to abandon due to iscsi limitations within containers | 20:47 |
pleia2 | ccrouch: see https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1226855 | 20:47 |
uvirtbot | Launchpad bug 1226855 in lxc "Cannot use open-iscsi inside LXC container" [Undecided,Confirmed] | 20:47 |
pleia2 | otherwise it actually did ok | 20:47 |
*** rongze has quit IRC | 20:50 | |
ccrouch | thats a shame, so that bug means we wont be able to deploy "bare metal nodes" that are really lxc containers? | 20:51 |
ccrouch | they have to be actual machines or VMs | 20:51 |
lifeless | ccrouch: rigt | 20:51 |
lifeless | yup, if we want to deploy containers, use the container codepath | 20:51 |
ccrouch | oh you mean dont treat them as regular "bare metal nodes" but have a specific path for deploying containers? | 20:52 |
pleia2 | yeah | 20:52 |
ccrouch | pleia2: so none of the workarounds Serge suggested worked? | 20:53 |
pleia2 | ccrouch: didn't try them. by the time I hit this impasse we had decided to do baremetal testing on our tripleo cloud | 20:54 |
pleia2 | which was the goal of my work all along (lxc was incidental) | 20:54 |
ccrouch | ok got it | 20:54 |
*** dprince has quit IRC | 20:54 | |
pleia2 | ccrouch: I do still have notes around somewhere if you want to work on it, needed to make some changes to tripleo to make lxc work (never committed changes since we didn't do this route) | 20:55 |
pleia2 | ccrouch: https://etherpad.openstack.org/p/tripleobaremetallxc2013 - a bit old devtest.sh-wise now, but has useful notes | 20:56 |
ccrouch | thanks | 20:57 |
*** marun has quit IRC | 20:57 | |
lifeless | ok so,, reviews time for me | 21:00 |
insanidade | dprince, lifeless : looks like fedora is broken. just build an ubuntu image without problems. | 21:00 |
lifeless | insanidade: please do file a bug on diskimage-builder about that | 21:02 |
*** rpodolyaka1 has quit IRC | 21:03 | |
insanidade | lifeless: never did it before. would you help me on that? :) | 21:04 |
pleia2 | insanidade: start here: https://bugs.launchpad.net/diskimage-builder/+filebug | 21:06 |
pleia2 | log in with your launchpad.net account | 21:07 |
insanidade | pleia2: at least I do have a launchpad account :) | 21:07 |
pleia2 | yes, good start :) | 21:07 |
*** marun has joined #tripleo | 21:09 | |
*** vkozhukalov has quit IRC | 21:09 | |
lifeless | jog0: please tell me you're not going to tbe the key man for the new bug :( | 21:11 |
*** cd-undercloud has joined #tripleo | 21:13 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 21:13 |
*** cd-undercloud has quit IRC | 21:13 | |
*** jtomasek has joined #tripleo | 21:15 | |
jog0 | lifeless: I wasn't planning on it | 21:19 |
jog0 | the libvirt one? | 21:19 |
*** CaptTofu has quit IRC | 21:20 | |
*** CaptTofu has joined #tripleo | 21:20 | |
lifeless | jog0: yeah | 21:21 |
insanidade | pleia2, lifeless : https://bugs.launchpad.net/diskimage-builder/+bug/1254879 | 21:22 |
uvirtbot | Launchpad bug 1254879 in diskimage-builder "Fedora image creation fails" [Undecided,New] | 21:22 |
SpamapS | ok so the new race is that we're trying to nova boot before nova compute is ready. | 21:22 |
SpamapS | I smell another waitcondition | 21:22 |
lifeless | SpamapS: how is nova compute not ready when the wait condition fires? | 21:22 |
SpamapS | lifeless: wait condition is for notcompute | 21:23 |
SpamapS | | fault | {u'message': u'No valid host was found. ', u'code': 500, u'created': u'2013-11-25T16:46:06Z'} | | 21:23 |
SpamapS | works 10 minutes later | 21:23 |
*** CaptTofu has quit IRC | 21:24 | |
SpamapS | lifeless: we can just set CompletionCondition count == 2 | 21:24 |
SpamapS | and pass it into both | 21:24 |
*** CaptTofu has joined #tripleo | 21:24 | |
*** julim has quit IRC | 21:26 | |
slagle | lifeless: do you want to talk about the demo images? i am losing time in the day before I take over for kid duty | 21:27 |
openstackgerrit | Clint Byrum proposed a change to openstack/diskimage-builder: Add a cleanup option to source-repositories https://review.openstack.org/58386 | 21:27 |
lifeless | slagle: oh oops; maybe tomorrow then? kid duty ++ | 21:28 |
lifeless | SpamapS: oh, then yeah, count 2;. | 21:28 |
lifeless | SpamapS: we can get fancy later | 21:28 |
SpamapS | lifeless: in process right now | 21:28 |
slagle | lifeless: sounds good | 21:28 |
lifeless | SpamapS: of course it is :) | 21:28 |
SpamapS | lifeless: unique one is just as easy but not sure about scale | 21:29 |
lifeless | ETOOHARd | 21:30 |
openstackgerrit | Chris Jones proposed a change to openstack/tripleo-incubator: Probe nova baremetal before registering nodes. https://review.openstack.org/58387 | 21:30 |
openstackgerrit | Clint Byrum proposed a change to openstack/tripleo-heat-templates: Wait for o-r-c on nova compute as well. https://review.openstack.org/58388 | 21:33 |
*** insanidade has quit IRC | 21:36 | |
lifeless | SpamapS: is there a way we can introspect rather than saying '2' ? | 21:42 |
SpamapS | lifeless: yes, have two | 21:42 |
lifeless | heh | 21:42 |
SpamapS | lifeless: btw, "resource group" landed last week. We can now scale by the stack rather than by the instance.. using random string generator per nested stack we might actually have accidentally stumbled on unique passwords per node with clone-like scalability. | 21:44 |
SpamapS | unfortunately we still don't have "only delete that one" on the scale down | 21:45 |
*** rongze has joined #tripleo | 21:46 | |
lifeless | SpamapS: cool | 21:48 |
*** rongze has quit IRC | 21:54 | |
*** cd-undercloud has joined #tripleo | 22:01 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 22:01 |
*** cd-undercloud has quit IRC | 22:01 | |
*** UtahDave has quit IRC | 22:07 | |
*** cd-undercloud has joined #tripleo | 22:48 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 22:48 |
*** cd-undercloud has quit IRC | 22:48 | |
*** rongze has joined #tripleo | 22:50 | |
SpamapS | tripleo core: https://review.openstack.org/#/c/58388/ <-- should fix the current source of status=1's | 22:51 |
lifeless | SpamapS: https://review.openstack.org/#/c/57575/ | 22:52 |
SpamapS | doh | 22:53 |
SpamapS | wtf, why is IANA assigning ports in the 32000+ range anyway? | 22:53 |
clarkb | because you ahve standards bodies, and implementations | 22:54 |
clarkb | honestly it is quite lolzy and I doubt either side will change | 22:54 |
lifeless | SpamapS: http://www.rfc-editor.org/rfc/rfc4340.txt | 22:54 |
*** rongze has quit IRC | 22:55 | |
SpamapS | lifeless: are you just sending me random network related things to make some vague point? | 22:55 |
lifeless | SpamapS: that has the citation for 49151 | 22:56 |
lifeless | SpamapS: I'm looking for a deeper ref | 22:56 |
*** jayg is now known as jayg|g0n3 | 22:56 | |
*** edmund has quit IRC | 22:57 | |
*** julim has joined #tripleo | 22:58 | |
openstackgerrit | A change was merged to openstack/diskimage-builder: Add option --image-size. https://review.openstack.org/58354 | 22:59 |
lifeless | so | 23:01 |
lifeless | rfc 1700 | 23:01 |
lifeless | says | 23:01 |
lifeless | The Registered Ports are in the range 1024-65535. | 23:01 |
lifeless | so 49151 is actually a reduction | 23:02 |
*** vipul is now known as vipul-away | 23:03 | |
lifeless | 1340 still said 65535 | 23:07 |
lifeless | and rfc 6335 doesn't specify | 23:07 |
lifeless | weird | 23:07 |
lifeless | so everyone knows its 49151 | 23:08 |
lifeless | but nothing says why ;) | 23:08 |
openstackgerrit | Chris Jones proposed a change to openstack/tripleo-incubator: Add asset re-use support to devtest.sh. https://review.openstack.org/57755 | 23:08 |
*** julim has quit IRC | 23:09 | |
openstackgerrit | Chris Jones proposed a change to openstack/tripleo-incubator: Add local swift symlink to .gitignore. https://review.openstack.org/58403 | 23:09 |
openstackgerrit | Chris Jones proposed a change to openstack/tripleo-incubator: Improve a boot-seed-vm status message. https://review.openstack.org/58404 | 23:11 |
pleia2 | lifeless: I'm all bridge confused, have time for a quick g+ call to get me unstuck? | 23:11 |
lifeless | sure | 23:11 |
openstackgerrit | A change was merged to openstack/tripleo-heat-templates: Wait for o-r-c on nova compute as well. https://review.openstack.org/58388 | 23:12 |
openstackgerrit | Clint Byrum proposed a change to openstack/tripleo-incubator: Work around neutron floatingip race condition https://review.openstack.org/58371 | 23:14 |
*** julim has joined #tripleo | 23:19 | |
*** vipul-away is now known as vipul | 23:22 | |
*** epim has quit IRC | 23:28 | |
*** cd-undercloud has joined #tripleo | 23:36 | |
cd-undercloud | ************** overcloud complete status=1 ************ | 23:36 |
*** cd-undercloud has quit IRC | 23:36 | |
lifeless | SpamapS: ^ another failure? | 23:43 |
*** vipul is now known as vipul-away | 23:45 | |
lifeless | SpamapS: (I mean, another type of failure?) | 23:46 |
SpamapS | lifeless: no I don't think that one had the new templates yet. | 23:47 |
SpamapS | hmmmm | 23:50 |
SpamapS | -rw-r--r-- 1 root root 10168 Nov 25 23:36 overcloud-source.yaml | 23:50 |
SpamapS | -rw-r--r-- 1 root root 12685 Nov 21 06:39 overcloud.yaml | 23:50 |
SpamapS | Yeah it still hasn't run make overcloud.yaml yet even | 23:50 |
*** rongze has joined #tripleo | 23:51 | |
*** rongze has quit IRC | 23:55 | |
openstackgerrit | James Slagle proposed a change to openstack/tripleo-incubator: Add unit type to help text for memory. https://review.openstack.org/58410 | 23:57 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!