Tuesday, 2015-05-19

SpamapSgreghaynes: it worked for me when I did that too00:04
*** sdake has quit IRC00:21
*** sdake has joined #tripleo00:22
*** barra204 has joined #tripleo00:26
*** barra204_ has joined #tripleo00:27
*** akrivoka has joined #tripleo00:27
*** barra204_ is now known as shakamunyi00:28
*** jcoufal has joined #tripleo00:32
*** akrivoka_ has joined #tripleo00:35
*** thrash is now known as thrash|g0ne00:36
*** akrivoka has quit IRC00:37
greghaynesSpamapS: Well, did you find a way in which it didnt work?00:37
*** ir2ivps10 has quit IRC00:37
*** sdake has quit IRC00:38
*** barra204_ has joined #tripleo00:40
*** barra204__ has joined #tripleo00:40
*** barra204 has quit IRC00:40
*** shakamunyi has quit IRC00:41
*** akrivoka__ has joined #tripleo00:43
*** barra204_ has quit IRC00:46
*** barra204__ has quit IRC00:46
*** akrivoka_ has quit IRC00:47
*** lazy_prince has joined #tripleo00:48
*** daneyon has quit IRC00:51
*** jcoufal has quit IRC00:51
*** akrivoka__ has quit IRC00:55
openstackgerritgreghaynes proposed openstack/diskimage-builder: Add tests for building *-minimal images  https://review.openstack.org/18116200:59
greghaynesmordred: we should make an installtype for the simple-init element that installs the rust version01:04
clarkbits no longer a beta language so you wont have to rebuild everyday01:10
bkeroit should be 'stable'01:20
*** sdake has joined #tripleo01:37
*** sdake has quit IRC01:44
*** lazy_prince has quit IRC02:00
*** barra204 has joined #tripleo02:25
*** barra204_ has joined #tripleo02:25
*** barra204_ has quit IRC02:32
*** barra204 has quit IRC02:32
*** daneyon has joined #tripleo02:37
*** julim has joined #tripleo02:43
*** noslzzp has joined #tripleo02:55
*** jcoufal has joined #tripleo02:57
*** untriaged-bot has joined #tripleo03:00
untriaged-botUntriaged bugs so far:03:00
openstackLaunchpad bug 1455175 in tripleo "Option to configure gateway through keepalived" [Undecided,New] - Assigned to Mayank (mayank0107)03:00
openstackLaunchpad bug 1449852 in diskimage-builder "Buidling ramdisk with ironic-agent behind proxy fails" [Undecided,In progress] - Assigned to Ramakrishnan G (rameshg87) (rameshg87)03:00
openstackLaunchpad bug 1449854 in diskimage-builder "Ironic agent ramdisk built using disk-image-create fails with iscsi_ilo driver" [Undecided,Fix committed] - Assigned to Ramakrishnan G (rameshg87) (rameshg87)03:00
openstackLaunchpad bug 1454803 in tripleo "puppet: Neutron is not configured with L2 population" [Undecided,New]03:00
openstackLaunchpad bug 1452752 in tuskar "keystone_authtoken section is wrong in default shipped tuskar.conf.sample" [Undecided,Confirmed]03:00
openstackLaunchpad bug 1454802 in tripleo "puppet: Neutron does not use Nova notifications" [Undecided,New]03:00
*** untriaged-bot has quit IRC03:00
*** MasterPiece has quit IRC03:01
*** MasterPiece has joined #tripleo03:02
*** barra204 has joined #tripleo03:08
*** shakamunyi has joined #tripleo03:08
*** panda has quit IRC03:14
*** panda has joined #tripleo03:14
*** jcoufal_ has joined #tripleo03:15
*** jcoufal has quit IRC03:19
*** jcoufal_ has quit IRC03:33
*** daneyon has quit IRC03:36
*** daneyon has joined #tripleo03:36
*** chlong has quit IRC03:54
*** julim has quit IRC03:59
*** chlong has joined #tripleo04:06
*** shakamunyi has quit IRC04:50
*** barra204 has quit IRC04:50
*** eghobo has joined #tripleo04:51
*** lazy_prince has joined #tripleo04:52
*** chlong has quit IRC04:52
*** julim has joined #tripleo04:54
*** chlong has joined #tripleo05:04
*** eghobo_ has joined #tripleo05:14
*** eghobo has quit IRC05:16
*** eghobo has joined #tripleo05:17
*** eghobo_ has quit IRC05:21
*** dasm|afk is now known as dasm05:29
*** eghobo_ has joined #tripleo05:39
*** eghobo has quit IRC05:39
*** julim has quit IRC05:41
*** masco has joined #tripleo05:54
*** julim has joined #tripleo05:54
*** julim has quit IRC06:06
*** dasm has quit IRC06:06
*** dasm has joined #tripleo06:07
*** dasm has quit IRC06:07
*** dasm has joined #tripleo06:08
*** dasm has quit IRC06:08
*** dasm_ has joined #tripleo06:09
*** ukalifon has joined #tripleo06:10
*** eghobo_ has quit IRC06:11
*** eghobo has joined #tripleo06:12
*** julim has joined #tripleo06:16
*** eghobo has quit IRC06:24
*** yog__ has joined #tripleo06:34
*** dasm has joined #tripleo06:36
*** yamahata has joined #tripleo06:37
*** dasm has quit IRC06:39
*** dasm has joined #tripleo06:40
*** regebro has joined #tripleo06:48
*** jprovazn has joined #tripleo06:58
*** mmagr has joined #tripleo07:07
*** noslzzp has quit IRC07:08
*** panda has quit IRC07:13
*** panda has joined #tripleo07:14
*** ifarkas has joined #tripleo07:17
*** jistr has joined #tripleo07:18
*** ishant has joined #tripleo07:25
*** aufi has joined #tripleo07:29
*** gfidente has joined #tripleo07:59
gfidentejistr, morning :)07:59
jistrmorning :)07:59
gfidentelooks like we got something yesterday07:59
gfidenteit seems to work fine for me07:59
jistrgfidente: neat, which patch were you on?08:00
gfidentemy WIP which included yours08:00
gfidenteI see CI failing apparently on the running time08:00
gfidenteso I merged the change which was increasing the HA job timeout08:01
gfidentejistr, so I tested this https://review.openstack.org/#/c/184078/08:03
gfidenteand also https://review.openstack.org/#/c/184043/ and both worked08:03
gfidentemarios, jistr can you check this (and the dep) https://review.openstack.org/#/c/183097/08:05
gfidenteit's be nice to merge that and then https://review.openstack.org/#/c/183472/08:06
*** yamahata has quit IRC08:06
jistrgfidente: yeah i'm going to test this https://review.openstack.org/#/c/183097/ (+ its long dep chain :) ) and if that gives us a successful deployment, then i think we should merge away08:08
jistrgonna do the remaining visual reviews first08:09
gfidenteFWIW the commit here is wrong https://review.openstack.org/18407208:10
gfidenteit is only moving VIPs08:10
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Enable VIPs via Pacemaker from step 2 instead of step 1  https://review.openstack.org/18407208:11
mariosgfidente: sure man gimme few (just done one round of reviews)08:12
gfidentejistr, the thing I found about clustercheck is that it doesn't work the way it is now08:14
gfidentewe need to add the service into haproxy08:14
gfidentewhich gives back status of synchronization to the resource agent08:14
gfidenteI was going to add that today08:14
jistrgfidente: by "doesn't work" you mean it reports OK even though the cluster is not ready yet08:15
gfidentejistr, so from what I understand reading the script08:16
*** mcornea has joined #tripleo08:16
gfidentethe script itself is started by xinetd and it does work08:16
gfidentebut we miss in haproxy the service which is polling the service08:17
gfidentelet me point at the code which is easier08:17
jistrgfidente: ack. btw the commit message change caused the later patches to depend on [outdated], it should go away if you submit the whole branch08:19
gfidenteL22 there is playing with the results of the xinetd service on 920008:20
jistrah ok08:21
gfidentejistr, the galera resource agent itself doesn't call it https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/galera08:24
gfidenteso I suspect the mechanism is nodes are removed from balancer if unsynched08:24
jistryeah that sounds sensible08:26
jistrso i'm rebuilding from scratch08:26
jistrwith this https://review.openstack.org/#/c/184078/08:27
* jistr fingers crossed08:27
gfidentejistr, so that makes me think08:29
gfidenteif nodes are only removed from balancer08:29
gfidentewhat is one supposed to do at full cluster restart?08:30
gfidenteI'll ping jayg|g0n3 when he is online to see if we miss pieces08:30
jistryeah. IIRC full cluster restarts are an issue in general. Not sure if it's solved in Astapor ATM.08:31
gfidenteso to add that into the balancer we need to make sure clustercheck works in step108:35
*** Goneri has joined #tripleo08:39
*** dguerri is now known as dguerri_08:39
jprovaznMorning, could some of cores take a look at https://review.openstack.org/#/c/173283/ ?08:42
* jistr looking08:42
jprovaznjistr: thanks08:44
*** chlong has quit IRC08:45
jistrjprovazn: is the method only for scaling up? scaling down is going to be different?08:46
jprovaznjistr: yes08:47
jistrjprovazn: then i wonder then if it's possible and/or useful to add a check that the new count of the nodes is >= than the original count08:48
jprovaznjistr: yes, it's possible, useful too08:49
*** dguerri_ is now known as dguerri08:50
jistrjistr: ack. Not a blocker, i'll +2 and if you add it to this patch i'll +2 the new patch set too08:50
*** dguerri is now known as dguerri_08:51
*** jang has joined #tripleo08:52
*** jang1 has joined #tripleo08:52
jprovaznjistr: thanks, I'm about to send scaledown patch, so I would lean to add count check in a separate patch, but if nobody +2 it soon, I might have it done sooner :)08:53
gfidentejprovazn, can stack_id and plan_id really be None?08:53
mariosjprovazn: thanks man, added commented, i completely missed https://github.com/openstack/tripleo-common08:53
mariosjprovazn: have added it to my review list for tomorrow morning08:53
jprovaznmarios: thanks08:54
jistrgfidente: so re clustercheck in step 1. I'm not experienced in this area, but could it be that we add it to HAProxy when it's still reporting failure, which means HAProxy would have no backends, and then when galera comes up, clustercheck starts reporting OK and HAProxy notices that and starts balancing over those nodes?08:54
*** jang1 has quit IRC08:54
*** jang has quit IRC08:54
jistrbecause i'm not sure we can get galera up in step 1...08:54
jistrwe need to write the config a step earlier08:55
gfidentejistr, yeah 1 is actually 2, where pcmk is starting services08:55
gfidenteack on letting haproxy/galera race08:56
jprovazngfidente: re none value for plan/stack - the method would fail if not proper ids are passed, but user input is checked in CLI part - https://review.gerrithub.io/#/c/231550/4/rdomanager_oscplugin/v1/overcloud_scale.py08:56
*** dguerri_ is now known as dguerri08:57
gfidentejprovazn, yeah I suppose L32 or L36 will raise something? would it be worth removing the =None default?08:59
jprovazngfidente: yep, doing it now including jistr's check08:59
jprovaznthanks for feedback :)09:00
*** untriaged-bot has joined #tripleo09:00
untriaged-botUntriaged bugs so far:09:00
openstackLaunchpad bug 1455175 in tripleo "Option to configure gateway through keepalived" [Undecided,New] - Assigned to Mayank (mayank0107)09:00
openstackLaunchpad bug 1449852 in diskimage-builder "Buidling ramdisk with ironic-agent behind proxy fails" [Undecided,In progress] - Assigned to Ramakrishnan G (rameshg87) (rameshg87)09:00
openstackLaunchpad bug 1449854 in diskimage-builder "Ironic agent ramdisk built using disk-image-create fails with iscsi_ilo driver" [Undecided,Fix committed] - Assigned to Ramakrishnan G (rameshg87) (rameshg87)09:00
openstackLaunchpad bug 1454803 in tripleo "puppet: Neutron is not configured with L2 population" [Undecided,New]09:00
openstackLaunchpad bug 1452752 in tuskar "keystone_authtoken section is wrong in default shipped tuskar.conf.sample" [Undecided,Confirmed]09:00
openstackLaunchpad bug 1454802 in tripleo "puppet: Neutron does not use Nova notifications" [Undecided,New]09:00
*** untriaged-bot has quit IRC09:00
gfidentejistr, how went the deployment? it worked for me a couple of time09:01
jistrdevtest_overcloud.sh@495: wait_for_stack_ready -w 3600 10 overcloud09:01
gfidentemarios, +A! :)09:03
mariosgfidente: i _will_ hit you09:05
mariosgfidente: done :)09:05
gfidentemarios, we have an entire week to break CI ... and FIX it! :)09:05
mariosgfidente: you do, i'm off on thursday09:05
mariosgfidente: thus i have an entire day to break stuff for you09:05
* marios goes to work09:05
jistrnow the bad spot09:06
jistrdevtest_overcloud.sh@601: wait_for -w 300 --delay 10 -- nova service-list --binary nova-compute '2>/dev/null' '|' grep 'enabled.*\ up\ '09:06
gfidentejistr, marios, so apparently no OPM build is out yet?09:06
* jistr fingers crossed09:06
gfidentejistr, I also noticed tpe seems to have been rebased actually?09:06
gfidentejistr, I didn't go that far!09:07
mariosgfidente: till yesterday no09:07
mariosgfidente: i was going to revisit that now and see what to do about a test env... o_O09:07
gfidentemarios, please please please https://review.openstack.org/#/c/183096/09:07
mariosgfidente: i also need to update my neutron-* to look like jistr and spredzy|afk with the puppet-pacemaker09:07
gfidentemarios, to I think we have cinder and glance and neutron and keystone pending09:08
mariosgfidente: its like you enjoy the pain09:08
gfidenteall of which need to be updated09:08
jistrgfidente, marios: hmm so it's still failing... That's exactly the same spot where it was failing yesterday when i tested just my patches. I like the refactors you made though, so i'd like to merge all that stuff anyway.09:10
*** regebro has quit IRC09:10
openstackgerritMerged openstack/tripleo-heat-templates: Add a directory for overcloud heat environments  https://review.openstack.org/18309609:10
gfidentejistr, so on the messaging part09:10
*** regebro has joined #tripleo09:10
gfidenteI think we need this: https://review.openstack.org/#/c/181081/09:11
openstackgerritMerged openstack/tripleo-heat-templates: Environment which configures puppet pacemaker.  https://review.openstack.org/18309709:11
gfidenteAND we need to https://github.com/redhat-openstack/astapor/blob/master/puppet/modules/quickstack/manifests/openstack_common.pp#L1509:11
jistryeah that does look like it might affect the issue09:12
gfidenteand if after those it still won't work, we might want to make some pressure on the depends09:12
jistrcurrently the master is broken09:12
jistrit doesn't finish stack-create at all09:12
gfidenteHA master?09:12
jistryeah i meant HA09:12
jistrnow we have a bunch of patches that get us to a successful stack-create at least09:13
jistri'd say let's merge them and continue from there09:13
jistrgfidente, marios: does it sound sensible?09:14
jistri'm not sure we should keep piling up the changes and rebasing09:14
gfidenteit'd be nice to get some eyes09:14
gfidentebut I can't think of many except jayg09:14
*** shardy_ has joined #tripleo09:15
*** shardy has quit IRC09:17
*** jrist has quit IRC09:18
*** shardy_ has quit IRC09:21
*** dguerri is now known as dguerri_09:21
*** shardy has joined #tripleo09:22
*** mmagr is now known as mmagr|afk09:28
*** dguerri_ is now known as dguerri09:28
gfidentemandre, while you're playing with the environments ... https://review.openstack.org/#/c/183472/ :)09:28
gfidentemarios, ^^09:29
*** pelix has joined #tripleo09:29
mariosjistr: yes +10009:29
mariosjistr: we haven't seen green build yet on any of them?09:30
mariosjistr: (ha)09:30
*** mmagr|afk is now known as mmagr09:30
mariosjistr: been checking progress periodically,09:30
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Move sysctl settings into hieradata  https://review.openstack.org/18421009:31
jistrmarios: not green e2e, it fails on post-deploy initialization. But the stack-create (all of the puppet parts) succeed.09:31
mariosjistr: is this the horizon thing waiting for opm (was what i was hitting yesterday )09:31
gfidentemarios, so I am attempting https://review.openstack.org/#/c/184078/ + https://review.openstack.org/#/c/184210/ + https://review.openstack.org/#/c/181081/09:31
gfidentejistr, ^^09:32
jistrmarios: no it's some problem in communication between nova compute and the rest of nova services. Could be rabbitmq related but we don't know for sure yet.09:32
*** mmagr is now known as mmagr|afk09:38
gfidentesysctl though should have been set by the element anyway, jistr can you check on your env?09:40
gfidentemarios, the horizon thing only applies to instack which installs OPM, devtest does checkout of puppet modules from git09:56
mariosgfidente: thanks10:02
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Move sysctl settings into hieradata  https://review.openstack.org/18421010:23
openstackgerritJan Provaznik proposed openstack/tripleo-common: Scale out heat stack  https://review.openstack.org/17328310:25
jprovazngfidente: jistr: marios: ^ when you have a sec10:26
*** trown|outttypeww is now known as trown10:31
*** dguerri is now known as dguerri_10:36
gfidentejistr, IT PASSED!10:50
gfidentejistr, marios and the sysctl change can land later because we already to it via sysctl element, so just this: https://review.openstack.org/#/c/181081/ + https://review.openstack.org/#/c/184078/10:51
gfidenteand we get back green again!10:51
jistrgfidente: wow that's awesome news!10:58
gfidentejistr, try for yourself :)10:58
*** mmagr|afk is now known as mmagr10:58
jistrgfidente: so should we rebase this on top of the HA patches? https://review.openstack.org/#/c/181081/10:59
gfidentethey don't really depend on each other but I can11:00
gfidenteyou want to see it green11:01
gfidenteI know11:01
gfidentelet me try11:01
*** dguerri_ is now known as dguerri11:06
*** dguerri is now known as dguerri_11:08
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Provide RabbitMQ clients with a list of servers instead of VIP  https://review.openstack.org/18108111:09
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Enable VIPs via Pacemaker from step 2 instead of step 1  https://review.openstack.org/18407211:09
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Consolidate use of $pacemaker_master in step 2  https://review.openstack.org/18407811:09
jistrgfidente: thx! i handed out the ticks there ;) Gonna test manually too.11:11
* jistr wants to land green HA so much11:11
gfidentejistr, bus as last time ... we'd only be half-there11:11
jistryeah, still missing services pacemaker coverage and maybe diverging from recommended arch here and there, but all the subsequent changes would be tested by CI then instead of manual testing, that's a huge win11:14
*** panda has quit IRC11:14
*** panda has joined #tripleo11:14
*** masco has quit IRC11:19
gfidentejistr, didn't jayg|g0n3  and cwolferh have more changes?11:21
mariosjistr: so the pcmk_resource_create stuff shouldn't be necessary for current instack packages right (puppet-pacemaker puppet-tripleo etc?)11:22
mariosjistr: gfidente: I set export DIB_INSTALLTYPE_puppet_modules=source in instack-build-images -and then also grabbed the pcmk changes ... haven't seen pcs status for a few days ;)11:23
mariosgfidente: your patches go ontop? ^^^11:23
* marios reads bak11:23
gfidenteso this should be live CI of topmost https://jenkins06.openstack.org/job/check-tripleo-ironic-overcloud-f20puppet-ha/61/11:23
openstackgerritJan Provaznik proposed openstack/tripleo-common: Scale out heat stack  https://review.openstack.org/17328311:24
jistrmarios: yeah with that you'd also have to switch to a custom branch of t-h-t with my and gfidente's patches11:24
mariosjistr: basically i've been looking for an env i can work with, i think this is it... (add horizon and tidy up neutron-* pcmk resources)11:25
marios(i mean even if it doesn't copmlete 100% if pcs is doing its thing by that point)11:25
gfidenteyeah problem is they diverged a lot recently11:26
mariosstill running though let's see11:26
jistrmandre: with DIB_INSTALLTYPE_puppet_modules=source you won't need any customizations for puppet modules i think11:26
gfidenteso you want updated OPM, updated TPE and merge updates to THT11:26
mariosgfidente: i think i'm missing opm from these11:26
gfidentejistr, does instack honor that?11:27
jistrmandre: sorry for the noise11:27
mariosgfidente: i rebuilt images to pull pe from src and also patched templates (and rebuilt roles)11:27
gfidentemarios, or you move to devtest11:27
jistrgfidente: i think it should use the same tooling, it only overrides the variable if it wasn't set previously it seems https://github.com/rdo-management/instack-undercloud/blob/e921062e7751a48af09f9a9f5f275fc31a9657e3/elements/undercloud-package-install/environment.d/00-package-install#L1011:28
mariosgfidente: you move to devtest11:28
gfidentejistr, ack, thanks11:29
gfidenteI thought was pulled in by some RPM as dependency11:29
gfidentemarios, it is not so terrible nowdays pretty stable actually11:30
mariosgfidente: indeed last time i poked was end of march, according to DAS LOG11:31
mariosgfidente: but you're there now, so i'm ok for a bit longer11:32
mariosMay 19 07:19:27 localhost pengine[17336]: warning: unpack_rsc_op_failure: Processing failed op start for galera:0 on ov-jbgefv6jimp-0-uzushvxqjdlv-controller-sg72b4wylo36.novalocal: not configured (6)11:33
mariosis this the startup race (looks like)11:33
marios(might be i only grabbed the 'rabbitmq startup race' and not the other one)11:34
gfidentemarios, those images also still have .novalocal11:35
gfidentethis is going to be epic rebase11:36
jistrepic rebase, yeah :D11:37
gfidenteloads of things could either get fixed OR BREAK! :)11:39
*** thrash|g0ne is now known as thrash11:39
*** jistr is now known as jistr|class11:40
*** pblaho has joined #tripleo11:41
*** jrist has joined #tripleo11:46
*** dguerri_ is now known as dguerri11:49
*** jrist has quit IRC11:52
*** rlandy has joined #tripleo11:59
*** ishant has quit IRC12:04
*** jayg|g0n3 is now known as jayg12:14
lsmola_gfidente: ping12:31
lsmola_gfidente: when I want to deploy ceph, should I use this export CINDER_ISCSI=112:31
lsmola_gfidente: the docs are unclear for me :-)12:31
gfidentehi lsmola_ we need to add some documentation about ceph cause things changed a bit12:32
gfidenteare you trying via instack or via devtest?12:32
lsmola_gfidente: the rdo-manager docs, with rhel12:32
gfidenteso you mean the bits here: https://repos.fedorapeople.org/repos/openstack-m/docs/master/basic_deployment/basic_deployment.html#deploy-the-overcloud ?12:33
lsmola_gfidente: there is basically just this param and number of ceph nodes for configuration documented12:33
lsmola_gfidente: https://repos.fedorapeople.org/repos/openstack-m/docs/internal/master/basic_deployment/basic_deployment.html12:33
lsmola_gfidente: the internal for rhel12:33
*** shardy_ has joined #tripleo12:33
lsmola_gfidente: but it's the same :-)12:34
gfidentelsmola_, so we support multibackend in cinder now12:34
gfidenteif you use CINDER_ISCSI=1 you will get both Ceph and LVM backends enabled12:34
gfidenteif you set CINDER_ISCSI=0, only Ceph12:34
gfidente(backend name in that message should really refer to LVM not ISCSI)12:34
gfidenteyou comfortable with updating the docs to word it better?12:35
*** shardy has quit IRC12:35
lsmola_gfidente: ok12:35
lsmola_gfidente: I guess only ceph is good for me, since I will dpeloy only ceph ndoes now12:36
lsmola_gfidente: I'll send it to jcoufal to reword it, I don't have access to the docs12:36
gfidenteso to be clear, upstream we do support scaling the LVM nodes as well, not in instack yet though12:36
lsmola_gfidente: or do I? never tried it :-)12:36
lsmola_gfidente: ok12:36
gfidentelsmola_, I think we all do, it is on gerrithub12:36
gfidenteso from instack, when using LVM backend, only the controllers will contribute to the LVM space12:37
lsmola_gfidente: is there any doc for the doc? :-)12:37
gfidentelsmola_, you have to use git-review12:37
gfidentedoc is built from instack inline12:37
lsmola_gfidente: ah12:37
gfidentelsmola_, my fault, not for instack12:38
gfidenteinstack doc https://github.com/rdo-management/instack-undercloud/blob/master/doc/source/basic_deployment/basic_deployment.rst12:38
lsmola_gfidente: ook, I'll ask jcoufal, he will send me how to contribute to the docs :-)12:39
*** shardy_ has quit IRC12:39
*** shardy has joined #tripleo12:40
*** jprovazn has quit IRC12:46
*** adrianopetrich has joined #tripleo12:57
adrianopetrichhey folks12:57
*** jistr|class is now known as jistr12:59
gfidenteso looks like we'll have to do a recheck13:17
gfidentebecause CI failed with Error: Could not start Service[httpd]: Execution of '/sbin/service httpd start' returned 113:17
gfidenteonly on controller-213:17
jistrgfidente: ack on recheck, in the meantime we can investigate13:17
jistrcould be a race condition again13:18
jistri don't see any httpd logs written on the ctrl 213:18
gfidentejistr, my fault, only controller-013:19
mariosguys which one am i missing to still be seeing http://paste.openstack.org/show/228197/13:20
jistrgfidente:  ERROR:scss.expression:Function not found: twbs-font-path:113:20
gfidentejistr, I also noticed from host_info that httpd is actually running on the other nodes even though there is no mention of httpd in os-collect-config13:21
*** dasm is now known as dasm|afk13:21
gfidentemarios, I think you wanted to take on horizon/ha ?13:21
mariosgfidente: when i get an env i will :)13:22
mariosi may have borked somthing on a rebase with heat templates... gonna try tidy that up a bit and go again13:23
mariosbut first 5 minutes afk13:23
*** jprovazn has joined #tripleo13:23
gfidentemarios, looks like it couldn't start haproxy13:24
gfidentecan you check systemctl status haproxy?13:24
gfidenteor the journalctl to see if you can spot anything about haproxy?13:24
jistrso that twbs-font-path thing -- twbs stands for twitter bootstrap :)13:25
jayggfidente jistr: so if I want to do a fresh deploy to test HA, is there a set of patches you recommend?  or perhaps a single on that is the base for others?13:27
gfidentejayg, we were waiting for you!@13:27
gfidentejayg, get back to master13:27
gfidentecheckout the entire tree from here: https://review.openstack.org/#/c/18108113:27
gfidenteand make sure you get overcloud deployed! :)13:27
jistr^ yeah13:28
jaygcool, I'll give it a shot, thanks!13:28
jistrthat works for both gfidente and me13:28
jaygis there a way in gerrit to grabs _all_ those patches at once?13:29
jistrso re that failure, i'm not sure if it's the twitter bootstrap errors that crash it or if it's just that the initial operation (whatever it does) takes too long13:29
jistr httpd.service start-pre operation timed out. Terminating.13:29
gfidentejistr, yeah was thinking same, have to compare with nonha13:29
jistrjayg: just take the last one, it will include the ~10 others13:29
gfidentejayg, use 'checkout'13:29
jistrgit fetch https://review.openstack.org/openstack/tripleo-heat-templates refs/changes/81/181081/7 && git checkout FETCH_HEAD13:30
jaygjistr: thx13:30
gfidentefrom the top-right 'download' menu13:30
jaygI prefer to pull to a new local branch, but I get the idea, thx13:30
jaygalso, any other patches like the puppet-triple one that bit me that other day, which I need to get on my control images?13:30
gfidentejistr, same messages but then it started http://logs.openstack.org/81/181081/7/check-tripleo/check-tripleo-ironic-overcloud-f20puppet-nonha/9fa3962/logs/ov-sdfgdlh7b3x-0-3mtvyb4vjsu7-Controller_logs/httpd.txt.gz13:31
jistrjayg: nothing special is required right now i think13:31
gfidentejistr, instead of timing out13:31
jistrgfidente: ack, so it's just horizon trying to eat the world on httpd startup13:32
gfidentelooks like13:32
gfidentein nonha it takes from 21:28 to 21:47 to start13:32
jaygjistr: cool, thx13:33
gfidentein ha at 55:27 starts but at 57:18 times out13:33
jaygjistr: are you using apache in the resource type like in the ref arch?13:33
jaygcrag had problems with that yesterday13:34
gfidentejayg, horizon is not in pacemaker yet13:34
jistrjayg: i think at this point we don't have it as a resource13:34
jaygah, k13:34
gfidentejayg, so this is the puppet-horizon module starting httpd at some point13:34
jaygwhy not just disable it until it is managed?13:35
gfidentewell, none of the openstack services is at this point, but if it helps in making CI green, we might do it13:36
gfidentesadly leaving work on the head of marios13:36
jistrso the scary thing is that the timeout comes from systemd itself13:37
mariosgfidente: i will hit you many times13:37
gfidentewho can push changes until he gets green CI with horizon13:37
jistr..which means that value might be hard to change13:37
mariosgfidente: my latest run is using your branch above (I see it has all the things pcmk_ etc)13:38
gfidentejistr, on the other hand, almost 2mins is reasonable timeout13:38
mariosgfidente: if it doesn't complete sucessfully13:38
mariosgfidente: i think you know what happens13:38
mariosgfidente: should i expect the horizon error then?13:38
gfidentemarios, so I haven't seen it on the local dev env, but upstream CI is failing on horizon yet13:39
jistrgfidente: yeah that's right. Possibly something went wrong there really, not just taking its time.13:39
mariosgfidente: looks like haproxy started ok this time...13:39
jaygguys, /7 doesn't apply to master without a merge commit, should it be rebased?13:39
mariosjayg: i'd do fresh git clone and then git review -d changid13:40
mariosjayg: assuming you have gerrit id review.openstack.org13:40
jistrjayg: yeah are you sure you have current master?13:40
gfidentemarios, but you still got CREATE_FAILED?13:40
mariosgfidente: waiting still13:40
jaygmarios: ok, I will try that, I think I set up gerrit on test box13:40
* gfidente pushing a temporary thing to disable horizon on top13:41
mariosjayg: yeah if new box you'll need to cp the .ssh/id_rsa.pum to gerrit13:41
* gfidente still have hopes of not disabling it though13:41
jistrgfidente: ack13:41
* marios palmface forgot to change registry13:43
jaygmarios: that did the trick, thanks, will try to redeploy now13:43
* jayg still gerrit n00b13:44
jistrgfidente: so regarding the instance check timeout i mentioned to you, the overcloud instance didn't get created at all, this is from nova-api log: http://fpaste.org/223328/14320430/raw/13:45
jistraand when i tried to re-run the nova boot command that devtest runs:13:47
jistr[root@dell-t5810ws-rdo-10 ~]# nova boot --key-name default --flavor m1.tiny --block-device source=image,id=855b0b2e-9fe6-4789-b58c-aee37dc44a0c,dest=volume,size=3,shutdown=preserve,bootindex=0 demo13:47
jistrERROR (BadRequest): Multiple possible networks found, use a Network ID to be more specific. (HTTP 400) (Request-ID: req-45b0ab52-3dbb-412c-bb37-bec7f06a3463)13:47
gfidentejistr, that is because in CI it runs with demo user credentials13:47
gfidentejistr, so it doesn't have multiple networks13:47
jistrduh thx13:47
jistrah so the instance *is* up, and has the correct ip assigned, but i can't ping it13:50
* jistr proceeds to investigate further13:50
gfidentejistr, I think I know this :)13:50
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Temporarily disable Horizon in HA job  https://review.openstack.org/18425213:51
* jistr is one big ear13:51
gfidentejistr, jprovazn probably remembers as well13:51
* jistr hopes that translates from czech correctly13:51
gfidenteso the reconnect to rabbit from neutron-ovs-agent13:51
gfidentemight cause desync of ovs-agent with neutron-server13:51
gfidenteand you end up with only 'some' of the ovs tunnels in between controller/compute13:52
* jprovazn reads back13:52
gfidenteI used to do this:13:52
gfidenteovs-vsctl show13:52
gfidenteon all three controllers AND compute13:52
gfidentecompute should have a tunnel to each controller13:52
*** dguerri is now known as dguerri`away13:53
gfidenteand controller to each other controller + to compute13:53
jprovaznjistr, gfidente is the right guy you want to ask about this :) - all I remember is just that he debugged&solved it13:54
gfidentejprovazn, sure13:54
gfidenteespecially solved13:54
jistrhmm interesting. seems all the nodes have 3 tunnels present (to all the remaining nodes of the four)13:57
gfidentejistr, sigh so it's not the same :(13:57
*** lblanchard has joined #tripleo13:58
openstackgerritJohn Trowbridge proposed openstack/diskimage-builder: rhel-common element should not attach when using activation key  https://review.openstack.org/18425314:04
*** Marga_ has joined #tripleo14:06
*** Marga_ has quit IRC14:06
*** Marga_ has joined #tripleo14:07
*** jtomasek has quit IRC14:09
*** lazy_prince has quit IRC14:09
*** dguerri`away is now known as dguerri14:14
gfidentejistr, I would ping ajo!14:18
jistrwe might give the original auth problem a more search14:19
*** jrist has joined #tripleo14:21
gfidentealso yes14:23
jayghrm, so the good news is I finally get a cluster14:25
jaygthe bad news is it appears puppet did not actually try to add any of the resources to that cluster14:25
gfidentejayg, any at all sounds like it failed14:25
jaygthe last Pacemaker ref I see in my puppet log is turning off stonith (no fails I see at end either)14:25
*** jrist has quit IRC14:25
gfidenteand never goes past that?14:26
gfidenteand the deployment is in create_failed?14:26
*** regebro has quit IRC14:26
gfidenteso I think puppet failed on one of the three nodes14:27
jaygok, I'll log into the other 3 and look further14:27
*** Marga_ has quit IRC14:29
*** dguerri is now known as dguerri`away14:29
*** dguerri`away is now known as dguerri14:29
jayggfidente: ok, it appears that on one node, mongo failed to start, due to a timeout14:33
* jayg considers setting smallfiles=true and retrying deployment14:33
*** ukalifon has quit IRC14:33
gfidentejayg, mongo is not supposed to start until pacemaker does14:33
gfidentebut there is an issue with the module which doesn't allow that yet14:33
gfidenteso it starts together with the cluster14:33
gfidentenot sure why it timed out though14:33
jaygwell, if smallfiles is not true, it takes forever to allocate disk space14:34
jaygwhat module doesn't allow this ordering though?  I am pretty sure we never had mongo start before the cluster in quickstack14:34
gfidenteit starts the daemon when doing the config14:35
mariosgfidente: i also saw the same as jayg i think to mongodb server! (\u001b[0m\n\u001b[1;31mError: /Stage[main]/Mongodb::Server::Service/Mongodb_conn_validator[mongodb]/ensure: change from absent to present failed: Unable to connect to mongodb server! (\u001b[0m\n\u00114:35
jaygmarios: that one sometimes fails a couple times while waiting for service to come up14:36
jaygbut the service should eventually come up (mine did not)14:36
mariossorry, i wasn't overly excited about reporting that, the ! was part of the log :)14:36
jaygyou know you were excited14:36
jaygeveryone loves to see mongo fai14:36
*** jrist has joined #tripleo14:39
*** jrist has joined #tripleo14:39
*** sdake has joined #tripleo14:42
*** sdake_ has joined #tripleo14:43
gfidentejayg, can we make some resource agent verbose?14:44
gfidentelooks like we ended up with pacemaker saying the rabbitmq cluster is up but rabbitmqctl says they are isolted14:44
jayggfidente: I am not sure what you mean by agent being verbose - pacemaker only reports on if service responds as running or not, not whether it is working properly14:46
gfidentejayg, yeah so rabbitmq-cluster ocf14:46
gfidentelogging cluster bootstrapped14:47
gfidentebut the rabbitmq nodes are isolated14:47
*** akrivoka has joined #tripleo14:47
*** sdake has quit IRC14:47
jayglet me take a look at that part of the puppet in my checkout here14:47
jistrpresently we're missing ordered=true interleave=true as clone params though, i'm wondering if that could make a difference.14:47
* jistr will try14:48
jaygjistr: you have switched to newest puppet-pacemaker, right?14:48
jistrjayg: yes14:48
jistri know adding them is easy14:49
jaygthere was a patch from spredzy|afk after crag's stuff, which we merged, just want to make sure you have that too14:49
jistri rebuilt the images today so i'm hoping i do14:49
gfidentejistr, looks like ordered is pretty important14:50
*** zbitter has joined #tripleo14:50
jaygyeah, those two look like the biggest diffs in config atm14:52
lsmola_gfidente: is it possible we don't have cinder registered to keystone?14:52
gfidentelsmola_, don't think so, something failed if you don't have it14:52
lsmola_gfidente: I see only object store there14:53
lsmola_gfidente: damn14:53
*** aufi has quit IRC14:53
lsmola_gfidente: which step does the registration?14:53
*** mmagr has quit IRC14:53
gfidentelsmola_, a script launched after the deployment finishes14:53
gfidenteautomatically by instack14:53
gfidenteadding users and endpoints14:54
lsmola_gfidente: hmm, wait, I should be talking to overcloud cinder right? :-)14:54
jistrgfidente, jayg: so looks like ordered=true fixed that14:54
jistrgfidente, jayg: i'll submit a patch14:54
jaygjistr: awesome14:54
lsmola_gfidente: so my bad again :-D14:54
*** zb has joined #tripleo14:57
*** akrivoka_ has joined #tripleo14:58
*** pblaho has quit IRC14:59
*** untriaged-bot has joined #tripleo15:00
untriaged-botUntriaged bugs so far:15:00
openstackLaunchpad bug 1455175 in tripleo "Option to configure gateway through keepalived" [Undecided,New] - Assigned to Mayank (mayank0107)15:00
openstackLaunchpad bug 1449852 in diskimage-builder "Buidling ramdisk with ironic-agent behind proxy fails" [Undecided,In progress] - Assigned to Ramakrishnan G (rameshg87) (rameshg87)15:00
openstackLaunchpad bug 1449854 in diskimage-builder "Ironic agent ramdisk built using disk-image-create fails with iscsi_ilo driver" [Undecided,Fix committed] - Assigned to Ramakrishnan G (rameshg87) (rameshg87)15:00
openstackLaunchpad bug 1454803 in tripleo "puppet: Neutron is not configured with L2 population" [Undecided,New]15:00
openstackLaunchpad bug 1454802 in tripleo "puppet: Neutron does not use Nova notifications" [Undecided,New]15:00
openstackLaunchpad bug 1452752 in tuskar "keystone_authtoken section is wrong in default shipped tuskar.conf.sample" [Undecided,Confirmed]15:00
openstackLaunchpad bug 1456648 in diskimage-builder "rhel-common element tries to attach when using activation key" [Undecided,In progress] - Assigned to John Trowbridge (trown)15:00
*** untriaged-bot has quit IRC15:01
*** zbitter has quit IRC15:01
*** marzif has joined #tripleo15:01
*** akrivoka has quit IRC15:02
openstackgerritJiri Stransky proposed openstack/tripleo-heat-templates: Clone params for pacemaker rabbitmq resource  https://review.openstack.org/18426315:03
gfidentejistr, on top or outside that tree?15:04
*** akrivoka__ has joined #tripleo15:04
*** yamahata has joined #tripleo15:04
jistrgfidente: on top, sans the horizon bit15:04
gfidenteI'm even more convinced we need to merge things now :)15:04
jistr(it has to be on top of at least some of the moves, otherwise it would conflict)15:04
jistryeah :D15:04
gfidenteah right15:04
gfidentejayg, any chance to cherry-pick https://review.openstack.org/#/c/184263 on top of what you have already?15:05
jayggfidente: already grabbing it  :)15:05
*** shardy_ has joined #tripleo15:05
jaygfound why my last setup failed, repairing then trying again15:05
gfidenteyeah hopefully mongo won't stop you halfway there15:05
gfidentejayg, what was it?15:06
jaygI set smallfiles=true for that this time15:06
gfidentejayg, if you think it is wortj15:06
jaygin addition to smallfiles my puppet-pacemaker was not up to date like we thought it was15:06
gfidentejust push it on the dance floor! :)15:06
jayggfidente: it is not for production, afaik, but sometimes needed for VMs15:06
jaygwe never included it in ofi/quickstack15:07
jaygbut I hacked it in for dev all the time15:07
gfidentegot it15:07
*** julim has quit IRC15:07
*** shardy has quit IRC15:07
gfidentejayg, you default to neutron l3_ha for multiple nodes?15:07
gfidenteI wanted to enable that as well15:07
jaygyes we do15:07
*** akrivoka_ has quit IRC15:07
*** dguerri is now known as dguerri`away15:07
jayggfidente: a big one there is to make sure l2pop is off15:08
gfidenteyeah we don't support it yet15:08
jaygthere is a bug in neutron that makes the two not work together15:09
gfidenteso it is always off AFAIK15:09
jaygso if you dont have l3_ha, what _do_ you have?15:09
jaygwe do either or15:09
*** dguerri`away is now known as dguerri15:09
gfidentel3 relocation15:10
gfidenteand l3_ha either one or the other15:10
*** zbitter has joined #tripleo15:10
* jayg has never used that :)15:10
gfidentework for dvr probably paused, but some is in there already15:10
openstackgerritJiri Stransky proposed openstack/tripleo-heat-templates: Clone params for pacemaker rabbitmq resource  https://review.openstack.org/18426315:11
*** shardy_ has quit IRC15:11
*** julim has joined #tripleo15:11
*** shardy has joined #tripleo15:11
*** panda has quit IRC15:13
*** panda has joined #tripleo15:14
*** zb has quit IRC15:14
*** zaneb has joined #tripleo15:14
lsmola_gfidente: so running service list, but I see only one volume service, for 2 deployed ceph nodes15:16
gfidentelsmola_, correct, you have cinder-volume running only on controller15:17
gfidenteceph osd dump15:17
gfidenteshould tell you about the ceph storage status15:17
gfidenteceph -s15:17
lsmola_gfidente: and hostname is weird, controller servicer has hostname of controller node, but volume doesn't match any ceph node15:17
*** zbitter has quit IRC15:17
gfidentehostname of the tripleo_ceph instance is customized to arbitrary string on purpose, jistr ^^15:18
gfidenteif you had more controllers they were all sharing same string for that backend15:18
gfidenteinsted of their hostname15:18
*** cwolferh has quit IRC15:19
gfidentejayg, and you were probably enabling l3_ha based on number of controllers right? because it doesn't work with 1 node15:19
*** akrivoka__ has quit IRC15:20
*** ParsectiX has joined #tripleo15:21
gfidentejayg, I also wanted to ask about clustercheck15:22
gfidenteping when you're not testing :P15:22
lsmola_gfidente: hmm, so the volume <-> host relation is not exposed in APIs?15:22
*** zb has joined #tripleo15:22
jayggfidente: we allow l3_ha even on one node, but there are a couple other settings that vary if you have one node or more15:23
* jayg wrapping up call :)15:23
*** akrivoka has joined #tripleo15:23
gfidentejayg, isn't it breaking neutron-server on 1 node preventing it from starting?15:23
jayggfidente: let me double check myself, but that has not been a problem15:25
gfidenteack thanks, then I can ask about clusterchek15:25
gfidenteyou should put a bot in the channel15:25
gfidentecapturing questions15:25
gfidenteand giving known good responses15:25
gfidentelsmola_, I am not sure about 'not exposed' ?15:26
*** akrivoka_ has joined #tripleo15:26
lsmola_gfidente: similar to VMs, you can see on which host it is running15:27
*** zaneb has quit IRC15:27
jaygthis is what we do https://github.com/redhat-openstack/astapor/blob/master/puppet/modules/quickstack/manifests/pacemaker/neutron.pp#L10215:27
gfidentelsmola_, ah so cinder-volume is always running on all controllers15:27
lsmola_gfidente: so here it's hidden behind the cluster?15:27
gfidentelsmola_, which piece of information is hidden?15:27
lsmola_gfidente: and the actual deployed volume?15:27
*** zbitter has joined #tripleo15:27
*** daneyon has quit IRC15:28
*** akrivoka__ has joined #tripleo15:28
gfidentelsmola_, the volume is not hosted by a specific node, that is the very purpose of customizing the host setting15:28
gfidenteif we were to do it, volume would only be available if backing host is up15:28
gfidentejayg, YOU HACK THE NUMBERS! :P15:29
* gfidente crying15:29
*** akrivoka__ has quit IRC15:29
gfidentejayg, maybe we should do that too :)15:29
lsmola_gfidente: I though it will be just replicated on few hosts?15:29
gfidentelsmola_, it is, but ceph is doing that, not cinder15:29
jistrgfidente, jayg: :)))15:29
lsmola_gfidente: right, so that info is not exposed through cinder?15:30
jayggfidente: I don't hack, them, I set them correctly  :)15:30
gfidentelsmola_, ack now I get it, that info is not even known to cinder15:30
lsmola_gfidente: any idea whic command can list this talking to ceph?15:30
*** akrivoka has quit IRC15:30
gfidentejayg, YOU HACK THE NUMBER!15:30
*** zaneb has joined #tripleo15:30
gfidenteand I mean YOU15:30
gfidentenot you15:30
gfidenteand now I WILL too :15:31
*** weshay has joined #tripleo15:31
*** akrivoka_ has quit IRC15:31
jaygooooh, I am watching a second puppet run that may be doing what I want for a change here....15:31
gfidentejayg, good :)15:32
*** zb has quit IRC15:32
*** zbitter has quit IRC15:34
*** whayutin_ has joined #tripleo15:35
*** julim has quit IRC15:37
*** CheKoLyN has joined #tripleo15:38
*** weshay has quit IRC15:38
openstackgerritPino Toscano proposed openstack/diskimage-builder: Cleanup the build directories earlier  https://review.openstack.org/18426815:38
*** CheKoLyN has quit IRC15:38
*** saguilar has joined #tripleo15:39
*** athomas has joined #tripleo15:40
* gfidente wait_for ping15:45
gfidentelsmola_, so cinder doesn't know about what ceph is doing in terms of replication15:46
gfidentejistr, IT REPLIED TO PING!15:47
lsmola_gfidente: ok, I'll try to investigate if I can get that relation from ceph API15:47
jistrgfidente: /me wait_for ping15:47
lsmola_gfidente: if not, I guess we don't need it that much :-)15:47
gfidentelsmola_, with rbd lspool volumes15:48
gfidenteyou get list of objects stored from cinder15:48
gfidenteand uuid matches the cinder volume itself15:48
gfidentebut you don't get which nodes are hosting the volume15:48
gfidenteceph is, in the background, maintaining 2/3 replicas at all times15:48
*** whayutin_ has quit IRC15:49
jistrgfidente: IT DID REPLY TO PING. which means i've just had a first END 2 END SUCCESSFUL HA DEPLOYMENT15:49
gfidenteAHAH :)15:49
lsmola_gfidente: where do I run this command?15:49
gfidentelsmola_, controllers15:49
gfidentejistr, including horizon, as it was for me as well15:50
gfidenteso I am not sure what to do with it15:50
jayggfidente jistr: ok, so after my deploy, I have one seemingly transient failure, but galera running - http://paste.fedoraproject.org/223415/14320505/15:50
gfidenteupstream CI still failing instead15:50
jaygand no rabbit partitions - http://paste.fedoraproject.org/223414/05054114/15:50
gfidentejayg, I have seen the galera_monitor to fail as well15:50
gfidenteI vote for merging THE WHOLE THING15:51
jistrgfidente, jayg: yeah i've seen the error too15:51
lsmola_gfidente: seems like wrong command, rbd lspool volumes, running it on overcloud controller15:51
gfidenteand then get back on galera15:51
jistrgfidente: we don't have enough +2s on some15:51
gfidentemarios, ^^ :)15:51
gfidentejayg, question about galera clustercheck is15:52
gfidenteI see it is polled by haproxy15:52
gfidenteto remove unsynced nodes15:52
jistrgfidente: should we start landing those which do have enough +2s? i'd say yes15:52
gfidenteis it used for full cluster restart as well?15:52
gfidentejistr, sure!15:53
gfidentefrom where we are, it's better to merge15:53
jayggfidente: in what way do you mean?15:53
openstackgerritMerged openstack/tripleo-heat-templates: Fix RabbitMQ startup race  https://review.openstack.org/18139815:53
openstackgerritMerged openstack/tripleo-heat-templates: Update to reflect puppet-pacemaker changes  https://review.openstack.org/18310315:54
openstackgerritMerged openstack/tripleo-heat-templates: Configure HAProxy, Galera and MongoDB before start  https://review.openstack.org/18404315:54
openstackgerritMerged openstack/tripleo-heat-templates: Remove unused enable_pacemaker setting from templates  https://review.openstack.org/18405715:55
*** eghobo has joined #tripleo15:55
* jistr EOD ttyl15:57
*** jistr has quit IRC15:57
gfidentemarios, you around for some merging?15:57
lsmola_gfidente: rbd ls volumes --long15:57
jayggfidente: so unless I read /usr/lib/ocf/resource.d/heartbeat/galera incorrectly, I dont see it using clustercheck wrt restarts15:58
lsmola_gfidente: but don;t see the any link to hosts, or replications15:58
gfidentejayg, yeah that is why I was wondering15:58
lsmola_gfidente: have to run, will try that tomorrow again15:58
gfidentewhat is purpose of clustercheck, just to report about sync status to haproxy?15:58
jaygwe use it also to make sure galera is ready before trying to set up anything that depends on it15:59
jaygand haproxy takes no action, that is just how it monitors for issues15:59
*** Marga_ has joined #tripleo16:01
gfidenteisn't it removing nodes from sticky table out of sync?16:01
*** saguilar has quit IRC16:01
*** cwolferh has joined #tripleo16:02
*** mestery has joined #tripleo16:04
*** dguerri is now known as dguerri`away16:05
jayggfidente: if the node is down, it would move traffic to a different one, but I think it keeps checking16:06
*** eghobo has quit IRC16:07
*** daneyon has joined #tripleo16:07
gfidenteso we have something which defines different nodes as master/backup16:07
gfidentein haproxy16:07
*** mcornea has quit IRC16:07
gfidenteI can see we prefer the master node when it is up, when we really shouldn't16:07
gfidentebut is there any other difference with the sticky-table approach that you can tell?16:08
jaygcwolferh: maybe you can answer this question better?  wrt stick table+galera+haproxy16:09
*** yamahata has quit IRC16:10
*** dguerri`away is now known as dguerri16:10
gfidentecurrently we do deploy clustercheck but we don't use it from haproxy yet because of the master/backup pre-existing approach16:10
*** ParsectiX has quit IRC16:10
cwolferher, what is the question?16:10
gfidentecwolferh, sec code is easier16:11
*** ParsectiX has joined #tripleo16:11
gfidentewe don't have this in tripleo https://github.com/redhat-openstack/astapor/blob/master/puppet/modules/quickstack/manifests/load_balancer/galera.pp#L2216:11
openstackgerritPino Toscano proposed openstack/diskimage-builder: Cleanup the build directories earlier  https://review.openstack.org/18426816:11
gfidenteinstead we append backup key to all-except-one node16:12
gfidenteyey we deploy clustercheck16:12
*** athomas has quit IRC16:13
jayghmm, well the linked quickstack bit is what the ref arch recommends...16:13
jaygso why arent we doing that?16:13
gfidenteso the questions are: 1) is it needed for full cluster restart in some way? is there more than haproxy pollin 9200? and 2) how does sticktable in haproxy compares to the master/backup thing? shall we use sticktable instead?16:13
gfidentejayg, because some of this stuff was in tripleo already16:13
gfidentejayg, it aimed at multiple controllers without pacemaker16:13
*** mestery has quit IRC16:14
cwolferhi don't understand what the master/backup thing is, i thought all nodes were masters when galera was deployed as a pacemaker service16:14
gfidentejayg, so we're slowly 'migrating' the config16:14
jaygah, ok16:14
gfidentecwolferh, all galera nodes are, it is in haproxy we define all-but-one as backup so they only get proxied if master is down16:14
gfidentecwolferh, https://github.com/stackforge/puppet-tripleo/blob/master/manifests/loadbalancer.pp#L571-L58816:15
cwolferhgfidente, n-1 masters and 1 backup, or 1 master and n-1 backups?16:15
gfidenten-1 backups16:15
cwolferheither way though, i think you are better off with stick table16:15
cwolferhif the master goes down, then you are in a round robin scenario between backups it seems to me (naively, not really familiar with master/backup)16:16
gfidentenah it is only using 1 backup at a time16:16
gfidenteas long as one is available it uses the first one it can find16:16
gfidenteanyway, I'll track this as a bug so we get on it with some medium priority16:16
*** zaneb has quit IRC16:17
*** saguilar has joined #tripleo16:19
gfidentejayg, so regarding the l3_ha setting instead16:20
gfidentehttps://github.com/redhat-openstack/astapor/blob/master/puppet/modules/quickstack/manifests/pacemaker/neutron.pp#L102 < I see minimum set to 216:20
gfidentebut I get InternalServerError: Not enough l3 agents available to ensure HA. Minimum required 2, available 1.16:20
gfidentewith single node, do you disable it with single node then>16:21
*** daneyon has quit IRC16:21
*** Marga_ has quit IRC16:27
*** Marga_ has joined #tripleo16:27
jayggfidente: hmm, not explicitly that I recall, unless staypuft was doing so16:28
gfidenteyeah because if I enforce min to 1 I get16:29
gfidenteHAMinimumAgentsNumberNotValid: min_l3_agents_per_router config parameter is not valid. It has to be equal to or more than 2 for HA.16:29
jayggfidente: can you paste your neutron config?16:31
gfidentejayg sure16:31
gfidentenot anymore I deleted stack16:31
jaygah, ok16:32
jaygbecause if you look, even with l3_ha= true, if the cluster size were 1, we would set dhcp_agents_per_network=1, and min is the same in either case16:33
gfidentedhcp_agents is fine16:33
jaygthe only bits that change are clone-max and max_l3_agents_per_router16:33
gfidenteit is min_l3_agents the issue16:33
jaygthat value is the default anyway16:33
jaygso there must be a different setting triggering this16:34
gfidenteyes indeed if you don't change it and set l3_ha with 1 neutron server, the neutron client returns the error I pasted16:34
gfidenteclone is only affecting the pacemaker resource no?16:34
jaygah, looking at neutron::server, it must be as you say, that 'DEFAULT/l3_ha':                    value => true;16:36
jaygmust break it16:36
*** mestery has joined #tripleo16:37
gfidenteyep it won't work with less than 216:39
jayggfidente: you know, now that I have been trying to remember for a bit, I think I have heard reports of this before, and that staypuft would set it to false if one node16:40
gfidenteyou DO NOT HACK THE NUMBERS!16:40
gfidentesee you tomorrow16:41
gfidenteI am done for today16:41
jayggood night!16:41
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Enable NeutronL3HA by default in Pacemaker scenario  https://review.openstack.org/18428916:42
*** ParsectiX has quit IRC16:47
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Enable NeutronL3HA by default in Pacemaker scenario  https://review.openstack.org/18428916:47
*** ParsectiX has joined #tripleo16:47
gfidentejayg, can you vote the deps tree as well?16:49
jayggfidente: do I just do each individually?16:50
gfidenteyeah you have to do individually16:50
gfidentetopmost can't be merged untill the deps also are16:50
gfidenteeven if it gets all the votes16:50
jaygsure thing, I'll go through them each as well16:51
gfidentewe switched rabbit client to list of hosts as wel16:51
gfidenteand there was some missing sync_db16:52
gfidentethe other three are minor restructuring16:52
gfidenteagain, ttyt :)16:52
jaygk, ttyt16:52
*** gfidente has quit IRC16:53
*** sdake_ is now known as sdake16:55
*** ParsectiX has quit IRC16:57
*** ParsectiX has joined #tripleo16:57
*** trown is now known as trown|lunch17:01
*** Marga_ has quit IRC17:01
*** adrianopetrich_ has joined #tripleo17:03
*** adrianopetrich has quit IRC17:05
*** noslzzp has joined #tripleo17:09
*** noslzzp has quit IRC17:13
*** yog__ has quit IRC17:14
*** mestery has quit IRC17:14
*** adrianopetrich_ has quit IRC17:22
*** mestery has joined #tripleo17:28
*** saguilar has quit IRC17:32
*** sdake has quit IRC17:40
*** sdake has joined #tripleo17:42
*** eghobo has joined #tripleo17:47
*** Marga_ has joined #tripleo17:48
*** daneyon has joined #tripleo17:50
*** MasterPiece has quit IRC17:50
*** sdake has quit IRC17:51
*** david-lyle has joined #tripleo17:53
*** sdake has joined #tripleo17:56
*** daneyon has quit IRC17:59
*** mestery has quit IRC17:59
*** daneyon has joined #tripleo18:00
*** david-lyle has quit IRC18:01
*** adrianopetrich has joined #tripleo18:03
*** eghobo has quit IRC18:03
*** david-lyle has joined #tripleo18:07
*** saguilar has joined #tripleo18:10
*** daneyon has quit IRC18:14
*** daneyon has joined #tripleo18:16
*** daneyon has quit IRC18:18
*** trown|lunch is now known as trown18:18
*** sdake has quit IRC18:18
*** spredzy|afk is now known as spredzy18:18
openstackgerritMerged openstack-infra/tripleo-ci: Replace ci.o.o links with docs.o.o/infra  https://review.openstack.org/18330618:34
*** gfidente has joined #tripleo18:41
*** gfidente has quit IRC18:44
*** alop has joined #tripleo18:48
*** barra204 has joined #tripleo18:50
*** barra204_ has joined #tripleo18:50
*** adrianopetrich has quit IRC18:54
*** david-lyle has quit IRC18:57
*** david-lyle has joined #tripleo19:01
*** jprovazn has quit IRC19:02
*** Goneri has quit IRC19:03
*** cwolferh has quit IRC19:04
*** barra204__ has joined #tripleo19:05
*** barra204_ has quit IRC19:05
*** openstackgerrit has quit IRC19:06
*** openstackgerrit has joined #tripleo19:06
*** barra204 has quit IRC19:06
*** barra204 has joined #tripleo19:06
*** panda has quit IRC19:13
*** panda has joined #tripleo19:14
*** barra204__ has quit IRC19:22
*** barra204 has quit IRC19:22
openstackgerritgreghaynes proposed openstack/diskimage-builder: Install debian locales  https://review.openstack.org/18389119:23
*** MasterPiece has joined #tripleo19:23
openstackgerritgreghaynes proposed openstack/diskimage-builder: Add debian build test case  https://review.openstack.org/18116119:23
openstackgerritgreghaynes proposed openstack/diskimage-builder: Add tests for building *-minimal images  https://review.openstack.org/18116219:23
*** alop has quit IRC19:28
*** akrivoka has joined #tripleo19:33
*** barra204__ has joined #tripleo19:35
*** barra204 has joined #tripleo19:35
*** barra204 has quit IRC19:37
*** barra204__ has quit IRC19:37
*** barra204__ has joined #tripleo19:37
*** barra204 has joined #tripleo19:37
*** ParsectiX has quit IRC19:39
*** ParsectiX has joined #tripleo19:40
*** Marga_ has quit IRC19:42
openstackgerritgreghaynes proposed openstack/diskimage-builder: Add tests for building *-minimal images  https://review.openstack.org/18116219:42
*** daneyon has joined #tripleo19:45
*** akrivoka has quit IRC19:46
*** cwolferh has joined #tripleo19:48
*** david-lyle has quit IRC19:52
*** dguerri is now known as dguerri`away19:54
*** sdake has joined #tripleo20:06
*** jayg is now known as jayg|g0n320:09
*** dasm|afk has quit IRC20:14
*** ParsectiX has quit IRC20:18
*** ParsectiX has joined #tripleo20:18
*** ParsectiX has quit IRC20:22
*** ParsectiX has joined #tripleo20:22
*** daneyon has quit IRC20:24
*** akrivoka has joined #tripleo20:29
*** ifarkas has quit IRC20:31
*** dsneddon has joined #tripleo20:39
*** sdake has quit IRC20:39
*** radez_g0n3 is now known as radez20:41
*** daneyon has joined #tripleo20:41
*** eghobo has joined #tripleo20:42
*** dsneddon has quit IRC20:42
*** daneyon has quit IRC20:45
*** eghobo_ has joined #tripleo20:45
*** radez is now known as radez_g0n320:46
openstackgerritChristopher Dearborn proposed openstack/tripleo-image-elements: Have all os-refresh-config elements use su instead of sudo  https://review.openstack.org/17130320:47
*** eghobo has quit IRC20:48
*** daneyon has joined #tripleo20:53
*** daneyon has quit IRC20:54
*** david-lyle has joined #tripleo20:55
*** akrivoka has quit IRC20:57
*** david-lyle has quit IRC20:58
*** untriaged-bot has joined #tripleo21:00
untriaged-botUntriaged bugs so far:21:00
openstackLaunchpad bug 1455175 in tripleo "Option to configure gateway through keepalived" [Undecided,New] - Assigned to Mayank (mayank0107)21:00
openstackLaunchpad bug 1449852 in diskimage-builder "Buidling ramdisk with ironic-agent behind proxy fails" [Undecided,In progress] - Assigned to Ramakrishnan G (rameshg87) (rameshg87)21:00
openstackLaunchpad bug 1449854 in diskimage-builder "Ironic agent ramdisk built using disk-image-create fails with iscsi_ilo driver" [Undecided,Fix committed] - Assigned to Ramakrishnan G (rameshg87) (rameshg87)21:00
openstackLaunchpad bug 1454803 in tripleo "puppet: Neutron is not configured with L2 population" [Undecided,New]21:00
openstackLaunchpad bug 1454802 in tripleo "puppet: Neutron does not use Nova notifications" [Undecided,New]21:00
openstackLaunchpad bug 1452752 in tuskar "keystone_authtoken section is wrong in default shipped tuskar.conf.sample" [Undecided,Confirmed]21:00
openstackLaunchpad bug 1456648 in diskimage-builder "rhel-common element tries to attach when using activation key" [Undecided,In progress] - Assigned to John Trowbridge (trown)21:00
*** untriaged-bot has quit IRC21:00
*** david-lyle has joined #tripleo21:01
*** eghobo_ has quit IRC21:05
*** trown is now known as trown|outttypeww21:13
*** dguerri`away is now known as dguerri21:16
*** lblanchard has quit IRC21:18
*** barra204__ has quit IRC21:23
*** barra204_ has joined #tripleo21:23
*** barra204__ has joined #tripleo21:23
*** thrash is now known as thrash|g0ne21:23
*** barra204 has quit IRC21:24
*** rlandy has quit IRC21:35
*** david-lyle has quit IRC21:41
*** eghobo has joined #tripleo21:43
*** eghobo_ has joined #tripleo21:46
*** david-lyle has joined #tripleo21:49
*** eghobo has quit IRC21:50
*** eghobo_ has quit IRC21:56
*** barra204 has joined #tripleo21:57
*** barra204__ has quit IRC21:57
*** barra204_ has quit IRC21:58
*** barra204_ has joined #tripleo21:58
*** barra204_ is now known as shakamunyi22:04
*** athomas has joined #tripleo22:08
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Update neutron local_ip to use the tenant network  https://review.openstack.org/17871622:15
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Add a network ports IP mapping resource  https://review.openstack.org/17871422:15
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Add isolated network ports to block storage roles  https://review.openstack.org/18082422:15
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Add isolated network ports to swift roles  https://review.openstack.org/18082322:15
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Add isolated network ports to ceph roles  https://review.openstack.org/18082222:15
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Add isolated network ports to controller roles  https://review.openstack.org/17784622:15
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Add isolated network ports to compute roles  https://review.openstack.org/18082122:15
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Add a ports (ip address) abstraction layer  https://review.openstack.org/17784522:15
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Add isolated net parameters to net-config stacks  https://review.openstack.org/18082022:15
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Wire in optional network creation for overcloud  https://review.openstack.org/17784422:15
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Switch net-config templates to use OS::stack_id  https://review.openstack.org/18434322:15
*** david-lyle has quit IRC22:19
*** david-lyle has joined #tripleo22:21
*** david-lyle has quit IRC22:27
openstackgerritMerged openstack/diskimage-builder: Update install docs to be more user friendly  https://review.openstack.org/17272322:31
*** barra204 has quit IRC22:46
*** shakamunyi has quit IRC22:46
*** rainya has joined #tripleo22:50
*** athomas has quit IRC22:54
*** saguilar has quit IRC23:00
*** Guest38176 is now known as mgagne23:05
*** mgagne has joined #tripleo23:05
*** panda has quit IRC23:13
*** panda has joined #tripleo23:14
*** chlong has joined #tripleo23:19
*** chlong_ has joined #tripleo23:20
*** rainya has quit IRC23:24
*** Marga_ has joined #tripleo23:28
*** chlong has quit IRC23:31
*** david-lyle has joined #tripleo23:35
*** rainya has joined #tripleo23:36
*** daneyon has joined #tripleo23:49
*** eghobo has joined #tripleo23:54
*** daneyon has quit IRC23:59
*** daneyon has joined #tripleo23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!