Friday, 2016-11-18

*** panda is now known as panda|Zz00:01
*** ayoung has quit IRC00:02
openstackgerritMerged openstack/python-tripleoclient: Support whole disk images in TripleO  https://review.openstack.org/39442600:04
*** asalkeld has joined #tripleo00:07
*** ooolpbot has joined #tripleo00:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION00:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242900:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]00:10
*** ooolpbot has quit IRC00:10
*** asalkeld has quit IRC00:12
*** penick has joined #tripleo00:15
*** penick_ has joined #tripleo00:18
*** penick has quit IRC00:19
*** penick_ is now known as penick00:19
openstackgerritMerged openstack/diskimage-builder: Change path for dnf arch override so basearch is not overwritten.  https://review.openstack.org/39917500:20
openstackgerritMerged openstack/diskimage-builder: debian: install dialog package  https://review.openstack.org/39721800:21
*** kjw3 has quit IRC00:21
openstackgerritMerged openstack/diskimage-builder: lib: common-functions: Fix tmpfs umounting  https://review.openstack.org/39200200:22
openstackgerritMerged openstack/diskimage-builder: Cleanup yumdownloader repos  https://review.openstack.org/39592100:23
openstackgerritMerged openstack/diskimage-builder: Disable all repos in os-refresh-config too  https://review.openstack.org/39863000:23
*** asalkeld has joined #tripleo00:27
openstackgerritAndreas Karis proposed openstack/tripleo-heat-templates: Disable Options Indexes in horizon  https://review.openstack.org/39155000:31
*** asalkeld has quit IRC00:32
*** hjensas has quit IRC00:35
*** limao has joined #tripleo00:39
*** ayoung has joined #tripleo00:49
*** morazi has quit IRC00:53
*** dhill_ has quit IRC01:05
*** dhill__ has joined #tripleo01:06
*** ayoung has quit IRC01:06
*** achadha has quit IRC01:10
*** ooolpbot has joined #tripleo01:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION01:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242901:10
*** ooolpbot has quit IRC01:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]01:10
*** achadha has joined #tripleo01:10
*** chlong has quit IRC01:11
*** dhill__ has quit IRC01:12
*** hjensas has joined #tripleo01:12
*** hjensas has joined #tripleo01:12
*** achadha has quit IRC01:13
*** achadha_ has joined #tripleo01:13
*** achadha_ has quit IRC01:15
*** achadha has joined #tripleo01:15
*** penick has quit IRC01:17
*** achadha_ has joined #tripleo01:19
*** achadha has quit IRC01:19
*** achadha_ has quit IRC01:24
*** rhallisey has quit IRC01:24
*** myoung|bbl is now known as myoung01:41
*** myoung is now known as myoung|biab01:42
*** dmacpher has joined #tripleo01:43
*** newmember has joined #tripleo01:48
*** achadha has joined #tripleo01:58
*** tiswanso has joined #tripleo02:02
*** tiswanso has quit IRC02:03
*** achadha has quit IRC02:03
*** tiswanso has joined #tripleo02:03
*** achadha has joined #tripleo02:09
*** ooolpbot has joined #tripleo02:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION02:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242902:10
*** ooolpbot has quit IRC02:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]02:10
openstackgerritgecong proposed openstack/diskimage-builder: Fix a typo  https://review.openstack.org/39934102:31
*** nyechiel has joined #tripleo02:33
*** lblanchard has quit IRC02:43
*** fzdarsky_ has joined #tripleo02:56
*** fzdarsky|afk has quit IRC02:59
*** fragatina has quit IRC03:03
*** nyechiel has quit IRC03:04
*** fragatina has joined #tripleo03:05
*** nyechiel has joined #tripleo03:08
*** fragatin_ has joined #tripleo03:09
*** fragatina has quit IRC03:09
*** ooolpbot has joined #tripleo03:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION03:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242903:10
*** ooolpbot has quit IRC03:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]03:10
*** fragatin_ has quit IRC03:14
*** rlandy has quit IRC03:14
*** Goneri has joined #tripleo03:17
*** numans has joined #tripleo03:25
*** fultonj has quit IRC03:32
*** yamahata has quit IRC03:36
openstackgerritIan Wienand proposed openstack/diskimage-builder: Turn of tracing around pid/chroot check  https://review.openstack.org/39936503:50
openstackgerritIan Wienand proposed openstack/diskimage-builder: Turn off tracing around pid/chroot check  https://review.openstack.org/39936503:51
openstackgerritCao Xuan Hoang proposed openstack/instack-undercloud: Changed author and author-email  https://review.openstack.org/39936704:02
openstackgerritAparna proposed openstack/diskimage-builder: element: proliant-tools: Update hpssacli to ssacli  https://review.openstack.org/39650404:05
*** ooolpbot has joined #tripleo04:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION04:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242904:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]04:10
*** ooolpbot has quit IRC04:10
*** ctayal has joined #tripleo04:14
*** ctayal has quit IRC04:20
*** bana_k has joined #tripleo04:26
*** Goneri has quit IRC04:27
*** tiswanso has quit IRC04:35
*** coolsvap has joined #tripleo04:36
*** ayoung has joined #tripleo04:43
*** tzumainn has quit IRC04:55
*** nyechiel has quit IRC04:58
*** yamahata has joined #tripleo04:58
*** pgadiya has joined #tripleo05:03
*** masco has joined #tripleo05:06
*** ooolpbot has joined #tripleo05:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION05:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242905:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]05:10
*** ooolpbot has quit IRC05:10
*** udesale has joined #tripleo05:18
openstackgerritMerged openstack/tripleo-puppet-elements: Add puppet-qdr module  https://review.openstack.org/37348805:21
*** prateek has joined #tripleo05:24
*** I has joined #tripleo05:24
*** I is now known as Guest6625405:24
openstackgerritNoam Angel proposed openstack/diskimage-builder: add option to configure cloud-init to allow password authentication  https://review.openstack.org/39176505:25
*** bana_k has quit IRC05:26
*** bana_k has joined #tripleo05:27
*** Guest66254 has quit IRC05:29
*** achadha has quit IRC05:57
*** ooolpbot has joined #tripleo06:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION06:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242906:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]06:10
*** ooolpbot has quit IRC06:10
*** achadha has joined #tripleo06:10
*** mgagne has quit IRC06:18
*** limao has quit IRC06:20
*** limao has joined #tripleo06:21
*** mgagne has joined #tripleo06:21
*** mgagne is now known as Guest5228506:21
*** dmacpher has quit IRC06:24
*** abehl has joined #tripleo06:40
*** lmiccini has joined #tripleo06:40
*** iranzo has joined #tripleo06:51
*** iranzo has joined #tripleo06:51
*** dsariel has joined #tripleo06:55
*** achadha has quit IRC06:57
openstackgerritAlejandro Andreu proposed openstack/puppet-tripleo: Changes default MidoNet API port on HAProxy  https://review.openstack.org/39912506:59
*** b00tcat has joined #tripleo07:00
*** myoung|biab is now known as myoung|pto07:08
*** ooolpbot has joined #tripleo07:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION07:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242907:10
*** ooolpbot has quit IRC07:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]07:10
*** tesseract has joined #tripleo07:18
*** tesseract is now known as Guest9031307:18
*** rasca has joined #tripleo07:19
*** achadha has joined #tripleo07:29
*** chandankumar has joined #tripleo07:34
dciabrindtrainor if the galera issue is still there, can you paste "pcs status" somewhere?07:39
*** jaosorior has joined #tripleo07:39
*** mhenkel has joined #tripleo07:43
jaosoriord0ugal: hey dude, could you take a look at this when you have time? https://review.openstack.org/#/c/397381/07:44
jaosoriord0ugal: this would be how it's used in t-h-t https://review.openstack.org/#/c/397350/07:45
*** pcaruana has joined #tripleo07:45
*** ebarrera has joined #tripleo07:47
*** hjensas has quit IRC07:48
*** cylopez has joined #tripleo07:56
*** ealcaniz has joined #tripleo07:59
*** _milan_ has joined #tripleo07:59
openstackgerritmathieu bultel proposed openstack-infra/tripleo-ci: Implement overcloud upgrade job - Mitaka -> Newton  https://review.openstack.org/32375008:01
*** openstackgerrit has quit IRC08:03
*** openstackgerrit has joined #tripleo08:03
openstackgerritmathieu bultel proposed openstack-infra/tripleo-ci: Implement overcloud upgrade job - Mitaka -> Newton  https://review.openstack.org/32375008:07
*** ooolpbot has joined #tripleo08:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION08:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242908:10
*** ooolpbot has quit IRC08:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]08:10
*** amoralej|off is now known as amoralej08:12
openstackgerritFlavio Percoco proposed openstack/tripleo-quickstart: Pass the libvirt_uri to the pool-define command  https://review.openstack.org/39914108:14
*** hogepodge has quit IRC08:15
*** florianf has joined #tripleo08:20
openstackgerritChristian Schwede proposed openstack/tripleo-heat-templates: Add system-uuid based hostname entries  https://review.openstack.org/35864308:21
*** mcornea has joined #tripleo08:25
openstackgerritAlejandro Andreu proposed openstack/puppet-tripleo: Changes default MidoNet API port on HAProxy  https://review.openstack.org/39912508:26
*** jprovazn has joined #tripleo08:28
openstackgerritMerged openstack/tripleo-heat-templates: Correct AllNodesDeploySteps depends_on  https://review.openstack.org/39798308:31
*** panda|Zz is now known as panda08:32
*** hjensas has joined #tripleo08:33
*** jpena|off is now known as jpena08:35
*** arxcruz has joined #tripleo08:37
*** achadha has quit IRC08:42
*** lifeless has quit IRC08:45
*** pmannidi has quit IRC08:45
*** lifeless has joined #tripleo08:47
*** lazy_prince has quit IRC08:47
*** fzdarsky_ is now known as fzdarsky08:48
*** bana_k has quit IRC08:50
*** lazy_prince has joined #tripleo08:52
*** shardy has joined #tripleo08:53
*** jpich has joined #tripleo08:55
*** pmannidi has joined #tripleo08:58
ccamachogood morning guys!08:59
shardyMorning all!09:01
pandaccamacho: and girls.09:01
shardyWe released ocata-1 yesterday, 2 blueprints and 118 bugs fixed!09:01
shardyGreat work everybody :)09:01
shardyhttps://launchpad.net/tripleo/+milestone/ocata-109:01
pandaand how many bugs created ? :)09:02
shardypanda: sssh! ;)09:02
shardywe don't talk about those :)09:02
*** gfidente has joined #tripleo09:03
*** gfidente has quit IRC09:03
*** gfidente has joined #tripleo09:03
shardyhttps://www.explainxkcd.com/wiki/index.php/1739:_Fixing_Problems09:03
*** paramite has joined #tripleo09:07
*** ooolpbot has joined #tripleo09:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION09:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242909:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]09:10
*** ooolpbot has quit IRC09:10
matbushardy: hey :) i commented the LP : https://bugs.launchpad.net/tripleo/+bug/158312509:13
openstackLaunchpad bug 1583125 in tripleo "There is no 'major version' upgrades job for ci " [High,In progress] - Assigned to mbu (mat-bultel)09:13
matbushardy: the full upgrade job works09:13
matbushardy: but the one from this night reach the 290minutes timeout :/09:13
*** derekh has joined #tripleo09:14
*** tremble has joined #tripleo09:14
matbushardy: so i removed the ceilometer migration (specific to M->N) and re-kick the experimental job to what happen09:14
openstackgerritJuan Antonio Osorio Robles proposed openstack/instack-undercloud: Deploy heat APIs over httpd  https://review.openstack.org/39483709:14
shardymatbu: Ok - what are our options to reduce the time?  I saw a patch from bnemec to always use cached images09:14
shardymatbu: is your test using multinode or ovb ?09:15
matbushardy: ovb09:15
matbushardy: i didn't see the patch from ben, but could be a solution09:15
derekhshardy: so I got the env setup to deploy a overcloud with 40 compute nodes, but http://paste.openstack.org/show/589683/09:15
matbushardy: i think it missed 20 minutes, because it was during the converge09:15
shardyI was also thinking we could have a more minimal test based on multinode, as I've been testing upgrades locally with an all-in-one setup09:16
shardythe coverage isn't as good but it's another option to at least exercise some of the upgrade of the controlplane09:16
shardymatbu: ack, OK I'll check out the patch and lets see if we can reduce the time somehow09:17
matbushardy: yes, btw i need to figure out why the undercloud-upgrades failed09:17
shardymatbu: one option would be to deploy with less services - e.g start by just proving we can upgrade some subset of the services normally enabled09:17
matbushardy: AFAIK this jobs wasn't really do a major UC upgrade09:17
matbushardy: yep, i was wondering if we could deploy only the controller (1nodes) ? (looks stupid but ...)09:18
shardymatbu: yeah, we could do that via the multinode job, which is already faster than ovb09:19
shardyslagle has that working with more than one node now too09:19
shardyderekh: nice, thanks, looking!09:19
matbushardy: we can switch to multinode09:19
jaosoriorderekh: any idea where that error is coming from? Is it from heat or from here?09:19
jaosorior*from where09:19
shardyderekh: you're missing a patch for the undercloud, sec09:20
shardyhttps://review.openstack.org/#/c/398396/09:20
shardyderekh: we need that patch landed to stable/newton, it fixes the broken increase of the yaql iterators limit09:20
shardyin heat.conf09:21
shardyyou can alternatively manually fix it and restart heat-engine09:21
shardywe bumped it from 200 to 1000 due to this issue09:21
derekhshardy: ok, I'll bump it up and retry09:21
shardyjaosorior: it's from heat09:21
shardyderekh: nice, thanks!09:21
derekhjaosorior: the error is in the heat-engine log, looks like I got a fix09:22
jaosoriornice :D09:22
openstackgerritFrederic Lepied proposed openstack/tripleo-specs: Step by step validation Spec  https://review.openstack.org/37233609:26
shardyflepied: Hi, good morning!09:29
shardyflepied: thanks for the update on the step by step validation spec09:29
flepiedmorning shardy09:30
flepiedshardy: catching up after my pto ;-)09:30
*** lucas-afk is now known as lucasagomes09:31
*** shardy has quit IRC09:34
*** shardy has joined #tripleo09:34
* shardy had connection drop09:36
shardyflepied: I've been thinking, we could do it via ansible, with the exact same approach I'm proposing for upgrades here:09:36
shardyhttps://review.openstack.org/#/c/393448/10/puppet/services/heat-engine.yaml09:36
shardywe're already using ansible for pre-flight validations so at least we'd be sticking to one tool09:36
shardyinterested in your views on that - I'm happy to write a patch with a prototype if you like the idea09:36
jaosoriorgfidente: out of curiosity, is there a reason why ceph rgw uses civetweb as a fronend instead of the apache fastcgi frontend?09:40
gfidentejaosorior I think performances09:42
*** achadha has joined #tripleo09:42
jaosoriorgfidente: when ceph is running over civetweb, is it running under a user specific for civetweb (like is the case for apache, that the user and group are httpd) or is it under a ceph user?09:44
gfidenteso the user can be configured and we default we ceph09:46
jaosorioralright, thanks dude09:47
jaosoriorgfidente: I'm checking what's up with getting SSL for ceph (starting with rgw)09:47
gfidentewe could do ssl termination in civetweb09:47
gfidenteit's just an argument for the binding option09:47
*** achadha has quit IRC09:48
jaosoriorgfidente: yeah, it doesn't seem to hard09:48
jaosoriorthe format is the same as haproxy09:48
jaosorior(key and cert in the same pem file)09:48
jaosoriorand we just gotta add "s" to the end of the port to specify that it should be using TLS09:48
gfidentewell, we always have haproxy on front so "public ssl" can be managed there without changes to civetweb no?09:48
jaosoriorright09:48
jaosoriorthat's not an issue09:48
jaosoriorfor internal TLS we need the traffic between haproxy and civetweb to be encrypted as well09:49
gfidenteok so yes civetweb I think expects the s appended to the port and the path to the .pem file09:49
jaosorioryeah, that's what I could discern from the docco09:49
jaosoriorI need to try to do a ceph deployment though09:49
gfidenteis key and cert in same .pem complicated to get?09:49
jaosoriornah, we do exactly the same for haproxy09:50
gfidenteso if you want I have an environment09:50
*** derekjhyang has quit IRC09:51
gfidentewhere you can mess with rgw09:51
gfidenteor I could try the submission myself09:51
jaosoriorawesome09:51
jaosoriorthanks dude09:51
jaosorioryou'll need a FreeIPA deployment though09:51
jaosoriorI'll ping you when I have something usable09:51
gfidenteack we can test it there09:52
jaosoriorgfidente: at some point I gotta think about the traffic between monitors and OSDs though09:52
jaosoriorand I guess that's gonna be a bit problematic, since the nodes are configured without the domain, right? And if we would change the hosts to use the domains, that would end up with issues in the CRUSH map, or not?09:53
*** masco has quit IRC09:54
gfidentewait I think you're thinking ahead of me09:55
gfidenteI don't think we can encrypt traffic in between the nodes natively09:55
jaosoriorgfidente: what would we need to do if we want to encrypt it?09:56
gfidenteso you're supposing to add terminations to do encryption transparently in front of every node?09:56
jaosoriorgfidente: for every service, yeah, the point is to end up with TLS everywhere.09:56
*** dtantsur|afk is now known as dtantsur09:56
*** jtomasek has joined #tripleo09:57
gfidentethough the clients would not support encryption either09:59
gfidenteand they need to get to both the monitors and the osds09:59
gfidenteso it looks like we'd need something like a vpn overlay10:00
jaosoriorgfidente: yeah... swift has the same issue apparently10:00
jaosorioralright, well, that's good to know10:00
gfidenteamongst the replicas you mean?10:01
jaosorioryeah10:01
jaosoriorI was given the same recommendation from the swift folks: "get the nodes on a vpn instead"10:02
jaosoriorok, meanwhile I'll get TLS for the civetweb front-end10:04
jaosoriorsince that seems doable10:05
shardyVPN seems like a large overhead for high traffic storage networks which are already isolated10:05
jaosoriorgfidente: thanks for the info dude10:05
jaosoriorshardy: any other suggestions?10:05
*** rbowen has joined #tripleo10:05
jaosoriorshardy: not using TLS is not a solution for some use-cases, specially when they have to meet government regulations10:05
*** ealcaniz has quit IRC10:06
shardyjaosorior: I'm questioning the requirement I guess - we've already isolated that traffic, it could be on a physically separate network where hardware provides a secure transport between disparate locations10:06
shardyI don't get why you'd demand potentially horribly slow software encryption between nodes in the same rack10:07
shardyif folks have access to the rack, over the wire encryption doesn't really help10:07
gfidenteon the other hand, crazy but if we managed to get a vpn overlay on any given network, I wonder if that wouldn't be the "less intrusive" solution to encrypt all internal communications10:07
* gfidente hides10:08
jaosoriorshardy: regulations...10:08
jaosoriorgfidente: I wouldn't mind deleting all the TLS parts if we end up doing that, to be honest.10:09
shardyjaosorior: Every bank I ever worked with terminated https in a hardware load balancer10:09
shardytunneling storage traffic via hardware is no different10:09
gfidentejaosorior I don't know how hard it would be to get os-net-config to create a vpn10:09
jaosoriorneither do I10:09
gfidentejaosorior it isn't necessary cleaner or easier10:09
gfidenteit's probably less intrusive on the services configuration10:09
gfidentebut not easier to manage I mean10:10
jaosoriorfor sure10:10
*** ooolpbot has joined #tripleo10:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION10:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242910:10
*** ooolpbot has quit IRC10:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]10:10
shardyhttp://docs.openstack.org/developer/heat/template_guide/openstack.html#OS::Neutron::VPNService10:10
pandaanother promotion blocker: https://bugs.launchpad.net/os-client-config/+bug/164289710:10
openstackLaunchpad bug 1642897 in os-client-config "osc commands fail when using os-client-config >= 1.23.0" [Undecided,New]10:10
gfidenteshardy those would be fake networks in the UC though10:10
gfidentewe need onc to actually do the thing in ifcfg too10:10
jaosoriorshardy: honestly, not up to me, apparently some clients really want TLS everywhere support10:10
shardyjaosorior: yeah, I'm just saying lets fully understand the requirement10:11
shardyfolks often say they want one thing, but don't actually know what they really need ;)10:11
*** katkapilatova has joined #tripleo10:12
*** nyechiel has joined #tripleo10:13
*** hewbrocca_afk is now known as hewbrocca10:14
openstackgerritChristian Schwede proposed openstack/tripleo-heat-templates: Add system-uuid based hostname entries  https://review.openstack.org/35864310:18
*** limao has quit IRC10:22
*** akrivoka has joined #tripleo10:32
*** fragatina has joined #tripleo10:35
*** fragatina has quit IRC10:36
*** fragatina has joined #tripleo10:37
*** yamahata has quit IRC10:41
openstackgerritJulie Pichon proposed openstack/tripleo-ui: Add pxe_drac to the node registration driver list  https://review.openstack.org/39951210:43
*** udesale has quit IRC10:50
*** nyechiel has quit IRC10:51
shardyjaosorior: Hey do you happen to know what the default cache configuration is for keystone when deployed via TripleO?10:57
rascaguys have we got somewhere a schema resuming the relations between services in newton? I'm trying to debug a problem with glance (timeout) but this service is fine, so I'm looking to swift, but would like to have an idea of the entire relations picture10:57
shardyjaosorior: I don't see anything explicitly configured in the profile, and I'm trying to deploy a minimal keystone only setup10:58
openstackgerritGael Chamoulaud proposed openstack/tripleo-quickstart: WIP: Update OOOQ to ansible 2.2  https://review.openstack.org/39819410:58
shardyhttp://hardysteven.blogspot.co.uk/2016/08/tripleo-composable-services-101.html10:58
shardyI did that previously as described there, but it doesn't work anymore, trying to figure out what changed10:58
jaosoriorI don't know what the default cache is10:59
shardyjaosorior: Ok, I'll dig into it10:59
jaosoriorshardy: stable/newton or master?10:59
shardywe deploy memcache and redis, and AFAICS we don't configure keystone to use either, although puppet-keystone supports both11:00
shardyjaosorior: on master11:00
shardythe blog was written before we cut newton11:00
*** rbowen has quit IRC11:00
shardyNow haproxy won't deploy without keepalived, and keystone is running but unresponsive11:01
shardyhttps://review.openstack.org/#/c/399152/ aims to fix the keepalived coupling11:01
*** jaosorior is now known as jaosorior_lunch11:04
dtantsurhi folks! how is stable/newton CI feeling today?11:04
*** arxcruz has quit IRC11:06
openstackgerritMerged openstack/tripleo-heat-templates: Use j2 loops in post.j2.yaml  https://review.openstack.org/39623011:08
openstackgerritMartin André proposed openstack/tripleo-heat-templates: Containerized Services for Composable Roles  https://review.openstack.org/33065911:09
*** arxcruz has joined #tripleo11:09
*** ooolpbot has joined #tripleo11:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION11:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242911:10
*** ooolpbot has quit IRC11:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]11:10
*** jkilpatr has quit IRC11:11
*** jtomasek has quit IRC11:13
gfidenteshardy https://review.openstack.org/#/c/399152 any clue how come ha was still passing?11:18
gfidenteI mean before the patch11:18
gfidente(ha job)11:18
*** nyechiel has joined #tripleo11:21
openstackgerritMarios Andreou proposed openstack/tripleo-specs: Composable Service Upgrades  https://review.openstack.org/39211611:32
openstackgerritJulie Pichon proposed openstack/tripleo-ui: Add pxe_drac to the node registration driver list  https://review.openstack.org/39951211:32
shardygfidente: it was hard-coded to true, and it's still true because hiera keepalived_enabled is true11:33
shardygfidente: it only fixes the case where you want to do a local deployment without keepalived11:33
gfidenteshardy no in HA job it should be false11:33
gfidenteHA job does not use keepalived11:34
shardygfidente: where do we disable keepalived?11:34
gfidentehuh but we are not disabling it indeed11:34
shardyafaics it's always on11:34
gfidentethough this seems a mistake to me11:34
shardybandini: ^^ ?11:35
gfidentebandini ^^11:35
openstackgerritGael Chamoulaud proposed openstack/tripleo-quickstart: WIP: Update OOOQ to ansible 2.2  https://review.openstack.org/39819411:35
*** katkapilatova has left #tripleo11:35
openstackgerritArx Cruz proposed openstack/tripleo-quickstart: Configuring vcpus for undercloud, controller and compute nodes  https://review.openstack.org/39659311:35
gfidenteshardy indeed it does not start11:36
gfidenteit's configured but not enabled on boot11:37
shardyugh11:38
gfidenteyeah11:38
shardywe set a boolean in ./puppet/services/pacemaker/haproxy.yaml:            enable_keepalived: false11:38
gfidentethe VIPs are in charge to pcmk in HA job11:38
gfidenteah cook11:38
gfidente*cool11:38
shardyOk I'll have to fix that, as it will break with composable upgrades11:38
gfidenteso we could just set the resource to None11:38
shardywe cannot have any services disabled except via OS::Heat::None or removing from the *Services lists11:38
gfidenteagreed11:39
gfidentecan I take it?11:39
shardygfidente: ack, thanks - I'll post a patch which does that now11:39
gfidenteoh okay I won't do it then11:39
gfidenteadd me on review11:39
*** jtomasek has joined #tripleo11:41
*** anton has quit IRC11:41
akrivokajpich: hey I just spotted a copy-paste error I made when refactoring the driver fields UI components11:42
*** prateek has quit IRC11:42
akrivokajpich: https://github.com/openstack/tripleo-ui/blob/master/src/js/components/nodes/driver_fields/PXEAndIPMIToolDriverFields.js#L511:42
akrivokajpich: the class should be called PXEAndIPMIToolDriverFields, of course11:43
akrivokajpich: do you want to include the fix with your patch https://review.openstack.org/#/c/399512 since it touches the same files?11:43
*** jaosorior_lunch is now known as jaosorior11:44
*** anton has joined #tripleo11:44
jpichakrivoka: I don't actually touch that one, I think it'd fit better on its own? How come the import didn't fail since the class doesn't exist?11:44
akrivokajpich: because javascript :)11:45
akrivokajpich: I guess when you export default, and then import from another file, it does not matter what you name the thing you imported11:45
akrivokajpich: no probs then, I'll submit a fix11:46
jpichakrivoka: I see I still have a long way to go. Great refactoring btw, I accidentally looked at an older version of these files first and was full of sad11:46
akrivokahaha11:46
openstackgerritSteven Hardy proposed openstack/tripleo-heat-templates: Disable keepalived for HA deployments via t-h-t  https://review.openstack.org/39955411:48
shardygfidente: ^^11:48
jpichakrivoka: Actually thinking further it might be easier to sneakily backport the typo fix if I include it in my fix, if that refactoring is in Newton as well?11:49
akrivokajpich: it is in Newton, yeah go for it :)11:50
jpichakrivoka: Theoretically no backport is allowed without associated bug AFAIU11:50
jpichakrivoka: Alright11:50
*** nyechiel has quit IRC11:53
jaosoriorshardy: do you happen to know if the heat-templates that has the fix I did for the content-type is available in the promoted images?11:54
openstackgerritFrederic Lepied proposed openstack/tripleo-specs: Step by step validation Spec  https://review.openstack.org/37233611:54
openstackgerritSteven Hardy proposed openstack/puppet-tripleo: Remove conditional in keepalived profile  https://review.openstack.org/39955611:54
shardyjaosorior: current-tripleo hasn't promoted in two weeks, so I'd guess not11:54
shardyyou can check by looking at the current-tripleo repo11:54
gfidentethere is bandini leaving +2s :)11:54
shardyhttp://buildlogs.centos.org/centos/7/cloud/x86_64/rdo-trunk-master-tripleo/11:55
bandinigfidente: yeah saw the backlog now ;) when I disabled keepalive that way I was still obviously not entirely clear on the composable stuff :)11:55
gfidenteor we didn't have it finished yet maybe11:56
gfidentebut I was more emphasizing on the +211:56
*** pkovar has joined #tripleo11:56
bandini:)11:57
shardybandini: I have an upgrade related question - does it make sense to stop haproxy and (pacemaker|keepalived) as a first step of upgrades, before stopping any other services?11:58
shardyI asked the same question to marios yesterday but didn't see a reply11:59
shardyI'm initially testing my WIP upgrades patch with a simple nonha environment, and haproxy spews errors if you don't stop it first, then all the services as a second step11:59
shardyI assume we handle this in the pcmk case by taking down the pacemaker cluster, which has dependencies such that haproxy is stopped first?12:00
bandinishardy: correct we basically do "bring down the cluster, yum update, bring up the cluster" (I am skipping over a bunch of details but yeah)12:00
bandiniactually right after the yum upgrade we actually start a minimal subset of services (the ones needed for the schema upgrade tools to work) and then start the rest12:01
bandinishardy: does that answer your question?12:02
pandawhat's the upstrea for "import gear" we are using in tripleo-ci ?12:02
shardybandini: Ok, I'm interested in making this work for the non pacemaker case, so I assume it will be 1. stop haproxy + keepalived, 2. stop all services, 3. yum update, 4. start db etc, 5. db sync, 6. start services12:02
shardybandini: yep I think so, thanks12:02
bandinishardy: that sounds about right12:02
*** dsariel has quit IRC12:05
*** ooolpbot has joined #tripleo12:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION12:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242912:10
*** ooolpbot has quit IRC12:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]12:10
openstackgerrityolanda.robla proposed openstack-infra/tripleo-ci: Enable consuming packages for a feature branch  https://review.openstack.org/39956212:10
openstackgerritJuan Antonio Osorio Robles proposed openstack/tripleo-heat-templates: Pass hostname to ceph-rgw  https://review.openstack.org/39956312:10
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Make Ceph RGW bind to hostname instead of IP  https://review.openstack.org/39956412:13
jaosoriorgfidente: what do you think? ^^12:13
pandaderekh: what's the upstrea for "import gear" we are using in tripleo-ci ?12:14
*** abregman|afk has quit IRC12:16
derekhpanda: http://git.openstack.org/cgit/openstack-infra/gear/12:17
openstackgerritJustin Kilpatrick proposed openstack/tripleo-quickstart: Add retries to ipxe rom installation  https://review.openstack.org/38981812:20
openstackgerrityolanda.robla proposed openstack-infra/tripleo-ci: Enable consuming packages for a feature branch  https://review.openstack.org/39956212:21
*** derekjhyang has joined #tripleo12:21
flaper87jrist: hey, could you take a look here ? https://review.openstack.org/#/c/330659/ ? :D12:21
*** jbadiapa has quit IRC12:22
*** jbadiapa has joined #tripleo12:22
gfidentejaosorior nah it can't12:22
gfidenteit only works with IPs12:22
gfidenteI tried this before :(12:22
*** jkilpatr has joined #tripleo12:22
pandaderekh: thanks12:23
jaosoriorgfidente: :(12:24
gfidentejaosorior yeah commented in https://review.openstack.org/#/c/399563/112:24
*** ccamacho is now known as ccamacho|lunch12:25
openstackgerritJulie Pichon proposed openstack/tripleo-ui: Add pxe_drac to the node registration driver list  https://review.openstack.org/39951212:26
*** pkovar has quit IRC12:28
*** tobias-fiberdata has joined #tripleo12:29
hewbroccaflaper87: did you mean jistr12:30
hewbroccaor jstir12:30
*** hogepodge has joined #tripleo12:31
hewbroccaYes, I think you meant jistr12:31
hewbroccaHe's off today and (I think) all next week, moving12:32
*** tobias_fiberdata has quit IRC12:32
flaper87hewbrocca: yeah, I meant jistr12:32
flaper87:(12:32
flaper87ok, will ping him on monday12:32
hewbroccaWe have a bot somewhere named jstir12:33
hewbroccajust to mix it up :D12:33
*** rhallisey has joined #tripleo12:33
openstackgerritJavier Peña proposed openstack-infra/tripleo-ci: Properly set distro branch in DLRN when STABLE_RELEASE=newton  https://review.openstack.org/39957812:36
hewbroccaflaper87: maybe shardy could +A in his absence, or dprince?12:36
*** lucasagomes is now known as lucas-hungry12:37
flaper87hewbrocca: that'd be awesome, although dprince hacked on that patch12:38
flaper87I think he can +2 but not +A12:38
flaper87shardy: if you have time https://review.openstack.org/#/c/330659/12:39
*** dtantsur is now known as dtantsur|brb12:39
weshaypanda, any open issues on the undercloud neutron-db-manage failing that you know of?12:42
*** abregman has joined #tripleo12:42
weshaypanda, oh ya.. https://bugs.launchpad.net/tripleo/+bug/164157112:43
openstackLaunchpad bug 1641571 in tripleo "CI: master jobs fail on neutron-db-manage" [High,Confirmed]12:43
pandaweshay: yes12:43
weshayhttps://bugs.launchpad.net/networking-cisco/+bug/164131112:44
openstackLaunchpad bug 1641311 in networking-cisco "neutron-db-manage fails after https://review.openstack.org/#/c/394201/" [High,Confirmed]12:44
pandaweshay: it has a link to the already opened12:44
pandayes12:44
weshaypanda, was that just the overcloud?12:44
weshayI'm seeing the issue on the undercloud12:44
weshayon master12:44
pandaweshay: link ? todays there's another promotion blocker that gives similar errors on the undercloud12:45
weshaysqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (1045, u"Access denied for user 'neutron'@'192.168.24.1' (using password: YES)")[0m12:45
gfidenteshardy I don't think we can *include* environment files from another right?12:45
weshaypanda, I'm working on the etherpad12:45
weshaylinks are ther12:45
weshaye12:45
pandaweshay: yeah saw that, I think this happens because all the previous keystone tasks fail, and no user neutron is created12:47
openstackgerritGiulio Fidente proposed openstack-infra/tripleo-ci: Use HCI Ceph in HA job  https://review.openstack.org/33808812:48
weshaypanda, ah I see /me scrolling up12:48
*** fultonj has joined #tripleo12:48
weshaypanda, ya.. from 2016-11-18 03:27:20,039 INFO: [1;31mError: /Stage[main]/Neutron::Keystone::Auth/Keystone::Resource::Service_identity[neutron]/Keystone_user[neutron]: Could not evaluate: Execution of '/bin/openstack domain list --quiet --format csv' returned 1: __init__() got an unexpected keyword argument 'project_domain_id' (tried 24, for a total of 170 seconds)[0m12:48
pandaweshay: yes, caused by https://bugs.launchpad.net/os-client-config/+bug/1642897, taht too is in the etherpad12:49
openstackLaunchpad bug 1642897 in os-client-config "osc commands fail when using admin_token plugin in keystoneauth1" [Undecided,New]12:49
pandaamoralej looks at the weirdo results that finish at 4am, and when I by the time I discover at the same failures at 10:30 when periodic jobs finish, he's usually already done the root cause analysis :)12:51
mariosshardy: sorry, i was afk and only just remembered the ping from you last night, getting setup and will read back12:51
amoralejwe've been hitting it since yesterday in rdo-ci panda12:52
pandaamoralej: ah, ok. I think this is good, rdo ci catching things a lot earlier.12:55
openstackgerritJohn Trowbridge proposed openstack/tripleo-puppet-elements: Add qpid-dispatch-router to overcloud-controller element  https://review.openstack.org/37348912:55
weshaypanda, amoralej ya.. that's why we have it run every 4 hours12:55
mariosshardy: so did it answer your question (ie. yes in the current workflow the controlplane comes down, pcs cluster etc)12:55
weshayI don't have access to mark this as critical12:55
amoralejyeap it does its work12:55
weshayfor tripleo12:56
openstackgerritDerek Higgins proposed openstack/tripleo-heat-templates: Nothing to see here  https://review.openstack.org/39958312:56
amoraleji'm waiting for dtroyer to be online12:56
weshaypanda, amoralej we need this escalated via the process in mojo12:56
amoralejbut we may need to pin os-client-config back12:56
amoraleji was waiting to see if upstream awareness makes it work12:56
*** abregman is now known as abregman|afk12:57
amoralejand people seems not to be online yet12:57
marios14:02 < shardy> bandini: Ok, I'm interested in making this work for the non pacemaker case, so I assume it will be 1. stop haproxy + keepalived, 2. stop all services, 3.  yum update, 4. start db etc, 5. db sync, 6. start services12:57
mariosshardy: sounds like the current workflow ^12:57
amoralejdo you think we should escalate it already weshay?12:57
weshayamoralej, yes.. because master hasn't passed in 2 weeks now12:57
weshayamoralej, ocata1 is now ish12:57
amoralejyea, we are chaining issues...12:58
amoraleji will send the mail12:58
weshayamoralej, https://dashboards.rdoproject.org/rdo-dev12:58
amoraleji know, i know, don'm make me look at those reds...12:58
weshayamoralej, panda if that said 3-5 days vs. 14d it wouldn't be as critical12:58
pandaweshay: ocata-1 was released yesterday12:58
weshaybut we also have to get a passing job for an upcoming test day12:58
weshaypanda, ya.. but rdo hasn't released ocata-112:59
weshaybecause we don't have a build12:59
*** pkovar has joined #tripleo13:02
*** pgadiya has quit IRC13:02
pandaamoralej: mind if I send the escalation ?13:02
amoralejok, no problem, i was writing it but if you have it, send it13:03
*** jkilpatr has quit IRC13:03
*** dsneddon is now known as dsneddon_afk13:04
jpichThe backport at https://review.openstack.org/#/c/396523/ is green and has several +2s, if someone would like to give it the missing +A13:06
*** morazi has joined #tripleo13:09
*** jeckersb is now known as jeckersb_gone13:10
*** ooolpbot has joined #tripleo13:11
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION13:11
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242913:11
*** ooolpbot has quit IRC13:11
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]13:11
pandaweshay: amoralej ehre's already a proposed patch: https://review.openstack.org/#/c/39891713:13
*** jayg|g0n3 is now known as jayg13:13
amoralejyeah, i've seen it13:14
*** jkilpatr has joined #tripleo13:14
*** trown|outtypewww is now known as trown13:15
amoralejwe need someone from the osc team to look at it13:16
mhenkelhi All, quick question on enabling neutron dhcp for internal_api and mangement networks:13:17
*** prateek has joined #tripleo13:17
mhenkelis it sufficient to set InternalApiNetEnableDHCP and ManagementNetEnableDHCP to true?13:17
mhenkelI did that in my env file but my neutron networks are created without dhcp13:18
mhenkelis there anything else I need to do?13:18
jkilpatrso is overcloud prep config now required for overcloud deployments but has not been added to the default extras file in quickstart?13:21
jkilpatrfun13:21
pandajkilpatr: ?13:23
*** pradk has joined #tripleo13:24
jkilpatrpanda, just chasing down an error it looks like the network-environmental templating task has been moved to a new role that's not landed yet in some places.13:24
pandajkilpatr: I see it in quickstart-extras-requirements.txt13:24
pandajkilpatr: you have a link to the error ?13:24
*** flepied has quit IRC13:25
jkilpatrpanda, it just says the role isn't there maybe i should check more carefully then.13:25
pandaweshay: amoralej escalation sent13:26
jkilpatrpanda, ok I think I see the issue, I was trying to add it to my playbook but the example on the github as the full role name instead of the shortened name13:27
pandajkilpatr: ok, beware that from monday (probably) all roles will lose the ansible-role-tripleo- prefix13:28
*** rain has joined #tripleo13:30
*** rain is now known as leanderthal13:30
*** lucas-hungry is now known as lucasagomes13:31
*** jkilpatr has quit IRC13:31
*** rlandy has joined #tripleo13:33
*** jbadiapa has quit IRC13:34
shardyjaosorior: was there a bug raised ref https://review.openstack.org/#/c/398127/ and https://review.openstack.org/#/c/398128 ?13:35
shardyI thought there was one but it's not linked from the patches13:35
*** ccamacho|lunch is now known as ccamacho13:36
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Add verify required and CA bundle to haproxy  https://review.openstack.org/39959113:40
*** fultonj has quit IRC13:40
*** flepied has joined #tripleo13:41
*** fultonj has joined #tripleo13:42
*** jaosorior has quit IRC13:42
*** amoralej is now known as amoralej|lunch13:42
*** jbadiapa has joined #tripleo13:43
*** Vijayendra_ has joined #tripleo13:46
openstackgerritSteven Hardy proposed openstack/puppet-tripleo: Remove explicit hiera calls for heat in keystone profile  https://review.openstack.org/39959513:47
*** jpena is now known as jpena|lunch13:47
openstackgerritSteven Hardy proposed openstack/puppet-tripleo: Remove explicit hiera calls for heat in keystone profile  https://review.openstack.org/39959513:47
*** dtantsur|brb is now known as dtantsur13:48
*** Vijayendra has quit IRC13:48
*** abregman|afk is now known as abregman13:49
openstackgerritSteven Hardy proposed openstack/tripleo-heat-templates: Use keystone profile parameter to pass heat password  https://review.openstack.org/39959913:51
openstackgerritSteven Hardy proposed openstack/puppet-tripleo: Remove explicit hiera calls for heat in keystone profile  https://review.openstack.org/39959513:51
shardyslagle, jaosorior: ^^13:51
matbushardy: ho i wonder, i figured out the badstatusline thing on my env. I discovered during the upgrade of the undercloud to newton release, puppet was setting all the workers of the UC services to 213:53
matbushardy: which was really low sized, so i override this value with a bigger one13:53
hewbroccamatbu: wait, is that the bug that's been blocking CI?13:54
shardymatbu: interesting - it should be the number of cores in most cases I think13:54
shardyamoralej|lunch: ^^13:54
hewbroccaI thought we pushed it down to fix undercloud memory issues13:55
*** lblanchard has joined #tripleo13:55
trownya increasing workers will increase memory pressure for sure13:55
*** nyechiel has joined #tripleo13:55
matbuhewbrocca: yep, but the "badstatusline" is really generic python error13:56
hewbroccaRight13:56
matbuhewbrocca: some maybe there was differents/severals errors behind13:56
shardyhewbrocca: we did tune a few things down a bit, but for the undercloud heat, two workers is not enough given the hundreds of stacks we're throwing at it13:56
hewbroccano clearly not13:56
hewbroccano wonder the API response is so slow...13:56
matbushardy: ack, i wasn't sure of the expecting behavior, the number of cores should have been 8 .. i set i to 6 and it worked fine13:56
hewbroccarook: you around?13:57
shardyyeah, we shouldn't have gone that low, especially for heat - we put a minimum of 4 in the heat codebase a while back because of these sorts of issues13:57
shardybut if you pass an explicit 2 we'll respect it13:57
shardyso yeah, my local box only has two heat-engine workers13:57
shardynum_engine_workers = 213:58
shardythat'll do it13:58
matbushardy: but i think it depend also of the box itself. i have a local box with a ssd and it's pretty fast , even with a low size of workers13:58
trownhmm so we need to patch instack-undercloud hiera for heat workers?13:58
shardywell it appears someone already has, sec13:58
shardythe default should be the number of cores, or 4, whichever is more13:58
trownya, I have never been able to reproduce this issue on my dev env with ssd13:58
hewbroccaAt some point we thought it was OK to make it 213:59
trownwell, we made it 2 or (total CPU)/2 for all services in puppet14:00
hewbroccaoh dear14:00
trownI think that is still fine, but for heat on the undercloud seems we need to override that14:00
shardyhttps://review.openstack.org/#/q/I9ed855648e23b0a7e452e6a840a92779fa3f6d4814:00
shardyOk so we'll need to revisit that I think14:00
hewbroccaSounds that way14:00
matbuhewbrocca: when i hit memory issues on the undercloud in upgrade, i used to set "number of core / 2 == heat_engine_workers"14:00
openstackgerritChristian Schwede proposed openstack/tripleo-heat-templates: Make Ceilometer notifications non-blocking  https://review.openstack.org/39198514:00
hewbroccaI'm all about saving memory on the overcloud nodes14:01
hewbroccabut frankly14:01
hewbroccaI think we should test with as big an undercloud as we need14:01
hewbroccaand no bigger14:01
hewbrocca:)14:01
trownya that is not a static quantifiable thing though :P14:01
hewbroccaIf that means we need an undercloud with 32GB RAM to get a proper test14:01
hewbroccafine14:01
matbu"The os_workers fact will be 2 for unless the cpu count is greater than 8 with an incremental increase of 1 worker for every 4 processors until 32 processors."14:02
*** abregman has quit IRC14:03
trownI wonder if just giving undercloud 8 vcpus in CI would work14:03
trownthe os_workers default feels sane for production14:03
trownso long as we say 8 cores is minimum for undercloud14:04
shardytrown: amoralej|lunch said he'd also hit this with an 8vcpu env, so I think the main issue is the bottleneck of workers for some services we hit hard14:04
*** artom has quit IRC14:04
shardyIME 4vcpus is fine provided you also have 4 heat-engine workers14:04
shardymwhahaha: Hey, we're discussing os_workers14:05
mwhahaha?14:06
shardymwhahaha: turns out vcpus/2 doesn't work for some services, particularly heat where we probably want number of CPUs up to some limit, perhaps 814:06
trownhmm, ya I guess uniform workers maybe doesnt work for undercloud14:06
matbushardy: trown maybe we should just documented this by saying : if you have 8 cores or less on this UC, you should override the workers to 4 or 614:06
mwhahahaIt's tunable14:06
*** ctayal has joined #tripleo14:06
shardymwhahaha: in the past we hit issues in CI due to the bottlneck of throwing huge stacks at heat with small number of workers, and https://review.openstack.org/#/c/387523 took us from a minimum of 4 to 214:06
shardymwhahaha: cool, that was my question  :)14:07
*** ctayal has quit IRC14:07
shardye.g can we tune this, or should we just go back to the heat calculated default14:07
mwhahahaTune14:07
mwhahahaThe defaults are bad14:07
*** ctayal has joined #tripleo14:07
mwhahahaThe number of CPUs is not a good default14:07
*** tiswanso has joined #tripleo14:07
mwhahahaYou could osworkers *2 for heat14:08
trownthat is a good idea14:08
shardymwhahaha: for heat, we want the number of CPUs up to a limit of 8, but with a minimum of 4 - what's the cleanest way to do that?14:08
shardyyeah, we can't do that in the hiera interpolation tho, can we?14:08
mwhahahaNot sure I'd have to test it14:08
mwhahahaI'm on pto today so I'm not at a computer at the moment14:09
shardymwhahaha: ack, OK we'll figure it out14:09
shardythanks14:09
openstackgerritwes hayutin proposed openstack/tripleo-quickstart: [WIP] first pass at composable roles, this is just the config  https://review.openstack.org/39960914:09
mwhahahaOs workers is num CPUs / 414:09
*** ooolpbot has joined #tripleo14:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION14:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242914:10
*** ooolpbot has quit IRC14:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]14:10
mwhahahaWith a cap of 8 so if you * 2 you should get some decent defaults. The issue is for CPU GPU t of <414:10
shardyOk for now I'm going to stop us setting heat::engine::num_engine_workers: "%{::os_workers}"14:10
shardythat's a quick fix for CI14:10
shardythen we can work out the capping afterwards14:10
hewbroccashardy: +114:10
mwhahahaWhy not just override it in the CI environment14:11
matbushardy: we can just override the value an extra hieradata file14:11
mwhahahaYea that14:11
shardymwhahaha: because it's also hitting folks in the development environments14:11
matbuwith hieradata_override =14:11
matbuin undercloud.conf14:11
shardy(and we've got three different CI systems all breaking with this)14:12
mwhahahaSo if you increase it your going to hit Mem usage issues14:12
mwhahahaSo you need to figure out which is worse14:12
mwhahahaEspecially cause heat is the #1 Mem user14:13
rookhewbrocca: sup?14:13
openstackgerritDougal Matthews proposed openstack/python-tripleoclient: Only start the deploy if the Heat stack isn't already in progress  https://review.openstack.org/39895914:13
shardymwhahaha: Yeah, we've got to find a balance for sure14:13
mwhahahaalternatively drop the workers logic into puppet-tripleo14:14
shardybut we know 4 is a good minimum for heat in this environment based on previous issues14:14
hewbroccarook: just curious -- on your OSP10 scale testing14:14
hewbroccaWhat do you set Heat workers to on the undercloud?14:14
shardymwhahaha: yeah, the problem is the undercloud tho, and we're not yet using puppet-tripleo there14:15
*** abregman has joined #tripleo14:15
trownI kind of like the hiera override as the temp solution... we only need to do that in tripleo-ci and tripleo-quickstart, both of which already have a default hiera override file14:15
shardyI can wire it in via puppet-stack-config.pp tho14:15
trownand are not branchless14:15
trownotherwise we have to make an instack-undercloud change and backport it... then if we fix in some better way later we would have to again fix and backport?14:16
trowns/and are not branchless/and ARE branchless/14:16
gfidenterook :)14:16
mwhahaha%{scope('::os_workers') * 2}14:17
*** hjensas has quit IRC14:17
mwhahahaShould work by the way14:17
openstackgerrityolanda.robla proposed openstack-infra/tripleo-ci: Enable consuming packages for a feature branch  https://review.openstack.org/39956214:17
trownmwhahaha: thanks. shardy, I can submit that ^14:18
mwhahahaYou would get a min of 4 all the time with some sliding scale up to 1614:18
*** Goneri has joined #tripleo14:18
shardymwhahaha, trown: OK sounds good, go for it :)14:19
shardythanks for the help mwhahaha14:19
trownshardy: we only need to do that for engine workers and not api right?14:19
trownya thanks mwhahaha... now go back to PTO :)14:19
shardytrown: Yeah I'd start with the engine workers14:19
*** cdearborn has joined #tripleo14:20
*** dsariel has joined #tripleo14:20
gfidenteshardy kind of related note14:22
gfidenteif we batch create of ::Server, then ::ResourceGroup will still wait for all members to come up before starting, right?14:23
openstackgerritMerged openstack/instack-undercloud: correctly spell yaql_limit_iterators  https://review.openstack.org/39839614:23
openstackgerritMerged openstack/os-net-config: Stop dhclient in os-net-config if interface not set for DHCP  https://review.openstack.org/39849814:23
beagles\o/ ^^14:23
rookhewbrocca default 1:1 cpu:worker14:23
rookgfidente: ? :)14:24
jristflaper87: of course I can look but I think you mean jistr :)14:24
gfidenterook not sure if you saw the email from Ben but sounds like we're going to reprise that, probably together14:24
gfidenteor at least /me would like to14:24
jristmatbu: I was playing destiny last night and there was someone named matbu so I thought it was you14:25
rookreprise what gfidente ? (i haven't seen it, still on leave, but lurking)14:25
gfidenterook but I think if you could reply adding some details that would be cool14:25
gfidenterook ah didn't know you were out, sorry14:25
rookhewbrocca shardy is the discussion to reduce heat workers?14:26
matbujrist: hehe nop, it wasn't me :)14:26
hewbroccarook: the other way14:26
derekhshardy: the 40 node deployment appears to be stalled, I'm seeing plenty of errors in signals being sent too heat-api-cfn and I think 2 compute nodes think they have nothing to do but heat is waiting for a signal14:26
derekhshardy: http://chunk.io/f/81a87c10941745de8a7df8c9ce3a397b14:27
hewbroccarook: looks like we have been setting Heat workers too low in CI which is causing performance problems14:27
shardyderekh: ack, ref discussion above, can you please check num_engine_workers on the undercloud?14:27
shardyderekh: and how many cpus does the undercloud VM have?14:27
derekhshardy: num_engine_workers = 214:28
*** jeckersb_gone is now known as jeckersb14:28
derekhshardy: 8xvCPU14:28
rook#num_engine_workers = <None>14:28
*** nyechiel has quit IRC14:29
*** ccamacho has quit IRC14:29
rookoh, this must be quickstart?14:29
shardyderekh: ack - OK that may be part of the problem - I'd suggest increasing that to num_engine_workers = 4 at least (perhaps even 6 or 8)14:29
rookmine isn't quickstart.14:29
derekhshardy: http://paste.openstack.org/show/589718/14:29
shardyderekh: https://review.openstack.org/#/c/387523/ reduced the number of workers accross the board, which did reduce memory usage, but I think is the cause of some of our performance problems14:29
derekhrook: I'm not trying this with quickstart either14:29
openstackgerritJohn Trowbridge proposed openstack/instack-undercloud: Increase the default number of workers for heat engine  https://review.openstack.org/39961914:30
rookhm, must be newer code then what I have14:30
rookah, i see the change shardy references... looking14:30
derekhshardy: ok, I'm gonna quick this attempt and start a new deploy14:30
*** ccamacho has joined #tripleo14:30
derekhshardy: nothing has happened in 40 minutes14:30
*** derekjhyang has quit IRC14:31
derekh*quit14:31
trownderekh: https://review.openstack.org/399619 is the patch to change default to #CPU for heat engine rather than #CPU/214:31
trownthough... with 8vCPU undercloud... shouldnt engine workers be set to 4 with os_workers?14:31
shardyderekh: Ok, there may be other issues, but I'm sure that engine count won't work for such a large deployment14:31
shardytrown: no, it's cpus/414:32
openstackgerritmathieu bultel proposed openstack-infra/tripleo-ci: Implement overcloud upgrade job - Mitaka -> Newton  https://review.openstack.org/32375014:32
shardy"(number of cpus/4) or 2 but is capped at 8"14:32
derekhya, mine is set to cpu/414:32
rookhow did we determine 8 was a sweet spot?14:32
trownah ok... my commit message is slightly wrong then14:33
*** numans has quit IRC14:33
* rook wonders if we consulted with each of the teams (storage,networking,etc) to come up with 814:33
shardyrook: I don't think we did14:33
* mwhahaha points to a mailing list message with no feedback14:33
shardyrook:  a while back we hit RPC timeouts using 2 heat workers, and at that time increasing the minimum to 4 in heat fixed it14:33
rookok... I know for a fact that < workers for neutron == less performance.14:34
mwhahahaso this was tested with other services and this is just the default, it's tunable. but the goal was not to use $::processorcount anymore14:34
rookglance might not be a huge hit... Swift -- my guess would be a hit.14:34
mwhahahadue to the fact that on baremetal that's terible14:34
shardymwhahaha: yup, understood14:34
openstackgerritJohn Trowbridge proposed openstack/instack-undercloud: Increase the default number of workers for heat engine  https://review.openstack.org/39961914:34
rookmwhahaha so, not just with the UC, but OC this is changing?14:34
gfidenterook shardy though from mulitple parties I hear that batching is good practice14:34
mwhahahaso it's the default in the puppet modules14:34
shardymwhahaha: it's just that we've been burned by this before - we know that certain services on the undercloud are hit very hard, so we can't tune them down too far14:35
rookcorrect mwhahaha14:35
gfidenteand was experimenting with different places where it could be used14:35
mwhahahawhich THT controls14:35
rookit isn't just processes, but the # of open DB connections.14:35
mwhahahaon the UC we lowered it because of memory issues since we have 17 different services runing14:35
rooksure14:35
mwhahahaso using 17 * $::processorcount = bad time14:35
rooki mean, look at the IO we see on the UC14:36
rookwhat are you going to do about that?14:36
trownI think the only services that are hit hard on UC are swift and heat though14:36
trownand swift is only in a couple bursts14:36
rookwell the burts can be quite large.14:36
rookie 40 node deployment at once.14:37
mwhahahathe only thing we tune for swift is proxy14:37
shardyYeah, I think just special-casing the new defaults for heat and perhaps swift should work OK14:37
trownfor neutron it is really just an IP manager on UC, so the lower default is good14:37
rooktrown I agree -- we only spawn 10 at a time, so interface creation should be low.14:38
mwhahahaso for ref, https://review.openstack.org/#/c/386696/ that's what we tuned down14:38
mwhahahawe don't tun swift server, only proxy14:38
rookmwhahaha: i _think_ the proxy is the biggest hitter though -- i could be wrong though, I have only played a little bit with object-stores14:39
* derekh kicks off a new 40 node deployment14:39
rookderekh 3 controllres, 40 compute?14:39
mwhahahai would assume swift server is the IO14:40
rookmwhahaha yup, that & provisioning14:40
derekhrook: yup14:40
rookderekh so, I had to bump timeouts with previous newton releases.14:40
mwhahahaso we could tune swift server, but we dont and the upstream module has 1 as the default and i think that's for $reasons14:40
rookah derekh initial deployment is good, nm14:41
rookderekh: https://gist.github.com/jtaleric/4f422ccbf89d7c413d68e0d3cdbaabbf14:41
mwhahahahttps://github.com/openstack/puppet-swift/blob/master/manifests/storage/server.pp#L5414:41
derekhrook: I'm deploying with "-t 480 " , were there others you had to increase?14:41
rookderekh: solid.14:42
*** amoralej|lunch is now known as amoralej14:42
*** bnemec has joined #tripleo14:42
*** abehl has quit IRC14:42
derekhrook: btw this is all on virt (using OVB) so not exactly replicating a production deployment14:44
derekhbut I think worth looking at at least14:44
rookderekh have the steps to do OVB work?14:47
rookderekh: I would like to compare : real deployments to ovb.14:47
trownbnemec: was just about to +A https://review.openstack.org/#/c/399146/ for the same reason :)14:47
derekhrook: not sure what your question meant14:48
openstackgerritFlorian Fuchs proposed openstack/tripleo-ui: Adds basic internationalization support  https://review.openstack.org/39962614:48
rookderekh I think there is more overhead with real deployments (ie, IPMI flakeyness) -- so using OVB might be good for consistency.14:48
*** bnemec is now known as beekneemech-semi14:48
*** beekneemech-semi is now known as bnemec-semi-here14:49
rookderekh so you are using OVB for your deployment, do you have a document/etherpad/napkin-with-nodes on how to set that up.14:49
* bnemec-semi-here needs to document --quintupleo for ovb14:50
derekhrook: possibly , were still doing ipmi but not against hardware with potential problems, and no POST time so thats a plus14:50
rookoh trust me.. IPMI on Dell vs Supermicro is night and day14:50
derekhrook: you first need access to a OVB cloud, then follor the steps in the OVB repo https://github.com/cybertron/openstack-virtual-baremetal14:51
rookwell, how do i create said ovb cloud14:51
rookderekh lets say I wnt to create one in the scale lab.14:51
*** jpena|lunch is now known as jpena14:51
*** rook is now known as rook-baby14:51
*** charliejllewelly has joined #tripleo14:51
derekhrook: I've also put together instructions of using RH2 here for some developers that use it https://etherpad.openstack.org/p/tripleo-devenvs14:52
*** abregman is now known as abregman|afk14:52
derekhrook-baby: the README in that repo give details on how to set it up14:52
*** chlong has joined #tripleo14:52
jristflorianf: : could I pull down your patch and have some strings i18n?14:53
derekhrook-baby: I'll send you the etherpad with the notes on how we setup RH1 and RH214:53
florianfjrist: It's still WIP. But if you want to review/test, type localStorage.setItem('language', 'fr') in your console and reload the page. some strings will be marked with 'FR'14:55
openstackgerritHarry Rybacki proposed openstack/tripleo-quickstart: Add tuned check to remote provision role  https://review.openstack.org/39636214:55
jrist!!!14:55
openstackjrist: Error: "!!" is not a valid command.14:55
dtantsurfolks, do we always use swift as glance backend in undercloud (similar question for overcloud)?14:55
jristflorianf: cool14:55
jristflorianf: gonna check it out shortly14:55
trowndtantsur: ya, except in the overcloud on some multinode scenario jobs where we dont deploy swift14:56
jpichflorianf: How goes getting the dependency rpm updated? Or will it work with the older version of the library for now?14:56
dtantsurtrown, ack, thanks. some ironic drivers imply glance backed by swift, hence my question.14:56
florianfjpich: It does work with the older version. I thought it might be a good idea to update the dependency as part of a more general review of all our current dependency versions.14:57
*** panda is now known as panda|bbl14:58
jpichflorianf: Cool! Review sounds good to me, though isn't that a massive task??14:58
amoralejshardy, trown, so tunning heat workers may help in the badlinestatus issue14:58
trownamoralej: did you try it out?14:59
hewbroccaamoralej: so we think14:59
amoralejyeah, make sense14:59
amoralejafter i hit the error yesterday in my test machine14:59
florianfjpich: probably. but can we get around it? we have our deps pretty tightly managed, so at some point we need to check for useful updates, I guess...14:59
amoraleji've been running it in a loop14:59
amoralejabout 8-10 times15:00
amoralejwith haproxy and openstack overcloud deploy running in debug and didn't hit again15:00
openstackgerritMerged openstack/tripleo-quickstart: Add tuned check to remote provision role  https://review.openstack.org/39636215:00
amoralejwith the same server and undercloud image15:00
openstackgerritMerged openstack/tripleo-quickstart: Pass the libvirt_uri to the pool-define command  https://review.openstack.org/39914115:00
trownamoralej: I put up https://review.openstack.org/399619 to increase heat engine workers15:01
jpichflorianf: Among other things, yeah...15:01
jpichflorianf: Let's open a bug to track this, if it can/should be done outside of the blueprint?15:01
amoralejso in a 8vcpus system we'd move from 2 to 4 workers?15:02
trownamoralej: also, I think https://review.openstack.org/#/c/396362/ will make it so we get similar performance in all CI nodes rather than 3/4 being much slower15:02
trownamoralej: yep15:02
openstackgerritMerged openstack/tripleo-puppet-elements: Add qpid-dispatch-router to overcloud-controller element  https://review.openstack.org/37348915:02
openstackgerritMerged openstack/instack-undercloud: Increase Mistral Task Size limit  https://review.openstack.org/39652315:02
shardyamoralej: yep, minimum of 4, possibly more for large deployments and/or if you have a lot of ram15:02
florianfjpich: good idea. yeah, outside the blueprint sounds fine to me. It's really not needed to implement what we need.15:02
jpichflorianf: Cool :)15:04
amoralejtrown, i think the tuned profile will help also, yes15:04
flaper87my quickstart run is failing to boot the undercloud because the pool disk is owned by root: http://paste.openstack.org/show/589722/ has this happened to other folks?15:08
openstackgerritMerged openstack/tripleo-quickstart: Add blockstorage to default node flavor  https://review.openstack.org/39657715:08
flaper87I'm trying to get quickstart to install tripleo on f2415:08
flaper87I know it's not supported but I'm working on that as I try this15:08
*** ooolpbot has joined #tripleo15:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION15:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242915:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]15:10
*** ooolpbot has quit IRC15:10
shardyflaper87: https://review.openstack.org/#/c/384892/ looks related?15:10
flaper87yeah, might be15:11
trownflaper87: hmm, must be something different with fedora... my env has the pool owned by stack user15:12
trownflaper87: did you run quickstart.sh as root user?15:13
*** ctayal_ has joined #tripleo15:13
flaper87trown: no, it seems to be related to https://review.openstack.org/#/c/384892/15:14
trownhmm... actually that should not matter for this issue I think15:14
flaper87trown: :D15:14
dtrainordciabrin, sure, i'll be doing another deployment shortly15:15
*** ctayal has quit IRC15:15
trownflaper87: I wonder if virsh is behaving differently on fedora vs centos when volumes are created15:18
trownflaper87: this task in particular seems like it would create the volume as non-root user if run as non-root user15:19
trownhttps://github.com/openstack/tripleo-quickstart/blob/master/roles/libvirt/setup/overcloud/tasks/main.yml#L70-L7815:19
flaper87trown: trying something out but that might actually be the case15:19
flaper87trown: just added become/become_user to these two https://github.com/openstack/tripleo-quickstart/blob/master/roles/libvirt/setup/undercloud/tasks/main.yml#L146-L16415:20
flaper87let's see if that has any effect15:20
flaper87otherwise, I'd say it's virsh's fault and need to figure out what it's doing differently15:20
*** abregman|afk has quit IRC15:22
trownflaper87: ya checking on a fedora box if the the pool gets created differently with the same xml15:23
*** prateek has quit IRC15:23
flaper87trown: adding become did nothing15:25
*** prateek has joined #tripleo15:26
*** jlinkes has joined #tripleo15:27
openstackgerritwes hayutin proposed openstack/tripleo-common: Add python-memcached to agent container.  https://review.openstack.org/39858215:29
*** abregman has joined #tripleo15:29
trownflaper87: hmm... just manually run pool creation steps as a non-root user on F23 resulted in pool directory and volumes being owned by that non-root user15:29
*** prateek has quit IRC15:30
flaper87T_T15:32
*** andrey-mp has joined #tripleo15:34
*** jcoufal has joined #tripleo15:34
andrey-mpHi! I'm trying to deploy Newton with tripleo to CentOS7. But my deployment stucks on os-collect-config. It tries to run something that uses 'lsb_release' tool. But this tool is absent. When I install 'redhat-lsb-core' then os-collect-config ends and deploy goes further.15:37
andrey-mpWhat is my mistake here?15:37
*** rook-baby is now known as rook15:37
pradkcan i get some reviews on https://review.openstack.org/#/c/396439/ and https://review.openstack.org/#/c/396435/15:38
andrey-mpHow I can build an overcloud image with this tool?15:38
rookderekh++15:38
pradkplease*15:38
shardyandrey-mp: how did you build your image?  My local overcloud image contains that package15:39
andrey-mpshardy: with this instruction - http://docs.openstack.org/developer/tripleo-docs/basic_deployment/basic_deployment_cli.html15:40
andrey-mpshardy: code is here - https://github.com/cloudscaling/redhat-kvm/blob/master/__undercloud-install-2-as-stack-user.sh#L3715:40
*** markmc` is now known as markmc15:43
rookshardy: mwhahaha hewbrocca so the thought is to increase # of workers because of failed deployments? -- sorrytrying to catch up on the convo.15:43
rookshardy mwhahaha hewbrocca I thought we were hitting mem issues.15:43
rookwith too many workers15:43
*** panda|bbl is now known as panda15:44
shardyrook: yeah, but now we're hitting response issues because of too few workers ;)15:44
*** abregman is now known as abregman|afk15:44
shardycan't satisfy both constraints unfortunately15:44
*** hoobaman has quit IRC15:44
openstackgerritMerged openstack/tripleo-common: Fernet Key management  https://review.openstack.org/39738115:44
shardyrook: the compromise is probably to increase at least the heat-engine workers, despite the increase in memory usage15:45
shardyrook: FWIW the heat memory usage issues which IIRC prompted this were largely resolved late in newton15:45
shardyhttp://people.redhat.com/~shardy/heat/plots/heat_before_after_end_newton.png15:46
pandaflaper87: if you add --teardown all, does it change anything ? It's not the first time I hear of this problem, and this usually work around it, until we find some peace to deal wit it properly.15:46
rookshardy: nice -- have those fixes landed in RDO?15:47
shardyrook: should have, they're all in stable/newton AFAIK15:47
rookshardy we just need to keep a eye on this.15:47
mwhahaharook: it might be ok if we just up heat only. the total num procs was an issue15:47
*** pradk has quit IRC15:47
rookmwhahaha: it is the # of available workers to service requests.15:48
rooksince we are hitting timeouts15:48
rookthe other possibility is to increase the RPC timeout?15:48
*** [1]cdearborn has joined #tripleo15:48
shardyI think we already increased it to a very high value15:49
shardythe problem this time was RPC didn't time out, so haproxy did15:49
rookah ha15:49
rookok15:49
rookwtf15:49
rookHAProxy on the UC15:49
shardyincreasing timeouts isn't a good solution IMO15:50
rooki agree15:50
mwhahahayea i was reffering to the history behind the original worker decrease15:50
rookbut trying to tune workers without data across the board is abd.15:50
rookbad15:50
*** pcaruana has quit IRC15:50
*** HenryG has quit IRC15:50
mwhahahawe had data15:50
rookfor more than just heat mwhahaha ?15:50
mwhahahayea15:50
rookbecause you are tuning across the board.15:50
rookalrighty15:50
mwhahahafrom puppet and fuel15:51
rooki came into the convo late15:51
mwhahahathis is a many release thing15:51
*** HenryG has joined #tripleo15:51
mwhahahathe openstack defaults of proc count only works for small vms, aka devstack15:51
openstackgerrityolanda.robla proposed openstack/tripleo-quickstart: Create directories with root  https://review.openstack.org/38489215:52
mwhahahaweve also done rounds of perf tests15:52
rooksure mwhahaha, this has bit us in the ass plenty of times.15:52
rookmwhahaha: yup, we have.15:52
rookmwhahaha we show that # of workers does increase response times with some services.15:52
rookbut if you have 96 cores.... you don't really need 96 workers.15:52
mwhahahaanyway, afk. if you want to chat further feel free to hit me up next week15:52
rookbut to limit to 8 might be too low.15:52
*** kjw3 has joined #tripleo15:53
*** pradk has joined #tripleo15:54
*** ctayal_ has quit IRC15:54
*** chandankumar has quit IRC15:55
*** ctayal has joined #tripleo15:55
rookshardy: so the change of # of workers is something pretty recent15:57
rookhm, 5 weeks ago15:57
openstackgerritwes hayutin proposed openstack/tripleo-quickstart: [WIP] config for containerized-compute  https://review.openstack.org/39334815:57
openstackgerritMerged openstack/puppet-tripleo: Replace hard-coded haproxy/keepalived coupling  https://review.openstack.org/39915215:59
openstackgerritMerged openstack-infra/tripleo-ci: Add worker config envs in toci_gate_test  https://review.openstack.org/39914615:59
rbradyhonza: I looked into supporting logging a bit15:59
bnemec-semi-hereandrey-mp: There was a bug in dib for newton that means lsb_release will be missing if you don't include ceph.  The workaround is to include the ceph repo.16:00
openstackgerritBen Nemec proposed openstack-infra/tripleo-ci: Restore worker configs for mitaka and below  https://review.openstack.org/39865216:00
*** jprovazn has quit IRC16:01
rbradyhonza: trying to use what's already in zaqar (e.g. custom pipeline stage or custom syslog notifier) were seen somewhat as abusing zaqar16:01
rbradyhonza: also, we'd run into the same problem with zaqar that we do for using mistral WRT to permissions or logging.conf16:01
rbradyhonza: two other options left.  a logging service in tripleo or some sort of logging service in openstack itself we could use for tripleo16:02
*** cdearborn has quit IRC16:03
*** dsavineau has joined #tripleo16:05
*** ebarrera has quit IRC16:06
flaper87trown: what libvirt_uri did you use in your test?16:09
*** Guest90313 has quit IRC16:09
trownflaper87: qemu:///session16:10
*** ooolpbot has joined #tripleo16:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION16:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242916:10
*** ooolpbot has quit IRC16:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]16:10
flaper87trown: a-ha, could you try with qemu:///system ?16:10
*** rajinir has joined #tripleo16:10
derekhshardy: rook, keystone has been running along at 110%+ CPU for a while now and using its fair share of RAM (1.35G although it doesn't seem to be increasing any more)16:11
trownflaper87: hmm, then I would need to use sudo so I think it would be expected to have it owned by root16:11
derekh18436 keystone  20   0 2409372 1.353g   6988 S 115.2  8.7 240:46.82 keystone-admin  -DFOREGROUND16:11
trownflaper87: you are explicitly setting system in your quickstart config?16:12
flaper87trown: I did because of a different bug, ok, think I found my issue16:12
shardyderekh: ouch, that doesn't sound too good16:12
flaper87trown: the fix for that other bug landed already16:12
trownflaper87: ah that is maybe the patch I merged this morning :)16:12
shardyderekh: I noticed today that we're not enabling any cache backend for keystone16:12
shardynot sure if that could be related16:13
*** bana_k has joined #tripleo16:13
derekhshardy: maybe, logs suggest its currently getting about (or getting through about) 3 token requests a second16:14
shardywow16:14
shardywell, at least you found something, hopefully we can figure out ways to make that a lot faster :)16:14
andrey-mpbnemec-semi-here: thanks!16:14
derekhshardy: and if I'm right these are connections to keystone queueing up?16:16
derekh[root@undercloud-scale httpd]# netstat -pn | grep -i 35357 | grep ESTABLISHED | grep http | wc16:16
derekh     46     322    464616:16
*** bana_k has quit IRC16:18
*** penick has joined #tripleo16:18
shardyderekh: looks like it16:19
*** jlinkes has quit IRC16:19
yolandahi derekh , can i get your review on https://review.openstack.org/399562 ?16:19
derekhshardy: or maybe ignore that netstat, I could be counting thing twice16:19
shardyderekh: We pass the same token around to all the services, so they're all going to hit the DB to have the token authenticated16:19
*** ramishra has quit IRC16:19
shardywhich is going to be particularly bad if we're letting keystone hit the db every time without caching I guess16:20
*** bana_k has joined #tripleo16:20
shardyactually16:20
shardythe caching settings I'm referring to are for the overcloud keystone, but I guess the same applies16:20
*** ramishra has joined #tripleo16:21
derekhyolanda: looking16:24
*** achadha has joined #tripleo16:25
*** jpena is now known as jpena|brb16:27
openstackgerritRonelle Landy proposed openstack/tripleo-quickstart: Remove OVB stack cleanup dependance on network isolation type  https://review.openstack.org/39967816:30
*** tremble has quit IRC16:30
dtrainorThe guide on deploying an SSL Overcloud go in to pretty fine detail but omit some steps on finding some needed information.  One thing that I'm running in to is needing a predictable IP for the IP-based SSL configuration.  Where is the pool of Public VIPs located?  http://docs.openstack.org/developer/tripleo-docs/advanced_deployment/ssl.html16:33
dtrainorPublic is not External, correct?16:33
dtrainorSpecifically, I'm looking for the value - or list of available values - I can use for PublicVirtualFixedIPs16:34
pandaderekh: in trieplo-ci, what's the part that actually creates the ovb stack ? testenv-client is launchin a gearman worker, but I can't find the function the worker is using to create the stack16:34
openstackgerritDimitri Savineau proposed openstack/puppet-tripleo: Allow neutron_options customization for dashboard  https://review.openstack.org/39792716:34
derekhpanda: http://git.openstack.org/cgit/openstack-infra/tripleo-ci/tree/scripts/te-broker/start_workers.sh16:35
pandaderekh: ack, thanks16:35
derekhpanda: that starts X workers (but doesn't create the env), then when they are connected too this script creates the envs http://git.openstack.org/cgit/openstack-infra/tripleo-ci/tree/scripts/te-broker/create-env16:36
yolandahi derekh, we still cannot test because we do not have the package available for that16:37
yolandaa new feature/v2 branch needs to be added to packaging16:38
*** larstobi has quit IRC16:40
hewbroccawild re keystone16:40
hewbroccashardy: it's almost like we need a cloud operator to look at our undercloud settings and tell us what to improve...16:40
hewbroccaI wonder where we could find one of those16:41
*** dsariel has quit IRC16:41
*** larstobi has joined #tripleo16:42
shardyhewbrocca: heh, +1, although there's no one-size answer I'm sure we can do better16:42
derekhyolanda: ok, should we wait for it? or is this needed first?16:43
yolandayep, we shall wait for it16:44
*** achadha has quit IRC16:44
hewbroccaMight be worth a mail to Graeme16:44
*** achadha has joined #tripleo16:44
shardyhewbrocca: yup - derekh do you want to drop him a mail with your findings and/or raise a bug with details?16:45
openstackgerritMartin André proposed openstack/tripleo-common: Update container images to point to newton  https://review.openstack.org/39968716:46
shardyI was planning to raise one re the cache thing but wanted to understand the options better first16:46
derekhshardy: will open a bug16:47
openstackgerritDougal Matthews proposed openstack/python-tripleoclient: Only start the deploy if the Heat stack isn't already in progress  https://review.openstack.org/39895916:47
derekhshardy: looks like my deployment has stalled again, 12 minutes since anything output from the deploy command + keystone is no longer being hit as hard16:49
openstackgerritSteven Hardy proposed openstack/tripleo-heat-templates: WIP prototyping composable upgrades with Heat+Ansible  https://review.openstack.org/39344816:49
derekhshardy: and plenty of these http://paste.openstack.org/show/589737/16:50
shardyderekh: Hmm, is that with the incresed heat engine workers?16:51
derekhshardy: yes16:51
derekhshardy: its set too 4, confirmed 4 are running and the parent16:52
shardyhrm, looks like the heat requests are still timing out tho, you'll probably see an RPC timeout in the heat logs associated with that16:52
shardyderekh: actually, we're polling swift16:52
shardyso it may be more workers are needed there too16:53
shardybut I'd expect a different error in that case16:53
derekhshardy: want access to this box, /me is supposed to be doing other things, I can try the swift thing first though16:53
shardyderekh: sure, if you can pm me details I'll take a look16:53
derekh2 swift proxies running16:53
shardyOk, I'd guess that's not enough, because all 40 boxes will be polling swift tempurls16:54
hewbroccaLOL16:55
*** ccamacho has quit IRC16:58
*** rasca has quit IRC16:59
derekhshardy: shoudl I bump up the swift proxy workers and try again?17:02
d0ugalThis is a silly question, but after using tripleo.sh --delorean-build, how are others installing the rpm?17:02
shardyderekh: Yeah, best I can tell it's an error hitting the tempurl from the request collector in os-collect-config17:03
derekhshardy: ok, will try it out17:03
shardyNot sure what a reasonable number is, I'd probably try 8 if there's enough ram17:04
*** ctayal has quit IRC17:05
*** jpena|brb is now known as jpena17:05
openstackgerritDimitri Savineau proposed openstack/puppet-tripleo: Allow neutron_options customization for dashboard  https://review.openstack.org/39792717:06
*** rickflare has quit IRC17:07
shardyd0ugal: there are a few ways, you can either virt-customize it into the image, use deploy artifacts to install it, put it in a local yum repo and have the nodes update from it17:07
derekhshardy: rook https://bugs.launchpad.net/tripleo/+bug/164300617:08
openstackLaunchpad bug 1643006 in tripleo "keystone maxed out during overcloud deploy" [Undecided,New]17:08
shardyd0ugal: or you can add a local repo to the undercloud yum.repos.d, and add that to OVERCLOUD_IMAGES_DIB_YUM_REPO_CONF when calling tripleo.sh --overcloud-images17:08
*** ooolpbot has joined #tripleo17:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION17:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242917:10
*** ooolpbot has quit IRC17:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]17:10
derekhshardy: bumping it too 817:10
d0ugalshardy: I want to install it on the undercloud - so I guess a local yum repo is what I want17:11
*** charliejllewelly has quit IRC17:11
* d0ugal finally goes and learns something about yum17:12
*** panda is now known as panda|bbl17:12
amoralejpanda, we've reverted os-client-config pin to 1.22.0 for the osc issue17:15
amoralejpanda|bbl ^17:15
amoralejwehay ^17:16
derekhshardy: the new deploy is running, feel free to check in on it later, I'll probably poke at it a bit over the weekend to see if I can find anything out17:16
amoraleji've launch promotion pipeline in rdo, let's see17:16
panda|bblweshay: ^17:16
openstackgerritChristian Schwede proposed openstack/tripleo-quickstart: Rename objectstorage flavor to swift-storage  https://review.openstack.org/39970317:16
shardyderekh: ack will do, thanks!17:16
openstackgerritLucas Alvares Gomes proposed openstack/tripleo-quickstart: WIP: VirtualBMC support for tripleo-quickstart  https://review.openstack.org/39970417:17
*** paramite has quit IRC17:18
*** hewbrocca is now known as hewbrocca_afk17:22
*** derekh has quit IRC17:24
*** yamahata has joined #tripleo17:25
*** dtantsur is now known as dtantsur|afk17:26
openstackgerritMarios Andreou proposed openstack/tripleo-heat-templates: Make the openvswitch 2.4->2.5 upgrade more robust  https://review.openstack.org/39970817:28
openstackgerritSteven Hardy proposed openstack/tripleo-heat-templates: WIP prototyping composable upgrades with Heat+Ansible  https://review.openstack.org/39344817:31
*** lucasagomes is now known as lucas-afk17:32
*** Guest52285 is now known as mgagne17:33
*** mgagne has quit IRC17:33
*** mgagne has joined #tripleo17:33
*** mcornea has quit IRC17:37
*** trown is now known as trown|lunch17:44
*** jpich has quit IRC17:53
*** achadha has quit IRC17:54
openstackgerritMerged openstack-infra/tripleo-ci: Properly set distro branch in DLRN when STABLE_RELEASE=newton  https://review.openstack.org/39957817:56
*** cylopez has quit IRC17:57
*** chandankumar has joined #tripleo18:02
*** fzdarsky is now known as fzdarsky|afk18:02
*** dbecker has quit IRC18:04
*** lmiccini has quit IRC18:06
*** jpena is now known as jpena|off18:06
openstackgerritBrent Eagles proposed openstack/os-net-config: Add support for enabling hotplug on interfaces  https://review.openstack.org/39466018:09
*** ooolpbot has joined #tripleo18:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION18:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242918:10
*** ooolpbot has quit IRC18:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]18:10
*** achadha has joined #tripleo18:11
*** ccamacho has joined #tripleo18:11
*** ccamacho has quit IRC18:11
weshaylucas-afk++18:13
*** achadha_ has joined #tripleo18:13
*** achadha_ has quit IRC18:14
*** achadha_ has joined #tripleo18:14
*** achadha_ has quit IRC18:15
*** achadha has quit IRC18:15
*** achadha has joined #tripleo18:15
*** yamahata has quit IRC18:22
dtrainorHi.  I'm getting a failed deployment when trying to set PublicVirtualFixedIPs for an ip-based SSL Overcloud deployment.  The error I'm getting is:  Resource CREATE failed: InvalidIpForNetworkClient: resources.PublicVirtualIP.resources.ExternalPort: IP address 10.12.148.193 is not a valid IP for any of the subnets on the specified network. Neutron server returns request_ids: ['req-1fde4302-a5d4-483b-9109-19f2e8a5a6e7']18:41
dtrainorThen IP that I'm using (10.12.148.193) is in fact in my ExternalAllocationPools range18:42
dtrainorhmm I don't need environments/ips-from-pool.yaml do I?18:44
*** _milan_ has quit IRC18:50
*** milan has joined #tripleo18:55
*** pkovar has quit IRC18:56
*** yamahata has joined #tripleo18:56
*** oshvartz has joined #tripleo18:59
*** jkilpatr has joined #tripleo19:02
*** cwolferh has quit IRC19:03
*** kjw3 has quit IRC19:03
*** chandankumar has quit IRC19:04
*** milan has quit IRC19:05
*** milan has joined #tripleo19:05
*** trown|lunch is now known as trown19:07
*** cwolferh has joined #tripleo19:07
jkilpatrrlandy, any idea why my browbeat job is trying to use a network environmental yaml even when I have network seperation disabled.19:07
jkilpatr?19:07
jkilpatrs/seperation/isolation19:08
jkilpatrnevermind I think I see a fix landed a few hours ago19:09
jkilpatrhttps://github.com/redhat-openstack/ansible-role-tripleo-overcloud/commit/51d96d1a23064002d0c7d4f4a8c357677f3f8c7819:09
*** ooolpbot has joined #tripleo19:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION19:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242919:10
*** ooolpbot has quit IRC19:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]19:10
*** penick has quit IRC19:11
rlandyjkilpatr: hello!!19:13
rlandyjkilpatr: pls see internal19:13
*** penick has joined #tripleo19:14
gfidentehave good weekend tripleo19:16
*** gfidente has quit IRC19:16
*** milan has quit IRC19:20
openstackgerritSagi Shnaidman proposed openstack-infra/tripleo-ci: DONT REVIEW: Test timeout  https://review.openstack.org/39341519:21
*** milan has joined #tripleo19:21
*** dsariel has joined #tripleo19:24
*** jbadiapa has quit IRC19:25
*** ctayal has joined #tripleo19:29
*** milan has quit IRC19:36
*** dsneddon_afk is now known as dsneddon19:37
*** leanderthal is now known as leanderthal|afk19:39
*** abregman|afk has quit IRC19:39
*** bnemec-semi-here has quit IRC19:39
*** milan has joined #tripleo19:41
*** milan has quit IRC19:50
*** amoralej is now known as amoralej|off19:51
*** iranzo has quit IRC19:52
*** milan has joined #tripleo20:01
*** tzumainn has joined #tripleo20:02
*** milan has quit IRC20:05
*** hjensas has joined #tripleo20:07
*** panda|bbl is now known as panda20:08
*** penick has quit IRC20:09
*** ooolpbot has joined #tripleo20:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION20:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164242920:10
*** ooolpbot has quit IRC20:10
openstackLaunchpad bug 1642429 in tripleo "CI: low-memory template doesn't apply and jobs are killed by oom" [Critical,Triaged]20:10
*** abehl has joined #tripleo20:10
*** tzumainn has quit IRC20:14
*** ctayal has quit IRC20:15
*** ctayal has joined #tripleo20:16
*** dsariel has quit IRC20:16
*** akrivoka has quit IRC20:20
*** milan has joined #tripleo20:21
*** fultonj has quit IRC20:23
*** cwolferh has quit IRC20:27
*** cwolferh has joined #tripleo20:29
slagleis it just my patches, or is anyone else seeing this error in CI20:30
slagleNov 18 19:37:25 localhost os-collect-config: #033[1;31mError: /Stage[main]/Rabbitmq::Config/Rabbitmq_erlang_cookie[/var/lib/rabbitmq/.erlang.cookie]/content: change from NFNKSOGWBJTZCZDNLQEN to weDcaWyG7Obatc48K8T6 failed: Execution of '/usr/bin/puppet resource service rabbitmq-server ensure=stopped' returned 1: /usr/share/gems/gems/json-1.7.7/lib/json/common.rb:155:in `encode': "\xC3" on US-ASCII (Encoding::InvalidByteSequenceError)20:30
slaglehmm, I think it's across the board20:31
*** milan has quit IRC20:35
openstackgerritTim Rozet proposed openstack/puppet-tripleo: Adds auto-detection for VIP interfaces  https://review.openstack.org/39040020:38
*** bana_k has quit IRC20:39
pandaslagle: that sounds new.20:39
*** milan has joined #tripleo20:41
openstackgerritTim Rozet proposed openstack/puppet-tripleo: Adds auto-detection for VIP interfaces  https://review.openstack.org/39040020:41
slagleyea20:43
*** penick has joined #tripleo20:44
*** ipsecguy has joined #tripleo20:45
slaglepanda: filed a bug: https://bugs.launchpad.net/tripleo/+bug/164305920:49
openstackLaunchpad bug 1643059 in tripleo "CI: jobs failing with Error: /Stage[main]/Rabbitmq::Config/Rabbitmq_erlang_cookie[/var/lib/rabbitmq/.erlang.cookie]/content: <snip> `encode': "\xC3" on US-ASCII (Encoding::InvalidByteSequenceError)" [Critical,Triaged]20:49
slaglei think it might be the new puppet-remote package20:49
*** ipsecguy_ has quit IRC20:49
panda\xC3 is the utf-8 core for é20:50
pandacode*20:50
pandaat least part of it20:51
*** milan has quit IRC20:51
*** chlong has quit IRC20:53
*** shardy has quit IRC20:59
*** milan has joined #tripleo21:01
*** arxcruz has quit IRC21:02
dsneddonHey everyone, is there a way to test-build the overcloud stack without deploying? I just want to see what the value of certain resources will be when I deploy, to help iterating while developing.21:02
*** jayg is now known as jayg|g0n321:02
openstackgerritJohn Trowbridge proposed openstack/instack-undercloud: Increase the default number of workers for heat engine  https://review.openstack.org/39961921:03
dsneddonzaneb, ^^^?21:03
zanebdsneddon: there's a stack-preview command21:04
dsneddonzaneb, Ah, thanks, I'll look that up.21:04
zanebdsneddon: not sure if it will do what you want or not, but worth a look21:04
dsneddonzaneb, Yes, it looks like exactly what I want.21:05
*** milan has quit IRC21:05
*** mhenkel has quit IRC21:08
*** ooolpbot has joined #tripleo21:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION21:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164305921:10
*** ooolpbot has quit IRC21:10
openstackLaunchpad bug 1643059 in tripleo "CI: jobs failing with Error: /Stage[main]/Rabbitmq::Config/Rabbitmq_erlang_cookie[/var/lib/rabbitmq/.erlang.cookie]/content: <snip> `encode': "\xC3" on US-ASCII (Encoding::InvalidByteSequenceError)" [Critical,Triaged]21:10
*** fragatina has quit IRC21:11
*** florianf has quit IRC21:16
*** milan has joined #tripleo21:17
*** jkilpatr has quit IRC21:17
*** rlandy has quit IRC21:17
*** mhenkel has joined #tripleo21:19
openstackgerritMerged openstack/tripleo-heat-templates: Use default Sensu redact  https://review.openstack.org/39828121:25
pandaslagle: https://github.com/paramite/puppet-remote/issues/2 it wasn't new21:33
*** jeckersb is now known as jeckersb_gone21:34
*** andrey-mp has left #tripleo21:35
*** dmarlin_ has left #tripleo21:35
pandaslagle: the á in Mágr encoding is \xC3\xA1 in iso-8859-1. Weird because JSON.parse is reading that correctly in local irb. but anyway21:35
*** fragatina has joined #tripleo21:41
slagleit might be the á, or it could have been the invalid json21:47
slaglehe merged my PR that fixed the invalid json, and a new package is available, we can see if it's fixed21:48
*** ctayal has quit IRC21:53
*** trown is now known as trown|outtypewww21:54
*** lblanchard has quit IRC22:01
*** penick has quit IRC22:01
*** penick has joined #tripleo22:03
*** lblanchard has joined #tripleo22:04
*** lblanchard has quit IRC22:08
*** ooolpbot has joined #tripleo22:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION22:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164305922:10
*** ooolpbot has quit IRC22:10
openstackLaunchpad bug 1643059 in tripleo "CI: jobs failing with Error: /Stage[main]/Rabbitmq::Config/Rabbitmq_erlang_cookie[/var/lib/rabbitmq/.erlang.cookie]/content: <snip> `encode': "\xC3" on US-ASCII (Encoding::InvalidByteSequenceError)" [Critical,Triaged]22:10
*** tiswanso has quit IRC22:13
*** dsavineau has quit IRC22:19
*** jcoufal has quit IRC22:23
*** milan has quit IRC22:27
*** tiswanso has joined #tripleo22:36
*** tiswanso has quit IRC22:40
*** dsneddon is now known as dsneddon_afk22:43
*** penick has quit IRC22:52
pandaslagle: how did you trigger new package creation ?22:53
*** cl has joined #tripleo22:53
*** cl has quit IRC22:54
*** [1]cdearborn has quit IRC22:59
*** panda is now known as panda|Zz23:08
*** ooolpbot has joined #tripleo23:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION23:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/164305923:10
openstackLaunchpad bug 1643059 in tripleo "CI: jobs failing with Error: /Stage[main]/Rabbitmq::Config/Rabbitmq_erlang_cookie[/var/lib/rabbitmq/.erlang.cookie]/content: <snip> `encode': "\xC3" on US-ASCII (Encoding::InvalidByteSequenceError)" [Critical,Triaged]23:10
*** ooolpbot has quit IRC23:10
*** bana_k has joined #tripleo23:14
openstackgerritSagi Shnaidman proposed openstack-infra/tripleo-ci: DONT REVIEW: Test timeout  https://review.openstack.org/39341523:18
*** b00tcat has quit IRC23:25
*** b00tcat has joined #tripleo23:25
*** b00tcat has joined #tripleo23:25
*** abehl has quit IRC23:25
*** bfournie has quit IRC23:28
*** achadha_ has joined #tripleo23:30
*** achadha_ has quit IRC23:31
*** achadha has quit IRC23:31
*** achadha has joined #tripleo23:31
*** achadha_ has joined #tripleo23:35
*** achadha has quit IRC23:36
*** mhenkel has quit IRC23:39
*** achadha_ has quit IRC23:39
*** fragatin_ has joined #tripleo23:50
*** fragatina has quit IRC23:53
*** bana_k has quit IRC23:55
*** bana_k has joined #tripleo23:55

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!