Thursday, 2016-06-16

*** openstack has joined #tripleo05:58
*** psanchez has joined #tripleo05:58
jaosoriorgreat05:58
*** rain has joined #tripleo05:58
jaosoriorbandini: It seems that none of the openstack components that are not running over httpd support reload though05:58
jaosoriorwhat do you recommend in those cases?05:58
*** rain is now known as Guest6167605:58
*** saneax is now known as saneax_AFK05:59
jaosoriorFor example; I have this WIP for heat https://review.openstack.org/#/c/327069/10 but I haven't put any command there.05:59
bandinijaosorior: in that case it gets a bit tricky, because if you do a "pcs restart openstack-<service>" it will restart all the dependent services too05:59
*** mbound has quit IRC05:59
jaosoriorbandini; can't I just do systemctl restart openstack-heat-api?06:00
bandinijaosorior: well it can get racy. basically if, while you do the restart via systemctl, pacemaker monitors that service it will see that it is down so it will try to stop and start it again. so at this point you're racing with pcmk for that06:02
bandinimaybe it works, but it can break06:03
jaosoriorbandini: Well, it is a post-save command; all I want is that the service re-reads the configuration so it can read the new certificates06:03
jaosoriorso if pacemaker tries to restart it again; it shouldn't be that problematic, as I don't do anything after the post save command06:04
*** pkovar has joined #tripleo06:04
*** tremble has joined #tripleo06:07
bandinijaosorior: let me have a think about that06:09
*** ooolpbot has joined #tripleo06:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION06:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/159277606:10
openstackLaunchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed]06:10
*** ooolpbot has quit IRC06:10
*** dciabrin has quit IRC06:10
*** jprovazn has joined #tripleo06:14
*** yolanda has joined #tripleo06:18
*** yolanda has quit IRC06:21
*** ramishra has quit IRC06:21
*** florianf has quit IRC06:22
*** florianf has joined #tripleo06:22
*** yolanda has joined #tripleo06:24
*** rcernin has joined #tripleo06:24
*** saneax_AFK is now known as saneax06:25
*** ramishra has joined #tripleo06:26
*** apetrich has quit IRC06:34
*** apetrich has joined #tripleo06:34
*** jcoufal has quit IRC06:36
*** coolsvap has quit IRC06:37
*** coolsvap has joined #tripleo06:38
bandinijaosorior: do you happen to have a newton env where the ci issue is seen?06:39
jaosoriorbandini: No dude :/06:40
jaosoriorI don't have an accessible machine06:40
jaosorioraaand I have a deployment I test with, but it's from before the issue06:40
jaosoriorI think ccamacho had one06:40
bandiniah right, I will try to set one up then06:41
bandinijaosorior: btw I will do a short write-up about the issues of systemctl restart <service> when <service> is managed by pacemaker. I think it will be useful for everyone (me included ;)06:43
jaosorioralright!06:43
jaosoriorthanks dude06:43
bandinithe reasons for why things might break are totally not obvious and are related how the systemd apis work06:44
*** mburned has quit IRC06:46
*** xinwu has quit IRC06:47
*** mburned has joined #tripleo06:50
*** panda has quit IRC06:50
*** panda has joined #tripleo06:51
*** afazekas is now known as afazekas|dentist06:52
bandinijaosorior: ever seen this https://paste.fedoraproject.org/379899/66059914/ after running tripleo.sh --repo-setup and then --undercloud /06:52
bandini?06:52
hewbroccamorning folks06:54
bandiniyo hewbrocca06:54
*** mburned has quit IRC06:56
hewbroccaIt seems like EmilienM made some progress on the upgrade job last night but it's still not fixed06:59
bandiniyeah am looking into that as well06:59
*** ramishra has quit IRC07:00
*** mburned has joined #tripleo07:00
*** anshul has joined #tripleo07:01
hewbroccaIt's an interesting one...07:01
*** anshul is now known as Guest3623807:01
bandiniI have one theory, am trying to get a newton env installed, but of course some repo screwage is playing against me07:02
hewbroccashocking07:03
hewbroccaOK, good stuff, thanks07:03
bandinilol07:03
*** mikelk has joined #tripleo07:05
*** ramishra has joined #tripleo07:05
*** oshvartz has joined #tripleo07:06
jaosoriorbandini: Yes, I usually just remove that package and try again07:07
bandinijaosorior: oh boy. ok. trying07:08
* hewbrocca facepalm07:09
bandinijaosorior: can you get me /var/lib/pacemaker/cib/cib.xml from your newton install?07:09
*** tesseract has joined #tripleo07:09
*** ooolpbot has joined #tripleo07:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION07:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/159277607:10
*** ooolpbot has quit IRC07:10
openstackLaunchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed]07:10
bandininope it fails afterwards as the undercloud tries to reinstall it and it barfs07:10
*** mcornea has joined #tripleo07:11
*** jpich has joined #tripleo07:12
bandiniasked in rdo as well07:12
*** coolsvap has quit IRC07:15
jaosoriorbandini, will do07:15
*** coolsvap has joined #tripleo07:15
*** pcaruana has joined #tripleo07:17
bandinijaosorior: Gracias Güey ;)07:18
jaosoriora huevo07:19
*** ifarkas has joined #tripleo07:19
*** ramishra has quit IRC07:19
bandinilol07:19
*** ramishra has joined #tripleo07:19
jaosoriorbandini: http://pastebin.com/hQnf8VZM07:23
jaosoriorbut it's not the same environment as the failure. like I said, I got that from before it started07:23
jaosoriormaybe you could compare that though07:23
*** jpena|off is now known as jpena07:23
*** hjensas__ has joined #tripleo07:23
bandinijaosorior: yeah as soon as I get an install working07:25
*** Guest61676 is now known as leanderthal07:25
*** ramishra has quit IRC07:26
jaosorioror poke ccamacho when he's back online07:26
bandiniyeah that is probably faster :D07:27
*** ramishra has joined #tripleo07:28
jaosoriorhewbrocca: Do you know if we're deploying cinder over httpd?07:28
jaosoriormarios ^^07:29
*** shardy has joined #tripleo07:30
*** ramishra has quit IRC07:32
jaosoriorshardy hey dude, quick question; do you know if we're deploying cinder over httpd?07:32
*** ohamada has joined #tripleo07:33
hewbroccajaosorior: Hmm no I really have no idea07:34
bandinijaosorior: I believe we are not (unless something changed very recently)07:34
shardyjaosorior: I don't think we are - it should be easy enough to check via ps ax | grep cinder on a controller?07:34
shardyand/or looking a the httpd conf07:35
*** ccamacho has joined #tripleo07:35
ccamachoMorning guys!07:36
openstackgerritOded Shvartz proposed openstack/tripleo-common: overcloud-odl : add new image file definition  https://review.openstack.org/26688107:36
jaosoriorccamacho: hey dude, what's up?07:37
bandinishardy: for the record, I narrowed down the heat issue I mentioned yesterday. It seems os-collect-config on the nodes gets confused after we yum update to mitaka (from liberty)07:37
bandininot sure yet why, but let's call it progress ;)07:37
*** ramishra has joined #tripleo07:37
shardybandini: ah, well good to know you've got a handle on it07:37
shardythe o-c-c thing doesn't sound good tho07:37
bandiniyeah I need to narrow down what is going on exactly07:38
ccamachojaosorior, the upgrades issue on ci, yesterday I managed to reproduce the error with an old patch passing CI, so i dont think is related to THT or puppet-tripleo07:38
shardybandini: Ok, well thanks for looking into it :)07:38
bandinishardy: let's see how it goes ;)07:38
jaosoriorccamacho: Do you still have that environment up and running?07:39
jaosoriorccamacho: If so, would it be possible for you to fetch /var/lib/pacemaker/cib/cib.xml from a controller?07:39
*** ebarrera has joined #tripleo07:42
*** dmacpher has quit IRC07:43
*** zoli_gone-proxy is now known as zoliXXL07:43
*** zoliXXL is now known as zoli|wfh07:45
mariosjaosorior: o/ no we are not afaik07:45
*** ramishra has quit IRC07:46
*** ramishra has joined #tripleo07:46
ccamachojaosorior, I have the environment with the error deployed07:47
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for keystone  https://review.openstack.org/32702907:47
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for heat  https://review.openstack.org/32706907:47
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for glance API and registry  https://review.openstack.org/32747307:47
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for RabbitMQ  https://review.openstack.org/32748207:47
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for cinder-api  https://review.openstack.org/32885907:47
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Add fact to get the fqdn for a host in the different networks  https://review.openstack.org/32929907:47
ccamachojaosorior, sure, give a sec07:47
*** olap has joined #tripleo07:49
ccamachojaosorior, http://paste.openstack.org/show/516489/07:50
*** dtantsur|afk is now known as dtantsur07:50
jaosoriorbandini^^07:53
bandinilooking, thanks07:54
*** numans has quit IRC07:54
*** saneax is now known as saneax_AFK07:56
jaosoriorbandini https://www.diffchecker.com/5jgcdjda07:59
bandinijaosorior: http://acksyn.org/files/tripleo/juan-working.pdf and http://acksyn.org/files/tripleo/carlos-broken.pdf08:00
bandiniI need to see stuff ;)08:00
jaosoriorok08:00
ccamachoyeahp, from yesterday checks might be something smelly with rabbit...08:00
jaosoriorso the main difference also is that neutron now depends on openstack-core08:00
jaosoriorwait08:01
jaosoriorno08:01
jaosoriorthat neutron doesn't depend anymore on openstack-core08:01
bandinithere are quite a few constraints that went away08:01
bandiniworking on installing my newton env08:01
ccamachomy CI was deployed with this patch https://review.openstack.org/#/c/328361/08:02
ccamachowhich was the last patch without the error08:02
ccamachoif you want access to the environment just send me the public keys08:03
jaosoriorwithout?08:04
jaosoriorthought you had an environment with the error08:04
ccamachothat env have the error08:04
*** aufi has joined #tripleo08:05
jaosorioralright08:05
*** coolsvap_ has joined #tripleo08:06
*** coolsvap_ has quit IRC08:06
*** osp has quit IRC08:06
*** coolsvap_ has joined #tripleo08:07
ccamachobut my last test from yesterday was to deploy the previous patch  before the first upgrade error on the upgrades job, to see if is related to THT or puppet-tripleo, but the error was reproduced.. so from our lasts tests from yesterday might be an error cause by some rabbit problem..08:07
*** athomas has joined #tripleo08:07
*** coolsvap has quit IRC08:07
*** jaosorior has quit IRC08:07
ccamachobandini how did you create this? "http://acksyn.org/files/tripleo/juan-working.pdf and http://acksyn.org/files/tripleo/carlos-broken.pdf"08:07
*** jaosorior has joined #tripleo08:08
*** remix_auei is now known as remix_tj08:09
*** ooolpbot has joined #tripleo08:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION08:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/159277608:10
*** ooolpbot has quit IRC08:10
openstackLaunchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed]08:10
*** hjensas_ has joined #tripleo08:11
*** chem```` has joined #tripleo08:12
*** chem``` has quit IRC08:14
*** hjensas__ has quit IRC08:14
chem````ccamacho: hello, has a solution to the CI problem been found ?08:19
*** chem```` is now known as chem08:20
ccamachochem````: not yet man..08:20
*** jprovazn has quit IRC08:21
chemccamacho: when you have deployed your test env, did you use all git version of puppet as you described in your documentation ?08:22
chemccamacho: I just want to make sure I have the right env to do the tests08:22
openstackgerrityolanda.robla proposed openstack/tripleo-quickstart: Allow to specify templates path on overcloud deployment  https://review.openstack.org/32955608:24
*** thegodfather is now known as fabbione08:26
ccamachochem yeahp, you can use from latest master patch or any other prior08:26
*** paramite has joined #tripleo08:27
ccamachochem also if you want, can have access to my env not to invest time deploying08:27
*** apetrich has quit IRC08:28
*** lucas-afk is now known as lucasagomes08:28
chemccamacho: one of my idea was to bissec the puppet-tripleo/tht code to find if it's something there08:28
chemccamacho: in the meantime I accept your proposal :)08:29
ccamachowhat I did was to deploy latest patch (Failed) and the last patch without having the error on CI (Also failed), so I dont think is related to tripleo-heat-templates or puppet-tripleo08:30
chemhttps://launchpad.net/%7Esofer-athlan-guyot/+sshkeys08:30
chemccamacho:^08:30
chemccamacho: ha oki, thanks for the information08:31
chemccamacho: when you say "last patch" you mean tht/puppet-tripleo combo ?08:31
*** apetrich has joined #tripleo08:32
ccamachochem yeahp08:32
openstackgerritKarthik S proposed openstack/tripleo-specs: New Spec: tripleo-ovs-dpdk  https://review.openstack.org/31387108:36
openstackgerritSanjay Upadhyay proposed openstack/tripleo-specs: new spec: tripleo-sriov  https://review.openstack.org/31387208:43
*** saneax_AFK is now known as saneax08:44
*** coolsvap_ is now known as coolsvap08:44
*** coolsvap has quit IRC08:45
*** coolsvap has joined #tripleo08:45
openstackgerritDerek Higgins proposed openstack-infra/tripleo-ci: Add infrstucture scripts to prepare rh2  https://review.openstack.org/29524308:46
openstackgerritDerek Higgins proposed openstack-infra/tripleo-ci: [WIP] Add support for OVB based CI  https://review.openstack.org/32952108:46
*** derekh has joined #tripleo08:47
bandiniccamacho: https://github.com/mbaldessari/pcs/tree/wip-graph-support08:48
hewbroccaderekh: hey, how did rh2 go yesterday08:48
derekhhewbrocca: good, all up again now, wont be reinstalling it again, just doing some sanity tests now and then will submit a patch to add it to infra08:48
ccamachobandini, thanks!08:49
derekhhewbrocca: I redeployed it last night and recorded the process, started editing the recording to compact it a bit, 3 hours of editing and I'm only half way through ....08:49
derekhhewbrocca: should finish the recording before the end of the week08:50
hewbroccaderekh: that is really excellent08:54
chembandini: what is this pcs graph support branch doing ?08:55
chembandini: never mind ... I've red the commit.08:56
hewbroccaderekh: What's the next step, actually enlisting the cloud in nodepool?08:56
chembandini: is this the tool that creates that http://acksyn.org/files/tripleo/wsgi-openstack-core.pdf08:57
derekhhewbrocca: 1. submit patch to infa to add the cloud to nodepool, 2. add an experimental OVB based job, 3. get it working, 4. switch the job to voting, 5. remove the existing jobs running on rh1, 6. move the rack etc...08:58
*** mgould|afk is now known as mgould08:59
*** jprovazn has joined #tripleo08:59
hewbroccavery good09:00
*** dtantsur is now known as dtantsur|brb09:00
shardyderekh: Is it possible to have more than one experimental job - we'd just have multiple jobs in the "check experimental" pipeline?09:00
shardyderekh: I'd like to add a job that enables heat convergence for check experimental09:00
derekhshardy: do you mean more then one experimental job running on rh2?09:01
hewbroccajistr: shardy is making tempting promises of rolling update support for SoftwareDeployments09:01
derekhshardy: or just in general, are multiple experimental jobs possible?09:01
hewbroccaassuming we can get jdob to do some work for a change09:01
shardyderekh: No, I just meant generally09:01
jistrhehe09:02
derekhshardy: yup, I think thats fine, if its a problem, we could make the rh2 one non experimental but non voting09:02
*** dmk0202 has joined #tripleo09:02
shardyhewbrocca, jistr: ramishra has kindly offered to pick it up today as I'm going to be on PTO for a couple of days09:02
hewbroccaahh, too bad, then I can't hassle jdob about it09:03
hewbrocca:)09:03
hewbroccaramishra: thanks, all BS aside09:03
shardyhewbrocca: don't worry, I'm currently writing a heat spec that we can probably hassle jdob to implement ;)09:03
hewbroccaexcellent09:04
shardy(merge strategy for environments)09:04
jistrhewbrocca, shardy: yea we discussed the upgrades in general yesterday. Heat is also getting YAQL support which should mean we woulnd't have to do hacks to process data, hopefully. Still i think we need to have a PoC to see how we'd approach things in practice. We're capturing the discussion at https://etherpad.openstack.org/p/tripleo-composable-upgrades-discussion09:04
shardyjistr: FYI the pattern will end up something like this:09:05
shardyhttp://paste.fedoraproject.org/379924/4660678809:05
shardyjistr: yaql support already landed in newton heat FYI09:05
*** nijaba has quit IRC09:06
*** ifarkas has quit IRC09:06
bandinichem: correct you do pcs -f <cib> constraint order graph > file.dot && dot -Tpdf file.dot > file.pdf09:07
jistrshardy: ack, tahnks09:07
jistr*thanks09:07
bandinichem: I need to get some time to upstream it09:07
shardyjistr: here's a simple example: http://paste.fedoraproject.org/379926/0680231409:07
openstackgerritMerged openstack/diskimage-builder: Add python logger configuration  https://review.openstack.org/32540909:07
shardyhttp://docs.openstack.org/developer/heat/template_guide/hot_spec.html#yaql09:07
chem bandini: that's great, just make it plain simple what the constraints were doing exactly.09:07
shardyjistr: note the yaql docs are... not good, but I've been reading the code to fill out the gaps09:08
shardyperhaps we can help with some docs patches to improve that09:08
shardymistral docs also contain some examples09:08
jistrshardy: yea that's pretty awesome i think. That gives Heat incomparably more data mangling power than it had until now.09:08
shardy\o/09:08
openstackgerritMerged openstack/diskimage-builder: Introspect logging testing more  https://review.openstack.org/32807109:08
shardyjistr: the nice thing is we can take a list, manipulate it, then str_replace (or list_join) supports transparently serializing to json09:09
*** apetrich has quit IRC09:10
*** ooolpbot has joined #tripleo09:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION09:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/159277609:10
*** ooolpbot has quit IRC09:10
openstackLaunchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed]09:10
hewbroccagrrr09:11
jistrshardy: just wondering now, if we want to create a value processed through YAQL and we want to reuse it on multiple places in a template, would there be a way to do that without using a nested stack like the in what you pasted? E.g. a resource of some sort...09:12
*** apetrich has joined #tripleo09:12
*** nijaba has joined #tripleo09:13
*** nijaba has joined #tripleo09:13
jistrshardy: i can imagine we could have a OS::TripleO::Yaql type that could accept `params` array, `yaql` string and give out `result` output, but it might be useful to support something in that sence out of the box09:13
jistrjust an idea though09:13
shardyjistr: Yeah there has been discussion on that, for now we'd have to either accept some duplication or do as you say and store the data in a nested stack09:15
shardyjistr: there was also discussion of an OS::Heat::Value native resource type09:16
shardyI thought there was even a patch but I can't find it atm09:16
shardyimplementing such a thing would be incredibly easy tho09:17
jistrshardy: yea i thought something like this https://paste.fedoraproject.org/379930/66068577/09:17
shardyYeah, that would work, but it adds to our overload-of-nested-stacks problem09:18
shardyprobably fine as a first step tho09:18
jistrand OS::Heat::Value sounds good to me by the name of it :) Something to get around the fact that we don't have variables available, just resources. So we'd have a resource that represents a variable.09:18
jistrand OS::Heat::Value could get rid of the nested stack overload perhaps if implemented within Heat :)09:19
*** milan has quit IRC09:20
shardyjistr: see this thread for context: http://lists.openstack.org/pipermail/openstack-dev/2016-April/091432.html09:22
jistrthanks09:22
shardywe'll have to check on the current status of any implementation09:22
*** florianf has quit IRC09:23
*** electrofelix has joined #tripleo09:27
*** fzdarsky|afk has joined #tripleo09:27
*** apetrich has quit IRC09:28
*** apetrich has joined #tripleo09:28
jistrshardy: interesting read. Maybe the advantage of OS::Heat::Value over computable parameter defaults would be ability to reference other resources within the same stack.09:32
jistr... read their outputs for example09:32
openstackgerritAttila Darazs proposed openstack/tripleo-quickstart: Allow passing extra arguments to overcloud deploy script  https://review.openstack.org/32897009:33
mgouldmorning everyone09:33
mgouldcould someone please do a release of instack-undercloud stable/mitaka, so one of my colleagues can QE introspection of UEFI-only nodes?09:35
shardyjistr: yep that's true09:36
*** tosky has joined #tripleo09:36
*** florianf has joined #tripleo09:37
shardymgould: sure - FYI note that it is possible for anyone to propose a release to openstack/releases now09:37
*** apetrich has quit IRC09:37
mgouldshardy: excellent, thanks!09:37
shardywe just need to look at the most recent passing stable periodic job and take the hashes from there09:37
*** apetrich has joined #tripleo09:38
*** hjensas_ has quit IRC09:38
shardyderekh: did we ever reinstate the periodic cistatus anywhere?09:38
shardyI can pull the results via a script but I wasn't sure if there was an easier interface like the tripleo.org page we had previously09:39
derekhshardy: nope but sshnaidm has a different status page that can be used, gimme a sec and I'll find it09:39
derekhshardy: http://status-tripleoci.rhcloud.com/09:40
derekhbrb09:40
shardyderekh: thanks!09:41
*** fzdarsky|afk is now known as fzdarsky09:43
*** mbound has joined #tripleo09:56
*** ifarkas has joined #tripleo09:57
*** tosky has quit IRC09:59
*** milan has joined #tripleo10:00
*** dciabrin has joined #tripleo10:01
*** tosky has joined #tripleo10:06
*** sambetts|afk is now known as sambetts10:09
*** ooolpbot has joined #tripleo10:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION10:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/159277610:10
*** ooolpbot has quit IRC10:10
openstackLaunchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed]10:10
*** dtantsur|brb is now known as dtantsur10:13
openstackgerritCarlos Camacho proposed openstack/tripleo-docs: Composable roles within services Tutorial  https://review.openstack.org/31151210:14
*** akrivoka has joined #tripleo10:14
*** dciabrin has quit IRC10:17
shardySimple docs patch that looks ready to merge if anyone has a moment: https://review.openstack.org/#/c/32969910:17
openstackgerritMichele Baldessari proposed openstack/tripleo-heat-templates: DO NOT MERFGE - debug why upgrade job fails  https://review.openstack.org/33006910:19
*** noslzzp has joined #tripleo10:20
bandinijistr, ccamacho, EmilienM: [re upgrade jobs failing] I pushed https://review.openstack.org/330069 because I want to verify if we are hitting https://bugzilla.redhat.com/show_bug.cgi?id=132746910:20
openstackbugzilla.redhat.com bug 1327469 in pacemaker "pengine wants to start services that should not be started" [Urgent,New] - Assigned to kgaillot10:20
openstackgerritDougal Matthews proposed openstack/python-tripleoclient: WIP Use Mistral for baremetal introspection  https://review.openstack.org/32778010:22
*** sirushti has quit IRC10:26
openstackgerritDougal Matthews proposed openstack/python-tripleoclient: WIP Use Mistral for baremetal introspection  https://review.openstack.org/32778010:26
openstackgerritDougal Matthews proposed openstack/python-tripleoclient: Use Mistral for baremetal introspection  https://review.openstack.org/32778010:29
*** jtomasek_ has joined #tripleo10:30
*** dsariel has quit IRC10:31
*** mikelk has quit IRC10:32
*** akrivoka has quit IRC10:32
*** sirushti has joined #tripleo10:33
ccamachobandini ack10:33
shardyderekh, sshnaidm: http://status-tripleoci.rhcloud.com/ indicates periodic-tripleo-ci-centos-7-ha-mitaka is constantly failing, and there are only 4 results, is that right?10:36
sshnaidmshardy, I don't include there passed results (yet)10:36
shardydo we have another script that can summarize all the periodic jobs run?10:36
shardysshnaidm: heh, passed results are the only thing I'm interested in :)10:37
sshnaidmshardy, I see :) will add this10:37
shardyI've been using tripleo-jobs-gerrit.py for non-periodic job results, it'd be good if we could get a similar script into tripleo-ci that supports the periodic jobs10:37
shardyor, even better wire this back into the tripleo.org status pages10:38
shardyevery time we need to do a release, the first thing required is the latest periodic job pass for a particular branch10:38
shardygetting that is kind of a hassle atm10:38
shardyCan anyone direct me to the logs for the latest passed stable/mitaka periodic job?10:39
*** hjensas_ has joined #tripleo10:42
*** apetrich has quit IRC10:46
derekhsshnaidm: also, can you remove the f22 jobs, they don't exist any longer10:50
sshnaidmderekh, yeah, right10:50
derekhshardy: http://logs.openstack.org/periodic/periodic-tripleo-ci-centos-7-ha-mitaka/d3b1c10/console.html10:51
shardyderekh: thanks!10:51
mariosguys is anyone aware of a change in nova for the way we set the scheduler_driver or scheduler_host_manager for newton? (filed https://bugs.launchpad.net/tripleo/+bug/1593182 for now)10:52
openstackLaunchpad bug 1593182 in tripleo "failed openstack-nova-scheduler after updating undercloud from mitaka to newton packages" [Medium,Triaged] - Assigned to Marios Andreou (marios-b)10:52
*** akrivoka has joined #tripleo10:56
*** ccamacho is now known as ccamacho|lunch10:56
*** pkovar has quit IRC10:58
*** jtomasek_ has quit IRC10:59
jokke_hi10:59
openstackgerritBrad P. Crochet proposed openstack/puppet-tripleo: Add Mistral profiles  https://review.openstack.org/32343111:00
jokke_is it common that the gate-tripleo-ci-centos-7-upgrades job fails on overcloud controller resource exhaustion?11:02
*** olap has quit IRC11:04
*** olap has joined #tripleo11:05
openstackgerritMerged openstack/tripleo-quickstart: use environmental variables for ansible ssh configuration  https://review.openstack.org/32912411:06
*** dsariel has joined #tripleo11:06
sshnaidmderekh, can you please review it https://review.openstack.org/#/c/326055/ ?11:06
jistrjokke_: hi, currently the -upgrades job fails altogether, i don't think we've pinned down the root cause yet, bandini pushed https://review.openstack.org/330069 to investigate11:08
jistrjokke_: some info on that is here https://bugs.launchpad.net/tripleo/+bug/159277611:08
openstackLaunchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed]11:08
openstackgerritMerged openstack/tripleo-quickstart: Adds nested blocks to skip steps if there are no overcloud VMs  https://review.openstack.org/32961711:09
jokke_jistr: I'm looking the overcloud-controller-0 messages logs11:09
jokke_there is at least note of high loads (over 10) after which pacemaker starts trying to fence the services off11:10
*** ooolpbot has joined #tripleo11:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION11:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/159277611:10
*** ooolpbot has quit IRC11:10
openstackLaunchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed]11:10
*** ramishra has quit IRC11:10
jokke_there is a lot of high cpu load warnings from crmd (and tons of session starts for rabbitmq, which I dunno if it's normal)11:12
hewbroccajokke_: I would say it is not common -- this is a new development that we are trying to pin down11:13
jistrhmm high load could be an issue, but regarding switching services off, it may be switching them off on purpose, as the failures seem to happen during a stack-update where we switch things off/on to apply config changes11:13
jokke_I doubt this is expected Jun 15 13:15:55 localhost pengine[11468]: warning: Forcing ip-fd00.fd00.fd00.3000..10 away from overcloud-controller-0 after 1000000 failures (max=1000000)11:14
jokke_looking these logs http://logs.openstack.org/18/311218/6/check-tripleo/gate-tripleo-ci-centos-7-upgrades/a985e87/logs/overcloud-controller-0/var/log/messages11:14
openstackgerritJohn Trowbridge proposed openstack/tripleo-quickstart: Use quickstart.sh to manage venv in all ci-scripts  https://review.openstack.org/33004011:14
jistrjokke_: yea that isn't expected i think11:14
jistrhttps://bugs.launchpad.net/tripleo/+bug/1592776/comments/611:14
openstackLaunchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed]11:14
*** ramishra has joined #tripleo11:16
*** thrash|g0ne is now known as thrash11:20
jokke_jistr: added my notes from the oc-controller to that bug11:24
jistrthanks :)11:24
shardymgould: see https://review.openstack.org/330476 for the instack-undercloud stable/mitaka release11:27
*** tobias_fiberdata has joined #tripleo11:28
shardycoolsvap: Hey, if you'd like to help with stable releases, you might like to review the sha's in the CI results referenced there11:28
shardyand figure out which other components are due releases (quite a few I suspect)11:28
*** noslzzp has quit IRC11:28
shardyhttp://logs.openstack.org/periodic/periodic-tripleo-ci-centos-7-ha-mitaka/d3b1c10/console.html#_2016-06-16_06_24_36_68511:29
coolsvapshardy, sure11:29
shardyyou can see the sha's used in the test from the package names11:29
shardyEmilienM: ^^ FYI I proposed an instack-undercloud release11:29
shardyI'll be on PTO for a couple of days, so if there's review changes required, it'd be good if you or someone else can push them :)11:30
shardycoolsvap: Note that as in this case for instack-undercloud, several repos were tagged directly, and don't have deliverables in openstack/releases11:31
shardyAIUI the right thing is just to add them and continue with versioning from the current tag11:31
*** lucasagomes is now known as lucas-hungry11:31
mgouldshardy: thanks!11:36
* mgould looks11:36
*** weshay has joined #tripleo11:37
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for heat  https://review.openstack.org/32706911:37
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for glance API and registry  https://review.openstack.org/32747311:37
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for RabbitMQ  https://review.openstack.org/32748211:37
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for cinder-api  https://review.openstack.org/32885911:37
openstackgerritFlavio Percoco proposed openstack-infra/tripleo-ci: Update Fedora Atomic URL  https://review.openstack.org/33047711:38
*** tobias_fiberdata has quit IRC11:43
*** pkovar has joined #tripleo11:45
coolsvapshardy got couple of mins i have some queries11:48
shardycoolsvap: I'm about to drop out on PTO for the afternoon, but yes if they're quick queries ;)11:49
coolsvapshardy, after looking at the logs i see some components like diskimage-builder need updates11:49
coolsvapplease let me know if i am going correct?11:50
coolsvapif yes how independent components are updated in releases11:50
shardycoolsvap: there was some discussion on the ML about diskimage-builder being a special case, I'm not sure how that concluded11:50
openstackgerrityolanda.robla proposed openstack/tripleo-quickstart: Allow to specify templates path on overcloud deployment  https://review.openstack.org/32955611:50
shardyso I'd hold off any dib releases until we can chat with some of the diskimage-builder-core folks11:50
*** jpena is now known as jpena|lunch11:50
coolsvapalright11:51
shardycoolsvap: the main focus is to ensure we've got semi-recent tags on all the other tripleo specific repos11:51
shardyhttps://github.com/openstack-infra/tripleo-ci/blob/master/scripts/tripleo-jobs-gerrit.py#L1711:52
*** ramishra has quit IRC11:52
shardyproject list there, we can exclude the dib stuf and tripleo-ci11:52
coolsvapmaybe i need some more time, i will trouble EmilienM11:53
coolsvapshardy, ^^11:53
coolsvapi am now looking at instack and i have the same query as diskimage-builder11:54
*** dprince has joined #tripleo11:54
coolsvapshardy, but i got it right what you did with instack-undercloud11:56
*** noslzzp has joined #tripleo11:57
openstackgerritIhar Hrachyshka proposed openstack/tripleo-heat-templates: Stop passing charset=utf8 for neutron database connection option  https://review.openstack.org/33049011:57
shardycoolsvap: we can release instack via the same method as for instack-undercloud I think11:58
shardy(we could even do it in the same patch if there are dependencies between the two)11:59
coolsvapshardy, so i will need to update in independent or mitaka?11:59
coolsvapthats my query11:59
openstackgerritFlavio Percoco proposed openstack/tripleo-heat-templates: Add StepConfig to docker compute-post.yaml  https://review.openstack.org/33049211:59
shardycoolsvap: Honestly I'm not sure - we've aligned with the milestone releases from newton, so I just replicated that pattern for mitaka, and there is an existing instack-undercloud deliverable for liberty12:00
*** apetrich has joined #tripleo12:00
coolsvapi think in independent12:00
coolsvapinstack does not have stable branches12:00
shardycoolsvap: good point, lets see how it's tagged in the governance repo12:01
coolsvap    tags:12:02
coolsvap        - release:cycle-with-intermediary12:02
coolsvap        - type:library12:02
shardyHmm, so we need to either cut branches or change that12:02
*** jschlueter is now known as jschlueter|afk12:02
shardythanks for pointing out the inconsistency12:02
*** ramishra has joined #tripleo12:03
shardyI need to go now, perhaps we can sort this out next week12:03
shardyor, please work with EmilienM and slagle_ to resolve it12:03
coolsvapshardy, sure12:03
shardycoolsvap: thanks!12:03
shardygood to have more eyes on this :)12:03
*** shardy has quit IRC12:04
*** MaxPC has joined #tripleo12:05
*** jayg|g0n3 is now known as jayg12:08
*** coolsvap has quit IRC12:10
*** ooolpbot has joined #tripleo12:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION12:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/159277612:10
*** ooolpbot has quit IRC12:10
openstackLaunchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed]12:10
*** ramishra has quit IRC12:10
*** ramishra has joined #tripleo12:12
*** rhallisey has joined #tripleo12:15
*** slagle_ is now known as slagle12:16
*** ramishra has quit IRC12:16
hewbroccabandini: any more ideas on this failure?12:16
*** ibravo2 has joined #tripleo12:20
*** ibravo2 has quit IRC12:20
*** jschlueter|afk is now known as jschlueter12:21
*** ramishra has joined #tripleo12:21
*** lucas-hungry is now known as lucasagomes12:22
*** rcernin has quit IRC12:24
*** myoung|afk has joined #tripleo12:26
ccamacho|lunchbandini, im still checking reading bugs, i got the changes from https://review.openstack.org/#/c/330069/3 to see why is failing but not able to determine why the cluster is not starting...12:26
*** ccamacho|lunch is now known as ccamacho12:26
Nghey folks, I have a really dumb question :)12:27
hewbroccaNg: Remember, there are no stupid questions...12:28
Nggiven an OS::Heat::SoftwareConfig script in a template, what actually executes that script?12:28
hewbroccaOnly stupid people12:28
hewbrocca:D12:28
hewbroccajistr: ^^^12:28
NgI was guessing it would be something in the os-* toolchain, but I've picked through them and if it's there, I'm not seeing it :)12:29
jistrNg: it's a OS::Heat::SoftwareDeployment or OS::Heat::SoftwareDeploymentGroup resource12:29
jistrthe difference is that *Group takes multiple servers to apply the config (script) on12:30
*** trown|outtypewww is now known as trown12:30
*** tobias_fiberdata has joined #tripleo12:30
*** pradk has joined #tripleo12:31
jistrNg: here is an example https://github.com/openstack/tripleo-heat-templates/blob/4bd42ea849e6d50d16610377b5e89504f5a4f412/extraconfig/tasks/major_upgrade_pacemaker.yaml#L53-L6512:31
jistrNg: or if you mean what runs the script on the target machine, it is os-collect-config + os-refresh-config i think12:33
Ngjistr: so conveniently, that's actually what I'm looking at - we're reliably seeing Step2 not get executed. It just sits in CREATE_IN_PROGRESS12:34
*** jcoufal has joined #tripleo12:34
EmilienMgood morning12:34
hewbroccahey, isn't that the same bug bandini and ccamacho hit yesterday?12:34
Ngbut yeah, I did mean the actual script on the target machine. I'm trying to work my way back up the stack12:34
Nghewbrocca: yep, I'm pitching in trying to track it down12:34
hewbroccacool12:35
Ngand I figured I'd start from where does that script end up and what is supposed to execute it12:35
slagleit's the 55-heat-config orc script12:35
hewbroccaSeems like a reasonable approach12:35
Ngslagle: awesome, thanks12:35
slagleas long as occ has collected the deployment12:35
Ngafaics Step2 shows up in the json that occ produces12:36
slaglei usually do "ps axjf | grep -A 5 os-refresh-config12:36
slagle"12:36
slaglethat will show you if it's actually running something or not12:37
Ngit's not, and I've tried running it by hand and adding debugging and it doesn't seem to try and do any script execution12:38
Nginterestingly, we don't seem to have a 55-heat-config anywhere in /usr/libexec/os-refresh-config/, yet the Step1 gets executed fine12:38
*** rcernin has joined #tripleo12:38
ccamachomorning EmilienM!12:38
EmilienMjistr: do we have some progress on upgrade job failure?12:38
Nglikely the key factor here is that before step2, a yum update runs, and if that is commented out, step2 does get called12:39
jistrEmilienM: i didn't look into it personally, bandini posted further debugging patch https://review.openstack.org/#/c/33006912:39
ccamachoEmilienM bandini added to https://review.openstack.org/#/c/330069/12:39
*** julim has joined #tripleo12:39
Ngnot found a side-effect of that yet though, there aren't any maintainer scripts for orc/oac (occ isn't updated)12:40
jistrEmilienM: the CI results aren't there yet12:40
*** rlandy has joined #tripleo12:40
*** athomas has quit IRC12:40
Ngok, just rebuilt the environment cleanly, without Step1 having run either, and 55-heat-config exists12:41
* Ng will poke further. thanks everyone :)12:41
jistrNg: good catch12:42
hewbroccao noes12:42
hewbroccathe 55-heat-config bug again12:42
EmilienMjistr, bandini: see my comment here: https://review.openstack.org/#/c/330069/3/extraconfig/tasks/pacemaker_resource_restart.sh12:43
pradkjistr, Hi, what do you think of https://review.openstack.org/#/c/330096/12:43
EmilienMjust fyi, HTH12:43
*** bfournie has joined #tripleo12:46
*** rodrigods has quit IRC12:46
*** rodrigods has joined #tripleo12:46
*** athomas has joined #tripleo12:46
*** bfournie1 has joined #tripleo12:50
*** bfournie has quit IRC12:50
*** dciabrin has joined #tripleo12:51
hewbroccajistr: which job is this where it breaks12:53
hewbroccathe 55-heat-config thing12:53
hewbroccamburned: ^^^12:53
*** rcernin has quit IRC12:54
*** ramishra has quit IRC12:54
hewbroccaNg, bandini this thing where Heat stops -- does it happen on a regular deploy or only on the upgrade job12:54
Ngit's the upgrade12:55
jistrhewbrocca: it's not a job, it's manual upgrade testing. We don't have upgrades covered by CI (despite having an upgrade job :) ).12:55
hewbroccaOh!12:55
hewbroccaI see, OK12:55
Ngah yes, sorry, we're testing this manually12:55
hewbroccaand upgrade testing of what, exactly?12:55
mburnedhewbrocca: jistr:  https://bugzilla.redhat.com/show_bug.cgi?id=127818112:55
openstackbugzilla.redhat.com bug 1278181 in openstack-heat-templates "55-heat-config shouldn't use /var/run for it's DEPLOYED_DIR" [Unspecified,Closed: errata] - Assigned to sbaker12:55
*** mikelk has joined #tripleo12:55
mburnedhewbrocca: that was the issue we had to handle specifically in kilo based updates12:55
Nghewbrocca: liberty to mitaka. bandini has more detail than I do on this, I'm just trying to help out :)12:56
mburnedhewbrocca: we documented it a bit here:  https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/7/html/Director_Installation_and_Usage/sect-Updating_the_Overcloud.html12:56
hewbroccaliberty to mitaka12:57
*** Goneri has joined #tripleo12:57
jistryea though the BZ is a bit different. The BZ is about the data of 55-heat-config disappearing, while now we have 55-heat-config itself disappearing12:57
hewbroccaright12:57
mburnedbut it should have only been a problem for that single version of heat-templates, not anything newer12:57
hewbroccaseems like some mitaka package update must be removing it?12:57
*** fultonj has joined #tripleo12:57
jistryea if it happens after the step where yum update is executed, it could be some faulty script in one of the RPMs12:58
*** ramishra has joined #tripleo12:58
hewbroccajistr: and is it liberty->mitaka or OSP8->OSP912:59
jistrhewbrocca: i think bandini was testing upstream13:00
matbuhewbrocca: jistr yep we were testing upstream13:00
matbu(well we are)13:00
hewbroccaMight be time to go hassle #rdo13:00
*** cdearborn has joined #tripleo13:00
*** jpena|lunch is now known as jpena13:03
*** tzumainn has joined #tripleo13:05
*** ramishra has quit IRC13:05
*** ramishra has joined #tripleo13:05
*** rcernin has joined #tripleo13:06
jistrmatbu: is it an environment where i could take a peek too?13:06
matbujistr: yep13:07
NgI think it's openstack-tripleo-image-elements13:07
Ngthe rpm has a post-install that does an rsync with --delete against /usr/libexec/os-refresh-config13:08
*** ooolpbot has joined #tripleo13:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION13:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/159277613:10
*** ooolpbot has quit IRC13:10
openstackLaunchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed]13:10
*** cllewellyn_ has joined #tripleo13:11
*** cllewellyn__ has joined #tripleo13:11
Ngthat seems like it would be a bug - if you've build an image with elements that have orc scripts not in openstack-tripleo-image-elements, those orc scripts are going to be wiped out if you upgrade o-t-i-e13:13
Ngbandini: ^^ (for when you're back)13:14
jistrNg: yea i think you're right, well spotted. BTW i have liberty downstream now deployed, and i don't have openstack-tripleo-image-elements installed.13:14
*** dciabrin has quit IRC13:15
jistrso one issue is that o-t-i-e RPM perhaps shouldn't be doing this13:15
jistrand another issue is that o-t-i-e probably shouldn't be on overcloud13:16
Ngit seems reasonable that it might want to remove its stale orc scripts13:16
Ngbut this is a fairly blunt hammer :)13:16
matbujistr: you hit the same issue with downstream atm ?13:16
jistrhttps://paste.fedoraproject.org/380037/8299314/raw/13:16
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for glance API and registry  https://review.openstack.org/32747313:17
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for RabbitMQ  https://review.openstack.org/32748213:17
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for cinder-api  https://review.openstack.org/32885913:17
jistrmatbu: i just went through the undercloud upgrade, which seems to have gone fine. I'm not testing further yet b/c we probably miss a lot of things for the migrations downstream still.13:17
jistrso t-i-e is being pulled in by instack-undercloud, which is being pulled in by python-tripleoclient13:18
*** zoli|wfh is now known as zoli|lunch13:18
jistri'm not sure if we need any of those packages installed on overcloud... probably not?13:18
jistralso having a hard python-tripleoclient dependency on instack-undercloud is a bit weird too13:19
openstackgerritAdriano Petrich proposed openstack/tripleo-quickstart: inject debug options on the under/overcloud images  https://review.openstack.org/32999913:19
matbujistr: i can try with a --exclude=t-i-e13:19
jistrmatbu: that would be cool. Also if we have a liberty overcloud before upgrading at some point, might be worth checking if openstack-tripleo-image-elements is installed at all. I wonder if we're introducing weird deps that pull in packages that have no business in overcloud.13:22
*** jcoufal has quit IRC13:23
Ngjistr: at least in the environment I'm using, pre-upgrade, o-t-i-e is installed13:23
jistrthe yum log says13:23
Ng(just checked on overlcoud-controller-0)13:23
jistrJun 16 12:59:51 Updated: openstack-tripleo-image-elements-5.0.0-0.20160613170807.5feb901.el7.centos.noarch13:23
jistrso yea what Ng says13:24
slaglei dont think tie should be installed on the overcloud13:25
*** akshai has joined #tripleo13:26
slagleif it is, the images need to be built with the element-manifest element, otherwise as you've seen the rpm %post is going to wipe everything out13:26
*** akshai_ has joined #tripleo13:27
matbujistr: ack, i'll look if it's installed on the overcloud before upgrading13:27
*** myoung|afk has quit IRC13:29
jistrslagle: trying to track down what pulls in python-tripleoclient but can't find mention of it anywhere in t-p-e or t-i-e13:30
*** akshai has quit IRC13:31
*** ramishra has quit IRC13:31
*** myoung|afk has joined #tripleo13:32
hewbroccaeurrgh13:33
hewbroccaThis image deployment thing13:33
hewbroccahow was this going to help us again?13:33
matbuhewbrocca: wow that make me think about something ...13:35
jistrwell tbh the overcloud deployment takes ~30 minutes and not ~3 hours like in Staypuft times with kickstart :)13:35
hewbroccaThat's good I suppose13:36
matbuhewbrocca: few weeks ago, i have been tested the downstream upgrade (7 to 8) with tripleo-quickstart. I used some images built by Wes, i got the same issue (heat stuck), when I used the mburned images, it passed correctly13:36
jistrmatbu: huh that could be a lead. Some customization somewhere is building funky images.13:37
*** ramishra has joined #tripleo13:37
matbuhewbrocca: i never understand why ... i didn't look more because the official images was working.13:37
matbujistr: yep13:37
hewbroccamburned: ^^^13:37
slaglejistr: yea, i'd start with the overcloud image build log13:38
slagleor, go to a liberty ci job from tripleo-ci, and download the host_info.txt for one of the oc nodes13:39
jistri have the "normal" images built with tripleo.sh, and i don't have t-i-e on my env. It's master though, not liberty.13:39
slagleit has rpm -qa output in there13:39
mburnedinternally, for osp 7/8 we build using openstack overcloud image build13:39
mburnednothing special13:39
jistryea i also don't have t-i-e in OSP 8 env (downstream liberty)13:40
jistrso it's something specific to the way we build RDO images probably13:40
matbujistr: which images are using for RDO ?13:41
jistrmatbu, bandini: so you deployed with tripleo-quickstart, using downloaded images rather than built? maybe this? https://github.com/openstack/tripleo-quickstart/blob/master/config/release/liberty.yml13:41
*** tserong has quit IRC13:41
matbujistr: yep for me, idk for bandini13:41
*** tserong has joined #tripleo13:42
openstackgerritMarios Andreou proposed openstack/tripleo-docs: Upgrade documentation  https://review.openstack.org/30898513:43
trownjistr: matbu, oh didnt realize we were troubleshooting RDO images13:43
*** lblanchard has joined #tripleo13:44
jistrtrown: can we get to what builds them?13:45
trownhttps://github.com/redhat-openstack/ansible-role-tripleo-image-build/blob/master/vars/default_package_list.yml are the packages that get pre-installed13:45
trownI guess one of those is pulling in t-i-e?13:45
slaglerdo uses it's own code to build images?13:46
jistrtrown: it's python-tripleoclient -> instack-undercloud -> openstack-tripleo-image-elements13:46
trownslagle: it uses the tripleo-common library to do the image building, but it pre-installs a bunch of packages since it is used for undercloud image as well13:47
trownslagle: so as not to install 300 packages twice13:47
jistrtrown: i think we should remove python-tripleoclient from that list for overcloud, and only have it in undercloud13:48
*** egafford has joined #tripleo13:48
trownjistr: cool that is easy to do, there is an undercloud package install var for just that purpose, I will put up a patch shortly13:49
slagletrown: maybe we could maintain the list upstream13:49
myoung|afktrown: I'll watch for it (review)13:49
*** myoung|afk is now known as myoung13:49
jistrtrown: thanks13:49
*** dciabrin has joined #tripleo13:49
hewbroccawell there you go13:49
weshay:)13:49
hewbroccawhat's the chance this also has something to do with this pacemaker issue bandini is arguing with13:50
trownslagle: ya I have been trying to think of an alternative where we don't have that list at all, and turn the overcloud image into an undercloud image **AFTER** it is built with DIB/tripleo-common13:50
hewbroccalike remove a bunch of stuff?13:51
trownya remove stuff, add stack user, add stuff13:51
*** hewbrocca is now known as hewbrocca-afk13:51
jistrhewbrocca-afk: it's almost surely unrelated to the pacemaker issue, the pacemaker issue fails on -upgrade job where we build overcloud images through tripleo.sh13:52
*** jcoufal has joined #tripleo13:52
matbujistr: which pacemaker issue bandini hit ? I think the latest issue was the heat hanging on step213:53
*** Larion has joined #tripleo13:53
ccamachoEmilienM cant find anything useful, on controller neutron-server is not started after the update, after a "sudo systemctl restart neutron-server" it starts but not able to finish the update or get another error, I think is rabbit related but not sure how to test is rabbit is actually working properly.13:55
trownmyoung: https://review.gerrithub.io/28065513:55
*** dciabrin has quit IRC13:56
ccamachoEmilienM, jaosorior, chem any new clue?13:56
*** tobias_fiberdata has quit IRC13:56
EmilienMccamacho: I haven't worked on it yes13:57
EmilienMyet*13:57
EmilienMbandini: any news?13:57
openstackgerritHarry Rybacki proposed openstack/tripleo-quickstart: [WIP] Add scale to roles gate  https://review.openstack.org/32954213:58
chemccamacho: I'm going back to pacemaker analysis and following the date in multiple log file.  I've found that ipaddr failure happens also in "successful" ha-proxy update.  I'm going to log that in the launchpad13:58
matbujistr: so yes, t-i-e is installed on liberty13:58
EmilienMjistr: could you reproduce the pacemaker issue?13:59
jistrpradk: i think a tweak is required at https://review.openstack.org/#/c/330096/ but otherwise i think it's good, thanks14:02
*** masco has quit IRC14:03
*** tserong has quit IRC14:05
jtomasekI am seeing this error randomly on several service apis in recent undercloud setup, is it known thing? http://paste.openstack.org/show/516666/14:06
pradkjistr, so you mean instead of keystone_cli.users.find .. do service.find?14:08
*** tserong has joined #tripleo14:08
jistrpradk: yea i think that should handle it better. E.g. i don't think there's a 'cinderv2' user, so the 'cinderv2' initialization would always be retried if we looked for users instead.14:09
*** ooolpbot has joined #tripleo14:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION14:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/159277614:10
*** ooolpbot has quit IRC14:10
openstackLaunchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed]14:10
pradkjistr, i see.. ok lemme see what call keystone client supports for services check14:10
*** fzdarsky is now known as fzdarsky|afk14:11
*** dsariel has quit IRC14:12
*** tobias_fiberdata has joined #tripleo14:14
*** tserong has quit IRC14:15
*** Larion has quit IRC14:17
*** tserong has joined #tripleo14:18
*** zoli|lunch is now known as zoli|wfh14:19
openstackgerritHarry Rybacki proposed openstack/tripleo-quickstart: [WIP] Add scale to roles gate  https://review.openstack.org/32954214:20
*** dciabrin has joined #tripleo14:23
EmilienMjistr, chem, bandini: is there someone on the pacemaker issue in upgrade job?14:23
chemEmilienM: well, I'm reading corosync log and deploying a fresh quickstart14:24
jistrEmilienM: i'm tasked with OSP 9 primarily, looking into https://bugzilla.redhat.com/show_bug.cgi?id=1341350 now14:25
openstackbugzilla.redhat.com bug 1341350 in rhel-osp-director "rhel-osp-director: registering the overcloud images fails on first attempt with "500 Internal Server Error: Failed to upload image 51672726-cc40-40ea-9ca0-1f8b2267313c (HTTP 500)"" [Unspecified,Assigned] - Assigned to jstransk14:25
EmilienMjistr: right but our CI is currently blocked, I spent almost my day on it yesterday and didn't make much progress, /me really needs help14:26
EmilienMjistr: I'm also tasked to things but imho we should fix our CI first14:26
jistrtrying to look for https://review.openstack.org/#/c/330069 in zuul but can't find it14:27
jistrwe may need to recheck?14:27
EmilienMI did recheck..14:28
*** jistr is now known as jistr|mtg14:28
*** ifarkas has quit IRC14:28
EmilienMwhat is worries me is that our CI is red14:28
EmilienMand nobody really cares14:28
*** hewbrocca-afk is now known as hewbrocca14:30
hewbroccaEmilienM: I wouldn't say that, we've been working on fixing it all day14:30
EmilienMcool, I hope I was wrong then.14:31
chemEmilienM: see my new comment on the launchpad, maybe you will see there something that eludes me14:32
EmilienMok14:32
pabelangerEmilienM: I feel your pain. Not to pile on to the ci pipeline on tripleo, I found it difficult to contribute simply because how long the testing were taking.14:33
*** ramishra has quit IRC14:34
pabelangersome feedback would be to break out testing into smaller portions vs and end to end testing model.14:34
hewbroccapabelanger: yes, slagle has done some significant work on that14:36
*** ramishra has joined #tripleo14:37
hewbroccafolks, I think we need to down tools until this upgrade job is green again14:38
EmilienMtrown: can we revert our last promotion?14:38
*** tserong has quit IRC14:38
EmilienMif it's related to the latest version of pacemaker or something?14:38
EmilienMI propose we revert what is blocking us, and let bandini and other pacemaker gurus to figure what is going wrong14:39
matbujistr|mtg: bandini hewbrocca just removing t-i-e package (and deps) unblock and works \o/14:39
hewbroccamatbu: well that's good news14:40
hewbroccabut I guess it doesn't fix the upgrade job, right?14:40
hewbroccaor does it...14:40
*** pcaruana has quit IRC14:41
matbuhewbrocca: nop cause, the tripleo ci job is about minor upgrade14:41
openstackgerritHarry Rybacki proposed openstack/tripleo-quickstart: [WIP] Add scale to roles gate  https://review.openstack.org/32954214:41
hewbroccayeah, I thought not14:42
hewbroccaEmilienM: I think the problem is that we have not been able to identify at all, what commit causes the upgrade job to fail14:42
trownmatbu: myoung I think https://review.gerrithub.io/280666 is a more long term fix to get rid of package drift between RDO images and upstream14:43
EmilienMI'm going to compare (again) latest successful job and a recent failing job and see what is diff14:43
EmilienMI did it already but it was not super useful14:43
openstackgerritPradeep Kilambi proposed openstack/python-tripleoclient: Run post deploy config on force  https://review.openstack.org/33009614:43
ccamachohewbrocca ++  yeahp, I dont think is related to tht or puppet-tripleo as I have deployed patches from last week.. and had the same problem.. Im not able no get the error..14:43
ccamachoEmilien I did that already a patch from last week, in the upgrade failed with the same issue...14:44
chemEmilienM: you had a link on the package diff, do you still have it ?14:44
hewbroccaEmilienM: can you identify a task force to nail this down14:44
EmilienMchem: yes but its not relevant14:44
*** trown is now known as trown|brb14:44
EmilienMit was https://www.diffchecker.com/f9hkwqmj14:44
hewbroccaName  your people14:44
EmilienMbut I'm doing it again14:44
chemEmilienM:thanks14:44
openstackgerritAttila Darazs proposed openstack/tripleo-quickstart: Fix script creation modes to standard 755  https://review.openstack.org/33062014:45
ccamachoI can give you /etc /var/log from an OK env and a NOK /etc /var/log from the same env after the update...14:45
EmilienMhewbrocca: I was wrong and it seems chem ccamacho are also working on it. Let's continue together now14:45
hewbroccaOK EmilienM and you need bandini as well?14:45
EmilienMccamacho: yeah? that would be cool14:45
ccamachojust a sec14:45
EmilienMhewbrocca: oh yeah, for sure14:45
hewbroccaOK14:45
myoungtrown, looked quickly and that seems like a rational solution to a real problem.  I'll take a deeper look later on.14:45
hewbroccaEmilienM: You might also want to grab beekhof14:47
EmilienMok, package diff: https://www.diffchecker.com/8qlic49z14:47
EmilienMon left, successful job, on right, failing job14:47
chemEmilienM: it's undercloud, do you have overcloud ?14:48
EmilienMso there is dkms, tht, i-u and os-net-config14:48
*** cllewellyn_ has quit IRC14:48
*** cllewellyn__ has quit IRC14:48
EmilienMright, that's undercloud14:48
EmilienMdamn, there is no rpm-qa logs on overcloud I think14:49
*** tserong has joined #tripleo14:49
slagleEmilienM: there is14:49
openstackgerritDerek Higgins proposed openstack-infra/tripleo-ci: Switch back to current-tripleo pre promotion  https://review.openstack.org/11101114:49
slagleEmilienM: in /var/log/host_info.txt14:49
derekhthats ^^ testing CI before we did the last promotion14:49
slagleor host_info.log14:49
*** jaosorior has quit IRC14:49
EmilienMslagle: ok thanks, I missed that14:49
derekhEmilienM: ^14:49
EmilienMoh awesome14:50
* EmilienM autoslap14:50
*** panda has quit IRC14:50
*** panda has joined #tripleo14:50
ccamachoEmilienM https://we.tl/EXKKlP4PjB14:51
*** trown|brb is now known as trown14:52
EmilienMccamacho: excellent, thx, let me look now14:52
chemI'm off for 30min14:52
ccamachoyou have there logs from both controller and compute before/after the upgrade14:52
ccamachoall /etc and /var/logs14:53
ccamachoEmilienM ^14:53
EmilienMexcellent14:54
chemI'm unoff ... wrong time14:55
EmilienMok I don't get it14:56
EmilienMhttps://www.diffchecker.com/wnc1vjrp14:56
EmilienMI need to try again, maybe I did wrong14:56
*** tserong has quit IRC14:56
*** ramishra has quit IRC14:57
*** adarazs is now known as adarazs_afk14:57
openstackgerritAna Krivokapic proposed openstack/tripleo-ui: Disable "Assign Nodes" link if no nodes are available  https://review.openstack.org/33062714:58
*** tremble has quit IRC14:58
*** dtantsur is now known as dtantsur|bbl14:58
openstackgerritHarry Rybacki proposed openstack/tripleo-quickstart: [WIP] Add scale to roles gate  https://review.openstack.org/32954214:58
*** ramishra has joined #tripleo14:59
*** chem has quit IRC14:59
*** chem has joined #tripleo15:00
*** tserong has joined #tripleo15:02
EmilienMok I have something more helpful15:03
EmilienMhttps://www.diffchecker.com/qpmfhinu15:03
openstackgerritMerged openstack/tripleo-quickstart: Fix script creation modes to standard 755  https://review.openstack.org/33062015:05
EmilienMderekh, trown: what do you think if we come back to the previous repo we were using?15:05
EmilienMit seems that all diff is related to the latest promotion15:05
*** ebarrera has quit IRC15:06
trownfine by me15:06
chemEmilienM: this is a nightmare, so many packages ...15:06
EmilienMchem: not so much, most of them are openstack related15:06
*** dciabrin has quit IRC15:07
EmilienMbut yeah, it's not an easy one15:07
EmilienMtrown: how to proceed, by patching triplo-ci?15:07
EmilienM70/8b/708bf15975d8bfb4b7bc9426a86369d82c0d4dd9_cbd0900e was working well15:09
*** ooolpbot has joined #tripleo15:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION15:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/159277615:10
*** ooolpbot has quit IRC15:10
openstackLaunchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed]15:10
openstackgerritBrad P. Crochet proposed openstack/tripleo-puppet-elements: Add zaqar package to controller image  https://review.openstack.org/33063615:12
*** aufi has quit IRC15:14
*** ramishra has quit IRC15:14
*** coolsvap has joined #tripleo15:16
*** ramishra has joined #tripleo15:16
EmilienMtrown: how can we unpromote our CI?15:16
EmilienMtrown: should we do something in tripleo-ci/scripts/tripleo.sh ?15:16
*** jmiu_ has joined #tripleo15:17
*** olap has quit IRC15:17
trownEmilienM: sorry on a call, I think I can just manually do it15:17
trownderekh: wdyt?15:17
EmilienMexcept if we have an immediate solution, I propose we do it15:18
*** fzdarsky|afk is now known as fzdarsky15:18
*** jistr|mtg is now known as jistr15:19
jistrderekh: if you have the issue reproduced, you could try logging into the controller and run "crm_resource --wait -VVV"15:19
openstackgerritSanjay Upadhyay proposed openstack/python-tripleoclient: Tripleoclient leaks temporary files  https://review.openstack.org/33063815:20
*** rbrady has quit IRC15:20
jistrthat should repeatedly print some info (what resources it's waiting for to start or stop) which could help us debug that further15:20
derekhtrown: EmilienM I push a patch about 30 minutes ago to test the previous repo, it it work we could either merge the patch(or something similar) or switch back the current-tripleo links on the mirror and trunk servers15:20
derekhtrown: EmilienM https://review.openstack.org/#/c/111011/6815:21
jistrthat's what bandini is trying to do with the CI patch https://review.openstack.org/#/c/330069/15:21
EmilienMderekh: excellent. Thanks.15:21
derekhjistr: ack, will see how it goes, thanks15:21
*** ifarkas has joined #tripleo15:24
ccamachojistr: for the CI issue   [heat-admin@overcloud-controller-0 ~]$ sudo crm_resource --wait -VVV15:25
ccamacho  notice: LogActions:Start   httpd:0(overcloud-controller-0)15:25
ccamacho  notice: LogActions:Start   httpd:0(overcloud-controller-0)15:25
ccamacho  notice: LogActions:Start   httpd:0(overcloud-controller-0)15:25
derekhjistr: ccamacho I got this every 2 seconds15:27
derekh  notice: LogActions:   Start   openstack-nova-scheduler:0      (overcloud-controller-0)15:27
derekh  notice: LogActions:   Start   openstack-cinder-volume (overcloud-controller-0 - blocked)15:27
derekh  notice: LogActions:   Start   openstack-nova-api:0    (overcloud-controller-0)15:27
derekh  notice: LogActions:   Start   openstack-nova-conductor:0      (overcloud-controller-0)15:27
derekhcinder-volume blocked?15:27
derekhconstaints say its started after openstack-cinder-scheduler15:28
jistrhmm so we probably are hitting https://bugzilla.redhat.com/show_bug.cgi?id=132746915:28
openstackbugzilla.redhat.com bug 1327469 in pacemaker "pengine wants to start services that should not be started" [Urgent,New] - Assigned to kgaillot15:28
derekhand further up the logs,15:29
derekhJun 16 14:03:34 overcloud-controller-0 systemd[1]: openstack-cinder-scheduler.service: main process exited, code=killed, status=14/ALRM15:29
*** rbrady has joined #tripleo15:31
jistrccamacho: is your output from the stuck upgrade too, didn't it change to what derekh pasted after a while?15:33
EmilienMjistr: I don't understand, if you look packaging diff, we don't have an update on pacemaker bits, or do we?15:33
*** openstackgerrit has quit IRC15:34
EmilienMjistr: https://www.diffchecker.com/qpmfhinu15:34
*** openstackgerrit has joined #tripleo15:34
ccamacho  nop now is empty..15:34
ccamachojistr ^15:34
*** adarazs_afk is now known as adarazs15:35
*** mikelk has quit IRC15:35
ccamachojistr weird, after running sudo pcs status --full had a lot of services stopped (including httpd) now they are started15:37
ccamacho..15:37
*** milan has quit IRC15:37
ccamachowithout doing anithing...15:37
*** leanderthal is now known as leanderthal|afk15:38
jistrhmm yea...15:39
ansiwen_will in the end (after everything is composable) puppet/manifests/overcloud_compute.pp disappear completely?15:40
chemccamacho: pacemaker has its own life, and start stop service as it sees fit, look into /var/log/cluster/corosync.log15:40
jistrderekh: could you please paste `pcs constraint show | grep nova` executed on a controller15:40
derekhso could this be the problem, cinder scheduler not stopping when it should?15:41
derekhJun 16 14:02:34 overcloud-controller-0 systemd[1]: Stopping OpenStack Cinder Scheduler Server...15:41
derekhJun 16 14:03:34 overcloud-controller-0 systemd[1]: openstack-cinder-scheduler.service: main process exited, code=killed, status=14/ALRM15:41
derekhjistr: yup15:41
ansiwen_and what is ::nova::compute::neutron, is that an independend service as well, EmilienM?15:41
EmilienMansiwen_: no15:42
openstackgerritBrad P. Crochet proposed openstack/python-tripleoclient: Add Zaqar password to deployment  https://review.openstack.org/33065015:42
EmilienMit's nova.conf parameters for neutron15:42
EmilienMbut not used anywhere AFIK15:42
derekhjistr: http://paste.openstack.org/show/516688/15:42
EmilienMansiwen_: https://github.com/openstack/puppet-nova/blob/master/manifests/compute/neutron.pp15:43
ansiwen_EmilienM: thanks15:43
ansiwen_EmilienM: so, will in the end puppet/manifests/overcloud_compute.pp disappear or will there something left?15:44
chemjistr: it looks like it miss openstack-core-clone  openstack-nova-consoleauth-clone colocation15:44
*** saneax is now known as saneax_AFK15:44
ansiwen_ccamacho: is puppet-nova something you should also mention in your walkthrough?15:45
EmilienMansiwen_: disappear15:45
ccamachoansiwen_ puppet-nova nop15:45
EmilienMansiwen_: why that?15:45
jistrchem: i'm not sure if that's needed though15:46
jistrderekh: thanks, that's not what i was hoping for, it actually looks ok :)15:46
ansiwen_because it is another external repository beside tripleo-puppet-elements that seems to be imported by default?15:47
*** flaper87 has joined #tripleo15:47
derekhjistr: ok, want to log onto this box and have a look?15:47
ansiwen_EmilienM: ok, so, if I want to let overcloud_compute.pp disappear, what could I do next?15:48
*** mcornea has quit IRC15:48
chemjistr: but then it doesn't form a group with all the openstack-nova-* services15:48
*** ibravo has joined #tripleo15:48
EmilienMansiwen_: you can sync with pradk for telemetry stuff15:49
chemjistr: it's stricking how that the service under this branch that wants to restart http://file.rdu.redhat.com/~mbaldess/lp1569444/newton-jiri.pdf15:49
*** mcornea has joined #tripleo15:49
*** mcornea has quit IRC15:49
EmilienMansiwen_: we have ceilometer compute agent15:49
EmilienMansiwen_: i'll finish nova15:49
*** mcornea has joined #tripleo15:49
ansiwen_ok, pradk, give me work please :-)15:50
*** tobias_fiberdata has quit IRC15:52
ansiwen_ccamacho: did you read my answer? I though if you mention tripleo-puppet-elements, maybe it's also worth to mention other external repositories?15:52
EmilienMansiwen_: you can review my stuff https://review.openstack.org/#/q/topic:tripleo/nova/compute-composable15:52
ansiwen_ccamacho: because without EmilienM I would not have known to look at puppet-nova for some definitions15:53
*** tobias_fiberdata has joined #tripleo15:53
EmilienMdprince: when you get a moment, can you look & give feedback on https://review.openstack.org/#/q/topic:tripleo/nova/compute-composable please?15:53
*** panda is now known as panda|afk15:53
jistrderekh: yea that would be nice if you can PM me login details15:53
derekhjistr: ack15:54
*** Larion has joined #tripleo15:56
openstackgerritimain proposed openstack/tripleo-heat-templates: WIP: Containerized Services for Composable Roles  https://review.openstack.org/33065915:57
*** dmacpher has joined #tripleo15:57
*** akshai has joined #tripleo15:58
*** tobias_fiberdata has quit IRC15:59
*** zoli|wfh is now known as zoli|gone15:59
ccamachohey  ansiwen_ as the tutorial is based in a simple patch is not covering puppet-nova15:59
dprinceEmilienM: yep15:59
pradkansiwen_, sure what are you looking for in particular?16:00
openstackgerritAthlan-Guyot sofer proposed openstack/tripleo-heat-templates: TEST: do not merge.  Testing if colocation help.  https://review.openstack.org/33066116:00
pradkansiwen_, did you wrap up the libvirt  realtime support thing? i see you on that in trello16:00
*** zoli|gone is now known as zoli_gone-proxy16:00
*** tesseract has quit IRC16:00
*** xinwu has joined #tripleo16:01
EmilienMpradk: what is that?16:01
*** dprince has quit IRC16:01
*** akshai_ has quit IRC16:01
chemjistr: I'm testing adding a colocation constraint, just to see.16:01
pradkEmilienM, ansiwen_ probably knows better on the details of that16:01
*** dprince has joined #tripleo16:02
jistrchem: ack, thanks. This is a bug in pacemaker, previously we've had luck with working around it by adding more constraints, but ordering constraints were enough IIRC.16:02
ccamachojistr, is normal that this service is the only one stopped? [heat-admin@overcloud-controller-0 ~]$ sudo pcs status --full  | grep -i Stopped16:02
ccamachohttpd(systemd:httpd):(target-role:Stopped) Stopped16:02
ccamachoStopped: [ overcloud-controller-0 ]16:02
EmilienMchem: did we remove it when we landed composable nova?16:03
*** Larion has quit IRC16:04
jistrccamacho: probably not, but depends where the operations failed and if any recovery was done etc. I'm looking at derekh's deployment and it's not in the same state as yours probably.16:04
chemEmilienM: I'm practictly positive that no it was not removed, but let me check again16:04
ansiwen_pradk: no, I have no idea yet, eglynn created that trello card. don't know the details about it yet16:04
jistrEmilienM: i don't think so. It was my first thought, that composable nova removed some constraints, but i wasn't able to find anything like that.16:04
jistrthat's why i asked derekh to paste those constraints before :)16:05
ansiwen_pradk: I'm just looking for composable role tasks, because this has high prio at the moment.16:05
EmilienMjistr: no, I didn't remove any constraint I think I kept it for later :)16:05
*** lucasagomes is now known as lucas-afk16:06
pradkansiwen_, ah ok sure.. for ceilo i have started something, but only converted the controller stuff.. there is still compute as ceilo-compute agent runs on compute.. you can look into that if you want16:06
*** milan has joined #tripleo16:06
jistrwhat keeps me me puzzled is why did we start hitting it all of a sudden. But it might be that we just shuffled around puppet code so maybe some order of pcs calls changed which makes the pcs bug fire up in a different way...16:06
ansiwen_pradk: sounds good16:07
EmilienMchem: oh wait I found something16:07
ansiwen_pradk: any related pointers? shall I look at your ceilo controller parts?16:08
EmilienMrequire      => Pacemaker::Resource::Ocf['openstack-core'], is missing for some nova resource16:08
*** hewbrocca is now known as hewbrocca-afk16:08
pradkansiwen_, see puppet/manifests/overcloud_compute.pp and look for ceilo specific config.. thats what we need to convert to composability16:08
pradkansiwen_, should be quite easy i think16:08
chemEmilienM: I know I pinged you about this.  But it's not relevent for pacemaker config16:08
pradkansiwen_, puppet/compute.yaml has the params16:09
jistrchem, EmilienM: as long as it's not missing from the constraint definitions, it should probably be ok...16:09
ansiwen_pradk: easy is good for me :-)16:10
*** ansiwen_ is now known as ansiwen16:10
chemEmilienM: the heat orchestration ensures that ocf['openstack-core'] is created before16:10
pradkansiwen_, and whatever hiera data is in puppet/hieradata/compute.yaml16:10
*** ooolpbot has joined #tripleo16:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION16:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/159277616:10
*** ooolpbot has quit IRC16:10
openstackLaunchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed]16:10
*** akshai_ has joined #tripleo16:10
openstackgerritEmilien Macchi proposed openstack/puppet-tripleo: profiles/nova/pacemaker/consoleauth: add missing require  https://review.openstack.org/33066916:10
EmilienMchem: ^16:10
ansiwenpradk: I don't get that separation into hieradata, btw...16:10
EmilienMchem: other services are good, I checked16:10
derekhjistr: btw, cinder-scheduler is runing on the controller, I did that, when I noticed that cinder-volume was blocked16:11
*** myoung is now known as myoung|biab16:12
chemEmilienM: when I saw the error it was on the top of my hit list.  But I cannot see how it can mess around there ...16:12
*** akshai has quit IRC16:13
jistrderekh: ack thanks, i already cycled the resources status start/stop so it should be ok now i think anyway16:13
*** xinwu has quit IRC16:15
*** myoung|biab has quit IRC16:17
* bandini back16:18
*** dprince has quit IRC16:20
openstackgerritLars Kellogg-Stedman proposed openstack/tripleo-quickstart: return global control of force_cached_image  https://review.openstack.org/33016616:20
*** dprince has joined #tripleo16:21
*** ramishra has quit IRC16:22
EmilienMchem: I'm sure it won't help but let's give it a try16:22
openstackgerritLars Kellogg-Stedman proposed openstack/tripleo-quickstart: return global control of force_cached_image  https://review.openstack.org/33016616:25
chemEmilienM: I'm kinda in the same mood ... let's try some stuff16:25
*** ramishra has joined #tripleo16:26
bandiniEmilienM, jistr: I see that the upgrade jobs was not triggered on my review. ci overloaded or somehing else?16:28
openstackgerritJiri Stransky proposed openstack/tripleo-heat-templates: Work around Nova blocking openstack-core stop  https://review.openstack.org/33068216:28
jistrbandini, EmilienM, chem: ^^ this helped on derekh's environment16:29
jistrbandini: indeed i think we're hitting the bug 'pengine wants to start its favorite services'16:29
jistrwhat's funny is that this time it doesn't seem that we're missing any constraints16:30
EmilienMwait16:30
chemjistr: hehe16:30
EmilienMit sounds super related to https://review.openstack.org/33066916:30
EmilienMjistr: ^16:30
* bandini still catching up16:31
jistrEmilienM: it's worth a shot for sure, but the resource is defined fine, and the constraint too. As long as the `require      => Pacemaker::Resource::Ocf['openstack-core'],` is on the constraint definition, it shouldn't matter that it isn't on the resource definition.16:31
*** dtantsur|bbl is now known as dtantsur16:32
EmilienMjistr: I was wondering if this new failure is related to our patches for nova composable16:32
bandinijistr: I see though that in newton a bunch of previously existing constraints went missing16:33
bandinijistr: http://acksyn.org/files/tripleo/juan-working.pdf vs http://acksyn.org/files/tripleo/carlos-broken.pdf16:33
chemEmilienM: one way to find out would be to roll back all the nova composition that landed yesterday.  But once again, I cannot see what would be missing or broken16:33
jistrEmilienM: i think it probably is, but that doesn't necessarily mean that there's something wrong in them, it might have just started triggering a pcmk bug by doing operations in different (but maybe not wrong) order.16:34
*** ramishra has quit IRC16:34
EmilienMok16:34
jistrbandini: oh yea that carlos-broken.pdf doesn't look nice :)16:34
bandinijistr: that is why we started triggering this, methinks16:35
derekhEmilienM: trown my test of the old repo failed, 2016-06-16 16:25:07.571379 Stack overcloud CREATE_FAILED16:35
derekhEmilienM: trown didn't get a far as the update16:35
EmilienMdamn16:35
derekhEmilienM: trown going to retrigger it not16:35
bandinijistr: shall we have a 5min sync up? I was gone for the most of the afternoon, not sure what has been tried already16:35
jistrbandini: ok so you mean in fact it's blocked on Nova resources, but it's broken by the composable services for Neutron and Gnocchi?16:36
chembandini: where are all the neutron-server resources ... ?16:36
bandinijistr: it's one theory, yes16:36
ccamachochem bandini  my deployment is using an older patch (Composable Nova).. I will try to go more in the past..16:36
bandinichem: some constraints disapperared for those as well16:36
trownEmilienM: derekh, we also have the option to try to move forward... osc_lib is packaged and being added as a dep to openstackclient package shortly16:36
bandininow if I could dpeloy a damn newton install, I would not be so useless16:36
openstackgerritDerek Higgins proposed openstack-infra/tripleo-ci: Switch back to current-tripleo pre promotion  https://review.openstack.org/11101116:37
derekhtrown: EmilienM ^^ retrying16:37
EmilienMack16:38
EmilienMtrown: yeah16:38
EmilienMI'm fine with the THT workaround now16:38
*** yamahata has joined #tripleo16:39
chembandini: when/why they disapear I don't see when they were removed ?16:40
*** ramishra has joined #tripleo16:40
bandinichem: I have to look around in git logs, no idea yet16:40
*** oshvartz has quit IRC16:42
*** dmk0202 has quit IRC16:42
ccamachoguys just to be clear. this is not working as openstack-core is not Started, right? "check_resource openstack-core stopped 1800"16:43
chemccamacho: not exactly16:44
ccamachommm chem, can you give me more details?16:44
*** jpich has quit IRC16:44
*** xinwu has joined #tripleo16:44
jistrwhat's tricky about these issues is that CI will never uncover them. B/c when the patch goes through -upgrades CI, it's deployed with old correct t-h-t, and then upgraded to incorrect t-h-t/o-p-m, but the constraints are still in place from the deploy. And the other jobs will deploy with the new (possibly incorrect) code, but they don't do service stops.16:45
chemccamacho: when the pcs resource stop openstack-core is done, the cluster remain in transition state because some resource are started that depends on openstack-core (and so should be off).  Being in transtion for more that 1800 sec the script abort16:45
jistrSo this is something we need to be careful about when extracting composable roles...16:45
chemccamacho: that's how I understand it anyway, if someone can crosscheck16:45
ccamachochem ack16:46
bandinichem: I believe you described it correctly, yes16:46
jistrchem: i'm not sure if they're still started, maybe it's more that pacemaker *wants* them to start but they can't start because they have constraints in place that prevent that.16:46
bandiniyep it tries but somehow it can't/won't16:47
chembandini: I'm digging into a non-working log of corosync and the neutron resources and order *are* there16:47
openstackgerritBen Nemec proposed openstack/tripleo-docs: Document how to clean up a pacemaker cluster after failed update  https://review.openstack.org/33069516:47
chembandini: so it seems that the second graph misses them because it doesn't use them.  Not the stuff we're looking I think16:47
bandinichem: the neutron resources are there in http://acksyn.org/files/tripleo/carlos-broken.pdf but the ordering constraints seem to be gone no?16:49
chembandini: no in the log they are present : rsc_order first="neutron-netns-cleanup-clone" first-action="start" id="order-neutron-netns-cleanup-clone-neutron-openvswitch-agent-clone-mandatory" then="neutron-openvswitch-agent-clone" then-action="start"/>16:49
chembandini: that's from non working cluster/corosync.txt16:50
bandinichem: do you have neutron-server -> neutron-openvswitch-agent too?16:50
chembandini: looking16:50
bandiniI don't see that one in the broken one16:50
*** ebarrera has joined #tripleo16:50
jistryea i don't see that one16:51
bandinijistr: what do you think about testing a test review that restores all the missing constraints and then we see how it behaves?16:52
jistr++16:52
derekhjistr: I gotta run soon, if your gonna want to reploy that overcloud, I'll have to kill the old one before I go16:53
jistrderekh: i will not need a redeploy, will just test the constraint changes live16:53
jistrderekh: thanks and have a good evening16:53
derekhjistr: ok, ttyl16:53
chembandini: no this one is definitively gone !16:53
EmilienMbandini: I also did https://review.openstack.org/#/c/330669/16:54
chembandini: looking at a working one, it was there16:54
EmilienMbandini: to catch up with how it was before our work on composable nova roles16:54
chembandini: I think this is a goooood lead :)16:54
*** mbound has quit IRC16:54
*** ramishra has quit IRC16:55
bandinior we just kill all the constraints except those specified in the NG HA spec and we deal with the fallout16:55
bandinibit of a big hammer, but we're headed that way anyway16:55
bandiniEmilienM: looking16:56
*** xinwu has quit IRC16:58
ccamachochem indeed starting all the services result in ERROR: cluster finished transition but openstack-core was not in stopped state, exiting16:59
*** derekh has quit IRC16:59
ccamachoill do it without starting openstack-core16:59
bandinijistr: btw. what is the background for doing all those "pcs resource disable" thingies instead of cluster stop and start?17:00
bandinido you remember the specifics?17:01
jistrbandini: hmm no i don't remember. We could change it if we wanted i guess. The only potential issue i see is that full cluster restart might take more time (the cluster needs to re-form).17:04
chembandini: the patchset were it was move to composable role is I896e5dfe6fae49371c9fe7f47c4364eb6f621b0717:04
jistrbandini: also we'd bump the VIPs17:04
jistrand there was this issue with cluster stop going bad b/c of pacemaker communicating via VIP on the client side and the VIP disappearing in the middle17:05
chembandini: relative to what is present in the puppet-tripleo it seems that we added if $enable_dhcp to $enable_ovs for the constraint to take place17:05
jistr(we discovered that one later, so it's not an original reason for this decision, but still we'd have to tackle that if we decided to change today)17:05
*** mbound has joined #tripleo17:05
bandinijistr: I see17:06
jistrso i added17:06
jistr    pcs constraint order neutron-server-clone then neutron-openvswitch-agent-clone17:06
jistr    pcs constraint order neutron-openvswitch-agent-clone then neutron-dhcp-agent-clone17:06
jistr    pcs constraint order openstack-core-clone then openstack-gnocchi-metricd-clone17:06
jistry17:06
*** akshai_ has quit IRC17:06
jistrbut still it gets stuck17:06
jistrnot sure if i forgot about some17:06
bandiniwant to paste the CIB so we can check?17:07
bandiniis sahara covered?17:07
*** trown is now known as trown|lunch17:08
bandinichem: ack17:08
*** jpena is now known as jpena|off17:08
*** ifarkas has quit IRC17:08
jistrbandini: possibly sahara is not, i worked off the carlos-broken.pdf17:08
chembandini: hum ... let me dig further I may have made a mistake. ...17:09
chembandini:(for the patchset)17:09
bandinijistr: send me a CIB and we doublecheck together, if you want17:10
*** ooolpbot has joined #tripleo17:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION17:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/159277617:10
*** ooolpbot has quit IRC17:10
openstackLaunchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,In progress] - Assigned to Jiří Stránský (jistr)17:10
jistrbandini: it's derekh's machine, it's bit challenging to get it out, gimme a sec :D17:12
openstackgerritBen Nemec proposed openstack/tripleo-docs: Undercloud install is not a script anymore  https://review.openstack.org/33071417:13
*** xinwu has joined #tripleo17:13
bandinijistr: :)17:13
chembandini: this is the right one Ia61295943e67efe354a51a26fe4540f288ff6ede17:15
chembandini: but it's sooo old!17:15
*** leanderthal|afk has quit IRC17:16
*** ohamada has quit IRC17:18
*** mcornea has quit IRC17:20
jistrbandini: sahara isn't deployed there at all it seems17:20
*** dtantsur is now known as dtantsur|afk17:20
*** sambetts is now known as sambetts|afk17:21
jistrbandini: does it look like something's still loose in that CIB?17:21
*** rcernin has quit IRC17:21
*** athomas has quit IRC17:22
bandinijistr: not really no. do you see the exact same symptoms?17:22
* bandini need to feed the kids17:22
bandinibbiab17:22
jistryea, the exact same services17:22
*** panda|afk is now known as panda17:23
openstackgerritBen Nemec proposed openstack/tripleo-common: Allow updating of nodes in baremetal import  https://review.openstack.org/33071717:23
openstackgerritBen Nemec proposed openstack/tripleo-common: Update default branch in .gitreview  https://review.openstack.org/33071817:23
jistrsame as derekh pasted earlier https://paste.fedoraproject.org/380181/66097809/raw/17:23
* EmilienM afk lunch bbiab too17:24
jistri'm gonna call it a day, but https://review.openstack.org/#/c/330682/ should get us out of trouble for the time being17:25
*** mcornea has joined #tripleo17:25
EmilienMjistr: thanks, I'll monitor it17:25
chemdoes someone know what happen to the NeutronEnableOVSAgent because it seems that neutron::enable_ovs_agent in puppet-triple will always be false?  I must be missing something here ...17:26
bandiniI will be here tonight for another issue anyway, so I will be around17:26
*** akshai has joined #tripleo17:28
*** leanderthal|afk has joined #tripleo17:29
chemjistr: on your platform could you check the value of neutron::enable_ovs_agent in the hieradata ?17:29
ccamachosee you tomorrow guys, ill be also seeing the patch17:29
*** ccamacho has quit IRC17:30
jistrchem: says 'nil', the key isn't set17:32
jistrchem: https://paste.fedoraproject.org/380187/60983821/raw/17:33
* jistr going for real :)17:33
jistro/17:33
chemjistr: bye :)17:34
*** akrivoka has quit IRC17:34
*** tosky has quit IRC17:34
openstackgerritDan Sneddon proposed openstack/tripleo-specs: Specification for tripleo-lldp-validation blueprint  https://review.openstack.org/32920317:38
chemdprince: would you mind if I make the changes to https://review.openstack.org/#/c/299643/ the (upload scripts), I need a break from the ha-upgrade problem17:38
paramiteHi guyHi guys, can somebody give me a hint what should I fix on "ERROR: Failed to validate: : resources.ControllerServiceChain: : Failed to validate nested template: Invalid type (list)"17:38
paramite?17:39
*** mcornea has quit IRC17:39
chemdprince: if you have some time also to explain me how neutron::enable_ovs_agent can ever be true as NeutronEnableOVSAgent has disapeared from tht, I must miss something, but I can't find it17:41
*** jprovazn has quit IRC17:44
dprincechem: looking17:47
dprincechem: go ahead and make changes on https://review.openstack.org/#/c/299643/17:49
dprincechem: NeutronEnableOVSAgent was removed, yes. Instead we now set this in your Heat environment's resource_registry: http://git.openstack.org/cgit/openstack/tripleo-heat-templates/tree/environments/neutron-nuage-config.yaml#n717:51
*** mgould is now known as mgould|afk17:52
*** myoung|biab has joined #tripleo17:53
*** myoung|biab is now known as myoung17:54
chemdprince: ok, thanks for the explanation.17:54
*** florianf has quit IRC17:54
*** mcornea has joined #tripleo17:56
matbujistr: marios bandini hewbrocca-afk  so i got a sucessfull upgrade L->M17:57
matbujistr: marios bandini controller+compute+ceph+converge17:57
chemmatbu: what was the trick ?17:57
matbui just had to remove the t-i-e from all the nodes after upgrading17:57
matbuchem: ^17:58
*** electrofelix has quit IRC17:58
matbuchem: i mean major rdo upgrade17:58
chemmatbu: so that's not related to the CI ha-upgrade ?18:00
matbuchem: ha no sorry :(18:00
chemmatbu: hehe, np, good news anyway :)18:01
matbuchem: i need to deep a bit more into those jobs, i'm not very familiar with them18:01
*** paramite has quit IRC18:03
*** coolsvap has quit IRC18:06
*** ebalduf has joined #tripleo18:06
*** coolsvap has joined #tripleo18:07
*** rcernin has joined #tripleo18:07
ayoungDuring Openstack Overcloud deploy, how does heat talk with the (new) controller instance?  Is it all via the Metadata server?18:08
jstir_ayoung: ITYM 'OpenStack'.18:08
*** akshai has quit IRC18:08
ayoungjstir_, I actually mean openstack since it is the CLI18:09
trown|lunchlol, trollbot18:09
ayoungso openstack overcloud deploy -e thetimehascomethewalrussaidtotalkofmanythings18:09
*** trown|lunch is now known as trown18:09
chemdprince: oki, I've had an hard look at the paste and all, but I still don't see how https://github.com/openstack/puppet-tripleo/blob/master/manifests/profile/pacemaker/neutron.pp#L52 can become true18:09
*** ooolpbot has joined #tripleo18:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION18:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/159277618:10
*** ooolpbot has quit IRC18:10
openstackLaunchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,In progress] - Assigned to Jiří Stránský (jistr)18:10
*** akshai has joined #tripleo18:11
openstackgerritRyan Brady proposed openstack/tripleo-common: Add baremetal workflows  https://review.openstack.org/30020018:11
*** coolsvap has quit IRC18:12
ayoungtrown, is my diagram correct here: https://adam.younglogic.com/wp-content/uploads/2016/06/VM-config-changes-via-Heat.png18:12
ayoungI assume Heat actuall *is* the metadata service but listening on a magic port...18:13
trownayoung: ya that is the one part of your diagram I am not totally sure of... everything up to the heat circle is right, and os-collect-config on is right18:14
ayoungtrown, what controls all the stages of the update?18:14
*** pkovar has quit IRC18:18
trownayoung: not sure what you mean18:19
trownayoung: the puppet manifests in tripleo-heat-templates have a step variable18:19
ayoungtrown, when you run an update, the log shows all the stages of the deplou18:19
ayoungdeploy18:19
ayounglike18:20
ayoung2016-06-16 17:48:54 [overcloud-AllNodesExtraConfig-htynk23jryte]: UPDATE_IN_PROGRESS Stack UPDATE started18:20
ayoung2016-06-16 17:48:54 [overcloud-AllNodesExtraConfig-htynk23jryte]: UPDATE_COMPLETE Stack UPDATE completed successfully18:20
ayoungerr I can do better18:20
ayoungactually, yah, like that...I realize Heat is is processing each of those templates in turn, but is it also synchronizing with the os-collect-config service on the remote host?18:21
trownya when os-collect-config finishes a puppet apply it signals back to heat18:22
trownplease dont ask how ;P /me does not know18:23
*** xinwu has quit IRC18:24
ayoungtrown, I'm sure shady has it in a presentation somewhere18:25
EmilienMchem: it sounds like the workaround does not work https://review.openstack.org/#/c/330682/18:28
EmilienMsame for my patch https://review.openstack.org/33066918:28
chemEmilienM: mine works https://review.openstack.org/#/c/330661/ :)18:29
chemEmilienM: soooo green, I cannot believe it18:30
EmilienMw00t18:31
EmilienMwow18:31
EmilienMchem: can you update commit message ? and we can land it18:31
chemEmilienM: doing18:32
*** csd_ has quit IRC18:32
chemEmilienM: hope it's not a lucky neutron hitting the right hard drive somewhere in an datacenter18:32
EmilienMlol18:32
EmilienMit's worth trying18:32
EmilienMtrown: you ok to land this patch ^ after commit message update?18:33
EmilienMasking other reviewers too: bandini, bnemec, dprince ^18:33
EmilienMchem: it would be even better to have it in puppet-tripleo no?18:34
bnemecEmilienM: Which one are we merging?18:35
chemEmilienM: as I told you, If it was only me I would move all those constraint to tripleo and in the end removing them from them18:35
EmilienMbnemec: chem did https://review.openstack.org/#/c/330661/ and it seems like it works.18:35
EmilienMchem: ok so we can keep it there now18:36
chemEmilienM: arghhh, but tomorrow we move them all oki ?18:37
bnemecEmilienM: Okay, I only have a vague idea of what that's doing, but since it passed CI I'm okay with merging it.18:37
EmilienMchem: either way work18:37
EmilienMI prefer puppet-tripleo18:37
EmilienMbut since it's passing CI in THT18:37
chemEmilienM: me too18:37
EmilienMlet's do it now in THT18:37
EmilienMand we'll iterate tomorrow18:37
chemEmilienM: I'm adjusting18:37
EmilienMjust update commit message and I propose we land it18:38
*** Guest36238 has quit IRC18:38
chemEmilienM: ack18:38
openstackgerritAthlan-Guyot sofer proposed openstack/tripleo-heat-templates: Colocation make a group for pckm nova resources.  https://review.openstack.org/33066118:42
openstackgerritAthlan-Guyot sofer proposed openstack/tripleo-heat-templates: Colocation make a group for pcmk nova resources.  https://review.openstack.org/33066118:42
*** dmk0202 has joined #tripleo18:44
chemEmilienM: let's hope it passes the gate, but I won't see that in real time.  I'm off.18:45
*** dmk0202 has quit IRC18:45
*** akshai_ has joined #tripleo18:45
*** chem is now known as chem|off18:46
*** saneax_AFK is now known as saneax18:46
*** ebarrera has quit IRC18:46
EmilienMchem|off: thanks18:46
*** akshai has quit IRC18:47
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Composable opencontrail plugin  https://review.openstack.org/32847118:47
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Drop extraconfig for neutron-nuage.yaml  https://review.openstack.org/32774218:47
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Drop extraconfig for neutron-opencontrail.yaml  https://review.openstack.org/32847218:47
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Composable neutron nuage plugin  https://review.openstack.org/32774118:47
*** akshai_ has quit IRC18:50
*** panda has quit IRC18:50
*** panda has joined #tripleo18:50
openstackgerritSanjay Upadhyay proposed openstack/python-tripleoclient: Tripleoclient leaks temporary files  https://review.openstack.org/33063818:51
EmilienMbnemec: I asked infra to unqueue it so we skip all waiting time in the loooonng queue :)18:54
openstackgerritLars Kellogg-Stedman proposed openstack/tripleo-quickstart: return global control of force_cached_image  https://review.openstack.org/33016618:55
*** mcornea has quit IRC18:59
openstackgerritMerged openstack/tripleo-heat-templates: Colocation make a group for pcmk nova resources.  https://review.openstack.org/33066118:59
EmilienMyeah ^18:59
*** yolanda has quit IRC19:00
openstackgerritLars Kellogg-Stedman proposed openstack/tripleo-quickstart: return global control of force_cached_image  https://review.openstack.org/33016619:00
* EmilienM removing alert tag on the bug19:00
openstackgerritEmilien Macchi proposed openstack/puppet-tripleo: Implement Libvirt profile  https://review.openstack.org/32968219:02
openstackgerritEmilien Macchi proposed openstack/tripleo-heat-templates: First iteration of libvirt as a composable service  https://review.openstack.org/32968619:02
openstackgerritEmilien Macchi proposed openstack/puppet-tripleo: Create libvirt micro-service  https://review.openstack.org/32971419:02
openstackgerritEmilien Macchi proposed openstack/tripleo-heat-templates: Enable libvirt as a micro-service  https://review.openstack.org/32971819:03
*** xinwu has joined #tripleo19:05
*** saneax is now known as saneax_AFK19:08
*** ebalduf has quit IRC19:09
openstackgerritRyan Brady proposed openstack/tripleo-common: [WIP] Fix exception within deployment plan actions  https://review.openstack.org/33075519:16
*** fultonj has quit IRC19:23
*** dprince has quit IRC19:27
*** chem|off` has joined #tripleo19:31
*** chem|off has quit IRC19:33
*** MaxPC has quit IRC19:36
*** dmk0202 has joined #tripleo19:40
openstackgerritMerged openstack/tripleo-quickstart: make --requirements cumulative  https://review.openstack.org/33008619:40
EmilienMbnemec: what was your results when you enabled Iptables rules by default on the overcloud?19:41
bnemecEmilienM: It breaks HA right now.  I have a series of patches up to fix it, except I just realized I haven't pushed my latest working local branch for review. :-)19:42
EmilienMbnemec: ah19:43
openstackgerritMerged openstack/tripleo-quickstart: return global control of force_cached_image  https://review.openstack.org/33016619:43
*** mbound has quit IRC19:43
openstackgerritBen Nemec proposed openstack/tripleo-heat-templates: Enable firewall by default on the overcloud  https://review.openstack.org/32183319:43
openstackgerritBen Nemec proposed openstack/tripleo-heat-templates: Allow pacemaker ports in firewall  https://review.openstack.org/33024919:43
openstackgerritBen Nemec proposed openstack/tripleo-heat-templates: Stop using deprecated port param in firewall rules  https://review.openstack.org/33075919:43
openstackgerritBen Nemec proposed openstack/tripleo-heat-templates: Allow sahara ports in firewall  https://review.openstack.org/33076019:43
*** dmk0202 has quit IRC19:43
bnemecEmilienM: ^19:44
bnemecThose should get it working.19:44
EmilienMexcellent19:44
openstackgerritJohn Trowbridge proposed openstack/tripleo-quickstart: use the ansible-role-tripleo-inventory to override native inventory  https://review.openstack.org/32993819:48
EmilienMbnemec: I have an idea how to make it composable19:49
EmilienMbnemec: part of our roles19:49
EmilienMbnemec: I'll work on it based on your patches so we can land yours first19:49
EmilienMif you don't mind19:49
bnemecEmilienM: No, that's sounds good.  I definitely want to keep the fixes and the composable work separate because I think the fixes will need to be backported.19:50
*** fragatin_ has quit IRC19:51
EmilienMbnemec: definitly19:52
*** lblanchard has quit IRC19:56
openstackgerritMerged openstack/tripleo-quickstart: Move default stopping point to just before overcloud deploy  https://review.openstack.org/33017619:58
*** ebalduf has joined #tripleo20:01
*** timothyb89 has joined #tripleo20:01
*** jayg is now known as jayg|g0n320:03
timothyb89hi all, does anyone here have experience running diskimage-builder behind a proxy? I'm trying to build a nodepool image but hitting proxy errors even with the usual env vars set20:08
*** fragatina has joined #tripleo20:10
*** fragatina has quit IRC20:10
*** fragatina has joined #tripleo20:11
*** jcoufal has quit IRC20:12
rlandyayoung: hello - I'm hitting  keystoneauth/identity errors when deploying on baremetal overcloud. During deploy (or sometimes after introspection), "ERROR (ConnectFailure): Unable to establish connection to http://<ip>5000/v2.0/tokens". Until that point, things were going along fine and all actions requiring auth worked.20:12
ayoungrlandy, something shoot Keystone?20:13
*** akshai has joined #tripleo20:13
*** noslzzp has quit IRC20:13
*** fzdarsky is now known as fzdarsky|afk20:14
rlandyayoung: I am guessing it just runs out of  steam (resource) at some point.20:14
rlandyayoung: I've run a few times on two different baremetal environments20:15
rlandyif I need to run introspection twice, for example, I could hit the error even before attempting deploy20:15
rlandyI could share the env with you if that helps20:16
ayoungrlandy, nope20:16
ayoungrlandy, its not a Keystone issue AFAICT20:16
ayoungyou just need to figure out what is killing Keystone20:16
rlandyah ok20:16
ayoungif you have the env up and running, look at the logs20:17
rlandyI'm thinking ironic20:17
ayoungjournalctl, /var/log/keysteon etc20:17
*** ayoung has quit IRC20:18
*** openstackstatus has joined #tripleo20:20
*** ChanServ sets mode: +v openstackstatus20:20
*** lucas-afk has quit IRC20:21
*** dmk0202 has joined #tripleo20:23
*** lucasagomes has joined #tripleo20:28
openstackgerritAndreas Florath proposed openstack/diskimage-builder: Refactor: block-device handling  https://review.openstack.org/31959120:30
openstackgerritJeff Peeler proposed openstack/tripleo-common: [WIP] Fix exception within deployment plan actions  https://review.openstack.org/33075520:32
*** mbound has joined #tripleo20:33
openstackgerritGabriele Cerami proposed openstack/tripleo-quickstart: Update downloaded images to latest delorean repos  https://review.openstack.org/32789820:38
openstackgerritGabriele Cerami proposed openstack/tripleo-quickstart: Move ironic config to post install  https://review.openstack.org/32830020:38
*** ibravo has quit IRC20:46
*** [1]cdearborn has joined #tripleo20:47
openstackgerritMichele Baldessari proposed openstack/tripleo-heat-templates: [WIP] Initial work to dump and restore galera db during major upgrades  https://review.openstack.org/32520520:48
openstackgerritJeff Peeler proposed openstack/tripleo-common: [WIP] Fix exception within deployment plan actions  https://review.openstack.org/33075520:51
*** cdearborn has quit IRC20:53
*** [1]cdearborn has quit IRC20:54
*** rhallisey has quit IRC20:58
openstackgerritStephanie Miller proposed openstack/diskimage-builder: Ironic agent kernel should be owned by user building image  https://review.openstack.org/33078320:59
*** trown is now known as trown|outtypewww21:00
openstackgerritEmilien Macchi proposed openstack/puppet-tripleo: keystone: deploy composable firewall rules  https://review.openstack.org/33078521:04
EmilienMbnemec: it's a PoC but see ^21:04
EmilienMslagle, bnemec: you guys working on firewall, please give any feedback before I continue this thing ^21:04
*** chem|off` has quit IRC21:07
*** shivrao has joined #tripleo21:07
openstackgerritStephanie Miller proposed openstack/diskimage-builder: Ironic agent kernel should be owned by user building image  https://review.openstack.org/33078321:28
*** Guest36238 has joined #tripleo21:29
*** fzdarsky|afk has quit IRC21:29
*** rcernin has quit IRC21:36
*** dmk0202 has quit IRC21:38
*** Goneri has quit IRC21:45
openstackgerritayoung proposed openstack/tripleo-quickstart: Allow for multiple undercloud nodes  https://review.openstack.org/31574921:50
*** chem has joined #tripleo21:50
*** yamahata has quit IRC21:51
*** chem is now known as chem|off21:52
*** panda is now known as panda|Zz22:00
openstackgerritBen Nemec proposed openstack/tripleo-docs: Add Liberty and Mitaka admonitions and use them  https://review.openstack.org/33080222:01
openstackgerritEmilien Macchi proposed openstack/puppet-tripleo: keystone: deploy composable firewall rules  https://review.openstack.org/33078522:02
*** dmk0202 has joined #tripleo22:04
*** egafford has quit IRC22:06
*** ayoung has joined #tripleo22:07
openstackgerritEmilien Macchi proposed openstack/puppet-tripleo: Implement Libvirt profile  https://review.openstack.org/32968222:09
openstackgerritEmilien Macchi proposed openstack/puppet-tripleo: Create libvirt micro-service  https://review.openstack.org/32971422:10
*** akshai has quit IRC22:11
openstackgerritayoung proposed openstack/tripleo-quickstart: Provision Identity VM  https://review.openstack.org/32833522:14
ayounglarsks, so...kindof harsh with the -2s there.  Do you really object that strenuously to the approach?  I'm not certain I can actually meet your suggestions.22:16
openstackgerritEmilien Macchi proposed openstack/puppet-tripleo: nova/api: include ::nova::network::neutron  https://review.openstack.org/32952922:16
openstackgerritAthlan-Guyot sofer proposed openstack/tripleo-heat-templates: WIP: integration of the new puppet pacemaker.  https://review.openstack.org/30240922:16
*** weshay has quit IRC22:18
openstackgerritEmilien Macchi proposed openstack/tripleo-heat-templates: compute: align rabbitmq configuration with nova-base service  https://review.openstack.org/33002222:18
EmilienMayoung: looking at https://review.openstack.org/#/c/328373/22:22
EmilienMso you want ipa server on undercloud, right?22:23
ayoungEmilienM, yeah22:23
*** ebalduf has quit IRC22:23
ayoungEmilienM, but it needs to meet a few different scenarios22:23
EmilienMayoung: why not contriting to instack undercloud?22:23
EmilienMcontributing*22:23
ayoungEmilienM, because I thought instack and quickstart were merging22:24
EmilienMayoung: afik we use puppet so setup & configure things, no?22:24
ayoungI can put it anywhere, but I went through the work to get it here because this seems to be the developmnet tool of choice22:24
ayoungEmilienM, that would be a very different set of code22:24
ayoungEmilienM, this is all ansible based stuff we did last summer coming on over22:25
*** rlandy has quit IRC22:25
EmilienMso iirc, we can install things with puppet with instack undercloud OR with ansible with quickstart?22:25
EmilienMs/iirc/iiuc/22:25
ayoungthere was some work done on getting IPA set up by Puppet back when Lynn Root worked in our group, but I have not looked at it in a long time22:26
EmilienMI'm not sure we take a right approach here22:26
EmilienMeveryone is working on instack-undercloud now22:26
EmilienMwe integrated recent services, etc22:26
EmilienMmaybe I'm wrong22:26
EmilienMslagle: do you have thoughts on ^ ?22:26
ayoungSo...I need to put IPA on a separate MAchine as it conflicts ports with Swift22:26
ayoungand, as I said, we are trying to mirror a common deploytment where IPA or some other IdP already exists22:27
ayoungso, when I said "on undercloiud" I really meant "on a separate VM next to the undercloud"22:27
EmilienMah ok !22:27
EmilienMI was confused :)22:27
ayoungEmilienM, IPA is very opnionoated, and , since it configures the Apache server, it will get in the way of Keystone and Horizon as well22:28
EmilienMI'm off now22:28
ayoungif we ever put Horizon on the undercloud....22:28
ayoungthanks for your interest22:28
EmilienMok22:28
EmilienMayoung: I'll look at it22:28
EmilienMthanks!22:28
ayoungEmilienM, I am just not sureI can do what larsks is asking for in the last of the three patches;  make it a separate role22:29
ayoungthe Identity VM is generated when you build the overall structure22:29
ayoungHe might not want that, instead have it generated as its own role, but that is a much more significant rewrite22:29
*** xinwu has quit IRC22:30
*** dmk0202 has quit IRC22:39
*** xinwu has joined #tripleo22:44
*** bfournie1 has quit IRC22:48
*** saneax_AFK is now known as saneax22:49
*** ayoung has quit IRC23:07
*** pradk has quit IRC23:07
openstackgerritIan Wienand proposed openstack/diskimage-builder: Pre-install pip/virtualenv packages  https://review.openstack.org/32747223:15
openstackgerritIan Wienand proposed openstack/diskimage-builder: Pre-install pip/virtualenv packages  https://review.openstack.org/32747223:21
slagleEmilienM: honestly, i'm not familiar with ipa enough to know if it should be an integrated service on the uc or not23:37
slagleit would depend on what types of resources and configuration it needs23:37
slagleand if it plays nicely alongside other services23:37
*** chlong has quit IRC23:42
*** ayoung has joined #tripleo23:44
*** Guest36238 has quit IRC23:46
*** mbound has quit IRC23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!