Thursday, 2016-03-03

openstackgerritRichard Su proposed openstack/tripleo-heat-templates: Store events in Ceilometer
openstackgerritJohn Trowbridge proposed openstack/tripleo-heat-templates: Manage keystone initialization directly in t-h-t manifests
*** panda has quit IRC03:37
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Add Management Network For System Administration.
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Add all isolated networks to all nodes.
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Support network isolation without external nets
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Add network templates for multiple NIC configuration
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Fix yaml validation errors in multiple-nics templates
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Adding ManagementIpSubnet to linux bridge net conf
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Change default bond-mode
openstackgerritMerged openstack/diskimage-builder: Add Gentoo to the list of supported distributions
*** jtomasek has joined #tripleo06:33
openstackgerritSergey Gotliv proposed openstack/puppet-tripleo: Sahara integration
*** jaosorior has joined #tripleo07:15
openstackgerritSergey Gotliv proposed openstack/tripleo-heat-templates: Sahara Integration
openstackgerritSergey Gotliv proposed openstack/tripleo-heat-templates: Removing Sahara password default
jaosoriorIs anyone getting these errors still? Error: /Stage[main]/Keystone/Exec[keystone-manage bootstrap]: keystone-manage bootstrap --bootstrap-password MQ67mevhB4RRReeR6bnz2ujqz returned 1 instead of one of [0]07:40
hewbroccajaosorior: morning08:12
jaosoriorhewbrocca: Hey dude08:13
hewbroccaAnyone have any idea if dprince was able to get the testenvs reworked for network isolation last night08:14
*** shardy has joined #tripleo08:37
*** devvesa has joined #tripleo08:39
jaosoriormarios, shardy: Have you guys seen this error when doing an HA deployment? Error: /Stage[main]/Keystone/Exec[keystone-manage bootstrap]: keystone-manage bootstrap --bootstrap-password MQ67mevhB4RRReeR6bnz2ujqz returned 1 instead of one of [0]08:42
mariosjaosorior: yeah i think there was a bug for it, lemme check08:43
mariosjaosorior:  might be this (bug linked in review ) "controller/ha: disable keystone-manage bootstrap."08:44
jaosoriormarios: Ah, nice, thanks!08:45
ishanthi folks, Can someone please review ?09:08
jprovaznOC fails to deploy with "DiscoveryFailure: Could not determine a suitable URL for the plugin" in neutron logs - , anyone hit this?09:40
openstackgerritMarios Andreou proposed openstack/tripleo-heat-templates: Introduce a UpgradeScriptDeliveryWorfklow as part of tripleo upgrades
shardyshadower: yes, sec10:35
hewbroccaLOL yes10:36
shadowerI admit, the question was more about "can someone paste the link here"10:36
shardyshadower: that's the liberty backport, the master commit is referenced in the t-h-t commit:10:37
shadowershardy: thanks10:37
shardy"Note that this depends on a fix for heat bug #1539737"10:37
openstackbug 1539737 in heat liberty "str_replace params can't be from parameters" [Undecided,Fix committed] - Assigned to Steven Hardy (shardy)10:37
openstackgerritJiri Stransky proposed openstack/tripleo-heat-templates: Upgrades: install zaqarclient
lucasagomesshardy, morning. Any updates re UEFI?10:58
*** tosky has joined #tripleo10:59
shardylucasagomes: Hi, not yet, been working my way through morning code reviews first :)11:00
lucasagomesshardy, oh fair enough, lemme know if something is needed11:01
shardylucasagomes: sure, will do, thanks for the help!11:01
dtantsurliberty backport ^^11:03
shardydtantsur: Hey, FYI I tested that and it works really nicely :)11:03
dtantsurcool :)11:04
shardydtantsur: question re backport - is it reasonable to expect existing users, e.g RDO, to have images with the python-hardware library?11:04
dtantsurshardy, we've merged the appropriate change quite some time ago... but I can swap the default to False, if it makes you feel safer11:04
dtantsur(then we'll have to make it True again downstream, I guess)11:05
shardydtantsur: No, it's fine, I just wanted to confirm - we could check with trown, but I'm just wary of anything that might break folks with existing images on update11:05
dtantsurtrown|outtypewww, ^^^11:05
shardyAny idea which commit switched the image build to include the library?11:05
dtantsurlemme find11:05
dtantsurshardy, merged 21 days ago11:06
shardyOk, let's chat with trown before landing then, as that's not been present in tripleoclient for a super-long time11:07
dtantsurshardy, on one hand we don't want people to get broken by just updating u-c.. on the other, this change only matters when (re)building undercloud11:07
shardydtantsur: that's fine with me, just need to call it out in the commit message11:09
dtantsurdefinitely.. then we'll reenable it with a downstream patch11:09
dtantsuralso good morning pic for this channel:
hewbroccavery nide11:33
shadowerError: Puppet::Parser::AST::Resource failed with error ArgumentError: Could not find declared class ::nova::db::mysql_api at /var/lib/heat-config/heat-config-puppet/420b2f5a-cc4a-4b90-bd91-6265421ac0fc.pp:506 on node overcloud-controller-0.localdomain11:34
shadoweranyone else seeing that?11:34
shardydtantsur: hey, I was looking to revive so we can land it for mitaka (allow yaml nodes file)11:40
shardyI'm not super familiar with openstackclient - I assume the --type is wired in as a common option to osc, and we can deprecate the old json/csv options?11:40
shardyI just copied what was there, obviously ;)11:40
dtantsurshardy, --type is not something standard afaik, I'm just suggesting a more OSC-ish way to handle it..11:41
dtantsurshardy, actually I see much more value in detecting the format by extension11:41
shardydtantsur: sure, I agree - I was just a bit confused by the "please deprecate old options" remark11:42
dtantsurcause really typing --yaml file.yaml is a bit lame :)11:42
shardye.g it implies there's a "new" way I should be doing it :)11:42
dtantsurshardy, well... ideally it should be --file-type=yaml --file-type=csv etc11:42
shardydtantsur: agreed, I can for sure add some auto-detecting stuff11:42
dtantsurwhat we do now feels a bit wrong.. but I can leave with it, especially if we add autodetection11:43
shardydtantsur: Ok, thanks - I'll try reworking it later, just wanted to confirm as I couldn't see any osc type args to plug into11:43
dtantsurshardy, yeah, it's not an existing thing, it's just an example how we should have implemented it11:44
shardyack, I'll see what I can do to improve it, thanks!11:44
*** ohamada has quit IRC11:45
*** ohamada has joined #tripleo11:48
slagleshardy: hi, when you get a chance could you re-review, i think i make the improvement you were asking for11:53
shardyslagle: thanks, sorry I thought I'd done that one yesterday11:56
*** olap has joined #tripleo11:56
shardyslagle: you can actually unregister nodes via e.g the spacecmd tool which talks to the xmlrpc API11:56
shardybut I guess we won't have credentials to do that11:56
*** cmyster has quit IRC11:56
slagleshardy: i didnt know about spacecmd, but it doesnt look like it's installed by default on rhel11:57
openstackgerritMerged openstack/tripleo-heat-templates: Add Satellite 5 support
openstackgerritJames Slagle proposed openstack/tripleo-heat-templates: Add Satellite 5 support
*** saneax is now known as saneax_AFK12:02
*** rhallisey has joined #tripleo12:17
*** rhallisey has quit IRC12:18
*** rhallisey has joined #tripleo12:18
*** gfidente has joined #tripleo12:21
gfidentemarios, jrist I have something I want to share with you guys12:22
hewbroccagfidente: I'm dying of suspense12:25
mariossup gfidente12:28
*** jaosorior has quit IRC12:39
gfidenteslagle, I knew you would say it12:39
gfidentebut you didn't see jrist's owl that's why12:39
*** jaosorior has joined #tripleo12:40
*** dprince has joined #tripleo12:40
shadowergfidente: did you come across these deployment errors?
shadowerI'm getting them when trying to deploy the overcloud (both master THT and with the ipv6 patches)12:41
gfidenteshadower, nope, I had an old version of memcached so memcached doesn't bind on the ipv6 address though12:42
gfidentebut you don't see that? are you deploying on centos?12:42
shadowergfidente: on rhel12:42
*** dprince has quit IRC12:46
*** masco has quit IRC12:46
gfidenteshadower, so I'm going to try that, back in a few12:47
shadowercould you point me in the right direction on debugging these? I looked at the
shadowerbut it looks like the other ones12:48
openstackgerritMerged openstack/tripleo-heat-templates: Moves the swift start/stop into the file
shadowergfidente: ^12:49
shadowerso could that mean that ::nova::db::mysql_api module isn't on the machine? Where/how could I check that?12:49
gfidenteshadower, so the error on the compute looks like the version of puppet-nova is not good for the templates12:51
gfidentethe module expects neutron_admin_password and we are not passing it12:51
gfidenteis this was changed in the puppet module we probably need to figure which patch it is and add into OPM12:52
gfidenteand I'd say this is same reason for the error you see on the controller too12:52
shadoweryeah I was afraid of that12:53
*** dprince has joined #tripleo12:53
gfidenteI think we need to git blame puppet-nova and see what we miss12:54
gfidenteshadower, I'd say this is the problem we're trying to fix with the idea of deploying the modules as artifacts12:57
gfidentedprince, ^^12:57
gfidente(I mean in general)12:57
dprincegfidente: is something wrong w/ puppet-nova today?12:59
gfidentedprince, no not in CI13:00
gfidenteit's probably the OPM builds13:00
gfidentewas just pointing out we figured this problem existed and had ideas13:01
*** miles is now known as mgould13:01
gfidente*I should say exists13:01
dprincegfidente: the puppet module swift thing finally landed.13:02
gfidenteyes exactly13:02
slaglejprovazn: a single instance on the overcloud? or the undercloud?13:11
jprovaznslagle: on undercloud13:11
dprincehewbrocca: progress... I'm trying not to break CI so testing very carefully13:11
hewbroccadprince: excellent13:11
hewbroccaObviously, there is a reason I keep harassing you about this... I'm anxious to get IPv6 landed13:13
gfidentejayg, I think I found the two submissions which blocked shadower13:20
gfidentejayg, I am not sure what is the approach here though13:20
jayggfidente: ah, cool, was going to say that I just updated opm yesterday or the day before, so I could give you a commit list to puppet-nova13:20
gfidenteif we prefer to port those to liberty or to not use the features in tht13:20
gfidentejayg, I think we need them in liberty eventually because this is for the liberty release13:21
gfidentedprince, what's your take?13:21
gfidentedo we port the puppet nova changes to liberty too or rather not use them in tht/liberty?13:21
gfidentecause you own both13:21
jaygI have no idea if they are needed in liberty, I haven't looked at these before, though I see EmilienM was a reviewer, so he may know13:22
jaygoh, and co-author on one13:22
gfidentejayg, I think porting is good13:23
jaygyeah, if it is not a hardship, and will make less change from liberty -> mitaka, I would go for it13:23
gfidenteme too13:23
gfidentelet's see if it's hard :)13:23
gfidentejayg, which one you take?13:24
jayggfidente: haha, you mean liberty backport or changing tht?13:24
gfidenteporting the changes to liberty13:24
jayggfidente: I don't care, I can take the api_db one if you want13:26
gfidentejayg, no that actually seems to go for mitaka only13:27
gfidenteso we need a change to not do that in tht I think13:27
gfidentefor tht/liberty13:28
*** trown|outtypewww is now known as trown13:28
jaygah, tht/liberty is already doing it the mitaka way?13:28
*** rlandy has joined #tripleo13:28
* jayg confused13:28
shardyI saw an issue today where if you delete an in-progress stack, nova can 500 which causes the delete to fail, but this sounds different13:35
trownshardy: this happens with succesful stacks and unsuccesful stacks, and the stack actually gets fully cleaned up except for the heat database13:36
trownshardy: as in all of the nova/neutron stuff gets deleted, but there is still a DELETE_FAILED stack sticking around, and retrying the delete fails every time13:37
trownshardy: rhallisey hit the same thing13:37
trownshardy: it is really easy to reproduce using trunk heat, what other detail would be helpful?13:38
shardytrown: if you can raise a launchpad bug with more details that would be good - even better if there's a non-TripleO (e.g devstack) reproducer13:42
shardywe have functional tests for deleting stacks, so it'd be a little surprising if deleting is completely broken13:43
trownshardy: have you been using trunk heat? or the repo setup of tripleoci13:43
shardye.g trace the cause of the DELETE_FAILED back from he heat-engine.log13:43
trownshardy: the main evidence of it not being tripleo specific is the simple ping test stack on the overcloud failing in the same way13:44
trowndelete failing I should say, which makes the ping test hang13:45
dtantsurliberty backport that passed gate, 1x +2 required: :)13:45
shardytrown: ack - I may have a slightly older (week old) heat actually, it's locally built but I've not updated for a few days due to PTO13:45
shardytrown: let me pull latest and try to reproduce13:45
trownshardy: ah yep, this started yesterday13:46
shardytrown: Ok, I pulled latest heat and so far can't reproduce, can you please raise a LP bug including the heat-engine traceback associated with the DELETE_FAILED?14:04
shardythere are quite a few possible reasons for a delete to fail, not all heat bugs :)14:04
trownshardy: sure will do14:04
trownshardy: I will try to get a trace from both the pingtest and the overcloud scenarios14:05
shardytrown: great, that should give us a few clues, thanks!14:05
jaosoriorshardy: What was the error you where seeing when deleting the overcloud stack? I'm seeing something where deleting the stack just stalls when trying to delete the nova servers14:05
jaosoriorhappens if I deploy with network isolation14:05
shardyjaosorior: nova was returning a 500 when interruping a deployment and trying to delete the stack when the nodes were still spawning (ironic nodes in wait-call-back)14:06
shardynot yet fully investigated, it may be a nova or ironic bug tho14:06
shardylucasagomes: hey, so I re-tested with boot_mode:uefi, and got some strange results14:07
jaosoriorshardy: That's different in my case then. It just stalls trying to delete the nova servers14:07
shardyjaosorior: I'd check the nova logs I guess14:08
shardylucasagomes: so, with just boot_option:local (no boot_mode:uefi), the box will pxe boot OK, and the image appears deployed, but then won't boot (assume broken bootloader, just a flashing cursor, no grub)14:08
shardylucasagomes: however, if you then add boot_mode:uefi, it won't pxe boot anymore, but something happens which fixes the bootloaded, so it will always boot from the local disk14:09
shardye.g the image deployed without boot_mode:uefi14:09
shardyI can't figure out a way to have both the deploy pxe boot part, and the local disk-image boot part work14:09
lucasagomesshardy, interesting, I think what fixes the local boot is that it will create an EFI partition14:10
lucasagomesshardy, lemme look at the code to see if I can spot something about the PXE boot14:10
shardylucasagomes: aha14:10
lucasagomesshardy, does it show any error?14:10
shardylucasagomes: no, it just drops through the pxe boot stage as if the server isn't responding, then boots from the local disk14:11
shardylucasagomes: if I remove boot_mode:uefi then pxe boot & the deploy ramdisk work fine again14:12
lucasagomesshardy, oh, so there's an option called "uefi_pxe_bootfile_name" and "uefi_pxe_config_template"14:12
lucasagomesshardy, you may want to set it to the same values as the pxe_bootfile_name and pxe_config_template14:12
lucasagomessince you are using that compatibility mode14:12
lucasagomesby default it iwll try to load the uefi pxe template and the efi ROM14:12
lucasagomeswhich may not be present in ur system14:12
lucasagomesthe options are under the [pxe] section14:13
*** links has joined #tripleo14:14
shardylucasagomes: I see, got it, changed those and re-testing, thanks!14:15
lucasagomescool, let's see how it goes14:15
lucasagomesI've never tried this UEFI pxe compat thing, so it's new for me14:16
*** tzumainn has joined #tripleo14:16
*** Goneri has quit IRC14:17
EmilienMcan we +A please ? reviewed lot of times, CI passing, etc...14:18
trownEmilienM: CI is not passing on that14:19
*** lblanchard has joined #tripleo14:19
EmilienMtrown: HA job is failing14:19
EmilienMbut if you look in the CI history, my patch used to pass14:19
trownEmilienM: ya, HA job is voting14:19
EmilienMand it's not related14:19
shardylucasagomes: cool - it's ipxe loading the deploy ramdisk now :)14:20
lucasagomesshardy, w00t, let's see how the deployment goes14:20
hewbroccaindeed, could we land that?14:21
dtantsurUEFI needs a lot of testing..14:21
lucasagomesdtantsur, yeah... it's a diff case here too cause shardy has some PXE compat layer for UEFI14:21
lucasagomesso it doesn't need to use the EFI ROM for pxe booting it14:22
dtantsurmeh, so many combinations14:22
lucasagomeswe need to come up with some good documentaiton for such things14:22
lucasagomesdtantsur, exactly, we have so many permutations that it makes it hard to write about it concisely14:22
shardyYeah, this may be something of a special case, because it's a fairly new desktop machine14:23
*** tiswanso has joined #tripleo14:23
lucasagomesshardy, yeah, I'm happy we can support it with what we have already14:23
lucasagomesbut would be good to make it more usable if we have better instrunctions about how to set it up14:23
shardy(tarox box with an asus z170m motherboard if anyone else has same/similar)14:23
*** tiswanso has quit IRC14:24
*** tiswanso has joined #tripleo14:24
lucasagomesshardy, cool, I added to my todo list to try to come up with some sane documentation for UEFI14:25
*** tosky has quit IRC14:25
trownjdob: we are just merging things without even all CI passing now?14:26
* trown gives up14:26
shardylucasagomes: sounds good - I'll probably document what I have from a TripleO perspective too, even if it's just a blog post14:26
jdobtrown: that was a special request from EmilienM since the failure wasn't related to his change14:26
lucasagomesshardy, nice, thanks for that!14:26
jdobthat was also like three seconds ago, you're on top of shit :)14:26
lucasagomesshardy, did the whole deployment went through now?14:27
trownjdob: ya I was on that review :p14:27
*** tosky has joined #tripleo14:27
shardylucasagomes: still deploying the image14:27
* shardy goes to make tea14:27
trownit is hard enough unfubaring trunk without us merging stuff that is not even passing our mocked up world CI14:28
jdobok, i have both EmilienM and trown telling me "i give up"; you two sort this shit out14:29
EmilienMi'm fixing bugs, my patch pass CI, got never merged, I spent time to rebase them, not pass CI all the time.14:29
EmilienMand now it's not approved14:29
EmilienMso yeah, take my patch if you want or I can abandon it14:30
*** links has quit IRC14:30
jdobi removed the +A; again, you two sort it out14:31
*** rbrady has quit IRC14:31
*** rbrady has joined #tripleo14:31
slaglebnemec: shardy : can i get a backport +a?
slagleor jdob , you seem to be in the review mood ^14:36
trownEmilienM: you cant complain about trunk delorean not moving forward and want to merge stuff that did not pass CI14:36
trownEmilienM: those are mutually exclusive views of the world14:36
EmilienMtrown: did I complain about trunk delorean?14:37
openstackgerritJohn Trowbridge proposed openstack/tripleo-heat-templates: Manage keystone initialization directly in t-h-t manifests
trownEmilienM: well you have moved puppet off of the current-passed-ci link...14:37
trownEmilienM: I assume because that is not updating frequently enough14:38
EmilienMtrown: how is it a problem for you, I'm ahead of current-passed-ci so I can see issues earlier and fix in Puppet when needed14:38
*** absubram has joined #tripleo14:38
EmilienMit's actually a good thing for you that I'm ahead14:38
EmilienMwe're gating things earlier so fix faster and reduce your workload on fixing upstream issues14:38
jristlol gfidente nice owl14:38
trownok... lets just merge everything... let downstream sort it out14:39
EmilienMand increase the chance of successful weirdo jobs for puppet openstack14:39
EmilienMtrown: I think we misunderstand, I never asked for that. I don't understand your concern about "Puppet OpenStack CI not using current-passed-ci"14:39
EmilienMtrown: what is the problem with that?14:40
*** trozet has joined #tripleo14:40
trownEmilienM: as long as it is not a response to the RDO promote job always failing, I have no problem14:41
trownEmilienM: I took it as a reaction to that, which I think is a mistake to react that way14:41
EmilienMyou misunderstood I think14:41
trownfair enough14:42
shardylucasagomes: so, it definitely got further, it appears to have written the image, unmounted the filesystem, but then got stuck before signalling ironic and halting the machine14:42
shardythe node is stuck in "deploying", and the deploy ramdisk appears to have hung14:42
EmilienMtrown: but anyway: again, puppet openstack CI is using an RDO URL ahead of current-passed-ci because we don't want to wait you to promote and find bugs we need to fix in puppet modules - we want to go ahead and make sure "when you promote, your puppet jobs will pass" - does it make sense?14:43
* shardy may have to look for a legacy boot mode option ;)14:43
trownEmilienM: sure. as much as any of our spaghetti of CI everywhere makes sense.14:44
jaygif anyone has time, have a simple patch that could use a review -
EmilienMtrown: weirdo makes sense to me, fwiw14:44
lucasagomesshardy, hmm... and nothing in the logs? Because Ironic should be instrucing the ramdisk to power off now14:45
trownEmilienM: weirdo just runs your jobs :p14:45
trownin a slightly different context14:45
trownit makes the most sense of anything to me too fwiw14:45
shardylucasagomes: ah, it's not hung, it's stuck in a loop, where it keeps mounting, then unmounting sda3 every few minutes14:47
shardyI see corresponding activity in the conductor log, sec14:47
lucasagomesshardy, lemme try to look at the code14:48
lucasagomesI wonder if it's the local boot code that may be broken14:49
*** pradk_ has joined #tripleo14:53
*** pradk has quit IRC14:53
*** pradk_ is now known as pradk14:53
*** dustins has joined #tripleo14:55
shardylucasagomes: aha, there is an error attempting to install the bootloader:14:56
*** Goneri has joined #tripleo14:56
shardyIt seems to just try again rather than fail after that14:57
*** liverpooler has quit IRC15:02
dprinceJust a heads up there are going to be a few intermittent CI failures from the last 2 hours15:04
dprinceanything that ran on testenv9 has failed... and it was my fault15:04
dprincefixing it now...15:04
*** rcernin has joined #tripleo15:04
slaglebnemec: since you merged the add swap patch...if you could also add this to your review queue...
jdobthanks for the heads up15:16
trownI would like folks to be extra careful with the HA job, since that is precisely what I am trying to fix on trunk15:16
EmilienMtrown: do you think they should be voting?15:17
EmilienM(voting like, blocking Gerrit merge for real)15:17
*** bvandenh has quit IRC15:17
trownEmilienM: as in on the gate queue?15:17
trownyes... that would be ideal15:17
trownderek has tried to explain to me why that creates problems for tripleo, but I do not fully grok the argument15:18
*** hjensas has quit IRC15:18
rhalliseytrown, are you have heat stack-delete success today?15:27
trownrhallisey: nope... I did just manage to get a trace in heat-engine.log... is this what you are seeing
trownrhallisey: odd thing is that the ping test also fails to delete the overcloud, but I would assume it is for a different reason... havent got a deployed overcloud yet today to find out15:29
hewbroccaEmilienM, trown you guys got an arrangement worked out?15:30
trownhewbrocca: that makes it sound like we are getting married15:30
rhalliseytrown, I did notice that it was able to delete pretty much everything except the controller and compute stacks15:30
trownwhich my wife would not be pleased about15:31
hewbroccaMaybe we could merge some more of EmilienM patches while we're at it :)15:31
EmilienMtrown: lol15:31
*** hjensas has joined #tripleo15:34
trownrhallisey: any chance you want to file a heat bug for the stack-delete issue? I will get to it later today if not15:36
rhalliseyI like your logs I may steal them15:36
trownfor sure15:37
trownI just learned about this week15:37
trownpretty neat service15:37
slagleneat, that looks even easier to use than clbin15:40
trowncredit to larsks for the suggestion15:41
dtantsurfolks, please review a tiny liberty fix, found during internal testing:
d0ugalAnyone around that understands the interaction between t-h-t and puppet-swift?15:49
trownd0ugal: anything in particular?15:50
trownd0ugal: I think is what drives creating the rings... but I dont think there is any logic for adding a new node to the existing rings15:57
*** aufi has quit IRC15:57
hewbroccathere wouldn't be, no15:57
hewbroccait's hard15:57
hewbroccayou actually need something like a resource agent15:57
d0ugalslagle: ^15:57
*** yamahata has joined #tripleo15:57
trownwhich then makes scaling a bit of a non-starter15:57
*** jprovazn has quit IRC16:01
pradkCould i please request some more reviews so we can get this in -
pradktrown, looks pretty neat, nice find16:06
*** rcernin has quit IRC16:06
shardyBe good if we could land this os-cloud-config change:
trowncould someone who understands how the puppet managed keystone init is supposed to work look at that THT patch ^... the pacemaker case is still not working16:15
openstackgerritMarios Andreou proposed openstack/tripleo-common: Adds override for the overcloud node user in upgrade-non-controller
jistrd0ugal: yea, from what you post it looks like scaling swift might have never been supported16:30
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Make OpenStack service ports configurable in HAProxy
jistrd0ugal: proper ring-amending logic, i didn't look into this at all but it might not be very easy to achieve unless someone has already implemented something that we can just plug in (even then it would need integration work, testing incl. upgrades). IIRC one cannot just recreate new rings from scratch, as that could result in a reshuffling of what's stored where, which is very undesirable. Not sure if it would mean data loss too. This is just16:39
jistrmy non-swift-expert view of things so it's better to check with someone else i think, perhaps someone from swift team directly.16:39
jistri saw a doc about it somewhere16:40
*** olap has quit IRC16:40
openstackgerritAthlan-Guyot sofer proposed openstack/tripleo-heat-templates: Fix password issue with mysql address for ceilometer
d0ugaljistr: ugh, okay. I think I am convinced that this is way over my head anyway16:46
d0ugaljistr: Thanks for the input.16:46
openstackgerritJason Guiditta proposed openstack/puppet-tripleo: loadbalancer: fix Redis timeout HAproxy config
trowndprince: seems like the puppet keystone bootstrapping is not working17:10
trown and are evidence of that17:11
dprincetrown: yes, but give me a few. I'm building new testenv's ATM for the network ISO job...17:12
trowndprince: ah right, forgot about the other fire, no problem17:13
jistrd0ugal: been scanning through code a bit more, it looks like we generate the rings on each node separately, and count on the operation being deterministic. There seems to be some support in puppet-swift to manipulate the rings and we use it, but i don't know if we can count on ring *updates* being deterministic too. Perhaps a more usual way would be to build the rings ahead of time in a single place and feed the same ring files to all nodes17:16
jistras-is. This might get easier with splitting HW vs. SW stack. Deploy HW -> generate/amend rings from HW stack info -> deploy SW (e.g. fencing config would benefit from split stack exactly the same way). I'd probably have to spend a lot more time on this to be of any real help though, i haven't played with Swift in TripleO at all yet.17:16
*** Marga_ has quit IRC17:16
*** mbound has quit IRC17:16
*** Marga_ has joined #tripleo17:17
d0ugaljistr: heh, I just started to learn about it this week. You have been very helpful, if nothing else we can better evaluate how much work is required.17:19
*** fgimenez has quit IRC17:21
gfidentejistr, d0ugal I was thinking to replace swift with ceph rgw entirely ...17:22
gfidentebut hey swift is probably one of the most successful openstack core projects17:23
*** jaosorior has joined #tripleo17:26
*** jistr has quit IRC17:33
jaosoriorit really blocked a lot of stuff for me17:50
jaosoriorand I see nothing in the novs logs17:50
bnemecAlthough I would have to get back to a point where I can deploy a stack to know for sure.17:51
bnemecI think it's probably a Heat problem.17:51
bnemecIn my case I don't think it was even getting to where it would talk to Nova.17:51
d0ugalgfidente: Yeah, I think lots of people want to use it17:51
jaosoriornow I'm trying to deploy with stable/liberty17:51
jaosoriorlets see if that helps17:51
gfidented0ugal, it's actually easier to manage on our side I think17:52
gfidenteand if we really want to use ceph by default for cinder a/a17:52
jaosoriorbnemec: If you get some time, could you take a quick look at this CR? I've been trying to test that too, and it fails in the CI with problems with the database connection... which is damn strange, since I don't touch those ports :/17:52
gfidentethen one could scale storage for all glance/swift/cinder by just scaling ceph17:53
gfidentebut there are downsides too, like the api isn't really on a 1.1 feature parity17:53
gfidentetty tomorrow guys!17:53
jaosoriorgfidente: Have a good one dude17:54
gfidentejaosorior :)17:54
*** ccamacho has quit IRC17:54
trownrhallisey: thanks... I do not have the delete_in_progress for a few hours symptom... I get to delete_failed pretty quick18:00
rhalliseyoh really18:00
rhalliseymy just hangs18:00
rhalliseyadd your comments because we are doing slightly different things18:00
trowndo you have some automation around heat stack-delete? is it possible it is failing because of the new confirmation in heat client?18:01
trownwill do18:01
*** masco has joined #tripleo18:02
jaosoriorrhallisey: I see the same fwiw18:03
jaosoriorit just stalls18:03
rhalliseyjaosorior, ya it's lame :/18:03
jaosoriorand stays in DELETE_IN_PROGRESS18:03
bnemecjaosorior: I'm seeing this in the hieradata from the failed SSL CI run: controller.yaml:neutron::server::notifications::auth_url: ://:/v318:04
bnemecI'm kind of amazed it isn't failing hard than it is.18:04
*** hjensas has quit IRC18:05
*** ohamada has quit IRC18:05
jaosoriorbnemec: What the hell18:05
bnemecI'm so tired of this notification auth BS.18:06
jaosoriorbnemec: Well. I do remember there were some changes recently related to the nova-neutron configuration... just don't remember exactly which18:06
jaosoriorbnemec: pretty big headache18:06
jaosoriorbnemec: So it seems somewhere it's not being set up... or some url is messed up (might be a typo somewhere... or something of the sort)18:07
bnemecjaosorior: Oh, I know what the problem is.18:08
jaosoriorwell, that makes sense18:10
bnemecI added a depends-on to the ci patch.  That will probably get it working.18:11
bnemecOr at least get us past this problem to the next one. :-)18:11
jaosoriorbnemec: is stable/liberty switching to keystone v3? or is that only in master?18:11
bnemecjaosorior: Not sure.  There have been so many keystone-related changes lately that I can't keep track of what happened where. :-/18:12
jaosoriorssl used to kinda work... but if the keystone changes went in to stable/liberty, then we need to backport your commit too18:12
jaosoriorEmilienM do you know if stable/liberty is also switching to keystonev3?18:13
EmilienMjaosorior: what do you mean by switching to keystone v3 ? v3 API is exposed since Juno I think18:14
jaosoriorEmilienM: even though the API has been exposed, the services where hardcoded to use v2.018:14
jaosoriorEmilienM: No worries, just asked in case you had that info from the top of your head. We can figure it out18:15
jaosoriorthanks dude18:15
bnemecjaosorior: No keystone v3 endpoints in liberty yet, so we should be okay there.18:15
jaosoriorbnemec: alright18:16
jaosoriorbnemec: you had tried to backport the overcloud ssl job to stable/liberty, right?18:16
openstackgerrityolanda.robla proposed openstack/diskimage-builder: Generate fedora-atomic images using dib
openstackgerritBen Nemec proposed openstack/instack-undercloud: TEST: Run the ssl ci change against stable
bnemecjaosorior: ^18:16
bnemecI didn't retry it again last night because I kept fscking up the rebase of the ci change and didn't want to waste any more CI time on it. :-)18:17
*** trown is now known as trown|lunch18:17
* jaosorior is in suspense18:18
*** Marga_ has quit IRC18:19
* bnemec is dropping f-bombs and it's barely noon18:19
bnemecThis bodes well for the rest of the day. ;-)18:20
jaosoriorhahaha know the feel18:21
bnemecFortunately I have an appointment this afternoon so there will be an enforced break from broken OpenStack services.18:22
jaosorioron my side I'm not so lucky. It's 8:23 pm and I'm still here O_O18:23
bnemecUgh.  Go home.  Or at least step away from the laptop if you're already home. :-)18:24
jaosoriorbnemec: yeah, that's the downside of working from home18:24
jaosoriorand I made the mistake of putting my working spot in my bedroom18:24
ayoungEmilienM, so, yeah, no HTTPD in this Keystone, still eventlet, even with the Heat changes.  Both underclouid and overcloud18:25
bnemecslagle: Left a review on  It's going to need a change to work with SSL.  We can merge it as-is, but since I assume it needs backporting it's probably easier to do it as one patch.18:27
jaosoriorbnemec: I guess that also needs changes in puppet-tripleo18:27
openstackgerritJames Slagle proposed openstack/instack-undercloud: Nova should not sync power state of overcloud nodes
bnemecjaosorior: I assumed that was already done or it wouldn't work at all.18:28
jaosoriorand maybe os-cloud-config?18:28
jaosoriorbnemec: Let me double check18:28
bnemecjaosorior: puppet-tripleo is actually in the depends-on:
bnemecSo that's done anyway.18:29
jaosoriorbnemec: True, found it18:29
jaosoriorbnemec: I was looking at stable/liberty initially, that's why I thought it still needed to be done18:30
jaosorioranyway, I gotta go18:31
bnemecjaosorior: Have a good night.18:31
jaosoriorbnemec: If at some point in the day you can take a look at to see if it makes sense or I messed up, would really appreciate it18:31
jaosoriorhave a good one guys!18:31
slaglebnemec: k, thanks. yea, it has to work with ssl18:31
bnemecOh, pradk, you should also see my comment on
*** jaosorior has quit IRC18:31
pradkbnemec, sure, i'll address that now, thx for the review18:32
openstackgerritPradeep Kilambi proposed openstack/tripleo-heat-templates: Deploy Aodh services, replacing Ceilometer Alarm
pradkbnemec, done, thx again18:35
*** loser_ is now known as panda18:36
*** gmmaha has joined #tripleo18:36
bnemecpradk: slagle: +218:36
ayoungshardy, if the heat template says to deploy the undercloud Keystone in Apache, why would it still be in Eventlet?18:36
shardyayoung: because the undercloud isn't deployed via heat?18:37
gmmahaHi, i have been using bifrost/ironic standalone to deploy some machines and i was wondering if there was a dib element to modify the grub cmdline parameters on the image..18:37
gmmahatyring to enable net.ifnames=1 in the ubuntu image thats being deployed on machines18:37
ayoungshardy, ah...good point, I meant overcloud18:37
ayoungshardy, I'm debugging both in this deploy18:38
openstackgerritAthlan-Guyot sofer proposed openstack/tripleo-heat-templates: Remove unsafe "unset" defaults
ayoungEmilienM, one big clue on the overcloud:  it seems to have failed after Keystone.  the other services are running, but not entered into the service catalog;  may be related18:38
shardyayoung: aha :)18:38
ayoungshardy, according to your debugging steps...18:39
ayoung| d7e339be-6d09-4bad-9007-33e62c560734 | overcloud-ControllerNodesPostDeployment-oyoxyg3vuak6-ControllerOvercloudServicesDeployment_Step5-fltpqc6ebthr | CREATE_FAILED | 2016-03-02T16:07:10 | None         | ec28bbc2-91f2-4dbc-8c51-6b0cfe143c53 |18:39
ayoungshardy, and then18:40
ayoung| 0                                           | 82621f7d-e775-4203-9104-44b5f38d6042          | OS::Heat::StructuredDeployment                                                  | CREATE_FAILED   | 2016-03-02T16:07:10 | overcloud-ControllerNodesPostDeployment-oyoxyg3vuak6-ControllerOvercloudServicesDeployment_Step5-fltpqc6ebthr |18:40
openstackgerritAthlan-Guyot sofer proposed openstack/tripleo-heat-templates: updating enable_ceph conditions for controller
*** masco has quit IRC18:42
ayoungCould not evaluate: Cannot allocate memory18:44
ayoungshardy, so, not related18:44
*** trown|lunch has quit IRC18:44
*** rcernin has joined #tripleo18:44
*** rwsu has quit IRC18:45
*** mgould has quit IRC18:45
ayoungshardy, yeah...thought I had upped the memory to 8092.  4096 is not enough anymore for a control node18:46
shardyayoung: Yeah, unfortunately several things have been added which increased memory requirements somewhat18:48
shardyayoung: another option is which adds some swap18:48
shardymaybe useful if you're only just running out of actual memory18:48
ayoungshardy, Its ok,  I think I can get away with the large nodes...18:48
*** akrivoka has quit IRC18:50
ayoungshardy, will export NODE_MEM=6144  work?  Can we do none powers of 2>18:50
ayoungwell..I guess we'll find out18:51
shardyayoung: yup should do18:51
shardyIIRC the undercloud defaults to 6G18:51
ayoungI set it to 818:52
shardyYeah, good plan, 6 won't be enough without swap18:52
*** leanderthal is now known as leanderthal|afk18:53
*** trown has joined #tripleo18:55
ayoungEmilienM, so...the failure was due to memory size.  I've just rebuilt the VMs, and about to redeploy the undercloud using tripleo.sh18:55
ayounganything I should try to capture?18:56
*** shardy has quit IRC18:56
EmilienMayoung: good to know, I was not aware of that18:56
*** jcoufal has quit IRC18:58
*** athomas has quit IRC18:59
*** Marga_ has joined #tripleo19:01
*** dshulyak has left #tripleo19:01
*** rwsu has joined #tripleo19:03
*** Marga_ has quit IRC19:03
*** tosky has quit IRC19:05
*** Marga_ has joined #tripleo19:06
ayoungEmilienM, Notice: /Stage[main]/Keystone::Wsgi::Apache/Apache::Vhost[keystone_wsgi_main]/Concat[10-keystone_wsgi_main.conf]/Exec[concat_10-keystone_wsgi_main.conf]/returns: executed successfully19:08
ayoungthat looks good, right?19:08
EmilienMayoung: yes Sir!19:08
ayoungEmilienM, and yet19:08
ayoung[stack@instack ~]$ ps -ef | grep keystone19:08
ayoungkeystone 22582 22580  9 19:08 ?        00:00:00 keystone-admin  -DFOREGROUND19:08
ayoungkeystone 22583 22580  9 19:08 ?        00:00:00 keystone-main   -DFOREGROUND19:08
ayoungstack    22648 22557  0 19:08 pts/1    00:00:00 grep --color=auto keystone19:08
ayoungEmilienM, I bet it is the systemd config...let's look19:09
ayoungEmilienM, yep19:10
ayoungless /usr/lib/systemd/system/openstack-keystone.service19:10
ayoungI think that should maybe just get yanked, or chkconfiged off19:11
ayoung$ systemctl status openstack-keystone.service19:12
ayoung● openstack-keystone.service - OpenStack Identity Service (code-named Keystone)19:12
ayoung   Loaded: loaded (/usr/lib/systemd/system/openstack-keystone.service; disabled; vendor preset: disabled)19:12
*** xinwu has quit IRC19:16
*** ifarkas has quit IRC19:17
ayoungEmilienM, I'm the one that should be facepalming19:17
ayoungroot     22580     1  0 19:08 ?        00:00:00 /usr/sbin/httpd -DFOREGROUND19:17
ayoungkeystone 22582 22580  0 19:08 ?        00:00:03 keystone-admin  -DFOREGROUND19:17
ayoungIts working fine....19:17
* ayoung bone head19:17
ayoungI am surprised that there is only one process per for each main and admin, but...that might be a config option to fix later19:18
trownayoung: do you have any thoughts on how to further troubleshoot
ayoungtrown, is that using keystone-manage bootstrap?19:21
trownor EmilienM ^, it is like the service tenant create part of kesytone::role::admin is not  working19:21
trownayoung: yep19:21
ayoungtrown, well, let me see if ithat creates the "project"19:21
EmilienMtrown: what says puppet logs?19:21
EmilienMcan you connect to keystone api?19:21
EmilienMpuppet use osclient to connect to keystone api and manage resources19:22
trownEmilienM: sort of... the service is running, but I cant auth to it19:22
trownEmilienM: as in I can curl it and get the top level version json, but /tokens not so much19:22 looks like it only creates one project?  Is the admin project created on your system trown ?19:23
EmilienMtrown: /me currently dealing with CI issues in puppet19:23
ayoungtrown, OK, two domains (one root) one project (admiun)19:24
ayoungone role assignment19:24
ayoungadmin user has admin role on admin project19:25
ayoungtrown, can you auth as Admin?19:25
ayoungtrown, ah...and nothing in the endpoint table19:25
ayoungI think I saw a commit for this...19:25
trownayoung: ya...19:26
trownayoung: even more odd, is that this only happens for the pacemaker managed keystone... the nonha job does not hit this19:26
ayoungtrown, maybe that is still using ADMIN_TOKEN19:28
dprinceEmilienM: trown there was a bad testenv this morning. I was testing a new build of the testenv's (and I broke it)19:29
rhalliseytrown, ramishra might have a solution for us fyi19:29
rhalliseycheck the bug19:29
trownayoung: I guess it would have to be? but it doesnt seem like it: says just give me puppet defaults19:29
dprinceEmilienM, trown: on that note, I'd like to deploy a new testenv now. Any issues if I just go on and do that?19:29
trowndprince: go for it19:29
trownayoung: and the puppet defaults are now to run the bootstrap19:30
rhalliseytrown, work around would be to use old neutronclient19:30
dprinceEmilienM, trown: okay, anything running would fail. We can recheck them later...19:30
trownoh wow... neutronclient is the problem?19:30
rhalliseytrown, I haven't confirmed yet, but will test later19:30
trownrhallisey: thanks!19:30
ayoungtrown, on a non-HA node, you get a service catalog with an identity endpoint?19:31
trownayoung: ya, I can fully validate the deployed overcloud with the exact packages/images for nonha19:31
trownayoung: it is only HA that is not fully configuring keystone... though it still gets to CREATE_COMPLETE19:32
ayoungtrown, include ::keystone::endpoint  where does that come from?  I am not fluent in puppet yet19:32
*** electrofelix has quit IRC19:33
trownayoung: no worries, I have been staring at this for 2 days, so I know where all the tripleo/puppet bits come from19:33
trownayoung: so that endpoint manifest is what is meant to create the identity endpoint19:34
ayoungtrown, so, that injects the endpoint in, but I can't see how it authenticates19:34
ayoung    user_domain         => $user_domain,19:35
ayoung    project_domain      => $project_domain,  seem like they are for a token19:35
ayoungbut if SERVICE_TOKEN is set, or whatever the newish term is for that, I think they would get ignored19:35
ayoung keystone_admin_token => hiera('keystone::admin_token'),  is specified
trownayoung: hmm that is only in the midonet stuff though19:38
trownwhich is not run in either scenario for me19:38
ayoungtrown, yeah, but I wonder if it "pollutes" the auth plugin...somehow?19:38
ayoungits the only thinkg I can see.  It needs to be yanked anyway19:38
trownit would only run if this conditional is true
ayoungtrown, so, if we are going to do bootstrap, we should disable admin_token both in the config file and in the paste pipeline19:41
ayoungthat latter is tought because we bury it in a non-config location19:41
trownya, and I think that is unrelated to the issue blocking delorean promotion19:41
ayoungbut I think now if we set ADMIN_TOKEN=19:41
ayoungfor None19:42
trownsince that would be the same in the ha and nonha cases19:42
ayoungwe can be assured that ADMIN_TOKEN is not being used, and cartch errors like this19:42
openstackgerritBen Nemec proposed openstack/tripleo-common: Add capabilities filter for Nova
*** dmacpher has quit IRC19:55
*** ayoung has quit IRC19:56
*** mkovacik has quit IRC19:58
*** jtomasek_ has quit IRC19:58
*** mkovacik has joined #tripleo20:04
*** shardy has joined #tripleo20:10
pradkhmm any idea why ci is failing with "testenv-client - ERROR - The command hasn't completed but the testenv worker has released the environment. Killing all processes."20:14
*** yamahata has quit IRC20:14
slaglepradk: i think there is some rebuilding of the testenv's going on20:15
pradkah ok20:16
*** penick has quit IRC20:16
*** pradk has quit IRC20:20
*** rwsu has quit IRC20:25
*** rwsu has joined #tripleo20:25
*** dmsimard has joined #tripleo20:27
*** penick has joined #tripleo20:28
*** dmacpher has joined #tripleo20:32
*** ayoung has joined #tripleo20:33
*** dmacpher has quit IRC20:33
*** dmacpher has joined #tripleo20:36
*** pradk has joined #tripleo20:43
*** stevebak` is now known as stevebaker21:00
*** mbound has joined #tripleo21:01
*** lblanchard has quit IRC21:01
*** jayg is now known as jayg|g0n321:04
*** penick has quit IRC21:18
*** ayoung has quit IRC21:24
*** jprovazn has quit IRC21:25
*** xinwu has joined #tripleo21:35
EmilienMpradk: in theory, would be useless if we land one day21:39
pradkEmilienM, ok is that expected to land in liberty?21:40
*** gmmaha has left #tripleo21:40
EmilienMpradk: I'm not sure21:40
EmilienMI wish :)21:40
EmilienMbut IIRC there is a problem during upgrade21:40
*** penick has joined #tripleo21:42
*** rcernin has quit IRC21:45
EmilienMdprince: I noticed that running tripleo jobs in puppet module CI was good but not enough, sometimes some patches can break HA jobs. Do you think we can switch non-ha to ha? or is it too hardwae  expensive?21:48
dprinceEmilienM: ha job is slower21:49
EmilienMyeah but not that much21:49
dprinceEmilienM: and non-ha is more stable21:49
dprinceEmilienM: but sure, the HA job does give extra coverage21:50
EmilienMits 1h35 versus 2h21:50
*** jtomasek_ has joined #tripleo21:50
EmilienMmhh ok21:50
EmilienMlet's keep like this21:50
EmilienMdprince: the keystone bootstrap stuff broke ha job last time21:50
dprinceEmilienM: for now, or we can bring in up in next weeks tripleO IRC meeting21:50
dprinceEmilienM: I just rebuilt all the testenv's btw21:51
dprinceEmilienM: jobs probably failed over the last 2 hours, that was me being disruptive21:51
EmilienMcan I run recheck?21:52
dprinceEmilienM: anyway, hopefully now things work again21:52
dprinceEmilienM: before running recheck in mass maybe just do one and see if it is green21:52
*** dshulyak has joined #tripleo21:52
dprinceEmilienM: I'd like an open queue to test some things21:52
*** penick has quit IRC21:53
dprinceEmilienM: if you see failing jobs try and send me the testenv number. Like testenv6 or something...21:56
dprinceEmilienM: this puppet-cinder job is on a new worker
EmilienMdprince: ok I'll let you know21:57
*** trozet has quit IRC22:01
*** gfidente has quit IRC22:02
*** shivrao has quit IRC22:06
*** dshulyak has quit IRC22:06
*** shivrao has joined #tripleo22:07
*** shardy has quit IRC22:10
*** jtomasek_ has quit IRC22:11
*** jtomasek has quit IRC22:19
*** penick has joined #tripleo22:29
*** trown is now known as trown|outtypewww22:30
*** dprince has quit IRC22:36
*** penick has quit IRC22:52
*** pradk has quit IRC23:11
*** penick has joined #tripleo23:13
*** ayoung has joined #tripleo23:18
*** morazi has quit IRC23:35
