14:00:27 <shardy> #startmeeting tripleo 14:00:28 <openstack> Meeting started Tue May 10 14:00:27 2016 UTC and is due to finish in 60 minutes. The chair is shardy. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:29 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:31 <openstack> The meeting name has been set to 'tripleo' 14:00:39 <shardy> #topic rollcall 14:00:41 <jdob> o/ 14:00:47 <shardy> Hey all, who's around? 14:00:57 <tzumainn> hi! 14:01:05 <dprince> hi 14:01:14 <EmilienM> o/ 14:01:56 <beagles> o/ 14:02:09 <derekh> o/ 14:02:22 <shardy> Ok then, let's get started :) 14:02:32 <shardy> #topic agenda 14:02:32 <shardy> * one off agenda items 14:02:32 <shardy> * bugs 14:02:32 <shardy> * Projects releases or stable backports 14:02:32 <shardy> * CI 14:02:35 <shardy> * Specs 14:02:35 <rhallisey> hi 14:02:39 <shardy> * open discussion 14:02:47 <shardy> Anyone have anything to add to the one-off items? 14:02:57 <shardy> there's one from me and one from beagles atm: 14:03:05 <shardy> #link https://wiki.openstack.org/wiki/Meetings/TripleO#One-off_agenda_items 14:03:21 <shadower> hey (sorry for being late) 14:03:34 <shardy> hey shadower, np 14:03:50 <shardy> #topic one off agenda items 14:04:04 <shardy> beagles: Hey, do you want to cover your item first here? 14:04:10 <beagles> shardy: sure 14:04:25 <shardy> #info bug tag for partial bug fixes 14:04:40 <beagles> I've been going through some of the older bugs especially the ones that are "in progress" 14:05:06 <d0ugal> o/ 14:05:10 <beagles> there have been patches landed as workarounds, but its not clear that they are needed any longer so I would like to tag bugs like this for now 14:05:22 <beagles> so that we can go through and clean them as time and resources permit 14:05:37 <EmilienM> I like the idea, specially for things related to our CI 14:06:13 <shardy> there is already a CI tag, but +1 on having a possibly-obsolete-hack-workaround tag (naming to be determined) 14:06:38 <EmilienM> ++ 14:06:47 <shardy> anything that allows us to divide and conquer so we start purging old and no longer valid bugs gets my +1 :) 14:06:54 <shardy> beagles: thanks for helping out here! 14:07:04 <beagles> cool... I'm open to suggestions on the name. Naming is always the hardest part 14:07:21 <beagles> shardy: np .. it's good stuff. Nice way to get some historical context on things 14:08:06 <beagles> temp-workaround might be reasonably descriptive and not be as scary as "hack" 14:08:31 <beagles> anyways, that's that in a nutshell we can bikeshed on names later 14:08:33 <shardy> beagles: +1. or just "workaround" would be OK I guess 14:08:46 <EmilienM> both wfm 14:08:49 <beagles> k 14:08:51 <shardy> we can decide the name in #tripleo later, thanks! 14:09:05 <shardy> Ok next topic was midcycle plans 14:09:28 <shardy> this was mentioned a couple of times recently, and I'd not considered organizing any f2f meetup this time 14:09:36 <EmilienM> +1 14:09:49 <shardy> what do folks think, should we aim for some sort of virtual hackfest/meetup around the middle of the cycle? 14:10:00 <beagles> +1 on virtual meetup 14:10:05 <shardy> I could arrange a series of focussed topic video calls or something 14:10:13 <marios> shardy: sounds good 14:10:14 <EmilienM> +1 on using #openstack-sprint + videoconf if needed 14:10:21 <shadower> yea better than traveling 14:10:23 <ccamacho> +1 virtual meetup 14:10:38 <shardy> obviously we can do that at any time, but it'd maybe be good to encourage some high-bandwidth discussion via some google hangouts or whatever 14:10:46 <EmilienM> or you can come in Quebec and I can cook french cooking 14:11:03 <beagles> :) 14:11:16 <slagle> +1 to virtual 14:12:14 <shardy> Ok, sounds like vague consensus, I'll start ML thread where we can decide the date and agenda 14:12:29 <EmilienM> shardy: I can help you to start agenda 14:12:37 <shardy> #info agreed to aim for virtual mid-cycle meetup, ML thread pending 14:12:45 <derekh> I'm happy to go virtual if bnemec turns on his cam so we can see what kind of gestures he is making at the screen ;-) 14:13:01 <shardy> Any other one-off items before we continue? 14:13:06 <bnemec> derekh: :-) 14:13:36 <shardy> #topic bugs 14:13:56 <shardy> #link https://bugs.launchpad.net/tripleo/ 14:14:23 <derekh> Current bug on trunk https://bugs.launchpad.net/tripleo/+bug/1580076 14:14:24 <openstack> Launchpad bug 1580076 in tripleo "Upgrades job failing pingtest with "Message: No valid host was found."" [Critical,Triaged] 14:14:37 <derekh> causing all upgrades jobs to fail 14:15:25 <slagle> hmm, is that happening before of after the stack-update? 14:15:26 <EmilienM> weird, I swear I saw pingtest working last night 14:15:27 <shardy> There's also https://bugs.launchpad.net/tripleo/+bug/1580170 which looks like a puppet module version mismatch on liberty->mitaka upgrade, possibly need to get more info on that 14:15:28 <openstack> Launchpad bug 1580170 in tripleo "overcloud upgrade liberty to mitaka failed" [Undecided,New] 14:15:41 <slagle> i would assume after since other jobs are passing the pingtest 14:15:42 <EmilienM> (I saw pingtest working *after* stack update last night) 14:15:48 <slagle> i wonder if we have a real upgrades bug? 14:15:56 <ccamacho> derekh, but not only affects upgrades, also (sometimes) creating the overcloud 14:16:14 <derekh> slagle: after, I have a hunch this patch started the problem https://review.openstack.org/#/c/312300/1 but its just currently a hunch 14:16:27 <ccamacho> Im trying to reproduce it creating the overcloud in my CI 14:16:44 <derekh> ccamacho: ok 14:17:24 <derekh> slagle: yup, we may possible have found a real upgrades bug 14:17:29 <EmilienM> derekh: why? 14:17:38 <EmilienM> why 312300 ? do you have logs? 14:18:17 <dprince> derekh: are you tested a puppet pin that uses keystone prior to that yet? 14:18:52 <derekh> EmilienM: dunno, don't worry about it being that patch until I run some tests, its just a hunch based on the auth problem we're seeing, when the problem started and the timing of that patch 14:18:57 <EmilienM> dprince: it's nova 14:19:12 <EmilienM> derekh: 5th may iirc 14:19:40 <ccamacho> derekh, agreed on the starting time for the issues 14:19:55 <EmilienM> derekh: it can't be this one, upgrade job passed at 2016-05-05 23:52:20 14:20:06 <EmilienM> and the puppet-nova patch merged at May 5 6:23 PM 14:20:18 <EmilienM> source: http://tripleo.org/cistatus.html and https://review.openstack.org/#/c/312300/ 14:20:31 <dprince> EmilienM: okay, either way I'd say lets test them both to see 14:20:41 * EmilienM checks in tripleo logs to check we had the commit 14:21:17 <shardy> Ok, so it sounds like we've got enough eyes on this issue, are there any other bugs folks want to highlight? 14:21:30 <derekh> EmilienM: ok, btw I'm not suggesting we jump into reverting it or anything, just letting people know what my current train of thought was 14:21:46 <EmilienM> derekh: wait, I checked in the logs, and this is the commit in puppet-nova that worked: https://github.com/openstack/puppet-nova/commits/b108a7c36bbc733b3aa90786540e978f5c0ec059 14:21:55 <EmilienM> and we don't have the one you mentionned, so it's still a possibility 14:22:17 <derekh> dprince: will try a temprevert, kindof tried something similar here but it didn't even get to the ping test https://review.openstack.org/#/c/314510/ 14:22:22 <derekh> EmilienM: ack 14:22:32 <shardy> Just a reminder to please target bugs when you triage them, e.g if it's an actual bug in TripleO pieces vs a CI fix 14:22:35 <shardy> https://launchpad.net/tripleo/+milestone/newton-1 14:22:56 <shardy> Then we can burn down the open list for the milestone and know when it's a good time to release 14:22:56 <dprince> derekh: I see. I should know to always check the "nothing to see" patches first ;) 14:23:15 <derekh> dprince: you should have learned by now ;-) 14:23:22 <EmilienM> derekh: well, 314510 has same effect as temprevert... and it fails :( 14:23:47 <derekh> EmilienM: ya but it didn't get as far as the ping test, 14:23:51 <derekh> EmilienM: going to recheck now 14:23:58 <dprince> derekh: lets recheck that on I think 14:24:09 <shardy> and also ref the cleanup beagles has been helping with - please review the list of bugs raised by you, and close any ye-olde ones which are no longer valid 14:24:25 <derekh> dprince: done 14:25:20 <shardy> Ok shall we continue and defer further discussion re bug #1580076 to after the meeting? 14:25:21 <openstack> bug 1580076 in tripleo "Upgrades job failing pingtest with "Message: No valid host was found."" [Critical,Triaged] https://launchpad.net/bugs/1580076 14:25:29 <EmilienM> ++ 14:25:40 <derekh> +1 14:25:42 <shardy> #topic Projects releases or stable backports 14:26:12 <shardy> #link http://releases.openstack.org/newton/schedule.html 14:26:26 <shardy> So, I wanted to run a plan past you all for the n-1 milestone 14:27:04 <shardy> I was thinking it'd be good to do a coordinated release of all the tripleo pieces, based on a passing periodic CI job, around the time of the n-1 milestone (e.g in about 3 weeks time) 14:27:32 <shardy> I'll probably write a script that can scrape the latest periodic CI pass and propose all-the-things to openstack/releases 14:27:46 <shardy> then at any time, we can tag a release for a combination of things we know to work 14:28:02 <EmilienM> just an FYI about puppet modules, we might produce a first newton release by end of this month 14:28:17 <dprince> shardy: would be cool to display that on tripleo.org perhaps too. Maybe in the CI status page or something 14:28:41 <slagle> shardy: once the releases were done, how would people consume them? 14:29:19 <derekh> shardy: your script can look at this file http://trunk.rdoproject.org/centos7/current-tripleo/versions.csv to see what version of each project is included (just FYI in case you didn't know) 14:29:33 <slagle> just wondering how we'd be able to definitevely install a n-1 14:29:34 <shardy> slagle: that's a good question - I've not yet figured that out - I was hoping we could wire up tripleo-quickstart to enable easily deploy for a given release 14:29:49 <shardy> obviously we can also publish the delorean hash for the passing CI run 14:30:06 <shardy> but I was hoping we could get the tagged repos more directly consumable 14:30:26 * shardy looks around for trown 14:31:21 <derekh> shardy: havn't seen trown around, I guess his wife had the baby, it could be a while before we see him 14:31:36 <shardy> slagle: I guess I was focussing on the first step, which is to define a point-in-time release which we expect to work, and has a known bunch of features/bug-fixes in it 14:31:50 <shardy> slagle: you're right, we need to then define and document how folks consume it 14:31:55 <shardy> derekh: ah, cool 14:32:24 <slagle> shardy: k, i think we could come up with something 14:32:45 <shardy> If folks are OK with the idea of milestone related releases, we can do that and figure out the consuming of it 14:33:18 <dprince> shardy: I always use trunk, but whatever :) 14:34:18 <shardy> Ok, lets table the release discussion and work it out over the next couple of weeks 14:34:24 <shardy> anything else release related? 14:34:52 <shardy> https://review.openstack.org/#/c/308236/ has some discussion re our stable branches FYI 14:35:11 <shardy> there's some resistance to our application for the follows-stable tag 14:36:33 <shardy> feel free to pitch in there if you have an opinion - I may start a ML thread on the same topic 14:36:56 <shardy> #topic CI 14:37:09 <shardy> So, other than the upgrades job, any CI news to discuss? 14:37:25 <EmilienM> yes 14:37:31 <shardy> the newly discovered step removal aka "turbo" option? :) 14:37:34 <dprince> shardy: Emilien is chopping some of the steps from deployment 14:37:37 <EmilienM> yesterday I released puppet-ceph 14:37:43 <slagle> all jobs now 7 minutes faster! 14:37:44 <dprince> that should speed things up I think 14:37:48 <slagle> upgrades job now 14 minutes faster! 14:37:55 <EmilienM> and we need to bump puppet-ceph to stable/hammer 14:37:58 <EmilienM> please review https://review.openstack.org/#/c/314311/ 14:38:05 <EmilienM> I'm not sure this patch does what I want 14:38:32 <shardy> Cool, well nice work on the optimization! :) 14:38:45 <dprince> EmilienM: we aren't using stable for the other modules though? 14:38:47 <EmilienM> and yeah, I'm also working on step6 removal https://review.openstack.org/314253 14:38:51 <derekh> Its a tripleo improvment though not just ci, I wouldn't like people thinking we just fixed something in CI 14:38:52 <dprince> EmilienM: why puppet-ceph? 14:38:53 <EmilienM> dprince: yeah but ceph is special 14:39:04 <shardy> I've been in discussion with zaneb and other heat folks, and it sounds like the heat memory usage issues are likely to be improved soon 14:39:16 <EmilienM> dprince: not a lot of people are working on this module, and we found out tripleo can't deploy Jewel *right now*, so better to pin 14:39:36 <sshnaidm> I'd like you consider tempest for periodic nonha jobs, please: https://review.openstack.org/#/c/297038/ 14:39:46 <shardy> https://review.openstack.org/#/c/311837/ is the first step if you'd like to follow it 14:40:20 <dprince> EmilienM: https://review.openstack.org/#/c/184844/ was an idea too 14:40:47 <slagle> shardy: was wondering if we should setup a periodic job to test with convergance? 14:41:08 <slagle> shardy: what do you think our plans should be around convergance? do you think we could switch in newton? 14:41:09 <EmilienM> dprince: should I update my patch or mine works too? 14:41:19 <shardy> slagle: Probably not yet - I tested it locally quite recently, it makes the memory usage and DB utilization/timeout issues much, much worse atm 14:41:28 <dprince> slagle: yeah, hearing about the memory usage of convergence at Austin was a bit concerning 14:41:29 <slagle> i see, ok 14:42:02 <shardy> slagle: I think we should probably wait for things to get optimized and configure for non-convergence at least until later in Newton 14:42:37 <shardy> Heat may switch the default relatively soon, but IMHO the benefits don't yet outweigh the performance issues for the TripleO use-case 14:42:43 <dprince> EmilienM: I'd just like to clarify what we are doing where. If we do it in your patch at the very least a comment explaining why. Initially I thought we'd have a separate element for this though. That was my point. 14:42:55 <EmilienM> dprince: commit message is not enough? 14:43:14 <dprince> EmilienM: #comment 14:43:21 <EmilienM> kk 14:43:40 <shardy> slagle: we could set up a periodic job tho I suppose, or maybe an experimental job that we can try 14:44:18 <dprince> shardy: the non-convergence path would eventually get removed though right/ 14:44:35 <dprince> shardy: so we'd be on borrowed time to fully migrate over once heat switches? 14:44:40 <shardy> dprince: eventually I guess, but there's no discussion of that happening anytime soon 14:45:12 <shardy> it's not even deprecated yet, they'll switch the default, then we'd need *at least* two full cycles before anything got removed, I anticpate it being longer than that 14:45:26 <EmilienM> shardy: what is the progress on dropping OPM rpm in stable jobs and use git? 14:46:03 <shardy> EmilienM: I proposed an approach, but dprince had a preference for a different implementation - I've not yet had time to revisit, so it's still TODO 14:46:20 <shardy> if anyone wants to pick that up, feel free 14:46:29 <EmilienM> ack 14:46:55 <shardy> we actually need to switch master to using the per-module delorean packages too ref trown's item last week 14:47:09 <shardy> Ok, anything else re CI before we continue? 14:47:19 <derekh> Main thing I got to report is that it looks like we're going ahead with the HW upgrade tomorrow 14:47:25 <derekh> I'll likely take the cloud down around 1pm UTC, and expect it down for about 12 hours 14:47:36 <derekh> sshnaidm: once thats done, assuming it speeds things up even more I think we can start thinking about adding tempest to the period job 14:47:48 <dprince> derekh: hope it goes well 14:47:48 <sshnaidm> derekh, cool, thanks 14:48:00 <shardy> derekh: nice - do you need any help or do you have the bringup post-upgrade covered? 14:48:05 <dprince> I've got one thing that has to happen before the 14th as well. Very important 14:48:14 <derekh> dprince: certs ? 14:48:14 <dprince> We need a new SSL cert :/ 14:48:23 <dprince> derekh: yep :) 14:48:29 * derekh meant to remind dprince last week, opps 14:48:38 <shardy> heh, good reminder :) 14:48:41 <dprince> derekh: I got my own reminder. So its on the list 14:48:47 <bnemec> Who needs certs when your cloud is down for upgrades? :-) 14:48:57 <shardy> OK #topic Specs 14:48:59 <sshnaidm> another topic - please comment my elastic-recheck related mail in maillist if you're interested 14:49:02 <sshnaidm> sorry :) 14:49:03 <dprince> bnemec: well, if it comes back up we'll need em 14:49:06 <shardy> #topic Specs 14:49:17 <derekh> shardy: I think I'm got it handled, will ping people if I need extra hands 14:49:26 <shardy> So, there's a ML message about two specs related to opnfv 14:49:32 <shardy> would be good to get some eyes on those 14:49:59 <shardy> #link https://review.openstack.org/#/c/313871/ 14:50:12 <shardy> #link https://review.openstack.org/#/c/313872/ 14:50:49 <shardy> I also added https://blueprints.launchpad.net/tripleo/+spec/custom-roles to track the fully-composable model we discussed at summit, which those two will probably require 14:51:18 <shardy> #link http://lists.openstack.org/pipermail/openstack-dev/2016-May/094287.html 14:52:21 <beagles> fwiw, I don't think they actually know how the dpdk one is going to work out just yet. The SR-IOV one is "closer" in terms of reality at the moment 14:52:44 <beagles> there are some non-tripleo related things to work out with respect to dpdk 14:53:03 <shardy> beagles: cool, please comment on the specs 14:53:10 <beagles> yup 14:53:18 <shardy> there's a lot of detail in both, but not that much clarity on the actual implementation AFAICT 14:53:39 <shardy> Anything else spec related? 14:54:05 <shardy> https://launchpad.net/tripleo/+milestone/newton-1 14:54:30 <shardy> we only have two features on the n-1 list, so we should add anything we expect to land in the next 2-3 weeks 14:54:46 <shardy> #topic open discussion 14:54:58 <shardy> Sorry, only 5 mins left - anything to raise? 14:55:34 <dprince> shardy: I added a new spec too 14:55:37 * derekh runs out the door a little early 14:55:38 <dprince> #link https://blueprints.launchpad.net/tripleo/+spec/remote-execution 14:56:02 <dprince> The power there is really in the Mistral workflow bits... but the CLI work shows it nicely I think 14:56:42 <EmilienM> I like it, but I wondered about security here, and the capacity of running malicious software remotely 14:57:01 <slagle> can we try and link etherpads tracking patches/reviews into the blueprints? 14:57:09 <dprince> EmilienM: I'm using the same mechanism we use for Heat. So nothing new really I think 14:57:13 <shardy> dprince: Nice, I saw that, looks good - it'd be interesting to see how that aligns with operator requests re e.g running ansible against a dynamic inventory generated by TripleO 14:57:14 <EmilienM> if someone gets admin credentials, it's easy to run malicious software remotely 14:57:15 <slagle> i noticed there were ones for composable services and mistral 14:57:19 <slagle> etherpads, that is 14:57:22 <slagle> but no one can find them 14:57:31 <shardy> EmilienM: it's already easy to do that if you have credentials 14:57:42 <marios> EmilienM: if you already have access to the pvt keys then you can do whatever you want anyway (e.g. from undercloud) 14:57:45 <EmilienM> ok, I'm just highlighting, just in case. 14:58:09 <EmilienM> slagle: https://etherpad.openstack.org/p/tripleo-composable-roles-work ? 14:58:23 <shardy> slagle: good point, lets link them from the whiteboards on the blueprints 14:58:28 <slagle> EmilienM: yes, pads like that 14:58:41 <slagle> the link isn't discoverable unless you already know it 14:58:41 <dprince> composable services is here: https://etherpad.openstack.org/p/tripleo-composable-services 14:59:00 <slagle> lol, or we end up with 2 etherpads :) 14:59:13 <shardy> <facepalm> 14:59:16 <slagle> qed 14:59:20 <shardy> Ok, we're out of time - thanks all! 14:59:24 <shardy> #endmeeting