14:00:58 <dprince> #startmeeting tripleo 14:01:03 <openstack> Meeting started Tue Feb 23 14:00:58 2016 UTC and is due to finish in 60 minutes. The chair is dprince. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:01:04 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:01:05 <d0ugal> o/ 14:01:07 <openstack> The meeting name has been set to 'tripleo' 14:01:16 <jaosorior> o/ 14:01:28 <slagle> hi 14:01:29 <dprince> anyone around for a tripleo meeting 14:01:43 <derekh> o/ 14:01:46 <dtantsur> o/ 14:01:50 <jtomasek> o/ 14:01:56 <gfidente> o/ 14:02:30 <jdob> o/ 14:03:14 <leanderthal> o/ 14:03:19 <dprince> #topic agenda 14:03:19 <dprince> * bugs 14:03:19 <dprince> * Projects releases or stable backports 14:03:19 <dprince> * CI 14:03:21 <dprince> * Specs 14:03:27 <dprince> * Create stable branch of DIB to help phase out old ramdisk http://lists.openstack.org/pipermail/openstack-dev/2016-February/086738.html 14:03:33 <dprince> * open discussion 14:03:57 <trown> o/ 14:04:08 <dprince> There is one extra item above to discuss creating of a DIB branch. Any other topics to add for this week that I missed? 14:05:21 <dprince> okay, lets go 14:05:28 <dprince> #topic bugs 14:06:41 <dprince> I'm not actually sure there are bugs on all of these patches but it would be good to get eyes on the depends-on here: https://review.openstack.org/#/c/278553/ 14:07:01 <dprince> as these are blocking us from testing current 14:07:16 <trown> dprince: it is down to just 2 patches, 1 for undercloud and 1 for THT 14:07:30 <trown> and both derekh and I got a successful pingtest with those 14:08:15 <dprince> trown: nice. So is the plan that once the tripleo-ci patch passes we land them all? 14:08:37 <trown> dprince: that would be my hope ya 14:09:13 <derekh> trown: we'll have to edit out part of the patch to tripleo.sh but ideally yes I think so 14:10:20 <dprince> okay, I've got one quick undercloud bug to (finally) get zaqar working: https://review.openstack.org/#/c/283221/ 14:10:21 <trown> derekh: right, we will need to use current-tripleo, but we can take the hash for it from what (hopefully) passes ci 14:10:32 * derekh points to the change to DELOREAN_REPO_URL 14:11:05 <dprince> derekh: thanks 14:11:16 <derekh> trown: yup, I also have a sleep 600 in there, I put that in yeterday wehn I noticed nodes we're ready but I may have just been crazy 14:11:53 <trown> derekh: ah cool, I have a slightly better scriptlet we could put there that checks nova hypervisor-stats 14:12:01 <derekh> trown: dprince we can just remove the sleep and merge, if it causes problems put the sleep back in while we investigate 14:12:18 <slagle> derekh: was it that oc services weren't up after create_copmlete? 14:12:52 <slagle> i've seen that in the ha job. i put in that 1 hack for crm_resource --wait in tripleo-ci 14:12:54 <trown> slagle: no, before deploy, ironic nodes not available 14:13:01 <slagle> oh ok 14:13:09 <derekh> slagle: not sure, iirc on friday when I tried a deploy I had lods of 500's from services 14:13:51 <derekh> slagle: when I cam back on monday and tried overcloud deploy on the same undercloud it went ok 14:14:12 <derekh> slagle: so I figured the weekend ebtween deploy/register and deploy helped somehow 14:14:37 <derekh> anyways, if nobody else has seen it lets drop it and merge 14:14:47 <derekh> if it causes a problem add a quick sleep in 14:14:54 <derekh> and figure it out 14:15:02 <trown> +1 14:15:03 <dprince> okay, that sounds like a plan 14:15:04 <slagle> k, wfm 14:15:23 <marios> o/ sorry also on a call 14:15:41 <dprince> #topic Projects releases or stable backports 14:16:06 <dprince> I'd actually like to mention the desire to create a stable branch for DIB here I think 14:16:13 <dtantsur> ++ 14:16:20 <dprince> dtantsur: want to drive this? 14:16:28 <dprince> #link http://lists.openstack.org/pipermail/openstack-dev/2016-February/086738.html 14:16:30 <trown> I thought DIB was supposed to be backwards compatible? 14:16:42 <dtantsur> please check the link, I'll give you tl;dr 14:17:00 <dtantsur> we (ironic) want to drop support for the old bash ramdisk from out code 14:17:13 <dtantsur> if we do that, we won't be able to gate on DIB any more 14:17:24 <derekh> its supposed to be, but has anybody tried build a F19 images latetly /me would be surprised if it worked 14:17:25 <dtantsur> then we're losing DIB support for our stable branches as well 14:17:41 <dtantsur> (it can get broken at any moment) 14:18:09 <dtantsur> the easiest way out for everyone is for DIB to get a stable/liberty branch (ideally, stable/mitaka would work too) 14:18:27 <dtantsur> then we can just drop things from both our master and DIB itself, and live happily 14:18:28 <trown> hmm... so a stable branch for a single element? 14:18:41 <dtantsur> not sure what you mean by "for a single element"... 14:18:55 <dprince> trown: we already have stable branches for other projects. t-h-t 14:18:56 <trown> the bash deploy ramdisk is a single element 14:19:06 <dtantsur> yes, but branch is for project, not for element 14:19:09 <dprince> trown: not a single element. All of DIB 14:19:25 <dtantsur> also while I agree that DIB is supposed to be backward compatible, things do happen from time to time 14:19:27 <trown> ya, but the rest of DIB does not need a stable branch, so we create it for that single element 14:19:34 <slagle> dtantsur: why the jump from "if ironic drops bsah ramdisk support" -> "can't gate on dib" 14:19:48 <dtantsur> slagle, DIB gate is running ironic master and will fail if we remove the code 14:20:21 <slagle> what part of the DIB gate? where it uses the bash ramdisk? 14:20:28 <derekh> why can't we just stop maintaining it, like any other element not used in CI 14:20:34 <dprince> dtantsur: if we update the DIB gate to use IPA would that solve it? 14:20:34 <slagle> why dont we just update that to not use the bash ramdisk? 14:20:43 <slagle> ^^ my question too 14:20:58 <dtantsur> slagle, gate-tempest-dsvm-ironic-pxe_ssh-dib 14:21:17 <dtantsur> dprince, yes, but we'll lose coverage for this element, and our stable branches may get hurt 14:21:34 <dtantsur> the old ramdisk was perfectly supported in Kilo and Liberty 14:21:35 <trown> well, really just liberty... 14:21:56 <dprince> dtantsur: so we create a stable/liberty as a buffer. No harm done 14:22:07 <dprince> dtantsur: and then we update the CI job to use IPA anyways? 14:22:18 <dtantsur> dprince, yep, then you feel free to do anything on master 14:22:31 <dtantsur> cause you won't affect our stable gates 14:22:41 <dtantsur> so yes, we're removing the bash-based gate 14:22:58 <slagle> i disagree with changing the dib backwards compatibility expectation on account of this 14:23:02 <dtantsur> then it's a good point that we should get an IPA-based gate 14:23:16 <slagle> if we need to do something as a buffer to get us by, that's fine 14:23:21 <trown> I am so confused... we remove the bash-based gate, why do we need the bash deploy element? 14:23:28 <dprince> slagle: I don't think we are changing backwards compat. Support for this element is just being removed 14:23:41 <dtantsur> trown, we can remove it 14:23:51 <dtantsur> if we have stable branches for DIB, I mean 14:23:52 <slagle> dprince: as long as this is clear. we're not free to do anything on master once/if the stable branch is there 14:23:56 <dprince> dtantsur: another option if people seem to push back on this (for whatever reason). Just leave the (broken) code in DIB and add a comment to the readme 14:24:12 <slagle> infra relies heavily on dib, etc 14:24:27 <dtantsur> dprince, well... then someone can land a change breaking our stable gates, still relying on DIB and the old ramdisk.. 14:24:30 <trown> dprince: I prefer that option 14:24:53 <trown> if we create a stable/liberty for DIB, then that means we get that branch for liberty delorean packages 14:25:07 <slagle> i really dont think we want that 14:25:07 <dtantsur> probably 14:25:17 <trown> which seems a bit not ideal, if it is just meant to save us from one unsupported element 14:25:29 <dprince> dtantsur: I view creating a branch as a nice safe place for people to hang out while they update to IPA. Sounds like some people would prefer we not do that. So just add a comment to the readme and move on. 14:25:41 <dtantsur> what comment? 14:26:00 <dprince> dtantsur: a comment to the DIB element for the old bash ramdisk 14:26:03 <dtantsur> I'm sorry guys, I don't get how a comment will prevent people from breaking stable gates..... 14:26:13 <dtantsur> especially somewhere in a base element 14:26:17 <dprince> dtantsur: We'll just drop the gates 14:26:38 <dtantsur> dprince, how do you ensure you don't break ironic stable/liberty then? 14:26:53 <jroll> dropping the stable gates is equivalent to dropping old ramdisk support on liberty IMO 14:27:00 <dprince> dtantsur: leave the gates on for those branches. Using DIB master since there is no branch 14:27:17 <dtantsur> dprince, how do we prevent DIB for breaking us? 14:27:32 <dtantsur> since DIB master will no longer be cogated with ironic at all 14:27:37 <derekh> dtantsur: can't the stable job get run on DIB master ? 14:27:47 <dprince> dtantsur: TO be clear creating a stable branch is the cleanest idea here. We created stable branches for everything else so what harm does it cause to do it for DIB too? 14:28:05 <dtantsur> derekh, maybe? I'm not sure how much infra would hate us for mixing it.. also what about requirements? how do we merge them? 14:28:26 <dtantsur> dprince, are you asking me? :) I'm not against it, you probably want to ask slagle 14:28:28 <trown> dprince: it means we freeze liberty DIB... unless we backport every DIB change 14:28:46 <dprince> trown: probably a good idea anyways 14:28:53 <bnemec> DIB makes backward compatibility promises. If we break those, that's a bug that we should fix. 14:29:03 <slagle> i'm against changing the expectation of the project being backwards compatible on account of this reason 14:29:14 <slagle> a stable branch is also a lot of maintenance for $someone 14:29:20 <bnemec> I don't think we should branch an entire project because we _might_ screw up our backwards compatibility promise. 14:29:20 <dprince> This isn't a DIB backwards compat promise. It is a feature that is going away in the element 14:29:35 <dtantsur> bnemec, that's what the whole openstack does 14:29:47 <slagle> dtantsur: that's not true 14:29:55 <slagle> some projects use stable branches, some do not 14:30:20 * derekh steps out for a minute 14:30:21 <slagle> earlier on, it was decided, with wide concensus that dib would not use a stable branch and be backwards compatible 14:30:24 <dtantsur> a couple of telemetry projects do not iirc 14:30:39 <dtantsur> ok, the vast majority of openstack projects, including libraries and clients 14:30:51 <dprince> THe problem here is DIB has code it can't control. I think the issue here is that some of these elements don't belong in DIB because they break the promise 14:31:08 <dprince> This is just an element that is going away. 14:31:10 <bnemec> If Ironic is that concerned about dib breaking their gate, put a cap on dib in the stable branches. 14:31:30 <jroll> bnemec: that will affect all of openstack, unfortunately 14:31:37 <dprince> bnemec: Ironic wants to dump a feature that will break our gate 14:31:42 <dtantsur> bnemec, that's an option, but it will 1. put a cup on DIB for all stable/liberty branches, including tripleo itself, 2. prevent anyone from landing DIB fixes in liberty 14:33:42 <dprince> ANytime someone tries to optimize something by not creating a branch it gets complicated I think. To me the simplest thing is just to create a stable/liberty branch and move on 14:33:51 <slagle> having to land dib fixes in liberty is exactly what i want to avoid 14:34:42 <dtantsur> slagle, well, then if it gets broken (e.g. external mirror change), it's broken forever 14:34:47 <dprince> slagle: I understand your desire to avoid this work. But I think it is the cleanest solution here 14:34:50 <dtantsur> cause it will be capped for liberty release of all projects 14:35:12 <bnemec> I mean, if we make a breaking change in dib's base elements, that's going to break Ironic's gate anyway because they're still running against master on master. 14:35:30 <bnemec> And that's a bug we need to fix anyway. 14:35:38 <dtantsur> bnemec, we won't be gating DIB any more 14:35:43 <dtantsur> so no, you won't 14:36:28 <trown> wait... if you wont be gating DIB... whats the big deal? 14:36:41 <slagle> ya, my head just 'sploded too 14:36:44 <jroll> we still need to gate DIB on stable/liberty 14:36:45 <dtantsur> trown, not on master 14:36:50 <bnemec> Fine, but my point stands. _If_ we break the job, then it's a bug that should be fixed. 14:36:51 <slagle> is that the end game here? 14:36:53 <dtantsur> trown, slagle, bnemec was token on master 14:37:03 <bnemec> That is _not_ reason to branch dib. 14:37:20 <dtantsur> bnemec, sigh... but you won't have any jobs on master... so you will break us, and we'll come back reverting things in hope it will help, etc.. 14:37:46 <trown> seems like the reverse is true for tripleo almost everywhere 14:37:58 <dtantsur> what we do right now is reinventing the whole path that lead openstack to stable branches, to be honest 14:39:14 <dprince> dtantsur: okay. Lack of consensus. But I don't think anything is blocked by not doing anything 14:39:15 <lucasagomes> we still can gate on DIB on master, but using the ironic-agent element (instead of the deploy-ironic one) 14:39:26 <slagle> dprince: that may be. i would like to understand if infra has a take on this as well though, given they are a heavy consumer as well 14:39:28 <dtantsur> lucasagomes, that's not relevant to the discussion 14:39:52 <bnemec> It is actually. 14:40:12 <bnemec> It gives us test coverage of everything except the bash ramdisk element. 14:40:16 <dtantsur> bnemec, no 14:40:19 <dprince> dtantsur: yep, just use DIB master for the stable ironic/liberty branches. 14:40:46 <dtantsur> dprince, that what we do, how does it solve the problem? 14:41:04 <dtantsur> bnemec, IPA is built in a completely different way. even using a different command 14:41:15 <dtantsur> from DIB point of view, IPA is not a ramdisk, we build it as a disk image 14:41:44 <bnemec> dtantsur: ramdisk-image-create is literally a symlink to disk-image-create. There's far less difference than you might think. 14:41:56 <sambetts> if DIB master is meant to be backward compatible can't we leave the bash ramdisk in there and just remove the gate jobs from Ironic master/mitaka and the Ironic code that supports the old ramdisk, then just leave a comment in the old ramdisk README that says this is only supported up to Ironic liberty 14:41:58 <dtantsur> bnemec, yeah, but base elements are different, at least used to be 14:42:18 <trown> sambetts: ya, that would be my preference 14:42:19 <dtantsur> sambetts, it is meant does not mean is always is. that's what gate guarantee 14:43:00 <sambetts> ? 14:43:13 <dtantsur> sambetts, if it's not tested, it's broken :) 14:43:27 <dprince> dtantsur: sounds like we aren't even close to resolving this. Sorry. I thought this would be a simple thing :/ 14:43:32 <sambetts> it tested in stable/liberty, just not in mater or stable/mitaka 14:43:34 <dtantsur> yeah... 14:43:41 <dprince> dtantsur: can you start another thread for TripleO regarding this topic 14:43:57 <dtantsur> well... I can try, but looks like we need workarounds 14:43:58 <dprince> dtantsur: explain the sides, creating the branch, extra work involved by some in maintinain that, etc. 14:44:13 <dtantsur> anyway, thanks dprince for bringing it up 14:44:21 <dprince> dtantsur: np 14:44:23 <dprince> #topic CI 14:44:33 <jroll> well, there's two clear paths: 1) make a branch, drop the support in ironic now, 2) don't make branch, drop ironic support in three cycles 14:44:41 <jroll> lots of tradeoffs :) 14:44:41 <dprince> derekh: CI is working right? :) 14:44:45 * jroll shuts up now 14:44:59 <sambetts> jroll: just because its in DIB doesn't mean that mitaka has to support it right? 14:45:01 <derekh> dprince: yup, at the moment 14:45:17 <jroll> sambetts: let's take this elsewhere, we're off topic now 14:45:21 <dprince> cool. 14:45:29 <derekh> dprince: problems with a new mariadb package yesterday 14:45:33 <dprince> jroll: yeah, lets just start another thread on this for now 14:45:42 <dprince> derekh: which are solve now right? 14:45:44 <gfidente> a little OT but about CI, I had a submission to rename the ceph into upgrades but it isn't landed yet, not sure if you guys want to vote on it https://review.openstack.org/#/c/281997/1 14:46:07 <trown> derekh: dprince, ya RDO will still want to move to mariadb10, so we need to figure out what all went wrong there and try to get fixes in place 14:46:17 <dprince> gfidente: thanks, I will look 14:46:39 <dprince> gfidente: was that the right patch? 14:46:46 <trown> there was at least one issue (missing clustercheck binary) that was the packaging fault, but even after adjusting for that I could not get our galera to bootstrap with the new package 14:46:54 <gfidente> there is this guys as well https://review.openstack.org/#/c/260466/ which potentially runs upgrades in the upgrade job 14:47:14 <dprince> #link https://bugs.launchpad.net/tripleo/+bug/1547660 14:47:14 <openstack> Launchpad bug 1547660 in tripleo "Could not find command '/usr/bin/clustercheck'" [Critical,Triaged] 14:47:16 <dprince> trown: ^^? 14:47:21 <gfidente> dprince, https://review.openstack.org/277419 14:47:23 <trown> ya, that one is packaging fault 14:47:41 <dprince> gfidente: thanks, that is it 14:47:42 <trown> dprince: so we reverted the package from the deps repo in RDO 14:48:18 <gfidente> #link tripleo ceph rename into upgrade https://review.openstack.org/277419 14:48:44 <gfidente> #link trigger upgrades in ci https://review.openstack.org/#/c/260466/ 14:49:29 <dprince> okay lets move on 14:49:33 <dprince> #topic specs 14:49:35 <derekh> gfidente: thanks for the reminder, will look again 14:50:23 <dprince> getting mostly positive feedback on the Mistral spec https://review.openstack.org/#/c/280407/ 14:50:49 <dprince> rbrady: anything you'd like to add here? 14:51:20 <rbrady> dprince: nope. I was giving it a day or so for feedback and then going to update 14:51:35 <dprince> rbrady: cool, sounds good 14:52:22 <dprince> Any other issues on specs this week? 14:53:46 <dprince> #topic open discussion 14:54:15 <bnemec> Missed this in the CI topic, but I think https://review.openstack.org/#/c/282462/ would help our HA job stability a bunch. 14:54:40 <gfidente> thanks bnemec ! 14:54:41 <bnemec> Probably half or more of the failures I'm seeing are the nohostfound due to swift getting OOM'd. 14:54:48 <trown> I will be demoing tripleo-quickstart 2 weeks from tomorrow: https://www.youtube.com/watch?v=4O8KvC66eeU 14:55:09 <marios> i'd like to hightlight this bug https://bugs.launchpad.net/heat/+bug/1539541 which is a problem for upgrades at the moment 14:55:09 <openstack> Launchpad bug 1539541 in heat "Can't ignore updates to OS::Nova::Server" [High,In progress] - Assigned to Steve Baker (steve-stevebaker) 14:55:18 <bnemec> Need to talk to derekh about whether we can increase the size of the undercloud again. 14:55:55 <gfidente> hey guys I wanted to raise a question as well, I was thinking to use something like https://review.openstack.org/#/c/270189/ to switch the networking configuration on using hostnames instead of ips 14:56:03 <gfidente> on the basis that this should help with cleaner ipv6 support 14:56:04 <dprince> bnemec: +2 14:56:11 <gfidente> do you think that is doable and is worth? 14:56:40 <marios> gfidente: thanks will take a look. time is the main concern before i even look at the change 14:56:41 <dprince> trown: did you see my comment about distro support for tripleo-quickstart? 14:57:01 <marios> gfidente: but ultimately would help solve all of the '[]' issues with ipv6 14:57:06 <trown> dprince: yep, it supports everything that instack-virt-setup does 14:57:12 <bnemec> gfidente: No idea whether it's doable, but it seems like a good idea. 14:57:13 <gfidente> marios, exactly, my main point is that by using names we don't need the conditionals for cope with [] and : in the ip addresses 14:57:27 <gfidente> bnemec, marios but it'll need some review time :) 14:57:38 <marios> bnemec: its a trap! 14:57:44 <gfidente> I know bnemec -1 my submissions all the times so that's good 14:57:56 <gfidente> marios volunteers too? ;) 14:58:06 <trown> dprince: or python-tripleoclient for that matter 14:58:19 <derekh> bnemec: ack, we can probably bump it again another 1G if needed 14:58:45 <dprince> trown: that may be. but I think it gets us farther from multi-distro support because some of the scripts in the incubator (used by instack-virt-setup) did in fact support multiple distributions 14:59:00 <bnemec> derekh: I think we do, although the extra swap usage doesn't seem to have slowed the job down any. It still finished in the same time as the ceph job. 14:59:18 <trown> dprince: I think ansible is better at supporting multi-distro than random bash scripts though 14:59:35 <dprince> trown: that was my real question. Does it get us in a better place... 14:59:45 <trown> dprince: for example, tripleo-quickstart uses the generic 'package' module instead of 'yum' 15:00:08 <derekh> bnemec: ok, lets merge that and see how it goes, if speed isn't affected then maybe its enough 15:00:13 <trown> that said, I have not tried it at all on anything except fedora and centos 15:00:23 <dprince> trown: cool, I might like to see package name abstractions. I know when I came to TripleO I was really discouraged to see everything hard coded to Debian package names 15:00:36 <dprince> trown: now the opposite is true in some cases. Hard coded to RH 15:00:38 <bnemec> derekh: Just pulled the trigger. 15:00:49 <dprince> oops. out of time 15:00:57 <dprince> Thanks everyone. Sorry about the quick cuttoff 15:01:01 <derekh> bnemec: ack 15:01:06 <dprince> #endmeeting