18:00:03 <vgridnev> #startmeeting sahara
18:00:08 <openstack> Meeting started Thu Aug 18 18:00:03 2016 UTC and is due to finish in 60 minutes.  The chair is vgridnev. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:00:09 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
18:00:12 <openstack> The meeting name has been set to 'sahara'
18:00:19 <egafford> o/
18:00:26 <vgridnev> #link https://wiki.openstack.org/wiki/Meetings/SaharaAgenda
18:01:22 <vgridnev> o/
18:01:34 <raissa> o/
18:02:16 <tosky> hi
18:02:27 <mionkin_> Hi
18:02:50 <esikachev> hi
18:02:59 <vgridnev> #topic News / updates
18:03:52 <egafford> Not a ton of progress this week I fear; lot of internal release prep stuff. I do have a shared resource model that I like for the image gen CLI now, though.
18:04:00 <tellesnobrega> o/
18:04:20 <mionkin_> I added Kafka in CDH 5.5 and 5.7 in Sahara and sahara elements (patches are on review), also working on fixing ci jobs in sahara and updating grenade job with new CLI calls (patch is on review)
18:04:38 <vgridnev> trying to finish kerberos implementation; internal stuff around testing sahara .
18:04:59 <vgridnev> also some preparations  around N3 release, and so on
18:05:09 <esikachev> working on finxing sahara-ci(problem with cinder+ceph)
18:06:01 <tosky> working on importing the last bits of tests not in sahara-tests
18:06:19 <vgridnev> tosky, is going to be done today with Sergey?
18:06:37 <tosky> vgridnev: we agreed on working on it later; I will be online till late
18:06:49 <SergeyLukjanov> I'm looking on the patches right now
18:06:49 <vgridnev> ok, cool
18:06:54 <tosky> at least the import, than we need to have the "allows merges" review approved by infra
18:07:06 <vgridnev> SergeyLukjanov, hei
18:07:20 <tosky> then I can push the merge patches, and then I can push the WIP review to move the code to the right place :)
18:08:37 <vgridnev> #topic Summit preparations: Episode 1
18:09:22 <tosky> (I prefer when it start from Episode IV, but...)
18:09:40 <vgridnev> so, we need to understand how many fishbowls, workrooms we need
18:09:46 <egafford> tosky: I felt like someone was going to go there, and gave it pretty good odds of being you. ;)
18:10:22 <egafford> Our fishbowls were pretty sparsely attended last time.
18:10:48 <SergeyLukjanov> ++ for starting from ep. IV (much better karma)
18:11:30 <vgridnev> so, I thinking that something around previous values is good (probably we can take smaller values). we did discussions of EDP, Horizon, Tests, API, Release models, Image generations
18:12:11 <egafford> I think that those continue to be our broad areas of concern.
18:12:12 <vgridnev> + contributors meetup (halfday)
18:14:02 <vgridnev> so, last time we had  2fb, 6wr, cm:half, so I think we can take probably 6-7 workrooms
18:14:42 <egafford> Yeah, I'm uncertain that we need any fishbowls.
18:15:10 <egafford> Depends on the workroom size I guess, but they've always been big enough.
18:15:58 <vgridnev> ok, thanks folks on feedback, I will send request soon
18:16:28 <vgridnev> #topic Newton 3 release: features status
18:17:43 <vgridnev> since today, we have 2 weeks to complete feature without FFE, + around 2 weeks before RC1
18:18:43 <egafford> The image gen CLI abstraction is totally usable and has no plugin implemetations yet. I think ambari is nearly done, and the others will be easier because they use a lot of shared resources and framework, but testing will tell.
18:19:04 <vgridnev> so I want to be sure that everything that we need is landed; let's track features at etherpad
18:19:08 <vgridnev> #link https://etherpad.openstack.org/p/sahara-review-priorities
18:19:55 <vgridnev> I'm inviting assignees to complete this etherpad
18:20:02 <tosky> I couldn't check them, but shouldn't we include also the changes for pagination (unless the last patch I've seen was merged)?
18:20:45 <vgridnev> egafford, I think that we can grant FFE for image generation, since it doesn't touch provisioning part a lot
18:21:40 <vgridnev> tosky, I'll include that to etherpad, the only change that we need is a dashboard
18:22:02 <egafford> vgridnev: Yeah, it's very safe, especially if we don't enable the validation piece (which is completely possible, per plugin.)
18:22:39 <egafford> Just adding the get_image_argument and pack_image methods has literally zero impact on the sahara service.
18:23:00 <vgridnev> on plugins api there are few dashboard changes which have to be landed
18:23:41 <vgridnev> that also will help us to find right change to review
18:23:59 <vgridnev> ok, next topic
18:24:08 <vgridnev> #topic CI status
18:25:23 <vgridnev> esikachev, it's your time. Actually I know that most probably we will have kind of unstable but working of CI version tomorrow; there were some issues around ceph + cinder integration
18:26:31 <esikachev> yes, we have a problem with cinder+ceph, as workaround we try to use LVM as backand for cinder
18:27:01 <esikachev> devops tell me that this method was unstable
18:27:13 <esikachev> *told
18:27:33 <esikachev> now, 42 lab uses LVM
18:28:40 <esikachev> i plan reinstall ubuntu on 42-43 labs on xenial in the next weeks
18:28:47 <vgridnev> additionally, since N3 is coming, we need to have a plan how changes can be merged if CI is not working well. I'll delegate some engineers ( mionkin, mikhail)  to perform manual tests of changes. Fairly we don't have large features which can potentially break everything
18:29:10 <vgridnev> what do you think team?
18:29:36 <esikachev> maybe, we can add some gate temporary jobs?
18:29:47 <tosky> manual tests for each changes (so for each backend) is... heavy
18:29:49 <esikachev> for example spark, vanilla
18:30:07 <vgridnev> esikachev, I'm not quite that it will be working well.
18:30:31 <tosky> if only nested virtualization was enabled and reliable on the main gates...
18:31:00 <vgridnev> So, current fake plugin tests are running with 2 very small vms
18:31:20 <esikachev> let's check it on testcommits in sahara-tests
18:32:46 <vgridnev> esikachev, how to upload vanilla image to node where tests are performed?
18:33:06 <vgridnev> wget from sahara-files  is not good idea
18:33:18 <esikachev> :(
18:35:04 <vgridnev> the only problem with that approach is that we have  A LOT of mapr changes that have to be tested
18:35:21 <vgridnev> the current list
18:35:23 <vgridnev> https://review.openstack.org/#/q/owner:groghkov%2540maprtech.com+status:open
18:35:54 <vgridnev> ops, there is one change to manila
18:37:39 <esikachev> vgridnev: we can temporary add some filters for jobs for mapr plugin
18:37:53 <esikachev> on sahara-cu
18:37:56 <esikachev> ci
18:38:16 <tosky> esikachev: just to be sure: devops told you that the current configuration (cinder on LVM), is this the current configuration which still fails, or is this instability just going to happen from time to time?
18:39:01 <esikachev> this configuration killed lab for 2-3 days
18:39:15 <esikachev> for example, kernel panic
18:39:21 <tosky> oh, I see
18:39:34 <vgridnev> but it was cycles ago, so probably something was changed
18:39:42 <esikachev> yep
18:40:00 <tosky> but is it the configuration which is enabled now then?
18:40:32 <esikachev> yes, on one devstack
18:42:03 <vgridnev> so, I think that we can accept manual tests for some exceptions (like code that touches only cdh plugin or ambari). In many cases like around provisioning, it's enough to perform tests on fake plugin
18:42:10 <vgridnev> at least
18:42:27 <esikachev> ok
18:42:58 <vgridnev> egafford, what do you think about my last statement?
18:44:30 <vgridnev> egafford seems to be afk
18:45:00 <egafford> Sorry; mildly distracted.
18:45:29 <tosky> esikachev: is there any hope in fixing the issues with ceph and restoring the previous state? Are the current failures mostly (or just) due to timeout because of cinder on LVM?
18:45:44 <egafford> Hm... this is really tricky.
18:47:06 <egafford> I suppose I think that realistically, we won't be able to sustain development without functioning CI, and also that we know that from time to time in extreme circumstances we need to stop listening to CI that's too broken.
18:47:42 <esikachev> tosky: if cinder+lvm are going to work bad i will reinstall ubuntu and restore cinder+ceph
18:47:57 <tosky> ok
18:48:28 <egafford> So I guess I think that making a policy of "these are the circumstances under which we ignore CI" is nearly doomed; it's always going to be a mix of time pressure and the test halo impact of the change.
18:49:22 <egafford> I tend to agree that on plugin-specific changes, it can be okay to approve with manual testing; I actually really dislike the idea of approving core changes using only the fake plugin.
18:50:06 <egafford> That works in a model in which vendors are supporting their own plugins and maintaining (like Cinder); I don't think it works very well here. Our plugins are ours, and they are the application; if they're low-quality, so are we.
18:50:12 <egafford> Does that make sense?
18:51:38 <vgridnev> So, we can discuss each possible exception of approving without CI on core changes
18:52:09 <egafford> That'd be my suggestion, yeah.
18:52:17 <tosky> yep
18:52:20 <egafford> Evaluate the risk.
18:52:26 <esikachev> vgridnev: maybe try to disale cinder volumes in scenario yaml files?
18:52:48 <esikachev> mapr use ephemeral drive
18:53:45 <vgridnev> esikachev, that makes sense. We can create copies for templates but without volumes)
18:54:06 <tellesnobrega> im with egafford on this, evaluate each case on core changes
18:54:11 <vgridnev> Or we can create a commit that will be removed later
18:54:21 <vgridnev> reverted*
18:55:17 <vgridnev> I like a variant with copy of templates (with -volumes suffix)
18:56:09 <vgridnev> tosky, what do you think about that ^^ ?
18:57:11 <vgridnev> be default we will not tests with volumes. Probably not all users can use volumes
18:57:59 <vgridnev> so, 3 mins left
18:58:40 <tosky> uhm, yeah, we should maybe think a way to not duplicate all the contents of the yamls
18:59:14 <vgridnev> we can continue discussing on openstack-sahara. thanks everyone for attending!
18:59:20 <vgridnev> #endmeeting