18:00:03 <vgridnev> #startmeeting sahara 18:00:08 <openstack> Meeting started Thu Aug 18 18:00:03 2016 UTC and is due to finish in 60 minutes. The chair is vgridnev. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:09 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 18:00:12 <openstack> The meeting name has been set to 'sahara' 18:00:19 <egafford> o/ 18:00:26 <vgridnev> #link https://wiki.openstack.org/wiki/Meetings/SaharaAgenda 18:01:22 <vgridnev> o/ 18:01:34 <raissa> o/ 18:02:16 <tosky> hi 18:02:27 <mionkin_> Hi 18:02:50 <esikachev> hi 18:02:59 <vgridnev> #topic News / updates 18:03:52 <egafford> Not a ton of progress this week I fear; lot of internal release prep stuff. I do have a shared resource model that I like for the image gen CLI now, though. 18:04:00 <tellesnobrega> o/ 18:04:20 <mionkin_> I added Kafka in CDH 5.5 and 5.7 in Sahara and sahara elements (patches are on review), also working on fixing ci jobs in sahara and updating grenade job with new CLI calls (patch is on review) 18:04:38 <vgridnev> trying to finish kerberos implementation; internal stuff around testing sahara . 18:04:59 <vgridnev> also some preparations around N3 release, and so on 18:05:09 <esikachev> working on finxing sahara-ci(problem with cinder+ceph) 18:06:01 <tosky> working on importing the last bits of tests not in sahara-tests 18:06:19 <vgridnev> tosky, is going to be done today with Sergey? 18:06:37 <tosky> vgridnev: we agreed on working on it later; I will be online till late 18:06:49 <SergeyLukjanov> I'm looking on the patches right now 18:06:49 <vgridnev> ok, cool 18:06:54 <tosky> at least the import, than we need to have the "allows merges" review approved by infra 18:07:06 <vgridnev> SergeyLukjanov, hei 18:07:20 <tosky> then I can push the merge patches, and then I can push the WIP review to move the code to the right place :) 18:08:37 <vgridnev> #topic Summit preparations: Episode 1 18:09:22 <tosky> (I prefer when it start from Episode IV, but...) 18:09:40 <vgridnev> so, we need to understand how many fishbowls, workrooms we need 18:09:46 <egafford> tosky: I felt like someone was going to go there, and gave it pretty good odds of being you. ;) 18:10:22 <egafford> Our fishbowls were pretty sparsely attended last time. 18:10:48 <SergeyLukjanov> ++ for starting from ep. IV (much better karma) 18:11:30 <vgridnev> so, I thinking that something around previous values is good (probably we can take smaller values). we did discussions of EDP, Horizon, Tests, API, Release models, Image generations 18:12:11 <egafford> I think that those continue to be our broad areas of concern. 18:12:12 <vgridnev> + contributors meetup (halfday) 18:14:02 <vgridnev> so, last time we had 2fb, 6wr, cm:half, so I think we can take probably 6-7 workrooms 18:14:42 <egafford> Yeah, I'm uncertain that we need any fishbowls. 18:15:10 <egafford> Depends on the workroom size I guess, but they've always been big enough. 18:15:58 <vgridnev> ok, thanks folks on feedback, I will send request soon 18:16:28 <vgridnev> #topic Newton 3 release: features status 18:17:43 <vgridnev> since today, we have 2 weeks to complete feature without FFE, + around 2 weeks before RC1 18:18:43 <egafford> The image gen CLI abstraction is totally usable and has no plugin implemetations yet. I think ambari is nearly done, and the others will be easier because they use a lot of shared resources and framework, but testing will tell. 18:19:04 <vgridnev> so I want to be sure that everything that we need is landed; let's track features at etherpad 18:19:08 <vgridnev> #link https://etherpad.openstack.org/p/sahara-review-priorities 18:19:55 <vgridnev> I'm inviting assignees to complete this etherpad 18:20:02 <tosky> I couldn't check them, but shouldn't we include also the changes for pagination (unless the last patch I've seen was merged)? 18:20:45 <vgridnev> egafford, I think that we can grant FFE for image generation, since it doesn't touch provisioning part a lot 18:21:40 <vgridnev> tosky, I'll include that to etherpad, the only change that we need is a dashboard 18:22:02 <egafford> vgridnev: Yeah, it's very safe, especially if we don't enable the validation piece (which is completely possible, per plugin.) 18:22:39 <egafford> Just adding the get_image_argument and pack_image methods has literally zero impact on the sahara service. 18:23:00 <vgridnev> on plugins api there are few dashboard changes which have to be landed 18:23:41 <vgridnev> that also will help us to find right change to review 18:23:59 <vgridnev> ok, next topic 18:24:08 <vgridnev> #topic CI status 18:25:23 <vgridnev> esikachev, it's your time. Actually I know that most probably we will have kind of unstable but working of CI version tomorrow; there were some issues around ceph + cinder integration 18:26:31 <esikachev> yes, we have a problem with cinder+ceph, as workaround we try to use LVM as backand for cinder 18:27:01 <esikachev> devops tell me that this method was unstable 18:27:13 <esikachev> *told 18:27:33 <esikachev> now, 42 lab uses LVM 18:28:40 <esikachev> i plan reinstall ubuntu on 42-43 labs on xenial in the next weeks 18:28:47 <vgridnev> additionally, since N3 is coming, we need to have a plan how changes can be merged if CI is not working well. I'll delegate some engineers ( mionkin, mikhail) to perform manual tests of changes. Fairly we don't have large features which can potentially break everything 18:29:10 <vgridnev> what do you think team? 18:29:36 <esikachev> maybe, we can add some gate temporary jobs? 18:29:47 <tosky> manual tests for each changes (so for each backend) is... heavy 18:29:49 <esikachev> for example spark, vanilla 18:30:07 <vgridnev> esikachev, I'm not quite that it will be working well. 18:30:31 <tosky> if only nested virtualization was enabled and reliable on the main gates... 18:31:00 <vgridnev> So, current fake plugin tests are running with 2 very small vms 18:31:20 <esikachev> let's check it on testcommits in sahara-tests 18:32:46 <vgridnev> esikachev, how to upload vanilla image to node where tests are performed? 18:33:06 <vgridnev> wget from sahara-files is not good idea 18:33:18 <esikachev> :( 18:35:04 <vgridnev> the only problem with that approach is that we have A LOT of mapr changes that have to be tested 18:35:21 <vgridnev> the current list 18:35:23 <vgridnev> https://review.openstack.org/#/q/owner:groghkov%2540maprtech.com+status:open 18:35:54 <vgridnev> ops, there is one change to manila 18:37:39 <esikachev> vgridnev: we can temporary add some filters for jobs for mapr plugin 18:37:53 <esikachev> on sahara-cu 18:37:56 <esikachev> ci 18:38:16 <tosky> esikachev: just to be sure: devops told you that the current configuration (cinder on LVM), is this the current configuration which still fails, or is this instability just going to happen from time to time? 18:39:01 <esikachev> this configuration killed lab for 2-3 days 18:39:15 <esikachev> for example, kernel panic 18:39:21 <tosky> oh, I see 18:39:34 <vgridnev> but it was cycles ago, so probably something was changed 18:39:42 <esikachev> yep 18:40:00 <tosky> but is it the configuration which is enabled now then? 18:40:32 <esikachev> yes, on one devstack 18:42:03 <vgridnev> so, I think that we can accept manual tests for some exceptions (like code that touches only cdh plugin or ambari). In many cases like around provisioning, it's enough to perform tests on fake plugin 18:42:10 <vgridnev> at least 18:42:27 <esikachev> ok 18:42:58 <vgridnev> egafford, what do you think about my last statement? 18:44:30 <vgridnev> egafford seems to be afk 18:45:00 <egafford> Sorry; mildly distracted. 18:45:29 <tosky> esikachev: is there any hope in fixing the issues with ceph and restoring the previous state? Are the current failures mostly (or just) due to timeout because of cinder on LVM? 18:45:44 <egafford> Hm... this is really tricky. 18:47:06 <egafford> I suppose I think that realistically, we won't be able to sustain development without functioning CI, and also that we know that from time to time in extreme circumstances we need to stop listening to CI that's too broken. 18:47:42 <esikachev> tosky: if cinder+lvm are going to work bad i will reinstall ubuntu and restore cinder+ceph 18:47:57 <tosky> ok 18:48:28 <egafford> So I guess I think that making a policy of "these are the circumstances under which we ignore CI" is nearly doomed; it's always going to be a mix of time pressure and the test halo impact of the change. 18:49:22 <egafford> I tend to agree that on plugin-specific changes, it can be okay to approve with manual testing; I actually really dislike the idea of approving core changes using only the fake plugin. 18:50:06 <egafford> That works in a model in which vendors are supporting their own plugins and maintaining (like Cinder); I don't think it works very well here. Our plugins are ours, and they are the application; if they're low-quality, so are we. 18:50:12 <egafford> Does that make sense? 18:51:38 <vgridnev> So, we can discuss each possible exception of approving without CI on core changes 18:52:09 <egafford> That'd be my suggestion, yeah. 18:52:17 <tosky> yep 18:52:20 <egafford> Evaluate the risk. 18:52:26 <esikachev> vgridnev: maybe try to disale cinder volumes in scenario yaml files? 18:52:48 <esikachev> mapr use ephemeral drive 18:53:45 <vgridnev> esikachev, that makes sense. We can create copies for templates but without volumes) 18:54:06 <tellesnobrega> im with egafford on this, evaluate each case on core changes 18:54:11 <vgridnev> Or we can create a commit that will be removed later 18:54:21 <vgridnev> reverted* 18:55:17 <vgridnev> I like a variant with copy of templates (with -volumes suffix) 18:56:09 <vgridnev> tosky, what do you think about that ^^ ? 18:57:11 <vgridnev> be default we will not tests with volumes. Probably not all users can use volumes 18:57:59 <vgridnev> so, 3 mins left 18:58:40 <tosky> uhm, yeah, we should maybe think a way to not duplicate all the contents of the yamls 18:59:14 <vgridnev> we can continue discussing on openstack-sahara. thanks everyone for attending! 18:59:20 <vgridnev> #endmeeting