14:00:06 <tellesnobrega> #startmeeting sahara 14:00:06 <openstack> Meeting started Thu Jun 21 14:00:06 2018 UTC and is due to finish in 60 minutes. The chair is tellesnobrega. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:07 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:10 <openstack> The meeting name has been set to 'sahara' 14:00:21 <tosky> o/ 14:01:07 <jeremyfreudberg> o/ 14:01:38 <zchkun> o/ 14:02:42 <tellesnobrega> #topic News/Updates 14:03:21 <tosky> everything is aws^H^H^H^Hbroken 14:03:22 <tosky> ehm 14:03:48 <tellesnobrega> even your keyboard lol 14:04:38 <tosky> short version: a keystone change (new default roles) uncovered: a not well-known assumption (roles are not case-sensitive) 14:04:53 <tosky> and a bug in trust handling (which Jeremy reproduced, and posted some patches which are not yet merged) 14:05:22 <tosky> that's why we can't merge anything 14:05:38 <tellesnobrega> I'm working on some bugs, extjs one and ambari tls v1 on sahara-image-elements one (I will bring this up later on its own topic), I'm waiting on a patch on openstackclient to continue the boot from volume work 14:05:44 <jeremyfreudberg> my summary: i spent the beginning of the week diagnosing and proposing fixes to keystone, and yesterday working on s3... running into a lot of last minute jar/classpath problems 14:06:11 <tosky> oh, my summary: I played with devstack-plugin-ceph, which is finally able to deploy a working radosgw 14:06:33 <tosky> so I'm testing a scenario job with radosgw instead of swift, useful for two reasons: 14:06:37 <tosky> - testing of S3 too 14:06:59 <jeremyfreudberg> (tosky: i find the failed cluster scaling on the radosgw job really strange) 14:07:00 <tosky> - testing of python 3 (swift is not ported and a mixed environment with only swift/python2 does not work right now) 14:07:29 <tosky> jeremyfreudberg: probably resources, it looked like a timeout; I recreated the test disabling ceph integration with nova, glance and cinder and it's passing 14:07:29 <tellesnobrega> also, I'm working on updating storm plugins versions and later spark 14:07:53 <jeremyfreudberg> tosky: ah, makes sense 14:09:52 <tellesnobrega> zchkun, any news from you? 14:10:28 <zchkun> sorry, no progress has been made this week, but I want to do some other spare work I don't know 14:11:26 <tellesnobrega> ok, lets move on. zchkun I may have some work for you, if you want 14:11:44 <zchkun> ok , no problem 14:12:16 <tellesnobrega> #topic Ambari TLS v1 14:12:54 <tellesnobrega> we have a bug filed by tosky about the update on ambari that breaks versions below 2.4.3.0 14:13:29 <tellesnobrega> we started an etherpad to gather info on it so we try to get to a decision on how to properly fix this issue 14:13:33 <tellesnobrega> #link https://etherpad.openstack.org/p/sahara-ambari-tls 14:14:45 <tosky> it's a complicated issue, the original story contains more details 14:14:53 <tosky> uh, we need a link to the story in the etherpad 14:15:04 <tellesnobrega> please take a look at it, get familiar so we can discuss later 14:15:07 <tellesnobrega> I will update it 14:15:36 <tellesnobrega> done 14:16:32 <tosky> the short version: with recent updates of java, ambari-agent tries to use a now deprecated version of encryption while talking with ambari-server, and it fails 14:17:19 <tosky> the last Ambari 2.4.x fixes it, which is good if you use queens and sahara-image-pack for the images (it was fixed there), but not sahara-image-elements; and that means that pike is broken 14:18:49 <tosky> there are few possible solutions on how to handle this, with difficult degrees of complication 14:18:51 <tosky> but yeah 14:18:52 <tosky> life 14:20:22 <tellesnobrega> I don't think we will have a solution today, so please read the etherpad, see the options, and lets try to get a solution asap 14:21:08 <tosky> I just added another option 14:21:49 <tellesnobrega> cool 14:22:23 <tellesnobrega> we could try to get down to 3 or 2 best solutions 14:22:29 <tellesnobrega> and try to decide from there 14:22:32 <tellesnobrega> lets see how it goes 14:23:50 <tellesnobrega> tosky, do you want to go over options now? 14:24:53 <tosky> so, option 1. and 2. are about pinning: either older system packages, or per ambari version 14:25:28 <tosky> this seems easy but it leaves out a lot of updates, and it could not be so quick to go back to older versions 14:25:40 <tosky> 3. is about dropping support (the kthxbye solution) 14:25:53 <tosky> support for ambari in pike, basically 14:26:06 <tosky> and ambari/sahara-image-elements from queens 14:26:07 <tellesnobrega> I guess that one is the worst one 14:26:13 <tellesnobrega> pike is too soon 14:27:01 <tosky> 4. requires coding: create an (ugly) patch which should be applied to the version of ambari-agent used (in 2.2.1.0 and 2.2.0.x, the version that we use) which forces the new protocol 14:27:36 <tosky> 5. is: check if Ambari 2.4, available with sahara-image-elements too in some older versions, works 14:27:47 <tosky> there have been reports of bugs, though, so it would need testing 14:28:25 <tosky> 6. (just added) is: allow new packages, but ignore security a bit for those older versions, and reenable the old SSL protocols *if possible* 14:28:32 <tosky> (not sure if they have been removed at compile time) 14:28:42 <tosky> it won't be worse than what users have now 14:28:46 <tosky> that's it 14:29:59 <tellesnobrega> ok, I guess 4,5 and 6 are the best ones 14:31:45 <tellesnobrega> I like 4 without the (ugly) 14:31:56 <tellesnobrega> and 5 is the best one if it works I guess 14:32:09 <tellesnobrega> we could start testing 5 and if it fails we fall back to 4 14:34:29 <tellesnobrega> tosky, does that makes sense to you? jeremyfreudberg 14:34:48 <jeremyfreudberg> i think 5 is worth trying 14:35:23 <tellesnobrega> I think so too 14:36:09 <tellesnobrega> lets do it this way and we move on from there 14:36:13 <tosky> I really don't know; the article on the cloudera site that explained the issue does not say which change exactly happen in the jdk, but maybe 6. could be easy 14:36:19 <tosky> anyway, yeah, let's move on 14:36:24 <tosky> there is at least another big topic 14:36:56 <tellesnobrega> we can have 6 too, but I would like to try 5 first 14:37:07 <tellesnobrega> which topic you are thinking about? keystone issue? 14:37:37 <tosky> exact status S3, and I think that jeremyfreudberg has some updates there 14:37:51 <tellesnobrega> ok 14:38:00 <tellesnobrega> let me just run quick topic 14:38:02 <tellesnobrega> and we move on there 14:38:10 <jeremyfreudberg> yup 14:38:13 <tellesnobrega> #topic Plugins upgrade 14:39:08 <tellesnobrega> right now we are still missing storm 1.2*, spark 2.3 and 2.2.1, and mapr 6 14:39:13 <tellesnobrega> cdh 5.13 is there 14:39:22 <tellesnobrega> probably we are not doing 5.14 14:39:41 <tellesnobrega> and maybe skip 5.12 14:39:57 <tellesnobrega> zchkun, that is the work that you could do if you want 14:40:01 <tellesnobrega> work on mapr 6 14:40:08 <tellesnobrega> I'm working on storm and will do spark after 14:40:19 <zchkun> ok 14:41:06 <tellesnobrega> let me know if you need help 14:41:08 <tellesnobrega> ok 14:41:11 <zchkun> does mapr 6 need to be merged before rock? 14:41:21 <tellesnobrega> that is the plan 14:41:36 <tosky> we would like to do it at least :) 14:41:55 <tellesnobrega> do you think you can handle that? 14:41:58 <zchkun> ok, I do my best 14:42:04 <tellesnobrega> thanks 14:42:07 <jeremyfreudberg> it's not absolutely essential 14:42:13 <tellesnobrega> let me know if you need my help 14:42:24 <jeremyfreudberg> btw, if you look the git history this will be the first mapr upgrade done without any mapr people 14:42:24 <tellesnobrega> yes, not essential, but desirable 14:43:18 <zchkun> ok , I get to know it first 14:43:42 <jeremyfreudberg> yes, even if the upgrade is not finished, it will be very nice to have somebody with knowledge 14:43:56 <jeremyfreudberg> so, thanks zchkun for being willing to learn 14:44:46 <zchkun> ok, thanks :) 14:45:46 <tellesnobrega> cool 14:45:53 <tellesnobrega> tosky, what is the topic you want now? 14:46:05 <tosky> S3 :) 14:46:23 <tellesnobrega> #topic S3 14:46:55 <tosky> sooo, support for S3; we have the basic in place and working correctly with later versions of Vanilla 14:47:09 <tosky> but talking with jeremyfreudberg, it looks like the status for vendor plugins is a bit more complex 14:47:26 <jeremyfreudberg> well, not just vendor plugins, edp with s3 datasources is getting complicated everywhere 14:47:43 <tosky> ok, could you summarize the status? 14:47:59 <jeremyfreudberg> yes, typing 14:48:41 <jeremyfreudberg> to put in one sentence, there are a lot of unseen issues that i missed during my manual testing that now are popping up when doing "real" edp 14:48:46 <jeremyfreudberg> 1) some problem with oozie classpath 14:48:59 <jeremyfreudberg> 2) some problem with older hadoop-aws jar 14:49:27 <jeremyfreudberg> 3) some problem with spark jobs against s3 using incompatible version of hadoop-common 14:49:45 <jeremyfreudberg> 2 and 3 will both involve some jar patching or replacing 14:50:35 <jeremyfreudberg> 1 is a bit of a mystery, working on it now 14:50:45 <jeremyfreudberg> basically, i will need to be making image changes, again 14:50:52 <tosky> is it something that will end up like the fork of hadoop-swiftfs, or a lighter solution is possible? (additional jars, for example) 14:52:31 <jeremyfreudberg> an additional jar won't help, really what we need is an edit 14:54:12 <tosky> an edit means a change in the jar code used by the plugin and its rebuild ? 14:55:03 <jeremyfreudberg> we will probably have to host jars on tarballs.o.o 14:55:26 <jeremyfreudberg> it won't be an actual fork, though 14:56:19 <tosky> if it's not a fork and it's not an additional jar, what it's going to be? :) 14:57:10 <jeremyfreudberg> maybe something like https://docs.oracle.com/javase/tutorial/deployment/jar/update.html to edit the jar from central.maven.org 14:57:25 <jeremyfreudberg> or otherwise, just a patch file to apply to the source code hosted elsewhere 14:57:47 <tosky> oh, patch a jar file, I see 14:57:50 <jeremyfreudberg> an "actual fork" would mean hosting the entire source code of whatever library, and i don't want to do that 14:58:33 <tosky> right, but from the point of view of our workflow, apart from the cost of updating it later to the upstream version, it's not much different 14:58:56 <tosky> so does it mean that the support for S3 that Cloudera advertises is not enough for our usage? 14:59:06 <tosky> and more important: do we have a roadmap/table of what to do? 14:59:13 * jeremyfreudberg looks at clock 14:59:20 <tosky> (we are running out of time, we can continue on #openstack-sahara) 14:59:37 <jeremyfreudberg> the answer is the cloudera jar is unsufficient 15:00:03 <tellesnobrega> let me close the meeting 15:00:10 <tellesnobrega> we continue on #openstack-sahara 15:00:17 <tellesnobrega> #endmeeting