*** sgotliv has quit IRC | 00:56 | |
*** ekarlso has quit IRC | 01:30 | |
*** ekarlso has joined #openstack-sahara | 01:45 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/python-saharaclient: Updated from global requirements https://review.openstack.org/281007 | 02:08 |
---|---|---|
*** egafford has quit IRC | 02:38 | |
*** crobertsrh is now known as _crorh | 02:40 | |
*** coolsvap|away is now known as coolsvap | 02:52 | |
*** dave-mccowan has quit IRC | 04:22 | |
*** links has joined #openstack-sahara | 04:41 | |
*** links has quit IRC | 04:57 | |
*** itisha has quit IRC | 05:17 | |
*** Poornima has joined #openstack-sahara | 05:46 | |
*** apavlov has joined #openstack-sahara | 05:48 | |
*** nkrinner has joined #openstack-sahara | 05:56 | |
*** sgotliv has joined #openstack-sahara | 06:25 | |
*** apavlov has quit IRC | 06:45 | |
-openstackstatus- NOTICE: A problem with the mirror used for CI jobs in the rax-iad region has been corrected. Please recheck changes that recently failed jobs on nodes in rax-iad. | 06:49 | |
*** akuznetsov has joined #openstack-sahara | 07:01 | |
*** akuznetsov has quit IRC | 07:04 | |
*** akuznetsov has joined #openstack-sahara | 07:04 | |
*** akuznetsov has quit IRC | 07:04 | |
*** akuznetsov has joined #openstack-sahara | 07:05 | |
*** akuznetsov has quit IRC | 07:10 | |
openstackgerrit | Jaxon Wang proposed openstack/sahara-tests: Add CDH 5.5.0 scenario test https://review.openstack.org/281092 | 07:32 |
*** rcernin has joined #openstack-sahara | 07:42 | |
openstackgerrit | Jaxon Wang proposed openstack/sahara-tests: Add more infomation when create cluster failed for scenario test https://review.openstack.org/281095 | 07:42 |
openstackgerrit | lu huichun proposed openstack/sahara: [EDP] Add suspend_job() for sahara edp engine(oozie implementation) https://review.openstack.org/201448 | 08:00 |
openstackgerrit | Merged openstack/sahara: Adding doc about distributed periodics https://review.openstack.org/276682 | 08:27 |
*** _degorenko|afk is now known as degorenko | 08:28 | |
*** pcaruana has joined #openstack-sahara | 08:42 | |
openstackgerrit | Michael Ionkin proposed openstack/sahara: Added scaling support for HDP 2.2 / 2.3 https://review.openstack.org/193081 | 08:42 |
*** nkrinner has quit IRC | 09:22 | |
*** nkrinner has joined #openstack-sahara | 09:22 | |
openstackgerrit | Michael Ionkin proposed openstack/sahara: Added scaling support for HDP 2.2 / 2.3 https://review.openstack.org/193081 | 09:46 |
*** pino|work has quit IRC | 09:52 | |
*** dmitryme has quit IRC | 09:52 | |
*** zhiyan has quit IRC | 09:52 | |
*** dmitryme has joined #openstack-sahara | 09:53 | |
*** pino|work has joined #openstack-sahara | 09:53 | |
*** zhiyan has joined #openstack-sahara | 09:58 | |
*** zhiyan has quit IRC | 10:03 | |
*** rcernin has quit IRC | 10:03 | |
*** rcernin has joined #openstack-sahara | 10:04 | |
*** Poornima has quit IRC | 10:08 | |
*** tmckay has quit IRC | 10:08 | |
*** chlong has quit IRC | 10:08 | |
*** aignatov has quit IRC | 10:08 | |
*** Poornima has joined #openstack-sahara | 10:08 | |
*** chlong has joined #openstack-sahara | 10:08 | |
*** aignatov has joined #openstack-sahara | 10:09 | |
*** tmckay has joined #openstack-sahara | 10:10 | |
*** zhiyan has joined #openstack-sahara | 10:12 | |
openstackgerrit | Xi Yang proposed openstack/sahara: Remove cinder v1 api support https://review.openstack.org/270623 | 10:25 |
openstackgerrit | Evgeny Sikachev proposed openstack/sahara-ci-config: Add CDH 5.5.0 to sahara-ci https://review.openstack.org/281180 | 10:39 |
*** esikachev has joined #openstack-sahara | 10:43 | |
openstackgerrit | Evgeny Sikachev proposed openstack/sahara-ci-config: Add CDH 5.5.0 to sahara-ci https://review.openstack.org/281180 | 10:46 |
*** esikachev has quit IRC | 10:49 | |
*** dmitryme has quit IRC | 10:49 | |
*** DuncanT has quit IRC | 10:49 | |
*** al_indig_ has quit IRC | 10:49 | |
*** logan- has quit IRC | 10:49 | |
*** al_indigo has joined #openstack-sahara | 10:49 | |
*** dmitryme has joined #openstack-sahara | 10:50 | |
*** logan- has joined #openstack-sahara | 10:52 | |
*** raildo-afk has quit IRC | 10:54 | |
*** raildo-afk has joined #openstack-sahara | 10:57 | |
*** DuncanT has joined #openstack-sahara | 11:02 | |
*** rcernin has quit IRC | 11:03 | |
*** al_indigo has quit IRC | 11:04 | |
*** zhiyan has quit IRC | 11:04 | |
*** Poornima has quit IRC | 11:04 | |
*** witlessb has quit IRC | 11:04 | |
*** Erming__ has quit IRC | 11:04 | |
*** _crorh has quit IRC | 11:04 | |
*** elmiko has quit IRC | 11:04 | |
*** NikitaKonovalov has quit IRC | 11:04 | |
*** NikitaKonovalov has joined #openstack-sahara | 11:04 | |
*** al_indigo has joined #openstack-sahara | 11:04 | |
*** Erming has joined #openstack-sahara | 11:04 | |
*** Poornima has joined #openstack-sahara | 11:04 | |
*** witlessb has joined #openstack-sahara | 11:05 | |
*** crobertsrh has joined #openstack-sahara | 11:07 | |
*** elmiko has joined #openstack-sahara | 11:09 | |
openstackgerrit | Merged openstack/sahara-ci-config: Add CDH 5.5.0 to sahara-ci https://review.openstack.org/281180 | 11:12 |
openstackgerrit | Vitaly Gridnev proposed openstack/sahara: [wip] implement sending health notifications https://review.openstack.org/281194 | 11:12 |
*** zhiyan has joined #openstack-sahara | 11:16 | |
*** esikachev has joined #openstack-sahara | 11:22 | |
*** crobertsrh has quit IRC | 11:39 | |
*** logan- has quit IRC | 11:39 | |
*** aignatov has quit IRC | 11:39 | |
*** krotscheck has quit IRC | 11:39 | |
*** alazarev has quit IRC | 11:39 | |
*** _mattf has quit IRC | 11:39 | |
*** alazarev has joined #openstack-sahara | 11:40 | |
*** _mattf has joined #openstack-sahara | 11:40 | |
*** aignatov has joined #openstack-sahara | 11:40 | |
*** crobertsrh has joined #openstack-sahara | 11:40 | |
*** logan- has joined #openstack-sahara | 11:42 | |
*** krotscheck has joined #openstack-sahara | 11:42 | |
*** egafford has joined #openstack-sahara | 11:46 | |
*** rcernin has joined #openstack-sahara | 11:49 | |
*** aignatov has quit IRC | 11:50 | |
*** tmckay has quit IRC | 11:50 | |
*** bapalm has quit IRC | 11:50 | |
*** degorenko has quit IRC | 11:50 | |
*** bapalm has joined #openstack-sahara | 11:50 | |
*** aignatov has joined #openstack-sahara | 11:50 | |
*** tmckay has joined #openstack-sahara | 11:53 | |
*** degorenko has joined #openstack-sahara | 11:53 | |
*** openstack has joined #openstack-sahara | 12:04 | |
*** htruta has joined #openstack-sahara | 12:06 | |
*** hogepodge has quit IRC | 12:07 | |
openstackgerrit | Evgeny Sikachev proposed openstack/sahara-tests: [wip]Fix using proxy node for checks https://review.openstack.org/279447 | 12:10 |
*** coolsvap is now known as coolsvap|away | 12:11 | |
openstackgerrit | Evgeny Sikachev proposed openstack/sahara-tests: [wip] put input datasources to hdfs https://review.openstack.org/280701 | 12:13 |
*** vgridnev has joined #openstack-sahara | 12:21 | |
*** witlessb has quit IRC | 12:24 | |
*** DuncanT has quit IRC | 12:24 | |
*** nkrinner has quit IRC | 12:24 | |
*** kgalanov has quit IRC | 12:24 | |
*** sreshetn1ak has quit IRC | 12:24 | |
*** jamielennox has quit IRC | 12:24 | |
*** zigo has quit IRC | 12:24 | |
*** sreshetnyak has joined #openstack-sahara | 12:25 | |
*** nkrinner has joined #openstack-sahara | 12:25 | |
*** nkrinner has quit IRC | 12:25 | |
*** nkrinner has joined #openstack-sahara | 12:25 | |
*** zigo has joined #openstack-sahara | 12:26 | |
*** witlessb has joined #openstack-sahara | 12:27 | |
*** jamielennox has joined #openstack-sahara | 12:30 | |
*** elmiko has quit IRC | 12:32 | |
*** NikitaKonovalov has quit IRC | 12:32 | |
*** agireud has quit IRC | 12:32 | |
*** openstackgerrit_ has quit IRC | 12:32 | |
*** openstackgerrit has quit IRC | 12:32 | |
*** SergeyLukjanov has quit IRC | 12:32 | |
*** egafford has quit IRC | 12:32 | |
*** elmiko has joined #openstack-sahara | 12:32 | |
*** elmiko has quit IRC | 12:32 | |
*** elmiko has joined #openstack-sahara | 12:32 | |
*** NikitaKonovalov has joined #openstack-sahara | 12:32 | |
*** openstackgerrit has joined #openstack-sahara | 12:33 | |
*** openstackgerrit_ has joined #openstack-sahara | 12:34 | |
*** vgridnev_ has joined #openstack-sahara | 12:36 | |
*** SergeyLukjanov has joined #openstack-sahara | 12:37 | |
*** agireud has joined #openstack-sahara | 12:38 | |
*** vgridnev has quit IRC | 12:38 | |
*** DuncanT has joined #openstack-sahara | 12:41 | |
*** kgalanov has joined #openstack-sahara | 12:43 | |
*** raildo-afk is now known as raildo | 13:09 | |
*** n-anzen has quit IRC | 13:10 | |
*** kgalanov has quit IRC | 13:20 | |
*** kgalanov has joined #openstack-sahara | 13:22 | |
*** raildo is now known as raildo-afk | 13:24 | |
*** esikachev has quit IRC | 13:25 | |
*** egafford has joined #openstack-sahara | 13:27 | |
*** raildo-afk is now known as raildo | 13:29 | |
*** vgridnev_ has quit IRC | 13:32 | |
*** vgridnev has joined #openstack-sahara | 13:32 | |
*** dave-mccowan has joined #openstack-sahara | 13:33 | |
*** esikachev has joined #openstack-sahara | 13:34 | |
openstackgerrit | Merged openstack/sahara: CDH plugin versionhandler refactoring https://review.openstack.org/261192 | 13:34 |
openstackgerrit | Merged openstack/sahara: Add test cases for CDH plugin config_helper https://review.openstack.org/253494 | 13:35 |
openstackgerrit | Vitaly Gridnev proposed openstack/sahara: base cluster verifications implementation https://review.openstack.org/273587 | 13:36 |
*** vgridnev_ has joined #openstack-sahara | 13:39 | |
*** vgridnev has quit IRC | 13:42 | |
*** dhellmann has quit IRC | 13:44 | |
*** dhellmann has joined #openstack-sahara | 13:47 | |
openstackgerrit | Vitaly Gridnev proposed openstack/sahara: cloudera health checks implementation https://review.openstack.org/279007 | 13:47 |
openstackgerrit | Vitaly Gridnev proposed openstack/sahara: base cluster verifications implementation https://review.openstack.org/273587 | 13:47 |
openstackgerrit | Vitaly Gridnev proposed openstack/sahara: ambari health check implementation https://review.openstack.org/280203 | 13:50 |
openstackgerrit | Vitaly Gridnev proposed openstack/sahara: implement sending health notifications https://review.openstack.org/281194 | 13:50 |
*** vgridnev_ has quit IRC | 13:58 | |
*** hogepodge has joined #openstack-sahara | 14:00 | |
*** crobertsrh1 has joined #openstack-sahara | 14:06 | |
*** vgridnev_ has joined #openstack-sahara | 14:07 | |
*** egafford has quit IRC | 14:09 | |
openstackgerrit | Merged openstack/sahara: Replace assertNotEqual(None,) with assertIsNotNone https://review.openstack.org/280788 | 14:09 |
openstackgerrit | Vitaly Gridnev proposed openstack/sahara: cloudera health checks implementation https://review.openstack.org/279007 | 14:11 |
*** Poornima has quit IRC | 14:17 | |
openstackgerrit | Evgeny Sikachev proposed openstack/sahara-tests: Fix using proxy node for checks https://review.openstack.org/279447 | 14:26 |
openstackgerrit | Evgeny Sikachev proposed openstack/sahara-tests: Disable ssl_verify as default https://review.openstack.org/280762 | 14:28 |
*** vgridnev__ has joined #openstack-sahara | 14:30 | |
*** vgridnev_ has quit IRC | 14:33 | |
*** witlessb has quit IRC | 14:43 | |
*** witlessb has joined #openstack-sahara | 14:44 | |
openstackgerrit | Vitaly Gridnev proposed openstack/sahara: ambari health check implementation https://review.openstack.org/280203 | 14:47 |
openstackgerrit | Evgeny Sikachev proposed openstack/sahara-tests: Add autoregistering of image https://review.openstack.org/281315 | 14:50 |
*** tmckay has quit IRC | 14:51 | |
*** vgridnev__ has quit IRC | 14:51 | |
*** vgridnev__ has joined #openstack-sahara | 14:52 | |
*** vgridnev__ has quit IRC | 14:55 | |
*** vgridnev__ has joined #openstack-sahara | 14:56 | |
*** vgridnev__ has quit IRC | 14:58 | |
*** dhellmann has quit IRC | 15:05 | |
*** dhellmann has joined #openstack-sahara | 15:05 | |
*** vgridnev__ has joined #openstack-sahara | 15:08 | |
rickflare | morning | 15:15 |
elmiko | hi | 15:16 |
rickflare | hey elmiko how are you? | 15:16 |
elmiko | sleepy... | 15:16 |
elmiko | you? | 15:16 |
rickflare | same | 15:16 |
rickflare | hoping tmckay is back | 15:16 |
rickflare | so we can finish this spark stuff | 15:17 |
rickflare | i think I have finally gotten over my neutron networking hurdle | 15:17 |
elmiko | i'm sure he'll be around at some point | 15:17 |
elmiko | \o/ | 15:17 |
crobertsrh | rickflare: where did you leave off? | 15:19 |
crobertsrh | tmckay sent me a text saying that you guys ran into something with a spark 1.6 classpath? Is that right? | 15:20 |
*** Erming has quit IRC | 15:20 | |
elmiko | vgridnev__: saw this yesterday, thought you might find it interesting: http://blog.kortar.org/?p=279 | 15:20 |
*** Erming has joined #openstack-sahara | 15:20 | |
rickflare | so | 15:21 |
rickflare | we got the job | 15:21 |
rickflare | but it kept saying done with errors | 15:21 |
rickflare | i also found a bug with horizon | 15:21 |
rickflare | when you add the binary path in switf | 15:21 |
rickflare | swift | 15:22 |
rickflare | if you add swift://container/binary | 15:22 |
rickflare | horizon will put it in as swift://swift://container/binary | 15:22 |
crobertsrh | That rings a bell. It may have been written up already. I'll double check in launchpad. | 15:22 |
rickflare | and then we will not be able to delete it | 15:22 |
crobertsrh | It might have already been fixed for a future release. | 15:22 |
elmiko | that bug has been fixed | 15:23 |
crobertsrh | thanks elmiko: it was sounding really familiar | 15:24 |
rickflare | ok | 15:25 |
rickflare | so its in mikata | 15:25 |
crobertsrh | Yes | 15:25 |
* rickflare feels like he is actually contributing now. | 15:25 | |
crobertsrh | Absolutely | 15:26 |
rickflare | yea so we were having a class path issue | 15:26 |
crobertsrh | Any bugs that you do find, definitely write-up on launchpad: https://bugs.launchpad.net/sahara | 15:26 |
rickflare | i was telling tmckay | 15:27 |
rickflare | I hope to prep to make a big sell to my customers on using sahara | 15:27 |
crobertsrh | Excellent | 15:27 |
rickflare | the next thing im going to really need to do is find out how using sahara image elements | 15:27 |
crobertsrh | that'll be a piece of cake | 15:28 |
rickflare | to be able to produce harden images that actually work | 15:28 |
rickflare | right now these images are far to loose | 15:28 |
rickflare | and open | 15:28 |
rickflare | they must be locked down | 15:28 |
rickflare | and some may even need to have selinux enabled | 15:28 |
crobertsrh | Ah, I see | 15:28 |
*** vgridnev__ has quit IRC | 15:28 | |
crobertsrh | Might be an interesting bit of work. | 15:29 |
rickflare | so I salt formulas | 15:29 |
rickflare | that can do a lot of this | 15:29 |
rickflare | but getting salt into the images was not the bad | 15:30 |
rickflare | however it seems like this is something that would be best handled by heat | 15:30 |
rickflare | and I really need to understand more the order in which things happen and where | 15:30 |
rickflare | so I can certain I am injecting in the correct locations | 15:30 |
rickflare | if that makes sense | 15:30 |
*** vgridnev__ has joined #openstack-sahara | 15:31 | |
crobertsrh | Right. There is a fair amount of documentation for Sahara. Some of it might help you figure out the flow. | 15:31 |
rickflare | yea I am about to start really digging into it | 15:31 |
rickflare | ive been so slowed down by install issues | 15:31 |
rickflare | but I feel they have been resolved | 15:32 |
rickflare | I did another install | 15:32 |
crobertsrh | Yeah, seems like you're up and running nicely | 15:32 |
rickflare | and I now know how to get this integrated now | 15:32 |
rickflare | vlans and more will come later | 15:32 |
rickflare | but for now this is fine | 15:32 |
*** vgridnev__ has quit IRC | 15:35 | |
*** vgridnev__ has joined #openstack-sahara | 15:37 | |
*** vgridnev__ has quit IRC | 15:42 | |
*** vgridnev__ has joined #openstack-sahara | 15:42 | |
*** nkrinner has quit IRC | 15:43 | |
*** esikachev has quit IRC | 15:48 | |
*** vgridnev__ has quit IRC | 15:52 | |
crobertsrh | rickflare: Here is the Sahara blueprint page: https://blueprints.launchpad.net/sahara | 15:52 |
*** vgridnev__ has joined #openstack-sahara | 15:52 | |
rickflare | ok | 15:52 |
*** tmckay has joined #openstack-sahara | 15:52 | |
*** vgridnev__ has quit IRC | 15:52 | |
rickflare | so I can just type up what I like to see done and submit it? | 15:52 |
crobertsrh | Yeah. If you have ideas on the "how", feel free to add them there as well. | 15:53 |
rickflare | ok | 15:53 |
rickflare | ill do this soon as I finish that reading | 15:53 |
rickflare | and understand the flow more | 15:54 |
crobertsrh | Once the idea is a bit better baked, you or someone else will write up a "spec", which is much more detailed. | 15:54 |
crobertsrh | Here's an example of what our specs look like: https://review.openstack.org/#/c/245571/10/specs/mitaka/edp-log-enhancement.rst | 15:54 |
rickflare | understood | 15:56 |
openstackgerrit | Merged openstack/sahara-tests: Define variables via args in scenario tests https://review.openstack.org/270699 | 16:00 |
elmiko | did tmckay end up finding a bug for rickflare to fix? | 16:02 |
tmckay | elmiko, no | 16:03 |
tmckay | something weird with tox, crobertsrh and I can see errors but not everywhere | 16:03 |
elmiko | k, i'll try to work up a softball today ;) | 16:03 |
tmckay | so we need something simple. I'm looking into one -- we discovered yesterday that rickflare submitted a spark job without a main class and it let him. That should be an error | 16:04 |
crobertsrh | tmckay: I upgraded tox and pep8, but still see the same errors | 16:04 |
elmiko | i was gonna format another bandit fix for a bug, should be simple | 16:04 |
elmiko | i'm scared to try the tests now... | 16:05 |
tmckay | rickflare, also I need to track down that spark 1.6 issue. What version of sahara are you running? Is it from the git, or a package? | 16:05 |
*** degorenko is now known as _degorenko|afk | 16:05 | |
elmiko | crobertsrh, tmckay, fyi, i've started running tox from a venv as the fedora versions have caused issues for me | 16:06 |
tmckay | crobertsrh, another bug we found -- he added swift:// to a url in horizon, and it came through as swift://swift:// | 16:06 |
crobertsrh | oh, tmckay: re pep8, I still see the errors on my ubuntu (14) machine, but NOT on my fedora 23 machine. | 16:06 |
elmiko | tmckay: that was fixed | 16:06 |
crobertsrh | elmiko: probably wise | 16:06 |
rickflare | hey hey | 16:06 |
rickflare | this was from using the sahara image element build | 16:06 |
tmckay | elmiko, when? so he hust have an older version | 16:06 |
rickflare | i did a tox build of the image | 16:06 |
elmiko | tmckay: it was fixed a month or two ago | 16:07 |
tmckay | rickflare, but your sahara binaries, sahara-engine and sahara-api -- how do you install them? from packstack? what package? | 16:07 |
tmckay | I need to see if I can patch your spark edp engine. Or, you might need to kill the cluster and launch a spark 1.3.1 cluster instead (rickflare) | 16:08 |
tmckay | elmiko, heh, when I was on review vacation. doh | 16:08 |
tmckay | elmiko, when you say it was fixed a few months ago, you mean the swift://swift:// or being able to run a job without a main class? | 16:11 |
elmiko | tmckay: swift://swift:// | 16:12 |
elmiko | looking for the fix now | 16:12 |
rickflare | brb guys | 16:12 |
crobertsrh | might have been fixed before we left the horizon repo | 16:12 |
elmiko | i think it was | 16:12 |
tmckay | was thinking about usability after helping rickflare, I was wondering if an additional field on a job binary to store a list of the runnable classes in a jar would be a nice option | 16:13 |
tmckay | so when you create the binary, you tag it with a list of class names optionally | 16:14 |
tmckay | then when someone goes to run it, they don't have to guess, or go dig up the binary and run "jar tf" on it to see what's in there | 16:14 |
crobertsrh | Might be useful. Partially solved by the job template interface stuff that was added. | 16:14 |
tmckay | that's always a pain point for me | 16:14 |
tmckay | crobertsrh, yeah, but does it have a list of main class values? It could be added there, maybe | 16:15 |
crobertsrh | No, nothing about a "list", but it does include a field for the one that you'll need | 16:15 |
tmckay | didn't show rickflare the job interface stuff yet | 16:16 |
crobertsrh | How common is it for a jar to have multiple runnables crammed inside? | 16:16 |
tmckay | I'll take a look | 16:16 |
crobertsrh | Is it only something that we tend to see in the example jars? | 16:16 |
crobertsrh | Or do people do that "for realz"? | 16:16 |
*** esikachev has joined #openstack-sahara | 16:17 | |
tmckay | I don't know, it happens to me all the time. I never remember what the class is inside a jar. It should be autodetectable imho (maybe it is, I don't know much bout Java) | 16:17 |
tmckay | kind of annoyed that hadoop is written in it to begin with | 16:18 |
tmckay | :) | 16:18 |
crobertsrh | Yeah, I see what you're saying. I've felt that pain more than once. | 16:18 |
*** esikachev has quit IRC | 16:24 | |
*** apavlov has quit IRC | 16:33 | |
*** pcaruana has quit IRC | 16:35 | |
*** vgridnev__ has joined #openstack-sahara | 17:03 | |
*** vgridnev__ has quit IRC | 17:04 | |
elmiko | tmckay, crobertsrh, fyi, just ran a fresh tox from a venv with python3.4 and tox 2.3.1 on f23. everything passed | 17:25 |
crobertsrh | python3.4, eh? | 17:25 |
elmiko | yea, i'm trying to use py3 for more stuff these days | 17:25 |
crobertsrh | Things pass nicely on my f23 box, just not on my other (ubuntu) machine | 17:26 |
elmiko | weird... | 17:26 |
elmiko | first thing i check these days in the tox version | 17:26 |
crobertsrh | actually, I think I see a few bashate warnings, but that's it...still "succeeded" | 17:26 |
crobertsrh | Yeah, tox version is often key | 17:26 |
elmiko | i get bashates too | 17:26 |
*** apavlov has joined #openstack-sahara | 17:32 | |
tmckay | elmiko, I found a patch for rickflare, a real bug | 17:33 |
* tmckay lunch | 17:33 | |
elmiko | tmckay: ack | 17:34 |
*** DuncanT has quit IRC | 17:34 | |
*** zigo has quit IRC | 17:34 | |
*** jamielennox has quit IRC | 17:35 | |
*** zigo has joined #openstack-sahara | 17:36 | |
rickflare | tmckay sorry | 17:37 |
rickflare | i am back | 17:37 |
rickflare | had a fire I had to tend to | 17:37 |
*** DuncanT has joined #openstack-sahara | 17:39 | |
rickflare | lmk when you are back and we can resume working on that job | 17:40 |
*** vgridnev__ has joined #openstack-sahara | 17:42 | |
*** jamielennox has joined #openstack-sahara | 17:44 | |
*** thumpba has joined #openstack-sahara | 17:51 | |
*** esikachev has joined #openstack-sahara | 18:10 | |
*** Erming has quit IRC | 18:12 | |
*** Erming_ has joined #openstack-sahara | 18:12 | |
*** egafford has joined #openstack-sahara | 18:13 | |
tmckay | rickflare, back. so, question of the day is what version of sahara are you running? Where did it come from? | 18:21 |
* tmckay looks for that spark 1.6 classpath issue | 18:22 | |
tmckay | rickflare, also have a simple bug for you to fix | 18:22 |
rickflare | ok | 18:26 |
rickflare | awesome | 18:26 |
rickflare | im back! | 18:26 |
rickflare | I got my tunes on | 18:27 |
rickflare | im ready to rock | 18:27 |
tmckay | rickflare, ok, to get that spark job to run that we launched yesterday (the last one), I believe you need this patch https://review.openstack.org/#/c/276734/4 | 18:30 |
tmckay | that should make the spark.xml file available on the classpath for spark-submit | 18:31 |
tmckay | this is why I was asking where your sahara came from -- to make sure you don't already have it | 18:31 |
*** egafford has quit IRC | 18:32 | |
rickflare | ah | 18:33 |
rickflare | is this patch | 18:33 |
rickflare | applied in the instance | 18:33 |
rickflare | or on the openstack host | 18:33 |
rickflare | ccccccejlbrtdgklbjdhunvtjiuhlubiitdgdvcngcee | 18:33 |
rickflare | whoops | 18:33 |
tmckay | the openstack host, sahara controller. If you're running spark 1.6, though, I'm a little confused, it should already be there. unless of course you're using a spark 1.6 image and sahara cluster thinks it's a 1.3.1 cluster | 18:34 |
rickflare | thats what it is | 18:35 |
rickflare | i bet | 18:35 |
rickflare | because i have not seen 1.6.0 | 18:35 |
rickflare | anywhere in horizon | 18:35 |
rickflare | and I have 1.3.1 selected | 18:35 |
tmckay | okay, it's a little clearer now. So, two choices -- in this case, I don't think there is really much difference from a sahara perspective between spark 1.3 and spark 1.6 | 18:36 |
rickflare | k | 18:36 |
*** rcernin has quit IRC | 18:36 | |
tmckay | so, we can 1) delete the cluster, generate a spark 1.3.1 image, relaunch the cluster, and you should be good or 2) patch your sahara in /usr/lib to have the fix so the classpath works for 1.6 | 18:37 |
tmckay | #1 is the "right" thing to do | 18:37 |
tmckay | #2 is a hack | 18:37 |
tmckay | but relatively simple | 18:37 |
tmckay | your choice | 18:37 |
rickflare | let hack for now | 18:37 |
rickflare | lets hack | 18:37 |
rickflare | now where do I patch> | 18:38 |
rickflare | ? | 18:38 |
tmckay | okay :) so you should be able to download that patch from gerrit as a patch file, then go to /usr/lib/pythonX.X/site-packages/sahara and apply the patch with "patch -p1 < whatever.patch" | 18:39 |
* tmckay that should be where packstack put it | 18:39 | |
* tmckay double checks | 18:40 | |
tmckay | yeah, I think that's right | 18:40 |
rickflare | forgive me but how do i download the patch | 18:40 |
tmckay | k, doing it alongside you, hold on | 18:40 |
rickflare | im a gerrit newb | 18:40 |
tmckay | download link up in the right-hand corner, you can get it as a zip | 18:41 |
rickflare | patch file? it looks like a diff | 18:41 |
tmckay | yeah, that's it | 18:42 |
tmckay | hmm, maybe the format patch link works better, hold on | 18:43 |
rickflare | yea | 18:44 |
rickflare | because | 18:44 |
rickflare | im getting a file to patch prompt | 18:44 |
rickflare | when I run the patch command | 18:44 |
rickflare | which means its not working | 18:44 |
tmckay | yeah, I bumped up a level to site-packages and did patch -p1 < sahara/blah.patch, but the changes field. if you're using RDO packages it may be too old. what do you have for rpm -qa | grep sahara | 18:47 |
rickflare | openstack-sahara-api-3.0.0-5.cc218ddgit.el7.noarch | 18:47 |
rickflare | openstack-sahara-common-3.0.0-5.cc218ddgit.el7.noarch | 18:47 |
rickflare | python-saharaclient-0.11.1-1.el7.noarch | 18:47 |
rickflare | openstack-sahara-engine-3.0.0-5.cc218ddgit.el7.noarch | 18:47 |
tmckay | ok, so, liberty. shouldn't be too different. | 18:49 |
*** egafford has joined #openstack-sahara | 18:55 | |
*** egafford has left #openstack-sahara | 18:55 | |
*** esikachev has quit IRC | 18:57 | |
openstackgerrit | Grigoriy Rozhkov proposed openstack/sahara: Remove unsupported MapR plugin versions https://review.openstack.org/266444 | 19:04 |
*** rcernin has joined #openstack-sahara | 19:06 | |
*** vgridnev__ has quit IRC | 19:07 | |
*** vgridnev__ has joined #openstack-sahara | 19:11 | |
*** vgridnev__ has quit IRC | 19:14 | |
*** vgridnev__ has joined #openstack-sahara | 19:18 | |
*** vgridnev__ has quit IRC | 19:19 | |
*** vgridnev__ has joined #openstack-sahara | 19:22 | |
*** vgridnev__ has quit IRC | 19:25 | |
*** egafford has joined #openstack-sahara | 19:26 | |
*** vgridnev__ has joined #openstack-sahara | 19:32 | |
*** egafford has quit IRC | 19:33 | |
*** vgridnev__ has quit IRC | 19:38 | |
*** vgridnev__ has joined #openstack-sahara | 19:38 | |
*** esikachev has joined #openstack-sahara | 19:42 | |
*** vgridnev__ has quit IRC | 19:44 | |
*** apavlov has quit IRC | 19:45 | |
tmckay | rickflare, btw, here is the bug I have for you https://bugs.launchpad.net/sahara/+bug/1546701 | 19:53 |
openstack | Launchpad bug 1546701 in Sahara "Validation for main class checks if the key is present but does not check non-null" [High,Triaged] - Assigned to Trevor McKay (tmckay) | 19:53 |
*** vgridnev__ has joined #openstack-sahara | 20:02 | |
tmckay | vgridnev__, hi. so, we don't think swift works with spark 1.6? | 20:04 |
vgridnev__ | it should be working, as I know. michael inonkin got a fix, as I know | 20:05 |
vgridnev__ | tmckay, ^^ | 20:05 |
vgridnev__ | hm, so many __ at the end of nickname | 20:05 |
tmckay | vgridnev__, ah, the classpath fix with working dir? | 20:05 |
*** apavlov has joined #openstack-sahara | 20:06 | |
vgridnev__ | yep | 20:06 |
tmckay | hmm, ok. I was helping rickflare, he was running a spark 1.6 image an sahara from liberty (we patched it for the classpath fix), job started running but got a socket timeout trying to authenticate to keystone | 20:07 |
tmckay | weird, since he could ping it | 20:07 |
tmckay | but, this is maybe unsupported -- running a 1.6 image under liberty. | 20:07 |
vgridnev__ | I've got an idea | 20:07 |
tmckay | I had him generate a 1.3.1 image and he's going to launch a new cluster | 20:07 |
vgridnev__ | https://bugs.launchpad.net/sahara/+bug/1486173 | 20:08 |
openstack | Launchpad bug 1486173 in Sahara "SocketTimeoutException on multi domain enviroment" [Undecided,New] | 20:08 |
tmckay | ah, interesting | 20:08 |
tmckay | that is the same error we got | 20:10 |
tmckay | not sure what it means by "running the same job on the default domain worked as expected" | 20:10 |
vgridnev__ | maybe we should configure node_domain in default section? | 20:11 |
vgridnev__ | https://github.com/openstack/sahara/blob/e0e20b2e33349373568c01493614a4388f4ab10c/sahara/config.py#L73 | 20:12 |
tmckay | hmm, maybe | 20:12 |
vgridnev__ | tmckay, also for reference: https://bugs.launchpad.net/sahara/+bug/1192193 | 20:15 |
openstack | Launchpad bug 1192193 in Sahara "Savanna should determine domain name dynamically" [Low,Incomplete] | 20:15 |
tmckay | thanks, vgridnev__ | 20:16 |
*** vgridnev__ has quit IRC | 20:21 | |
*** vgridnev__ has joined #openstack-sahara | 20:28 | |
*** vgridnev__ has quit IRC | 20:28 | |
*** vgridnev has joined #openstack-sahara | 20:28 | |
*** vgridnev has quit IRC | 20:32 | |
rickflare | interesting | 20:33 |
*** krotscheck is now known as krotscheck_dcm | 20:34 | |
tmckay | rickflare, also, here is a debug hint for you | 20:39 |
tmckay | because of the way the job_launch.log is written, you can open up that log in vi, copy the job launch command, and execute from the command line in the job run dir on the master node | 20:40 |
tmckay | and re-run the job manually without launching from sahara | 20:40 |
tmckay | outout from spark will stream on your console | 20:41 |
tmckay | This is a good way to play with arguments, or if you're tweaking network config, etc | 20:42 |
rickflare | ahh | 20:44 |
rickflare | ok good to know | 20:44 |
rickflare | spawning my new cluster as we speak | 20:44 |
tmckay | got another debug hint for you, too. With a tweak to the hadoop core site, you can do "hadoop fs -ls swift://demo.sahara/myfile" and that will let you know if hadoop can access swift | 20:47 |
tmckay | If it works, you get a dir listing. If not, you'll get the same exception that you would have gotten from a job run | 20:48 |
tmckay | rickflare, really fast test when debugging network config issues ^^ | 20:48 |
tmckay | I can tell you how to modify the core-site.xml, simple | 20:48 |
* tmckay anticipating that maybe 1.3.1 cluster will have the same problem | 20:48 | |
tmckay | Networking, by Charles Dickens. It was the best of times, it was the worst of times. | 20:49 |
rickflare | ok | 20:55 |
rickflare | cluster is backup | 20:55 |
tmckay | alright, give the relaunch of that same job a try and let's see what happens | 20:55 |
rickflare | and I am running 1.3.1 | 20:55 |
rickflare | like a idiot | 20:55 |
rickflare | I deleted the job | 20:55 |
tmckay | heh! | 20:55 |
rickflare | i got to relaunch it | 20:55 |
tmckay | okay. just remember, main class, swift configs, input output args | 20:56 |
*** thumpba has quit IRC | 20:57 | |
rickflare | can you message me | 20:57 |
rickflare | those args again | 20:57 |
rickflare | for the admin | 20:57 |
rickflare | and password | 20:57 |
*** raildo is now known as raildo-afk | 21:00 | |
*** esikachev has quit IRC | 21:18 | |
tmckay | elmiko, crobertsrh, egafford, anyone else around, you ever see this as an error trying to write to swift from spark? Read works fine, write to local hdfs works fine | 21:32 |
tmckay | https://cryptbin.com/i92#72387cfa10b2a429c9e6b5659588306e | 21:32 |
crobertsrh | lookin' | 21:32 |
tmckay | craziest thing, this is liberty, with a fresh spark 1.3.1 image | 21:33 |
tmckay | everything is working but this last silly error -- spark.xml is there, etc etc | 21:33 |
crobertsrh | I don't think I've seen that. I haven't tried spark jobs recently though. I'll try a quick one. I think I can get a cluster up quickly. | 21:33 |
crobertsrh | oh, bonus...I have a cluster already | 21:34 |
elmiko | cryptbin, that's a new one for me | 21:34 |
tmckay | worse than that, the container got created ... | 21:34 |
tmckay | wonder if there is something in it | 21:35 |
elmiko | my first guess would be to check the swift logs | 21:35 |
crobertsrh | yeah, +1 to swift logs | 21:35 |
elmiko | may some sort of issue with acls or the user groups or something | 21:35 |
elmiko | also, you should be able to check the core-site.xml (i think) to double check the credentials that are used to validate with swift | 21:36 |
elmiko | that would be on the cluster node | 21:36 |
tmckay | rickflare ^^, great idea. mine the swift logs on the controller | 21:36 |
elmiko | and log in to the cluster master node to dbl check creds, imo | 21:36 |
tmckay | elmiko, we've got the creds right, we can hadoop fs -ls from the cluster node | 21:36 |
tmckay | we hacked core-site | 21:37 |
elmiko | ack | 21:37 |
elmiko | could be an issue with write acls on swift? | 21:37 |
tmckay | maybe, it created the output container, and create a _temporary file successfully. huh? | 21:38 |
rickflare | guys ill be back | 21:38 |
tmckay | elmiko, then some mysterious permission denied. makes no sense | 21:38 |
rickflare | i got to bug out now | 21:38 |
elmiko | tmckay: weird... | 21:39 |
rickflare | ive seen this before with hadoop | 21:39 |
rickflare | llll be be back | 21:39 |
elmiko | take care rickflare | 21:39 |
tmckay | alright, well, we're doing everything right at this point | 21:39 |
tmckay | I am officially clueless | 21:39 |
* elmiko really hopes rickflare wears a cape at summit | 21:39 | |
tmckay | oh yea, gotta confirm his bug | 21:40 |
tmckay | elmiko, https://bugs.launchpad.net/sahara/+bug/1546701 | 21:40 |
openstack | Launchpad bug 1546701 in Sahara "Validation for main class checks if the key is present but does not check non-null" [High,Triaged] - Assigned to Trevor McKay (tmckay) | 21:40 |
tmckay | verified for spark, 100% it crashes java too but I haven't gotten a generic cluster up | 21:40 |
elmiko | crazy... | 21:41 |
elmiko | good find though | 21:41 |
tmckay | elmiko, crobertsrh, either of you have a vanilla/cdh/hdp cluster up? | 21:41 |
tmckay | elmiko, it's like a 2 line fix :) | 21:41 |
crobertsrh | I don't, just spark atm | 21:41 |
elmiko | let me look | 21:41 |
tmckay | k, I'll try. I tried an hdp2 but it hung in configure forever | 21:41 |
elmiko | yea, i have a vanilla 2.7.1 up | 21:42 |
tmckay | ooo, oooo, can you run a java wordcount from edp-examples? | 21:42 |
elmiko | i can try, yea | 21:42 |
tmckay | just want to run it, and leave the main class blank on purpose | 21:42 |
elmiko | ok | 21:43 |
tmckay | the dynamite should go boom | 21:43 |
elmiko | so, setup a new job, but leave the main class blank? | 21:43 |
tmckay | yep | 21:43 |
tmckay | it should launch, and oozie should barf. It should not fail on the sahara side, or at least not during validation | 21:43 |
elmiko | k | 21:44 |
tmckay | validation should be catching it but isn't (at least for spark) | 21:44 |
tmckay | ooo, lookie there, I have a centos7 vanilla 2.6 image lying around | 21:47 |
elmiko | man, we need to update the docs in those edp-examples | 21:47 |
tmckay | it's like Christmas | 21:47 |
elmiko | lol | 21:47 |
tmckay | yeah, EDP needs some love | 21:47 |
tmckay | elmiko, rickflare has made it abundantly clear to me indirectly that log aggregation has to happen | 21:48 |
tmckay | really | 21:48 |
tmckay | for N, that is all I'm going to do, unless someone with authority makes me do otherwise ;-) | 21:48 |
tmckay | it is crazy stupid that I've got him poking around in /tmp/spark-edp to debug | 21:49 |
elmiko | +1 | 21:49 |
elmiko | i think we've all agreed that logs could be better | 21:49 |
tmckay | been deferred long enoug, I thin it is now "the most important thing" | 21:49 |
elmiko | well, it's on our short list of "most important things" ;) | 21:50 |
tmckay | lol, yeah | 21:50 |
tmckay | it's just big, and hard | 21:50 |
elmiko | yup | 21:50 |
elmiko | also, for the regex help stuff, we should totally create these little context "?" icons. it is super helpful in the job tempate creation form | 21:51 |
tmckay | sounds good. I am unfamiliar with the "?" icons | 21:51 |
tmckay | I wanted this cycle to be all about usability, then I got wooed by baremetal | 21:51 |
elmiko | look at job template create | 21:52 |
elmiko | lol! | 21:52 |
tmckay | darn you baremetal. you're dead to me | 21:52 |
elmiko | ok, job launched without issue (no main lib specified) | 21:53 |
tmckay | k, should get an error | 21:55 |
crobertsrh | tmckay: Are you still trying to get the wordcount job to run? Or did I miss that great success? | 21:55 |
elmiko | yup, it got killed | 21:55 |
tmckay | elmiko, main class you mean, not main lib, right? | 21:55 |
elmiko | yea, main class | 21:55 |
elmiko | i only specified the args | 21:55 |
tmckay | crobertsrh, no, ended with that permission denied. but it created the ouput dir in swift, and it created the temporary file | 21:56 |
tmckay | then choked | 21:56 |
crobertsrh | Ok, just making sure I haven't missed anything. | 21:56 |
crobertsrh | I did crank up my cluster and run wordcount. Managed to run for me. | 21:57 |
elmiko | what *is* the main class for word count? | 21:57 |
crobertsrh | sahara.edp.spark.SparkWordCount | 21:58 |
elmiko | thanks | 21:58 |
crobertsrh | of course :) | 21:58 |
elmiko | i'm not confident that this will work even with the main class | 21:58 |
*** crobertsrh1 has quit IRC | 21:58 | |
elmiko | but, i have terrible luck running jobs | 21:59 |
tmckay | elmiko, crobertsrh gave you the main class for spark | 22:00 |
tmckay | for java, it's different. This is a java job you're running, right? | 22:00 |
crobertsrh | ah...my bad :) | 22:01 |
* tmckay checks for it | 22:01 | |
crobertsrh | I was indeed in spark land | 22:01 |
elmiko | ha! i typed that and didn't even think about it | 22:01 |
elmiko | yes, this is java | 22:01 |
tmckay | org/openstack/sahara/examples/WordCount | 22:01 |
tmckay | with . for /, of course | 22:02 |
tmckay | so, guys, another interesting usability wrinkle in this | 22:02 |
elmiko | and do i add arguments in the arguments section on the configure tab of launch job or do i use the interface arguments tab? | 22:02 |
tmckay | apparently you can specify main class with a MANIFEST | 22:02 |
tmckay | elmiko, you can just use arguments on the configure tab to keep it simple | 22:03 |
elmiko | but either will work? | 22:03 |
tmckay | but, even though Java/spark may allow main class in a MANIFEST, we currently are requiring a --class argument for spark, and I'm not sure Oozie will handle a manifest specification (but it might) | 22:03 |
tmckay | elmiko, yeah | 22:04 |
tmckay | wouldn;t it be cool if users could build their jars so that they didn't need a main class value? | 22:04 |
tmckay | that would be awesome | 22:04 |
elmiko | huh... i'm just terrible at actually operating sahara... sigh | 22:04 |
tmckay | yeah, me too, you get rusty after a while. This excercise with rf has been good | 22:04 |
elmiko | and of course, now that the job has failed i have no clue why it failed... | 22:04 |
tmckay | back in the trenches | 22:05 |
elmiko | yup | 22:05 |
elmiko | just like when i joined ;) | 22:05 |
tmckay | elmiko, you can follow the Oozie console link and find out ... | 22:05 |
elmiko | yea, i need to setup the routes and everything though. my devstack box is another machine | 22:05 |
tmckay | elmiko, only if you want to -- I think you've proven what I wanted. I'm going to try to spin up my own | 22:06 |
elmiko | it can wait, i'm in the middle of reviewing crobertsrh stuff atm | 22:06 |
tmckay | thanks for checking | 22:06 |
elmiko | np | 22:06 |
openstackgerrit | Merged openstack/python-saharaclient: Updated from global requirements https://review.openstack.org/281007 | 22:14 |
openstackgerrit | Merged openstack/sahara: Add support running Sahara as wsgi app https://review.openstack.org/262492 | 22:19 |
openstackgerrit | Vitaly Gridnev proposed openstack/sahara: honor api_insecure parameters https://review.openstack.org/279996 | 22:23 |
openstackgerrit | Vitaly Gridnev proposed openstack/sahara: implement sending health notifications https://review.openstack.org/281194 | 22:24 |
*** tmckay has left #openstack-sahara | 22:44 | |
*** apavlov has quit IRC | 22:48 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!