*** jamielennox is now known as jamielennox|away | 00:25 | |
*** jamielennox|away is now known as jamielennox | 00:31 | |
*** tuanluong has joined #openstack-sahara | 00:35 | |
*** abalutoiu has quit IRC | 00:46 | |
*** shuyingya has joined #openstack-sahara | 01:38 | |
*** shuyingya has quit IRC | 01:44 | |
*** shuyingya has joined #openstack-sahara | 01:44 | |
*** shuyingya has quit IRC | 01:44 | |
*** shuyingya has joined #openstack-sahara | 01:47 | |
*** shuyingya has quit IRC | 03:03 | |
*** shuyingya has joined #openstack-sahara | 03:03 | |
*** Poornima has joined #openstack-sahara | 03:59 | |
*** Poornima has quit IRC | 03:59 | |
*** Poornima has joined #openstack-sahara | 04:11 | |
*** tellesnobrega has quit IRC | 04:12 | |
*** tellesnobrega has joined #openstack-sahara | 04:18 | |
*** vgridnev has joined #openstack-sahara | 05:03 | |
*** nkrinner_afk is now known as nkrinner | 05:16 | |
*** tellesnobrega has quit IRC | 05:16 | |
*** tellesnobrega has joined #openstack-sahara | 05:29 | |
*** gauPaw has joined #openstack-sahara | 05:42 | |
*** gauPaw is now known as gautam | 05:43 | |
*** rcernin has joined #openstack-sahara | 05:44 | |
*** Poornima has quit IRC | 05:45 | |
*** gautam has quit IRC | 05:48 | |
*** Poornima has joined #openstack-sahara | 05:54 | |
openstackgerrit | Evgeny Sikachev proposed openstack/sahara-tests master: Move methods from runner to utils https://review.openstack.org/445954 | 06:31 |
---|---|---|
*** pgadiya has joined #openstack-sahara | 06:40 | |
*** Poornima has quit IRC | 06:49 | |
*** Poornima has joined #openstack-sahara | 06:51 | |
*** Poornima has quit IRC | 06:52 | |
*** Poornima has joined #openstack-sahara | 06:53 | |
*** Poornima has quit IRC | 06:53 | |
*** Poornima has joined #openstack-sahara | 06:54 | |
*** abalutoiu has joined #openstack-sahara | 07:00 | |
*** pgadiya has quit IRC | 07:01 | |
*** tesseract has joined #openstack-sahara | 07:08 | |
*** anshul has joined #openstack-sahara | 07:10 | |
*** anshul has quit IRC | 07:10 | |
*** pgadiya has joined #openstack-sahara | 07:10 | |
*** anshul has joined #openstack-sahara | 07:10 | |
*** anshul has quit IRC | 07:16 | |
*** Poornima has quit IRC | 07:17 | |
*** gautam has joined #openstack-sahara | 07:19 | |
*** Poornima has joined #openstack-sahara | 07:21 | |
*** gautam has quit IRC | 07:23 | |
*** gautam has joined #openstack-sahara | 07:23 | |
*** pcaruana has joined #openstack-sahara | 07:27 | |
*** anshul has joined #openstack-sahara | 07:27 | |
*** Poornima has quit IRC | 07:29 | |
*** vgridnev has quit IRC | 07:33 | |
-openstackstatus- NOTICE: Jobs in gate are failing with POST_FAILURE. Infra roots are investigating | 07:44 | |
*** ChanServ changes topic to "Jobs in gate are failing with POST_FAILURE. Infra roots are investigating" | 07:44 | |
*** rcernin has quit IRC | 07:54 | |
*** tesseract has quit IRC | 07:54 | |
*** tesseract has joined #openstack-sahara | 07:55 | |
*** rcernin has joined #openstack-sahara | 07:56 | |
*** Poornima has joined #openstack-sahara | 07:58 | |
*** shuyingya has quit IRC | 08:13 | |
-openstackstatus- NOTICE: logs.openstack.org has corrupted disks, it's being repaired. Please avoid rechecking until this is fixed | 08:25 | |
*** ChanServ changes topic to "logs.openstack.org has corrupted disks, it's being repaired. Please avoid rechecking until this is fixed" | 08:25 | |
*** abalutoiu_ has joined #openstack-sahara | 08:35 | |
*** abalutoiu has quit IRC | 08:37 | |
*** shuyingya has joined #openstack-sahara | 08:42 | |
*** shuyingya has quit IRC | 08:46 | |
*** remix_tj has joined #openstack-sahara | 08:51 | |
remix_tj | hello, i'm running newton and while creating a CDH 5.7 cluster i get this error: Creating cluster failed for the following reason(s): '' | 08:53 |
remix_tj | if i dig in the logs i get also this bad python traceback: https://pastebin.com/SvKJSsPK | 08:54 |
*** pgadiya has quit IRC | 09:02 | |
*** pgadiya has joined #openstack-sahara | 09:18 | |
*** Poornima has quit IRC | 09:21 | |
ltosky[m] | Uhm, I will have better connectivity soon, but: does the instance have connectivity? Did you assign a floating ip pool in thenode group templates? | 09:23 |
*** pgadiya has quit IRC | 09:26 | |
*** Poornima has joined #openstack-sahara | 09:26 | |
*** pgadiya has joined #openstack-sahara | 09:30 | |
*** tosky has joined #openstack-sahara | 09:35 | |
* tosky in his proper IRC self | 09:35 | |
tosky | remix_tj: sooo | 09:35 |
remix_tj | tosky: now i enabled the debug, let's see what's happening | 09:36 |
remix_tj | i forgot to say that the guest image i'm using is the one downloaded from mirantis repository | 09:37 |
tosky | which should work, but let's see | 09:38 |
tosky | (worst case, you can rebuild it) | 09:38 |
tosky | remix_tj: but about the two questions above, did you check them? | 09:39 |
*** abalutoiu_ has quit IRC | 09:57 | |
remix_tj | Tosky yes connectivity is ok | 10:00 |
tosky | and the floating ip pool in the templates? | 10:03 |
remix_tj | yes, everything ok | 10:03 |
remix_tj | i can ssh to vms without problems | 10:04 |
remix_tj | wow, now i got ECONNREFUSED, fuck. | 10:04 |
remix_tj | . | 10:05 |
*** abalutoiu has joined #openstack-sahara | 10:07 | |
remix_tj | in this case with econnrefused i see that wants to connect to http://10.6.124.16:7180/api/v8/commands/58 and reports then error | 10:11 |
tosky | cloudera manager | 10:13 |
tosky | what address is that one? On the private network of the tenant? | 10:14 |
remix_tj | 10.6.124.0/22 is my floating ip pool | 10:14 |
tosky | do you see the ECONNREFUSED from the cluster nodes (master/workers) or from sahara-engine.log ? | 10:15 |
remix_tj | sahara-engine.log | 10:15 |
remix_tj | i see that there are also some requests to cloudera manager api before that error and seems to be ok | 10:16 |
remix_tj | i'm posting now a part of the log | 10:16 |
remix_tj | tosky: https://pastebin.com/ZFFi8ZAV | 10:17 |
tosky | like it's dying | 10:18 |
tosky | and if you log into the clouder manager node? (maybe through the web console, if you set a root password on the image) | 10:19 |
remix_tj | java invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0 | 10:19 |
remix_tj | !!!! | 10:19 |
openstack | remix_tj: Error: "!!!" is not a valid command. | 10:19 |
remix_tj | maybe the image is broken | 10:19 |
tosky | which flavor did you use? | 10:19 |
*** tuanluong has quit IRC | 10:20 | |
tosky | and what is the layout of the nodes? | 10:20 |
remix_tj | 1 node dedicated to the manager with m1.medium (2cpu 4gb ram) | 10:21 |
tosky | go for 6 | 10:21 |
remix_tj | 6 gigs? | 10:21 |
tosky | yep | 10:21 |
remix_tj | ok, i'll try now | 10:21 |
tosky | at least for the manager | 10:21 |
remix_tj | then i wait for the empty error to appear... | 10:21 |
tosky | this is what we use on the gates (the yaml is a custom one, but the structure should be sufficiently self-describing): http://git.openstack.org/cgit/openstack/sahara-tests/tree/sahara_tests/scenario/defaults/newton/cdh-5.7.0.yaml.mako | 10:22 |
remix_tj | tosky: maybe the empty error is because the manager was crashing during the http request? | 10:23 |
tosky | in my local tests, I used 6GiB of RAM for large_flavor_id | 10:24 |
tosky | maybe | 10:24 |
tosky | without the manager nothing can happen :) | 10:24 |
*** nkrinner is now known as nkrinner_afk | 10:24 | |
remix_tj | maybe is documented somewhere.... | 10:25 |
remix_tj | but is a bad crash | 10:25 |
tosky | did you see anything else before that 'java invoked oom-killer'? | 10:26 |
remix_tj | some service activations | 10:27 |
remix_tj | nothing else. Sorry but i just destroyed the vm | 10:28 |
remix_tj | now i'm trying with your suggestion | 10:28 |
remix_tj | i'm going to lunch and be back in 20 mins | 10:28 |
*** abalutoiu has quit IRC | 10:31 | |
*** tellesnobrega has quit IRC | 10:33 | |
*** abalutoiu has joined #openstack-sahara | 10:46 | |
*** Poornima has quit IRC | 10:52 | |
remix_tj | tosky: we moved to a new error, Failed to Provision Hadoop Cluster: Failed to format NameNode. Error ID: 0ed7fc6c-b04a-43f2-833c-0e5de951f02b | 10:55 |
remix_tj | but seems that now cloudera manager issue has been fixed | 10:56 |
tosky | uhm, still in sahara-engine, right? More details? | 10:56 |
remix_tj | this is an error on the interface, now i look at the logs | 10:57 |
remix_tj | tosky: https://pastebin.com/6cehzE0T | 10:58 |
tosky | the error comes from cloudera manager, uhm | 11:01 |
remix_tj | i see that credentials on cloudera manager are set | 11:01 |
remix_tj | tosky: if i go in cloudera manager interface i see 4 configuratione errors, 3 about hdfs01 (Missing required value: DataNode Data Directory, | 11:03 |
remix_tj | Missing required value: HDFS Checkpoint Directories, Missing required value: NameNode Data Directories) one about yarn01 (Missing required value: NodeManager Local Directories) | 11:03 |
tosky | I think those may be red herrings | 11:05 |
remix_tj | There is insufficient memory for the Java Runtime Environment to continue. | 11:06 |
remix_tj | this is reported in the error | 11:06 |
tosky | oh | 11:06 |
remix_tj | this cloudera is quite resource expensive :-P | 11:06 |
tosky | yeah | 11:06 |
* tosky -> lunchtime | 11:10 | |
remix_tj | maybe there should be some more informations from sahara side to say what's going wrong and suggest some actions to users | 11:12 |
*** anshul has quit IRC | 11:16 | |
*** tellesnobrega has joined #openstack-sahara | 11:42 | |
*** Poornima has joined #openstack-sahara | 11:42 | |
tosky | that would be good for sure, even if it's not easy to find the reason | 11:43 |
tosky | still at least an hardcoded list of checks would be useful, like 'check on the manager if the manager is running' | 11:43 |
tosky | would you mind filing a bug with some suggestions? The issue is broad, but at least report about oom_killer or processes killed would be useful | 11:43 |
remix_tj | Yes for sure i'll do on monday | 11:46 |
remix_tj | Fyi with more ram everything worked well | 11:47 |
*** nkrinner_afk is now known as nkrinner | 11:47 | |
*** iwonka has quit IRC | 11:47 | |
remix_tj | Min requirements by cloudera were satisfied by my old flavor, but that was not enough | 11:48 |
remix_tj | maybe also Sahara documentation may include some minimum requirements for each kind of nodethat compose a cluster deployed by a given plugin | 11:55 |
tosky | but that should be just a reference to the documentation of each plugin; the risk is duplicating an information which lives elsewhere | 11:56 |
remix_tj | Anyway thank you very much because that error was driving me crazy | 11:57 |
tosky | we can't do anything if Cloudera uses more resources than what they suggest :) | 11:57 |
remix_tj | Is there no one from cloudera? | 12:04 |
tosky | here? | 12:05 |
remix_tj | Yes | 12:11 |
*** iwonka has joined #openstack-sahara | 12:14 | |
tosky | I think only MapR contributes directly nowadays | 12:15 |
tosky | we had Intel developers working on Cloudera, but I'm not sure that they are still working on it | 12:16 |
*** Poornima has quit IRC | 12:22 | |
*** Poornima has joined #openstack-sahara | 12:24 | |
*** pgadiya has quit IRC | 12:24 | |
*** chlong has joined #openstack-sahara | 12:40 | |
*** iurygregory has joined #openstack-sahara | 12:43 | |
iurygregory | Hello people, does anyone now if it's possible to create a cluster with instances based on volumes? | 12:45 |
tosky | iurygregory: we have an approved spec for ocata, but I'm not sure about the status | 12:55 |
tosky | unfortunately specs.openstack.org does not work, but I can find the original review | 12:55 |
tosky | https://review.openstack.org/#/c/349516/ | 12:56 |
tosky | iurygregory: check ^^ | 12:57 |
tosky | uhm, but I don't see that flag implemented, so maybe the code is not there yet | 12:57 |
iurygregory | tosky, oh thanks | 12:58 |
*** Poornima has quit IRC | 13:16 | |
*** dave-mccowan has joined #openstack-sahara | 13:19 | |
*** tellesnobrega has quit IRC | 13:31 | |
*** nkrinner is now known as nkrinner_afk | 14:16 | |
*** jamielennox is now known as jamielennox|away | 14:16 | |
*** shuyingya has joined #openstack-sahara | 14:40 | |
*** shuyingya has quit IRC | 14:44 | |
openstackgerrit | Luigi Toscano proposed openstack/sahara master: _get_os_distrib() can return 'redhat', add mapping https://review.openstack.org/452230 | 14:49 |
*** rcernin has quit IRC | 15:08 | |
*** vgridnev has joined #openstack-sahara | 15:45 | |
*** pcaruana has quit IRC | 15:46 | |
*** tosky has quit IRC | 16:45 | |
*** abalutoiu has quit IRC | 16:53 | |
*** vgridnev has quit IRC | 17:14 | |
*** vgridnev has joined #openstack-sahara | 17:15 | |
*** tosky has joined #openstack-sahara | 17:48 | |
*** tellesnobrega has joined #openstack-sahara | 17:50 | |
*** vgridnev has quit IRC | 18:11 | |
*** abalutoiu has joined #openstack-sahara | 18:13 | |
*** vgridnev has joined #openstack-sahara | 18:15 | |
*** tesseract has quit IRC | 18:21 | |
*** vgridnev has quit IRC | 19:29 | |
*** abalutoiu has quit IRC | 19:41 | |
-openstackstatus- NOTICE: lists.openstack.org will be offline from 20:00 to 23:00 UTC for planned upgrade maintenance | 19:59 | |
openstackgerrit | mimansa04 proposed openstack/sahara master: added timeout function in health check function https://review.openstack.org/449308 | 20:14 |
openstackgerrit | Merged openstack/sahara master: _get_os_distrib() can return 'redhat', add mapping https://review.openstack.org/452230 | 21:06 |
*** gautam has quit IRC | 21:47 | |
-openstackstatus- NOTICE: The upgrade maintenance for lists.openstack.org has been completed and it is back online. | 21:51 | |
*** iwonka has quit IRC | 22:37 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/sahara master: Updated from global requirements https://review.openstack.org/451410 | 22:54 |
*** tosky has quit IRC | 23:47 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!