Friday, 2017-03-31

*** jamielennox is now known as jamielennox|away00:25
*** jamielennox|away is now known as jamielennox00:31
*** tuanluong has joined #openstack-sahara00:35
*** abalutoiu has quit IRC00:46
*** shuyingya has joined #openstack-sahara01:38
*** shuyingya has quit IRC01:44
*** shuyingya has joined #openstack-sahara01:44
*** shuyingya has quit IRC01:44
*** shuyingya has joined #openstack-sahara01:47
*** shuyingya has quit IRC03:03
*** shuyingya has joined #openstack-sahara03:03
*** Poornima has joined #openstack-sahara03:59
*** Poornima has quit IRC03:59
*** Poornima has joined #openstack-sahara04:11
*** tellesnobrega has quit IRC04:12
*** tellesnobrega has joined #openstack-sahara04:18
*** vgridnev has joined #openstack-sahara05:03
*** nkrinner_afk is now known as nkrinner05:16
*** tellesnobrega has quit IRC05:16
*** tellesnobrega has joined #openstack-sahara05:29
*** gauPaw has joined #openstack-sahara05:42
*** gauPaw is now known as gautam05:43
*** rcernin has joined #openstack-sahara05:44
*** Poornima has quit IRC05:45
*** gautam has quit IRC05:48
*** Poornima has joined #openstack-sahara05:54
openstackgerritEvgeny Sikachev proposed openstack/sahara-tests master: Move methods from runner to utils  https://review.openstack.org/44595406:31
*** pgadiya has joined #openstack-sahara06:40
*** Poornima has quit IRC06:49
*** Poornima has joined #openstack-sahara06:51
*** Poornima has quit IRC06:52
*** Poornima has joined #openstack-sahara06:53
*** Poornima has quit IRC06:53
*** Poornima has joined #openstack-sahara06:54
*** abalutoiu has joined #openstack-sahara07:00
*** pgadiya has quit IRC07:01
*** tesseract has joined #openstack-sahara07:08
*** anshul has joined #openstack-sahara07:10
*** anshul has quit IRC07:10
*** pgadiya has joined #openstack-sahara07:10
*** anshul has joined #openstack-sahara07:10
*** anshul has quit IRC07:16
*** Poornima has quit IRC07:17
*** gautam has joined #openstack-sahara07:19
*** Poornima has joined #openstack-sahara07:21
*** gautam has quit IRC07:23
*** gautam has joined #openstack-sahara07:23
*** pcaruana has joined #openstack-sahara07:27
*** anshul has joined #openstack-sahara07:27
*** Poornima has quit IRC07:29
*** vgridnev has quit IRC07:33
-openstackstatus- NOTICE: Jobs in gate are failing with POST_FAILURE. Infra roots are investigating07:44
*** ChanServ changes topic to "Jobs in gate are failing with POST_FAILURE. Infra roots are investigating"07:44
*** rcernin has quit IRC07:54
*** tesseract has quit IRC07:54
*** tesseract has joined #openstack-sahara07:55
*** rcernin has joined #openstack-sahara07:56
*** Poornima has joined #openstack-sahara07:58
*** shuyingya has quit IRC08:13
-openstackstatus- NOTICE: logs.openstack.org has corrupted disks, it's being repaired. Please avoid rechecking until this is fixed08:25
*** ChanServ changes topic to "logs.openstack.org has corrupted disks, it's being repaired. Please avoid rechecking until this is fixed"08:25
*** abalutoiu_ has joined #openstack-sahara08:35
*** abalutoiu has quit IRC08:37
*** shuyingya has joined #openstack-sahara08:42
*** shuyingya has quit IRC08:46
*** remix_tj has joined #openstack-sahara08:51
remix_tjhello, i'm running newton and while creating a CDH 5.7 cluster i get this error: Creating cluster failed for the following reason(s): ''08:53
remix_tjif i dig in the logs i get also this bad python traceback: https://pastebin.com/SvKJSsPK08:54
*** pgadiya has quit IRC09:02
*** pgadiya has joined #openstack-sahara09:18
*** Poornima has quit IRC09:21
ltosky[m]Uhm, I will have better connectivity soon, but: does the instance have connectivity? Did you assign a floating ip pool in thenode group templates?09:23
*** pgadiya has quit IRC09:26
*** Poornima has joined #openstack-sahara09:26
*** pgadiya has joined #openstack-sahara09:30
*** tosky has joined #openstack-sahara09:35
* tosky in his proper IRC self09:35
toskyremix_tj: sooo09:35
remix_tjtosky: now i enabled the debug, let's see what's happening09:36
remix_tji forgot to say that the guest image i'm using is the one downloaded from mirantis repository09:37
toskywhich should work, but let's see09:38
tosky(worst case, you can rebuild it)09:38
toskyremix_tj: but about the two questions above, did you check them?09:39
*** abalutoiu_ has quit IRC09:57
remix_tjTosky yes connectivity is ok10:00
toskyand the floating ip pool in the templates?10:03
remix_tjyes, everything ok10:03
remix_tji can ssh to vms without problems10:04
remix_tjwow, now i got ECONNREFUSED, fuck.10:04
remix_tj.10:05
*** abalutoiu has joined #openstack-sahara10:07
remix_tjin this case with econnrefused i see that wants to connect to http://10.6.124.16:7180/api/v8/commands/58 and reports then error10:11
toskycloudera manager10:13
toskywhat address is that one? On the private network of the tenant?10:14
remix_tj10.6.124.0/22 is my floating ip pool10:14
toskydo you see the ECONNREFUSED from the cluster nodes (master/workers) or from sahara-engine.log ?10:15
remix_tjsahara-engine.log10:15
remix_tji see that there are also some requests to cloudera manager api before that error and seems to be ok10:16
remix_tji'm posting now a part of the log10:16
remix_tjtosky: https://pastebin.com/ZFFi8ZAV10:17
toskylike it's dying10:18
toskyand if you log into the clouder manager node? (maybe through the web console, if you set a root password on the image)10:19
remix_tjjava invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=010:19
remix_tj!!!!10:19
openstackremix_tj: Error: "!!!" is not a valid command.10:19
remix_tjmaybe the image is broken10:19
toskywhich flavor did you use?10:19
*** tuanluong has quit IRC10:20
toskyand what is the layout of the nodes?10:20
remix_tj1 node dedicated to the manager with m1.medium (2cpu 4gb ram)10:21
toskygo for 610:21
remix_tj6 gigs?10:21
toskyyep10:21
remix_tjok, i'll try now10:21
toskyat least for the manager10:21
remix_tjthen i wait for the empty error to appear...10:21
toskythis is what we use on the gates (the yaml is a custom one, but the structure should be sufficiently self-describing): http://git.openstack.org/cgit/openstack/sahara-tests/tree/sahara_tests/scenario/defaults/newton/cdh-5.7.0.yaml.mako10:22
remix_tjtosky: maybe the empty error is because the manager was crashing during the http request?10:23
toskyin my local tests, I used 6GiB of RAM for large_flavor_id10:24
toskymaybe10:24
toskywithout the manager nothing can happen :)10:24
*** nkrinner is now known as nkrinner_afk10:24
remix_tjmaybe is documented somewhere....10:25
remix_tjbut is a bad crash10:25
toskydid you see anything else before that 'java invoked oom-killer'?10:26
remix_tjsome service activations10:27
remix_tjnothing else. Sorry but i just destroyed the vm10:28
remix_tjnow i'm trying with your suggestion10:28
remix_tji'm going to lunch and be back in 20 mins10:28
*** abalutoiu has quit IRC10:31
*** tellesnobrega has quit IRC10:33
*** abalutoiu has joined #openstack-sahara10:46
*** Poornima has quit IRC10:52
remix_tjtosky: we moved to a new error, Failed to Provision Hadoop Cluster: Failed to format NameNode. Error ID: 0ed7fc6c-b04a-43f2-833c-0e5de951f02b10:55
remix_tjbut seems that now cloudera manager issue has been fixed10:56
toskyuhm, still in sahara-engine, right? More details?10:56
remix_tjthis is an error on the interface, now i look at the logs10:57
remix_tjtosky: https://pastebin.com/6cehzE0T10:58
toskythe error comes from cloudera manager, uhm11:01
remix_tji see that credentials on cloudera manager are set11:01
remix_tjtosky: if i go in cloudera manager interface i see 4 configuratione errors, 3 about hdfs01 (Missing required value: DataNode Data Directory,11:03
remix_tjMissing required value: HDFS Checkpoint Directories, Missing required value: NameNode Data Directories) one about yarn01 (Missing required value: NodeManager Local Directories)11:03
toskyI think those may be red herrings11:05
remix_tj There is insufficient memory for the Java Runtime Environment to continue.11:06
remix_tjthis is reported in the error11:06
toskyoh11:06
remix_tjthis cloudera is quite resource expensive :-P11:06
toskyyeah11:06
* tosky -> lunchtime11:10
remix_tjmaybe there should be some more informations from sahara side to say what's going wrong and suggest some actions to users11:12
*** anshul has quit IRC11:16
*** tellesnobrega has joined #openstack-sahara11:42
*** Poornima has joined #openstack-sahara11:42
toskythat would be good for sure, even if it's not easy to find the reason11:43
toskystill at least an hardcoded list of checks would be useful, like 'check on the manager if the manager is running'11:43
toskywould you mind filing a bug with some suggestions? The issue is broad, but at least report about oom_killer or processes killed would be useful11:43
remix_tjYes for sure i'll do on monday11:46
remix_tjFyi with more ram everything worked well11:47
*** nkrinner_afk is now known as nkrinner11:47
*** iwonka has quit IRC11:47
remix_tjMin requirements by cloudera were satisfied by my old flavor, but that was not enough11:48
remix_tj maybe also Sahara documentation may include some minimum requirements for each kind of nodethat compose a cluster deployed by a given plugin11:55
toskybut that should be just a reference to the documentation of each plugin; the risk is duplicating an information which lives elsewhere11:56
remix_tjAnyway thank you very much because that error was driving me crazy11:57
toskywe can't do anything if Cloudera uses more resources than what they suggest :)11:57
remix_tjIs there no one from cloudera?12:04
toskyhere?12:05
remix_tjYes12:11
*** iwonka has joined #openstack-sahara12:14
toskyI think only MapR contributes directly nowadays12:15
toskywe had Intel developers working on Cloudera, but I'm not sure that they are still working on it12:16
*** Poornima has quit IRC12:22
*** Poornima has joined #openstack-sahara12:24
*** pgadiya has quit IRC12:24
*** chlong has joined #openstack-sahara12:40
*** iurygregory has joined #openstack-sahara12:43
iurygregoryHello people, does anyone now if it's possible to create a cluster with instances based on volumes?12:45
toskyiurygregory: we have an approved spec for ocata, but I'm not sure about the status12:55
toskyunfortunately specs.openstack.org does not work, but I can find the original review12:55
toskyhttps://review.openstack.org/#/c/349516/12:56
toskyiurygregory: check ^^12:57
toskyuhm, but I don't see that flag implemented, so maybe the code is not there yet12:57
iurygregorytosky, oh thanks12:58
*** Poornima has quit IRC13:16
*** dave-mccowan has joined #openstack-sahara13:19
*** tellesnobrega has quit IRC13:31
*** nkrinner is now known as nkrinner_afk14:16
*** jamielennox is now known as jamielennox|away14:16
*** shuyingya has joined #openstack-sahara14:40
*** shuyingya has quit IRC14:44
openstackgerritLuigi Toscano proposed openstack/sahara master: _get_os_distrib() can return 'redhat', add mapping  https://review.openstack.org/45223014:49
*** rcernin has quit IRC15:08
*** vgridnev has joined #openstack-sahara15:45
*** pcaruana has quit IRC15:46
*** tosky has quit IRC16:45
*** abalutoiu has quit IRC16:53
*** vgridnev has quit IRC17:14
*** vgridnev has joined #openstack-sahara17:15
*** tosky has joined #openstack-sahara17:48
*** tellesnobrega has joined #openstack-sahara17:50
*** vgridnev has quit IRC18:11
*** abalutoiu has joined #openstack-sahara18:13
*** vgridnev has joined #openstack-sahara18:15
*** tesseract has quit IRC18:21
*** vgridnev has quit IRC19:29
*** abalutoiu has quit IRC19:41
-openstackstatus- NOTICE: lists.openstack.org will be offline from 20:00 to 23:00 UTC for planned upgrade maintenance19:59
openstackgerritmimansa04 proposed openstack/sahara master: added timeout function in health check function  https://review.openstack.org/44930820:14
openstackgerritMerged openstack/sahara master: _get_os_distrib() can return 'redhat', add mapping  https://review.openstack.org/45223021:06
*** gautam has quit IRC21:47
-openstackstatus- NOTICE: The upgrade maintenance for lists.openstack.org has been completed and it is back online.21:51
*** iwonka has quit IRC22:37
openstackgerritOpenStack Proposal Bot proposed openstack/sahara master: Updated from global requirements  https://review.openstack.org/45141022:54
*** tosky has quit IRC23:47

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!