*** rcernin has quit IRC | 06:57 | |
*** pcaruana has joined #openstack-sahara | 07:25 | |
openstackgerrit | Tobias Urdin proposed openstack/puppet-sahara master: Remove deprecated parameters https://review.openstack.org/620808 | 08:23 |
---|---|---|
*** tosky has joined #openstack-sahara | 08:48 | |
*** tellesnobrega_ is now known as tellesnobrega | 10:00 | |
tellesnobrega | tosky, morning, I see that you have a solution for the vanilla issue | 10:01 |
tellesnobrega | did that make the cluster start? | 10:01 |
tosky | hi, yes, now the behavior using ubuntu and centos7 is the same | 10:01 |
tosky | both fails on EDP and scale -_-' but at least it's a start | 10:02 |
tosky | did you hit the same issue? | 10:03 |
tellesnobrega | I just woke up, will test it soon | 10:04 |
tellesnobrega | creating all vanilla versions for ubuntu and centos7 | 10:33 |
tellesnobrega | and it will test it out | 10:33 |
openstackgerrit | Merged openstack/sahara-image-elements master: firstboot: make rc-local start after cloud-init https://review.openstack.org/621302 | 10:50 |
openstackgerrit | Luigi Toscano proposed openstack/sahara-image-elements master: Plain Ubuntu image are still based on Xenial https://review.openstack.org/621540 | 10:59 |
openstackgerrit | Merged openstack/sahara stable/queens: Add DEBIAN_FRONTEND=noninteractive in front of apt-get install commands https://review.openstack.org/621351 | 11:20 |
openstackgerrit | Merged openstack/sahara stable/rocky: doc: restructure the image building documentation https://review.openstack.org/621306 | 11:20 |
tosky | tellesnobrega: do you think it's worth to backport (and adapt) the doc refactoring patch also to queens? It's going to be supported for 8+ months, until August (and then in extended maintainance mode) | 11:47 |
tellesnobrega | tosky, it is a long time, so I would say yes | 11:49 |
tosky | oki :) | 11:49 |
tellesnobrega | tosky, just created a worklist with apiv2 stuff https://storyboard.openstack.org/#!/worklist/533 | 11:57 |
tosky | I noticed the updates on storyboard | 12:43 |
tellesnobrega | tosky, so, I just started a centos7 vanilla 2.7.1 cluster | 12:52 |
tellesnobrega | without your fix | 12:54 |
tellesnobrega | and the cluster is active | 12:54 |
*** dave-mccowan has joined #openstack-sahara | 12:57 | |
tosky | tellesnobrega: yes, and it may happen, if the order of the service happens to be the lucky one | 13:07 |
tellesnobrega | hum, I see | 13:07 |
openstackgerrit | Tobias Urdin proposed openstack/puppet-sahara master: Deprecate ZeroMQ https://review.openstack.org/621568 | 13:12 |
tosky | added a story to track the API v2 changes required for sahara-tests | 13:31 |
tosky | can I tag it as high-priority? | 13:31 |
*** dave-mccowan has quit IRC | 13:32 | |
tellesnobrega | yes please | 13:34 |
*** dave-mccowan has joined #openstack-sahara | 14:54 | |
openstackgerrit | Telles Mota Vidal Nóbrega proposed openstack/sahara master: Fixing cluster scale https://review.openstack.org/616193 | 17:02 |
tosky | tellesnobrega: could this fix ^^ solve the issue that I noticed when scaling vanilla clusters using the template in sahara-tests? | 17:08 |
tellesnobrega | yes | 17:08 |
tellesnobrega | it does | 17:08 |
tosky | oh! | 17:08 |
tosky | let me try it then | 17:08 |
tosky | now I get it :) | 17:08 |
tellesnobrega | that is when I noticed this change was needed | 17:09 |
openstackgerrit | Luigi Toscano proposed openstack/sahara-image-elements stable/rocky: firstboot: make rc-local start after cloud-init https://review.openstack.org/621641 | 17:17 |
tellesnobrega | tosky, | 17:37 |
tellesnobrega | scaling works here | 17:37 |
tellesnobrega | EDP still failing | 17:37 |
tosky | I see mixed errors for EDP failures; on the node where I use radosgw, some are related to some swift URL being unreachable, others related to weired errors on retrieving some blocks from HDFS | 17:43 |
tosky | I need to check on the system which uses swift instead of radosgw, but it was failing too (maybe a subset of the failures) | 17:43 |
tellesnobrega | I see | 17:50 |
tosky | tellesnobrega: did you notice which job(s) failed specifically? In my last re-run on the current code rocky + the patch, the first run of EDP jobs saw only one KILLED job | 17:58 |
tosky | and that's the Hive job | 17:58 |
tosky | nothing new | 17:58 |
tosky | I'm waiting for the second run of the jobs, after the scaling operation | 17:58 |
tosky | in the meantime I re-run the tests for vanilla 2.8.2/centos7 on the split plugin/devstack deployment | 17:59 |
tosky | let's see | 17:59 |
tosky | I will also run mapr afterwards (as mapr scenario test also includes scaling, and it's faster than ambari) | 17:59 |
tellesnobrega | on centos7, vanilla 2.7.1 | 18:00 |
tellesnobrega | AssertionError: Job with id=47344935-b912-43e3-9639-1e1449eee700, name=test-d92a1077, type=Pig has status FAILED | 18:00 |
tellesnobrega | Job with id=5956755b-4276-4288-a798-854c3feb06d6, name=test-f786028b, type=MapReduce has status FAILED | 18:00 |
tellesnobrega | Job with id=af766279-eac2-418f-849f-94341c7fff5e, name=test-da3316f7, type=MapReduce.Streaming has status FAILED | 18:00 |
tellesnobrega | Job with id=440bb3b9-c0be-42b8-8b86-12818448bde7, name=test-f3e76d23, type=Java has status FAILED | 18:00 |
tellesnobrega | Job with id=1c636e54-bc4c-44b6-9548-2ec67d756082, name=test-59888187, type=Hive has status FAILED | 18:00 |
tosky | that's the split version? | 18:01 |
tellesnobrega | no, master | 18:01 |
tosky | anyway, I'm also going to build a centos7/vanilla 2.7.1 image | 18:02 |
tellesnobrega | ok | 18:02 |
tosky | do you see any special error for those jobs from the web console? | 18:02 |
tellesnobrega | not right now, I will run again and see how that goes | 18:04 |
tosky | getting closer | 18:05 |
tosky | you will need another rebase for sure :P | 18:05 |
tellesnobrega | too much stuff changing? | 18:05 |
tosky | few useful patches | 18:06 |
tellesnobrega | nice | 18:06 |
tellesnobrega | no worries on rebase, I think I got a good handle on it now | 18:06 |
Gaasmann | About the problem with telnetlib.Telnet for cdh, what would you think about a patch like that? http://paste.openstack.org/show/736588/ | 18:51 |
Gaasmann | 2018-12-03 18:46:50.102 20 DEBUG sahara.plugins.cdh.client.http_client [req-d67860ab-0093-4c8f-8452-9b5ceec3941f 78e9b31dfce642fa9995a58d017458d1 a4b84a529e9e4eea8ff7bfbe51c48e32 - - -] [instance: none, cluster: 337be666-a115-46d4-9b73-3e704e14b0ec] Method: GET, URL: http://192.168.52.18:7180/api/v8/users/admin execute | 18:52 |
Gaasmann | /var/lib/kolla/venv/local/lib/python2.7/site-packages/sahara/plugins/cdh/client/http_client.py:124 | 18:52 |
Gaasmann | this timeout, maybe the same issue? | 18:52 |
tosky | that timeout seems to be related, do you have longer stacktrace? | 19:13 |
tosky | about the patch, it may not work in general | 19:13 |
tosky | oh | 19:13 |
tosky | uhm, maybe it can, you are exploiting the usage of ssh on one of the nodes | 19:14 |
tosky | let me test it with my "normal" deployment | 19:15 |
Gaasmann | tosky: the longer stacktrace http://paste.openstack.org/show/736591/ | 19:40 |
Gaasmann | For the patch, I use what seems to be used for the preparation/configuration of the cluster so I guess it uses the Remote/SshRemoteDriver classes | 19:43 |
Gaasmann | (it makes the debug log a bit verbose though) | 19:45 |
tosky | Gaasmann: that stacktraces looks like another direct call to the API | 19:49 |
tosky | I guess we need to wrap all API calls somehow | 19:49 |
* tosky bbl | 19:49 | |
Gaasmann | that is the first API call I see during the cluster creation. I guess it's possible to ssh and run a curl command locally but it sounds like a quick and dirty fix | 19:54 |
tosky | I'd say: please send that patch as it is (it worked for me on a "normal" cloudera deployment), and then let's see if it makes sense to extend it to support all cases where sahara use the CDH API | 20:25 |
tosky | I'd also suggest to extend the scope of the story to address all the possible issues of the same kind; if we need to split the commits, we can use different tasks | 20:26 |
tosky | Gaasmann: you may want to edit the main content of the story, instead of just adding a comment :) | 20:48 |
Gaasmann | good idea :-) | 20:50 |
tosky | Gaasmann: I think that storyboard supports user editing of the comments, but the feature is disabled on the openstack instance | 21:59 |
openstackgerrit | Luigi Toscano proposed openstack/sahara master: DNM TESTONLY py3 test: remove i18n call to db exceptions https://review.openstack.org/600689 | 22:06 |
*** goldyfruit has joined #openstack-sahara | 22:14 | |
*** rcernin has joined #openstack-sahara | 22:18 | |
*** pcaruana has quit IRC | 22:18 | |
openstackgerrit | Merged openstack/sahara-image-elements stable/rocky: firstboot: make rc-local start after cloud-init https://review.openstack.org/621641 | 22:41 |
openstackgerrit | Luigi Toscano proposed openstack/sahara-image-elements stable/queens: firstboot: make rc-local start after cloud-init https://review.openstack.org/621721 | 22:47 |
*** irclogbot_2 has joined #openstack-sahara | 23:09 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!