火花& EC2上的SparkR配置 - Java超时

时间:2015-02-17 19:58:00

标签: r amazon-ec2 apache-spark

我正在尝试使用提供的脚本和指示在一个小型EC2集群上运行Spark和SparkR。每当我要求一个需要在RDD上进行计算的操作(例如,collect()reduce())时,我会得到下面记录的错误。工作人员似乎正确启动 - 如果我只parallelize,我看到工作人员通过主人的网络ui运行。

我得到的错误类似于Intermittent Timeout Exception using Spark中的错误,我已经完成了那里的所有解决方案(修改URL的conf文件,禁用防火墙等),没有运气。

这是错误日志,请提前感谢您的帮助:

15/02/17 19:10:22 INFO executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT]
15/02/17 19:10:22 INFO spark.SecurityManager: Changing view acls to: root,-
15/02/17 19:10:22 INFO spark.SecurityManager: Changing modify acls to: root,-
15/02/17 19:10:22 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root, -); users with modify permissions: Set(root, -)
15/02/17 19:10:23 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/02/17 19:10:23 INFO Remoting: Starting remoting
15/02/17 19:10:23 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher@-.ec2.internal:60218]
15/02/17 19:10:23 INFO util.Utils: Successfully started service 'driverPropsFetcher' on port 60218.
15/02/17 19:10:53 ERROR security.UserGroupInformation: PriviledgedActionException as:- cause:java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException: Unknown exception in doAs
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1134)
        at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:59)
        at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:115)
        at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:161)
        at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: java.security.PrivilegedActionException: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        ... 4 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:107)
        at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:127)
        at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60)
        at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:59)
        ... 7 more

1 个答案:

答案 0 :(得分:0)

最终通过

的组合解决了这个问题
- Updates to SparkR, which have resolved a number of serialization issues.

- Recognizing that the Spark-ec2 scripts require that the control node and master node be the same machine.

- Replacing calls to parallelize() with distributing and then loading the data by hadoop. 

我正在为R程序员编写SparkR简介,我希望将来可以帮助人们处理这类事情。