AWS EC2上的Spark在启动作业时抛出EOFException

时间:2014-09-12 18:04:48

标签: exception hadoop amazon-web-services apache-spark

我正在尝试在我使用他们提供的Spark-ec2脚本创建的Spark集群上运行我的Spark作业。我可以运行SparkPi示例,但每当我运行我的工作时,我都会遇到此异常:

Exception in thread "main" java.io.IOException: Call to ec2-XXXXXXXXXX.compute-1.amazonaws.com/10.XXX.YYY.ZZZZ:9000 failed on local exception: java.io.EOFException
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107)
        at org.apache.hadoop.ipc.Client.call(Client.java:1075)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
        at com.sun.proxy.$Proxy6.setPermission(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at com.sun.proxy.$Proxy6.setPermission(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.setPermission(DFSClient.java:1042)
        at org.apache.hadoop.hdfs.DistributedFileSystem.setPermission(DistributedFileSystem.java:531)
        at org.apache.spark.util.FileLogger.createLogDir(FileLogger.scala:93)
        at org.apache.spark.util.FileLogger.start(FileLogger.scala:70)
        at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:71)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:252)
        at com.here.traffic.collection.archiver.IsoCcMergeJob$.isoMerge(IsoCcMergeJob.scala:55)
        at com.here.traffic.collection.archiver.IsoCcMergeJob$.main(IsoCcMergeJob.scala:11)
        at com.here.traffic.collection.archiver.IsoCcMergeJob.main(IsoCcMergeJob.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:804)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:749)

从我在互联网上寻找解决方案看起来,它看起来可能与Hadoop lib版本不匹配,但我确认Spark使用1.0.4并且我的作业是使用相同版本编译的。

为了提供更多上下文,我的工作是在两个文件的左外连接中生成S3,并将结果再次放入S3。

任何想法可能出错?

1 个答案:

答案 0 :(得分:1)

我有使用ec2脚本的类似经验,一旦我们使用cloudera发行版(5.1)进行集群(通过一个简单的apt-get)和jar依赖项,几乎所有的版本问题就消失了。

安装spark: http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Installation-Guide/cdh5ig_spark_installation.html

添加spark作为依赖项(搜索文本“spark”):

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH-Version-and-Packaging-Information/cdhvd_cdh5_maven_repo.html