我正在运行一个节点cassandra 2.0.3和Apache Spark 2.0.3
我创建了一个scala程序来使用Spark hadoop API创建一个RDD来访问Cassandra DB。
另外应该在bashrc中为spaark设置什么环境变量,因为我在spark-env.sh中使用下面的配置
export SPARK_MASTER_IP="10.0.3.15"
export SPARK_MASTER_PORT="7077"
export SCALA_HOME="/home/Desktop/CD/scala-2.9.3"
export SPARK_WORKER_MEMORY=1g
export SPARK_WORKER_INSTANCES=1
export SPARK_WORKER_DIR="/home/Desktop/CD/spark-0.8.0-incubating/sparkdata"
我的示例scala代码如下
val job=new Job()
job.setInputFormatClass(classOf[ColumnFamilyInputFormat])
val host: String = "localhost"
val port: String = "9160"
ConfigHelper.setInputInitialAddress(job.getConfiguration(), host)
ConfigHelper.setInputRpcPort(job.getConfiguration(), port)
ConfigHelper.setInputColumnFamily(job.getConfiguration(), "demodb", "emp")
ConfigHelper.setInputPartitioner(job.getConfiguration(), "Murmur3Partitioner")
CqlConfigHelper.setInputColumns(job.getConfiguration(), "empid,deptid,first_name,last_name")
CqlConfigHelper.setInputWhereClauses(job.getConfiguration(),"empid=104")
// Make a new Hadoop RDD
val casRdd = sc.newAPIHadoopRDD(job.getConfiguration(),
classOf[CqlPagingInputFormat],
classOf[Map[String, ByteBuffer]],
classOf[Map[String, ByteBuffer]])
println(casRdd.count())
但是,当我在Spark Master上运行此作业时,它无法完成作业并提供以下日志。
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException
java.lang.RuntimeException
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.executeQuery(CqlPagingRecordReader.java:661)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.<init>(CqlPagingRecordReader.java:297)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader.initialize(CqlPagingRecordReader.java:163)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:86)
at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:74)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:237)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:226)
at org.apache.spark.scheduler.ResultTask.run(ResultTask.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:158)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:0 as TID 12 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:0 as 2144 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 2 (task 0.0:2)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 1]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:2 as TID 13 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:2 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 1 (task 0.0:1)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 2]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:1 as TID 14 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:1 as 2144 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 5 (task 0.0:5)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 3]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:5 as TID 15 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:5 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 11 (task 0.0:11)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 4]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:11 as TID 16 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:11 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 8 (task 0.0:8)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 5]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:8 as TID 17 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:8 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 3 (task 0.0:3)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 6]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:3 as TID 18 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:3 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 6 (task 0.0:6)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 7]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:6 as TID 19 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:6 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 9 (task 0.0:9)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 8]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:9 as TID 20 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:9 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 7 (task 0.0:7)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 9]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:7 as TID 21 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:7 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 10 (task 0.0:10)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 10]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:10 as TID 22 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:10 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 4 (task 0.0:4)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 11]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:4 as TID 23 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:4 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 12 (task 0.0:0)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 12]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:0 as TID 24 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:0 as 2144 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 13 (task 0.0:2)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 13]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:2 as TID 25 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:2 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 14 (task 0.0:1)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 14]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:1 as TID 26 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:1 as 2144 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 15 (task 0.0:5)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 15]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:5 as TID 27 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:5 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 16 (task 0.0:11)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 16]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:11 as TID 28 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:11 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 17 (task 0.0:8)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 17]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:8 as TID 29 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:8 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 18 (task 0.0:3)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 18]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:3 as TID 30 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:3 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 19 (task 0.0:6)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 19]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:6 as TID 31 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:6 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 20 (task 0.0:9)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 20]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:9 as TID 32 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:9 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 21 (task 0.0:7)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 21]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:7 as TID 33 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:7 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 22 (task 0.0:10)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 22]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:10 as TID 34 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:10 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 23 (task 0.0:4)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 23]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:4 as TID 35 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:4 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 24 (task 0.0:0)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 24]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:0 as TID 36 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:0 as 2144 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 25 (task 0.0:2)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 25]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:2 as TID 37 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:2 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 26 (task 0.0:1)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 26]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:1 as TID 38 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:1 as 2144 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 27 (task 0.0:5)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 27]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:5 as TID 39 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:5 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 29 (task 0.0:8)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 28]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:8 as TID 40 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:8 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 28 (task 0.0:11)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 29]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:11 as TID 41 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:11 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 30 (task 0.0:3)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 30]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:3 as TID 42 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:3 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 31 (task 0.0:6)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 31]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:6 as TID 43 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:6 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 32 (task 0.0:9)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 32]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:9 as TID 44 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:9 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 33 (task 0.0:7)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 33]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:7 as TID 45 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:7 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 34 (task 0.0:10)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 34]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:10 as TID 46 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:10 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 35 (task 0.0:4)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 35]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:4 as TID 47 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:4 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 36 (task 0.0:0)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 36]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:0 as TID 48 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:0 as 2144 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 37 (task 0.0:2)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 37]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:2 as TID 49 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:2 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 38 (task 0.0:1)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 38]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:1 as TID 50 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:1 as 2144 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 39 (task 0.0:5)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 39]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:5 as TID 51 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:5 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 40 (task 0.0:8)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 40]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:8 as TID 52 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:8 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 41 (task 0.0:11)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 41]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:11 as TID 53 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:11 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 42 (task 0.0:3)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 42]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:3 as TID 54 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:3 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 43 (task 0.0:6)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 43]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:6 as TID 55 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:6 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 44 (task 0.0:9)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 44]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:9 as TID 56 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:9 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 45 (task 0.0:7)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 45]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:7 as TID 57 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:7 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 46 (task 0.0:10)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 46]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:10 as TID 58 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:10 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 47 (task 0.0:4)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 47]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:4 as TID 59 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:4 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 48 (task 0.0:0)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 48]
14/01/03 16:34:16 ERROR cluster.ClusterTaskSetManager: Task 0.0:0 failed more than 4 times; aborting job
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Remove TaskSet 0.0 from pool
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 51 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 49 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 50 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 52 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 53 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 54 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 52 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 51 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 55 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 53 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 56 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 54 because its task set is gone
14/01/03 16:34:16 INFO scheduler.DAGScheduler: Failed to run count at myown.scala:75
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 57 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 55 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 56 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 58 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 59 because its task set is gone
[error] (run-main) org.apache.spark.SparkException: Job failed: Task 0.0:0 failed more than 4 times
org.apache.spark.SparkException: Job failed: Task 0.0:0 failed more than 4 times
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:760)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:758)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:758)
at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:379)
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:441)
at org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:149)
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 59 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 57 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 58 because its task set is gone
所以基本上我很困惑并且很难解决这个问题因为我不明白我的scala代码或者spark主从通信或者spark环境配置是否有问题。
请求指导我。
答案 0 :(得分:2)
这个答案可能会帮助您实施:Spark with Cassandra input/output
Datastax宣布了Spark的官方Cassandra驱动程序。使用此解决方案,您无需实现Hadoop接口,因为这是Cassandra和Spark之间的直接桥梁。
答案 1 :(得分:0)
正如您所看到的那样here这是因为Cassandra的CQL RecordReader中抛出了一般的未处理异常。
尝试在启用调试日志的情况下运行Spark,您应该得到导致此问题的错误。
我的猜测问题在于您尝试在jobConfiguration中设置ColumnFamilyInputFormat并将classOf [CqlPagingInputFormat]传递给newHadoopApiRdd方法。但是htat只是猜测。
虽然we目前仅针对Cassandra 1.2.12发布了Calliope,但我们的客户已成功使用它与Cassandra 2.x无任何变化。 Calliope为您提供了一个更简单,更高级别的API,可以与Spark的Cassandra一起使用。我建议你试一试。