Spark JavaSQlContext:object不是声明类的实例

时间:2015-01-06 22:29:54

标签: cassandra apache-spark

我正在尝试使用Spark Cassandra Connector的一个简单示例。

我正在使用cassandra 2.0.9和spark 1.1.0

当我对基于CassandraJavaRDD构建的JavaRDD执行SQL查询时,我收到以下错误

    06 Jan 2015 17:01:28,077 DEBUG Cluster        : Cannot connect with protocol V3, trying V2
06 Jan 2015 17:01:28,075 DEBUG Connection     : Connection[/127.0.0.1:9042-1, inFlight=0, closed=true] closing connection
06 Jan 2015 17:01:28,077 DEBUG Connection     : Connection[/127.0.0.1:9042-1, inFlight=0, closed=true] has already terminated
06 Jan 2015 17:01:28,080 DEBUG Connection     : Connection[/127.0.0.1:9042-2, inFlight=0, closed=false] Transport initialized and ready
06 Jan 2015 17:01:28,080 DEBUG ControlConnection: [Control connection] Refreshing node list and token map
06 Jan 2015 17:01:28,090 DEBUG ControlConnection: [Control connection] Refreshing schema
06 Jan 2015 17:01:28,142 DEBUG ControlConnection: [Control connection] Refreshing node list and token map
06 Jan 2015 17:01:28,156 DEBUG ControlConnection: [Control connection] Successfully connected to /127.0.0.1:9042
06 Jan 2015 17:01:28,156 INFO  Cluster        : New Cassandra host /127.0.0.1:9042 added
06 Jan 2015 17:01:28,156 INFO  CassandraConnector: Connected to Cassandra cluster: Test Cluster
06 Jan 2015 17:01:28,156 INFO  LocalNodeFirstLoadBalancingPolicy: Adding host 127.0.0.1 (datacenter1)
06 Jan 2015 17:01:28,167 DEBUG Connection     : Connection[/127.0.0.1:9042-3, inFlight=0, closed=false] Transport initialized and ready
06 Jan 2015 17:01:28,176 DEBUG Connection     : Connection[/127.0.0.1:9042-4, inFlight=0, closed=false] Transport initialized and ready
06 Jan 2015 17:01:28,176 DEBUG Session        : Added connection pool for /127.0.0.1:9042
06 Jan 2015 17:01:28,177 INFO  LocalNodeFirstLoadBalancingPolicy: Adding host 127.0.0.1 (datacenter1)
06 Jan 2015 17:01:28,194 DEBUG CassandraRDD   : Fetching data for range token("m_id") > ? AND token("m_id") <= ? with SELECT "m_id", "m_name" FROM "my_keyspace"."m_table" WHERE token("m_id") > ? AND token("m_id") <= ? ALLOW FILTERING with params [-3710785879179969863,-3308243544180364096]
06 Jan 2015 17:01:28,624 DEBUG CassandraRDD   : Row iterator for range token("m_id") > ? AND token("m_id") <= ? obtained successfully.
06 Jan 2015 17:01:28,633 DEBUG CassandraRDD   : Fetched 1 rows from my_keyspace.m_table for partition 0 in 0.455 s.
06 Jan 2015 17:01:28,634 ERROR Executor       : Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.IllegalArgumentException: object is not an instance of declaring class
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.spark.sql.api.java.JavaSQLContext$$anonfun$1$$anonfun$apply$1$$anonfun$apply$2.apply(JavaSQLContext.scala:100)
    at org.apache.spark.sql.api.java.JavaSQLContext$$anonfun$1$$anonfun$apply$1$$anonfun$apply$2.apply(JavaSQLContext.scala:100)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
    at org.apache.spark.sql.api.java.JavaSQLContext$$anonfun$1$$anonfun$apply$1.apply(JavaSQLContext.scala:100)
    at org.apache.spark.sql.api.java.JavaSQLContext$$anonfun$1$$anonfun$apply$1.apply(JavaSQLContext.scala:99)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:389)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1165)
    at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:904)
    at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:904)
    at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
    at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
    at org.apache.spark.scheduler.Task.run(Task.scala:54)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:722)
06 Jan 2015 17:01:28,640 DEBUG LocalActor     : [actor] received message StatusUpdate(0,FAILED,java.nio.HeapByteBuffer[pos=0 lim=2723 cap=2723]) from Actor[akka://sparkDriver/deadLetters]
06 Jan 2015 17:01:28,641 DEBUG TaskSchedulerImpl: parentName: , name: TaskSet_0, runningTasks: 0
06 Jan 2015 17:01:28,644 INFO  TaskSetManager : Starting task 1.0 in stage 0.0 (TID 1, localhost, ANY, 6186 bytes)
06 Jan 2015 17:01:28,645 INFO  Executor       : Running task 1.0 in stage 0.0 (TID 1)
06 Jan 2015 17:01:28,645 DEBUG LocalActor     : [actor] handled message (4.569476 ms) StatusUpdate(0,FAILED,java.nio.HeapByteBuffer[pos=2723 lim=2723 cap=2723]) from Actor[akka://sparkDriver/deadLetters]
06 Jan 2015 17:01:28,645 DEBUG LocalActor     : [actor] received message StatusUpdate(1,RUNNING,java.nio.HeapByteBuffer[pos=0 lim=0 cap=0]) from Actor[akka://sparkDriver/deadLetters]
06 Jan 2015 17:01:28,647 WARN  TaskSetManager : Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.IllegalArgumentException: object is not an instance of declaring class
        sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        java.lang.reflect.Method.invoke(Method.java:601)
        org.apache.spark.sql.api.java.JavaSQLContext$$anonfun$1$$anonfun$apply$1$$anonfun$apply$2.apply(JavaSQLContext.scala:100)
        org.apache.spark.sql.api.java.JavaSQLContext$$anonfun$1$$anonfun$apply$1$$anonfun$apply$2.apply(JavaSQLContext.scala:100)
        scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
        scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
        scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
        org.apache.spark.sql.api.java.JavaSQLContext$$anonfun$1$$anonfun$apply$1.apply(JavaSQLContext.scala:100)
        org.apache.spark.sql.api.java.JavaSQLContext$$anonfun$1$$anonfun$apply$1.apply(JavaSQLContext.scala:99)
        scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:389)
        scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
        scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
        org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1165)
        org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:904)
        org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:904)
        org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
        org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
        org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
        org.apache.spark.scheduler.Task.run(Task.scala:54)
        org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
        java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        java.lang.Thread.run(Thread.java:722)
06 Jan 2015 17:01:28,649 ERROR TaskSetManager : Task 0 in stage 0.0 failed 1 times; aborting job
06 Jan 2015 17:01:28,650 DEBUG BlockManager   : Getting local block broadcast_0
06 Jan 2015 17:01:28,650 DEBUG BlockManager   : Level for block broadcast_0 is StorageLevel(true, true, false, true, 1)
06 Jan 2015 17:01:28,650 DEBUG BlockManager   : Getting block broadcast_0 from memory
06 Jan 2015 17:01:28,650 DEBUG LocalActor     : [actor] handled message (5.114644 ms) StatusUpdate(1,RUNNING,java.nio.HeapByteBuffer[pos=0 lim=0 cap=0]) from Actor[akka://sparkDriver/deadLetters]
06 Jan 2015 17:01:28,650 DEBUG Executor       : Task 1's epoch is 0
06 Jan 2015 17:01:28,655 INFO  TaskSchedulerImpl: Cancelling stage 0
06 Jan 2015 17:01:28,658 DEBUG LocalActor     : [actor] received message KillTask(1,false) from Actor[akka://sparkDriver/deadLetters]
06 Jan 2015 17:01:28,659 INFO  Executor       : Executor is trying to kill task 1.0 in stage 0.0 (TID 1)
06 Jan 2015 17:01:28,659 INFO  TaskSchedulerImpl: Stage 0 was cancelled
06 Jan 2015 17:01:28,659 DEBUG LocalActor     : [actor] handled message (0.860446 ms) KillTask(1,false) from Actor[akka://sparkDriver/deadLetters]
06 Jan 2015 17:01:28,661 INFO  DAGScheduler   : Failed to run count at JavaSchemaRDD.scala:42
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.IllegalArgumentException: object is not an instance of declaring class
        sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        java.lang.reflect.Method.invoke(Method.java:601)
        org.apache.spark.sql.api.java.JavaSQLContext$$anonfun$1$$anonfun$apply$1$$anonfun$apply$2.apply(JavaSQLContext.scala:100)
        org.apache.spark.sql.api.java.JavaSQLContext$$anonfun$1$$anonfun$apply$1$$anonfun$apply$2.apply(JavaSQLContext.scala:100)
        scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
        scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
        scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
        org.apache.spark.sql.api.java.JavaSQLContext$$anonfun$1$$anonfun$apply$1.apply(JavaSQLContext.scala:100)
        org.apache.spark.sql.api.java.JavaSQLContext$$anonfun$1$$anonfun$apply$1.apply(JavaSQLContext.scala:99)
        scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:389)
        scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
        scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
        org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1165)
        org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:904)
        org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:904)
        org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
        org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
        org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
        org.apache.spark.scheduler.Task.run(Task.scala:54)
        org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
        java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        java.lang.Thread.run(Thread.java:722)
Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
    at scala.Option.foreach(Option.scala:236)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
    at akka.actor.ActorCell.invoke(ActorCell.scala:456)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
    at akka.dispatch.Mailbox.run(Mailbox.scala:219)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
06 Jan 2015 17:01:28,667 DEBUG DAGScheduler   : Removing running stage 0

这是我的代码

    SparkConf conf = new SparkConf();
        conf.setAppName("Spark Cassandra demo");    
conf.setMaster("local");
        conf.set("spark.cassandra.connection.host", "127.0.0.1");
SparkContextJavaFunctions javaFunctions = CassandraJavaUtil.javaFunctions(sc);
        logger.debug("javaFunctions=["+javaFunctions+"]");
        CassandraJavaRDD<CassandraRow>  mCassandraRDD = javaFunctions.cassandraTable("my_keyspace", "m_table");
        logger.debug("mCassandraRDD =["+mCassandraRDD +"]");
        mCassandraRDD .map(new Function<CassandraRow, MObject>() {

            @Override
            public Matter call(CassandraRow row) throws Exception {
                MObject mObject= new MObject();
                mObject.setId(row.getString("M_ID"));
                mObject.setName(row.getString("m_name"));
                return mObject;
            }

        });
        JavaSQLContext sqlCtx = new JavaSQLContext(sc);
        JavaSchemaRDD schemaMObject = sqlCtx.applySchema(mCassandraRDD , MObject.class);
        schemaMatter.registerTempTable("MOBJECT_SPARK");

        JavaSchemaRDD johnRDD = sqlCtx.sql("SELECT * FROM MOBJECT_SPARK WHERE name='john'");
        System.out.println("Count=["+johnRDD.count()+"]");

不确定此代码有什么问题。

感谢任何输入。

由于

0 个答案:

没有答案