Spark HbaseRDD给出了例外

时间:2014-12-17 15:56:51

标签: hadoop hbase apache-spark

我正在尝试使用以下代码阅读表单Hbase

JavaPairRDD<ImmutableBytesWritable, Result> pairRdd = ctx
        .newAPIHadoopRDD(conf, TableInputFormat.class,
                ImmutableBytesWritable.class,
                org.apache.hadoop.hbase.client.Result.class).cache().cache();

System.out.println(pairRdd.count());

但是获得例外 java.lang.IllegalStateException:未读块数据

查找以下代码

    SparkConf sparkConf = new SparkConf().setAppName("JavaSparkSQL");
sparkConf.set("spark.master","spark://192.168.50.247:7077");

/ * String [] stjars = {“/ home / BreakDown / SparkDemo2 / target / SparkDemo2-0.0.1-SNAPSHOT.jar”};     sparkConf.setJars(stjars); * /     JavaSparkContext ctx = new JavaSparkContext(sparkConf);     JavaSQLContext sqlCtx = new JavaSQLContext(ctx);

Configuration conf= HBaseConfiguration.create();
;
conf.set("hbase.master","192.168.50.73:60000");
conf.set("hbase.zookeeper.quorum","192.168.50.73");
conf.set("hbase.zookeeper.property.clientPort","2181");
conf.set("zookeeper.session.timeout","6000");
conf.set("zookeeper.recovery.retry","1");


conf.set("hbase.mapreduce.inputtable","employee11");

任何指针都会有很大的帮助

Spark版本1.1.1 hadoop 2 hadoop 2.2.0 Hbase 0.98.8-hadoop2

PFB堆栈跟踪 14/12/17 21:18:45 WARN NativeCodeLoader:无法为您的平台加载native-hadoop库...使用适用的builtin-java类 14/12/17 21:18:46 INFO AppClient $ ClientActor:连接到主火花://192.168.50.247:7077 ... 14/12/17 21:18:46 INFO SparkDeploySchedulerBackend:SchedulerBackend准备好在达到minRegisteredResourcesRatio后开始进行调度:0.0 14/12/17 21:18:46 INFO SparkDeploySchedulerBackend:使用应用ID app-20141217211846-0035连接到Spark群集    14/12/17 21:18:47 INFO TaskSetManager:在阶段0.0中启动任务0.0(TID 0,192.168.50.253,ANY,1256字节) 14/12/17 21:18:47 INFO BlockManagerMasterActor:注册块管理器192.168.50.253:41717,256.4 MB RAM,BlockManagerId(0,192.168.50.253,41717,0) 14/12/17 21:18:48 WARN TaskSetManager:阶段0.0中的丢失任务0.0(TID 0,192.168.50.253):java.lang.IllegalStateException:未读块数据         java.io.ObjectInputStream中的$ BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2420)         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1380)         java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)         java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)         java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)         java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)         org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)         org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)         org.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:160)         java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)         java.util.concurrent.ThreadPoolExecutor中的$ Worker.run(ThreadPoolExecutor.java:615)         java.lang.Thread.run(Thread.java:724)

0 个答案:

没有答案