但在使用S3测试EMR数据时遇到异常。
// Spark conf
SparkConf sparkConf = new SparkConf().setMaster("local[4]").setAppName("My App");
JavaSparkContext jsc = new JavaSparkContext(sparkConf);
// Hbase conf
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.zookeeper.quorum","localhost");
conf.set("hbase.zookeeper.property.client.port","2181");
// Submit scan into hbase conf
// conf.set(TableInputFormat.SCAN, TableMapReduceUtil.convertScanToString(scan));
conf.set(TableInputFormat.INPUT_TABLE, "mytable");
conf.set(TableInputFormat.SCAN_ROW_START, "startrow");
conf.set(TableInputFormat.SCAN_ROW_STOP, "endrow");
// Get RDD
JavaPairRDD<ImmutableBytesWritable, Result> source = jsc
.newAPIHadoopRDD(conf, TableInputFormat.class,
ImmutableBytesWritable.class, Result.class);
// Process RDD
System.out.println("&&&&&&&&&&&&&&&&&&&&&&& " + source.count());
18/05/04 00:22:02 INFO MetricRegistries:已加载的MetricRegistries类 org.apache.hadoop.hbase.metrics.impl.MetricRegistriesImpl 18/05/04 00:22:02错误TableInputFormat:java.io.IOException:java.lang.reflect.InvocationTargetException 在org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240) 引起:java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 在org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238) 引起:java.lang.IllegalAccessError:试图从类访问类org.apache.hadoop.metrics2.lib.MetricsInfoImpl org.apache.hadoop.metrics2.lib.DynamicMetricsRegistry 在org.apache.hadoop.metrics2.lib.DynamicMetricsRegistry.newGauge(DynamicMetricsRegistry.java:139) 在org.apache.hadoop.hbase.zookeeper.MetricsZooKeeperSourceImpl。(MetricsZooKeeperSourceImpl.java:59) 在org.apache.hadoop.hbase.zookeeper.MetricsZooKeeperSourceImpl。(MetricsZooKeeperSourceImpl.java:51) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 在java.lang.Class.newInstance(Class.java:442) at java.util.ServiceLoader $ LazyIterator.nextService(ServiceLoader.java:380) ......还有42个
Exception in thread "main" java.io.IOException: Cannot create a record reader because of a previous error. Please look at the previous
记录任务完整日志中的行以获取更多详细信息。 at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:270) at org.apache.hadoop.hbase.mapreduce.TableInputFormat.getSplits(TableInputFormat.java:256) 在org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:125) 在org.apache.spark.rdd.RDD $$ anonfun $ partitions $ 2.apply(RDD.scala:252) 在org.apache.spark.rdd.RDD $$ anonfun $ partitions $ 2.apply(RDD.scala:250) 在scala.Option.getOrElse(Option.scala:121) 在org.apache.spark.rdd.RDD.partitions(RDD.scala:250) 在org.apache.spark.SparkContext.runJob(SparkContext.scala:2094) 在org.apache.spark.rdd.RDD.count(RDD.scala:1158) 在org.apache.spark.api.java.JavaRDDLike $ class.count(JavaRDDLike.scala:455) 在org.apache.spark.api.java.AbstractJavaRDDLike.count(JavaRDDLike.scala:45) 在HbaseScan.main(HbaseScan.java:60) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) 在org.apache.spark.deploy.SparkSubmit $ .org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(SparkSubmit.scala:775) 在org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:180) 在org.apache.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:205) 在org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:119) 在org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 引发者:java.lang.IllegalStateException:输入格式实例尚未正确初始化。确保你打电话 在构造函数或初始化方法中初始化 at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getTable(TableInputFormatBase.java:652) at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:265) ... 20多个所有APACHE HBASE LIBS:e.hadoop.hbase.metrics.impl.MetricRegistriesImpl 18/05/04 04:05:54 错误TableInputFormat:java.io.IOException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240) 在 org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218) 在 org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119) 在 org.apache.hadoop.hbase.mapreduce.TableInputFormat.initialize(TableInputFormat.java:202) 在 org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:259) 在 org.apache.hadoop.hbase.mapreduce.TableInputFormat.getSplits(TableInputFormat.java:256) 在 org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:125) 在 org.apache.spark.rdd.RDD $$ anonfun $分区$ 2.适用(RDD.scala:252) 在 org.apache.spark.rdd.RDD $$ anonfun $分区$ 2.适用(RDD.scala:250) 在scala.Option.getOrElse(Option.scala:121)at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)at org.apache.spark.SparkContext.runJob(SparkContext.scala:2094)at at org.apache.spark.rdd.RDD.count(RDD.scala:1158)at org.apache.spark.api.java.JavaRDDLike $ class.count(JavaRDDLike.scala:455) 在 org.apache.spark.api.java.AbstractJavaRDDLike.count(JavaRDDLike.scala:45) 在HbaseScan.main(HbaseScan.java:60)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 在 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 在java.lang.reflect.Method.invoke(Method.java:498)at org.apache.spark.deploy.SparkSubmit $ .ORG $阿帕奇$火花$部署$ SparkSubmit $$ runMain(SparkSubmit.scala:775) 在 org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:180) 在org.apache.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:205) 在org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:119) 在org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)导致 by:java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(本机方法) 在 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 在 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 在 org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238) ... 24更多引起:java.lang.RuntimeException:无法创建 接口org.apache.hadoop.hbase.zookeeper.MetricsZooKeeperSource是 类路径上的hadoop兼容性jar?在 org.apache.hadoop.hbase.CompatibilitySingletonFactory.getInstance(CompatibilitySingletonFactory.java:75) 在 org.apache.hadoop.hbase.zookeeper.MetricsZooKeeper。(MetricsZooKeeper.java:38) 在 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper。(RecoverableZooKeeper.java:130) 在org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(ZKUtil.java:143) 在 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher。(ZooKeeperWatcher.java:181) 在 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher。(ZooKeeperWatcher.java:155) 在 org.apache.hadoop.hbase.client.ZooKeeperKeepAliveConnection。(ZooKeeperKeepAliveConnection.java:43) 在 org.apache.hadoop.hbase.client.ConnectionManager $ HConnectionImplementation.getKeepAliveZooKeeperWatcher(ConnectionManager.java:1737)