未找到HBase异常类的Spark(JAVA)

时间:2017-03-20 17:37:23

标签: java apache-spark hbase

我正在尝试使用spark与hbase进行通信。我在下面使用以下代码:

SparkConf sparkConf = new SparkConf().setAppName("HBaseRead");
JavaSparkContext jsc = new JavaSparkContext(sparkConf);
Configuration conf = HBaseConfiguration.create();
conf.addResource(new Path("/etc/hbase/conf/core-site.xml"));
conf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"));
JavaHBaseContext hbaseContext = new JavaHBaseContext(jsc, conf);

Scan scan = new Scan();
scan.setCaching(100);

JavaRDD<Tuple2<ImmutableBytesWritable, Result>> hbaseRdd = hbaseContext.hbaseRDD(TableName.valueOf("climate"), scan);

System.out.println("Number of Records found : " + hbaseRdd.count());

如果我执行此操作,则会收到以下错误:

Exception in thread "dag-scheduler-event-loop" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/regionserver/StoreFileWriter
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
    at java.lang.Class.getDeclaredMethod(Class.java:2128)
    at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475)
    at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72)
    at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498)
    at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
    at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
    ...

我没有通过谷歌找到任何解决方案。有人有想法吗?

-------- --------编辑

我正在使用maven。我的Pom看起来像:

<dependencies>   
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-server</artifactId>
        <version>1.3.0</version>
    </dependency>        

    <dependency>
        <groupId>org.sharegov</groupId>
        <artifactId>mjson</artifactId>
        <version>1.4.1</version>
    </dependency>

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.10</artifactId>
        <version>1.5.2</version>
    </dependency> 

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.10</artifactId>
        <version>1.5.2</version>
    </dependency>

    <dependency>
        <groupId>com.databricks</groupId>
        <artifactId>spark-csv_2.10</artifactId>
        <version>1.5.0</version>
    </dependency>

    <dependency>
        <groupId>com.databricks</groupId>
        <artifactId>spark-xml_2.10</artifactId>
        <version>0.3.5</version>
    </dependency>        

    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-spark</artifactId>
        <version>2.0.0-SNAPSHOT</version>                               
    </dependency>

</dependencies>

使用maven-assembly-plugin

构建具有依赖关系的应用程序

1 个答案:

答案 0 :(得分:0)

您收到NoClassDefFoundError,因为spark无法在类路径中找到 hbase jar,您需要使用{{spark-submit显式提供所需的jar 1}}启动作业时的参数:

--jars