Cloudera中的Hbase-Spark连接器问题:java.lang.AbstractMethodError

时间:2019-02-21 10:57:01

标签: apache-spark-sql

我试图将Spark数据帧写入Hbase,但是当我在同一数据帧上执行任何操作或写入/保存方法时,会出现以下异常:

{
java.lang.AbstractMethodError
        at org.apache.spark.Logging$class.log(Logging.scala:50)
        at org.apache.spark.sql.execution.datasources.hbase.HBaseFilter$.log(HBaseFilter.scala:121)
        at org.apache.spark.sql.execution.datasources.hbase.HBaseFilter$.buildFilters(HBaseFilter.scala:124)
        at org.apache.spark.sql.execution.datasources.hbase.HBaseTableScanRDD.getPartitions(HBaseTableScan.scala:60)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
----------

    Here is my code:




    import org.apache.spark.sql.{SQLContext, _}
    import org.apache.spark.sql.execution.datasources.hbase._
    import org.apache.spark.{SparkConf, SparkContext}

    def catalog = s"""{
         |      |"table":{"namespace":"default", "name":"Contacts"},
         |      |"rowkey":"key",
         |      |"columns":{
         |      |"rowkey":{"cf":"rowkey", "col":"key", "type":"string"},
         |      |"officeAddress":{"cf":"Office", "col":"Address", "type":"string"},
         |      |"officePhone":{"cf":"Office", "col":"Phone", "type":"string"},
         |      |"personalName":{"cf":"Personal", "col":"Name", "type":"string"},
         |      |"personalPhone":{"cf":"Personal", "col":"Phone", "type":"string"}
         |      |}
         |  |}""".stripMargin

    def withCatalog(cat: String): DataFrame = {
         |          spark.sqlContext
         |          .read
         |          .options(Map(HBaseTableCatalog.tableCatalog->cat))
         |          .format("org.apache.spark.sql.execution.datasources.hbase")
         |          .load()
         |      }

    val df = withCatalog(catalog)

    i was able to create dataframe, but i perform 
    df.show() 

它给我错误:

    java.lang.AbstractMethodError
     at org.apache.spark.Logging$class.log(Logging.scala:50)
            at org.apache.spark.sql.execution.datasources.hbase.HBaseFilter$.log(HBaseFilter.scala:121)
            at org.apache.spark.sql.execution.datasources.hbase.HBaseFilter$.buildFilters(HBaseFilter.scala:124)
            at org.apache.spark.sql.execution.datasources.hbase.HBaseTableScanRDD.getPartitions(HBaseTableScan.scala:60)`

请提出一些建议:   我将表格从Hbase中导入并创建目录,并基于以下内容在以下基础上创建数据框:-   火花1.6   HBase的1.2.0-CDH5.13.3   cloudera

1 个答案:

答案 0 :(得分:0)

遇到相同的问题,我正在使用hbase-spark 1.2.0-cdh5.8.4。

在此错误不再存在之后,我尝试在1.2.0-cdh5.13.0版本上进行编译。您应该尝试重新编译源代码或使用更高版本。