我试图将Spark数据帧写入Hbase,但是当我在同一数据帧上执行任何操作或写入/保存方法时,会出现以下异常:
{
java.lang.AbstractMethodError
at org.apache.spark.Logging$class.log(Logging.scala:50)
at org.apache.spark.sql.execution.datasources.hbase.HBaseFilter$.log(HBaseFilter.scala:121)
at org.apache.spark.sql.execution.datasources.hbase.HBaseFilter$.buildFilters(HBaseFilter.scala:124)
at org.apache.spark.sql.execution.datasources.hbase.HBaseTableScanRDD.getPartitions(HBaseTableScan.scala:60)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
----------
Here is my code:
import org.apache.spark.sql.{SQLContext, _}
import org.apache.spark.sql.execution.datasources.hbase._
import org.apache.spark.{SparkConf, SparkContext}
def catalog = s"""{
| |"table":{"namespace":"default", "name":"Contacts"},
| |"rowkey":"key",
| |"columns":{
| |"rowkey":{"cf":"rowkey", "col":"key", "type":"string"},
| |"officeAddress":{"cf":"Office", "col":"Address", "type":"string"},
| |"officePhone":{"cf":"Office", "col":"Phone", "type":"string"},
| |"personalName":{"cf":"Personal", "col":"Name", "type":"string"},
| |"personalPhone":{"cf":"Personal", "col":"Phone", "type":"string"}
| |}
| |}""".stripMargin
def withCatalog(cat: String): DataFrame = {
| spark.sqlContext
| .read
| .options(Map(HBaseTableCatalog.tableCatalog->cat))
| .format("org.apache.spark.sql.execution.datasources.hbase")
| .load()
| }
val df = withCatalog(catalog)
i was able to create dataframe, but i perform
df.show()
它给我错误:
java.lang.AbstractMethodError
at org.apache.spark.Logging$class.log(Logging.scala:50)
at org.apache.spark.sql.execution.datasources.hbase.HBaseFilter$.log(HBaseFilter.scala:121)
at org.apache.spark.sql.execution.datasources.hbase.HBaseFilter$.buildFilters(HBaseFilter.scala:124)
at org.apache.spark.sql.execution.datasources.hbase.HBaseTableScanRDD.getPartitions(HBaseTableScan.scala:60)`
请提出一些建议: 我将表格从Hbase中导入并创建目录,并基于以下内容在以下基础上创建数据框:- 火花1.6 HBase的1.2.0-CDH5.13.3 cloudera
答案 0 :(得分:0)
遇到相同的问题,我正在使用hbase-spark 1.2.0-cdh5.8.4。
在此错误不再存在之后,我尝试在1.2.0-cdh5.13.0版本上进行编译。您应该尝试重新编译源代码或使用更高版本。