在Spark 2.4中读取HBase 1.2.5表时出错

时间:2019-12-08 12:20:25

标签: scala apache-spark hbase

火花版本-2.4.0 hbase版本-1.2.5 scala版本2.11.8

一切都在本地VM中设置。

已导入的软件包

spark-shell --packages com.hortonworks:shc-core:1.1.1-2.1-s_2.11,org.apache.hadoop:hadoop-common:2.7.3,org.apache.hbase:hbase-common :1.2.5,org.apache.hbase:hbase客户端:1.2.5,org.apache.hbase:hbase协议:1.2.5,org.apache.hbase:hbase-hadoop2-compat:1.2.5,org .apache.hbase:hbase-server:1.2.5-存储库http://repo.hortonworks.com/content/groups/public/

在hbase shell中:

creating table:
 create  'cardata','software','hardware','other'

inserting data to table:

put 'cardata','v001_H','hardware:alloy_wheels','yes'
put 'cardata','v001_H','hardware:anti_Lock_break','yes'
put 'cardata','v001_H','software:electronic_breakforce_distribution','yes'
put 'cardata','v001_H','software:terrain_mode','yes'
put 'cardata','v001_H','software:traction_control','yes'
put 'cardata','v001_H','software:stability_control','yes'
put 'cardata','v001_H','software:cruize_control','yes'
put 'cardata','v001_H','other:make','hyundai'
put 'cardata','v001_H','other:model','i10'
put 'cardata','v001_H','other:variant','sportz'

代表

import org.apache.spark.sql.execution.datasources.hbase._
import spark.implicits._


def carCatalog = s"""{
"table":{"namespace":"default", "name":"cardata"},
"rowkey":"key",
"columns":{
"alloy_wheels":{"cf":"hardware", "col":"alloy_wheels", "type":"string"},
"anti_Lock_break":{"cf":"hardware", "col":"anti_Lock_break", "type":"string"},
"electronic_breakforce_distribution":{"cf":"software", "col":"electronic_breakforce_distribution", "type":"string"},
"terrain_mode":{"cf":"software", "col":"terrain_mode", "type":"string"},
"traction_control":{"cf":"software", "col":"traction_control", "type":"string"}
}
}""".stripMargin

val hbaseDF=spark.read.options(Map(HBaseTableCatalog.tableCatalog->carCatalog)).format("org.apache.spark.sql.execution.datasources.hbase").load()

错误

java.lang.NoSuchMethodError: org.json4s.jackson.JsonMethods$.parse(Lorg/json4s/JsonInput;Z)Lorg/json4s/JsonAST$JValue;
  at org.apache.spark.sql.execution.datasources.hbase.HBaseTableCatalog$.apply(HBaseTableCatalog.scala:257)
  at org.apache.spark.sql.execution.datasources.hbase.HBaseRelation.<init>(HBaseRelation.scala:80)
  at org.apache.spark.sql.execution.datasources.hbase.DefaultSource.createRelation(HBaseRelation.scala:51)
  at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:318)
  at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
  ... 53 elided

一切都在本地VM中设置。

1 个答案:

答案 0 :(得分:0)

NoSuchMethodError通常意味着在运行时找到的库版本不是预期的库版本(在编译时使用)。

您的平台可能会添加默认库(使用spark-submit代表您)并与开发时使用的库冲突。

一个常见的解决方案称为shading,可以在创建程序集jar时定义它(例如,在build.sbt中)。

您可以参考以下链接:https://github.com/sbt/sbt-assembly#shading