spark hbase连接器 - 异常“java.lang.UnsupportedOperationException:empty.tai​​l”

时间:2017-03-01 17:12:42

标签: apache-spark hbase apache-spark-sql

我们使用的是HDP 2.4.2,使用Scala 2.10.5编译spark 1.6。 Hbase版本为1.1.2.2.4.2.0-258

环境是一个基本的开发群集(< 10个节点),其中包含hbase& spark以集群模式运行。

尝试使用spark hbase连接器将hbase中的soem数据导入spark中的数据框失败,并出现以下错误 -

Exception in thread "main" java.lang.UnsupportedOperationException: empty.tail
    at scala.collection.TraversableLike$class.tail(TraversableLike.scala:445)
    at scala.collection.mutable.ArraySeq.scala$collection$IndexedSeqOptimized$super$tail(ArraySeq.scala:45)
    at scala.collection.IndexedSeqOptimized$class.tail(IndexedSeqOptimized.scala:123)
    at scala.collection.mutable.ArraySeq.tail(ArraySeq.scala:45)
    at org.apache.spark.sql.execution.datasources.hbase.HBaseTableCatalog.initRowKey(HBaseTableCatalog.scala:150)
    at org.apache.spark.sql.execution.datasources.hbase.HBaseTableCatalog.<init>(HBaseTableCatalog.scala:164)
    at org.apache.spark.sql.execution.datasources.hbase.HBaseTableCatalog$.apply(HBaseTableCatalog.scala:239)
    at hbaseReaderHDPCon$.main(hbaseReaderHDPCon.scala:42)
    at hbaseReaderHDPCon.main(hbaseReaderHDPCon.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:731)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

在我的代码第42行 - 这种情况正在发生 -

val cat =
      s"""{
          |"table":{"namespace":"myTable", "name":"person", "tableCoder":"PrimitiveType"},
          |"rowkey":"ROW",
          |"columns":{
            |"col0":{"cf":"person", "col":"detail", "type":"string"}
          |}
          |}""".stripMargin
    val scon = new SparkConf()
    val sparkContext = new SparkContext(scon)

1 个答案:

答案 0 :(得分:5)

鉴于您的代码,我认为字段&#34;列&#34;在您的目录中缺少rowkey。 下面是一个适合我的例子。我使用的是Spark 2.0(SparkSession),但它应该适用于Spark 1.6:

    val catalog =
        s"""{
            |"table":{"namespace":"default", "name":"person"},
            |"rowkey":"id",
            |"columns":{
            |"id":{"cf":"rowkey", "col":"id", "type":"string"},
            |"name":{"cf":"info", "col":"name", "type":"string"},
            |"age":{"cf":"info", "col":"age", "type":"string"}
            |}
            |}""".stripMargin

    val spark = SparkSession
        .builder()
        .appName("HbaseWriteTest")
        .getOrCreate()

    val df = spark
        .read
        .options(
            Map(
                HBaseTableCatalog.tableCatalog -> catalog
            )
        )
        .format("org.apache.spark.sql.execution.datasources.hbase")
        .load()