Question

那里，当我使用spark流将数据写入带有saveAsHadoopDataset(jobConf)的Hbase时，我收到一条警告日志，有人可以帮我弄清楚我错过了什么吗？这是我的代码：

 def main(args: Array[String]): Unit = {

val conf = new SparkConf().setAppName(args(0)).setMaster("local[4]")
val ssc = new StreamingContext(conf, Seconds(1))

val wordcountDStream: DStream[(String, Int)] =
  ssc.socketTextStream(args(1), args(2).toInt).flatMap(x => x.split(" "))
    .map((_, 1))
    .reduceByKeyAndWindow((a: Int, b: Int) => (a + b), Seconds(10), Seconds(3))
var jobConf = new JobConf(HBaseConfiguration.create())
jobConf.set("hbase.zookeeper.quorum", args(3))
jobConf.set("zookeeper.znode.parent", "/hbase")
jobConf.setOutputFormat(classOf[TableOutputFormat])
jobConf.set(TableOutputFormat.OUTPUT_TABLE, "SparkWordCount")
val resultDstream = wordcountDStream.repartition(1)
  .foreachRDD(rdd => {
  rdd.sortBy(_._2, false).zipWithUniqueId().filter(_._2 < 5).map(triple => {

    var put = new Put(Bytes.toBytes((triple._2 + 1).toString))
    put.addColumn(Bytes.toBytes("result"), Bytes.toBytes("word"), Bytes.toBytes(triple._1._1))
    put.addColumn(Bytes.toBytes("result"), Bytes.toBytes("count"), Bytes.toBytes(triple._1._2.toString))
    (new ImmutableBytesWritable, put)
  }).saveAsHadoopDataset(jobConf)
})
ssc.start()
ssc.awaitTermination()

和我的控制台记录：

17/06/20 22:06:15 WARN FileOutputCommitter: Output Path is null in setupJob()
17/06/20 22:06:24 WARN FileOutputCommitter: Output Path is null in commitJob()

非常感激！

Answer 1

您的问题是否已解决？

我也得到了这个WARN信息，我的进程陷入了saveAsHadoopDataset（）

WARN FileOutputCommitter：setupJob（）中的输出路径为空

1 个答案: