那里,当我使用spark流将数据写入带有saveAsHadoopDataset(jobConf)
的Hbase时,我收到一条警告日志,有人可以帮我弄清楚我错过了什么吗?
这是我的代码:
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName(args(0)).setMaster("local[4]")
val ssc = new StreamingContext(conf, Seconds(1))
val wordcountDStream: DStream[(String, Int)] =
ssc.socketTextStream(args(1), args(2).toInt).flatMap(x => x.split(" "))
.map((_, 1))
.reduceByKeyAndWindow((a: Int, b: Int) => (a + b), Seconds(10), Seconds(3))
var jobConf = new JobConf(HBaseConfiguration.create())
jobConf.set("hbase.zookeeper.quorum", args(3))
jobConf.set("zookeeper.znode.parent", "/hbase")
jobConf.setOutputFormat(classOf[TableOutputFormat])
jobConf.set(TableOutputFormat.OUTPUT_TABLE, "SparkWordCount")
val resultDstream = wordcountDStream.repartition(1)
.foreachRDD(rdd => {
rdd.sortBy(_._2, false).zipWithUniqueId().filter(_._2 < 5).map(triple => {
var put = new Put(Bytes.toBytes((triple._2 + 1).toString))
put.addColumn(Bytes.toBytes("result"), Bytes.toBytes("word"), Bytes.toBytes(triple._1._1))
put.addColumn(Bytes.toBytes("result"), Bytes.toBytes("count"), Bytes.toBytes(triple._1._2.toString))
(new ImmutableBytesWritable, put)
}).saveAsHadoopDataset(jobConf)
})
ssc.start()
ssc.awaitTermination()
和我的控制台记录:
17/06/20 22:06:15 WARN FileOutputCommitter: Output Path is null in setupJob()
17/06/20 22:06:24 WARN FileOutputCommitter: Output Path is null in commitJob()
非常感激!
答案 0 :(得分:0)
您的问题是否已解决?
我也得到了这个WARN信息,我的进程陷入了saveAsHadoopDataset()