我有以下格式的文本文件
2018-01-19 12:00 Info
2018-01-20 12:00 Info
2018-01-21 12:00 Error
我已经使用多个映射函数编写了代码,以基于错误或信息对案例类和过滤器进行编写,通过减少使用的映射函数的数量,有没有更好的编写方法?
case class Test(date: String, time: String, log: String)
val input = sc.textFile(inputPath)
val filter = input.map(x => x.split(" ")).map(x => Test(x(0), x(1), x(2))).filter(_.log == "[" + logLevel + "]")
val mapCount = filter.map { case Test(f1, f2, f3) => (f1, 1) }
mapCount.persist()
val r = mapCount.reduceByKey(_ + _).collect
val rd = sc.makeRDD(r.toList).saveAsTextFile(outputPath)
将信息作为logLevel传递时的输出
(2018-01-19 ,1)
(2018-01-20 ,1)