原始数据如下所示:
YAPM1,20100901,23:36:01.563,Quote,,,,,,,4563,,,,,,
YAPM1,20100901,23:36:03.745,Quote,,,,,4537,,,,,,,,
第一行有额外的空列。我按如下方式解析数据:
val tokens = List.fromString(line, ',')
结果:
List(YAPM1, 20100901, 23:36:01.563, Quote, 4563)
List(YAPM1, 20100901, 23:36:03.745, Quote, 4537)
目前无法使用生成的列表来推断哪些行具有额外的列。我该怎么做?
答案 0 :(得分:11)
使用字符串拆分并传递-1作为第二个参数!
scala> "a,b,c,d,,,,".split(",")
res1: Array[java.lang.String] = Array(a, b, c, d)
scala> "a,b,c,d,,,,".split(",", -1)
res2: Array[java.lang.String] = Array(a, b, c, d, "", "", "", "")
不推荐使用FYI List fromString,而选择字符串拆分。