我现在有数据的副本,每行的数据如下。
A
B
C
QW
OO
P
...
现在,我希望每三行合并一次,如下所示:
ABC
QWOOP
...
该功能该怎么写?
eg. val data = sc.textFile("path")
谢谢!
答案 0 :(得分:-1)
val lineRdd = sc.textFile("path")
val yourRequiredRdd = lineRdd
.zipWithIndex
.map({ case (line, index) => (index / 3, (index, line)))
.aggregateByKey(List.empty[(Long, String)])(
{ case (aggrList, (index, line)) => (index, line) :: aggrList },
{ case (aggrList1, aggrList2) => aggrList1 ++ aggrList2 }
)
.map({ case (key, aggrList) =>
aggrList
.sortBy({ case (index, line) => index })
.map({ case (index, line) => line })
.mkString("")
})