无法通过火花中的特殊字符拆分数组

时间:2020-06-30 06:48:25

标签: apache-spark

嗨,我有这个数据框(runnerListByPositionDataframe):

+------------+---------------------------------+
|runner      |positions                        |
+------------+---------------------------------+
|azerty      |[10, 8, 11,, 1, 5, 4, 1, 9, 7, 1]|
+------------+---------------------------------+

我正在尝试将职位除以数字。 例:我需要:

+------------+----------------------------------------+
|runner      |positions                               |
+------------+----------------------------------------+
|azerty      |[[10, 8, 11,, 1] , [5, 4, 1], [9, 7, 1]]|
+------------+----------------------------------------+

每1个位置我都会创建一个新数组,以便拥有一个数组数组

要这样做:

val result: Dataset[(Seq[Int], Seq[Int])] = runnerListByPositionDataframe.map((runner: Row) => {
  val positions: Seq[Int] = runner.getAs[Seq[Int]]("positions")
  val positionsSplited: (Seq[Int], Seq[Int]) = positions.splitAt(positions.indexWhere(x => {
    x == 0
  }))
  positionsSplited
})

result.show(false)

但是我得到了:

+-----------+-----------------------+
|_1         |_2                     |
+-----------+-----------------------+
|[10, 8, 11]|[, 1, 5, 4, 1, 9, 7, 1]|
+-----------+-----------------------+

有人可以帮忙吗?

谢谢

1 个答案:

答案 0 :(得分:1)

Microsoft Azure PowerShell

我可以想到的

蛮力方法以达到所需的o / p

spark>=2.4