将PySpark Dataframe列a列表/数组条目转换为双重列表/数组条目

时间:2018-08-09 15:49:09

标签: python pyspark pyspark-sql

我想转换具有以下架构结构的Pyspark数据框。

root
 |-- top: long (nullable = true)
 |-- inner: struct (nullable = true)
 |    |-- inner1: long (nullable = true)
 |    |-- inner2: long (nullable = true)
 |    |-- inner3: date (nullable = true)
 |    |-- inner4: date (nullable = true)

收件人:

root
 |-- top: long (nullable = true)
 |-- inner: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- inner1: long (nullable = true)
 |    |    |-- inner2: long (nullable = true)
 |    |    |-- inner3: date (nullable = true)
 |    |    |-- inner4: date (nullable = true)

这基本上正在改变

top | [ inner1, inner2, inner3, inner4]

top | [[inner1, inner2, inner3, inner4]]

0 个答案:

没有答案