pyspark-数组的扁平结构

时间:2018-10-03 18:23:41

标签: dataframe pyspark

root
|-- first_name: string
|-- last_name: string
|-- degrees: struct
|    |-- A: array
|    |   |-- element: struct
|    |   |   |-- school: string
|    |   |   |-- advisor1: string
|    |   |   |-- advisor2: string
     |-- B: array
|    |   |-- element: struct
|    |   |   |-- school: string
|    |   |   |-- advisor1: string
|    |   |   |-- advisor2: string
|    |   |   |-- attrn: string

如何简化此架构,以便使配置单元的查询更容易。

我需要爆炸每一行,这样

first_name,last_name,A,A.school,A.advisor1,A.advisor2, NULL
first_name,last_name,B,B.school,B.advisor1,B.advisor2, B.attrn

0 个答案:

没有答案