我有一个DataFrame
,这是架构。 element
的数字未知,但某些元素(例如element1
和element3
)必须存在且唯一性
root
|-- context: struct (nullable = true)
|---|-- key: string (nullable = true)
| |-- data: struct (nullable = true)
| | |-- dimensions: array (nullable = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- element1: string (nullable = true)
| | | | |-- element2: string (nullable = true)
| | | | |-- element3: string (nullable = true)
| | | | |-- *** : string (nullable = true)
| | | | |-- elementN: string (nullable = true)
如何将其转换为这样的架构?
root
|-- context: struct (nullable = true)
|---|-- key: string (nullable = true)
|---|-- element1: string (nullable = true)
|---|-- element3: string (nullable = true)
非常感谢。
答案 0 :(得分:0)
您能否尝试explode
功能。这些是以下链接,请仔细阅读。
Extract columns in nested Spark DataFrame
Extract value from structure within an array of arrays in spark using scala