如何爆炸一个火花数据框

时间:2020-01-02 21:17:44

标签: scala apache-spark apache-spark-sql databricks

我分解了一个嵌套模式,但是没有得到想要的东西,

爆炸前看起来像这样:

df.show()

+----------+----------------------------------------------------------+
|CaseNumber|                   SourceId                               |
+----------+----------------------------------------------------------+
|       0  |[{"id":"1","type":"Sku"},{"id":"22","type":"ContractID"}] |
+----------|----------------------------------------------------------|
|       1  |[{"id":"3","type":"Sku"},{"id":"24","type":"ContractID"}] |                                             
+---------------------------------------------------------------------+

我希望它像这样

+----------+-------------------+
| CaseNumber| Sku | ContractId |
+----------+-------------------+
|       0  | 1    |      22    |
+----------|------|------------|   
|       1  | 3    |      24    | 
+------------------------------|

0 个答案:

没有答案