我有一个JSON文档,其条目如下:
{
"data":[
[
1,
"CD07C40E-4943-44B0-BF5E-370DA2133E25",
1,
1320919663,
"386118",
1320919663,
"386118",
"{\n \"invalidCells\" : {\n \"1669635\" : \" \"\n }\n}",
null,
" --T::00",
null,
null,
null,
[
null,
null,
null,
null,
null
],
null
],
[
2,
"152ECD05-2301-43C7-88C5-085199623DA7",
2,
1320919663,
"386118",
1320919663,
"386118",
"{\n}",
"6900 37th Av S",
"Medic Response",
1320881580,
"47.540683",
"-122.286131",
[
null,
"47.540683",
"-122.286131",
null,
false
],
"F110104166"
],
[
3,
"311ED596-51B7-4E70-A293-12378C61F0A6",
3,
1320919663,
"386118",
1320919663,
"386118",
"{\n}",
"N 50th St / Stone Way N",
"Aid Response",
1320881520,
"47.665034",
"-122.340207",
[
null,
"47.665034",
"-122.340207",
null,
false
],
"F110104164"
]
]
}
我已经有JSON文档中先前结构的列,我将不在这里列出。我现在想要的只是从上面的“数据”数组中获取值,然后填充表格。
我尝试了以下方法,但似乎不起作用:
import org.apache.spark.sql.functions._
seattlefire.withColumn("data", explode(seattlefire("data")).as("data_flattened")).show(false)