火花表有问题。我的桌子是;
[
{
"year" : "2013",
"title" : "Turn It Down, Or Else!",
"info" : {
"directors" : [
"Alice Smith",
"Bob Jones"
],
"release_date" : "2013-01-18T00:00:00Z",
"rating" : "6.2",
"genres" : [
"Comedy",
"Drama"
],
"image_url" : "http://ia.media-imdb.com/images/N/O9ERWAU7FS797AJ7LU8HN09AMUP908RLlo5JF90EWR7LJKQ7@@._V1_SX400_.jpg",
"plot" : "A rock band plays their music at high volumes, annoying the neighbors.",
"rank" : "11",
"running_time_secs" : "5215",
"actors" : [
"David Matthewman",
"Ann Thomas",
"Jonathan G. Neff"
]
}
}
]
我需要将“数据”列分成几行。我需要这张桌子;
# Source: spark<?> [?? x 4]
AssetConnectDeviceKey CreateDate FaultStatus Data
* <chr> <dttm> <int> <chr>
1 0037005B4834500C20323250 2019-03-19 11:02:52 1 F@BBZL,CSSAA
2 0037005B4834500C20323250 2019-03-19 11:02:54 1 F@BBZL
3 0037005B4834500C20323250 2019-03-19 11:02:54 1 F@BBZL
4 0037005B4834500C20323250 2019-03-19 11:03:24 1 F@BBZL,QBBBC
# ... with more rows
我可以使用这样的数据帧来做到这一点;
# Source: spark<?> [?? x 4]
AssetConnectDeviceKey CreateDate FaultStatus Data
* <chr> <dttm> <int> <chr>
1 0037005B4834500C20323250 2019-03-19 11:02:52 1 F@BBZL
2 0037005B4834500C20323250 2019-03-19 11:02:52 1 CSSAA
3 0037005B4834500C20323250 2019-03-19 11:02:54 1 F@BBZL
4 0037005B4834500C20323250 2019-03-19 11:02:54 1 F@BBZL
5 0037005B4834500C20323250 2019-03-19 11:03:24 1 F@BBZL
6 0037005B4834500C20323250 2019-03-19 11:03:24 1 QBBBC
# ... with more rows
但是我无法在spark tbl上做到这一点。我该怎么办?