我不怎么从此SQL列类型获取相关信息:
array<
struct<
day_of_week:string,
start:bigint,
duration:bigint,
enabled:boolean,
created_at:timestamp,
deleted_at:timestamp
>
>
此列在数据库中包含有关餐馆的每日营业时间的信息。有一些餐厅改变了我们的日常运作,因此,我实际上不需要SQL表中的某些行。所有需要的就是所有餐馆的当前营业时间。
这是我尝试从中获取信息的列的示例:
[
{
"day_of_week": "4",
"start": 64800000,
"duration": 359,
"enabled": false,
"created_at": "2018-02-23T10:47:15.033+0000",
"deleted_at": "2018-10-22T18:27:40.403+0000"
},
{
"day_of_week": "7",
"start": 64800000,
"duration": 359,
"enabled": true,
"created_at": "2018-10-22T18:29:11.030+0000",
"deleted_at": null
},
{
"day_of_week": "5",
"start": 64800000,
"duration": 359,
"enabled": true,
"created_at": "2018-10-22T18:29:11.030+0000",
"deleted_at": null
},
{
"day_of_week": "6",
"start": 64800000,
"duration": 359,
"enabled": false,
"created_at": "2018-10-22T18:27:40.397+0000",
"deleted_at": "2018-10-22T18:27:42.074+0000"
},
{
"day_of_week": "7",
"start": 64800000,
"duration": 359,
"enabled": false,
"created_at": "2018-10-22T18:27:40.397+0000",
"deleted_at": "2018-10-22T18:27:42.074+0000"
},
{
"day_of_week": "1",
"start": 64800000,
"duration": 359,
"enabled": false,
"created_at": "2018-10-22T18:27:42.069+0000",
"deleted_at": "2018-10-22T18:29:11.035+0000"
},
{
"day_of_week": "6",
"start": 64800000,
"duration": 359,
"enabled": true,
"created_at": "2018-10-22T18:29:11.030+0000",
"deleted_at": null
},
{
"day_of_week": "7",
"start": 64800000,
"duration": 359,
"enabled": false,
"created_at": "2018-10-22T18:27:42.069+0000",
"deleted_at": "2018-10-22T18:29:11.035+0000"
},
{
"day_of_week": "2",
"start": 64800000,
"duration": 359,
"enabled": false,
"created_at": "2018-02-23T10:47:15.033+0000",
"deleted_at": "2018-10-22T18:27:40.403+0000"
},
我对此信息不感兴趣,因为它已于2018-10-22删除:
[{"day_of_week":"4","start":64800000,"duration":359,"enabled":false,
"created_at":"2018-02-23T10:47:15.033+0000","deleted_at":"2018-10-22T18:27:40.403+0000"}
但是我对本专栏中所有看起来像这样的部分都感兴趣,因为它显示了day_of_week的营业时间:7。
"day_of_week":"7","start":64800000,"duration":359,"enabled":true,
"created_at":"2018-10-22T18:29:11.030+0000","deleted_at":null
我已经尝试过获取列的所有元素,但是它仅返回单元格的第一个类似元素,仅此而已:
LATERAL VIEW explode(shifts.`day_of_week`) exploded_table as day_of_week
LATERAL VIEW explode(shifts.`start`) exploded_table as start
LATERAL VIEW explode(shifts.`enabled`) exploded_table as enabled
LATERAL VIEW explode(shifts.`duration`) exploded_table as duration
有人可以帮我吗!
另外,我想"start":64800000
是指开放时间
和"duration":359
餐厅营业时间。但是我也不知道如何解释这些数字。我不知道"start":64800000
是指上午7点,上午8点,上午9点吗?如果是“持续时间”:359 7小时9小时??
很抱歉,发表了这么长的文章,但是我对SQL还是陌生的,在这里,我是唯一真正的资源,可以找出我无知的事情。
在此先感谢您提供的任何帮助。
答案 0 :(得分:0)
TLDR:
For a dataframe df
with schema:
key:integer
data:array
element:struct
day_of_week:string
start:decimal(38,0)
duration:decimal(38,0)
enabled:boolean
created_at:string
deleted_at:string
which is registered as temp table test
can be exploded with:
select key, a.ed.day_of_week,
a.ed.start, a.ed.duration,
a.ed.enabled, a.ed.created_at, a.ed.deleted_at
from (select key, explode(data) as ed from global_temp.test) a
where a.ed.deleted_at is null