我正在尝试运行此查询,其中包含Hive中的sum(...) over(...)
函数,但显示错误。
我试图在子查询中使用select distinct
,但它仍然不起作用。
join
函数中是否有错误?
这是我的SQL代码
select
c.driver_city_name,
c.driver_car_brand,
c.year,
count(distinct c.driver_id)over(PARTITION BY c.driver_car_brand,c.driver_city_name,c.year),
sum(c.upply_hr)over(PARTITION BY c.driver_car_brand,c.driver_city_name,c.year),
sum(c.work_hr)over(PARTITION BY c.driver_car_brand,c.driver_city_name,c.year)
from (
select
a.driver_city_name,
a.driver_car_brand,
a.driver_id,
year(b.reg_date_cheling) as year,
d.onlinetime as supply_hr,
d.charge_time_length/60/60 as work_hr
from gulfstream_dw.dw_v_driver_base a
join (
select
driver_id,
reg_date_cheling
from g_bi.t_brx_xinzheng_driver_diaodu_1
where driver_city_id in (17,18,2,3,1,5,10,4,24,23)
and concat_ws('-',year,month,day) BETWEEN '2016-11-19' and '2017-02-21'
) b on a.driver_id = b.driver_id
join (
select
e.driver_id,
sum(e.onlinetime) as onlinetime, --在线时长(单位:小时)
sum(e.charge_time_length) as charge_time_length --计费总时长(单位:秒)
from(
select distinct
concat_ws('-',year,month,day) date1,
driver_id,
onlinetime, --在线时长(单位:小时)
charge_time_length --计费总时长(单位:秒)
from gulfstream_dw.dw_m_driver_strategy
where concat_ws('-',year,month,day) between '2016-11-19' and '2017-02-21'
and onlinetime > 0
) e
group by e.driver_id
) d on a.driver_id = d.driver_id
where driver_city_id in (17,18,2,3,1,5,10,4,24,23)
and to_date(max_success_strive_time) BETWEEN '2016-11-20' and '2017-02-20'
and concat_ws('-',year,month,day) BETWEEN '2016-11-19' and '2017-02-21'
and a.driver_car_brand in (
"奇瑞-E3",
"吉利-远景",
)
group by a.driver_city_name,a.driver_car_brand,a.driver_id,year(b.reg_date_cheling),d.onlinetime,d.charge_time_length/60/60
)c
group by c.driver_city_name,c.driver_car_brand,c.year
但是有错误:
错误类型:SEMANTIC_FAILED编译语句时出错:FAILED: SemanticException无法将窗口调用分解为组。 至少有一个组必须仅依赖于输入列。还要检查 循环依赖。潜在的错误: org.apache.hadoop.hive.ql.parse.SemanticException:第6行:6无效 列引用'supply_hr'
如果我遗漏sum over
函数并保留count over
函数,则会成功运行。