我有一个带有窗口函数的常用表表达式,并不断收到错误消息:
编译语句时出错:失败:SemanticException无法执行 将窗口调用分组。至少必须有一组 取决于输入列。还要检查循环依赖性。 底层错误:org.apache.hadoop.hive.ql.parse.SemanticException: 行82:6 CTE的定义中的列引用'gcr_amt'无效 pro_orders [选择o.shopper_id作为pro_shopper_id, date_format(o.order_date,'YYYYMM')as ym_order,sum(o.gcr_amt)as total_gcr,sum(当o.product_pnl_new_renewal_name ='New时的情况 购买”,然后将o.gcr_amt结束)作为new_gcr,总和(o.gcr_amt) (由o.shopper_id行划分为12个在前和1个在后) 作为来自dp_enterprise.uds_order或内部联接的12months_direct_gcr cs.pro_shopper_id = o.shopper_id上的Combined_shopper_level_data cs cs.year_month = date_format(o.order_date,'YYYYMM')其中 o.exclude_reason_desc是由o.shopper_id,o.order_date组成的Null组 在第83:5行用作po
我的CTE看起来像这样:
pro_orders as (
select o.shopper_id as pro_shopper_id,
date_format(o.order_date, 'YYYYMM') as ym_order,
sum(o.gcr_amt) as total_gcr,
sum(case when o.product_pnl_new_renewal_name = 'New Purchase' then o.gcr_amt end) as new_gcr,
sum(o.gcr_amt) over (partition by o.shopper_id, cs.year_month order by cs.year_month desc rows between 12 preceding and 0 following) as 12months_direct_gcr
from dp_enterprise.uds_order o
right join combined_shopper_level_data cs on cs.pro_shopper_id = o.shopper_id and cs.year_month = date_format(o.order_date, 'YYYYMM')
group by o.shopper_id, o.order_date
),
我不经常使用窗口函数,也许我的语法不可用。用英语我想做的是获取指标“ gcr”的12个月总计。
所以在year_month 201901中有shopper_id 123abc的行,我想将前11个月加上gcr的当前行月份加起来,总计12个月。不确定我的窗口功能是否正确设置了?
所引用的year_month的格式为YYYYMM,例如201901。
我的目标窗口功能设置正确吗?
如何克服此错误消息?
编辑: 仍然收到带有以下CTE的错误消息:
pro_orders as (
select o.shopper_id as pro_shopper_id,
cs.year_month,
sum(case when date_format(o.order_date, 'YYYYMM') = cs.year_month then o.gcr_amt else 0 end) as total_gcr,
sum(case when date_format(o.order_date, 'YYYYMM') = cs.year_month and o.product_pnl_new_renewal_name = 'New Purchase' then o.gcr_amt else 0 end) as new_gcr,
sum(sum(o.gcr_amt)) over (partition by o.shopper_id
order by cs.year_month desc
rows between 12 preceding and 0 following)
as 12months_direct_gcr
from combined_shopper_level_data cs
left join dp_enterprise.uds_order o on o.shopper_id = cs.pro_shopper_id
where o.exclude_reason_desc is Null
group by o.shopper_id, cs.year_month
),
导致类似的错误消息:
编译语句时出错:失败:SemanticException无法执行 将窗口调用分组。至少必须有一组 取决于输入列。还要检查循环依赖性。 底层错误:org.apache.hadoop.hive.ql.parse.SemanticException: 行83:10 CTE的定义中的列引用'gcr_amt'无效 pro_orders [选择o.shopper_id作为pro_shopper_id,cs.year_month, 总和(如果date_format(o.order_date,'YYYYMM')= cs.year_month然后 o.gcr_amt else 0 end)as total_gcr,sum(case when) date_format(o.order_date,'YYYYMM')= cs.year_month和 o.product_pnl_new_renewal_name ='新购买',然后o.gcr_amt否则为0 end)作为new_gcr,sum(sum(o.gcr_amt))over(由o.shopper_id分区) 按cs.year_month desc行排序,介于12个在前和0个在后) 作为来自Combined_shopper_level_data CS的12months_direct_gcr左连接 dp_enterprise.uds_order o on o.shopper_id = cs.pro_shopper_id其中 o.exclude_reason_desc是由o.shopper_id,cs.year_month组成的Null组] 在第87:5行用作po
答案 0 :(得分:1)
您有一个聚合查询,因此window函数看起来有点有趣。基本思想是这样的:
sum(sum(o.gcr_amt)) over (partition by o.shopper_id, cs.year_month
order by cs.year_month desc
rows between 12 preceding and 0 following
) as 12months_direct_gcr
这仍然行不通。首先,您在order by
和partition by
中具有值。其次,它不在group by
中。
假设每个月都有一个值,那么您可以使用:
sum(sum(o.gcr_amt)) over (partition by o.shopper_id
order by cs.year_month desc
rows between 12 preceding and 0 following
) as 12months_direct_gcr
并在cs.year_month
中使用group by
(这可能需要调整查询的其他部分。
出于可读性考虑,我还建议您使用left join
而不是right join
。对于我(和大多数人)来说,说“保留我刚刚读取的第一个表中的所有行”比“保留要在{末尾读取的某张表中的所有行”在认知上要简单得多。 {1}}子句”。
编辑:
我认为完整的查询是:
from
Hive在聚合查询中使用窗口函数可能会受到限制(这会让我感到惊讶,因为它们是单独处理的)。我找不到对此的具体参考。如果是这样,只需使用子查询:
with pro_orders as (
select o.shopper_id as pro_shopper_id,
cs.year_month,
sum(coalesce(o.gcr_amt, 0)) as total_gcr,
sum(case when o.product_pnl_new_renewal_name = 'New Purchase' then o.gcr_amt else 0 end) as new_gcr,
sum(sum(o.gcr_amt)) over (partition by o.shopper_id
order by cs.year_month desc
rows between 12 preceding and 0 following
) as 12months_direct_gcr
from combined_shopper_level_data cs left join
dp_enterprise.uds_order o
on o.shopper_id = cs.pro_shopper_id and
date_format(o.order_date, 'YYYYMM') = cs.year_month and
o.exclude_reason_desc is Null
group by o.shopper_id, cs.year_month
),