我在SQL中有以下查询:
select midquery.account, midquery.name, midquery.label, midquery.labelfrequency
from(
-- Count the appearance of each label.
select count(*) as labelfrequency, account, name, label
from(
select account, name, label from myTable
) innerquery
group by account, name, label
) midquery
-- Select most frequent values only.
where rank() over
(partition by midquery.account, midquery.name
order by midquery.labelfrequency desc) = 1
我们的想法是找到每个名称帐户集最常用的标签。当我运行此查询时,出现以下错误:
Error while compiling statement: FAILED: SemanticException [Error 10002]: Line 12:74 Invalid column reference 'labelfrequency': (possible column names are: labelfrequency, account, name, label)
我不太明白为什么口译员没有找到列实验室,但可以提出建议。您对如何解决这个问题有任何建议吗?
修改:如果我将rank()移到select部分,我会得到结果。
select midquery.account, midquery.name, midquery.label, midquery.labelfrequency,
rank() over (partition by midquery.account, midquery.name
order by midquery.labelfrequency desc)
from(
-- Count the appearance of each label.
select count(*) as labelfrequency, account, name, label
from(
select account, name, label from myTable
) innerquery
group by account, name, label
) midquery
答案 0 :(得分:1)
WHERE
子句中根本不允许使用窗口函数。这有很好的理由,但您可以将其视为SQL的另一个规则 - 类似于无法识别的列别名。
(真正的原因是指定当有多个过滤条件时窗函数将如何运作。(几乎?)不可能提出一套连贯的规则。)
话虽如此,您可以简化查询:
select t.account, t.name, t.label, t.labelfrequency
from (select count(*) as labelfrequency, account, name, label,
rank() over (partition by account, name
order by count(*) desc
) as seqnum
from myTable t
group by account, name, label
) t
where seqnum = 1;
即,可以组合窗口函数和聚合函数。并且您不需要子查询来仅指定少数列。