我有一个问题:
select substr(name,7,50) as location, points,sum(if (p1=r1,10,-10))as total from
dq.data
group by points,location order by location,total desc
产生这些数据:
FRANCE |0|2|0|0|0|0|1 110.0
FRANCE |0|2|1|0|1|2|1 100.0
FRANCE |0|2|0|0|0|1|1 100.0
FRANCE |0|2|1|0|0|1|1 100.0
FRANCE |0|2|0|1|1|2|1 100.0
FRANCE |0|2|0|0|1|1|1 100.0
GERMANY |1|0|2|2|2|1|0 120.0
GERMANY |1|0|2|2|2|0|0 110.0
GERMANY |1|0|2|2|2|2|0 110.0
GERMANY |1|0|2|2|2|0|2 110.0
GERMANY |1|0|2|2|2|1|1 110.0
我希望每个total
获得最高points
和相关location
。
我最终应该:
FRANCE |0|2|0|0|0|0|1 110.0
GERMANY |1|0|2|2|2|1|0 120.0
我相信我需要使用子查询和MAX(total)
,但我无法使用它。
在子查询中,我想选择points
,但我不想将它分组,这显然是不允许的。
我该怎么做?
答案 0 :(得分:3)
你的直觉是正确的。您可以通过计算最大总数然后将其连接回原始数据来执行此操作:
select t.*
from (select substr(name,7,50) as location, points,sum(if (p1=r1,10,-10))as total
from dq.data
group by points,location
) t join
(select location, max(total) as maxtotal
from (select substr(name,7,50) as location, points,sum(if (p1=r1,10,-10))as total
from dq.data
group by points,location
) t
group by location
) tsum
on t.location = tsum.location and t.total = tsum.maxtotal
请注意,如果顶部有联系,此版本将返回重复项。
我对google-biggquery并不熟悉。如果它支持“with”语句,那么您可以通过执行以下操作来简化查询:
with t as (select substr(name,7,50) as location, points,sum(if (p1=r1,10,-10))as total
from dq.data
group by points,location
)
select t.*
from t join
(select location, max(total) as maxtotal
from t
group by location
) tsum
on t.location = tsum.location and t.total = tsum.maxtotal
如果它支持windows函数(例如row_number()),那么你可以完全消除显式连接。
答案 1 :(得分:0)
我最近遇到了类似的问题,解决了类似的问题:
SELECT substr(name,7,50) as location, points,sum(if (p1=r1,10,-10))as total
FROM (
SELECT * FROM dq.data ORDER BY location,sum(if (p1=r1,10,-10)) desc
) tmp
GROUP BY points,location;
不确定它是否可以正常运行,因为我的数据库是MySQL,但它是一个很好的直观解决方案。按照您希望汇总行丢失的方式对子查询进行排序。