从按查询分组中选择最高值

时间:2012-06-21 15:27:17

标签: sql google-bigquery

我有一个问题:

select substr(name,7,50) as location, points,sum(if (p1=r1,10,-10))as total from
dq.data 
group by points,location order by location,total desc

产生这些数据:

FRANCE  |0|2|0|0|0|0|1  110.0    
FRANCE  |0|2|1|0|1|2|1  100.0    
FRANCE  |0|2|0|0|0|1|1  100.0    
FRANCE  |0|2|1|0|0|1|1  100.0    
FRANCE  |0|2|0|1|1|2|1  100.0    
FRANCE  |0|2|0|0|1|1|1  100.0
GERMANY |1|0|2|2|2|1|0  120.0    
GERMANY |1|0|2|2|2|0|0  110.0    
GERMANY |1|0|2|2|2|2|0  110.0    
GERMANY |1|0|2|2|2|0|2  110.0    
GERMANY |1|0|2|2|2|1|1  110.0

我希望每个total获得最高points和相关location

我最终应该:

FRANCE  |0|2|0|0|0|0|1  110.0
GERMANY |1|0|2|2|2|1|0  120.0

我相信我需要使用子查询和MAX(total),但我无法使用它。 在子查询中,我想选择points,但我不想将它分组,这显然是不允许的。

我该怎么做?

2 个答案:

答案 0 :(得分:3)

你的直觉是正确的。您可以通过计算最大总数然后将其连接回原始数据来执行此操作:

select t.*
from (select substr(name,7,50) as location, points,sum(if (p1=r1,10,-10))as total
      from dq.data 
      group by points,location
     ) t join
     (select location, max(total) as maxtotal
      from (select substr(name,7,50) as location, points,sum(if (p1=r1,10,-10))as total
            from dq.data 
            group by points,location
           ) t
      group by location
     ) tsum
     on t.location = tsum.location and t.total = tsum.maxtotal

请注意,如果顶部有联系,此版本将返回重复项。

我对google-biggquery并不熟悉。如果它支持“with”语句,那么您可以通过执行以下操作来简化查询:

with t as (select substr(name,7,50) as location, points,sum(if (p1=r1,10,-10))as total
           from dq.data 
           group by points,location
          )
select t.*
from t join
     (select location, max(total) as maxtotal
      from t
      group by location
     ) tsum
     on t.location = tsum.location and t.total = tsum.maxtotal

如果它支持windows函数(例如row_number()),那么你可以完全消除显式连接。

答案 1 :(得分:0)

我最近遇到了类似的问题,解决了类似的问题:

SELECT substr(name,7,50) as location, points,sum(if (p1=r1,10,-10))as total
FROM ( 
   SELECT * FROM dq.data ORDER BY location,sum(if (p1=r1,10,-10)) desc 
) tmp
GROUP BY points,location;

不确定它是否可以正常运行,因为我的数据库是MySQL,但它是一个很好的直观解决方案。按照您希望汇总行丢失的方式对子查询进行排序。