使用SQL Server计算满足条件的各种类别的百分比

时间:2019-02-04 02:40:34

标签: sql sql-server subquery correlated-subquery

我有一张表格,其中包含有关城市之间航班的信息,如下所示:

    origin_city dest_city   time    
    Dothan AL   Atlanta GA    171    
    Dothan AL   Atlanta GA    171    
    Dothan AL   Elsewhere AL    2    
    Dothan AL   Elsewhere AL    2    
    Dothan AL   Elsewhere AL    2    
    Boston MA   New York NY     5    
    Boston MA   City MA         1    
    New York NY Boston MA       5    
    New York NY Boston MA       5    
    New York NY Boston MA       5    
    New York NY Poughkipsie NY  2

我想为每个始发城市找到少于3小时的航班百分比。结果是这样的:

    Dothan AL    60
    Boston MA    50
    New York NY  25

我认为可行的代码如下:

     SELECT F.origin_city as origin_city,    
       ((SELECT COUNT(*) FROM Flights as F2
       WHERE F2.actual_time < 3) / (SELECT COUNT(*) FROM Flights as  F3)) * 100
     AS percentage
     FROM Flights as F
     GROUP BY F.origin_city
     ORDER BY percentage;
     GO

运行它时,如预期的那样,我得到了一个原始城市的列表和一个列,但该百分比始终为0。对于子查询,我还是很困惑(如您所见)。

2 个答案:

答案 0 :(得分:1)

我可以使用AVG()作为窗口函数来做到这一点:

SELECT F.origin_city as origin_city, 
       AVG( CASE WHEN F2.actual_time < 3 THEN 100.0 ELSE 0 END) as percentage
FROM Flights F
GROUP BY F.origin_city
ORDER BY percentage;

这假设时间以小时为单位。根据Google Maps,您可以在68小时内从多森步行到亚特兰大,因此171可疑。

答案 1 :(得分:0)

您的百分比在整个表格中,而不是按来源城市分组。尝试这样的事情:

 SELECT F.origin_city as origin_city,    
   (SUM(CASE WHEN F.actual_time < 3 THEN 1 ELSE 0 END) / COUNT(*) )  * 100 AS percentage
 FROM Flights as F
 GROUP BY F.origin_city
 ORDER BY percentage;
 GO

FWIW当前子查询的问题是当前行与子查询中的数据之间没有联接。您可能将其重写为:

 SELECT F.origin_city as origin_city,    
 ((SELECT COUNT(*) FROM Flights as F2
 WHERE F2.origin_city = F.origin_city and F2.actual_time < 3) / (SELECT COUNT(*) FROM Flights as  F3 where F3.origin_city = F.origin_city)) * 100
 AS percentage
 FROM Flights as F
 GROUP BY F.origin_city
 ORDER BY percentage;
 GO

但是,当您已经有足够的数据来进行如上所述的计算时,不必为每行重新查询表。