根据PostgreSQL中多个字段的排名选择或过滤列

时间:2017-12-21 19:45:12

标签: sql postgresql-9.4 rank

在其中一个表格中,我有多个字段,其中有一个排名字段。所有这些字段都有一个共同的分组属性,我需要找到最佳排名列值,该值可以存在于组的任何记录中。例如,让我们考虑以下数据:

+---------+---------------+-----------+-----------------+-------------+----------------------+------------+
| Country |     City      | City_Rank |     Artist      | Artist_Rank |        Movie         | Movie_Rank |
+---------+---------------+-----------+-----------------+-------------+----------------------+------------+
| USA     | Las Vegas     |         2 | Louis C.K       |           2 | Justice League       |          3 |
| USA     | New York City |         3 | Michael Flynn   |           3 | IT                   |          1 |
| USA     | Los Angeles   |         1 | Matt Lauer      |           1 | Get Out              |          2 |
| UK      | Leeds         |         2 | Jack Maynard    |           3 | Beauty and the Beast |          2 |
| UK      | Manchester    |         3 | Charlie Gard    |           1 | Wonder Woman         |          1 |
| UK      | London        |         1 | Shannon Mathews |           2 | Logan                |          3 |
+---------+---------------+-----------+-----------------+-------------+----------------------+------------+

现在我需要City的等级1,ArtistMovie在单个记录中由Country分组。所以预期的输出是:

+---------+------------------+--------------------+-------------------+
| Country | Best_Ranked_City | Best_Ranked_Artist | Best_Ranked_Movie |
+---------+------------------+--------------------+-------------------+
| USA     | Los Angeles      | Matt Lauer         | IT                |
| UK      | London           | Charlie Gard       | Wonder Woman      |
+---------+------------------+--------------------+-------------------+

我有更多属性,我有排名字段。我可以通过为每个排名字段(其中rank = 1)形成上述多个数据集并使用过滤条件到达所需输出,然后通过组字段连接这些数据集。

然而,由于表中有数百万条记录,这是相当昂贵的事情,并且多次过滤和连接此数据集似乎不是解决此问题的最佳方法。通过在其上应用一些业务逻辑,我使用Rank() windows函数到达每个字段的行列。

我希望仅在可能的情况下使用Window函数进一步解决此问题。

2 个答案:

答案 0 :(得分:1)

  

我使用Rank()窗口到达每个字段的排名   通过在其上应用一些业务逻辑来实现。

我想有一些查询可以计算排名然后a pivot operation来生成问题中显示的汇总表。
最好消除数据透视操作,以便此查询生成的输入数据看起来像这样:

| country | category |            cat_value | rank_value |
|---------|----------|----------------------|------------|
|      UK |   Artist |         Jack Maynard |          3 |
|      UK |   Artist |      Shannon Mathews |          2 |
|      UK |   Artist |         Charlie Gard |          1 |
|      UK |     City |                Leeds |          2 |
|      UK |     City |           Manchester |          3 |
|      UK |     City |               London |          1 |
|      UK |    Movie |                Logan |          3 |
|      UK |    Movie | Beauty and the Beast |          2 |
|      UK |    Movie |         Wonder Woman |          1 |
|     USA |   Artist |            Louis C.K |          2 |
|     USA |   Artist |        Michael Flynn |          3 |
|     USA |   Artist |           Matt Lauer |          1 |
|     USA |     City |            Las Vegas |          2 |
|     USA |     City |          Los Angeles |          1 |
|     USA |     City |        New York City |          3 |
|     USA |    Movie |       Justice League |          3 |
|     USA |    Movie |                   IT |          1 |
|     USA |    Movie |              Get Out |          2 |

如果无法做到这一点,则可以使用以下方法忽略此结果集:

SELECT Country, 'City' as category, City as cat_value, City_Rank as rank_value
FROM Table1
UNION ALL
SELECT Country, 'Artist' as category, Artist as cat_value, Artist_Rank as rank_value
FROM Table1
UNION ALL
SELECT Country, 'Movie' as category, Movie as cat_value, Movie_Rank as rank_value
FROM Table1

如果您取消忽略此表格,那么选择rank = 1的项目非常简单,只需执行以下操作:

SELECT * FROM unpivot_table WHERE rank_value = 1

然后可以对其结果进行另一个转轴。

最终查询可能如下所示(现场演示:http://sqlfiddle.com/#!17/05e53/5

With unpivot_me As (
SELECT Country, 'City' as category, City as cat_value, City_Rank as rank_value
FROM Table1
UNION ALL
SELECT Country, 'Artist' as category, Artist as cat_value, Artist_Rank as rank_value
FROM Table1
UNION ALL
SELECT Country, 'Movie' as category, Movie as cat_value, Movie_Rank as rank_value
FROM Table1
)


SELECT Country,
       Max( case when category = 'City' Then cat_value End) As Best_Ranked_City,
       Max( case when category = 'Artist' Then cat_value End) As Best_Ranked_Artist,
       Max( case when category = 'Movie' Then cat_value End) As Best_Ranked_Movie
FROM unpivot_me
WHERE rank_value = 1 
GROUP BY Country

| country | best_ranked_city | best_ranked_artist | best_ranked_movie |
|---------|------------------|--------------------|-------------------|
|      UK |           London |       Charlie Gard |      Wonder Woman |
|     USA |      Los Angeles |         Matt Lauer |                IT |

答案 1 :(得分:0)

使用窗口函数max()并在其中放置一个案例条件,其中按国家/地区划分排名为1。这为所有国家/地区提取了所需列的第一个等级值。稍后使用值为1的排名字段之一对其进行过滤(可以使用任何可用的排名字段进行过滤)。这是SQL:http://sqlfiddle.com/#!17/05e53/18

With T1 as (
select Country, max(case when City_Rank =1 then City else '' end) 
over (partition by Country) as Best_Ranked_City, City_Rank, 
max(case when Artist_Rank =1 then Artist else '' end) 
over (partition by Country) as Best_Ranked_Artist, max(case when 
Movie_Rank =1 then Movie else '' end) 
over (partition by Country) as Best_Ranked_Movie 
from Table1
  )
select Country, Best_Ranked_City, Best_Ranked_Artist, Best_Ranked_Movie 
from T1 where city_rank=1;