在其中一个表格中,我有多个字段,其中有一个排名字段。所有这些字段都有一个共同的分组属性,我需要找到最佳排名列值,该值可以存在于组的任何记录中。例如,让我们考虑以下数据:
+---------+---------------+-----------+-----------------+-------------+----------------------+------------+
| Country | City | City_Rank | Artist | Artist_Rank | Movie | Movie_Rank |
+---------+---------------+-----------+-----------------+-------------+----------------------+------------+
| USA | Las Vegas | 2 | Louis C.K | 2 | Justice League | 3 |
| USA | New York City | 3 | Michael Flynn | 3 | IT | 1 |
| USA | Los Angeles | 1 | Matt Lauer | 1 | Get Out | 2 |
| UK | Leeds | 2 | Jack Maynard | 3 | Beauty and the Beast | 2 |
| UK | Manchester | 3 | Charlie Gard | 1 | Wonder Woman | 1 |
| UK | London | 1 | Shannon Mathews | 2 | Logan | 3 |
+---------+---------------+-----------+-----------------+-------------+----------------------+------------+
现在我需要City
的等级1,Artist
和Movie
在单个记录中由Country
分组。所以预期的输出是:
+---------+------------------+--------------------+-------------------+
| Country | Best_Ranked_City | Best_Ranked_Artist | Best_Ranked_Movie |
+---------+------------------+--------------------+-------------------+
| USA | Los Angeles | Matt Lauer | IT |
| UK | London | Charlie Gard | Wonder Woman |
+---------+------------------+--------------------+-------------------+
我有更多属性,我有排名字段。我可以通过为每个排名字段(其中rank = 1)形成上述多个数据集并使用过滤条件到达所需输出,然后通过组字段连接这些数据集。
然而,由于表中有数百万条记录,这是相当昂贵的事情,并且多次过滤和连接此数据集似乎不是解决此问题的最佳方法。通过在其上应用一些业务逻辑,我使用Rank()
windows函数到达每个字段的行列。
我希望仅在可能的情况下使用Window函数进一步解决此问题。
答案 0 :(得分:1)
我使用Rank()窗口到达每个字段的排名 通过在其上应用一些业务逻辑来实现。
我想有一些查询可以计算排名然后a pivot operation来生成问题中显示的汇总表。
最好消除数据透视操作,以便此查询生成的输入数据看起来像这样:
| country | category | cat_value | rank_value |
|---------|----------|----------------------|------------|
| UK | Artist | Jack Maynard | 3 |
| UK | Artist | Shannon Mathews | 2 |
| UK | Artist | Charlie Gard | 1 |
| UK | City | Leeds | 2 |
| UK | City | Manchester | 3 |
| UK | City | London | 1 |
| UK | Movie | Logan | 3 |
| UK | Movie | Beauty and the Beast | 2 |
| UK | Movie | Wonder Woman | 1 |
| USA | Artist | Louis C.K | 2 |
| USA | Artist | Michael Flynn | 3 |
| USA | Artist | Matt Lauer | 1 |
| USA | City | Las Vegas | 2 |
| USA | City | Los Angeles | 1 |
| USA | City | New York City | 3 |
| USA | Movie | Justice League | 3 |
| USA | Movie | IT | 1 |
| USA | Movie | Get Out | 2 |
如果无法做到这一点,则可以使用以下方法忽略此结果集:
SELECT Country, 'City' as category, City as cat_value, City_Rank as rank_value
FROM Table1
UNION ALL
SELECT Country, 'Artist' as category, Artist as cat_value, Artist_Rank as rank_value
FROM Table1
UNION ALL
SELECT Country, 'Movie' as category, Movie as cat_value, Movie_Rank as rank_value
FROM Table1
如果您取消忽略此表格,那么选择rank = 1的项目非常简单,只需执行以下操作:
SELECT * FROM unpivot_table WHERE rank_value = 1
然后可以对其结果进行另一个转轴。
最终查询可能如下所示(现场演示:http://sqlfiddle.com/#!17/05e53/5)
With unpivot_me As (
SELECT Country, 'City' as category, City as cat_value, City_Rank as rank_value
FROM Table1
UNION ALL
SELECT Country, 'Artist' as category, Artist as cat_value, Artist_Rank as rank_value
FROM Table1
UNION ALL
SELECT Country, 'Movie' as category, Movie as cat_value, Movie_Rank as rank_value
FROM Table1
)
SELECT Country,
Max( case when category = 'City' Then cat_value End) As Best_Ranked_City,
Max( case when category = 'Artist' Then cat_value End) As Best_Ranked_Artist,
Max( case when category = 'Movie' Then cat_value End) As Best_Ranked_Movie
FROM unpivot_me
WHERE rank_value = 1
GROUP BY Country
| country | best_ranked_city | best_ranked_artist | best_ranked_movie |
|---------|------------------|--------------------|-------------------|
| UK | London | Charlie Gard | Wonder Woman |
| USA | Los Angeles | Matt Lauer | IT |
答案 1 :(得分:0)
使用窗口函数max()并在其中放置一个案例条件,其中按国家/地区划分排名为1。这为所有国家/地区提取了所需列的第一个等级值。稍后使用值为1的排名字段之一对其进行过滤(可以使用任何可用的排名字段进行过滤)。这是SQL:http://sqlfiddle.com/#!17/05e53/18
With T1 as (
select Country, max(case when City_Rank =1 then City else '' end)
over (partition by Country) as Best_Ranked_City, City_Rank,
max(case when Artist_Rank =1 then Artist else '' end)
over (partition by Country) as Best_Ranked_Artist, max(case when
Movie_Rank =1 then Movie else '' end)
over (partition by Country) as Best_Ranked_Movie
from Table1
)
select Country, Best_Ranked_City, Best_Ranked_Artist, Best_Ranked_Movie
from T1 where city_rank=1;