过滤SQL查询返回的结果

时间:2011-11-28 17:54:23

标签: sql sql-server sql-server-2008 greatest-n-per-group

我整个下午一直都在努力解决这个问题 - 看起来很简单,但我一定会错过一些东西!

我有一个返回一些数据的查询,它返回的两个列是“PackageWeight”和“PackageGroup”。从本质上讲,我想过滤这些数据,以便每个“PackageGroup”只显示一行 - 这应该是“PackageWeight”列中具有最高值的行。

看起来很简单,但我无法使用TOP 1和GROUP BY的组合在SQL Server中使用它。我一定错过了什么!

    SELECT VendorID, PackageID, PackageWeight, PackageGroup
  FROM (SELECT VendorID, COUNT(*) AS qty
          FROM VendorServices
         GROUP BY VendorID
       ) cs
  JOIN (SELECT PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup, COUNT(*) AS qty
          FROM PackageServices
          JOIN lookupPackages ON PackageServices.PackageID = lookupPackages.PackageID
          GROUP BY PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup
       ) ps ON cs.qty >= ps.qty
  WHERE (SELECT COUNT(*)
          FROM VendorServices cs2
          JOIN PackageServices ps2 ON cs2.ServiceTypeID = ps2.ServiceID
         WHERE cs2.VendorID = cs.VendorID
           AND ps2.PackageID = ps.PackageID
       ) = ps.qty

此查询返回我需要过滤的完整数据集。但是到目前为止我的尝试都失败了:(

非常感谢任何帮助!

编辑 - 感谢下面的贡献者,到目前为止,我有以下问题:

with result_cte as
(
SELECT VendorID, PackageID, PackageWeight, PackageGroup,
    RANK() over (partition by PackageGroup order by PackageWeight desc) as [rank]
FROM (SELECT VendorID, COUNT(*) AS qty
    FROM VendorServices
    GROUP BY VendorID
    ) cs
JOIN (SELECT PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup, COUNT(*) AS qty
    FROM PackageServices
    JOIN lookupPackages ON PackageServices.PackageID = lookupPackages.PackageID
    GROUP BY PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup
    ) ps ON cs.qty >= ps.qty
WHERE (SELECT COUNT(*)
    FROM VendorServices cs2
    JOIN PackageServices ps2 ON cs2.ServiceTypeID = ps2.ServiceID
    WHERE cs2.VendorID = cs.VendorID
    AND ps2.PackageID = ps.PackageID
    ) = ps.qty
)

select *
from result_cte
WHERE [rank] = 1
ORDER BY VendorID

到目前为止,这么好。我仍然会看看@gbn建议的APPLY运算符,因为这对我来说是新的 - 我仍然需要进行一些测试以确保此查询在100%的时间内都能正常运行。然而,最初的迹象是好的!

感谢所有迄今为止做出贡献的人。

编辑2 - 遗憾的是,在使用更多示例数据填充数据库之后,此查询无法正常工作。它似乎错过了一些条目。

也许我需要更多地解释一下这里发生了什么。我的原始查询返回的数据列出了系统中的每个客户,以及派生的PackageID(由该查询计算)以及在查找表中分配给该Package的权重和组。

我需要过滤原始结果表,以便每个客户从每个组中获得不超过一个包(每个客户可能有一个或多个组的包,但可能没有来自每个组的包)< / p>

我明天会更清新地看一下,因为我觉得我可能会在'看不见木头树'的情况下!

谢谢大家。

4 个答案:

答案 0 :(得分:1)

你能试试吗?如果您在同一组中有多个具有相同权重的记录,则它不是防弹的。还有其他方法可以处理它。

with result_cte as
(
SELECT VendorID, PackageID, PackageWeight, PackageGroup
FROM (SELECT VendorID, COUNT(*) AS qty
    FROM VendorServices
    GROUP BY VendorID
    ) cs
JOIN (SELECT PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup, COUNT(*) AS qty
    FROM PackageServices
    JOIN lookupPackages ON PackageServices.PackageID = lookupPackages.PackageID
    GROUP BY PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup
    ) ps ON cs.qty >= ps.qty
WHERE (SELECT COUNT(*)
    FROM VendorServices cs2
    JOIN PackageServices ps2 ON cs2.ServiceTypeID = ps2.ServiceID
    WHERE cs2.VendorID = cs.VendorID
    AND ps2.PackageID = ps.PackageID
    ) = ps.qty
)

select *
from result_cte
where result_cte.PackageWeight = (select top 1 highestweight.PackageWeight from result_cte highestweight
                                where highestweight.PackageGroup = result_cte.PackageGroup
                                order by highestweight.PackageWeight desc)

或者你可以这样做:

with result_cte as
(
SELECT VendorID, PackageID, PackageWeight, PackageGroup,
    ROW_NUMBER() over (partition by PackageGroup order by PackageWeight desc) as [row]
FROM (SELECT VendorID, COUNT(*) AS qty
    FROM VendorServices
    GROUP BY VendorID
    ) cs
JOIN (SELECT PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup, COUNT(*) AS qty
    FROM PackageServices
    JOIN lookupPackages ON PackageServices.PackageID = lookupPackages.PackageID
    GROUP BY PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup
    ) ps ON cs.qty >= ps.qty
WHERE (SELECT COUNT(*)
    FROM VendorServices cs2
    JOIN PackageServices ps2 ON cs2.ServiceTypeID = ps2.ServiceID
    WHERE cs2.VendorID = cs.VendorID
    AND ps2.PackageID = ps.PackageID
    ) = ps.qty
)

select *
from result_cte
where [row] = 1

答案 1 :(得分:0)

如果一个组中有多个包具有相同的最大权重,您是否愿意接受单个任意供应商和PackageID?如果确定,只需在它们和PackageWeight上添加聚合:

SELECT max(VendorID), max(PackageID), max(PackageWeight), PackageGroup
...
GROUP BY PackageGroup

否则,你需要做E.Y.建议并执行嵌套查询以首先找到每个组的最大权重并自行处理重复项(如果有的话)。

答案 2 :(得分:0)

您可以使用MAX功能:

SELECT * FROM #one
lbs groups
5   0
4   0
1   0
9   1
2   1     

SELECT groups,MAX(lbs)
FROM #one
GROUP BY groups

groups  (No column name)
0   5
1   9

答案 3 :(得分:0)

感谢Eric.K.Yung的帖子 - 我最终使用他的查询解决了这个问题,但是将VendorID(实际上是CustomerID)添加到查询的“分区依据”部分。这确保了为所有客户退回包裹。

感谢所有贡献者。最后的查询是:

with result_cte as
(
SELECT VendorID, PackageID, PackageWeight, PackageGroup,
    ROW_NUMBER() over (partition by PackageGroup, VendorID order by PackageWeight desc) as [row]
FROM (SELECT VendorID, COUNT(*) AS qty
    FROM VendorServices
    GROUP BY VendorID
    ) cs
JOIN (SELECT PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup, COUNT(*) AS qty
    FROM PackageServices
    JOIN lookupPackages ON PackageServices.PackageID = lookupPackages.PackageID
    GROUP BY PackageServices.PackageID, lookupPackages.PackageWeight, lookupPackages.PackageGroup
    ) ps ON cs.qty >= ps.qty
WHERE (SELECT COUNT(*)
    FROM VendorServices cs2
    JOIN PackageServices ps2 ON cs2.ServiceTypeID = ps2.ServiceID
    WHERE cs2.VendorID = cs.VendorID
    AND ps2.PackageID = ps.PackageID
    ) = ps.qty
)

select *
from result_cte
where [row] = 1