SQL Server:一个令我讨厌的分组问题

时间:2010-06-14 22:22:05

标签: sql sql-server database-partitioning

我一直在使用SQL Server十年的大部分时间,这个分组(或分区,或排名......我不确定答案是什么!)让我难过。感觉它应该是一个简单的感觉。我会概括我的问题:

假设我有3名员工(不要担心他们退出或任何事情......总会有3名),并且我会跟上他每月分配工资的方式。

Month   Employee  PercentOfTotal
--------------------------------
1       Alice     25%
1       Barbara   65%
1       Claire    10%

2       Alice     25%
2       Barbara   50%
2       Claire    25%

3       Alice     25%
3       Barbara   65%
3       Claire    10%

正如你所看到的,我在第1个月和第3个月给他们支付了相同的百分比,但是在第2个月,我给了爱丽丝25%,但芭芭拉得到了50%而克莱尔得到了25%。

我想知道的是我曾经给出的所有不同的分布。在这种情况下,将有两个 - 一个用于第1个月和第3个月,一个用于第2个月。

我希望结果看起来像这样(注意:ID,或音序器,或者其他什么都无关紧要)

ID      Employee  PercentOfTotal
--------------------------------
X       Alice     25%
X       Barbara   65%
X       Claire    10%

Y       Alice     25%
Y       Barbara   50%
Y       Claire    25%

看起来很简单,对吗?我很难过!有人有优雅的解决方案?我只是在写这个问题的时候把这个解决方案放在一起,这似乎有效,但我想知道是否有更好的方法。或者也许是一种不同的方式,我将从中学到一些东西。

WITH temp_ids (Month)
AS
(
  SELECT DISTINCT MIN(Month)
    FROM employees_paid
  GROUP BY PercentOfTotal
)
SELECT EMP.Month, EMP.Employee, EMP.PercentOfTotal
  FROM employees_paid EMP
         JOIN temp_ids IDS ON EMP.Month = IDS.Month
GROUP BY EMP.Month, EMP.Employee, EMP.PercentOfTotal

谢谢你们! -Ricky

5 个答案:

答案 0 :(得分:4)

这会给您一个与您要求的格式略有不同的答案:

SELECT DISTINCT
    T1.PercentOfTotal AS Alice,
    T2.PercentOfTotal AS Barbara,
    T3.PercentOfTotal AS Claire
FROM employees_paid T1
JOIN employees_paid T2
  ON T1.Month = T2.Month AND T1.Employee = 'Alice' AND T2.Employee = 'Barbara'
JOIN employees_paid T3
  ON T2.Month = T3.Month AND T3.Employee = 'Claire'

结果:

Alice   Barbara  Claire
25%     50%      25%
25%     65%      10%

如果您愿意,可以使用UNPIVOT将此结果集转换为您要求的表单。

SELECT rn AS ID, Employee, PercentOfTotal
FROM (
    SELECT *, ROW_NUMBER() OVER (ORDER BY Alice) AS rn
    FROM (
        SELECT DISTINCT
            T1.PercentOfTotal AS Alice,
            T2.PercentOfTotal AS Barbara,
            T3.PercentOfTotal AS Claire
        FROM employees_paid T1
        JOIN employees_paid T2 ON T1.Month = T2.Month AND T1.Employee = 'Alice'
                                                      AND T2.Employee = 'Barbara'
        JOIN employees_paid T3 ON T2.Month = T3.Month AND T3.Employee = 'Claire'
    ) T1
) p UNPIVOT (PercentOfTotal FOR Employee IN (Alice, Barbara, Claire)) AS unpvt

结果:

ID  Employee  PercentOfTotal  
1   Alice     25%
1   Barbara   50%      
1   Claire    25%             
2   Alice     25%             
2   Barbara   65%              
2   Claire    10%               

答案 1 :(得分:3)

你想要的是每个月的发行版作为你想要在其他月份找到的价值的签名或模式。不清楚的是,价值所在的员工是否与百分比的分解同等重要。例如,Alice = 65%,Barbara = 25%,Claire = 10%与您示例中的第3个月相同?在我的例子中,我推测它不会是一样的。与Martin Smith的解决方案类似,我通过将每个百分比乘以10来找到签名。这假设所有百分比值都小于1。例如,如果有人可能有110%的百分比,那么这会给这个解决方案带来问题。

With Employees As
    (
    Select 1 As Month, 'Alice' As Employee, .25 As PercentOfTotal
    Union All Select 1, 'Barbara', .65
    Union All Select 1, 'Claire', .10
    Union All Select 2, 'Alice', .25
    Union All Select 2, 'Barbara', .50
    Union All Select 2, 'Claire', .25
    Union All Select 3, 'Alice', .25
    Union All Select 3, 'Barbara', .65
    Union All Select 3, 'Claire', .10
    )
    , EmployeeRanks As
    (
    Select Month, Employee, PercentOfTotal
        , Row_Number() Over ( Partition By Month Order By Employee, PercentOfTotal ) As ItemRank
    From Employees
    )
    , Signatures As
    (
    Select Month
        , Sum( PercentOfTotal * Cast( Power( 10, ItemRank ) As bigint) ) As SignatureValue
    From EmployeeRanks
    Group By Month
    )
    , DistinctSignatures As
    (
    Select Min(Month) As MinMonth, SignatureValue
    From Signatures
    Group By SignatureValue
    )
Select E.Month, E.Employee, E.PercentOfTotal
From Employees As E
    Join DistinctSignatures As D
        On D.MinMonth = E.Month

答案 2 :(得分:2)

如果我理解正确的话,那么对于一般的解决方案,我认为你需要将整个小组连接在一起 - 例如生成Alice:0.25, Barbara:0.50, Claire:0.25。然后选择不同的组,以便像下面这样做(相当笨拙)。

WITH EmpSalaries
AS
(

SELECT 1 AS Month, 'Alice' AS Employee, 0.25 AS PercentOfTotal UNION ALL
SELECT 1 AS Month, 'Barbara' AS Employee, 0.65 UNION ALL
SELECT 1 AS Month, 'Claire' AS Employee, 0.10 UNION ALL

SELECT 2 AS Month, 'Alice' AS Employee, 0.25 UNION ALL
SELECT 2 AS Month, 'Barbara' AS Employee, 0.50 UNION ALL
SELECT 2 AS Month, 'Claire' AS Employee, 0.25 UNION ALL

SELECT 3 AS Month,  'Alice' AS Employee, 0.25 UNION ALL
SELECT 3 AS Month,  'Barbara' AS Employee, 0.65 UNION ALL
SELECT 3 AS Month,  'Claire' AS Employee, 0.10 
),
Months AS 
(
SELECT DISTINCT Month FROM EmpSalaries
),
MonthlySummary AS
(
SELECT Month,
Stuff(
            (
            Select ', ' + S1.Employee + ':' + cast(PercentOfTotal as varchar(20))
            From EmpSalaries As S1
            Where S1.Month = Months.Month
            Order By S1.Employee
            For Xml Path('')
            ), 1, 2, '') As Summary
FROM Months
)
SELECT * FROM EmpSalaries
WHERE Month IN (SELECT MIN(Month)
                FROM MonthlySummary
                GROUP BY Summary)

答案 3 :(得分:2)

我假设性能不会很好(子查询的原因)

SELECT * FROM employees_paid where Month not in (
     SELECT
          a.Month
     FROM
          employees_paid a
          INNER JOIN employees_paid b ON 
               (a.employee = B.employee AND 
               a.PercentOfTotal = b.PercentOfTotal AND 
               a.Month > b.Month)
     GROUP BY
          a.Month,
          b.Month
     HAVING
          Count(*) = (SELECT COUNT(*) FROM employees_paid c 
               where c.Month = a.Month)
     )
  1. 内部SELECT执行自联接以识别匹配的员工和百分比组合(同月的那些组合除外)。 >在JOIN中确保只接受一组匹配,即如果Month1条目= Month3条目,我们只获得Month3-Month1条目组合而不是Month1-Month3,Month3-Month1和Month3-Month3。
  2. 然后我们按每个月 - 月组合的COUNT个匹配条目进行分组
  3. 然后HAVING排除没有与月份条目一样多的匹配的月份
  4. 外部SELECT获取除内部查询返回的条目之外的所有条目(具有完整集匹配的条目)

答案 4 :(得分:2)

  

我只是把这个解决方案放在一起   在写这个问题时,哪个   似乎工作

我认为它不起作用。在这里,我又添加了两组(月份分别为4和5),我认为这些组是截然不同的,结果是相同的,即月份= 1和2:

WITH employees_paid (Month, Employee, PercentOfTotal)
AS 
(
 SELECT 1, 'Alice', 0.25
 UNION ALL
 SELECT 1, 'Barbara', 0.65
 UNION ALL
 SELECT 1, 'Claire', 0.1
 UNION ALL
 SELECT 2, 'Alice', 0.25
 UNION ALL
 SELECT 2, 'Barbara', 0.5
 UNION ALL
 SELECT 2, 'Claire', 0.25
 UNION ALL
 SELECT 3, 'Alice', 0.25
 UNION ALL
 SELECT 3, 'Barbara', 0.65
 UNION ALL
 SELECT 3, 'Claire', 0.1
 UNION ALL
 SELECT 4, 'Barbara', 0.25
 UNION ALL
 SELECT 4, 'Claire', 0.65
 UNION ALL
 SELECT 4, 'Alice', 0.1
 UNION ALL
 SELECT 5, 'Diana', 0.25
 UNION ALL
 SELECT 5, 'Emma', 0.65
 UNION ALL
 SELECT 5, 'Fiona', 0.1
), 
temp_ids (Month)
AS
(
 SELECT DISTINCT MIN(Month)
   FROM employees_paid
  GROUP 
     BY PercentOfTotal
)
SELECT EMP.Month, EMP.Employee, EMP.PercentOfTotal
  FROM employees_paid AS EMP
       INNER JOIN temp_ids AS IDS 
          ON EMP.Month = IDS.Month
 GROUP 
    BY EMP.Month, EMP.Employee, EMP.PercentOfTotal;