使用多个条件在同一查询中使用最小值和最大值查找唯一记录

时间:2018-08-26 17:46:41

标签: sql sql-server tsql

我有一个带有以下字段的公用表表达式

product.identifier, ingredient.identifier, ingredient.cost,
ingredient.isActive, ingredient.isPrimary

我正在尝试根据以下条件在多条记录中查找一条记录

  1. 如果isActive = 1isPrimary = 1,则选择该记录
  2. 如果记录中包含isPrimary = 1isActive = 0,请选择成本最高/最高的记录,其中isPrimary = 0isActive = 1
  3. 如果第2步中的所有记录的费用都相同,请根据ingredient.Identifier选择最早/分钟的记录

靠自己找到这些逻辑很简单,但是将逻辑组合到一个子句中并不能按预期工作。这是我要与不正确的SQL匹配的预期输出

product ingredient cost  isActive isPrimary   isChosenRecord

-- isActive and isPrimary example                            
1       10         1.00  1        1           yes
1       11         1.10  1        0           no
2       20         2.00  1        1           yes
2       22         2.15  1        0           no

-- primary record is inactive, choose max cost record
3       30         3.00  0        1           no
3       31         3.10  1        0           no
3       32         3.20  1        0           yes
4       40         4.00  0        1           no
4       41         4.10  1        0           no
4       42         4.20  1        0           yes

-- primary record is inactive, all records have same cost, choose oldest record
5       50         5.00  0        1           no
5       51         5.00  1        0           yes
5       52         5.00  1        0           no
6       60         6.00  0        1           no
6       61         6.00  1        0           yes
6       62         6.00  1        0           no

; with [ActiveRecordsCTE] as
(
    select
        ProductIdentifier = p.Identifier,
        IngredientIdentifier = i.Identifier,
        i.Cost, i.isActive, i.isPrimary
    from Product p
    inner join Ingredient i on i.Identifier = p.Identifier
    where i.isActive = 1

),

[CalculatedPrimaryRecords] AS 
(
    SELECT
        r.ProductIdentifier,
        r.IngredientIdentifier
    FROM ActiveRecordsCTE r
    WHERE r.IsPrimary = 1

    UNION

    -- get the oldest records
    SELECT
        r.ProductIdentifier,
        IngredientIdentifier = min(r.IngredientIdentifier)
    FROM
    (
        -- get most expensive record by cost
        SELECT
            r.ProductIdentifier,
            r.IngredientIdentifier
        FROM ActiveRecordsCTE a
        CROSS APPLY
        (
            -- get most expensive record per product
            SELECT
                r.ProductIdentifier
                ,MaxAssetValue = MAX(r.Cost)
            FROM ActiveRecordsCTE b
            WHERE b.IsPrimary = a.IsPrimary
                AND a.ProductIdentifier = b.ProductIdentifier
                AND a.IngredientIdentifier = b.IngredientIdentifier
            GROUP BY b.ProductIdentifier
        ) ca
        WHERE a.IsPrimary = 0
            -- exclude records that are included in the statement above
            AND a.ProductIdentifier NOT IN
            (
                SELECT ProductIdentifier
                FROM ActiveRecordsCTE
                WHERE IsPrimary = 1
            )
    ) sub
    GROUP BY sub.ProductIdentifier
)

select * from [CalculatedPrimaryRecords]

1 个答案:

答案 0 :(得分:1)

使用row_number()进行这种优先排序:

with cte as ( . . . )
select t.*
from (select cte.*,
             row_number() over (partition by product
                                order by (case when isActive = 1 and isPrimary = 1 then 1
                                               when isActive = 0 and isPrimary = 1 then 2
                                               else 3
                                          end),
                                         cost desc, 
                                         identifier asc
                               ) as seqnum
      from cte
     ) t
where seqnum = 1;

这使一些假设似乎与问题相符:

  • isActiveisPrimary仅取值0和1。
  • 如果没有记录包含isPrimary = 1,则您仍需要一条记录。 (如果没有,则可以很容易地将它们过滤掉。)
  • 您的样本数据中未定义
  • identifier

编辑:

如果想花哨的话,可以使用top (1) with ties

select top (1) with ties cte.*
from cte
order by row_number() over (partition by product
                            order by (case when isActive = 1 and isPrimary = 1 then 1
                                           when isActive = 0 and isPrimary = 1 then 2
                                           else 3
                                      end),
                                     cost desc, 
                                     identifier asc
                          );

我实际上更喜欢row_number()解决方案,因为我不确定在isPrimary = 0的情况下该怎么办,并且为该解决方案添加逻辑以过滤掉那些记录会更容易。