SQL:跟踪时间范围内MAX值的变化

时间:2012-09-24 07:38:03

标签: sql oracle10g group-by

我们有一个所谓的新产品评估流程,这个流程相当复杂。针对不同的区域和应用评估产品。在每个评估步骤之后,产品获得新的结果分数。这个分数(在0到10之间)表示产品在这个过程中的距离。对于每个步骤,它保持相同或增加但从未减少,并且不均匀的数字标记未通过评估的产品。最高分称为产品状态。

我不想选择在startDate(包括该状态)具有偶数状态(2,4,8,10)的所有产品以及在时间范围的endDate处的状态。

(我还想选择在该时间范围内进入过程的所有新产品,但我认为可以在第二个陈述中轻松完成。)

我遇到的问题是如何在输出中包含两个初始状态。这是我的SQL语句:

SELECT 
  MyTable.product_id, 
  MyTable.REGION, 
  MyTable.SEGMENT,
  Max(MyTable.result) AS NEW_STATUS
  FROM 
     MyTable INNER JOIN (
     SELECT
      product_id, 
      REGION, 
      SEGMENT, 
      Max(result) AS INITIAL_STATUS
    FROM
      MyTable
    WHERE
      DATE <= to_date(:startDate)
    GROUP BY
      product_id, REGION, SEGMENT
    HAVING
      Max(result) IN(2,4,8,10)
   ) initial_status ON MyTable.product_id = initial_status.product_id    
  WHERE
    MyTable.DATE <= to_date(:endDate)
  GROUP BY
    MyTable.product_id, 
    MyTable.REGION, 
    MyTable.SEGMENT;

如何在输出中包含initial_status而不影响max / group by? (是oracle,但我不是专家所以也许某些oracle特定的东西可以提供帮助)

编辑:

数据是1对多关系。 1个产品,很多评估。每个评估都有一个Region,segment,result和evaluation_date(以及此处不相关的其他数据)。在这里反规范化一些示例数据:

product_id    Region    Segment    Result    date
    1           US         AB         2    20.05.2012
    1           EU         TS         4    13.06.2012
    1           US         AB         4    01.09.2012
  234           US         AB         2    09.09.2012

上述样本的预期输出,日期范围为2012年6月26日至2012年9月21日:

product_id    Region    Segment    Initial_Status    New_Status
    1            US        AB             2              4
    1            EU        TS             4              4 (this did not change)
  234            US        AB           (null)           2 ( new entry)

我知道我当前的SQL无法实现这一点。特别是显示新的值。

2 个答案:

答案 0 :(得分:0)

这听起来像在子查询和analytic functions设置操作中需要UNION。分析函数的好处是您只需要进行单个表扫描。

  

我现在想要选择所有具有偶数状态(2,4,8,10)的产品   的startDate

这将是:

select product_id, region, segment, initial_status, new_status
  from ( select product_id, region, segment, initial_status, date
                -- The maximum status over all time per product_id,
                -- region and segment
              , max(initial_status) over 
                   ( partition by product_id, region, segment ) as new_status
           from my_table
                )
       -- Restrict on where 
 where ( date <= to_date(:start_date, <format model>)
          -- If you only want even you can use mod
         and mod(initial_status, 2) = 0
             )
    or new_status = initial_status

然后,您可以获得所有新内容:

select product_id, region, segment, initial_status, new_status
  from ( select product_id, region, segment, initial_status
              , initial_status as new_status, date
                -- Minimum date this product_id, region, segment
                -- combination was entered
              , min(date) over 
                   ( partition by product_id, region, segment ) as min_date
                -- Find the most recent record for this combination
              , rank() over ( partition by product_id, region, segment
                                  order by date desc ) as rnk
           from my_table
                )
       -- By putting this condition in the outer-select
       -- you ensure you only get completely new records
 where min_date >= to_date(:startdate, <format_model>)
       -- If you have multiple records that were entered for a single pk
       -- between startdate and enddate you only want the most recent one.
   and rnk = 1

最后,您可以使用UNION将这些添加到一起。如果你可以保证没有重叠,那么请使用UNION ALL,因为这不会进行DISTINCT操作,因此使查询更具性能。

select query1
 union
select query2

注意如何将这些连接在一起成为一个查询,它看起来不会很漂亮,但可能会更有效:

select product_id, region, segment, initial_status, new_status
  from ( select product_id, region, segment, initial_status
              , min(date) over 
                   ( partition by product_id, region, segment ) as min_date
              , rank() over ( partition by product_id, region, segment
                                  order by date desc ) as rnk
              , max(initial_status) over 
                   ( partition by product_id, region, segment ) as new_status
           from my_table
                )
 where ( min_date >= to_date(:startdate, <format_model>)
         and rnk = 1
             )
    or ( ( date <= to_date(:start_date, <format model>)
            and mod(initial_status, 2) = 0
                )
        or new_status = initial_status
           )

答案 1 :(得分:0)

仅为了文档,我提出了以下查询。我知道它包含初始问题中未提出的某些要求。其中一些是由于处理错误的数据。

SELECT 
  product_id, 
  REGION, 
  SEGMENT,
  initial_status,
  NEW_STATUS,
  "Comment",
  Count("Comment")  OVER (PARTITION BY 
      "Comment"
    ) "Counter"
from(
SELECT DISTINCT
  myTable.product_id, 
  myTable.REGION, 
  myTable.SEGMENT,
  initial_status.initial_status,
  Max(myTable.result) 
    OVER (PARTITION BY 
      myTable.product_id, 
      myTable.REGION, 
      myTable.SEGMENT
    ) NEW_STATUS,
  CASE WHEN initial_status.initial_status <> Max(myTable.result) 
    OVER (PARTITION BY 
      myTable.product_id, 
      myTable.REGION, 
      myTable.SEGMENT
    ) THEN 'Changed' ELSE 'Same' END as "Comment"  
  FROM 
    myTable INNER JOIN (
     SELECT
      product_id, 
      REGION, 
      SEGMENT, 
      Max(result) AS INITIAL_STATUS
    FROM
      myTable
    WHERE
      DATE <= to_date(:startDate)
      OR DATE is null
    GROUP BY
      product_id, REGION, SEGMENT
    HAVING
      Max(result) IN(2,4,8,10)
   ) initial_status 
    ON 
      myTable.product_id = initial_status.product_id
      AND myTable.REGION = initial_status.REGION
      AND (
        myTable.SEGMENT = initial_status.SEGMENT
        OR (myTable.SEGMENT is null AND initial_status.SEGMENT is null)
      )
  WHERE
    myTable.DATE <= to_date(:endDate)
UNION ALL
SELECT 
  myTable.product_id, 
  myTable.REGION, 
  myTable.SEGMENT,
  null AS initial_status,
  Max(myTable.result) 
    OVER (PARTITION BY 
      myTable.product_id, 
      myTable.REGION, 
      myTable.SEGMENT
    ) NEW_STATUS,
'New' As "Comment"
FROM myTable
WHERE evaluation_date BETWEEN to_date(:startDate) + 1 AND to_date(:endDate)
AND stage <> 'Stage 0')
ORDER BY
    product_id ASC;