我们有一个所谓的新产品评估流程,这个流程相当复杂。针对不同的区域和应用评估产品。在每个评估步骤之后,产品获得新的结果分数。这个分数(在0到10之间)表示产品在这个过程中的距离。对于每个步骤,它保持相同或增加但从未减少,并且不均匀的数字标记未通过评估的产品。最高分称为产品状态。
我不想选择在startDate(包括该状态)具有偶数状态(2,4,8,10)的所有产品以及在时间范围的endDate处的状态。
(我还想选择在该时间范围内进入过程的所有新产品,但我认为可以在第二个陈述中轻松完成。)
我遇到的问题是如何在输出中包含两个初始状态。这是我的SQL语句:
SELECT
MyTable.product_id,
MyTable.REGION,
MyTable.SEGMENT,
Max(MyTable.result) AS NEW_STATUS
FROM
MyTable INNER JOIN (
SELECT
product_id,
REGION,
SEGMENT,
Max(result) AS INITIAL_STATUS
FROM
MyTable
WHERE
DATE <= to_date(:startDate)
GROUP BY
product_id, REGION, SEGMENT
HAVING
Max(result) IN(2,4,8,10)
) initial_status ON MyTable.product_id = initial_status.product_id
WHERE
MyTable.DATE <= to_date(:endDate)
GROUP BY
MyTable.product_id,
MyTable.REGION,
MyTable.SEGMENT;
如何在输出中包含initial_status而不影响max / group by? (是oracle,但我不是专家所以也许某些oracle特定的东西可以提供帮助)
编辑:
数据是1对多关系。 1个产品,很多评估。每个评估都有一个Region,segment,result和evaluation_date(以及此处不相关的其他数据)。在这里反规范化一些示例数据:
product_id Region Segment Result date
1 US AB 2 20.05.2012
1 EU TS 4 13.06.2012
1 US AB 4 01.09.2012
234 US AB 2 09.09.2012
上述样本的预期输出,日期范围为2012年6月26日至2012年9月21日:
product_id Region Segment Initial_Status New_Status
1 US AB 2 4
1 EU TS 4 4 (this did not change)
234 US AB (null) 2 ( new entry)
我知道我当前的SQL无法实现这一点。特别是显示新的值。
答案 0 :(得分:0)
这听起来像在子查询和analytic functions设置操作中需要UNION。分析函数的好处是您只需要进行单个表扫描。
我现在想要选择所有具有偶数状态(2,4,8,10)的产品 的startDate
这将是:
select product_id, region, segment, initial_status, new_status
from ( select product_id, region, segment, initial_status, date
-- The maximum status over all time per product_id,
-- region and segment
, max(initial_status) over
( partition by product_id, region, segment ) as new_status
from my_table
)
-- Restrict on where
where ( date <= to_date(:start_date, <format model>)
-- If you only want even you can use mod
and mod(initial_status, 2) = 0
)
or new_status = initial_status
然后,您可以获得所有新内容:
select product_id, region, segment, initial_status, new_status
from ( select product_id, region, segment, initial_status
, initial_status as new_status, date
-- Minimum date this product_id, region, segment
-- combination was entered
, min(date) over
( partition by product_id, region, segment ) as min_date
-- Find the most recent record for this combination
, rank() over ( partition by product_id, region, segment
order by date desc ) as rnk
from my_table
)
-- By putting this condition in the outer-select
-- you ensure you only get completely new records
where min_date >= to_date(:startdate, <format_model>)
-- If you have multiple records that were entered for a single pk
-- between startdate and enddate you only want the most recent one.
and rnk = 1
最后,您可以使用UNION将这些添加到一起。如果你可以保证没有重叠,那么请使用UNION ALL,因为这不会进行DISTINCT操作,因此使查询更具性能。
select query1
union
select query2
注意如何将这些连接在一起成为一个查询,它看起来不会很漂亮,但可能会更有效:
select product_id, region, segment, initial_status, new_status
from ( select product_id, region, segment, initial_status
, min(date) over
( partition by product_id, region, segment ) as min_date
, rank() over ( partition by product_id, region, segment
order by date desc ) as rnk
, max(initial_status) over
( partition by product_id, region, segment ) as new_status
from my_table
)
where ( min_date >= to_date(:startdate, <format_model>)
and rnk = 1
)
or ( ( date <= to_date(:start_date, <format model>)
and mod(initial_status, 2) = 0
)
or new_status = initial_status
)
答案 1 :(得分:0)
仅为了文档,我提出了以下查询。我知道它包含初始问题中未提出的某些要求。其中一些是由于处理错误的数据。
SELECT
product_id,
REGION,
SEGMENT,
initial_status,
NEW_STATUS,
"Comment",
Count("Comment") OVER (PARTITION BY
"Comment"
) "Counter"
from(
SELECT DISTINCT
myTable.product_id,
myTable.REGION,
myTable.SEGMENT,
initial_status.initial_status,
Max(myTable.result)
OVER (PARTITION BY
myTable.product_id,
myTable.REGION,
myTable.SEGMENT
) NEW_STATUS,
CASE WHEN initial_status.initial_status <> Max(myTable.result)
OVER (PARTITION BY
myTable.product_id,
myTable.REGION,
myTable.SEGMENT
) THEN 'Changed' ELSE 'Same' END as "Comment"
FROM
myTable INNER JOIN (
SELECT
product_id,
REGION,
SEGMENT,
Max(result) AS INITIAL_STATUS
FROM
myTable
WHERE
DATE <= to_date(:startDate)
OR DATE is null
GROUP BY
product_id, REGION, SEGMENT
HAVING
Max(result) IN(2,4,8,10)
) initial_status
ON
myTable.product_id = initial_status.product_id
AND myTable.REGION = initial_status.REGION
AND (
myTable.SEGMENT = initial_status.SEGMENT
OR (myTable.SEGMENT is null AND initial_status.SEGMENT is null)
)
WHERE
myTable.DATE <= to_date(:endDate)
UNION ALL
SELECT
myTable.product_id,
myTable.REGION,
myTable.SEGMENT,
null AS initial_status,
Max(myTable.result)
OVER (PARTITION BY
myTable.product_id,
myTable.REGION,
myTable.SEGMENT
) NEW_STATUS,
'New' As "Comment"
FROM myTable
WHERE evaluation_date BETWEEN to_date(:startDate) + 1 AND to_date(:endDate)
AND stage <> 'Stage 0')
ORDER BY
product_id ASC;