我正在寻找一种方法来获取一系列开始和结束日期的MEDIAN(很多和很多日期)。但是,它将特定于各种发票号码。"请参阅下面的示例数据。
invoice_no invoice start date invoice end date
4006 11/14/2001 12/15/2004
20071 11/29/2001 02/01/2003
19893 11/30/2001 12/02/2001
19894 11/30/2001 12/04/2001
004 10/22/2002 10/31/2002
004 12/02/2002 10/31/2002
004 01/19/2002 10/31/2002
004 05/10/2002 10/31/2002
查找开始日期和结束日期之间的中位数。
对于仅显示一次的发票,中位数只是该特定invoice_no的开始日期和结束日期之间的任何值。但是,很多情况下发票的方式都是004'显示。它会在不同日期重复多次 - 但这里的概念仍然相同。需要找到两个日期之间的中位数,但仍需要根据该发票号显示。
尽可能过滤数据。我意识到我也可以做WHERE STATUS<> ' REJECTED'它还应该有助于保持很多不确定的日期。另外,我只想在几个月之间过滤,所以我也加入了BETWEEN DATETIME。
到目前为止代码(但没有工作......如果有1个日期列,这种逻辑似乎有效,但现在我们正在处理两个日期,所以我不确定):
WITH
tmp AS
(
SELECT invoice_no,
invoice_start_date, invoice_end_date, check_date, status_code,
cast(count(*) OVER (PARTITION BY invoice_no) as float) AS total,
row_number() OVER (PARTITION BY invoice_no ORDER BY
invoice_start_date, invoice_end_date, check_date) AS rn
FROM INVOICE_HEADER INNER JOIN
INVOICE_HEADER_CUSTOM ON INVOICE_HEADER.invoice_id
= INVOICE_HEADER_CUSTOM.invoice_id
WHERE status_code <> 'REJECTED' AND
Check_Date BETWEEN CONVERT(DATETIME, '2014-12-01 00:00:00', 102)
AND CONVERT(DATETIME, '2014-12-31 00:00:00', 102)
)
SELECT *
FROM tmp
WHERE (total / 2.0 - 1) < rn and rn < (total / 2.0 + 1)
答案 0 :(得分:1)
好的...尝试像this page上的查询:
SELECT @Median = AVG(1.0 * val)
FROM
(
SELECT val,
c = COUNT(*) OVER (),
rn = ROW_NUMBER() OVER (ORDER BY val)
FROM dbo.EvenRows
) AS x
WHERE rn IN ((c + 1)/2, (c + 2)/2);
答案 1 :(得分:1)
如果您指的是设置的开始日期和结束日期,请将它们放在一列中:
WITH t AS (
SELECT invoice_no, invoice_start_date, invoice_end_date, check_date, status_code,
FROM INVOICE_HEADER INNER JOIN
INVOICE_HEADER_CUSTOM
ON INVOICE_HEADER.invoice_id = INVOICE_HEADER_CUSTOM.invoice_id
WHERE status_code <> 'REJECTED' AND
Check_Date BETWEEN CONVERT(DATETIME, '2014-12-01 00:00:00', 102) AND
CONVERT(DATETIME, '2014-12-31 00:00:00', 102)
),
t2 as (
select d, row_number() over (order by d) as seqnum,
count(*) over () as cnt
from (select invoice_start_date as d from t
union all
select invoice_end_date as d from t
) t
)
select dateadd(day, datediff(hour, min(d), max(d)) / 2.0, min(d))
from t2
where 2 * seqnum in (cnt, cnt + 1, cnt + 2);