我们说我有一张桌子如下
date add_days
2015-01-01 5
2015-01-04 2
2015-01-11 7
2015-01-20 10
2015-01-30 1
我要做的是检查days_balance
,即date
是否大于或小于之前的日期+ N天(add_days
)并计算累积的天数如果它们是一个连续的系列。
所以算法应该像
一样工作for i in 2:N_rows {
days_balance[i] := date[i-1] + add_days[i-1] - date[i]
if days_balance[i] >= 0 then
date[i] := date[i] + days_balance[i]
}
预期结果应如下
date days_balance
2015-01-01 0
2015-01-04 2
2015-01-11 -3
2015-01-20 -2
2015-01-30 0
纯SQL可以吗?我想它应该与某些条件连接,但无法看到它是如何实现的。
答案 0 :(得分:1)
我发布了另一个答案,因为它们可能很好比较它们,因为它们使用不同的方法(这个只做一个n ^ 2样式连接,另一个使用递归CTE)。这个利用了以下事实:在为特定行计算每个前一行之前,您不必计算每个行的days_balance,只需要对前几天的内容进行求和....
drop table junk
create table junk(date DATETIME, add_days int)
insert into junk values
('2015-01-01',5 ),
('2015-01-04',2 ),
('2015-01-11',7 ),
('2015-01-20',10 ),
('2015-01-30',1 )
;WITH cte as
(
select ROW_NUMBER() OVER (ORDER BY date) i, date, add_days, ISNULL(DATEDIFF(DAY, LAG(date) OVER (ORDER BY date), date), 0) days_since_prev
FROM Junk
)
, combinedWithAllPreviousDaysCte as
(
select i [curr_i], date [curr_date], add_days [curr_add_days], days_since_prev [curr_days_since_prev], 0 [prev_add_days], 0 [prev_days_since_prev] from cte where i = 1 --get first row explicitly since it has no preceding rows
UNION ALL
select curr.i [curr_i], curr.date [curr_date], curr.add_days [curr_add_days], curr.days_since_prev [curr_days_since_prev], prev.add_days [prev_add_days], prev.days_since_prev [prev_days_since_prev]
from cte curr
join cte prev on curr.i > prev.i --join to all previous days
)
select curr_i, curr_date, SUM(prev_add_days) - curr_days_since_prev - SUM(prev_days_since_prev) [days_balance]
from combinedWithAllPreviousDaysCte
group by curr_i, curr_date, curr_days_since_prev
order by curr_i
输出:
+--------+-------------------------+--------------+
| curr_i | curr_date | days_balance |
+--------+-------------------------+--------------+
| 1 | 2015-01-01 00:00:00.000 | 0 |
| 2 | 2015-01-04 00:00:00.000 | 2 |
| 3 | 2015-01-11 00:00:00.000 | -3 |
| 4 | 2015-01-20 00:00:00.000 | -5 |
| 5 | 2015-01-30 00:00:00.000 | -5 |
+--------+-------------------------+--------------+
答案 1 :(得分:0)
嗯,我认为我有一个递归CTE(对不起,我目前只提供Microsoft SQL Server,所以它可能不符合PostgreSQL)。
此外,我认为您的预期结果已经结束(请参阅上面的评论)。如果没有,可以修改它以符合您的数学。
drop table junk
create table junk(date DATETIME, add_days int)
insert into junk values
('2015-01-01',5 ),
('2015-01-04',2 ),
('2015-01-11',7 ),
('2015-01-20',10 ),
('2015-01-30',1 )
;WITH cte as
(
select ROW_NUMBER() OVER (ORDER BY date) i, date, add_days, ISNULL(DATEDIFF(DAY, LAG(date) OVER (ORDER BY date), date), 0) days_since_prev
FROM Junk
)
,recursiveCte (i, date, add_days, days_since_prev, days_balance, math) as
(
select top 1
i,
date,
add_days,
days_since_prev,
0 [days_balance],
CAST('no math for initial one, just has zero balance' as varchar(max)) [math]
from cte where i = 1
UNION ALL --recursive step now
select
curr.i,
curr.date,
curr.add_days,
curr.days_since_prev,
prev.days_balance - curr.days_since_prev + prev.add_days [days_balance],
CAST(prev.days_balance as varchar(max)) + ' - ' + CAST(curr.days_since_prev as varchar(max)) + ' + ' + CAST(prev.add_days as varchar(max)) [math]
from cte curr
JOIN recursiveCte prev ON curr.i = prev.i + 1
)
select i, DATEPART(day,date) [day], add_days, days_since_prev, days_balance, math
from recursiveCTE
order by date
结果如下:
+---+-----+----------+-----------------+--------------+------------------------------------------------+
| i | day | add_days | days_since_prev | days_balance | math |
+---+-----+----------+-----------------+--------------+------------------------------------------------+
| 1 | 1 | 5 | 0 | 0 | no math for initial one, just has zero balance |
| 2 | 4 | 2 | 3 | 2 | 0 - 3 + 5 |
| 3 | 11 | 7 | 7 | -3 | 2 - 7 + 2 |
| 4 | 20 | 10 | 9 | -5 | -3 - 9 + 7 |
| 5 | 30 | 1 | 10 | -5 | -5 - 10 + 10 |
+---+-----+----------+-----------------+--------------+------------------------------------------------+
答案 2 :(得分:-1)
我不太了解您的算法如何返回预期结果?但是,让我分享一下我提出的可能有用的技术。
这只有在将数据的最终结果导出到Excel时才有效,即使这样,它也无法在所有场景中运行,具体取决于导出数据集的格式,但这里是......
如果您熟悉Excel公式,我发现如果您在SQL中将Excel公式作为另一个字段编写,它会在您导出到excel时立即为您执行该公式(最适合的方法)我只是将它复制并粘贴到Excel中,因此它不会将其格式化为文本)
因此,对于您的示例,这是您可以做的事情(再次注意到我不理解您的算法,所以这可能是错误的,但这只是为了给您提供概念)
SELECT
date
, add_days
, '=INDEX($1:$65536,ROW()-1,COLUMN()-2)'
||'+INDEX($1:$65536,ROW()-1,COLUMN()-1)'
||'-INDEX($1:$65536,ROW(),COLUMN()-2)'
AS "days_balance[i]"
,'=IF(INDEX($1:$65536,ROW(),COLUMN()-1)>=0'
||',INDEX($1:$65536,ROW(),COLUMN()-3)'
||'+INDEX($1:$65536,ROW(),COLUMN()-1))'
AS "date[i]"
FROM
myTable
ORDER BY /*Ensure to order by whatever you need for your formula to work*/
使这项工作的关键部分是使用INDEX
公式函数根据当前单元格的位置选择单元格。因此,ROW()-1
告诉我获取上一条记录的结果,而COLUMN()-2
表示从当前左侧的两列中取值。因为您不能使用A2+B2-A3
之类的单元格引用,因为行号在导出时不会更改,并且它会假定列的位置。
我使用||
进行SQL字符串连接,以便在屏幕上更容易阅读。
我在excel中试过这个;它与您的预期结果不符。但是,如果这种技术适合您,那么只需更正excel公式即可。