我有两张桌子:余额和日历。
余额:
Account Date Balance 1111 01/01/2014 100 1111 02/01/2014 156 1111 03/01/2014 300 1111 04/01/2014 300 1111 07/01/2014 468 1112 02/01/2014 300 1112 03/01/2014 300 1112 06/01/2014 300 1112 07/01/2014 350 1112 08/01/2014 400 1112 09/01/2014 450 1113 01/01/2014 30 1113 02/01/2014 40 1113 03/01/2014 45 1113 06/01/2014 45 1113 07/01/2014 60 1113 08/01/2014 50 1113 09/01/2014 20 1113 10/01/2014 10
日历
date business_day_ind 01/01/2014 N 02/01/2014 Y 03/01/2014 Y 04/01/2014 N 05/01/2014 N 06/01/2014 Y 07/01/2014 Y 08/01/2014 Y 09/01/2014 Y 10/01/2014 Y
我需要做以下事情:
1111 01/01/2014 100 N 1111 02/01/2014 156 Y 1111 03/01/2014 300 Y 1111 04/01/2014 300 Y 1111 05/01/2014 N 1111 06/01/2014 N 1111 07/01/2014 468 Y 1111 08/01/2014 Y 1111 09/01/2014 Y 1111 10/01/2014 Y 1112 01/01/2014 N 1112 02/01/2014 300 Y 1112 03/01/2014 300 Y 1112 04/01/2014 N 1112 05/01/2014 N 1112 06/01/2014 300 Y 1112 07/01/2014 350 Y 1112 08/01/2014 400 Y 1112 09/01/2014 450 Y 1112 10/01/2014 Y
我需要一种有效的方式(最好不涉及多个步骤)将日期限制在帐户的最长余额可用日期(2014年7月1日,如果是1111,09 / 01/2014)如果是1112)
期望的输出:
1111 01/01/2014 100 N 1111 02/01/2014 156 Y 1111 03/01/2014 300 Y 1111 04/01/2014 300 Y 1111 05/01/2014 N 1111 06/01/2014 N 1111 07/01/2014 468 Y 1112 01/01/2014 N 1112 02/01/2014 300 Y 1112 03/01/2014 300 Y 1112 04/01/2014 N 1112 05/01/2014 N 1112 06/01/2014 300 Y 1112 07/01/2014 350 Y 1112 08/01/2014 400 Y 1112 09/01/2014 450 Y
在填写缺失的日子之后,我计划将前一个工作日的余额归咎于缺失的日子。我计划在每个日期前一个工作日,并通过将原始余额表与acct和上一个工作日作为关键字来更新缺失的行。
感谢。
我是Greenplum数据库。
答案 0 :(得分:0)
可能的方法是在子查询中放入第二个选择。例如:
select ... from calendar a left outer join balance b on a.date = b.date
where a.date <= (select max(date) from balance c where b.Account = c.Account )
答案 1 :(得分:0)
我想你有第三张桌子accounts
:
select
accounts.account,
calendar.date,
balance.balance,
calendar.business_day_ind
from
accounts cross join lateral (
select *
from calendar
where calendar.date <= (
select max(date)
from balance
where balance.account = accounts.account)) as calendar left join
balance on (balance.account = accounts.account and balance.date = calendar.date)
order by
accounts.account, calendar.date;
答案 2 :(得分:0)
这是一个有趣的挑战!
CREATE TABLE balance
(account int, balance_date timestamp, balance int)
DISTRIBUTED BY (account, balance_date);
INSERT INTO balance
values (1111,'01/01/2014', 100),
(1111, '02/01/2014', 156),
(1111, '03/01/2014', 300),
(1111, '04/01/2014', 300),
(1111, '07/01/2014', 468),
(1112, '02/01/2014', 300),
(1112, '03/01/2014', 300),
(1112, '06/01/2014', 300),
(1112, '07/01/2014', 350),
(1112, '08/01/2014', 400),
(1112, '09/01/2014', 450),
(1113, '01/01/2014', 30),
(1113, '02/01/2014', 40),
(1113, '03/01/2014', 45),
(1113, '06/01/2014', 45),
(1113, '07/01/2014', 60),
(1113, '08/01/2014', 50),
(1113, '09/01/2014', 20),
(1113, '10/01/2014', 10);
CREATE TABLE calendar
(calendar_date timestamp, business_day_ind boolean)
DISTRIBUTED BY (calendar_date);
INSERT INTO calendar
values ('01/01/2014', false),
('02/01/2014', true),
('03/01/2014', true),
('04/01/2014', false),
('05/01/2014', false),
('06/01/2014', true),
('07/01/2014', true),
('08/01/2014', true),
('09/01/2014', true),
('10/01/2014', true);
analyze balance;
analyze calendar;
现在是查询。
select d.account, d.my_date, b.balance, c.business_day_ind
from (
select account, start_date + interval '1 month' * (generate_series(0, duration)) AS my_date
from (
select account, start_date, (date_part('year', duration) * 12 + date_part('month', duration))::int as duration
from (
select start_date, age(end_date, start_date) as duration, account
from (
select account, min(balance_date) as start_date, max(balance_date) as end_date
from balance
group by account
) as sub1
) as sub2
) sub3
) as d
left outer join balance b on d.account = b.account and d.my_date = b.balance_date
join calendar c on c.calendar_date = d.my_date
order by d.account, d.my_date;
结果:
account | my_date | balance | business_day_ind
---------+---------------------+---------+------------------
1111 | 2014-01-01 00:00:00 | 100 | f
1111 | 2014-02-01 00:00:00 | 156 | t
1111 | 2014-03-01 00:00:00 | 300 | t
1111 | 2014-04-01 00:00:00 | 300 | f
1111 | 2014-05-01 00:00:00 | | f
1111 | 2014-06-01 00:00:00 | | t
1111 | 2014-07-01 00:00:00 | 468 | t
1112 | 2014-02-01 00:00:00 | 300 | t
1112 | 2014-03-01 00:00:00 | 300 | t
1112 | 2014-04-01 00:00:00 | | f
1112 | 2014-05-01 00:00:00 | | f
1112 | 2014-06-01 00:00:00 | 300 | t
1112 | 2014-07-01 00:00:00 | 350 | t
1112 | 2014-08-01 00:00:00 | 400 | t
1112 | 2014-09-01 00:00:00 | 450 | t
1113 | 2014-01-01 00:00:00 | 30 | f
1113 | 2014-02-01 00:00:00 | 40 | t
1113 | 2014-03-01 00:00:00 | 45 | t
1113 | 2014-04-01 00:00:00 | | f
1113 | 2014-05-01 00:00:00 | | f
1113 | 2014-06-01 00:00:00 | 45 | t
1113 | 2014-07-01 00:00:00 | 60 | t
1113 | 2014-08-01 00:00:00 | 50 | t
1113 | 2014-09-01 00:00:00 | 20 | t
1113 | 2014-10-01 00:00:00 | 10 | t
(25 rows)
我必须获取每个帐户的最短和最长日期,然后使用generate_series生成两个日期之间的月份。如果你想要每天的记录,但我必须使用另一个子查询来获得每月的结果,那将是一个更清晰的查询。