如果我回溯数据,则需要每天获取客户总余额的帮助。
我在Postgres数据库中具有以下表结构:
Table1: accounts (acc)
|id|acc_created|
|1 |2019-01-01 |
|2 |2019-01-01 |
|3 |2019-01-01 |
Table2: transactions
|transaction_id|acc_id|balance|txn_created |
|1 |1 |100 |2019-01-01 07:00:00|
|2 |1 |50 |2019-01-01 16:32:10|
|3 |1 |25 |2019-01-01 22:10:59|
|4 |2 |200 |2019-01-02 18:34:22|
|5 |3 |150 |2019-01-02 15:09:43|
|6 |1 |125 |2019-01-04 04:52:31|
|7 |1 |0 |2019-01-05 05:10:00|
|8 |2 |300 |2019-01-05 12:34:56|
|9 |3 |120 |2019-01-06 23:59:59|
交易表显示在帐户上进行交易后的余额。
说实话,我不确定如何编写查询,或者我是否对此情况考虑过多。我知道它将涉及last_value()和coalesce(),并可能涉及lag()和lead()。我想满足的条件基本上是:
该帐户采用当天的最后一个余额值。 (即,2019年1月1日acc_id ='1'的余额为$ 25,acc_id ='2'和'3'的余额为$ 0)
在某天没有进行任何交易的日子里,余额将取自该帐户的先前余额。 (即,2019年1月3日acc_id ='1'的余额为$ 25)
最后,我希望按日期汇总所有帐户的总余额。 (即,在2019年1月2日结束时,总余额应为375美元(= 25 + 200 + 150)
我已经尝试过以下查询:
SELECT date_trunc('day',date), sum(balance_of_day) FROM (
SELECT txn.created as date,
acc_id,
row_number() over (partition BY acc_id ORDER BY txn_created ASC) as order_of_created,
last_value(balance) over (partition by acc_id ORDER BY txn_created RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as balance_of_day
FROM transactions) X
where X.order_of_created = 1
GROUP BY 1
但是,只有在某一天任何帐户进行交易时,这才给我总余额。
预期的最终结果(基于示例)应为:
|date |total_balance|
|2019-01-01 |25 |
|2019-01-02 |375 |
|2019-01-03 |375 |
|2019-01-04 |475 |
|2019-01-05 |450 |
|2019-01-06 |420 |
我不需要提供不同的帐号,只需显示一天结束时所有客户的累计余额即可。请让我知道我该如何解决!非常感谢!
答案 0 :(得分:1)
您可以使用一些很酷的postgres功能来完成此任务。首先,要获取每天的最后余额,请使用DISTINCT ON:
SELECT DISTINCT on(acc_id, txn_created::date)
transaction_id, acc_id, balance, txn_created::date as day
FROM transactions
ORDER BY acc_id, txn_created::date, txn_created desc;
要弄清楚任何一天的余额,我们将使用每行的日期范围,该行包括当前行,但不包括下一行,并按acc_id进行分区:
SELECT transaction_id, acc_id, balance, daterange(day, lead(day, 1) OVER (partition by acc_id order by day), '[)')
FROM (
SELECT DISTINCT on(acc_id, txn_created::date)
transaction_id, acc_id, balance, txn_created::date as day
FROM transactions
ORDER BY acc_id, txn_created::date, txn_created desc
) sub;
最后,加入generate_series。我们可以在上一步中创建的daterange中包含generate_series中的日期的地方。日期范围有意不重叠,因此我们可以安全地查询任何日期。
WITH balances as (
SELECT transaction_id, acc_id, balance, daterange(day, lead(day, 1) OVER (partition by acc_id order by day), '[)') as drange
FROM (
SELECT DISTINCT on(acc_id, txn_created::date)
transaction_id, acc_id, balance, txn_created::date as day
FROM transactions
ORDER BY acc_id, txn_created::date, txn_created desc
) sub
)
SELECT d::date, sum(balance)
FROM generate_series('2019-01-01'::date, '2019-01-06'::date, '1 day') as g(d)
JOIN balances ON d::date <@ drange
GROUP BY d::date;
d | sum
------------+-----
2019-01-01 | 25
2019-01-02 | 375
2019-01-03 | 375
2019-01-04 | 475
2019-01-05 | 450
2019-01-06 | 420
(6 rows)