我有一张桌子,上面有商品:
player_id | date | registration_date | price
pl1 | 2019-01-21 | 2019-01-20 | 20
pl1 | 2019-01-23 | 2019-01-20 | 10
pl1 | 2019-01-24 | 2019-01-20 | 15
在使用groupArray的“ date”和arrayCumSum的“ price”并使用ArrayJoin进行计算之后,我得到了每天累积总和的表:
player_id | date | registration_date | sum_price
pl1 | 2019-01-21 | 2019-01-20 | 20
pl1 | 2019-01-23 | 2019-01-20 | 30
pl1 | 2019-01-24 | 2019-01-20 | 45
但是我需要添加自注册到今天为止的缺失日期(今天是'2019-01-25'
):
player_id | date | registration_date | sum_price
pl1 | 2019-01-20 | 2019-01-20 | 0
pl1 | 2019-01-21 | 2019-01-20 | 20
pl1 | 2019-01-22 | 2019-01-20 | 20
pl1 | 2019-01-23 | 2019-01-20 | 30
pl1 | 2019-01-24 | 2019-01-20 | 45
pl1 | 2019-01-25 | 2019-01-20 | 45
我该怎么办?
答案 0 :(得分:1)
尝试这个:
SELECT player_id, result.1 as date, registrationDate as registration_date, result.2 as sum_price
FROM
(
SELECT
player_id,
groupArray((date, price)) AS purchases,
min(registration_date) AS registrationDate,
arrayMap(x -> registrationDate + x, range(toUInt32(toDate('2019-01-25') - registrationDate + 1))) dates,
arrayFilter(x -> arrayFirstIndex(p -> p.1 = x, purchases) = 0, dates) AS missed_dates,
arrayMap(x -> (x, 0), missed_dates) AS dummy_purchases,
arraySort(x -> x.1, arrayConcat(purchases, dummy_purchases)) all_purchases,
arrayCumSum(x -> x.2, all_purchases) cum_prices,
arrayMap(index -> (all_purchases[index].1, cum_prices[index]), arrayEnumerate(all_purchases)) flat_result,
arrayJoin(flat_result) result
FROM test.purchases01
GROUP BY player_id
)
/* result
┌─player_id─┬───────date─┬─registration_date─┬─sum_price─┐
│ pl1 │ 2019-01-20 │ 2019-01-20 │ 0 │
│ pl1 │ 2019-01-21 │ 2019-01-20 │ 20 │
│ pl1 │ 2019-01-22 │ 2019-01-20 │ 20 │
│ pl1 │ 2019-01-23 │ 2019-01-20 │ 30 │
│ pl1 │ 2019-01-24 │ 2019-01-20 │ 45 │
│ pl1 │ 2019-01-25 │ 2019-01-20 │ 45 │
└───────────┴────────────┴───────────────────┴───────────┘
*/
/* Prepare test data */
CREATE TABLE test.purchases01
(
`player_id` String,
`date` Date,
`registration_date` Date,
`price` int
)
ENGINE = Memory;
INSERT INTO test.purchases01
VALUES ('pl1', '2019-01-21', '2019-01-20', 20),
('pl1', '2019-01-23', '2019-01-20', 10),
('pl1', '2019-01-24', '2019-01-20', 15);