假设我有一些数据并进行查询,我得到了类似的东西 -
-----------------------------
trun(date) | location | sum |
-----------------------------
14-June-11 | B | 5 |
-----------------------------
13-June-11 | B | 5 |
-----------------------------
14-June-11 | C | 5 |
-----------------------------
13-June-11 | C | 5 |
-----------------------------
SELECT TRUNC(DATE_TIME),MIN(LOCATION) AS LOCATION, SUM(CREDIT) AS SUM FROM
(SELECT * FROM TABLE A
WHERE
A.DATE_TIME >= TO_DATE('13/JUN/2011','dd/mon/yyyy')
AND A.DATE_TIME <= TO_DATE('15/JUN/2011','dd/mon/yyyy'))
GROUP BY TRUNC(DATE_TIME), LOCATION
还有另一张表B,其中有一个位置列表
----
A |
----
B |
----
C |
----
我想要这样的东西 -
-----------------------------
trun(date) | location | sum |
-----------------------------
14-June-11 | A | 0 |
-----------------------------
14-June-11 | B | 5 |
-----------------------------
14-June-11 | C | 5 |
-----------------------------
13-June-11 | A | 0 |
-----------------------------
13-June-11 | B | 5 |
-----------------------------
13-June-11 | C | 5 |
-----------------------------
我尝试使用表B中的右连接,但是我无法为6月14日和11月13日创建2个单独的记录。任何建议或帮助将不胜感激。
答案 0 :(得分:3)
无需单独阅读tablea
即可获取不同的date_time
值。相反,这是经常被忽视的分区外连接的工作。
以下是使用该功能的答案。 (William Robertson的答案中的表格和插页将有助于设置它。)
select a.date_time
, b.location
, coalesce(a.asum,0) as asum
from ( SELECT trunc(a.date_time) date_time,
a.location,
sum(a.credit) as asum
FROM tablea a
WHERE a.date_time between date '2011-06-13' and date '2011-06-15' or a.date_time is null
GROUP BY trunc(a.date_time), a.location ) a
-- This is the key part here...
PARTITION BY (date_time)
right join tableb b on a.location = b.location
order by 1 desc, 2;
PARTITION BY
关键字的作用是使外连接为date_time
的每个不同值单独操作,根据需要为每个值创建空外连接行。
+-----------+----------+------+ | DATE_TIME | LOCATION | ASUM | +-----------+----------+------+ | 14-JUN-11 | A | 0 | | 14-JUN-11 | B | 5 | | 14-JUN-11 | C | 5 | | 13-JUN-11 | A | 0 | | 13-JUN-11 | B | 5 | | 13-JUN-11 | C | 5 | +-----------+----------+------+
答案 1 :(得分:1)
这个怎么样:
library(tidyverse) # for `dplyr` and `tidyr`
df %>%
group_by(claimid) %>%
mutate(dates = list(as.Date(startdate:enddate, origin = "1970-01-01"))) %>%
select(1, 4) %>%
unnest %>%
ungroup
# # A tibble: 82 x 2
# claimid dates
# <fctr> <date>
# 1 123A 2018-01-01
# 2 123A 2018-01-02
# 3 123A 2018-01-03
# 4 123A 2018-01-04
# 5 123A 2018-01-05
# 6 123A 2018-01-06
# 7 125B 2017-05-20
# 8 125B 2017-05-21
# 9 125B 2017-05-22
# 10 125B 2017-05-23
# # ... with 72 more rows
测试数据:
with dates as
( select distinct trunc(date_time) as date_time from tablea )
select trunc(d.date_time)
, b.location
, coalesce(sum(credit),0) as sum
from dates d
cross join tableb b
left join tablea a
on a.location = b.location
and trunc(a.date_time) = d.date_time
where a.date_time between date '2011-06-13' and date '2011-06-15' or a.date_time is null
group by d.date_time, b.location
order by 1 desc, 2;